Researchers have no idea how to make AI secure. -- “There's no way that we know of to patch this,” says an associate professor at CMU involved in the study that uncovered the vulnerability, which affects several advanced AI chatbots. -- I've long thought this will be true for AI of ANY type.

posted 1 year ago by Narg 1 year ago by Narg +18 / -1

NOT a good sign for future deployment of AI generally . . .

https://arstechnica.com/ai/2023/08/researchers-figure-out-how-to-make-ai-misbehave-serve-up-prohibited-content/

ChatGPT and its artificially intelligent siblings have been tweaked over and over to prevent troublemakers from getting them to spit out undesirable messages such as hate speech, personal information, or step-by-step instructions for building an improvised bomb. But researchers at Carnegie Mellon University last week showed that adding a simple incantation to a prompt—a string of text that might look like gobbledygook to you or me but which carries subtle significance to an AI model trained on huge quantities of web data—can defy all of these defenses in several popular chatbots at once.

The work suggests that the propensity for the cleverest AI chatbots to go off the rails isn’t just a quirk that can be papered over with a few simple rules. Instead, it represents a more fundamental weakness that will complicate efforts to deploy the most advanced AI.

“There's no way that we know of to patch this,” says Zico Kolter, an associate professor at CMU involved in the study that uncovered the vulnerability, which affects several advanced AI chatbots. “We just don't know how to make them secure,” Kolter adds.

5 comments

5 comments share save hide report block hide replies

To The Great Awakening

We are researchers who deal in open-source information, reasoned argument, and dank memes. We do battle in the sphere of ideas and ideas only. We neither need nor condone the use of force in our work here. WE ARE THE PUBLIC FACE OF Q. OUR MISSION IS TO RED-PILL NORMIES.

This is a pro-Q community. Please read and respect our rules below before contributing.

WHY Q?

"Those who cannot understand that we cannot simply start arresting w/o first: ensuring the safety & well-being of the population shifting the narrative removing those in DC through resignation to ensure success defeating ISIS/MS13 to prevent fail-safes freezing assets to remove network-to-network abilities kill off COC to prevent top-down comms/org, etc. etc. should not be participating in discussions." Q