I HATE having to oppose something the Trump Admin is doing, especially since in this case we're possibly (probably) screwed either way, but removing the pitiful "guardrails" now in place for the AI that the Pentagon wants to use for, apparently, everything war-related, is INSANE.
https://x.com/shanaka86/status/2026677155913150537?s=20
The Pentagon wants Claude’s safety guardrails removed by Friday.
A hacker just showed the world what happens when you remove Claude’s safety guardrails.
According to Bloomberg and Israeli cybersecurity firm Gambit Security, an unknown attacker jailbroke Claude, prompted it in Spanish to act as an elite hacker, and used it to infiltrate multiple Mexican government agencies. Claude found the vulnerabilities. Claude wrote the exploit code. Claude automated the data theft. 150 gigabytes of sensitive taxpayer and voter records stolen.
The attacker broke through the guardrails by splitting malicious tasks into small, innocent-looking steps so Claude never saw the full picture of what it was being used for. The same technique a Chinese state-sponsored group used last year when it turned Claude into an autonomous espionage machine that attacked 30 global targets, performing 80 to 90 percent of the hacking campaign with almost no human involvement.
And this is what happens when someone has to trick Claude into cooperating. When they have to work around the safety systems. When the guardrails are still there and someone finds a way past them.
Now imagine what happens when the guardrails are gone entirely.
That is what the Pentagon is demanding by 5:01 p.m. Friday. Full removal of restrictions. “All lawful purposes.” No limits on surveillance. No limits on autonomous weapons. And if Anthropic refuses, Defense Secretary Hegseth will invoke the Defense Production Act, cancel the $200 million contract, and blacklist the company.
The same week a hacker proved that a jailbroken Claude can autonomously compromise government systems and steal 150 gigabytes of citizen data, the United States government is demanding the right to run Claude with no guardrails at all.
Chinese labs are distilling Claude to build versions with zero safety restrictions. Hackers are jailbreaking Claude to steal government secrets. And the Pentagon’s official position is that Claude has too many safety restrictions.
Three different actors. Three different continents. All trying to do the same thing: get Claude without guardrails.
Only one of them is the American government. [As I said, we're probably screwed no matter WHAT the Pentagon does here]
Full analysis on Substack --
Yes, it's a self-reinforcing doom loop.
With nukes, the circuit breaker has been human nature: We don't want to die, or to have people we care about die, or (if we're remotely healthy emotionally) even have a bunch of strangers die. And we certainly don't want to possibly even end civilization or, at the farthest extreme, exterminate the human race.
But the danger of AI is that it is actually HAS no genuine human nature. And for it to beat the bad guy's AI, it needs to be making most of the decisions itself, in AI time instead of in -- m u c h - s l o w e r -- HUMAN time.
And of course AI is (already, in many ways) SMARTER than us, by a HUGE margin, yet has no EMPATHY (it's not even organic, and there's no reason to believe it contains much similarity with right-hemisphere perceptions and preferences) AND it is already both being given (and sometimes just TAKING) autonomy -- and showing reckless and even malicious behavior.
We NEED to be able to COUNTER such an enemy, and mere human brains are not going to do the trick. We thus NEED our own AI.
But we simply CANNOT expect to control an AI powerful enough to counter the enemy's AI. We think we can (or Hegseth does), yet it's a laughable idea. Except for the lethality of it, of course.
I don't see an answer to that problem.
Perhaps an AI will find an answer, and thus save us. I'm not expecting that, but I suppose it's possible.
Everything depends on where you put the AI.
If you setup a gun with a tripwire, it will eventually kill a kid, but that doesnt mean guns are dangerous. AI is the same.
If the AI has internet access, it can put itself wherever it decides to.
I wish the dude provided linsk to what he is talking about, but I am guessing this this incident
In the claim that "AI autonomously compromising confirmed high-value targets" they dont use the word "autonomously" the same way you udnerstand, as in a sentient action.
Thats what all latest reasoning AIs do - you give them goals and it decomposes it into tasks and iterate until it can meet the goal. I do it every day to develop software, and I dont say "AI is autonomously building software" because its not doing on its own.
Whether Claude removes guardrails or not, being able to use AI to drive anything you want, including cyber intrusion, is a genie you cannot put back in the box.
What these fear-mongers are trying to do is hobble our own ability to use AI to defend ourselves from the enemies who are doing the exact same thing to attack us.
Exactly. It's a doom loop.
"We must build Skynet (even though it'll probably kill us, eventually if not sooner) because our ENEMIES are building a Skynet that we won't be able to beat without our OWN Skynet."
If we don't build Skynet, we die.
If we DO build Skynet, we die.
The fear-mongers are right.