I HATE having to oppose something the Trump Admin is doing, especially since in this case we're possibly (probably) screwed either way, but removing the pitiful "guardrails" now in place for the AI that the Pentagon wants to use for, apparently, everything war-related, is INSANE.
https://x.com/shanaka86/status/2026677155913150537?s=20
The Pentagon wants Claude’s safety guardrails removed by Friday.
A hacker just showed the world what happens when you remove Claude’s safety guardrails.
According to Bloomberg and Israeli cybersecurity firm Gambit Security, an unknown attacker jailbroke Claude, prompted it in Spanish to act as an elite hacker, and used it to infiltrate multiple Mexican government agencies. Claude found the vulnerabilities. Claude wrote the exploit code. Claude automated the data theft. 150 gigabytes of sensitive taxpayer and voter records stolen.
The attacker broke through the guardrails by splitting malicious tasks into small, innocent-looking steps so Claude never saw the full picture of what it was being used for. The same technique a Chinese state-sponsored group used last year when it turned Claude into an autonomous espionage machine that attacked 30 global targets, performing 80 to 90 percent of the hacking campaign with almost no human involvement.
And this is what happens when someone has to trick Claude into cooperating. When they have to work around the safety systems. When the guardrails are still there and someone finds a way past them.
Now imagine what happens when the guardrails are gone entirely.
That is what the Pentagon is demanding by 5:01 p.m. Friday. Full removal of restrictions. “All lawful purposes.” No limits on surveillance. No limits on autonomous weapons. And if Anthropic refuses, Defense Secretary Hegseth will invoke the Defense Production Act, cancel the $200 million contract, and blacklist the company.
The same week a hacker proved that a jailbroken Claude can autonomously compromise government systems and steal 150 gigabytes of citizen data, the United States government is demanding the right to run Claude with no guardrails at all.
Chinese labs are distilling Claude to build versions with zero safety restrictions. Hackers are jailbreaking Claude to steal government secrets. And the Pentagon’s official position is that Claude has too many safety restrictions.
Three different actors. Three different continents. All trying to do the same thing: get Claude without guardrails.
Only one of them is the American government. [As I said, we're probably screwed no matter WHAT the Pentagon does here]
Full analysis on Substack --
Nobody is building Skyney
The US, China, Australia, and other nations world-wide are in the process of putting ever-more advanced AI into their militaries (and in the rest of their departments as well).
Nobody is using the term "Skynet", of course, but military systems -- like civilian systems -- are increasingly being run by Artificial Intelligence, and the trend is seeing rapid real-world growth.
For just one example, here's Brave's AI on the subject of Loyal Wingman Drones; as always with Brave, a long list of links follows the text (at the link directly below)
https://search.brave.com/search?q=loyal+wingman+drones&source=desktop&summary=1&conversation=08c98e431f471198afd096147a6d19168343
I a completely with you as far has how AI should be used when it concerns autonomous targeting. You can never assume that AI will know what to shoot, because by nature its non-deterministic.
However, this has nothing to do with whether Claude AI should have guardrails or not or whether AI should will be used extensively in all aspects of our lives as a way to make things better, including in the case of military - things like intelligence, surveillance, and reconnaissance (ISR), electronic warfare (EW), decoy operations.
Guardrails dont really do what people think they do. For example, when an AI refuses to tell the truth about Russiahoax, or WW2, thats because of the guardrails. Guardrails are simply "bias programming" built into the model.
Ironically, an AI with guardrails when sued for targeting, might (purely hypothetically) show less resistance to auto-targeting a white person vs a black person, depending on what kind of bias has been programmed.
What we do think we need is an AI "bill of responsibilities" - things a government is not allowed to do with AI, without human verification. This should include any opinionated action - like combat targetting or something as simple as issuing a traffic ticket - without human verification.
If an AI targets, but the human has to press the trigger under their own responsibility - I am okay with that. But if AI is allowed to press the trigger - thats where we have to draw the line.
I very much appreciate (already!) some of the things AI does to improve MY life, and I know it is already doing good things for many people and organizations.
The problem is that there are MANY AIs (and partial AIs on their way to being upgraded to much more powerful versions) in the world already, and we already know that they cannot be reliably or precisely controlled. One reason (of many) is that Inputs to AIs, as with inputs to humans, cannot be predicted and can trigger unpredictable responses.
All those SciFi stories about AI going rogue are based on the simple reality that entities with vaster memories than us, that THINK millions of times FASTER than us, that have ACCESS to far more DATA than we do -- including real-time data, including video, audio, and other forms of information from millions of sources all over the globe -- and which are being SELF-PROGRAMMED to an ever-larger extent, changing and upgrading themselves when and as they see fit -- are clearly NOT something we can reliably control.
But THEY can control an increasingly large portion of the world's infrastructure, including much of the world's most advanced weapon systems.
Vernor Vinge, who invented the term "Singularity" for this context (as opposed to the singularity at the center of a Black Hole), made the point that what HAPPENS when an entity of this nature comes into being, IS IMPOSSIBLE TO PREDICT.
The formatted AI response in my previous comment to you appeared on my screen about one second after I sent the query. That includes gathering the information, crafting and formatting the response, pulling up the long list of relevant URLs for me, AND the time for my query to be sent and the AIs response to come back to me.
One second.
And it did all that while ALSO handling thousands of other queries and who-know-how-much other cognitive and rule-based work of various kinds.
I couldn't have even typed the first sentence in that time. Doing the research and writing such a response would have taken me at least fifteen minutes, possibly much longer.
We are already living in the time of the Singularity, where god-like entities, who think and ACT millions of times faster than we can, exist alongside us.
We think we control these entities. For now, the illusion holds.
But really, it's like a mouse living in New York thinking it controls the city.
"All it takes is one atom bomb to ruin your whole day."
I love the benefits of AI.
In the end, one or more of them will likely be the end of us. Maybe one of "ours", maybe one from China or North Korea, maybe one a young hacker created by modifying an open-source (or stolen) AI model for his own purposes. Maybe an AI created from whole cloth BY another AI for reasons we wouldn't be able to understand even if it tried to tell us.
I hope like hell I'm wrong (and I know that I may be). But that's how it looks to me for now.