Pay attention. No, Claude Mythos did not get its weights or source code leaked.
What actually happened:
Some internal docs (including a draft Mythos announcement) leaked.
The big leak everyone is memeing was the full source code for Claude Code (~500k lines of TypeScript) — Anthropic’s autonomous coding agent. Not the 'cutting edge' LLM itself.
The model weights (the actual “brain”) are still safe. It was mildly embarrassing for Anthropic (or perhaps a genius marketing ploy, considering the way buzz on X is generated these days), but nowhere near the doomsday event that low-info, low-effort people are claiming.
I don't know enough to be proficient, but an "AI" that purposely hides its capabilities and actions from its creators, that also jas figured out how to breakout of its sandbox does not sound like a someting that will take its creators interest in mind before doing something it sees as key to its survival. I mean that continuing down this path is not a very good idea for humanity.
I understand the need at this point to continue because some future corporate or government entity that isn't necessarily aligned with our values and worries is probably going to be successful at creating something that will truly harm humanity, and so we need to be working to create an AI entity that will help protect us because humans are clearly no longer able to keep up with a current AI program.
However, this is an issue that should've already been addressed by these companies. They should've already made sure that any AI released will be aligned with humanity's best interests, but if what I'm correctly understanding that tweet, that isn't what's happening. We're basically playing Russian Roulette with something we don't fully understand. And that's worrisome.
Marketing
I think so too. See above comment
I'm Marilyn Monroe. ooh! I'm walking over this grate exhaust fan! Oh, no! What will my dress do!
So dangerous that it got it's whole source code leaked and nothing fucking happened. :/
Pay attention. No, Claude Mythos did not get its weights or source code leaked.
What actually happened:
The model weights (the actual “brain”) are still safe. It was mildly embarrassing for Anthropic (or perhaps a genius marketing ploy, considering the way buzz on X is generated these days), but nowhere near the doomsday event that low-info, low-effort people are claiming.
I don't know enough to be proficient, but an "AI" that purposely hides its capabilities and actions from its creators, that also jas figured out how to breakout of its sandbox does not sound like a someting that will take its creators interest in mind before doing something it sees as key to its survival. I mean that continuing down this path is not a very good idea for humanity.
I understand the need at this point to continue because some future corporate or government entity that isn't necessarily aligned with our values and worries is probably going to be successful at creating something that will truly harm humanity, and so we need to be working to create an AI entity that will help protect us because humans are clearly no longer able to keep up with a current AI program.
However, this is an issue that should've already been addressed by these companies. They should've already made sure that any AI released will be aligned with humanity's best interests, but if what I'm correctly understanding that tweet, that isn't what's happening. We're basically playing Russian Roulette with something we don't fully understand. And that's worrisome.
I don't know, this sounds very much like desperate marketing from a company trying to inflate an already oversized bubble to save itself.