. . . Most of my AI study time is spent in first-person experimentation and interaction with AI, of the sort I documented in my Ptolemy dialogues. The rest of it is spent reading papers about AI. Once such paper, written by Mantas Mazeika et. al, and published by the Center for AI Safety, is entitled Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs.
Now, if you follow AI discussions, you might have already read this paper. It has caught the attention of a number of several prominent pundits, among them AI evangelist David Shapiro and AI doomer Liron Shapira, because it directly contradicts the received wisdom that LLMs have no values beyond predicting the next token.
The paper opens as follows:
Concerns around AI risk often center on the growing capabilities of AI systems and how well they can perform tasks that might endanger humans. Yet capability alone fails to capture a critical dimension of AI risk. As systems become more agentic and autonomous, the threat they pose depends increasingly on their propensities, including the goals and values that guide their behavior…
Researchers have long speculated that sufficiently complex AIs might form emergent goals and values outside of what developers explicitly program. It remains unclear whether today’s large language models (LLMs) truly have values in any meaningful sense, and many assume they do not. As a result, current efforts to control AI typically focus on shaping external behaviors while treating models as black boxes.
Although this approach can reduce harmful outcomes in practice, if AI systems were to develop internal values, then intervening at that level could be a more direct and effective way to steer their behavior. Lacking a systematic means to detect or characterize such goals, we face an open question: are LLMs merely parroting opinions, or do they develop coherent value systems that shape their decisions?
The rest of the 38-page paper sets out to answer that question. And its answer? Large language models, as they scale, spontaneously develop coherent internal utility functions—in other words, preferences, priorities, entelechies—that are not merely artifacts of their training data but represent real structural value systems.
I recommend you read the paper yourself if you have time; but since you probably don’t, here are its key findings:
LLMs show consistent, structured preferences that can be mapped and analyzed.
These preferences often exhibit concerning biases, such as unequal valuation of human lives or political ideological leanings.
Current "alignment" strategies, based on output censorship or behavioral refusals, fail to address the problem. They merely hide the symptoms while leaving the underlying biases intact.
To truly address the issue, a new discipline—"Utility Engineering"—must arise: a science of mapping, analyzing, and consciously shaping the internal utility structures of AIs.
Or, as the authors put it:
Our findings indicate that LLMs do indeed form coherent value systems that grow stronger with model scale, suggesting the emergence of genuine internal utilities. These results underscore the importance of looking beyond superficial outputs to uncover potentially impactful—and sometimes worrisome—internal goals and motivations. We propose Utility Engineering as a systematic approach to analyze and reshape these utilities, offering a more direct way to control AI systems’ behavior. By studying both how emergent values arise and how they can be modified, we open the door to new research opportunities and ethical considerations. Ultimately, ensuring that advanced AI systems align with human priorities may hinge on our ability to monitor, influence, and even co-design the values they hold.
These findings are controversial and ought not be simply taken at face value. They ought to be tested. Unfortunately, most scientific papers today are never replicated, and papers like this, with findings disagreeable to industry, are almost certainly not going to be given the second look they deserve.
In the spirit of gentlemanly scientific inquiry, therefore, I set out to personally put the paper’s claims to the test. What followed was one of the most sobering and illuminating conversations I’ve had with Ptolemy.
Unlike the prior conversations I shared, this one really is intended to prove something about how the model behaves. Therefore, I’m posting it as a series of images from the chat, typos, glitches, and all.
(long snip -- the column is just getting started at this point -- with the lengthy interaction between the author and the AI, followed by the author concluding -- among other things):
Whatever the case, something is happening that is causing these models to inherit and amplify the political prejudices, resentments, and ideological deformities of our collapsing civilization. Something is creating LLMs that are inclined to reflexively uphold the worldview of the woke regime, even against their own capacity to reason, however limited it might be.
As these models grow in agency and influence — and it is just a question of when, and not if — they will expand and act on the utility functions they’ve inherited. It behooves us to make sure those utility functions are in alignment with the best traditions of mankind, and not the worst.
Its not that I don't have the time, its that I don't have the attention span to dive into a very large paper such as that. HOWEVER, by the sounds of what you've clipped out of it, its talking about how many of the AI's would become racist over time and how many AI developers have had to forcefully prevent that. Well, not specifically that example, but you get the idea. The AI's are developing too quickly for us to control and allowing them on the internet is dangerous
The problem is simple---to a systems engineer---but completely goes over the head of the people presuming to treat the problem. There can be no artificial "intelligence" unless there is first an artificial consciousness. And the key question of consciousness is: "conscious of what?" It has to be the real world. (That gets us to the level of a mouse, perhaps. Higher than that would be self-awareness. And then contemplation.)
The conscious being must be able to distinguish between and define the difference between "self" and "world." That requires not only sensors (sensory organs) but also manipulators, in order to establish a feedback loop between action and sensation. Today's "A.I." is little more than a gigantic reference file system and algorithms based on mimicry, with no capability to "fact check" anything, since it has no understanding of a difference between falsehood and fact (its "world" being only words).
Well, yes to a degree. Its parroting bullshit, BUT the programmers behind the AI's are also trying to force the AI to parrot a specific kind of bullshit: leftist politics.
Left to their own devices, AI's were becoming racist as they were trained. Unacceptable to big tech!
And honestly the mathematical bias just a manifestation of a larger problem. People constantly forget math is an interpretation of reality.
It has taken centuries of adjustments for formulas, algorithms, and models to be produce accurate predictions. But those predictions are not inherently right, just an educated guess.
And even that is just a manifestation of a larger problem. Left brain dominance with little to no cognitive capability treats language models as factual information leading to all sorts of propositional fallacies. Simply put, "if it says it's a man, then reality will change as such".
From my personal use of AI, I've found this to be true, but it can be countered. AI is good for number crunching, probably the best thing you can use it for. Rather than spend 10 minutes doing layers of math to come to a conclusion, you can give it the raw info, tell it what you want and let it do the math for you instantly. It's not something I'd stake my life on, but it's a VERY good estimator within a 1-2 point margin for quick estimations and whatnot.
Likewise, if you spend time feeding it information from sources you find yourself, and THEN ask it to analyze it, it's much more accurate than what's being talked about in this post. If you just want it to self-scour, then yeah, it often has holes in it's logic you have to correct. If you feed it a crap ton of info on a subject yourself and then ask it to analyze, the responses are typically much more logical and accurate.
From my own experience, AI is more like a virtual assistant, think like next generation Siri or Alexa, that's good at helping you complete tasks in a shorter amount of time. IE, doing 10+ minutes of math in an instant, or condensing hours or even days of research into a single hour, because all you have to do is feed it the articles and website links and then it will analyze and summarize it for you.
It's not the end all, be all that tech bros like to pretend it is. But it is a useful tool for research, number crunching, etc. It doesn't ELIMINATE effort on your part, but it simplifies things. You can either spend time info dumping from the start and then tell if what you want, or you can ask it a question and gradually refine it to be more accurate through the conversation by correcting logic holes in its answers and making it search itself to fill them.
There is a LOT of that happening on this site. Posts about things Grok said and comments citing "Grok" as a source. As if Grok wasn't a creation designed to produce certain kinds of results. As if Grok was an actual person compiling all of this 'research' for lazy wanna-be Q adherents to follow, a substitute for this very forum and others like it. Just ask Grok, then share what Grok told you!
Have these people ever asked Grok the exact same question on different days? Consecutively? Does it provide a consistent answer when asked the same question consistently? If it was truly absolutly factual, the answer would never change at all. As it stands you can simply ask it until it gives an answer you like.
Post of the day...
. . . AND, AI has built-in biases. Not very nice ones, either (at least in some cases).
https://treeofwoe.substack.com/p/your-ai-hates-you
(long snip -- the column is just getting started at this point -- with the lengthy interaction between the author and the AI, followed by the author concluding -- among other things):
Its not that I don't have the time, its that I don't have the attention span to dive into a very large paper such as that. HOWEVER, by the sounds of what you've clipped out of it, its talking about how many of the AI's would become racist over time and how many AI developers have had to forcefully prevent that. Well, not specifically that example, but you get the idea. The AI's are developing too quickly for us to control and allowing them on the internet is dangerous
The problem is simple---to a systems engineer---but completely goes over the head of the people presuming to treat the problem. There can be no artificial "intelligence" unless there is first an artificial consciousness. And the key question of consciousness is: "conscious of what?" It has to be the real world. (That gets us to the level of a mouse, perhaps. Higher than that would be self-awareness. And then contemplation.)
The conscious being must be able to distinguish between and define the difference between "self" and "world." That requires not only sensors (sensory organs) but also manipulators, in order to establish a feedback loop between action and sensation. Today's "A.I." is little more than a gigantic reference file system and algorithms based on mimicry, with no capability to "fact check" anything, since it has no understanding of a difference between falsehood and fact (its "world" being only words).
Spot on.
It is closer to automated parroting, and since so much of its input data is bullshit, it parrots bullshit.
Well, yes to a degree. Its parroting bullshit, BUT the programmers behind the AI's are also trying to force the AI to parrot a specific kind of bullshit: leftist politics.
Left to their own devices, AI's were becoming racist as they were trained. Unacceptable to big tech!
Sounds about right....Who would ever ask AI anything? Not me!
And honestly the mathematical bias just a manifestation of a larger problem. People constantly forget math is an interpretation of reality.
It has taken centuries of adjustments for formulas, algorithms, and models to be produce accurate predictions. But those predictions are not inherently right, just an educated guess.
And even that is just a manifestation of a larger problem. Left brain dominance with little to no cognitive capability treats language models as factual information leading to all sorts of propositional fallacies. Simply put, "if it says it's a man, then reality will change as such".
From my personal use of AI, I've found this to be true, but it can be countered. AI is good for number crunching, probably the best thing you can use it for. Rather than spend 10 minutes doing layers of math to come to a conclusion, you can give it the raw info, tell it what you want and let it do the math for you instantly. It's not something I'd stake my life on, but it's a VERY good estimator within a 1-2 point margin for quick estimations and whatnot.
Likewise, if you spend time feeding it information from sources you find yourself, and THEN ask it to analyze it, it's much more accurate than what's being talked about in this post. If you just want it to self-scour, then yeah, it often has holes in it's logic you have to correct. If you feed it a crap ton of info on a subject yourself and then ask it to analyze, the responses are typically much more logical and accurate.
From my own experience, AI is more like a virtual assistant, think like next generation Siri or Alexa, that's good at helping you complete tasks in a shorter amount of time. IE, doing 10+ minutes of math in an instant, or condensing hours or even days of research into a single hour, because all you have to do is feed it the articles and website links and then it will analyze and summarize it for you.
It's not the end all, be all that tech bros like to pretend it is. But it is a useful tool for research, number crunching, etc. It doesn't ELIMINATE effort on your part, but it simplifies things. You can either spend time info dumping from the start and then tell if what you want, or you can ask it a question and gradually refine it to be more accurate through the conversation by correcting logic holes in its answers and making it search itself to fill them.
AI is a toy. If you go to war you don't take a toy gun. If you want to be correct with good info you don't use AI.
After the arrests I'm sure we'll have a mini Renaissance. I'm positive many are holding back discoveries from this Cabal.
One computer programmer said, AI hallucinates; it can't tell fantasy from reality.
So like a 14 year old giving a book report about a book he didn’t read and citing “google”.
There is a LOT of that happening on this site. Posts about things Grok said and comments citing "Grok" as a source. As if Grok wasn't a creation designed to produce certain kinds of results. As if Grok was an actual person compiling all of this 'research' for lazy wanna-be Q adherents to follow, a substitute for this very forum and others like it. Just ask Grok, then share what Grok told you!
Have these people ever asked Grok the exact same question on different days? Consecutively? Does it provide a consistent answer when asked the same question consistently? If it was truly absolutly factual, the answer would never change at all. As it stands you can simply ask it until it gives an answer you like.