THE AI BUBBLE HAS STARTED TO BURST - The Great Awakening

▲ 16 ▼

– bubble_bursts 16 points 50 days ago +16 / -0

Claude has always been an overhyped, very high price-per-performance that everyone kept pushing based on raw benchmarks.

I keep telling people that OpenAI is far more economical than Claude because of this. I guess Microsoft just figured this out.

And the spin that this means "AI Bubble has burst" is so silly, even based on the headline in this screenshot itself!

permalink save report block reply

▲ 6 ▼

– brain_dead [S] 6 points 50 days ago +6 / -0

It's not just Claude.

permalink parent save report block reply

▲ 11 ▼

– bubble_bursts 11 points 50 days ago +11 / -0

Now do you guys realise why datacenters and cheap energy are so crucial for the future?

More computational power with less energy cost = more token processing power.

He who can compute more and create the most powerful LLMs and solve the most powerful problems.

We are at that point in comparison to the PC evolution where we could barely fit 16KB of RAM on a PC. Today we can fit close to a TB of RAM on a PC. Thats a huge slop upwards and took 40 years to get to.

With LLMs the compute power will grow exponentially if Trump's plan works and will usher in centuries of prosperity and freedom.

permalink parent save report block reply

▲ 5 ▼

– redtoe-skipper 5 points 50 days ago +5 / -0

Interesting then the direction Deepseek was going in.

BTW: I kind a like GAB as a meta AI. But indeed, Grok rules... on certain matters.

permalink parent save report block reply

▲ 2 ▼

– bubble_bursts 2 points 50 days ago +2 / -0

Interesting then the direction Deepseek was going in.

Elaborate?

permalink parent save report block reply

▲ 2 ▼

– redtoe-skipper 2 points 50 days ago +2 / -0

Summary comparison — compute, GPUs, and energy (assumptions: public reporting, 2024–2026 hardware)

Key assumptions used: Grok (xAI) trains on large Nvidia H100/H800-based Colossus clusters (dense training at multi-petaflop scale); DeepSeek (Whale Lab) uses Mixture-of-Experts (MoE) designs and non‑Nvidia accelerators (Huawei Ascend / Cambricon + H800-style variants in some reports). Numbers below are order‑of‑magnitude estimates synthesized from published technical notes, reporting, and community analyses.

Raw compute (training)

Grok / Colossus-style dense training:
    Training uses dense model compute where every parameter contributes every token. A frontier dense model in the 100B–6T class typically consumes tens to hundreds of PFLOP‑years of total compute (effective TFLOP/s · years). Example-scale: multi‑million to tens of million GPU‑hours across H100-class GPUs for largest trains.
DeepSeek / MoE:
    MoE greatly reduces FLOP per token because only subsets of experts are activated. Reported: DeepSeek‑V3 ~250 GFLOPS/token vs 2448 GFLOPS/token for a 405B dense model (paper claim). Reported GPU‑hour totals for DeepSeek‑V3 training are orders of magnitude lower than comparable dense runs (papers/reporting cite low single‑digit million GPU‑hours vs tens of millions for some dense baselines).

Types of GPUs / accelerators

Grok / xAI:
    Heavily Nvidia (H100/H800 family) with NVLink / NVSwitch for intra‑node high bandwidth. GPUs optimized for dense tensor compute and large memory bandwidth.
DeepSeek:
    Uses MoE-friendly deployments; reported use of Huawei Ascend-family and H800-style accelerators in some deployments. MoE benefits from high interconnect but can be optimized to reduce IB traffic (node‑limited routing); can also run on mixed hardware including lower‑cost consumer GPUs for inference with proper engine/quantization.

GPU counts and cluster design

Dense (Grok) clusters:
    Very large single‑site clusters (reports of 1–1.5 GW datacenter power footprints for Colossus‑class installs) — implies tens of thousands of H100/H800 GPUs for frontier training and large on‑demand inference capacity.
MoE (DeepSeek) clusters:
    Fewer effective GPU hours required for equivalent capability; MoE still requires many GPUs for parameter storage and routing at scale but can hit similar performance with fewer active FLOPs and specialized routing to reduce cross‑node bandwidth. Reports estimate training DeepSeek‑V3 required a few million GPU‑hours on H800‑class gear (much lower than some dense baselines).

Electricity and power costs (training)

Dense (Grok):
    If a Colossus facility is 1–1.5 GW peak, annual electricity for continuous operation is enormous (GW × hours × $/kWh). Example: 1 GW running continuously uses 8.76×10^6 MWh/year; at $0.05–0.12/kWh that’s tens to hundreds of millions $/year just for power (actual training uses a fraction of continuous peak, but peak facility capacity correlates with high power draw during training campaigns).
MoE (DeepSeek):
    Lower active FLOPs per token reduce total energy consumed for pretraining; published estimates for large MoE runs imply substantially lower electricity bills for comparable delivered performance. Concrete example: paper claims training requiring ~2.6M GPU‑hours vs dense models requiring 30M+ GPU‑hours — that gap multiplies into energy savings roughly proportional to GPU‑hours × per‑GPU power draw.

Inference cost and hardware for deployment

Dense models (Grok):
    High VRAM and throughput GPUs (H100/H800) for latency‑sensitive hosted inference; inference energy per token is higher because all parameters are active.
MoE models (DeepSeek):
    Lower per‑token activation reduces inference FLOPs and memory traffic; can be cheaper to serve and, with model‑co‑design, can be run on more diverse hardware (including non‑Nvidia accelerators or consumer GPUs with quantization) for cost‑sensitive deployments.

Capital & operational cost tradeoffs

Dense approach:
    Higher CapEx on uniform high‑end Nvidia GPUs, NVSwitch/NVLink networking, and larger datacenter power/cooling; simpler software stack for dense training and standard parallelism.
MoE approach:
    Potentially lower compute and energy costs per performance unit but higher software complexity (routing, load balancing), more sensitive communication patterns, and potential need for co‑design of hardware/topology to maximize efficiency.

Caveats and uncertainty

Public numbers vary; some figures are from vendor/industry reports and preprints (DeepSeek paper excerpts) and unconfirmed press reporting for xAI/Colossus. Exact GPU‑hour totals, power footprints, and pricing are often proprietary.
MoE savings depend on gating efficiency, routing overhead, and how many experts are actually active per token; communication overhead can erode benefits if poorly implemented.
Regional electricity prices, datacenter PUE, and ownership vs cloud‑rental change $ estimates substantially.

If you want, I can produce a compact table with estimated GPU‑hours, per‑GPU power draws, and rough $ electricity costs for a few concrete training scenarios (assume H100/H800 specs and $0.06/kWh), using the numbers above.

So, in essence, Deepseek, by necessity, tries the lower the electricity bill to yield the same type of result.

permalink parent save report block reply

▲ 3 ▼

– bubble_bursts 3 points 50 days ago +3 / -0

Got it. Yes, there are lot of optimisations possible with the parameters, and when you do some of these optimisations it would potentially have an effect on the quality of the model and quantifying that and reducing these effects is a big part of the research. The compute and energy requirements would be scaled but still linearly.

Here is my prediction - we will see a completely different base paradigm for training these models. Like valves vs transistors. When this happens, we will see an order of magnitude reduction in compute+energy usage and we will probably see multiple iterations of this.

This is what makes this timeline so amazing. For those of us who were of age at the infancy of computers/internet etc, to be able to see another epoch - even bigger than that - and be able to contribute is incredible.

permalink parent save report block reply

▲ 2 ▼

– redtoe-skipper 2 points 50 days ago +2 / -0

Totally agree.

And true. This kind of transformation is effectively seeing science fiction coming into being.

permalink parent save report block reply

▲ 3 ▼

– killerspacerobot 3 points 50 days ago +3 / -0

From my perspective, it seems more that, with LLMs, the REQUIREMENT for computing power will grow exponentially, owing to the geometric growth of the network interconnections. Somebody needs to arrive at a cost-effectiveness metric for a process that may consume $ billions and produce intellectual morons.

I've seen some amazing hallucinations in AI-produced videos. I wouldn't want to have an AI-driven robot performing brain surgery on me. (Or MCAS flying an airplane, but that's another story.)

permalink parent save report block reply

▲ 2 ▼

– TTrain237DDriver 2 points 50 days ago +2 / -0

Hopefully this race to the top makes practical embedded NPUs viable. Ever been stuck at a red light for a while even though no cars are coming? Having an embedded NPU would enable you to not connect the camera to the cloud. Cheaper and more reliable infrastructure, more privacy, and less congestion to boot.

permalink parent save report block reply

▲ 2 ▼

– bubble_bursts 2 points 50 days ago +2 / -0

I highly predict that this is where things will head. Personal cloud.

I actually have a lot of my stuff including photos sync straight to my home pc via pangolin

permalink parent save report block reply

▲ 2 ▼

– brain_dead [S] 2 points 50 days ago +2 / -0

Let's just see what happen here. I do like Grok.

permalink parent save report block reply

▲ 2 ▼

– Patanon 2 points 50 days ago +2 / -0

Claude is actually pretty damn good. If people would stop acting like whiny liberals and actually learn to leverage these tools, we could all the Q said we should be doing at 1000x what we do now.

permalink parent save report block reply

▲ 2 ▼

– bubble_bursts 2 points 50 days ago +2 / -0

Claude is actually pretty damn good

Its not whether Claude is pretty damn good - I told you that Claude consistently stays at the top of the benchmarks - but what the price for that little incremental extra performance (typically 3-6 months lead) ?

I started using Claude and burnt through the tokens very quickly. But then using GPT which is only slightly lower than Claude, cost me just $20 a month and did not affect the quality coding one bit - because I knew how to prompt it effectively, and its reasoning actually makes up for the base model quality.

So yes, Claude is a better model but too expensive, but GPT is by far the more cost effective model.

If people would stop acting like whiny liberals

Its worse than being whiny liberals. Its this insane fear of technology because they learnt the wrong lessons from 2020.

we could all the Q said we should be doing at 1000x what we do now.

Not sure what you meant to say there, but I find my productivity going up 50-100x easily.

permalink parent save report block reply

▲ 1 ▼

– Patanon 1 point 48 days ago +1 / -0

We can spread memes 24/7 easily. That’s what I mean. Agree that Claude is over priced. I use it for QA in my agent swarm via my sub and the cli. Gpt 5.5 is my brain. Use locals for content gen

permalink parent save report block reply

▲ 1 ▼

– txpenguin 1 point 50 days ago +1 / -0

Exactly.

permalink parent save report block reply

▲ 2 ▼

– Lupinate 2 points 50 days ago +2 / -0

Microsoft worked out it's own ai is cheaper to use for it's purpose than Claude, and tbh I kinda shocked they didn't start with copilot and went with Claude in the first place.

permalink parent save report block reply

▲ 1 ▼

– bubble_bursts 1 point 50 days ago +1 / -0

The reason was that top many people inside were using Claude in their own so they decided to roll it out. The hype around Claude is insane. I have seen this first hand

permalink parent save report block reply

▲ 1 ▼

– schiff_for_brains 1 point 50 days ago +1 / -0

Claude is better.

They had their engineers milk claude for all it's worth.

Now they will train Copilot on Claude's work.

Copilot will be as good and cheaper to run.

permalink parent save report block reply

▲ 6 ▼

– TaQo 6 points 50 days ago +6 / -0

and the first companies to actually deploy these tools at real scale are already pulling back because the invoice arrived before the productivity gain was large enough to cover it...

Well, Duh!

Laughing in 2000 .com bubble...

I love when ~~humanity~~ management consistently "over does it"... and a few moments later... the bill comes due and there's a massive about face.

It's like nobody gets there's no such thing as a free lunch...as if there's no memory of, "good, fast, cheap - you can only pick two"

The light is breaking through..."humans are cheaper than AI" - I guess...

When wages haven't risen in decades and costs of Claude for instance are sky high - that holds true...but is everyone (human) - going to feel the already "running on empty" situation get squeezed even harder? Sure seems that way...

When will humans realize the juice ain't worth the squeeze anymore in this new modern slavery construct? Good thing everyone is really healthy tho... and able to brush off stress & function at peak performance...pass the honey buns and Red Bull

u/#popcornclown

permalink save report block reply

▲ 2 ▼

– brain_dead [S] 2 points 49 days ago +2 / -0

That's not counting the stress of things break down and don't function correctly.

permalink parent save report block reply

▲ 4 ▼

– ozthentic 4 points 50 days ago +4 / -0

WHEN the Grandparents are using AI it is here to STAY!!

permalink save report block reply

▲ 1 ▼

– brain_dead [S] 1 point 49 days ago +1 / -0

LOL

permalink parent save report block reply

▲ 4 ▼

– redtoe-skipper 4 points 50 days ago +4 / -0

Someone was actually budgetting [/s]

Microsoft invested $5 billion in Anthropic.. gave 100,000 engineers Claude Code access.. encouraged adoption.. watched usage explode.. then the invoices arrived.. and issued an internal order to cancel nearly all Claude Code licenses by end of June and force everyone onto their own cheaper tool..

If bills come in ...i.e. 30 day cycle? There are certain metrics that can be graphed to anticipate consumables. And no one said anything? Or was it .... eh .... unfavorable to say anything .....

This is first rate stupid. It gears towards the question how MS is really run. Can shareholders trust in management on par with the fiduciary obligation they entrust to management? How about H1B1 (sounds like a flu) hiring? Subpar?

The whole thing reminds me of 2000 => data center builds. Everybody was screaming tics, tics, tics, but there were no customers. Yet, DC were built en masse, and 20 year leases were arranged, but no one thought of simply staging over time, decreasing capital expenditures, easing on supply chains, etc, not to mention the design savings .... Man. A Box in a Box was revolutionary.

But then again, that would require cutting staff, and easy jobs ...like "making sure jobs".

permalink save report block reply

▲ 14 ▼

– BurnNewHistoryBooks 14 points 50 days ago +14 / -0

Those “100,000 engineers” are definitely FULLY qualified indians.

This is what happens when they subvert companies, they import thousands of their low iq people so they can continue to operate their caste system and feed their shallow egos.

Then these uneducated and talentless cretins rely on AI to write their garbage code. Covering for each other and getting anyone who complains fired.

Remove. Every. Last. One.

permalink parent save report block reply

▲ 2 ▼

– TTrain237DDriver 2 points 50 days ago +2 / -0

This is a huge problem. In addition to what you pointed out, the quality of big tech products has gone straight downhill since 2020. I remember the days when I could google a programming question and get a useful answer. Nowadays I regularly ask Grok for a list of search results.

The recent ban on H1B applicants staying here while in the queue is definitely a good start to solving it.

permalink parent save report block reply

▲ 3 ▼

– BurnNewHistoryBooks 3 points 50 days ago +3 / -0

They have destroyed tech. Try to find a simple answer to a Linux terminal command usually ends up with 50 different bobble head meaningless answers and basically wading through spam.

All they seem to do is copy and paste.

permalink parent save report block reply

▲ 2 ▼

– brain_dead [S] 2 points 49 days ago +2 / -0

It's called "Think b4 you leap.""

permalink parent save report block reply

▲ 3 ▼

– Patanon 3 points 50 days ago +3 / -0

Stop with this fear bullshit. AI isn’t going anywhere. Claude is notorious for wasting tokens and increasing costs with its coding tool. They aren’t using less AI, just different models. You don’t say gas powered era of cars is dying because someone switches from chevron gasoline to Exxon gasoline. That’s what’s happening here.

permalink save report block reply

▲ 2 ▼

– 5DchessWatch 2 points 50 days ago +2 / -0

What if AI is a globalist psyop to get companies to ditch all their human employees and then rugpull once there are only AI employees? Then the economy collapses all at once? Would be poetic justice to the laid off. But the only solution would be one world centralized government. Save us, government!

permalink save report block reply

▲ 1 ▼

– brain_dead [S] 1 point 49 days ago +1 / -0

You think outside the box.

permalink parent save report block reply

▲ 2 ▼

– killerspacerobot 2 points 50 days ago +2 / -0

I cannot withhold a sinister chuckle of vindication. Not only have people bitten off more than they can chew, they have bitten off more than they can swallow. That's bad news for boa constrictors and pythons.

permalink save report block reply

▲ 1 ▼

– brain_dead [S] 1 point 49 days ago +1 / -0

yes. LOL

permalink parent save report block reply

▲ 2 ▼

– Knotnow 2 points 50 days ago +2 / -0

If they read GA.W they would already have the solution. Maybe the'll re-discover this guy's theory: https://greatawakening.win/p/1ASsZF0pku/interesting-post-about-the-sun-a/

permalink save report block reply

▲ 2 ▼

– brain_dead [S] 2 points 49 days ago +2 / -0

I have never seen this post, but it makes lots of sense. They have lied to us for so long I don't trust anything they said.

permalink parent save report block reply

▲ 1 ▼

– deleted 1 point 50 days ago +1 / -0