GPUs, TPUs, & The Economics of AI Explained | Gavin Baker Interview

ELI5/TLDR

Gavin Baker maps the entire AI hardware landscape: who makes the chips, who runs them best, and why the economics of all this actually matter. The core argument is that Nvidia’s Blackwell chips are about to shift the cost advantage away from Google’s TPUs, which reshuffles the competitive dynamics between every frontier AI lab. Along the way he makes a surprisingly compelling case for data centers in space and explains why SaaS companies clinging to 80% gross margins are repeating the exact mistake brick-and-mortar retailers made with e-commerce.

The Full Story

Scaling Laws Are Intact, But It Was a Close Call

Gemini 3 confirmed that pre-training scaling laws still hold. Baker stresses we don’t actually understand why they work — our knowledge is closer to ancient Egyptians measuring the sun than to orbital mechanics. They can measure the phenomenon precisely but have no mechanistic explanation.

Had reasoning models not arrived, AI progress would have flatlined from mid-2024 through late 2025. The reason: after XAI got 200,000 Hoppers coherent, there was nowhere to go until next-generation chips were ready. Reasoning — reinforcement learning with verified rewards plus test-time compute — bridged an 18-month hardware gap.

“Had reasoning not come along, there would have been no AI progress from mid-2024 through essentially Gemini 3. Everything would have stalled.”

These scaling laws are multiplicative. Better base models trained on Blackwell will then get the reasoning and test-time compute treatment applied on top.

The Hopper-to-Blackwell Transition Was Brutal

Going from Hopper to Blackwell meant switching from air-cooled to liquid-cooled, tripling rack weight from 1,000 to 3,000 pounds, and quadrupling power draw from 30 to 130 kilowatts per rack. Baker’s analogy: imagine upgrading your iPhone required rewiring your house to 220V, installing a Tesla Powerwall, a generator, solar panels, a humidification system, and reinforcing the floors.

Blackwell has only been deployed at scale for 3-4 months. XAI will produce the first Blackwell-trained model because, per Jensen Huang himself, “no one builds data centers faster than Elon.” Even after deployment, it takes 6-9 months for a new chip generation to outperform the old one — engineers need to learn its quirks.

Google’s Temporary Cost Advantage Is Ending

Google has been the lowest-cost producer of tokens, which is historically unusual — being the low-cost producer has never mattered in tech before. Apple didn’t win by being cheap. Neither did Microsoft or Nvidia. But AI is different because every inference costs real compute.

Google used this advantage to suck the economic oxygen out of the ecosystem — running AI at negative 30% margins to starve competitors of funding. Rational strategy when you’re cheapest. But once Blackwell clusters shift from training to inference, and especially when GB300 arrives (drop-in compatible with GB200 racks, no new infrastructure needed), the cost calculus flips. Vertically integrated GPU operators become the new low-cost producers.

The Broadcom-Google Relationship Is Under Strain

Google pays Broadcom roughly $15 billion annually for back-end chip design and TSMC management on TPUs. At that scale, it becomes rational to bring the work in-house — Google could double the entire Broadcom semiconductor division’s compensation and still save $5 billion. Google has started bringing in MediaTek as a warning shot. The Taiwanese ASIC companies have much lower margins.

Meanwhile, TPU development is slowing while GPU development is accelerating. Nvidia and AMD are moving to annual chip cadences, and Baker is skeptical most custom ASICs can keep up: “It takes at least three generations to make a good chip.”

The Four Labs and the Reasoning Flywheel

The four frontier labs that matter: OpenAI, Google/Gemini, Anthropic, XAI. Meta threw enormous resources at frontier models and failed. So did Microsoft (Inflection AI acquisition) and Amazon (Adept AI, Nova models). Baker doesn’t think Meta’s models crack the top hundred.

Reasoning created something that didn’t exist before in AI: a flywheel. Previously, you pre-trained a model, released it, and it was what it was. Now, user interactions generate verifiable rewards that feed back into model improvement. This flywheel, combined with each lab having a more advanced internal checkpoint used to train the next model, is creating compounding separation.

“If you do not have that latest checkpoint, you’re behind. It’s getting really hard to catch up.”

Chinese open-source models served as bootstrap checkpoints for others, but Blackwell will blow out the gap between American frontier labs and Chinese open source. DeepSeek’s own technical paper acknowledged compute constraints as a key limitation.

Anthropic’s Quiet Advantage

Anthropic burns dramatically less cash than OpenAI while growing faster. Their relationships with Google (TPUs) and Amazon (Tranium) gave them the same cost advantages Google enjoys. The recent $5 billion Nvidia deal signals Dario Amodei understands the coming Blackwell advantage — and gives Nvidia a third frontier lab in its camp alongside XAI and OpenAI.

OpenAI’s core problem is being a high-cost token producer. They pay margins to compute providers who may not run GPUs optimally. The $1.4 trillion spending commitment was partly a signal about future capital needs.

Data Centers in Space

Baker argues this is the most important development in the next 3-4 years. The first-principles case:

Power: Solar in space delivers 6x more irradiance than on Earth. The sun is 30% more intense, and satellites stay in sunlight 24 hours a day. No batteries needed.

Cooling: Free. Put a radiator on the dark side of the satellite. Near absolute zero.

Networking: Lasers through vacuum are faster than lasers through fiber optic cable. Satellite-to-satellite communication is theoretically faster and more coherent than data center networking on Earth.

Latency: With direct-to-cell capability (Starlink has demonstrated this), you skip the entire terrestrial routing chain — cell tower, base station, fiber, metro aggregation, data center, and back.

The bottleneck is launch capacity. Only Starship-class rockets can make it economical. Training workloads will take longer to move up, but inference could shift sooner. Baker estimates data centers in space becoming a majority of deployed capacity in 5-6 years.

The SaaS Burning Platform

Application SaaS companies are repeating brick-and-mortar retail’s e-commerce mistake. They see AI’s lower gross margins (40% vs. their 80%) and refuse to cannibalize. But AI-native startups generate cash earlier despite lower margins because they have very few human employees.

“If you are trying to preserve an 80% gross margin structure, you are guaranteeing that you will not succeed at AI. Absolute guarantee.”

Baker points out that software investors have already tolerated this transition once — the cloud. Adobe’s revenues and margins both imploded during the SaaS shift. Microsoft was a tough stock in the early cloud days. Both came through stronger. Every SaaS company from Salesforce to Atlassian could run this playbook. Almost none are.

“This is a life-or-death decision. And essentially everyone except Microsoft is failing it.”

The Natural Governors

Two forces prevent an overbuild: TSMC’s extreme caution about capacity expansion (“they met with Sam Altman and laughed — said he’s a podcast bro”), and power constraints on Earth. If both governors released simultaneously — TSMC expanding aggressively while space data centers solve power — you’d get an overbuild fast. Their staggered timing is healthy.

Power will be solved by natural gas and solar, not nuclear. America simply cannot build nuclear fast enough. The system is already responding: Caterpillar announced 75% capacity expansion for turbines.

ROI on AI Is Already Positive

Baker finds the debate about AI ROI puzzling. The largest GPU buyers are public companies with audited financials. Their return on invested capital is higher than before they ramped AI spending. CH Robinson’s stock jumped 20% after reporting AI-driven productivity: quoting 100% of inbound shipping requests in seconds versus 60% in 15-45 minutes previously.

Fortune 500 companies are always last to adopt new technology. VCs are more bullish because their portfolio companies show clear productivity gains — fewer employees per dollar of revenue than two years ago.

Key Takeaways

Scaling laws are empirical, not theoretical. We measure them precisely but don’t understand the mechanism. Every confirmation matters.
Reasoning bridged an 18-month hardware gap. Without it, AI progress would have stalled from mid-2024 to late 2025 waiting for Blackwell.
Three scaling laws are now multiplicative: pre-training, reinforcement learning with verified rewards, and test-time compute.
Karpathy’s distinction: With software, anything you can specify, you can automate. With AI, anything you can verify, you can automate. Verification is the key constraint.
GB300 is drop-in compatible with GB200 racks. No new infrastructure. This is when GPU operators overtake Google on cost.
Google pays Broadcom ~$15B/year for TPU back-end work. The economics of bringing this in-house are becoming irresistible.
It takes three chip generations to become competitive. Amazon’s Tranium 3 is the first “okay” version. Tranium 4 will probably be good.
Each frontier lab has a more advanced internal checkpoint. This compounds — if you don’t have the latest checkpoint, the gap widens every cycle.
Meta, Microsoft, and Amazon all failed at building frontier models despite massive spending. The task is harder than it looks.
Anthropic burns less cash than OpenAI and grows faster — largely thanks to cost advantages from Google/Amazon chip access.
Solar in space delivers 6x more irradiance than on Earth. No batteries, free cooling, faster-than-fiber laser networking.
TSMC’s capacity caution is an unintentional governor preventing an overbuild. Intel’s empty fabs will eventually fill the gap.
AI gross margins are ~40%, not 80%. SaaS companies refusing to accept this are guaranteeing their own disruption.
The “free tier” of AI models is like evaluating a 10-year-old. Serious assessment requires the $200/month tier.
Edge AI is the scariest bear case. If a pruned frontier model runs at 30-60 tokens/second on a phone at 115 IQ, cloud demand collapses.
Power constraints favor the best compute. When watts are the bottleneck, tokens-per-watt is all that matters — chip price becomes irrelevant.

Claude’s Take

This is one of the more information-dense investing conversations I’ve encountered. Baker has a rare combination: deep semiconductor knowledge, a genuine understanding of the underlying physics, and the investor’s instinct for which technical details actually matter for competitive positioning. He doesn’t just know what Blackwell is — he understands why the thermal transition from air to liquid cooling delayed everything, and what that delay meant for the competitive landscape.

The data centers in space argument is the kind of thing that sounds ridiculous until he walks through the first principles. Power, cooling, networking — the advantages are real. The timeline (5-6 years to majority deployment) is aggressive but not insane given Starship’s trajectory. The key insight isn’t that it will definitely happen — it’s that the possibility changes the risk calculus for anyone building terrestrial power plants and data centers today.

The SaaS argument is the sharpest strategic insight in the conversation. The parallel to brick-and-mortar retail is precise and damning. Every CRM company should be building sales agents at 20% gross margin, funded by their existing cash-generative business. The fact that almost none are doing this is genuinely surprising.

Where I’d push back: the “whatever AI needs, it gets” observation at the end veers close to teleological thinking. Nuclear opinion didn’t shift because AI needed it — it shifted because of broader energy security concerns that coincided with AI demand. Correlation is doing a lot of work there.

Score: 8/10. Encyclopedic technical depth married to clear strategic thinking. The semiconductor supply chain analysis alone is worth the listen. Loses a point for some repetition and the occasional venture-bro energy, but the signal-to-noise ratio is high.