Max Tegmark Says Physics Just Swallowed AI
Max Tegmark Says Physics Just Swallowed AI
ELI5 / TLDR
Physics keeps eating subjects that people once swore weren’t physics: electricity, atoms, the early universe. MIT’s Max Tegmark argues AI is the latest meal, and consciousness is on the menu next. His sharpest move is separating two things we usually mash together: intelligence (getting tasks done) and consciousness (having an inner experience). You can have one without the other, so they can’t be the same. He then proposes an actual experiment to test theories of consciousness, and he picks a serious fight with the AI industry’s claim that we’ve “aligned” our machines. We haven’t aligned their goals, he says. We’ve only trained their behavior, which is a different and much weaker thing. He closes by attacking the idea that runaway superintelligence is inevitable. That word, he says, is mostly a sales tactic.
The Full Story
What “physics” actually is, and why its borders keep moving
Tegmark opens with a piece of history. When Michael Faraday proposed the electromagnetic field, his contemporaries scoffed:
“What are you talking about? You’re saying there is some stuff that exists, but you can’t see it, you can’t touch it. That sounds like total non-scientific ghosts.”
The punchline is that the field turned out to be the only thing we can see, because light is an electromagnetic wave. The lesson Tegmark draws is that the line between “physics” and “not physics” is not fixed. It moves. Astrology got kicked out. Atoms, black holes, and the state of the universe 13.8 billion years ago got let in.
His working definition of physics is deliberately broad: take any complex system, watch what it does, and try to figure out the rules underneath. By that yardstick, a neural network that translates French into Japanese is as fair a target as a hydrogen atom. This is why he thinks Geoffrey Hinton and John Hopfield winning a Nobel Prize in Physics for neural networks was correct, not a category error.
To make the connection concrete, he walks through Hopfield’s key insight using an image worth holding onto. Imagine an egg carton with 25 little valleys. Drop a marble in one valley and you’ve stored a piece of information. To read it back, just look at where the marble sits. Now picture a much bumpier landscape with many valleys. Tegmark calls each valley a minimum, a low point where a marble naturally settles. This kind of memory works differently from a normal computer. A normal computer is like a librarian: you ask for the exact shelf and address, and it fetches what’s there. Hopfield’s version is like singing “Twinkle, twinkle…” and having your brain auto-complete “little star.” You give a partial, fuzzy cue and the system slides into the nearest valley. The technical name for this is associative memory (recall triggered by a related fragment, not an exact address). Try to remember pi, start the marble at “three,” and it rolls down into the valley centered on 3.14159, as long as you started close enough. Memory, which felt like it had nothing to do with physics, turns out to be an energy landscape with marbles rolling downhill.
Intelligence and consciousness are not the same thing
This is the spine of the conversation, so it’s worth going slow. Tegmark says most of his colleagues think consciousness-as-science is nonsense, but when he presses them on why, they split into two camps that flatly contradict each other. One camp says consciousness is just intelligence by another name. The other says machines can obviously never be conscious. Both can’t be right, because if consciousness equals intelligence, and machines can be intelligent (which nobody at an AI conference would dispute), then smart machines would be conscious by definition.
He cuts the knot with two everyday observations:
“You can have intelligence without consciousness. And you can have consciousness without intelligence.”
For the first half: when you glance at a face and instantly know who it is, you did something genuinely clever, but you have zero access to the algorithm you used. Think of it like getting an email from a department you can’t see. The face-recognition team finishes the job and sends you a one-line memo: “It’s Max.” You experience the answer, not the work. A huge fraction of what your brain does runs this way, under the floorboards, and you only learn the results after the fact. Intelligent, but not conscious.
For the second half: a dream. While you’re lying there asleep, you’re accomplishing nothing, solving no task, yet there’s a vivid inner experience happening. Conscious, but not intelligent.
So the two are different phenomena. Tegmark’s picture is a Venn diagram. Some processing is both intelligent and conscious, some is intelligent only, some is conscious only. The real scientific question becomes: what kind of information processing produces intelligence, and what kind produces experience?
A surprising point about where experience lives
A small but mind-bending aside. You might assume that when you look out a sunlit window, the content of your consciousness is the world outside. Tegmark says no, and the proof is that you can have the same rich experience with your eyes shut, dreaming. What you are actually conscious of is your inner model of the world, not the world itself. Your senses are just a feed that keeps updating that model. Right now you aren’t experiencing the room. You’re experiencing your brain’s running simulation of the room.
A consciousness experiment you can actually run
Here’s where Tegmark gets ambitious. The standard objection is that you can never test a theory of consciousness, because all anyone can observe from outside is your behavior. He thinks that objection is lazy, and he sketches a way around it.
Suppose someone writes down a real mathematical formula claiming to specify which information processing is conscious. The leading candidate is Giulio Tononi’s Integrated Information Theory, which proposes a quantity called phi (a number measuring how tightly unified a system’s information is). Tononi’s core requirement is integration: a single unified experience can’t secretly be two separate processes that never talk to each other, because two non-communicating halves would be like two parallel universes, unaware of one another, and couldn’t feel like one mind. (Tegmark was curious whether Tononi’s was the only formula with the right mathematical properties, so he wrote a paper classifying all of them. There turned out to be fewer than a hundred.)
Now the test. Put yourself in a brain scanner that reads your neural activity in real time, and wire it to a computer running the theory. The computer announces its predictions about what you’re experiencing:
“I predict that you’re consciously aware of a water bottle.” And you’re like, “Yeah, that’s true.”
Then it overreaches: “I detect processing about regulating your pulse, so I predict you’re consciously aware of your heartbeat.” You’re not. The theory just made a falsifiable claim about your private experience, and you falsified it. The crucial design feature is that you’re not trying to convince anyone else. You’re alone with the machine, trying to catch it being wrong about your own inner life.
Curt pushes back hard, and his objection is the right one: isn’t this just measuring a correlation between brain patterns and what you report? Tegmark’s answer is that reporting to a third party isn’t the point. You already know, from the inside, what you’re experiencing. The machine doesn’t need to prove your consciousness to an outsider. It only needs to keep getting your experience right or wrong, and you’re the judge. He’s careful to nail down the definition: consciousness here means subjective experience, full stop, not “information stored in your brain.” You hold thousands of memories you aren’t experiencing right now; if the machine claimed you were conscious of all of them, it would fail.
The analogy he leans on is general relativity. We can’t fly into a black hole to check what’s inside. But we don’t test philosophical claims about black holes. We test a mathematical theory that also predicts things we can check (Mercury’s wobbling orbit, light bending around the sun, gravitational waves). It passed those, so we extend trust to its untestable predictions. And critically, you don’t get to cherry-pick:
“It’s not like, ‘Oh, I want coffee, but decaf.’ If you’re going to buy the theory, you need to buy all its predictions, not just the ones you like.”
If a consciousness theory survives the same kind of relentless attempted falsification, Tegmark says we’d be obliged to take seriously what it predicts about coma patients, locked-in patients, and eventually about machines, including whether they suffer.
Don’t listen to the curmudgeons
Tegmark zooms out into a recurring theme: science has repeatedly been delayed by confident people insisting something can’t be studied. Astronomers assumed other solar systems must look like ours, so why bother looking for exoplanets? When someone finally looked anyway, they found “hot Jupiters,” giant planets orbiting closer to their star than Mercury orbits the sun. That discovery could have come a decade earlier. Same story with X-ray astronomy (“there are no X-rays in the sky, what, are there dentists up there?”) and with van Leeuwenhoek’s microscope revealing a whole zoo of invisible creatures. His rule: if you can build an experiment that pokes into a genuinely new region of parameter space, just do it. More than half the time, there’s a revolution waiting.
He folds this into an extended riff on human pessimism. Picture two people in a cave 30,000 years ago, looking at the stars, melancholy that they’ll never know what those dots are, never reach them, probably starve, and that life 50,000 years on will be just as bleak. They would have been catastrophically wrong. We turned out to be, in his phrase, “the masters of underestimation,” wrong about the size of the cosmos and wrong about the power of our own minds. He retells Aristarchus working out, from the curved shadow Earth casts on the eclipsed moon, that Earth is a ball bigger than the moon, all without leaving the ground.
From overhyped to underhyped
On AI specifically, Tegmark says the field was chronically overhyped from the 1950s onward, with progress consistently slower than promised. Then, around four years ago, it flipped to underhyped. Six or seven years ago most of his MIT colleagues thought passing the Turing test was decades away. It already happened. His logic: if the experts were that wrong in the pessimistic direction once, they may be wrong about more.
He revives Turing’s own 1951 warning that once machines outperform us at every cognitive task, the default outcome is that they take control, the way humans took over from other apes. He pairs it with I.J. Good’s point that the final sprint, from “slightly better than us” to “vastly better,” could be brutally fast, because once AI can do AI research, the researchers no longer sleep, eat, or think at human speed, and each improvement can be copied instantly to all the others. The slow exponential of human-paced R&D could give way to one that doubles every day, until it eventually flattens into an S-shape (a sigmoid) when it hits hard physical limits like the speed of light. His colleague Seth Lloyd estimates we’re still some unimaginable factor (a million-million-million-million-million times) away from those limits, so there is enormous room to run.
His historical rhyme is Fermi’s 1942 nuclear reactor under a Chicago stadium. The reactor itself was about as dangerous as ChatGPT, i.e. not very. What spooked the physicists was the realization that the last conceptually hard milestone had been cleared, and “the rest is just engineering.” Tegmark feels the same about AI now. We don’t yet have AI that’s better than us at building AI, but he suspects the remaining gap is mostly engineering. (When Curt guesses it took a decade from Fermi’s reactor to the first nuclear explosion, Tegmark notes it was three years.)
Goals are written into physics itself
This is the most beautiful idea in the conversation. Tegmark points out there are two completely valid ways to explain any physical event: by its past (causes pushing it forward) or by its future (a goal it’s heading toward).
The textbook case is light bending in water. Why does a straw look bent in a glass? The causal story is a brutal calculation about photons interacting with atoms and electromagnetic fields. But there’s a second story, Fermat’s principle: the light simply took the path that gets it to its destination fastest. Picture a lifeguard on a beach racing to a drowning swimmer. They don’t run in a straight line; they run farther on the sand (where they’re fast) and swim less in the water (where they’re slow), bending their path to minimize total time. That’s obviously goal-oriented for the lifeguard. For the photon, both descriptions are equally true, and the goal-oriented one is often easier to calculate.
He extends this to thermodynamics via Jeremy England’s work. Leave sugar on the floor and it sits there for a year. Add ants, and it vanishes fast, and entropy (disorder) increases faster because the sugar got eaten and dissipated. England showed that systems in certain conditions tend to reorganize themselves to dissipate energy faster. Life is the grand example. Life can’t beat the second law of thermodynamics, but it pulls a trick: it keeps its own internal order low by dumping even more disorder into its surroundings. The increase in environmental entropy is the price; staying complex and reproducing is what life buys with it.
Tegmark’s big-picture claim: the universe has been getting steadily more goal-oriented as life has grown more sophisticated, and we’re now at a hinge point where the atoms in purpose-built technology rival the atoms in all living biomass. In an AI-driven future, the overwhelming majority of atoms might end up engaged in goal-directed behavior.
”They’ve aligned behavior, not goals”
Here Tegmark lands his most pointed critique of the AI industry. When companies say they’ve aligned an AI (given it good goals), what they’ve actually done, he argues, is shape its behavior through reward and punishment. He offers a deliberately uncomfortable analogy:
“That’s just like if you train a serial killer to not say anything that reveals his murderous desires.”
If you punish the killer every time he hints at violence until he stops hinting, have you changed what he wants? Or just what he says? Tegmark contrasts this with how he’s raising his two-year-old son. He doesn’t plan to simply punish the boy without explanation when he misbehaves; he wants the child to internalize the value of kindness, to actually care about others’ wellbeing. Reinforcement learning from human feedback (the standard alignment method, often abbreviated RLHF) is nothing like that. In practice, he notes, it often means paying workers in Kenya and Nigeria a dollar or two an hour to label horrific content, which is a world away from a parent patiently teaching a child why.
His honest bottom line is unsettling:
“We have no clue really what, if any, goals ChatGPT has. It acts as if it has goals… but who knows?”
A model trained to predict the next word, then nudged by feedback, might have no unified set of goals at all, the way a brilliant actor can convincingly portray a thousand different motivations without holding any of them. And this matters enormously, because if we end up living alongside machines smarter than us, our safety depends on them genuinely wanting to treat us well, not on having mouthed the right reassurances before gaining power. We’ve lived with smarter beings before, our parents, and it worked because they actually cared about us.
Goals, optimization, and why you’re not a single equation
Curt asks a sharp question: does every goal imply optimization, and vice versa? Tegmark notes Feynman observed that nearly every law of physics can be derived from some optimization principle. But he doubts humans can be modeled as maximizing a single goal. Evolution’s only “goal” was genes making copies of themselves. But a rabbit that recalculated “how will eating this carrot affect my future offspring count?” before every bite would starve. So evolution installed heuristic hacks (shortcuts): feel hungry, eat; feel thirsty, drink; fall in love, make babies. These are proxies for the gene’s true objective, but they’ve drifted loose from it. Anyone using birth control is running the “make love” heuristic while explicitly refusing the goal it was built to serve, a small rebellion against our own design. We are, in the jargon, agents of bounded rationality (thinkers with limited compute), navigating by a tangle of drives rather than one clean objective. Today’s AI, Tegmark thinks, is even more of a “random mishmash” than we are.
Understanding looks like geometry
The final big theme is understanding, which Tegmark treats as a third thing distinct from both intelligence and consciousness, and which he considers an open problem. (He notes intelligence and goals are independent: there are chess tournaments for losing chess, where the winner is whoever forces their opponent to win, and a computer can be brilliant at that. This is Nick Bostrom’s orthogonality thesis: intelligence is just skill at achieving goals, whatever those goals happen to be. A smarter Hitler would have been worse, not kinder.)
His clearest example of machine understanding is gorgeous. His group trained a network to do modular arithmetic, which is just clock math: on a 12-hour clock, 11 plus 2 doesn’t give 13, it wraps around to 1. They used numbers 0 to 58, wrapping at 59. The network was fed each number as a meaningless symbol and learned to place each one as a point in a high-dimensional space. For a long while it just memorized, performing well on examples it had seen and badly on new ones. Then, suddenly, it started getting unseen problems right. A eureka moment. When they plotted how the 59 points were arranged at that exact moment (using a standard technique called principal component analysis to flatten the high-dimensional cloud onto a plane), the points snapped into a perfect circle, 59 beads on a ring. The machine had, on its own, rediscovered the clock. It had built a representation (an internal geometric model) of the problem, and that representation was what let it generalize.
They’ve since found language models laying numbers out on a helix (a spiral, so it can encode both rough magnitude along the length and digit-cycling around the loops). This feeds a striking conjecture, the platonic representation hypothesis: when two different systems deeply understand the same thing, they tend to converge on the same internal representation. Evidence is mounting: a model trained only on English and one trained only on Italian can have their word-clouds rotated to line up, yielding a rough dictionary. Tegmark’s own team trained models on family trees (who’s whose uncle, sister, descendant) and found independent systems all built the same tree-shaped representation, literally rotatable from one model into another, even though no one ever told them the concept “family tree.” Understanding, on this view, is finding the pattern and then finding a clever shape to hold it in, where the shape itself does much of the work.
Optimism, and the word “inevitable”
Tegmark rejects the “doomer” label, calling it the thing people say when they’ve run out of arguments. He insists he’s an optimist, and his sharpest target is the meme that superintelligence is inevitable:
“If you tell yourself that something is inevitable, it’s a self-fulfilling prophecy… It’s the oldest psyop game in town.”
He compares it to convincing a just-invaded country that resistance is futile. Companies that don’t want to be regulated have every incentive to insist the outcome is fixed, so don’t fight it (and by the way, buy the product). His counterexamples are technologies we chose not to build despite money and power being on the table: human cloning (the one scientist who tried it in China went to prison), and bioweapons, where Matthew Meselson persuaded Nixon that a weapon of mass destruction cheap enough for every adversary was a bad idea, leading to a global ban. What people actually want from AI, he argues, is tools, to cure cancer, run businesses better, even strengthen armies, all achievable with controllable AI rather than “some kind of sand god.” He notes the unusual coalition against uncontrollable superintelligence, spanning evangelicals, the Pope, Bernie Sanders, and Marjorie Taylor Greene.
He ends on his parents’ advice (don’t worry too much what others think), and on guidance for young researchers with an unpopular idea: roughly half the greatest breakthroughs were trash-talked at the time, so listen to criticism but keep pushing if the logic still holds. And hedge: do enough respectable work to pay the bills, then carve out real time for the passion project, quietly. “You’re never going to be the first to do something important if you’re just following everybody else.”
Key Takeaways
- The borders of physics move. Electromagnetism, atoms, and cosmology were all once “not physics.” Tegmark argues AI has now crossed in, and consciousness is the last frontier.
- Intelligence is not consciousness. Intelligence is task-completion; consciousness is subjective experience. Face recognition shows intelligence without awareness; dreaming shows awareness without task. They can’t be the same thing.
- Consciousness may be testable. Wire a brain scanner to a computer running a mathematical theory (like Integrated Information Theory’s phi), let it predict what you experience, and let you falsify it from the inside. You’re the only judge you need.
- Goals are built into physics. Fermat’s principle (light takes the fastest path) and non-equilibrium thermodynamics can both be read as goal-oriented. The universe is becoming more goal-directed as life and technology spread.
- “Alignment” today is behavior, not goals. Punishing a model into silence is like training a serial killer to stop dropping hints. We genuinely don’t know what goals, if any, current AIs hold.
- Understanding shows up as geometry. A network learning clock-arithmetic spontaneously arranged its numbers in a circle at the moment it “got it.” Different systems often converge on the same representation (the platonic representation hypothesis).
- Superintelligence is not inevitable. That framing is a self-fulfilling prophecy and a convenient one for the unregulated. We chose not to build human clones or bioweapons; we have more agency than we’re told.
Claude’s Take
This is Curt Jaimungal at his best: a genuinely good interviewer who pushes back instead of nodding along, and Tegmark rewards it. The consciousness-experiment exchange is the strongest part precisely because Curt keeps poking the correlation hole, and Tegmark’s “you’re not trying to convince me, you’re convincing yourself” reframing is either a real insight or a sleight of hand, and the conversation honestly leaves that open. I lean toward thinking it sidesteps rather than solves the hard problem (it tests whether a theory predicts the boundary of your experience, which is a real and useful thing, but it still presupposes that you have experience to report). That’s a feature, not a bug, for an honest discussion.
Where I’d put up the BS filter slightly: the “intelligence vs consciousness” distinction is presented as obvious, and it mostly is, but Tegmark glides past the possibility that his everyday examples (you don’t know your face-recognition algorithm) prove something weaker than he wants. Not knowing how you did something isn’t quite the same as that processing being non-conscious. It’s suggestive, not airtight. And the optimism section, while bracing and well-argued, is doing some rhetorical work: comparing AI to cloning and bioweapons is appealing but those had clear bright-line bans and no trillion-dollar commercial flywheel; he half-acknowledges this when he notes “there’s also a lot of money in it,” then mostly sets it aside.
What earns the high score is the density of genuinely good ideas per minute, and that almost all of them are grounded in concrete, checkable things: Hopfield’s energy landscapes, Fermat’s principle, the modular-arithmetic circle, the family-tree representations. The “alignment is behavior, not goals” point is the most quotable and, to me, the most important takeaway, and Tegmark states it more bluntly than most insiders will. An 8: substantive, well-structured, occasionally over-confident, never boring.
Further Reading
- John Hopfield, “Neural networks and physical systems with emergent collective computational abilities” (1982) — the associative-memory / energy-landscape paper behind his Nobel.
- Giulio Tononi — Integrated Information Theory and the phi measure of consciousness.
- Max Tegmark, Life 3.0 — his book on the future of AI and the control problem; also Our Mathematical Universe for the multiverse / mathematical-universe ideas mentioned at the end.
- Nick Bostrom, Superintelligence — source of the orthogonality thesis (intelligence and goals are independent).
- Jeremy England — dissipation-driven adaptation and the thermodynamics of life.
- Bernard Baars / Stanislas Dehaene — Global Workspace Theory, the “small desktop / spotlight” model of consciousness Curt references.
- The grokking and platonic representation hypothesis papers from Tegmark’s MIT group on mechanistic interpretability (the circle and helix representations).