The Most Counterintuitive Way to Build a Brain
ELI5 / TLDR
Close your eyes and hum We Will Rock You. Nothing is vibrating your eardrums — your brain is generating that song from inside itself. Kirsanov’s question: how does a tangle of neurons produce a precise sequence out of thin air? The counterintuitive answer is reservoir computing: don’t try to engineer the tangle. Leave it random, give it a gentle rhythmic nudge, and just learn to read the resulting mess at the right angle. The randomness is the feature, not the bug.
The Full Story
The puzzle: where do internal sequences come from?
A pitcher’s throw, a zebra finch’s song, a tune in your head — all of these are autonomous pattern generation. The brain produces a precise sequence of activity without anything outside pushing it through. If we want a machine that thinks like us, we have to figure out how a closed box generates complex behavior on its own.
A standard neural network can’t do this. It’s a static input-output machine with no sense of time. The fix is recurrence — let neurons feed their own activity back into themselves. But Kirsanov asks us to stop thinking of recurrence as an engineering hack and start thinking of it as a property of a system that has its own internal physics.
The swimming pool
Imagine a pool. You jump in. That’s the input. After you climb out, the water doesn’t go still — ripples bounce off the walls, interfere, create complex patterns. The pool remembers your jump in the form of its own ongoing dance. Now picture a thousand neurons wired to each other randomly — some strong, some weak, some excitatory, some inhibitory. That tangle is the reservoir, and it behaves a lot like the pool. Each neuron’s state is set by where it was a moment ago, plus all the ripples coming in from its neighbours.
There’s a catch. Real pools have friction. Without friction, ripples build on ripples and the whole thing explodes into chaos. Chaos here has a precise meaning — sensitivity to initial conditions. If one neuron misfires by a millisecond, that error grows exponentially and the whole pattern collapses. You can’t compute with an explosion. So the network has to be tuned just right. Engineers call this the echo state property: every input leaves a temporary trace that fades. Energy in, slow decay, no runaway.
The pacemaker
But if ripples die, how do you sing a long song? You need something to keep the pool alive — a steady, boring driver. Picture a metronome dripping water at a constant rhythm. In the model, this is a sine wave z(t) scaled by a small weight per neuron. In a real brain, this is the role of theta and gamma oscillations — the rhythmic background hum that everything else rides on top of.
The dream is then to take that boring metronome z(t) and turn it into an interesting target — a bird song, a motor command — call it y(t). Kirsanov’s image:
It’s like dropping a stone in the pool every 10 seconds, but sculpting the walls of the pool so perfectly that the resulting ripples sound like Beethoven’s fifth symphony.
That’s hard. That’s exactly why recurrent networks are notoriously hard to train. Once you have loops, every weight you nudge has consequences that ripple through time. Tweaking a connection by 1% now might mess up the output 10 seconds from now. The interactions are knotted across time, and untying that knot is a nightmare.
The radical move: don’t train the reservoir
In the early 2000s, researchers gave up on the knot. Don’t try to tame the mess. Embrace it. Leave the random connections inside the bucket completely alone. Don’t touch a single one.
Instead, add one extra neuron — a readout — that listens to all the others but doesn’t talk back. Its output is just a weighted sum of every neuron’s activity. The only thing you train is the volume knobs on each of those mics.
Each neuron is shouting its own random gibberish into its microphone. Our job is to simply tweak the volume knobs on all of those microphones in such a way that the collective hum sounds like our target song.
The astonishing bit: finding those volume knobs is just linear regression. The same math that fits a straight line through scatter points fits a zebra finch song from a bucket of random neurons. One sweep, closed-form solution, done.
Why this works — Fourier’s loophole
It feels like a magic trick. Why should a complex signal already be hiding inside a random tangle? The cleanest intuition Kirsanov offers is a detour through 1820s France.
Joseph Fourier was stuck on heat flow. The differential equation was easy if the initial heat profile was a smooth sine wave — sine waves just flatten as they cool, no shape change. But if the initial profile was jagged, the math was hopeless. So Fourier had a wild idea: what if every jagged shape is secretly just a stack of sine waves added together? If true, you don’t solve the hard problem — you solve the easy problem many times over and add the answers up. He was right. Sines and cosines are a basis — universal building blocks. Stack enough of them in the right proportions and you can build any curve.
Now look back at the bucket. A thousand random neurons being driven by a metronome each respond differently — a thousand random squiggly time series. That’s a random basis. Not the clean, ordered sines and cosines of Fourier, but a vast Library of Babel of temporal shapes. If your library is big enough, some weighted combination of those squiggles will spell out almost any signal you want. The readout is just finding that combination.
The takeaway about brains
This says something quietly radical about biology. Maybe real neural circuits don’t need to be precisely engineered to produce complex behavior. Maybe the messy, random-looking tangle of cortical wiring isn’t a problem the brain has to solve — maybe it’s exactly what makes the brain work. The mess is the basis.
Key Takeaways
- Autonomous pattern generation is the brain’s core trick — producing precise sequences without external input. Songs in your head, motor commands, bird songs, heartbeats.
- A recurrent network’s hidden dynamics must sit on a knife’s edge — enough activity to ripple, not enough to explode. This is the echo state property.
- Chaos in this context has a technical meaning: tiny errors blow up exponentially. Useful computation requires controlled, fading echoes.
- Reservoir computing flips the standard ML script — leave the recurrent weights random and only train a linear readout. Training collapses from hard nonlinear optimisation to one closed-form regression.
- Brain rhythms (theta, gamma) probably aren’t decoration — they’re the metronome that keeps the reservoir alive long enough to produce sustained sequences.
- Fourier’s insight generalises: any sufficiently rich collection of basis functions, even random ones, can reconstruct arbitrary signals. Order isn’t required, just diversity.
- Implication for biology: random-looking cortical wiring may be a feature, not noise to be cleaned up.
Claude’s Take
Kirsanov is one of the cleanest computational-neuroscience explainers on YouTube — patient, visual, mathematically honest without bullying. This video is him at his best. The Fourier bridge to reservoir computing is genuinely the right intuition, and I haven’t seen it framed this elegantly elsewhere. The swimming-pool metaphor carries the entire argument without ever feeling forced.
Score 9/10 — half a point off only because the readout-as-linear-regression bit goes by fast and the video skips why the echo state property specifically (rather than some other stability condition) is what makes the basis usable. Still, for a 14-minute video this is a remarkable amount of intellectual ground covered.
The deeper thing worth sitting with: this is a recurring pattern in modern theory — large random systems often work better than carefully engineered ones, as long as you read them out smartly. It shows up in extreme learning machines, in some interpretations of why huge transformers work, in compressed sensing. Random + smart readout > engineered + dumb readout. Worth keeping that template in your head.
Further Reading
- Jaeger & Haas (2004) — Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication. The foundational echo-state-network paper.
- Maass, Natschläger & Markram (2002) — Real-Time Computing Without Stable States. The “liquid state machine” version of the same idea, framed for biological microcircuits.
- Sussillo & Abbott (2009) — Generating Coherent Patterns of Activity from Chaotic Neural Networks. The FORCE learning algorithm, which extends reservoir computing into the chaotic regime — directly relevant to Kirsanov’s tradeoff.
- Jeff Hawkins — A Thousand Brains. Sponsor mention in the video, but actually relevant: argues the cortex is itself a population of semi-independent reservoir-like columns.
- Artem Kirsanov’s prior video on recurrent networks — the explicit predecessor to this one; worth watching first if RNNs feel hazy.
- For Fourier, any intro signal-processing text — but Grant Sanderson’s (3Blue1Brown) Fourier series video is the cleanest visual intuition.