The Most Counterintuitive Way To Build A Brain
read summary →TITLE: The Most Counterintuitive Way to Build a Brain CHANNEL: Artem Kirsanov DATE: 2026-05-05 ---TRANSCRIPT--- You know there is something miraculous happening in your brain right now. Close your eyes. I want you to think of the song We Will Rock You by Queen. Chances are you can hear it in your head. But here’s the mystery. Where is it coming from? Your ear drums are not vibrating. The outside world is not pushing the song into your brain. You are generating it internally. This is actually one of the fundamental tasks that the brain needs to perform called autonomous pattern generation. From a zebrafinch singing its song to a pitcher throwing a ball, brains constantly face the challenge of learning to produce precise sequences of neural activity. So if we want to build a machine that thinks like us, we have to solve this specific problem. How do we build a box that generates complex behavior seemingly out of thin air? In the previous video, we saw that standard neural networks are essentially static machines having no sense of time. To fix this, we introduced recurrence, letting neurons feed their activity back into themselves. But as we hinted, there is another way to think about recurrence. Not as an engineering fix, but as a fundamental property of a dynamical system. Think of it like a swimming pool. You jump in. This is the input. You make a splash, but after you leave, the water doesn’t stop. The ripples you generated spread, reflect off the walls, and interfere with each other, creating complex patterns. Essentially, the input just gave the system a little nudge, but the water keeps dancing according to its own internal physics, creating a kind of memory of your jump. Now, we know that brains compute with the nerve cells, acting as individual units interacting with each other. In a way, they are like individual water molecules in that pool. Imagine a bucket of n neurons, say a thousand of them. We’ll call this our reservoir. Let’s connect them randomly. Some connections are strong, some are weak, some positive, some negative. It’s a big tangled mess. Let’s write down what happens to a single neuron in that pool. At each moment, its state is determined by where it was a moment ago, plus the incoming ripples from all other neurons. Here, Wig J is the strength of the connection between neurons J and I. And sigma is our activation function, mimicking how a real neuron only fires once its input voltage crosses a threshold. But here’s the catch. In a real swimming pool, if you wait long enough, the water settles. The friction kills the energy and the ripples die out. Now, mathematically, this friction is actually a good thing. It creates stability. If we didn’t have it, if we cranked up the weights too high, the network would generate a self-sustained dance, but it would be chaotic. Chaos here means a sensitivity to initial conditions. If a single neuron misfired by a millisecond, that tiny error would explode and the whole pattern would change. You can’t compute with an explosion. So, we tune the network to have what’s called an ecoate property. It means that every input leaves a temporary trace, an echo in the network’s activity. But that echo gradually fades over time. But this brings us back to the swimming pool problem. If the ripples eventually die out, how do we sing a long song? We need to keep the water moving, we need a driver. Let’s introduce a simple rhythmic signal Z of T. something like a boring sine wave to keep the energy levels up. Think of it like a background clock. In the brain, this might correspond to the rhythmic oscillations like theta or gamma waves that act as neural pacemakers. Each neuron now receives this driving signal scaled by the value mu unique to that neuron. The goal then is to take this boring driving signal Z of T and transform it into an interesting target signal Y of T, like a zebra finch song or a motor command. It’s like dropping a stone in the pool every 10 seconds, but sculpting the walls of the pool so perfectly that the resulting ripples sound like Beethovven’s fifth symphony. That sounds extremely complicated, and that’s because it is. In fact, to this day, recurrent neural networks are notoriously hard to train. But here comes the crucial mental shift. You see, in traditional machine learning, you act as a micromanager. You try to adjust every single connection weight between every pair of neurons to sculpt that perfect splash. The problem is that once you introduce recurrence, the interactions become entangled in time. The effect of nudging a weight by 1% right now might have unexpected consequences 10 seconds from now. Because these ripples are bouncing around in loops, it’s incredibly hard to untie the knot. If these ideas got you curious about broader theories of neural computation, I’d recommend a book a thousand brains theory by Jeff Hawkings, which proposes that the neo cortex is itself a kind of reservoir of independent cortical columns. You can find it on Short Form, for kindly sponsoring today’s video. Short Form turns books into proper study resources. Not just condensed summaries, but deep guides that place each book’s ideas in the context of related research and other titles, offering a much richer understanding of the big picture. They cover a wide range of genres like science, technology, and education, releasing new guides every week, and letting subscribers vote on which books to cover next. There is also a browser extension that does the same thing for articles and YouTube videos you stumble across online. If you want to supercharge your reading, follow the link down in the description for a free trial and 20% off the annual subscription. But in the early 2000s, researchers asked a radical question. What if instead of trying to tame this mess, we embraced it? What if we don’t train the reservoir at all? This is the philosophy of reservoir computing. We leave the connections inside the bucket completely random. We don’t touch them. Rather than trying to force water molecules to bounce around perfectly, we just learn to work with the physics we already have. Let’s see what happens when we let a simple sine wave hit that random network. Examining individual neurons, it looks like a mess. But reservoir computing relies on a beautiful mathematical curiosity. The answer we’re looking for is already hidden in that noise. We just need to learn to look at the mess at the right angle. Now, this might sound like magic, and we’ll see why it works in a moment, but here’s what I mean. Let’s add one final neuron called the readout. It listens to the activity of all other neurons, but doesn’t talk back. The state of that readout x of t is simply a weighted sum of all neurons states in the network. While we can’t touch the network, we can adjust these readout weights. In fact, this is the only thing we can do. You can think of it like this. Each neuron is shouting its own random gibberish into its microphone. Our job then is to simply tweak the volume knobs on all of those microphones in such a way that the collective hum sounds like our target song. We let the network run for a while and record the voices of all n neurons. Mathematically, we’re looking for a set of coefficients such that when we add up all these random signals, we get our target y of t. It turns out this is a famous problem with a simple analytical solution. It is just a linear regression in disguise. The math for finding the perfect bird song is the exact same math used to fit a straight line through a set of points on the graph. I won’t go through the derivation here. I think the conceptual picture is far more important. But the upchart is this. We can calculate the optimal weights in a single sweep. Once we lock those weights in, if we drive the network with that simple sine wave, it produces a complex rippling response that the readout neuron translates into a beautiful zebra finch song. But this might feel unsatisfying, almost magical. Why on earth would we expect a complex signal to be hiding inside the bucket of randomly connected neurons? The intuition I find the most satisfying is this. Let’s step back from neural networks for a second and go back to the early 19th century. The French mathematician Joseph Furier was obsessed with a specific problem, heat. He wanted to describe exactly how heat spreads through a solid object like an iron bar over time. He wrote down the differential equation for it but hit a wall. If the initial heat profile was jagged or complicated, the math was impossible. He could not solve the equation. But Fier found a loophole. He realized that if the initial temperature looked like a perfect smooth sine wave, the solution was trivial. A sine wave doesn’t change its shape as it cools down. It just gets flatter. The math for a sine wave was easy. And then he had a crazy idea. He asked, “What if the jagged complicated shape I can’t solve is actually just a bunch of simple sine waves added together?” If that were true, he wouldn’t need to solve the hard equation. He could just solve the easy equation for each individual sine wave, add the answers together, and boom, he would have the solution for the jagged mass. And remarkably, he was right. We now know that if you have enough s and cosine waves and if you mix them in right proportions you can build any curve you want. In mathematics we saying that ss and cosiness form a basis. They are universal building blocks. Importantly they are not the only basis. You may have heard of tailaylor expansions which use polomials to do the same thing. So, what does it all have to do with reservoir computing? Think about what we just built. We have a bucket of neurons. We drive them with a signal. Because the connections are random, every neuron reacts differently. When we record these neurons, we’re looking at a collection of random squiggly lines. Just like Furya had a collection of sine waves to build a heat profile, we can use this collection of neuron activities to build a bird song. In other words, we have created a random basis, a library of babel of temporal shapes. And just like Fier, if our library is big enough, if we have enough random variations, we can find a linear combination of these building blocks that add up to tell the exact story we want to hear. So, let’s tie everything together. We started with a simple question. How does the brain generate complex patterns seemingly out of thin air? We saw that recurrent neural networks unlike simple input to output machines have their own internal dynamics like ripples in a swimming pool. But these dynamics are notoriously hard to control. The key insight of reservoir computing is that we don’t have to control them. We leave the random network untouched and only learn a simple linear readout. adjusting the volume knobs on a choir of random voices until the collective hum matches our target. And the reason this works is almost fierike. A large enough collection of random temporal patterns forms a rich basis from which virtually any signal can be reconstructed. This tells us something interesting about the brain. Maybe biological neural circuits don’t need to be precisely engineered to produce complex behavior. The messy randoml looking tangle of connections might not be a bug. It might be exactly the feature that makes the system so powerful. If you enjoyed the video, share it with your friends. Subscribe to the channel if you haven’t already and press like button. Stay tuned for more computational neuroscience and machine learning topics coming up.