The Science of Learning Math (and Anything Else) with Justin Skycak

ELI5/TLDR

Learning math (or anything skill-based) works almost exactly like training for a sport: you need your fundamentals drilled to the point where you don’t think about them, you need a coach who serves you the right exercise at the right time, and watching lectures without solving problems is about as useful as watching tennis on TV and expecting your serve to improve. Justin Skycak, who self-studied 3,000 hours of college math as a teenager and now builds Math Academy, walks through the cognitive science of why most education fails and what actually works. The short version: prerequisites before everything, minimum effective doses of instruction, solve problems immediately, and let a knowledge graph handle the sequencing so your brain can handle the learning.

The Full Story

The Accidental Math Addict

Justin Skycak’s story starts with a teenager who finished pre-calculus, got curious about calculus over the summer, and found an online course that broke the subject into small lessons with lots of problems. He expected to spend the whole summer grinding through a third of it. Instead, math replaced video games. Eight hours a day, holed up in his room, knocking out lessons like levels in a game. His parents — an art major and a business major — had no frame of reference. It was like their kid had developed a wholesome addiction and they didn’t know whether to intervene.

That summer turned into 3,000 hours across high school, working through MIT OpenCourseWare up through real analysis and abstract algebra. But here’s the thing Justin keeps coming back to: most of those hours were wasted. He estimates the actual learning could have been compressed to about 750 hours with proper guidance. He violated basically every principle of cognitive science — skipping problems to watch videos, ignoring spaced review, attempting problems way above his level — and had to discover each mistake the hard way.

“Every single cognitive science principle — I violated initially during my self-study and kind of learned from that.”

The Knowledge Graph: A Textbook That Knows You

Math Academy, where Justin is now chief quant and director of analytics, is built on a simple but powerful idea: imagine an expert tutor who knows exactly what you know, what you don’t know, and has unlimited patience and an infinite bank of problems. That’s what they’re trying to build.

Think of it like a personal trainer for your brain. A good trainer doesn’t hand you a barbell and say “figure it out.” They assess where you are, identify your weak points, demonstrate the movement once, then have you practice while they watch. Math Academy does this through a knowledge graph — a directed acyclic graph (imagine a massive family tree of skills) with about 2,000+ topics, each broken into three or four atomic knowledge points.

The system starts with a diagnostic test that looks backward for missing prerequisites, sometimes years back. Want to learn calculus? It’ll test your algebra. Can’t complete the square? That goes on your learning plan. Then it serves you material at your “knowledge frontier” — the edge of what you’re ready to learn. Each lesson is a minimum effective dose: a brief introduction, then you’re solving problems immediately. Nail the first two? Move on. Struggle? Do a few more until you hit two successes in a row.

The entire knowledge graph was built by hand. Not generated by an LLM. Justin and the team spent years encoding the prerequisite relationships, informed by watching real students struggle in real classrooms. This is the part that’s hard to automate — knowing that a particular topic has ten prerequisites that students have never practiced pulling together creates what they call “cognitive congestion,” and you need to break it into smaller steps. That kind of insight comes from teaching, not from training data.

FIRE: The Spaced Repetition Algorithm That Reads the Graph

One of the cleverest pieces of the system is their review algorithm, called FIRE — Fractional Implicit Repetition. Here’s the problem it solves.

Imagine you’re doing Anki-style flashcards, but instead of vocabulary words, each card is a math problem that takes a minute or two to work through. Your review pile grows fast. This is “review hell” — and it kills most self-study efforts.

Now imagine you notice something. When you solve a quadratic equation by factoring it into linear pieces, you’re implicitly practicing linear equations. When you solve a linear equation, you’re practicing subtraction and division. Think of it like lightning strikes — when you practice one advanced skill, electricity crackles down through the graph, partially reviewing dozens of sub-skills below it.

FIRE exploits this structure. Instead of drawing the most overdue card from your review pile, it draws the card that knocks out the most due reviews with a single problem. One well-chosen advanced problem can cascade fractional credit down through the entire prerequisite tree. The “fractional” part matters: solving one quadratic doesn’t cover every case of linear equations (what about ones with leading coefficients?), so the algorithm tracks partial coverage and distributes credit accordingly.

“You can visualize it almost like strikes of lightning. You practice this thing, it covers a bunch of stuff below.”

Automaticity: The Prerequisite You Didn’t Know You Needed

Justin makes a claim that sounds counterintuitive: computational automaticity is a prerequisite for conceptual understanding. Not a substitute — a prerequisite. Imagine a basketball player who has to consciously think about how to dribble. They can’t see their teammates. They can’t read the defense. The mechanics eat up all their working memory, leaving nothing for strategy.

The same thing happens in math. A student who has to consciously recall how chain rule works can’t “feel” why gradients explode or vanish in a neural network. But Justin’s students — high schoolers who had built up automaticity through hundreds of practice problems — could work through a simplified three-layer network and predict the behavior after just one round of computation.

“Wait, this thing is already saturating… I don’t even have to work out the rest. I already know how it’s going to play out.”

This is the difference between watching basketball on TV and playing it. You can learn the vocabulary of backpropagation from a YouTube video. But the deep conceptual understanding — the kind where you look at a loss curve and immediately know what’s wrong — requires having done the movements yourself.

The Confusion Paradox

Both hosts agree on a subtle distinction about confusion in learning. When you’re working through known material (the existing knowledge graph), confusion almost always means a missing prerequisite. The “aha moment” feels amazing, but it represents time wasted — if you’d had that prerequisite in place, there would have been nothing to click. The learning would have just… flowed.

“We try to smooth it out so you don’t have that — you’re climbing out of the hole and you’re like ‘I made it!’ No, they just make it smooth.”

But at the frontier — in research, or building something genuinely new — confusion is the signal. Everyone has missing prerequisites because nobody knows what the prerequisites are. That’s the whole point. If you can identify the missing piece that’s confusing everyone, that’s a high-value contribution.

The practical advice: spend your confusion budget on things with high ROI. Don’t manufacture confusion in algebra when there’s a well-mapped path through it. Save it for the frontier.

Adults, ADHD, and Breaking the Concrete

For adults wondering if it’s too late: the amount of work doesn’t change based on your age. What changes is the available time. Math Academy’s Mathematical Foundations sequence — everything from fractions to university-ready math, stripped of irrelevant school curriculum — takes about 15,000 minutes. That’s one hour per weekday for a year. Two hours per weekday and you’re done in six months.

Justin tells the story of a co-worker who went from 911 call receiver to data scientist. Started community college in his late 20s, no math or coding background, held down his job while studying, earned the nickname “lord of the data flow” at his first tech company. The hardest part is the upskilling phase when you’re working one job and preparing for another. Once you make the transition, you get eight hours a day of paid practice in your new direction. Phase change.

On ADHD: Math Academy’s founder, Jason Roberts, is on the ADHD spectrum, and this shaped the entire design philosophy. The insight is that ADHD doesn’t mean you can’t learn — it means the standard learning environment (being talked at for an hour) is even more intolerable for you than it already is for everyone else. The minimum effective dose approach — brief instruction, immediate problem-solving, constant engagement — actually serves ADHD learners better than traditional education serves anyone.

The emotional layer matters more than people think. Many ADHD learners develop shame about not being able to sit through lectures, which becomes a bigger barrier than the attention difference itself. Feeling dumb is often the real impediment to learning, not being dumb.

LLMs as Multipliers, Not Equalizers

Both hosts converge on a sharp observation about AI tools: they’re multipliers, not equalizers. If you know what you’re doing, Claude or GPT can put zeros behind your one — taking you from 10 to 100 to 1,000. If you don’t know what you’re doing, the LLM will cheerfully lead you into what Yacine calls “some psychosis area” — 26 pages of confident nonsense.

Justin’s rule for using AI coding tools: scope the task small enough to fit in your working memory. If you can personally verify every decision the tool makes, it’s extraordinary. If you can’t, you’ll look up a day later and find a mess. Same failure mode as giving an unscoped project to a junior employee.

Key Takeaways

3,000 hours compressed to 750: Justin estimates ~75% of his self-study hours were wasted due to inefficient techniques. The gap between unguided and guided learning is enormous.
Prerequisites are the bottleneck, not explanations: When a student needs ten different explanations to understand something, the real problem is almost always a missing prerequisite — not a bad explanation.
The knowledge graph has ~2,000+ topics, each with 3-4 atomic knowledge points. Math Academy’s calculus course alone has ~300 topics. All hand-curated by domain experts, not generated.
FIRE (Fractional Implicit Repetition): Review algorithm that exploits the prerequisite graph to compress review load. One advanced problem cascades partial credit to dozens of sub-skills, solving the “review hell” problem.
Cognitive congestion: A topic with 10 unpracticed prerequisites pulled together for the first time will overload working memory. The knowledge graph is specifically designed to reduce this.
The Mathematical Foundations sequence: ~15,000 minutes (fractions to university-ready), stripped of ~25-33% of traditional school content that doesn’t feed forward into university math (e.g., inscribed circle theorems).
Automaticity precedes understanding: You can’t conceptually understand why gradients vanish if you haven’t manually chain-ruled through enough examples to feel the pattern.
Confusion ROI: In known material, confusion = wasted time from missing prerequisites. At the frontier, confusion = signal worth investigating. Don’t spend confusion budget on solved problems.
ADHD learners aren’t broken, the environment is: Minimum effective doses of instruction suit ADHD learners better than traditional lectures suit anyone. Math Academy’s ADHD founder designed the system around what he hated about school.
LLMs multiply existing skill: A Terence Tao-level mathematician gets enormous leverage from AI tools. A beginner gets cheerfully led into nonsense. The tools amplify the gap.
The sport-coach analogy holds up: Great athletes often make poor coaches because they never had to consciously solve problems that came naturally to them. The best teachers have had to push past their own limitations.
Ultralearning vs. mastery learning: Top-down (goal-first, backfill prerequisites) works when the gap is small. Bottom-up (systematic mastery) works when the gap is large. Best approach is often a hybrid: scope down the goal, then fill up systematically.

Claude’s Take

This is a genuinely useful conversation, and I’m giving it an 8. The reason it’s not higher is structural — at 3+ hours, there’s significant repetition and circling back to the same ideas from different angles. The signal-to-noise ratio is good but not exceptional. You could extract 90% of the value from a 90-minute version.

What’s actually valuable here is the specificity. Justin doesn’t just say “practice is important” — he explains the FIRE algorithm, the knowledge graph architecture, the cognitive congestion concept. These are mechanisms, not platitudes. The automaticity argument is particularly well-made: most learning advice stops at “do practice problems,” but the claim that automaticity is a prerequisite for (not a substitute for) conceptual understanding is a genuinely useful reframe.

The weakest parts are where the conversation drifts into biography and anecdote without connecting back to principles. Justin’s high school research stories are fine, but they don’t add much beyond “having foundations helps.” Yacine’s molecular biology stories are fun but long.

The strongest parts are the FIRE algorithm explanation, the confusion-as-signal-vs-waste distinction, and the frank discussion of what LLMs can and can’t do for learning. The ADHD section is also refreshingly practical — it avoids the usual “everyone learns differently” hand-waving and instead argues that the optimal learning flow is basically the same for everyone, ADHD just makes you less tolerant of the suboptimal version.

One thing worth noting: Math Academy is Justin’s company, so there’s an inherent bias in how he frames the problems and solutions. That said, the cognitive science principles he cites are well-established (spaced repetition, mastery learning, interleaving, retrieval practice), and the approach is coherent.