Systems Thinking for Leaders: Designing Solutions That Work

ELI5/TLDR

John Sterman, the MIT professor who literally wrote the textbook on system dynamics, spends an hour explaining why smart people with good intentions keep producing terrible outcomes. The culprit is what he calls policy resistance — your solution works for a moment, then the problem comes roaring back through a side door you didn’t see. His prescription is causal mapping (drawing the feedback loops you can’t see) and flight simulators (letting managers crash projects in a sandbox before they crash real ones), because telling people the answer never works.

The Full Story

The slogan problem

Systems thinking has become the kind of phrase that gets nodded at in boardrooms and means nothing. Sterman opens by saying as much. “Everything is connected to everything else” is true and useless. The interesting question is whether you can give a busy executive an actual tool — something they can pick up in a week and use on Monday morning.

His framing for why we need the tool is a phenomenon he calls policy resistance. It is not the failure to get your idea implemented. It is the more demoralizing thing where you do get it implemented, it appears to work, and then the problem reappears somewhere else, often worse than before.

His examples are familiar:

Build more roads to fight traffic, and congestion gets worse because driving became more attractive and people moved farther from the city.
US healthcare has been “containing costs” for 70 years and now spends 18% of GDP.
Mergers promise synergies and routinely destroy value for the acquirer.
Six Sigma, re-engineering, TQM — the tools work, but most implementations fail, and each failure makes the next attempt harder.
Projects are “late, expensive, and wrong.”

Sterman quotes Thomas More from 500 years ago: “by applying a remedy to one sore, you provoke another.” The pattern is old. The question is what to do about it.

The open-loop fantasy

The first villain is the diagram every consultant has drawn: identify the issue, gather data, evaluate alternatives, pick the best one, implement, done. Sterman has never seen a project go this way — not the new Sloan building he co-chaired, not the grocery list. Real work is iterative, with discoveries forcing you back to earlier stages.

Although this is especially popular and common in organizations of all types, it is not how the world works. Instead, it’s a world of feedback.

The replacement is a causal diagram. You have a goal. You have an actual state of the system. The gap drives decisions, decisions change the system, the new state generates new information, around and around. Sterman uses his bike commute — 12 miles each morning — as the trivial case. Stay on the right side of the path. Drift left, hit pedestrians. Drift right, hit a tree. Without continuous feedback, you cannot know whether to turn the handlebars left or right.

The problem is that running a company is not riding a bike. Your mental model of how the system responds is not wrong, just incomplete. You cut prices, hire better salespeople, advertise more — each of which, in isolation, ought to grow market share. But every action has multiple effects, not just the ones you planned.

There’s no such thing as a side effect in reality. There’s just effects.

This is the line of the talk. When a manager says “unintended consequence” or “side effect,” they are doing rhetorical work — claiming credit for what worked, externalising what didn’t. What they are actually revealing is the narrowness of the mental model they used to make the decision. And it gets worse: you’re not the only actor. Employees, customers, suppliers, regulators, communities all have their own goals. When you pull the system toward yours, you pull it away from theirs, and they push back. Everyone is operating with an incomplete model. Everyone is generating effects they didn’t plan for.

Case one: prior authorization in US healthcare

To show what causal mapping actually does, Sterman walks through US healthcare’s three-decade flirtation with prior authorization (PA), preferred drug lists (PDLs), and step therapy. Each one is a control mechanism — the insurer makes the doctor jump through a hoop before approving the test, drug, or procedure they want to prescribe.

The intent is clear. Costs are climbing, so make it harder to prescribe expensive things, and the unit cost of care will drop. Above the waterline this looks like a clean balancing feedback loop: cost pressure → tighter rules → lower unit costs → lower total costs.

Below the waterline, the picture is uglier. The numbers themselves are damning:

Prior authorization in the US now costs $35 billion a year in administrative overhead.
That’s roughly $11,000 per clinician per year.
Each request costs $20–30 to process.

And a 1996 study — thirty years old now — already concluded that drug formularies and prior approval raise total medical costs rather than lower them. A 2026 meta-review of 25 studies found PA was associated with “disease exacerbation, preventable hospitalizations, prolonged hospital stays, and lower rates of disease-free survival.” Sterman translates the last one bluntly: more people died.

The causal diagram explains why. Tighter PA delays and degrades care. Patients get sicker. Sicker patients come back, often through emergency departments and hospital readmissions, requiring more tests and more procedures. Total cost goes up. The manager, seeing costs rise, doubles down on the only lever they can see — more PA. Reinforcing loop. Around and around.

Then there are the side loops. Doctors appeal denials, which raises administrative cost. Malpractice litigation rises. Unhappy patients switch plans, eroding revenue. The “control mechanism” produces every outcome it was designed to prevent.

How can we actually produce an outcome that nobody wants?

Sterman’s point is that the people running these systems are smart and well-intentioned. They are not stupid. They just cannot see what is below the waterline, and the short-run feedback — costs did go down for a quarter — strongly reinforces the broken model. Recent moves by Blue Cross Blue Shield and UnitedHealthcare to scale back PA, he notes, are decades late to a conclusion the data established in the 1990s.

Case two: the project that ate the budget

The second demonstration is a management flight simulator. Sterman runs a hardware project — a VR headset — in real time, on screen, while narrating his decisions. He names the project “kill the competition.” The simulator is built on decades of system dynamics research and lets you set workforce size, accept or decline scope changes from marketing, push for speed or quality, and overlap phases that should be sequential.

He plays it the way most managers would. Budget is tight, so he staffs lean — 70 people instead of the 75 the planner suggests. Marketing comes in mid-project with new features the competitor is launching. He accepts them. The work week climbs to 60 hours, then 78. Engineering reports that the preliminary design review went badly. HR reports burnout complaints. He applies management pressure for both progress and quality.

The detailed-design phase, originally scoped at 28 people, is now estimated to need 166. He hires 120. He tries to cut scope to recover. The simulator won’t let him — once features are accepted, you own them.

We went over budget. We took way too long. Our product quality is horrible. The goal was less than 1% defects. It’s 25%. The net present value of our losses is about $50 million.

This is not a freak run. Sterman, who has been an expert witness in half a dozen project-failure lawsuits, says this outcome is typical and often worse. The lawyers, he notes drily, are the only consistent winners.

The pedagogical point is buried in the demo. He doesn’t lecture you on the Brooks’s-law dynamic of adding people to a late project. He lets you do it, fail, and feel it. Then you replay and try a different strategy.

Why simulators, and why the sage steps off the stage

The most interesting answer in the Q&A is to a question about whether the simulator tells you what you did wrong. Sterman’s answer is no, deliberately. Telling people the answer doesn’t work.

Research shows that showing people research doesn’t work.

The mental models people carry about complex systems are powerful and constantly reinforced by short-run feedback. If an expert or a model contradicts that intuition, the natural response is to reject the model. He cites the statistician George Box: “All models are wrong. Some are useful.” If you tell someone your model says they’re wrong, they will tell you your model is too simple, or too complex, or a black box, or that you don’t understand their industry. They will not change their mind.

What does work, in Sterman’s experience, is letting people discover the failure themselves in a safe environment. The pilot analogy lands. Sully Sullenberger, after the Hudson landing, said he had never experienced a real bird strike that killed both engines — but he had practiced exactly that scenario many times in a simulator. That is the use case. Not predicting the future. Building the response patterns for things you cannot otherwise rehearse.

For organisations that don’t need a full simulator, Sterman offers a softer version: qualitative causal mapping with the system in the room. Get everyone with a stake — including your traditional adversaries, the consumer group accusing you of using sweatshop labour, the supplier you’ve been squeezing — into the same group modelling process. They will collectively see the loops a single team would miss. The substantive benefit is a better map. The meta benefit is that the participants build their own systems thinking muscle: learning to listen before concluding, to hold their expertise lightly, to act as a guide on the side rather than the sage on the stage.

What he doesn’t say

There is one thread Sterman waves at and drops. Asked whether systems thinking applies to ultra-short-term domains like trading, he admits he isn’t aware of publicly available simulators for the commodity pit, though system dynamics has been used in high-frequency domains. The implication is that the toolkit scales across time horizons, but the applications you’ve seen most are in policy-length problems — climate, healthcare, multi-year capital projects — because that is where the feedback delays make intuition reliably fail.

Key Takeaways

Policy resistance is the default failure mode: well-implemented solutions that work briefly, then get undone by the system pushing back through loops you didn’t model.
“Side effect” is rhetoric, not reality. Every effect is an effect. Calling some of them side effects is how managers protect their reputations while exposing the narrowness of their mental model.
Open-loop diagrams (identify → analyse → decide → implement) are a fantasy. Real systems are loops. Map them.
Above-the-waterline thinking causes the iceberg to crash you. In US healthcare, prior authorization looks like a cost-control loop. Below the waterline, it generates the cost increases it was supposed to prevent — $35B/year in admin, sicker patients, more litigation, churn.
“All models are wrong; some are useful” (George Box) — the operating principle for working with simulators and causal maps.
Telling people they’re wrong doesn’t change their mind. Showing them research about why their intuition is wrong also doesn’t work. Letting them experience the failure in a simulator sometimes does.
The pilot analogy is the right one. You practice the bird strike that takes out both engines because you can’t practice it in flight. Same for hostile takeovers, project blowups, climate tipping points.
“Get the system in the room” — qualitative causal mapping with all the actors present (including adversaries) often delivers more value than building a formal simulator.
Project-management flight simulator failure mode: hire lean, accept scope creep, apply pressure when behind, hire more people late, can’t cut scope, ship late and broken. This is the modal outcome, not a freak run.
Reinforcing vs balancing loops is the fundamental vocabulary. Balancing loops control; reinforcing loops amplify. Most policy resistance comes from a balancing loop spawning a hidden reinforcing one.
The “sage on the stage” → “guide on the side” shift is part of the systems thinking skill stack — not just a teaching style, an executive style.

Claude’s Take

This is a marketing webinar for a $10,000-ish executive course, so calibrate expectations. The sales pitch is undisguised at both ends. But the middle is good.

What Sterman does well: he refuses the everything-is-connected slogan and tries to put a working tool in your hands within an hour. The “no such thing as a side effect” reframe is genuinely useful — once you hear it, you start noticing how often managers use the phrase as reputational cover. The prior authorization walkthrough is the cleanest version I’ve seen of how a balancing loop can spawn a reinforcing one and produce the opposite of the intended outcome, and the numbers are sharp enough to puncture any “but we’re saving money” defense.

What’s missing: any honest discussion of when systems thinking fails or gets misused. Causal diagrams can be drawn to justify almost any narrative if you choose your boundaries cleverly — and Sterman knows this, because he says in passing “always challenge the boundary of your model.” But he doesn’t dwell. The flight simulator demo is also a slightly stacked deck. The simulator wouldn’t let him cut scope at the critical moment, which is a design choice that pushes the demo toward the moral he wants. Real projects sometimes can cut scope, and the right move is often to do exactly that.

The Q&A is where you see the limits. Asked how simulation actually overturns entrenched mental models, Sterman essentially says “good question, no short answer, come to the course.” That’s honest, but it’s also the gap. The skill of running a group modelling session that changes minds is craft, not framework, and the framework alone won’t get you there.

A 7. The conceptual scaffolding is real and worth keeping — policy resistance, the open-loop fantasy, “no such thing as a side effect,” and the waterline metaphor are durable mental furniture. The healthcare case is the clearest worked example you’ll see. The project-management demo is mostly theatre but theatre that lands. Knock a point because it’s an ad; another because it papers over the boundary-choice problem; bump it back up because Sterman is the real thing and even the watered-down public version is denser than most leadership content.