Transcript: Mike Stonebraker Postgres And Future Problems

TITLE: Turing Award Winner: Postgres, Disagreeing with Google, Future Problems | Mike Stonebraker CHANNEL: Ryan Peterman DATE: 2026-04-20 URL: https://youtu.be/YPObBOwIrHk ---TRANSCRIPT--- Computer science may well not be a growth industry going forward.

This is Mike Stonebraker. He’s a Turing award winner famous for his fundamental contributions to database systems like creating Postgres and more. What was the hardest part of that?

Query optimizer. It’s just algorithmically difficult.

How do you identify the people who aren’t smart?

Well, I mean it’s very easy.

He shared interesting technical takes from his experience. On our benchmarks, large language models get 0%.

Why did you disagree so much with MapReduce?

That wasn’t the only thing Google was stupid about.

I’m curious your thoughts on unsolved problems in databases and what you think the future might look like. Here’s the full episode.

How Postgres got started

The first thing I want to go over is the story of how Postgres got started. But for that I kind of want to start at the beginning. How did you get into building database systems?

When I graduated I had the good fortune of being hired at Berkeley, and it was clear I had to — continuing what I did for my PhD was not going to go anywhere, then as well as today. You’re way ahead if you get adopted by a mentor who knows the ropes. So Gene Wong, who is still alive and still kicking, took me under his wing and said, “Well, let’s do something together.”

And this was 1971, which was the year after Ted Codd wrote his pioneering paper in CACM. Gene Wong said, “Well, let’s take a look at database stuff.” At the time the competitors were a thing called the Codasyl proposal, which you’re probably too young to have ever heard of. It was a low-level spaghetti network proposal where you executed queries by following pointers. The alternative was the IBM proposal, which was a hierarchical thing called IMS, which is still available — it’s hierarchical data. It’s a tree. You organized your data as trees.

Even at the time, IBM realized that trees were not general enough to solve many people’s problems. So they hacked on a way to make it a limited network structure. So it was clearly a horrible hack. The Codasyl proposal had all kinds of bad properties. Besides being low-level and really hard to debug, it also had the property that if anything changed in your schema, you basically had to throw away everything and do it all again because it was absolutely rooted at the physical level.

Whereas Ted Codd’s stuff made perfect sense. Gene said, “Well, let’s build one of these puppies. That’s clearly the next thing to try.”

So we started building Ingres in 1972 while I was an assistant professor at Berkeley. As you know, as an assistant professor you have about five years to prove that you’re a big — and they fire you or they give you tenure. Ingres was my ticket to getting tenure, which happened in 1976.

That was where it started. Then again, happenstance. At the time, a lot of people would build prototypes which were sort of studenty code, which means you could get it to run, but if you gave it to anybody else, they couldn’t. So we put in the first 90% to get something we could run, and then for whatever reason we put in the next 90% to get it to where it really worked.

So the University of California version of Ingres really worked. Over the next couple years about a hundred universities started running it because Unix became the big thing. This was a free database system that ran on Unix. So it was quite popular in the academic world. We got lots of visitors at Berkeley who would say, “Gee this is really nifty looking stuff, what’s the biggest Ingres application you have?” And we’d be forced to say, “Not very big.”

This was brought home in spades when Arizona State University considered running Ingres on their student records data, all 40,000 students worth. They could get over that they had to get an unsupported operating system from Bell Labs. They could also get over that they had to run an unsupported database system from these guys at Berkeley. But the project went down in flames when they realized there was no COBOL available for Unix, and they were a COBOL shop. So unsupported OS, unsupported DB, no COBOL doomed us to irrelevance.

It was clear the only way out was to start a company. In 1980 we got venture capital as it existed then and started Ingres Corporation to move Ingres to DEC VMS — a real operating system — and we had a real company that would support Ingres. That was the start of the commercial journey.

Competing with Oracle

I saw that Ingres was competing with Larry Ellison’s offering at Oracle. I saw that Ingres was certainly better than what they were offering, but they were still competing somehow. How did they compete?

Larry Ellison is a fabulous salesman, and at the time he made present tense and future tense indistinguishable. He basically lied to customers. He would ship stuff that didn’t work and have his initial customers help him debug it. So I think he engaged in what I consider very shady business practices. Lying to customers, I think, is unconscionable.

For instance, there’s a thing called referential integrity — which is, if you fire an employee and he’s the last person in a given department, do you want to delete the department or do you want to have it be a ghost department? Ingres Corporation implemented referential integrity. Oracle Corporation wrote two manual pages that said “here’s the definition of referential integrity” which everybody agreed to. And then down at the bottom it said “not yet implemented.”

I had interviewed someone who worked at Sun Microsystems and they had a similar opinion that Larry Ellison was a little bit shady.

I also saw somewhere else that when Oracle acquired MySQL, everyone kind of got afraid and moved to Postgres. That was the genesis of Postgres replacing MySQL as the preferred open-source relational database system.

What Postgres did that Ingres didn’t

So you created Ingres and there were a lot of technical innovations so that it was better than the incumbents, but ultimately it went away and you developed Postgres. What was the thing that Ingres didn’t do that Postgres would do?

The big thing that guided us at the very beginning — the original reasoning for the academic version of Ingres was we were going to support a geographic information system that the neighboring professor, Pine Verria, wanted. To support a GIS system, you need points, lines, polygons, line groups, that sort of stuff. It was clear that Ingres couldn’t do it because the data types we put into Ingres were the standard ones — integers, floats, text, strings — and you couldn’t efficiently support GIS types on top of that. As a GIS, the academic version of Ingres was a complete failure. That was in the back of our mind.

The other thing that happened — this is a little out of chronological sequence, but it helps make the point — is that around 1985 the commercial version of Ingres — ANSI had just proposed a date and time standard for relational databases. Commercial Ingres implemented date and time using the standard Gregorian calendar. I was associated with the commercial version of Ingres as well as still being at the University of California as a professor, so I got a call from an Ingres customer who said, “You implemented date and time wrong.” I said, “Huh? We implemented the Gregorian calendar and you can subtract, and days have 30 or 31 days except for February, except for leap years. So subtraction on dates works exactly the way you would expect it to.”

But he said that’s not what I want in his particular world. He was dealing with bond financial instruments, and for some reason you got the same amount of interest on his financial bonds during each month no matter how long the month was. He had the date you bought the bond, the date you sold the bond. He wanted to do a subtraction, multiply it by the coupon rate, and say that’s the interest we paid you.

But of course, his version of subtraction was March 15 minus February 15 is 30 days, because that’s the definition of his calendar. So he had to retrieve two dates out to user code, do the subtraction in user code, put the answer back, and it cost him a factor of two or three in efficiency. He said, “Why can’t I just overload your definition of subtraction with what I want?” With Ingres, it was hardcoded.

The problem was this is a case where you wanted bond time just like you wanted points, lines and polygons. So Postgres was engineered to have an extendable type system. You could have whatever data types you wanted and they were very efficient. That was the main gist of Postgres — that flexibility. In business data processing most people were happy with the standard data types, but relational databases started to spread to all kinds of other places. What are called abstract data types or stored procedures — a bunch of names — had great applicability.

Postgres also supported what the AI guys at the time wanted in the way of inheritance. We also supported time travel, but the implementation absolutely sucked and it got taken out after a while. So there were a huge number of really nifty things in Postgres.

Identifying smart people (and not-smart people)

You mentioned you want to hire extraordinary software engineers and I think you’ve said before that you have no trouble finding those people. How do you identify them in your hiring?

It’s usually pretty obvious. I have a good feel for how difficult stuff is. If they get 3x the amount done in school that I think is reasonable, then they’re incredible.

On the flip side, you had this quote: “I can’t stand people who aren’t really smart. It’s challenging to talk to them.” How do you identify the people who aren’t smart?

Well, it’s very easy. You talk to them and you can rapidly surface whether they’re smart or not. What was your master’s thesis? What did you do? How did it exactly work? How did you deal with error conditions? How many processes did you have? Why didn’t you use threads? You ask them deep technical questions.

One size fits all databases (or doesn’t)

You gave a talk and I think there’s also a paper behind it — this idea that one-size-fits-all database systems is not optimal, one size actually fits none, and what you really want is database solutions that target specific needs.

In 2004 when I wrote the paper, we had an academic project which was building what became StreamBase. A stream processing engine looks nothing like a relational database. We had the gist of an idea for column stores for data warehouses, which was popularized by Vertica — looks nothing like a row store. Here were three wildly different implementations that had no resemblance to each other, and in each case they were an order of magnitude faster than the other guys. So it’s pretty clear that one side, with those three instances, you give up an order of magnitude when you’re running a database system that isn’t architected for your kind of stuff.

I think that’s still true. ClickHouse is a column store. Pinecone is faster than user-defined types on text-based vector processing. It’s still very much the case. There’s no difficulty putting a common parser on top of multiple implementations. Postgres has so far chosen not to do that. They don’t implement a column store, so they are not competitive on sizable data warehouses. They also don’t have multi-node support — again for people with big data warehouses that’s table stakes.

What is true is that if you want to get going, you have a database problem — the answer is choose Postgres. There’s a huge programming community, all kinds of data type implementations, it’s free, and you can find Postgres talent easily. So it’s a great choice for lowest common denominator. Until you’re trying to do a million transactions a second it works just fine. Until you’re trying to support a petabyte data warehouse — at the low end it’s absolutely the right one-size-fits-all. At the low end it’s Postgres. At the high end that’s just not true.

GPUs

GPUs — do they make available some new opportunities to optimize databases?

Probably, but the big challenge is that GPUs are SIMD — single instruction multi-data — and that’s the anathema of indexing. Whenever indexing is the right answer, they’re probably not a good idea. Also, you’ve got to architect them so that the bandwidth from storage is not the bottleneck. If they’re an add-on to the CPU, as often as not the bus connecting the GPU to the CPU is a bottleneck.

Can you explain why indexing would be not as effective when there’s SIMD?

Let’s say I’m looking for Ryan’s salary and I have a B-tree. You go to the root, you find the divider that has both sides of Ryan, you follow the pointer. That’s a memory access for sure. Then you do it all again, and you do this three or four times. That doesn’t parallelize well.

When you first implemented that first version of Ingres, did you write all of that by hand?

Yeah, we wrote the original version of Ingres all from scratch.

What was the hardest part?

Query optimizer. It’s just algorithmically difficult. It’s still — if you ask most any senior database programmer what’s the hardest part, they’ll still say the optimizer.

Disagreeing with Google (MapReduce and eventual consistency)

MapReduce came out at some point in the early 2000s and kind of took the data world by storm. People were really impressed by it. They thought Google really knows what they’re doing. But it seems like when I look at the literature, you kind of disagreed heavily.

There were a lot of not very enlightened people who said Google is really smart, they must know what they’re doing, and so we’ll do whatever they say. They would engage with Hadoop. But Hadoop is ridiculously inefficient. Dave DeWitt and others who were involved in our 2011 paper — we understood distributed databases and understood that you could beat the heck out of Hadoop with a distributed database system. That’s basically what that 2011 paper says. And of course, it’s true.

But that wasn’t the only thing Google was stupid about. Google also had the opinion that eventual consistency was the right way to do concurrency control. That was postulated from on high by Google all during that same period of time. All the database people said, “You’re out of your frigin mind, because it solves one particular kind of problem but only, and that very rarely occurs in practice.”

Why did they pursue eventual consistency?

The idea is that you have an east coast database and a west coast database and they’re replicas. You want them to be the same. If you say I’m going to do a transaction — I’m going to decrement by one the number of widgets in the west coast warehouse — before I commit, I’m going to update the east coast warehouse, pay a message over and back to update it, and then to make sure everything goes well it takes another roundtrip message to make sure both actually do the commit correctly. So it’s expensive to do a distributed commit, and it still is.

The idea was: well, you do the west coast update, you decrease the widgets by one, you just send a message asynchronously — not in a transaction — so that eventually the east coast warehouse gets decremented by one.

Meanwhile, if you’re on the east coast, you decrement foodstuffs by one. You send an asynchronous message. Eventually the west coast gets it and eventually everything settles out. If you’re allowed to go below zero, then what will happen is if the east coast guy and the west coast guy simultaneously sell the last widget, eventually the state of the warehouse will be minus one and somebody won’t get their widget.

If you’re allowed like Amazon to say “usually ships in 24 hours,” then maybe you can oversell. But most enterprises can’t do that. Eventual consistency just doesn’t work. Referential integrity in a sales system — an integrity constraint is stock is greater than minus one — and that fails with eventual consistency.

Jeff Dean of Google finally figured that out, and when they did Spanner, Spanner had a conventional transactional system. So Google completely abandoned eventual consistency and completely abandoned MapReduce.

So the trade-offs basically — correctness for performance.

Did you ever talk to the Google team while they were doing those things you thought were so wrong?

We talked to them before the 2011 paper and said, “Why don’t we partner up and do some stuff?” They weren’t interested. They declined.

Amazon has too many databases

Have you seen other examples in other big tech companies where their databases or database solutions you actively disagree with? Like Amazon or Facebook.

I gave a talk at Amazon maybe three years ago and told them all the things I thought they were doing wrong. Amazon’s problem is they are supporting 15 different database systems, and that’s about 12 too many. They have their own culture. I said you’re supporting too many database systems, and at this point they haven’t chosen to retire any of them.

Why do you say the 15 should be three?

They’re supporting a graph-based database system, and it’s well understood that a graph-based database system is almost never the performant option. If you want a graph — if you like the idea of a user interface that deals with nodes and edges — that’s fine. Put a layer on top of a relational database system that gives you that user model. Most of their database systems, there’s some other of their database systems that’s better at what it does. The answer is you should retire any database system that isn’t performant in a big enough market to justify the maintenance.

Why academia over industry

You’ve influenced industry significantly from academia. Why not work directly in industry, or why prefer the position of being in academia?

Because that gives you a boss. That gives you company rules, limits your ability to publish, limits your ability to go talk at conferences, limits your ability to go poke at what various competitors are doing that they won’t tell their competitors. But mostly I really like being in startups. After the commercial version of Postgres got acquired by Informix, I was working part-time for Informix, which was a 2,000 person company, and I didn’t feel like I could make a difference because it was bureaucratic and whatever the president wanted he got. I’m just not cut out for politicking. I don’t do that very well, and I have a hard time interacting with people I think are dumb. I have some problems with big companies.

DBOS — replacing the upper half of Linux with a database

I want to talk about DBOS. I just thought it was a really interesting technical model. Can you explain what DBOS is?

We started the academic project in 2019 or 2020. The gist was — at that point Matei Zaharia, who is on the faculty at Stanford, was also one of the founders of Databricks, original creator of Spark. At the time Databricks was basically running people’s Spark jobs on the cloud. He said, “At any given time we might be orchestrating a million Spark jobs. We have to write a scheduler that’s going to decide who to run next at scale of a million. We tried all the schedulers written by the OS folks and they couldn’t scale.”

So we put all the scheduling data in a Postgres database, and basically a Postgres application was doing scheduling. Then it clicked that by and large most everything you do in an operating system is managing data at scale, and you should do that using database technology. So why don’t we just replace at least the upper half of Linux with a database system?

That was the gist of the academic project. We worked on it at Berkeley and Stanford in the early 20s and it was very successful. It clearly worked. In the process, the Stanford folks wrote an extension to JavaScript so that you could program — you need some programming world that can talk to your implementation.

If you’re doing what amounts to a programming language and you’re running on top of what amounts to an operating system that is a database, then the obvious thing to do is put all your state in the database. That’s exactly what they did. We had an innovative programming language model and an innovative operating system model.

Then the idea was, can we start a company? We talked to the VCs who to a person said, “You’re dreaming if you think you’re going to displace Linux. However, this programming language stuff is really nifty.” We had extensions to JavaScript that would allow any program to have all the nice features of a database system. Stuff was durable. You could have transactions. If it failed, you’d fail over. All that nifty stuff.

We got funded to start a company in 2023, and that was DBOS Incorporated. We decided that was the name of the project since it had always been the name. We were basically in the programming language business. Currently DBOS has a version of TypeScript, Java, Go, and Python, which are basically seamless. It runs what looks like vanilla programs.

In the world of the cloud, there’s every incentive to structure your application as a workflow. So we decided we would support a workflow system, period. The workflow that DBOS supports in those four languages — the steps in a workflow, the individual micro apps, whatever you want to call them, are transactional. Workflows are durable, so once you do a step it’s not forgotten. It’s clear that we can make workflows atomic if there was a market for it, which means the whole workflow would either finish or look like it never happened. It has very nice properties and is a great deal faster and a great deal easier to use than the competition.

The idea is you want to make state of your application persistent when you put it in the database, and then you figure out how to do it fast. Their business model is very much — get leaf-level programmers interested. Tell us leaf-level programmer what you need that we don’t have, get it quickly, convince people to try it. We’ve been very successful with other startups who want to choose the best thing, and we’re starting to be successful with the big boys.

About two-thirds of the customers are doing agentic AI, which means they have a large language model surrounded by a bunch of stuff that adds more signal. So far, most agentic AI is read-only, meaning you want to produce a prediction for whether Ryan is going to be a good customer or not. It just runs some stuff and then produces a new thing that’s given to somebody. Basically read-only, which means you’re not actually updating Ryan’s credit rating. I think fairly quickly the whole world is going to move to using agents to do read-write applications, and that’s going to make them very databasey. DBOS does that stuff really really well.

If you want to write an agent or two agents that move $100 from my account to your account — you debit my account, you increment your account, and these two agents have to agree to commit or you have to back everything out. The workflow needs to be what I called atomic, which is it all happens or it looks like it never happened.

What’s being offered in the market today differs from the original research project where that was actually swapping out the guts of an operating system with a database. There’s got to be some trade-off there.

Well, a file system written on top of a DBMS is faster than the Linux file system. The scheduling engine is competitive with other scheduling engines. You can make everything fail over, so you get high availability without having to do anything else. There’s really no downside.

Then why wouldn’t Linux incorporate that and upgrade itself?

You hope they would. In other words, you should keep all the device driver junk down at the bottom because there’s a lot of it and no one wants to do that, and replace everything else with the database implementation.

Is that something you’ve mentioned to Linux people?

Back in the academic project when I’d mention that to operating system folks they would get very threatened, which is — this is the database guys trying to take over their turf. The programming language guys ditto, which is the way to implement the runtime for a programming environment is with a database.

That’s interesting. If it’s objectively true, then maybe it will take over.

Well, it took Java 10 years to become widely accepted. I just think the time constant is substantial.

The future of databases and LLMs

I’m curious your thoughts on unsolved problems in databases and what you think the future might look like.

Two different things. The first — like everyone else, three years ago we started to look at what large language models were good for. We’ve been trying to get what’s now called text-to-SQL to work on real world databases, especially real world data warehouses. We’ve been trying the technology on four different production databases where we’ve gotten the workload — the actual workload that’s run — and gotten the users to reverse engineer the text that corresponds to that sequence. So we have text and SQL for four benchmarks.

When you say text-to-SQL, you mean like a human prompting a model?

That text would be, you know, “every tenure-track professor over four years old” or “tell me all the professors at MIT who won the Turing Award.” An LLM is supposedly good at that. The text-to-SQL benchmarks — there’s one called Spider, another called Bird. The best LLM systems are pretty good at those benchmarks — 80% accuracy or better.

Not superhuman?

Not superhuman, but pretty good, like you would consider using them. Current leaderboard is something like 85% accuracy, which is getting there. You could say it’s not quite ready for prime time, but it certainly looks pretty good.

Well, on our benchmarks, large language models get 0%. And if you enhance them with RAG and all the tricks, it goes to 10%. And if you give as a prompt the FROM clause — in other words, all the actual tables that need to be accessed and all the actual JOIN clauses that need to be joined — then accuracy goes to about 35%.

So this definition of stuff is not ready for prime time and not going to be for a while, if ever.

What’s the difference? Number one, LLMs are trained on “the pile.” Data warehouse data is not in the pile. And there’s an adage that if you haven’t seen the data a couple times before, you have no chance of regurgitating it. Number two, query complexity on Spider and Bird is maybe 10 to 20 lines of SQL. Real world data warehouses it’s 100 lines of SQL. Complexity is bigger. Number three, the schema in Spider and Bird is clean — table names are mnemonic, column names are mnemonic, and there’s no duplication. In data warehouses, people have materialized views all the time — there’s redundancy — and column names are often underscore-zoppers-blah. They’re not mnemonic. That makes it a lot harder. They also have idiosyncratic data. J-term is a popular thing at MIT — it’s a one-month term in January. Not unique to MIT but not very popular. So not in the pile, idiosyncratic data, simple queries, schema is a mess — make it not work. Those are true of every data warehouse I know of. The technology simply doesn’t work and isn’t going to work anytime soon.

So what do you do? We published our benchmark. It’s called Beaver, which is an anonymized and abstracted version of these four data warehouses. If you think you’re really good at doing text-to-SQL, try a real benchmark, not a fake one.

Number two, if you don’t have all the JOIN terms and you don’t have the FROM clause, you’re toast. What’s more, if you don’t break down the query into simpler pieces, you’re toast. That says you want to give your retrieval system simpler pieces which include the FROM clause and include JOIN terms.

Number two, the minute you want to talk to two different structured databases — like your data warehouse and your CRM system — then it’s pretty clear to me that doing a structured data join using an LLM is a bad idea. You’re much better off leaving them as tables and doing a join in SQL.

Our point of view is turning everything into tables. We’re working with the Department of Transportation in the city of Munich, Germany. They have six people full-time who are answering citizens’ complaints and queries. “How come I don’t have enough time to cross this intersection next to my house before the light turns?” “How come the trolley doesn’t stop long enough for me to get on?” “How come the trolley doesn’t come more than once an hour?” Their database is the trolley schedule — that’s SQL. The light sequencing — that’s SQL. The intersections — that’s CAD. Federal German regulations — text. City of Munich regulations — text. You got to join SQL, SQL, CAD, text and text. Our point of view is turn it all into SQL, all into tables, and do a join with what amounts to a query optimizer. That’s what we’re working on. Other people will have other ideas, but it’s extremely fertile area because people really want to do it.

Number two, we talked earlier about agentic AI. The minute this becomes read-write it’s a distributed database problem and you want atomicity, consistency, all that stuff. That’s pretty much what I’m working on now.

On that benchmark where it’s 0% right now, what percent is human? If you took someone who really knows SQL?

Once you disambiguate the text, a knowledgeable SQL programmer with the schema will get very high accuracy.

Like 90%?

At least.

Closing advice

For people who want to deeply understand databases and are looking for material to study, is there a book?

Joey Hellerstein and I published what’s called the Red Book, which is Readings in Database Systems. It’s now eight years old. That would be a great set of readings for eight years ago. Beyond that, popular papers from the literature.

If you could go back to yourself when you just graduated, what would you say?

Back when I first took the job at Berkeley, without thinking about it much, we said, “Let’s write a database system.” We knew nothing about databases, nothing about implementations. We were not skilled programmers like Bill Joy. Starting off doing something that crazy was really pretty crazy. You effort and you make stuff work and you learn along the way. So the answer is — think outside the box, think crazy thoughts, and try to do them.

The better question is if you were starting out today, what would you major in? Because computer science may well not be a growth industry going forward. I’m not sure I would recommend 18 year olds to major in computer science. Healthcare and the building trades are safe bets and everything else looks much riskier.

If you’re about to get your PhD and are trying to decide what to do, life is pretty easy — take the most prestigious job you can get, find a mentor who’s willing to help you, and pick some area that isn’t going with the flow. Our stuff, which is called Rubicon, is definitely not going with the flow. Choose something not going with the flow and try to make it work.

Both my wife and I said: “Follow your passion. Somehow the money will work out.” I don’t believe that for a minute, but I think that’s what you have to tell your kids.

And your grandkids.

If you don’t believe it, why do you have to tell them that?

My wife is a good example. She has a master’s degree in computer science, undergraduate degree in computer science, and she wanted to be a K-12 teacher. Her parents said, “You can’t do that. It doesn’t pay enough money.” Ever since, she’s regretted that decision. She wasn’t passionate about doing computer science — it was simply a trade.

Find something you’re passionate about and you won’t starve. You may not make a lot of money, but chances are you’ll be happier than if you do something you’re not passionate about. A lot of people I know view their job as simply a job — life is what happens between 5pm and 8am. I don’t feel that way at all. I really like what I do. Wouldn’t matter whether I made a lot of money or didn’t.

Thank you so much for your time.