How Not to Be Wrong
An Excerpt From
How Not to Be Wrong

Praise for How Not to Be Wrong


“Brilliantly engaging . . . Ellenberg’s talent for finding real-life situations that enshrine mathematical principles would be the envy of any math teacher. He presents these in fluid succession, like courses in a fine restaurant, taking care to make each insight shine through, unencumbered by jargon or notation. Part of the sheer intellectual joy of the book is watching the author leap nimbly from topic to topic, comparing slime molds to the Bush-Gore Florida vote, criminology to Beethoven’s Ninth Symphony. The final effect is of one enormous mosaic unified by mathematics.”

—Manil Suri, The Washington Post

“Easy to follow, humorously presented . . . This book will help you to avoid the pitfalls that result from not having the right tools. It will help you realize that mathematical reasoning permeates our lives—that it can be, as Mr. Ellenberg writes, a kind of ‘X-ray specs that reveal hidden structures underneath the messy and chaotic surface of the world.’”

—Mario Livio, The Wall Street Journal

“Witty, compelling, and just plain fun to read . . . How Not to Be Wrong can help you explore your mathematical superpowers.”

—Evelyn Lamb, Scientific American

“Mathematicians from Charles Lutwidge Dodgson to Steven Strogatz have celebrated the power of mathematics in life and the imagination. In this hugely enjoyable exploration of everyday maths as ‘an atomic-powered prosthesis that you attach to your common sense,’ Jordan Ellenberg joins their ranks. Ellenberg, an academic and Slate ’s‘Do the Math’ columnist, explains key principles with erudite gusto—whether poking holes in predictions of a U.S. ‘obesity apocalypse,’ or unpicking an attempt by psychologist B. F. Skinner to prove statistically that Shakespeare was a dud at alliteration.”


“The book is filled to the rim with anecdotes and ‘good-to-know’ facts. And Ellenberg does not shy away from delving deeply into most topics, both in terms of the underlying mathematical concepts and the background material, which he has researched meticulously. . . . Whereas the book may be aimed at a general audience, who wonder how the mathematics they learned at school might ever be useful, there is much on offer for those who have chosen a professional career in the sciences even when the fundamental ideas discussed are not new. It’s a bit like walking through a well-curated exhibition of a favored painter. Many works you know inside out, but the context and the logic of the presentation may offer refreshing new perspectives and insights.”

Nature Physics

“Refreshingly lucid while still remaining conceptually rigorous, this book lends insight into how mathematicians think—and shows us how we can start to think like mathematicians as well.”

The New York Times Book Review

“A poet-mathematician offers an empowering and entertaining primer for the age of Big Data. . . . A rewarding popular math book for just about anyone.”

—Laura Miller, Salon

“A fresh application of complex mathematical thinking to commonplace events . . . How Not to Be Wrong is beautifully written, holding the reader’s attention throughout with well-chosen material, illuminating exposition, wit, and helpful examples. I am reminded of the great writer of recreational mathematics, Martin Gardner: Ellenberg shares Gardner’s remarkable ability to write clearly and entertainingly, bringing in deep mathematical ideas without the reader registering their difficulty.”

Times Higher Education (London)

“Ellenberg tells engaging, even exciting stories about how ‘the problems we think about every day—problems of politics, of medicine, of commerce, of theology—are shot through with mathematics.’”

The Washington Post (blog)

“A collection of fascinating examples of math and its surprising applications . . . How Not to Be Wrong is full of interesting and weird mathematical tools and observations.”

Business Insider

“Wry, accessible, and entertaining . . . Ellenberg finds the commonsense math at work in the everyday world, and his vivid examples and clear descriptions show how ‘math is woven into the way we reason.’”

Publishers Weekly (starred review)

“Witty and expansive, Ellenberg’s math will leave readers informed, intrigued, and armed with plenty of impressive conversation starters.”

Kirkus Reviews

“Readers will indeed marvel at how often mathematics shed unexpected light on economics (assessing the performance of investment advisors), public health (predicting the likely prevalence of obesity in thirty years), and politics (explaining why wealthy individuals vote Republican but affluent states go for Democrats). Relying on remarkably few technical formulas, Ellenberg writes with humor and verve as he repeatedly demonstrates that mathematics simply extends common sense.”


How Not to Be Wrong is a cheery manifesto for the utility of mathematical thinking. Ellenberg’s prose is a delight—informal and robust, irreverent yet serious. Maths is ‘an atomic-powered prosthesis that you attach to your common sense, vastly multiplying its reach and strength,’ he writes. Doing maths ‘is to be, at once, touched by fire and bound by reason. Logic forms a narrow channel through which intuition flows with vastly augmented force.’”

The Guardian (London)

“The title of this wonderful book explains what it adds to the honorable genre of popular writing on mathematics. Like Lewis Carroll, George Gamow, and Martin Gardner before him, Jordan Ellenberg shows how mathematics can delight and stimulate the mind. But he also shows that mathematical thinking should be in the toolkit of every thoughtful ­person—of everyone who wants to avoid fallacies, superstitions, and other ways of being wrong.”

—Steven Pinker, Johnstone Family Professor of Psychology, Harvard University, and author of How the Mind Works

“Brilliant and fascinating! Ellenberg shows his readers how to magnify common sense using the tools usually only accessible to those who have studied higher mathematics. I highly recommend it to anyone interested in expanding their worldly savviness—and math IQ!”

—Danica McKellar, actress and bestselling author of Math Doesn’t Suck and Kiss My Math

“Jordan Ellenberg promises to share ways of thinking that are both simple to grasp and profound in their implications, and he delivers in spades. These beautifully readable pages delight and enlighten in equal parts. Those who already love math will eat it up, and those who don’t yet know how lovable math is are in for a most pleasurable surprise.”

—Rebecca Newberger Goldstein, author of Plato at the Googleplex

“With math as with anything else, there’s smart, and then there’s street smart. This book will help you be both. Fans of Freakonomics and The Signal and the Noise will love Ellenberg’s surprising stories, snappy writing, and brilliant lessons in numerical savvy. How Not to Be Wrong is sharp, funny, and right.”

—Steven Strogatz, Jacob Gould Schurman Professor of Applied Mathematics, Cornell University, and author of The Joy of x

“Every page is a stand-alone, positive, and ontological examination of the beauty and surprise of mathematical discovery.”

—Cathy O’Neil,



Jordan Ellenberg is the Vilas Distinguished Achievement Professor of Mathematics at the University of Wisconsin-Madison. His writing has appeared in Slate, The Wall Street Journal, The New York Times, The Washington Post, The Boston Globe, and The Believer.



Right now, in a classroom somewhere in the world, a student is mouthing off to her math teacher. The teacher has just asked her to spend a substantial portion of her weekend computing a list of thirty definite integrals.

There are other things the student would rather do. There is, in fact, hardly anything she would not rather do. She knows this quite clearly, because she spent a substantial portion of the previous weekend computing a different—but not very different—list of thirty definite integrals. She doesn’t see the point, and she tells her teacher so. And at some point in this conversation, the student is going to ask the question the teacher fears most:

“When am I going to use this?”

Now the math teacher is probably going to say something like:

“I know this seems dull to you, but remember, you don’t know what career you’ll choose—you may not see the relevance now, but you might go into a field where it’ll be really important that you know how to compute definite integrals quickly and correctly by hand.”

This answer is seldom satisfying to the student. That’s because it’s a lie. And the teacher and the student both know it’s a lie. The number of adults who will ever make use of the integral of (1 − 3x + 4x2)−2 dx, or the formula for the cosine of 3θ, or synthetic division of polynomials, can be counted on a few thousand hands.

The lie is not very satisfying to the teacher, either. I should know: in my many years as a math professor I’ve asked many hundreds of college students to compute lists of definite integrals.

Fortunately, there’s a better answer. It goes something like this:

“Mathematics is not just a sequence of computations to be carried out by rote until your patience or stamina runs out—although it might seem that way from what you’ve been taught in courses called mathematics. Those integrals are to mathematics as weight training and calisthenics are to soccer. If you want to play soccer—I mean, really play, at a competitive level—you’ve got to do a lot of boring, repetitive, apparently pointless drills. Do professional players ever use those drills? Well, you won’t see anybody on the field curling a weight or zigzagging between traffic cones. But you do see players using the strength, speed, insight, and flexibility they built up by doing those drills, week after tedious week. Learning those drills is part of learning soccer.

“If you want to play soccer for a living, or even make the varsity team, you’re going to be spending lots of boring weekends on the practice field. There’s no other way. But now here’s the good news. If the drills are too much for you to take, you can still play for fun, with friends. You can enjoy the thrill of making a slick pass between defenders or scoring from distance just as much as a pro athlete does. You’ll be healthier and happier than you would be if you sat home watching the professionals on TV.

“Mathematics is pretty much the same. You may not be aiming for a mathematically oriented career. That’s fine—most people aren’t. But you can still do math. You probably already are doing math, even if you don’t call it that. Math is woven into the way we reason. And math makes you better at things. Knowing mathematics is like wearing a pair of X-ray specs that reveal hidden structures underneath the messy and chaotic surface of the world. Math is a science of not being wrong about things, its techniques and habits hammered out by centuries of hard work and argument. With the tools of mathematics in hand, you can understand the world in a deeper, sounder, and more meaningful way. All you need is a coach, or even just a book, to teach you the rules and some basic tactics. I will be your coach. I will show you how.”

For reasons of time, this is seldom what I actually say in the classroom. But in a book, there’s room to stretch out a little more. I hope to back up the grand claims I just made by showing you that the problems we think about every day—problems of politics, of medicine, of commerce, of theology—are shot through with mathematics. Understanding this gives you access to insights accessible by no other means.

Even if I did give my student the full inspirational speech, she might—if she is really sharp—remain unconvinced.

“That sounds good, Professor,” she’ll say. “But it’s pretty abstract. You say that with mathematics at your disposal you can get things right you’d otherwise get wrong. But what kinds of things? Give me an actual example.

And at that point I would tell her the story of Abraham Wald and the missing bullet holes.


This story, like many World War II stories, starts with the Nazis hounding a Jew out of Europe and ends with the Nazis regretting it. Abraham Wald was born in 1902 in what was then the city of Klausenburg in what was then the Austro-Hungarian Empire. By the time Wald was a teenager, one world war was in the books and his hometown had become Cluj, Romania. He was the grandson of a rabbi and the son of a kosher baker, but the younger Wald was a mathematician almost from the start. His talent for the subject was quickly recognized, and he was admitted to study mathematics at the University of Vienna, where he was drawn to subjects abstract and recondite even by the standards of pure mathematics: set theory and metric spaces.

But when Wald’s studies were completed, it was the mid-1930s, Austria was deep in economic distress, and there was no possibility that a foreigner could be hired as a professor in Vienna. Wald was rescued by a job offer from Oskar Morgenstern. Morgenstern would later immigrate to the United States and help invent game theory, but in 1933 he was the director of the Austrian Institute for Economic Research, and he hired Wald at a small salary to do mathematical odd jobs. That turned out to be a good move for Wald: his experience in economics got him a fellowship offer at the Cowles Commission, an economic institute then located in Colorado Springs. Despite the ever-worsening political situation, Wald was reluctant to take a step that would lead him away from pure mathematics for good. But then the Nazis conquered Austria, making Wald’s decision substantially easier. After just a few months in Colorado, he was offered a professorship of statistics at Columbia; he packed up once again and moved to New York.

And that was where he fought the war.

The Statistical Research Group (SRG), where Wald spent much of World War II, was a classified program that yoked the assembled might of American statisticians to the war effort—something like the Manhattan Project, except the weapons being developed were equations, not explosives. And the SRG was actually in Manhattan, at 401 West 118th Street in Morningside Heights, just a block away from Columbia University. The building now houses Columbia faculty apartments and some doctor’s offices, but in 1943 it was the buzzing, sparking nerve center of wartime math. At the Applied Mathematics Group−Columbia, dozens of young women bent over Marchant desktop calculators were calculating formulas for the optimal curve a fighter should trace out through the air in order to keep an enemy plane in its gunsights. In another apartment, a team of researchers from Princeton was developing protocols for strategic bombing. And Columbia’s wing of the atom bomb project was right next door.

But the SRG was the most high-powered, and ultimately the most influential, of any of these groups. The atmosphere combined the intellectual openness and intensity of an academic department with the shared sense of purpose that comes only with high stakes. “When we made recommendations,” W. Allen Wallis, the director, wrote, “frequently things happened. Fighter planes entered combat with their machine guns loaded according to Jack Wolfowitz’s* recommendations about mixing types of ammunition, and maybe the pilots came back or maybe they didn’t. Navy planes launched rockets whose propellants had been accepted by Abe Girshick’s sampling-inspection plans, and maybe the rockets exploded and destroyed our own planes and pilots or maybe they destroyed the target.”

The mathematical talent at hand was equal to the gravity of the task. In Wallis’s words, the SRG was “the most extraordinary group of statisticians ever organized, taking into account both number and quality.” Frederick Mosteller, who would later found Harvard’s statistics department, was there. So was Leonard Jimmie Savage, the pioneer of decision theory and great advocate of the field that came to be called Bayesian statistics.* Norbert Wiener, the MIT mathematician and the creator of cybernetics, dropped by from time to time. This was a group where Milton Friedman, the future Nobelist in economics, was often the fourth-smartest person in the room.

The smartest person in the room was usually Abraham Wald. Wald had been Allen Wallis’s teacher at Columbia, and functioned as a kind of mathematical eminence to the group. Still an “enemy alien,” he was not technically allowed to see the classified reports he was producing; the joke around SRG was that the secretaries were required to pull each sheet of notepaper out of his hands as soon as he was finished writing on it. Wald was, in some ways, an unlikely participant. His inclination, as it always had been, was toward abstraction, and away from direct applications. But his motivation to use his talents against the Axis was obvious. And when you needed to turn a vague idea into solid mathematics, Wald was the person you wanted at your side.

So here’s the question. You don’t want your planes to get shot down by enemy fighters, so you armor them. But armor makes the plane heavier, and heavier planes are less maneuverable and use more fuel. Armoring the planes too much is a problem; armoring the planes too little is a problem. Somewhere in between there’s an optimum. The reason you have a team of mathematicians socked away in an apartment in New York City is to figure out where that optimum is.

The military came to the SRG with some data they thought might be useful. When American planes came back from engagements over Europe, they were covered in bullet holes. But the damage wasn’t uniformly distributed across the aircraft. There were more bullet holes in the fuselage, not so many in the engines.

The officers saw an opportunity for efficiency; you can get the same protection with less armor if you concentrate the armor on the places with the greatest need, where the planes are getting hit the most. But exactly how much more armor belonged on those parts of the plane? That was the answer they came to Wald for. It wasn’t the answer they got.

The armor, said Wald, doesn’t go where the bullet holes are. It goes where the bullet holes aren’t: on the engines.

Wald’s insight was simply to ask: where are the missing holes? The ones that would have been all over the engine casing, if the damage had been spread equally all over the plane? Wald was pretty sure he knew. The missing bullet holes were on the missing planes. The reason planes were coming back with fewer hits to the engine is that planes that got hit in the engine weren’t coming back. Whereas the large number of planes returning to base with a thoroughly Swiss-cheesed fuselage is pretty strong evidence that hits to the fuselage can (and therefore should) be tolerated. If you go to the recovery room at the hospital, you’ll see a lot more people with bullet holes in their legs than people with bullet holes in their chests. But that’s not because people don’t get shot in the chest; it’s because the people who get shot in the chest don’t recover.

Here’s an old mathematician’s trick that makes the picture perfectly clear: set some variables to zero. In this case, the variable to tweak is the probability that a plane that takes a hit to the engine manages to stay in the air. Setting that probability to zero means a single shot to the engine is guaranteed to bring the plane down. What would the data look like then? You’d have planes coming back with bullet holes all over the wings, the fuselage, the nose—but none at all on the engine. The military analyst has two options for explaining this: either the German bullets just happen to hit every part of the plane but one, or the engine is a point of total vulnerability. Both stories explain the data, but the latter makes a lot more sense. The armor goes where the bullet holes aren’t.

Wald’s recommendations were quickly put into effect, and were still being used by the navy and the air force through the wars in Korea and Vietnam. I can’t tell you exactly how many American planes they saved, though the data-slinging descendants of the SRG inside today’s military no doubt have a pretty good idea. One thing the American defense establishment has traditionally understood very well is that countries don’t win wars just by being braver than the other side, or freer, or slightly preferred by God. The winners are usually the guys who get 5% fewer of their planes shot down, or use 5% less fuel, or get 5% more nutrition into their infantry at 95% of the cost. That’s not the stuff war movies are made of, but it’s the stuff wars are made of. And there’s math every step of the way.

Why did Wald see what the officers, who had vastly more knowledge and understanding of aerial combat, couldn’t? It comes back to his math-trained habits of thought. A mathematician is always asking, “What assumptions are you making? And are they justified?” This can be annoying. But it can also be very productive. In this case, the officers were making an assumption unwittingly: that the planes that came back were a random sample of all the planes. If that were true, you could draw conclusions about the distribution of bullet holes on all the planes by examining the distribution of bullet holes on only the surviving planes. Once you recognize that you’ve been making that hypothesis, it takes only a moment to realize it’s dead wrong; there’s no reason at all to expect the planes to have an equal likelihood of survival no matter where they get hit. In a piece of mathematical lingo we’ll come back to in chapter 15, the rate of survival and the location of the bullet holes are correlated.

Wald’s other advantage was his tendency toward abstraction. Wolfowitz, who had studied under Wald at Columbia, wrote that the problems he favored were “all of the most abstract sort,” and that he was “always ready to talk about mathematics, but uninterested in popularization and special applications.”

Wald’s personality made it hard for him to focus his attention on applied problems, it’s true. The details of planes and guns were, to his eye, so much upholstery—he peered right through to the mathematical struts and nails holding the story together. Sometimes that approach can lead you to ignore features of the problem that really matter. But it also lets you see the common skeleton shared by problems that look very different on the surface. Thus you have meaningful experience even in areas where you appear to have none.

To a mathematician, the structure underlying the bullet hole problem is a phenomenon called survivorship bias. It arises again and again, in all kinds of contexts. And once you’re familiar with it, as Wald was, you’re primed to notice it wherever it’s hiding.

Like mutual funds. Judging the performance of funds is an area where you don’t want to be wrong, even by a little bit. A shift of 1% in annual growth might be the difference between a valuable financial asset and a dog. The funds in Morningstar’s Large Blend category, whose mutual funds invest in big companies that roughly represent the S&P 500, look like the former kind. The funds in this class grew an average of 178.4% between 1995 and 2004: a healthy 10.8% per year.* Sounds like you’d do well, if you had cash on hand, to invest in those funds, no?

Well, no. A 2006 study by Savant Capital shone a somewhat colder light on those numbers. Think again about how Morningstar generates its number. It’s 2004, you take all the funds classified as Large Blend, and you see how much they grew over the last ten years.

But something’s missing: the funds that aren’t there. Mutual funds don’t live forever. Some flourish, some die. The ones that die are, by and large, the ones that don’t make money. So judging a decade’s worth of mutual funds by the ones that still exist at the end of the ten years is like judging our pilots’ evasive maneuvers by counting the bullet holes in the planes that come back. What would it mean if we never found more than one bullet hole per plane? Not that our pilots are brilliant at dodging enemy fire, but that the planes that got hit twice went down in flames.

The Savant study found that if you included the performance of the dead funds together with the surviving ones, the rate of return dropped down to 134.5%, a much more ordinary 8.9% per year. More recent research backed that up: a comprehensive 2011 study in the Review of Finance covering nearly 5,000 funds found that the excess return rate of the 2,641 survivors is about 20% higher than the same figure recomputed to include the funds that didn’t make it. The size of the survivorship effect might have surprised investors, but it probably wouldn’t have surprised Abraham Wald.


At this point my teenage interlocutor is going to stop me and ask, quite reasonably: Where’s the math? Wald was a mathematician, that’s true, and it can’t be denied that his solution to the problem of the bullet holes was ingenious, but what’s mathematical about it? There was no trig identity to be seen, no integral or inequality or formula.

First of all: Wald did use formulas. I told the story without them, because this is just the introduction. When you write a book explaining human reproduction to preteens, the introduction stops short of the really hydraulic stuff about how babies get inside Mommy’s tummy. Instead, you start with something more like “Everything in nature changes; trees lose their leaves in winter only to bloom again in spring; the humble caterpillar enters its chrysalis and emerges as a magnificent butterfly. You are part of nature too, and . . .”

That’s the part of the book we’re in now.

But we’re all adults here. Turning off the soft focus for a second, here’s what a sample page of Wald’s actual report looks like:

I hope that wasn’t too shocking.

Still, the real idea behind Wald’s insight doesn’t require any of the formalism above. We’ve already explained it, using no mathematical notation of any kind. So my student’s question stands. What makes that math? Isn’t it just common sense?

Yes. Mathematics is common sense. On some basic level, this is clear. How can you explain to someone why adding seven things to five things yields the same result as adding five things to seven? You can’t: that fact is baked into our way of thinking about combining things together. Mathematicians like to give names to the phenomena our common sense describes: instead of saying, “This thing added to that thing is the same thing as that thing added to this thing,” we say, “Addition is commutative.” Or, because we like our symbols, we write:

For any choice of a and b, a + b = b + a.

Despite the official-looking formula, we are talking about a fact instinctively understood by every child.

Multiplication is a slightly different story. The formula looks pretty similar:

For any choice of a and b, a × b = b × a.

The mind, presented with this statement, does not say “no duh” quite as instantly as it does for addition. Is it “common sense” that two sets of six things amount to the same as six sets of two?

Maybe not; but it can become common sense. Here’s my earliest mathematical memory. I’m lying on the floor in my parents’ house, my cheek pressed against the shag rug, looking at the stereo. Very probably I am listening to side two of the Beatles’ Blue Album. Maybe I’m six. This is the seventies, and therefore the stereo is encased in a pressed wood panel, which has a rectangular array of airholes punched into the side. Eight holes across, six holes up and down. So I’m lying there, looking at the airholes. The six rows of holes. The eight columns of holes. By focusing my gaze in and out I could make my mind flip back and forth between seeing the rows and seeing the columns. Six rows with eight holes each. Eight columns with six holes each.

And then I had it—eight groups of six were the same as six groups of eight. Not because it was a rule I’d been told, but because it could not be any other way. The number of holes in the panel was the number of holes in the panel, no matter which way you counted them.

We tend to teach mathematics as a long list of rules. You learn them in order and you have to obey them, because if you don’t obey them you get a C-. This is not mathematics. Mathematics is the study of things that come out a certain way because there is no other way they could possibly be.

Now let’s be fair: not everything in mathematics can be made as perfectly transparent to our intuition as addition and multiplication. You can’t do calculus by common sense. But calculus is still derived from our common sense—Newton took our physical intuition about objects moving in straight lines, formalized it, and then built on top of that formal structure a universal mathematical description of motion. Once you have Newton’s theory in hand, you can apply it to problems that would make your head spin if you had no equations to help you. In the same way, we have built-in mental systems for assessing the likelihood of an uncertain outcome. But those systems are pretty weak and unreliable, especially when it comes to events of extreme rarity. That’s when we shore up our intuition with a few sturdy, well-placed theorems and techniques, and make out of it a mathematical theory of probability.

The specialized language in which mathematicians converse with one another is a magnificent tool for conveying complex ideas precisely and swiftly. But its foreignness can create among outsiders the impression of a sphere of thought totally alien to ordinary thinking. That’s exactly wrong.

Math is like an atomic-powered prosthesis that you attach to your common sense, vastly multiplying its reach and strength. Despite the power of mathematics, and despite its sometimes forbidding notation and abstraction, the actual mental work involved is little different from the way we think about more down-to-earth problems. I find it helpful to keep in mind an image of Iron Man punching a hole through a brick wall. On the one hand, the actual wall-breaking force is being supplied, not by Tony Stark’s muscles, but by a series of exquisitely synchronized servomechanisms powered by a compact beta particle generator. On the other hand, from Tony Stark’s point of view, what he is doing is punching a wall, exactly as he would without the armor. Only much, much harder.

To paraphrase Clausewitz: Mathematics is the extension of common sense by other means.

Without the rigorous structure that math provides, common sense can lead you astray. That’s what happened to the officers who wanted to armor the parts of the planes that were already strong enough. But formal mathematics without common sense—without the constant interplay between abstract reasoning and our intuitions about quantity, time, space, motion, behavior, and uncertainty—would just be a sterile exercise in rule-following and bookkeeping. In other words, math would actually be what the peevish calculus student believes it to be.

That’s a real danger. John von Neumann, in his 1947 essay “The Mathematician,” warned:

As a mathematical discipline travels far from its empirical source, or still more, if it is a second and third generation only indirectly inspired by ideas coming from “reality” it is beset with very grave dangers. It becomes more and more purely aestheticizing, more and more purely l’art pour l’art. This need not be bad, if the field is surrounded by correlated subjects, which still have closer empirical connections, or if the discipline is under the influence of men with an exceptionally well-developed taste. But there is a grave danger that the subject will develop along the line of least resistance, that the stream, so far from its source, will separate into a multitude of insignificant branches, and that the discipline will become a disorganized mass of details and complexities. In other words, at a great distance from its empirical source, or after much “abstract” inbreeding, a mathematical subject is in danger of degeneration.*


If your acquaintance with mathematics comes entirely from school, you have been told a story that is very limited, and in some important ways false. School mathematics is largely made up of a sequence of facts and rules, facts which are certain, rules which come from a higher authority and cannot be questioned. It treats mathematical matters as completely settled.

Mathematics is not settled. Even concerning the basic objects of study, like numbers and geometric figures, our ignorance is much greater than our knowledge. And the things we do know were arrived at only after massive effort, contention, and confusion. All this sweat and tumult is carefully screened off in your textbook.

There are facts and there are facts, of course. There has never been much controversy about whether 1 + 2 = 3. The question of how and whether we can truly prove that 1 + 2 = 3, which wobbles uneasily between mathematics and philosophy, is another story—we return to that at the end of the book. But that the computation is correct is a plain truth. The tumult lies elsewhere. We’ll come within sight of it several times.

Mathematical facts can be simple or complicated, and they can be shallow or profound. This divides the mathematical universe into four quadrants:

Basic arithmetic facts, like 1 + 2 = 3, are simple and shallow. So are basic identities like sin(2x) = 2 sin x cos x or the quadratic formula: they might be slightly harder to convince yourself of than 1 + 2 = 3, but in the end they don’t have much conceptual heft.

Moving over to complicated/shallow, you have the problem of multiplying two ten-digit numbers, or the computation of an intricate definite integral, or, given a couple of years of graduate school, the trace of Frobenius on a modular form of conductor 2377. It’s conceivable you might, for some reason, need to know the answer to such a problem, and it’s undeniable that it would be somewhere between annoying and impossible to work it out by hand; or, as in the case of the modular form, it might take some serious schooling even to understand what’s being asked for. But knowing those answers doesn’t really enrich your knowledge about the world.

The complicated/profound quadrant is where professional mathematicians like me try to spend most of our time. That’s where the celebrity theorems and conjectures live: the Riemann Hypothesis, Fermat’s Last Theorem,* the Poincaré Conjecture, P vs. NP, Gödel’s Theorem . . . Each one of these theorems involves ideas of deep meaning, fundamental importance, mind-blowing beauty, and brutal technicality, and each of them is the protagonist of books of its own.

But not this book. This book is going to hang out in the upper left quadrant: simple and profound. The mathematical ideas we want to address are ones that can be engaged with directly and profitably, whether your mathematical training stops at pre-algebra or extends much further. And they are not “mere facts,” like a simple statement of arithmetic—they are principles, whose application extends far beyond the things you’re used to thinking of as mathematical. They are the go-to tools on the utility belt, and used properly they will help you not be wrong.

Pure mathematics can be a kind of convent, a quiet place safely cut off from the pernicious influences of the world’s messiness and inconsistency. I grew up inside those walls. Other math kids I knew were tempted by applications to physics, or genomics, or the black art of hedge fund management, but I wanted no such rumspringa.* As a graduate student, I dedicated myself to number theory, what Gauss called “the queen of mathematics,” the purest of the pure subjects, the sealed garden at the center of the convent, where we contemplated the same questions about numbers and equations that troubled the Greeks and have gotten hardly less vexing in the twenty-five hundred years since.

At first I worked on number theory with a classical flavor, proving facts about sums of fourth powers of whole numbers that I could, if pressed, explain to my family at Thanksgiving, even if I couldn’t explain how I proved what I proved. But before long I got enticed into even more abstract realms, investigating problems where the basic actors—“residually modular Galois representations,” “cohomology of moduli schemes,” “dynamical systems on homogeneous spaces,” things like that—were impossible to talk about outside the archipelago of seminar halls and faculty lounges that stretches from Oxford to Princeton to Kyoto to Paris to Madison, Wisconsin, where I’m a professor now. When I tell you this stuff is thrilling, and meaningful, and beautiful, and that I’ll never get tired of thinking about it, you may just have to believe me, because it takes a long education just to get to the point where the objects of study rear into view.

But something funny happened. The more abstract and distant from lived experience my research got, the more I started to notice how much math was going on in the world outside the walls. Not Galois representations or cohomology, but ideas that were simpler, older, and just as deep—the northwest quadrant of the conceptual foursquare. I started writing articles for magazines and newspapers about the way the world looked through a mathematical lens, and I found, to my surprise, that even people who said they hated math were willing to read them. It was a kind of math teaching, but very different from what we do in a classroom.

What it has in common with the classroom is that the reader gets asked to do some work. Back to von Neumann on “The Mathematician”:

“It is harder to understand the mechanism of an airplane, and the theories of the forces which lift and which propel it, than merely to ride in it, to be elevated and transported by it—or even to steer it. It is exceptional that one should be able to acquire the understanding of a process without having previously acquired a deep familiarity with running it, with using it, before one has assimilated it in an instinctive and empirical way.”

In other words: it is pretty hard to understand mathematics without doing some mathematics. There’s no royal road to geometry, as Euclid told Ptolemy, or maybe, depending on your source, as Menaechmus told Alexander the Great. (Let’s face it, famous old maxims attributed to ancient scientists are probably made up, but they’re no less instructive for that.)

This will not be the kind of book where I make grand, vague gestures at great monuments of mathematics, and instruct you in the proper manner of admiring them from a great distance. We are here to get our hands a little dirty. We’ll compute some things. There will be a few formulas and equations, when I need them to make a point. No formal math beyond arithmetic will be required, though lots of math way beyond arithmetic will be explained. I’ll draw some crude graphs and charts. We’ll encounter some topics from school math, outside their usual habitat; we’ll see how trigonometric functions describe the extent to which two variables are related to each other, what calculus has to say about the relationship between linear and nonlinear phenomena, and how the quadratic formula serves as a cognitive model for scientific inquiry. And we’ll also run into some of the mathematics that usually gets put off to college or beyond, like the crisis in set theory, which appears here as a kind of metaphor for Supreme Court jurisprudence and baseball umpiring; recent developments in analytic number theory, which demonstrate the interplay between structure and randomness; and information theory and combinatorial designs, which help explain how a group of MIT undergrads won millions of dollars by understanding the guts of the Massachusetts state lottery.

There will be occasional gossip about mathematicians of note, and a certain amount of philosophical speculation. There will even be a proof or two. But there will be no homework, and there will be no test.

Includes: the Laffer curve, calculus explained in one page, the Law of Large Numbers, assorted terrorism analogies, “Everyone in America will be overweight by 2048,” why South Dakota has more brain cancer than North Dakota, the ghosts of departed quantities, the habit of definition



A few years ago, in the heat of the battle over the Affordable Care Act, Daniel J. Mitchell of the libertarian Cato Institute posted a blog entry with the provocative title “Why Is Obama Trying to Make America More Like Sweden when Swedes Are Trying to Be Less Like Sweden?”

Good question! When you put it that way, it does seem pretty perverse. Why, Mr. President, are we swimming against the current of history, while social welfare states around the world—even rich little Sweden!—are cutting back on expensive benefits and high taxes? “If Swedes have learned from their mistakes and are now trying to reduce the size and scope of government,” Mitchell writes, “why are American politicians determined to repeat those mistakes?”

Answering this question will require an extremely scientific chart. Here’s what the world looks like to the Cato Institute:

The x-axis represents Swedishness,* and the y-axis is some measure of prosperity. Don’t worry about exactly how we’re quantifying these things. The point is just this: according to the chart, the more Swedish you are, the worse off your country is. The Swedes, no fools, have figured this out and are launching their northwestward climb toward free-market prosperity. But Obama’s sliding in the wrong direction.

At the top of the facing page I’ve drawn the same picture from the point of view of people whose economic views are closer to President Obama’s than to those of the Cato Institute.

This picture gives very different advice about how Swedish we should be. Where do we find peak prosperity? At a point more Swedish than America, but less Swedish than Sweden. If this picture is right, it makes perfect sense for Obama to beef up our welfare state while the Swedes trim theirs down.

The difference between the two pictures is the difference between linearity and nonlinearity, one of the central distinctions in mathematics. The Cato curve is a line;* the non-Cato curve, the one with the hump in the middle, is not. A line is one kind of curve, but not the only kind, and lines enjoy all kinds of special properties that curves in general may not. The highest point on a line segment—the maximum prosperity, in this example—has to be on one end or the other. That’s just how lines are. If lowering taxes is good for prosperity, then lowering taxes even more is even better. And if Sweden wants to de-Swede, so should we. Of course, an anti-Cato think tank might posit that the line slopes in the other direction, going southwest to northeast. And if that’s what the line looks like, then no amount of social spending is too much. The optimal policy is Maximum Swede.

Usually, when someone announces they’re a “nonlinear thinker” they’re about to apologize for losing something you lent them. But nonlinearity is a real thing! And in this context, thinking nonlinearly is crucial, because not all curves are lines. A moment of reflection will tell you that the real curves of economics look like the second picture, not the first. They’re nonlinear. Mitchell’s reasoning is an example of false linearity—he’s assuming, without coming right out and saying so, that the course of prosperity is described by the line segment in the first picture, in which case Sweden stripping down its social infrastructure means we should do the same.

But as long as you believe there’s such a thing as too much welfare state and such a thing as too little, you know the linear picture is wrong. Some principle more complicated than “More government bad, less government good” is in effect. The generals who consulted Abraham Wald faced the same kind of situation: too little armor meant planes got shot down, too much meant the planes couldn’t fly. It’s not a question of whether adding more armor is good or bad; it could be either, depending on how heavily armored the planes are to start with. If there’s an optimal answer, it’s somewhere in the middle, and deviating from it in either direction is bad news.

Nonlinear thinking means which way you should go depends on where you already are.

This insight isn’t new. Already in Roman times we find Horace’s famous remark “Est modus in rebus, sunt certi denique fines, quos ultra citraque nequit consistere rectum” (“There is a proper measure in things. There are, finally, certain boundaries short of and beyond which what is right cannot exist”). And further back still, in the Nicomachean Ethics, Aristotle observes that eating either too much or too little is troubling to the constitution. The optimum is somewhere in between, because the relation between eating and health isn’t linear, but curved, with bad outcomes on both ends.


The irony is that economic conservatives like the folks at Cato used to understand this better than anybody. That second picture I drew up there? The extremely scientific one with the hump in the middle? I am not the first person to draw it. It’s called the Laffer curve, and it’s played a central role in Republican economics for almost forty years. By the middle of the Reagan administration, the curve had become such a commonplace of economic discourse that Ben Stein ad-libbed it into his famous soul-killing lecture in Ferris Bueller’s Day Off:

Anyone know what this is? Class? Anyone? . . . Anyone? Anyone seen this before? The Laffer curve. Anyone know what this says? It says that at this point on the revenue curve, you will get exactly the same amount of revenue as at this point. This is very controversial. Does anyone know what Vice President Bush called this in 1980? Anyone? Something-doo economics. “Voodoo” economics.

The legend of the Laffer curve goes like this: Arthur Laffer, then an economics professor at the University of Chicago, had dinner one night in 1974 with Dick Cheney, Donald Rumsfeld, and Wall Street Journal editor Jude Wanniski at an upscale hotel restaurant in Washington, DC. They were tussling over President Ford’s tax plan, and eventually, as intellectuals do when the tussling gets heavy, Laffer commandeered a napkin* and drew a picture. The picture looked like this:

The horizontal axis here is level of taxation, and the vertical axis represents the amount of revenue the government takes in from taxpayers. On the left edge of the graph, the tax rate is 0%; in that case, by definition, the government gets no tax revenue. On the right, the tax rate is 100%; whatever income you have, whether from a business you run or a salary you’re paid, goes straight into Uncle Sam’s bag.

Which is empty. Because if the government vacuums up every cent of the wage you’re paid to show up and teach school, or sell hardware, or middle-manage, why bother doing it? Over on the right edge of the graph, people don’t work at all. Or, if they work, they do so in informal economic niches where the tax collector’s hand can’t reach. The government’s revenue is zero once again.

In the intermediate range in the middle of the curve, where the government charges us somewhere between none of our income and all of it—in other words, in the real world—the government does take in some amount of revenue.

That means the curve recording the relationship between tax rate and government revenue cannot be a straight line. If it were, revenue would be maximized at either the left or right edge of the graph; but it’s zero in both places. If the current income tax is really close to zero, so that you’re on the left-hand side of the graph, then raising taxes increases the amount of money the government has available to fund services and programs, just as you might intuitively expect. But if the rate is close to 100%, raising taxes actually decreases government revenue. If you’re to the right of the Laffer peak, and you want to decrease the deficit without cutting spending, there’s a simple and politically peachy solution: lower the tax rate, and thereby increase the amount of taxes you take in. Which way you should go depends on where you are.

So where are we? That’s where things get sticky. In 1974, the top income tax rate was 70%, and the idea that America was on the right-hand downslope of the Laffer curve held a certain appeal—especially for the few people lucky enough to pay tax at that rate, which only applied to income beyond the first $200,000.* And the Laffer curve had a potent advocate in Wanniski, who brought his theory into the public consciousness in a 1978 book rather self-assuredly titled The Way the World Works.* Wanniski was a true believer, with the right mix of zeal and political canniness to get people to listen to an idea considered fringy even by tax-cut advocates. He was untroubled by being called a nut. “Now, what does ‘nut’ mean?” he asked an interviewer. “Thomas Edison was a nut, Leibniz was a nut, Galileo was a nut, so forth and so on. Everybody who comes with a new idea to the conventional wisdom, comes with an idea that’s so far outside the mainstream, that’s considered nutty.”

(Aside: it’s important to point out here that people with out-of-the-mainstream ideas who compare themselves to Edison and Galileo are never actually right. I get letters with this kind of language at least once a month, usually from people who have “proofs” of mathematical statements that have been known for hundreds of years to be false. I can guarantee you Einstein did not go around telling people, “Look, I know this theory of general relativity sounds wacky, but that’s what they said about Galileo!”)

The Laffer curve, with its compact visual representation and its agreeably counterintuitive sting, turned out to be an easy sell for politicians with a preexisting hunger for tax cuts. As economist Hal Varian put it, “You can explain it to a Congressman in six minutes and he can talk about it for six months.” Wanniski became an adviser first to Jack Kemp, then to Ronald Reagan, whose experiences as a wealthy movie star in the 1940s formed the template for his view of the economy four decades later. His budget director, David Stockman, recalls:

“I came into the Big Money making pictures during World War II,” [Reagan] would always say. At that time the wartime income surtax hit 90 percent. “You could only make four pictures and then you were in the top bracket,” he would continue. “So we all quit working after about four pictures and went off to the country.” High tax rates caused less work. Low tax rates caused more. His experience proved it.

These days it’s hard to find a reputable economist who thinks we’re on the downslope of the Laffer curve. Maybe that’s not surprising, considering top incomes are currently taxed at just 35%, a rate that would have seemed absurdly low for most of the twentieth century. But even in Reagan’s day, we were probably on the left-hand side of the curve. Greg Mankiw, an economist at Harvard and a Republican who chaired the Council of Economic Advisors under the second President Bush, writes in his microeconomics textbook:

Subsequent history failed to confirm Laffer’s conjecture that lower tax rates would raise tax revenue. When Reagan cut taxes after he was elected, the result was less tax revenue, not more. Revenue from personal income taxes (per person, adjusted for inflation) fell by 9 percent from 1980 to 1984, even though average income (per person, adjusted for inflation) grew by 4 percent over this period. Yet once the policy was in place, it was hard to reverse.

Some sympathy for the supply-siders is now in order. First of all, maximizing government revenue needn’t be the goal of tax policy. Milton Friedman, whom we last met during World War II doing classified military work for the Statistical Research Group, went on to become a Nobel-winning economist and adviser to presidents, and a powerful advocate for low taxes and libertarian philosophy. Friedman’s famous slogan on taxation is “I am in favor of cutting taxes under any circumstances and for any excuse, for any reason, whenever it’s possible.” He didn’t think we should be aiming for the top of the Laffer curve, where government tax revenue is as high as it can be. For Friedman, money obtained by the government would eventually be money spent by the government, and that money, he felt, was more often spent badly than well.

More moderate supply-side thinkers, like Mankiw, argue that lower taxes can increase the motivation to work hard and launch businesses, leading eventually to a bigger, stronger economy, even if the immediate effect of the tax cut is decreased government revenue and bigger deficits. An economist with more redistributionist sympathies would observe that this cuts both ways; maybe the government’s diminished ability to spend means it constructs less infrastructure, regulates fraud less stringently, and generally does less of the work that enables free enterprise to thrive.

Mankiw also points out that the very richest people—the ones who’d been paying 70% on the top tranche of their income—did contribute more tax revenue after Reagan’s tax cuts.* That leads to the somewhat vexing possibility that the way to maximize government revenue is to jack up taxes on the middle class, who have no choice but to keep on working, while slashing rates on the rich; those guys have enough stockpiled wealth to make credible threats to withhold or offshore their economic activity, should their government charge them a rate they deem too high. If that story’s right, a lot of liberals will uncomfortably climb in the boat with Milton Friedman: maybe maximizing tax revenue isn’t so great after all.

Mankiw’s final assessment is a rather polite “Laffer’s argument is not completely without merit.” I would give Laffer more credit than that! His drawing made the fundamental and incontrovertible mathematical point that the relationship between taxation and revenue is necessarily nonlinear. It doesn’t, of course, have to be a single smooth hill like the one Laffer sketched; it could look like a trapezoid

or a Bactrian camel’s back

or a wildly oscillating free-for-all*

but if it slopes upward in one place, it has to slope downward somewhere else. There is such a thing as being too Swedish. That’s a statement no economist would disagree with. It’s also, as Laffer himself pointed out, something that was understood by many social scientists before him. But to most people, it’s not at all obvious—at least, not until you see the picture on the napkin. Laffer understood perfectly well that his curve didn’t have the power to tell you whether or not any given economy at any given time was overtaxed or not. That’s why he didn’t draw any numbers on the picture. Questioned during congressional testimony about the precise location of the optimal tax rate, he conceded, “I cannot measure it frankly, but I can tell you what the characteristics of it are; yes, sir.” All the Laffer curve says is that lower taxes could, under some circumstances, increase tax revenue; but figuring out what those circumstances are requires deep, difficult, empirical work, the kind of work that doesn’t fit on a napkin.

There’s nothing wrong with the Laffer curve—only with the uses people put it to. Wanniski and the politicians who followed his panpipe fell prey to the oldest false syllogism in the book:

It could be the case that lowering taxes will increase government revenue;

I want it to be the case that lowering taxes will increase government revenue;

Therefore, it is the case that lowering taxes will increase government revenue.



You might not have thought you needed a professional mathematician to tell you that not all curves are straight lines. But linear reasoning is everywhere. You’re doing it every time you say that if something is good to have, having more of it is even better. Political shouters rely on it: “You support military action against Iran? I guess you’d like to launch a ground invasion of every country that looks at us funny!” Or, on the other hand, “Engagement with Iran? You probably also think Adolf Hitler was just misunderstood.”

Why is this kind of reasoning so popular, when a moment’s thought reveals its wrongness? Why would anyone think, even for a second, that all curves are straight lines, when they’re obviously not?

One reason is that, in a sense, they are. That story starts with Archimedes.


What’s the area of the following circle?

In the modern world, that’s a problem so standard you could put it on the SAT. The area of a circle is πr2, and in this case the radius r is 1, so the area is π. But two thousand years ago this was a vexing open question, important enough to draw the attention of Archimedes.

Why was it so hard? For one thing, the Greeks didn’t really think of π as a number, as we do. The numbers they understood were whole numbers, numbers that counted things: 1, 2, 3, 4 . . . But the first great success of Greek geometry—the Pythagorean Theorem*—turned out to be the ruin of their number system.

Here’s a picture:

The Pythagorean Theorem tells you that the square of the hypotenuse—the side drawn diagonally here, the one that doesn’t touch the right angle—is the sum of the squares of the other two sides, or legs. In this picture, that says the square of the hypotenuse is 12 + 12 = 1 + 1 = 2. In particular, the hypotenuse is longer than 1 and shorter than 2 (as you can check with your eyeballs, no theorem required). That the length is not a whole number was not, in itself, a problem for the Greeks. Maybe we just measured everything in the wrong units. If we choose our unit of length to make the legs 5 units long, you can check with a ruler that the hypotenuse is just about 7 units long. Just about—but a bit too long. For the square of the hypotenuse is

52 + 52 = 25 + 25 = 50

and if the hypotenuse were 7, its square would be 7 × 7 = 49.

Or if you make the legs 12 units long, the hypotenuse is almost exactly 17 units, but is tantalizingly too short, because 122 + 122 is 288, a smidgen less than 172, which is 289.

And at some point around the fifth century BCE, a member of the Pythagorean school made a shocking discovery: there was no way to measure the isosceles right triangle so that the length of each side was a whole number. Modern people would say “the square root of 2 is irrational”—that is, it is not the ratio of any two whole numbers. But the Pythagoreans would not have said that. How could they? Their notion of quantity was built on the idea of proportions between whole numbers. To them, the length of that hypotenuse had been revealed to be not a number at all.

This caused a fuss. The Pythagoreans, you have to remember, were extremely weird. Their philosophy was a chunky stew of things we’d now call mathematics, things we’d now call religion, and things we’d now call mental illness. They believed that odd numbers were good and even numbers evil; that a planet identical to our own, the Antichthon, lay on the other side of the sun; and that it was wrong to eat beans, by some accounts because they were the repository of dead people’s souls. Pythagoras himself was said to have had the ability to talk to cattle (he told them not to eat beans) and to have been one of the very few ancient Greeks to wear pants.

The mathematics of the Pythagoreans was inseparably bound up with their ideology. The story (probably not really true, but it gives the right impression of the Pythagorean style) is that the Pythagorean who discovered the irrationality of the square root of 2 was a man named Hippasus, whose reward for proving such a nauseating theorem was to be tossed into the sea by his colleagues, to his death.

But you can’t drown a theorem. The Pythagoreans’ successors, like Euclid and Archimedes, understood that you had to roll up your sleeves and measure things, even if this brought you outside the pleasant walled garden of the whole numbers. No one knew whether the area of a circle could be expressed using whole numbers alone.* But wheels must be built and silos filled;* so the measurement must be done.

The original idea comes from Eudoxus of Cnidus; Euclid included it as book 12 of the elements. But it was Archimedes who really brought the project to its full fruition. Today we call his approach the method of exhaustion. And it starts like this.

The square in the picture is called the inscribed square; each of its corners just touches the circle, but it doesn’t extend beyond the circle’s boundary. Why do this? Because circles are mysterious and intimidating, and squares are easy. If you have before you a square whose side has length X, its area is X times X—indeed, that’s why we call the operation of multiplying a number by itself squaring! A basic rule of mathematical life: if the universe hands you a hard problem, try to solve an easier one instead, and hope the simple version is close enough to the original problem that the universe doesn’t object.

The inscribed square breaks up into four triangles, each of which is none other than the isosceles triangle we just drew.* So the square’s area is four times the area of the triangle. That triangle, in turn, is what you get when you take a 1 x 1 square and cut it diagonally in half like a tuna fish sandwich.

The area of the tuna fish sandwich is 1 × 1 = 1, so the area of each triangular half-sandwich is 1/2, and the area of the inscribed square is 4 × 1/2, or 2.

By the way, suppose you don’t know the Pythagorean Theorem. Guess what—you do now! Or at least you know what it has to say about this particular right triangle. Because the right triangle that makes up the lower half of the tuna fish sandwich is exactly the same as the one that is the northwest quarter of the inscribed square. And its hypotenuse is the inscribed square’s side. So when you square the hypotenuse, you get the area of the inscribed square, which is 2. That is, the hypotenuse is that number which, when squared, yields 2; or, in the usual more concise lingo, the square root of 2.

The inscribed square is entirely contained within the circle. If its area is 2, the area of the circle must be at least 2.

Now we draw another square.

This one is called the circumscribed square; it, too, touches the circle at just four points. But this square contains the circle. Its sides have length 2, so its area is 4; and so we know the area of the circle is at most 4.

To have shown that pi is between 2 and 4 is perhaps not so impressive. But Archimedes is just getting started. Take the four corners of your inscribed square and mark new points on the circle halfway between each adjacent pair of corners. Now you’ve got eight equally spaced points, and when you connect those, you get an inscribed octagon, or, in technical language, a “stop sign”:

Computing the area of the inscribed octagon is a bit harder, and I’ll spare you the trigonometry. The important thing is that it’s about straight lines and angles, not curves, and so it was doable with the methods available to Archimedes. And the area is twice the square root of 2, which is about 2.83.

You can play the same game with the circumscribed octagon

whose area is 8(√2 − 1), a little over 3.31.

So the area of the circle is trapped in between 2.83 and 3.31.

Why stop there? You can stick points in between the corners of the octagon (whether inscribed or circumscribed) to make a 16-gon; after some more trigonometric figuring, that tells you that the area of the circle is in between 3.06 and 3.18. Do it again, to make a 32-gon; and again, and again, and pretty soon you have something that looks like this:

Wait, isn’t that just the circle? Of course not! It’s a regular polygon with 65,536 sides. Couldn’t you tell?

The great insight of Eudoxus and Archimedes was that it doesn’t matter whether it’s a circle or a polygon with very many very short sides. The two areas will be close enough for any purpose you might have in mind. The area of the little fringe between the circle and the polygon has been “exhausted” by our relentless iteration. The circle has a curve to it, that’s true. But every tiny little piece of it can be well approximated by a perfectly straight line, just as the tiny little patch of the earth’s surface we stand on is well approximated by a perfectly flat plane.*

The slogan to keep in mind: straight locally, curved globally.

Or think of it like this. You are streaking downward toward the circle as from a great height. At first you can see the whole thing:

Then just one segment of arc:

And a still smaller segment:

Until, zooming in, and zooming in, what you see is pretty much indistinguishable from a line. An ant on the circle, aware only of his own tiny immediate surroundings, would think he was on a straight line, just as a person on the surface of the earth (unless she is clever enough to watch objects crest the horizon as they approach from afar) feels like she’s standing on a plane.


I will now teach you calculus. Ready? The idea, for which we have Isaac Newton to thank, is that there’s nothing special about a perfect circle. Every smooth curve, when you zoom in enough, looks just like a line. Doesn’t matter how winding or snarled it is—just that it doesn’t have any sharp corners.

When you fire a missile*, its path looks like this:

The missile goes up, then down, in a parabolic arc. Gravity makes all motion curve toward the earth; that’s among the fundamental facts of our physical life. But if we zoom in on a very short segment, the curve starts to look like this:

And then like this:

Just like the circle, the missile’s path looks to the naked eye like a straight line, progressing upward at an angle. The deviation from straightness caused by gravity is too small to see—but it’s still there, of course. Zooming in to an even smaller region of the curve makes the curve even more like a straight line. Closer and straighter, closer and straighter . . .

Now here’s the conceptual leap. Newton said, look, let’s go all the way. Reduce your field of view until it’s infinitesimal—so small that it’s smaller than any size you can name, but not zero. You’re studying the missile’s arc, not over a very short time interval, but at a single moment. What was almost a line becomes exactly a line. And the slope of this line is what Newton called the fluxion, and what we’d now call the derivative.

That’s a kind of jump Archimedes wasn’t willing to make. He understood that polygons with shorter sides got closer and closer to the circle—but he would never have said that the circle actually was a polygon with infinitely many infinitely short sides.

Some of Newton’s contemporaries, too, were reluctant to go along for the ride. The most famous objector was George Berkeley, who denounced Newton’s infinitesimals in a tone of high mockery sadly absent from current mathematical literature: “And what are these fluxions? The velocities of evanescent increments. And what are these same evanescent increments? They are neither finite quantities, nor quantities infinitely small, nor yet nothing. May we not call them the ghosts of departed quantities?”

And yet calculus works. If you swing a rock in a loop around your head and suddenly release it, it’ll shoot off along a linear trajectory at constant speed,* exactly in the direction that calculus says the rock is moving at the precise moment you let go. That’s yet another Newtonian insight; objects in motion tend to proceed in a straight-line path, unless some other force intercedes to nudge the object one way or the other. That’s one reason linear thinking comes so naturally to us: our intuition about time and motion is formed by the phenomena we observe in the world. Even before Newton codified his laws, something in us knew that things like to move in straight lines, unless given a reason to do otherwise.


Newton’s critics had a point; his construction of the derivative didn’t amount to what we’d call rigorous mathematics nowadays. The problem is the notion of the infinitely small, which was a slightly embarrassing sticking point for mathematicians for thousands of years. The trouble started with Zeno, a fifth-century-BCE Greek philosopher of the Eleatic school who specialized in asking innocent-seeming questions about the physical world that inevitably blossomed into huge philosophical brouhahas.

His most famous paradox goes like this. I decide to walk to the ice cream store. Now certainly I can’t get to the ice cream store until I’ve gone halfway there. And once I’ve gone halfway, I can’t get to the store until I’ve gone half the distance that remains. Having done so, I still have to cover half the remaining distance. And so on, and so on. I may get closer and closer to the ice cream store—but no matter how many steps of this process I undergo, I never actually reach the ice cream store. I am always some tiny but nonzero distance away from my two scoops with jimmies. Thus, Zeno concludes, to walk to the ice cream store is impossible. The argument works just as well for any destination: it’s equally impossible to walk across the street, or to take a single step, or to wave your hand. All motion is ruled out.

Diogenes the Cynic was said to have refuted Zeno’s argument by standing up and walking across the room. Which is a pretty good argument that motion is actually possible; so something must be wrong with Zeno’s argument. But where’s the mistake?

Break down the trip to the store numerically. First you go halfway. Then you go half of the remaining distance, which is 1/4 of the total distance, and you’ve got 1/4 left to go. So half of what’s left is 1/8, then 1/16, then 1/32. Your progress toward the store looks like this:

1/2 + 1/4 + 1/8 + 1/16 + 1/32 + . . .

If you add up ten terms of this sequence you get about 0.999. If you add up twenty terms it’s more like 0.999999. In other words, you are getting really, really, really close to the store. But no matter how many terms you add, you never get to 1.

Zeno’s paradox is much like another conundrum: is the repeating decimal 0.99999. . . . . . equal to 1?

I have seen people come nearly to blows over this question.* It’s hotly disputed on websites ranging from World of Warcraft fan pages to Ayn Rand forums. Our natural feeling about Zeno is “of course you eventually get your ice cream.” But in this case, intuition points the other way. Most people, if you press them, say 0.9999 . . . doesn’t equal 1. It doesn’t look like 1, that’s for sure. It looks smaller. But not much smaller! Like Zeno’s hungry ice cream lover, it gets closer and closer to its goal, but never, it seems, quite makes it there.

And yet, math teachers everywhere, myself included, will tell them, “No, it’s 1.”

How do I convince someone to come over to my side? One good trick is to argue as follows. Everyone knows that

0.33333. . . . . = 1/3.

Multiply both sides by 3 and you’ll see

0.99999. . . . = 3/3 = 1.

If that doesn’t sway you, try multiplying 0.99999 . . . by 10, which is just a matter of moving the decimal point one spot to the right.

10 × (0.99999 . . .) = 9.99999 . . .

Now subtract the vexing decimal from both sides:

10 × (0.99999 . . .) − 1 × (0.99999 . . .) = 9.99999 . . . − 0.99999 . . . .

The left-hand side of the equation is just 9 × (0.99999 . . .), because 10 times something minus that something is 9 times the aforementioned thing. And over on the right-hand side, we have managed to cancel out the terrible infinite decimal, and are left with a simple 9. So we end up with

9 × (0.99999 . . .) = 9.

If 9 times something is 9, that something just has to be 1—doesn’t it?

These arguments are often enough to win people over. But let’s be honest: they lack something. They don’t really address the anxious uncertainty induced by the claim 0.99999 . . . = 1; instead, they represent a kind of algebraic intimidation. “You believe that 1/3 is 0.3 repeating—don’t you? Don’t you?

Or worse: maybe you bought my argument based on multiplication by 10. But how about this one? What is

1 + 2 + 4 + 8 + 16 +  . . . ?

Here the “. . .” means “carry on the sum forever, adding twice as much each time.” Surely such a sum must be infinite! But an argument much like the apparently correct one concerning 0.9999 . . . seems to suggest otherwise. Multiply the sum above by 2 and you get

2 × (1 + 2 + 4 + 8 + 16 + . . .) = 2 + 4 + 8 + 16 + . . .

which looks a lot like the original sum; indeed, it is just the original sum (1 + 2 + 4 + 8 + 16 + . . .) with the 1 lopped off the beginning, which means that 2 × (1 + 2 + 4 + 8 + 16 + . . .) is 1 less than (1 + 2 + 4 + 8 + 16 + . . .). In other words,

2 × (1 + 2 + 4 + 8 + 16 + . . .) − 1 × (1 + 2 + 4 + 8 + 16 + . . .) = −1.

But the left-hand side simplifies to the very sum we started with, and we’re left with

1 + 2 + 4 + 8 + 16 + . . . = −1.

Is that what you want to believe?* That adding bigger and bigger numbers, ad infinitum, flops you over into negativeland?

More craziness: What is the value of the infinite sum

1 − 1 + 1 − 1 + 1 − 1 + . . .

One might first observe that the sum is

(1 − 1) + (1 − 1) + (1 − 1) + . . . = 0 + 0 + 0 + . . .

and argue that the sum of a bunch of zeroes, even infinitely many, has to be 0. On the other hand, 1 − 1 + 1 is the same thing as 1 − (1 − 1), because the negative of a negative is a positive; applying this fact again and again, we can rewrite the sum as

1 − (1 − 1) − (1 − 1) − (1 − 1)  . . . = 1 − 0 − 0 − 0 . . .

which seems to demand, in the same way, that the sum is equal to 1! So which is it, 0 or 1? Or is it somehow 0 half the time and 1 half the time? It seems to depend where you stop—but infinite sums never stop!

Don’t decide yet, because it gets worse. Suppose T is the value of our mystery sum:

T = 1 − 1 + 1 − 1 + 1 − 1 + . . .

Taking the negative of both sides gives you

−T = −1 + 1 − 1 + 1  . . .

But the sum on the right-hand side is precisely what you get if you take the original sum defining T and lop off that first 1, thus subtracting 1; in other words,

−T = −1 + 1 − 1 + 1  . . . = T − 1.

So −T = T − 1, an equation concerning T which is satisfied only when T is equal to 1/2. Can a sum of infinitely many whole numbers somehow magically become a fraction? If you say no, you have the right to be at least a little suspicious of slick arguments like this one. But note that some people said yes, including the Italian mathematician/priest Guido Grandi, after whom the series 1 − 1 + 1 − 1 + 1 − 1 + . . . is usually named; in a 1703 paper, he argued that the sum of the series is 1/2, and moreover that this miraculous conclusion represented the creation of the universe from nothing. (Don’t worry, I don’t follow that last step either.) Other leading mathematicians of the time, like Leibniz and Euler, were on board with Grandi’s strange computation, if not his interpretation.

But in fact, the answer to the 0.999 . . . riddle (and to Zeno’s paradox, and to Grandi’s series) lies a little deeper. You don’t have to give in to my algebraic strong-arming. You might, for instance, insist that 0.999 . . . is not equal to 1, but rather 1 minus some tiny infinitesimal number. And, for that matter, you might further insist that 0.333 . . . is not exactly equal to 1/3, but also falls short by an infinitesimal quantity. This point of view requires some stamina to push through to completion, but it can be done. I once had a calculus student named Brian who, unhappy with the classroom definitions, worked out a fair chunk of the theory by himself, referring to his infinitesimal quantities as “Brian numbers.”

Brian was not actually the first to get there. There’s a whole field of mathematics that specializes in contemplating numbers of this kind, called nonstandard analysis. The theory, developed by Abraham Robinson in the mid-twentieth century, finally made sense of the “evanescent increments” that Berkeley found so ridiculous. The price you have to pay (or, from another point of view, the reward you get to reap) is a profusion of novel kinds of numbers; not only infinitely small ones, but infinitely large ones, a huge spray of them in all shapes and sizes.*

As it happened, Brian was in luck—I had a colleague at Princeton, Edward Nelson, who was an expert in nonstandard analysis. I set up a meeting for the two of them so Brian could learn more about it. The meeting, Ed told me later, didn’t go well. As soon as Ed made it clear that infinitesimal quantities were not in fact going to be called Brian numbers, Brian lost all interest.

(Moral lesson: people who go into mathematics for fame and glory don’t stay in mathematics for long.)

But we’re no closer to settling our dispute. What is 0.999 . . . , really? Is it 1? Or is it some number infinitesimally less than 1, a crazy kind of number that hadn’t even been discovered a hundred years ago?

The right answer is to unask the question. What is 0.999 . . . , really? It appears to refer to a kind of sum:

.9 + .09 + .009 + .0009 + . . .

But what does that mean? That pesky ellipsis is the real problem. There can be no controversy about what it means to add up two, or three, or a hundred numbers. This is just mathematical notation for a physical process we understand very well: take a hundred heaps of stuff, mush them together, see how much you have. But infinitely many? That’s a different story. In the real world, you can never have infinitely many heaps. What’s the numerical value of an infinite sum? It doesn’t have one—until we give it one. That was the great innovation of Augustin-Louis Cauchy, who introduced the notion of limit into calculus in the 1820s.*

The British number theorist G. H. Hardy, in his 1949 book Divergent Series, explains it best:

It does not occur to a modern mathematician that a collection of mathematical symbols should have a “meaning” until one has been assigned to it by definition. It was not a triviality even to the greatest mathematicians of the eighteenth century. They had not the habit of definition: it was not natural to them to say, in so many words, “by X we mean Y.” . . . It is broadly true to say that mathematicians before Cauchy asked not, “How shall we define 1 − 1 + 1 − 1 + . . .” but “What is 1 − 1 + 1 − 1 + . . . ?” and that this habit of mind led them into unnecessary perplexities and controversies which were often really verbal.

This is not just loosey-goosey mathematical relativism. Just because we can assign whatever meaning we like to a string of mathematical symbols doesn’t mean we should. In math, as in life, there are good choices and there are bad ones. In the mathematical context, the good choices are the ones that settle unnecessary perplexities without creating new ones.

The sum .9 + .09 + .009 + . . . gets closer and closer to 1 the more terms you add. And it never gets any farther away. No matter how tight a cordon we draw around the number 1, the sum will eventually, after some finite number of steps, penetrate it, and never leave. Under those circumstances, Cauchy said, we should simply define the value of the infinite sum to be 1. And then he worked very hard to prove that committing oneself to his definition didn’t cause horrible contradictions to pop up elsewhere. By the time this labor was done, he’d constructed a framework that made Newton’s calculus completely rigorous. When we say a curve looks locally like a straight line at a certain angle, we now mean more or less this: as you zoom in tighter and tighter, the curve resembles the given line more and more closely. In Cauchy’s formulation, there’s no need to mention infinitely small numbers, or anything else that would make a skeptic blanch.

Of course there is a cost. The reason the 0.999 . . . problem is difficult is that it brings our intuitions into conflict. We would like the sum of an infinite series to play nicely with arithmetic manipulations like the ones we carried out on the previous pages, and this seems to demand that the sum equal 1. On the other hand, we would like each number to be represented by a unique string of decimal digits, which conflicts with the claim that the same number can be called either 1 or 0.999 . . . , as we like. We can’t hold on to both of these desires at once; one must be discarded. In Cauchy’s approach, which has amply proved its worth in the two centuries since he invented it, it’s the uniqueness of the decimal expansion that goes out the window. We’re untroubled by the fact that the English language sometimes uses two different strings of letters (i.e., two words) to refer synonymously to the same thing in the world; in the same way, it’s not so bad that two different strings of digits can refer to the same number.

As for Grandi’s 1 − 1 + 1 − 1 + . . . , it is one of the series outside the reach of Cauchy’s theory: that is, one of the divergent series that formed the subject of Hardy’s book. The Norwegian mathematician Niels Henrik Abel, an early fan of Cauchy’s approach, wrote in 1828, “Divergent series are the invention of the devil, and it is shameful to base on them any demonstration whatsoever.”* Hardy’s view, which is our view today, is more forgiving; there are some divergent series to which we ought to assign values and some to which we ought not, and some to which we ought or ought not depending on the context in which the series arises. Modern mathematicians would say that if we are to assign the Grandi series a value, it should be 1/2, because, as it turns out, all interesting theories of infinite sums either give it the value 1/2 or decline, like Cauchy’s theory, to give it any value at all.*

To write Cauchy’s definitions down precisely takes a bit more work. This was especially true for Cauchy himself, who had not quite phrased the ideas in their clean, modern form.* (In mathematics, you very seldom get the clearest account of an idea from the person who invented it.) Cauchy was an unwavering conservative and a royalist, but in his mathematics he was proudly revolutionary and a scourge to academic authority. Once he understood how to do things without the dangerous infinitesimals, he unilaterally rewrote his syllabus at the École Polytechnique to reflect his new ideas. This enraged everyone around him: his mystified students, who had signed up for freshman calculus, not a seminar on cutting-edge pure mathematics; his colleagues, who felt that the engineering students at the École had no need for Cauchy’s level of rigor; and the administrators, whose commands to stick to the official course outline he completely ignored. The École imposed a new curriculum from above that emphasized the traditional infinitesimal approach to calculus, and placed note takers in Cauchy’s classroom to make sure he complied. Cauchy did not comply. Cauchy was not interested in the needs of engineers. Cauchy was interested in the truth.

It’s hard to defend Cauchy’s stance on pedagogical grounds. But I’m sympathetic with him anyway. One of the great joys of mathematics is the incontrovertible feeling that you’ve understood something the right way, all the way down to the bottom; it’s a feeling I haven’t experienced in any other sphere of mental life. And when you know how to do something the right way, it’s hard—for some stubborn people, impossible—to make yourself explain it the wrong way.



The stand-up comic Eugene Mirman tells this joke about statistics. He says he likes to tell people, “I read that 100% of Americans were Asian.”

“But Eugene,” his confused companion protests, “you’re not Asian.”

And the punch line, delivered with magnificent self-assurance: “I read that I was!”

I thought of Mirman’s joke when I encountered a paper in the journal Obesity whose title posed the discomfiting question: “Will all Americans become overweight or obese?” As if the rhetorical question weren’t enough, the article supplies an answer: “Yes—by 2048.”

In 2048 I’ll be seventy-seven years old, and I hope not to be overweight. But I read I would be!

The Obesity paper got plenty of press, as you might imagine. ABC News warned of an “obesity apocalypse.” The Long Beach Press-Telegram went with the simple headline “We’re Getting Fatter.” The study’s results resonated with the latest manifestation of the fevered, ever-shifting anxiety with which Americans have always contemplated our national moral status. Before I was born, boys grew long hair and thus we were bound to get whipped by the Communists. When I was a kid, we played arcade games too much, which left us doomed to be outcompeted by the industrious Japanese. Now we eat too much fast food, and we’re all going to die weak and immobile, surrounded by empty chicken buckets, puddled into the couches from which we long ago became unable to hoist ourselves. The paper certified this anxiety as a fact proved by science.

I have some good news. We’re not all going to be overweight in the year 2048. Why? Because not every curve is a line.

How Not to Be Wrong

How Not to Be Wrong

The Power of Mathematical Thinking