Elementary Secondary Higher Ed Common Reads

Download high-resolution image

Listen to a clip from the audiobook

0:00

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

Author Cathy O'Neil

Listen to a clip from the audiobook

0:00

Ebook

Crown

On sale Sep 06, 2016 | 288 Pages | 9780553418828

Grades 9-12 + AP/IB

Add to cart Add to list Exam Copies

See Additional Formats

NEW YORK TIMES BESTSELLER • A former Wall Street quant sounds the alarm on Big Data and the mathematical models that threaten to rip apart our social fabric—with a new afterword

“A manual for the twenty-first-century citizen . . . relevant and urgent.”—Financial Times

NATIONAL BOOK AWARD LONGLIST • NAMED ONE OF THE BEST BOOKS OF THE YEAR BY The New York Times Book Review • The Boston Globe • Wired • Fortune • Kirkus Reviews • The Guardian • Nature • On Point

We live in the age of the algorithm. Increasingly, the decisions that affect our lives—where we go to school, whether we can get a job or a loan, how much we pay for health insurance—are being made not by humans, but by machines. In theory, this should lead to greater fairness: Everyone is judged according to the same rules.

But as mathematician and data scientist Cathy O’Neil reveals, the mathematical models being used today are unregulated and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination—propping up the lucky, punishing the downtrodden, and undermining our democracy in the process. Welcome to the dark side of Big Data.

Cathy O’Neil is the author of the bestselling Weapons of Math Destruction, which won the Euler Book Prize and was longlisted for the National Book Award. She received her PhD in mathematics from Harvard and has worked in finance, tech, and academia. She launched the Lede Program for data journalism at Columbia University and recently founded ORCAA, an algorithmic auditing company. O’Neil is a regular contributor to Bloomberg Opinion. View titles by Cathy O'Neil

1

BOMB PARTS

What Is a Model?

It was a hot August afternoon in 1946. Lou Boudreau, the player-manager of the Cleveland Indians, was having a miserable day. In the first game of a doubleheader, Ted Williams had almost single-handedly annihilated his team. Williams, perhaps the game’s greatest hitter at the time, had smashed three home runs and driven home eight. The Indians ended up losing 11 to 10.

Boudreau had to take action. So when Williams came up for the first time in the second game, players on the Indians’ side started moving around. Boudreau, the shortstop, jogged over to where the second baseman would usually stand, and the second baseman backed into short right field. The third baseman moved to his left, into the shortstop’s hole. It was clear that Boudreau, perhaps out of desperation, was shifting the entire orientation of his defense in an attempt to turn Ted Williams’s hits into outs.

In other words, he was thinking like a data scientist. He had analyzed crude data, most of it observational: Ted Williams usually hit the ball to right field. Then he adjusted. And it worked. Fielders caught more of Williams’s blistering line drives than before (though they could do nothing about the home runs sailing over their heads).

If you go to a major league baseball game today, you’ll see that defenses now treat nearly every player like Ted Williams. While Boudreau merely observed where Williams usually hit the ball, managers now know precisely where every player has hit every ball over the last week, over the last month, throughout his career, against left-handers, when he has two strikes, and so on. Using this historical data, they analyze their current situation and calculate the positioning that is associated with the highest probability of success. And that sometimes involves moving players far across the field.

Shifting defenses is only one piece of a much larger question: What steps can baseball teams take to maximize the probability that they’ll win? In their hunt for answers, baseball statisticians have scrutinized every variable they can quantify and attached it to a value. How much more is a double worth than a single? When, if ever, is it worth it to bunt a runner from first to second base?

The answers to all of these questions are blended and combined into mathematical models of their sport. These are parallel universes of the baseball world, each a complex tapestry of probabilities. They include every measurable relationship among every one of the sport’s components, from walks to home runs to the players themselves. The purpose of the model is to run different scenarios at every juncture, looking for the optimal combinations. If the Yankees bring in a right-handed pitcher to face Angels slugger Mike Trout, as compared to leaving in the current pitcher, how much more likely are they to get him out? And how will that affect their overall odds of winning?

Baseball is an ideal home for predictive mathematical modeling. As Michael Lewis wrote in his 2003 bestseller, Moneyball, the sport has attracted data nerds throughout its history. In decades past, fans would pore over the stats on the back of baseball cards, analyzing Carl Yastrzemski’s home run patterns or comparing Roger Clemens’s and Dwight Gooden’s strikeout totals. But starting in the 1980s, serious statisticians started to investigate what these figures, along with an avalanche of new ones, really meant: how they translated into wins, and how executives could maximize success with a minimum of dollars.

“Moneyball” is now shorthand for any statistical approach in domains long ruled by the gut. But baseball represents a healthy case study—and it serves as a useful contrast to the toxic models, or WMDs, that are popping up in so many areas of our lives. Baseball models are fair, in part, because they’re transparent. Everyone has access to the stats and can understand more or less how they’re interpreted. Yes, one team’s model might give more value to home run hitters, while another might discount them a bit, because sluggers tend to strike out a lot. But in either case, the numbers of home runs and strikeouts are there for everyone to see.

Baseball also has statistical rigor. Its gurus have an immense data set at hand, almost all of it directly related to the performance of players in the game. Moreover, their data is highly relevant to the outcomes they are trying to predict. This may sound obvious, but as we’ll see throughout this book, the folks building WMDs routinely lack data for the behaviors they’re most interested in. So they substitute stand-in data, or proxies. They draw statistical correlations between a person’s zip code or language patterns and her potential to pay back a loan or handle a job. These correlations are discriminatory, and some of them are illegal. Baseball models, for the most part, don’t use proxies because they use pertinent inputs like balls, strikes, and hits.

Most crucially, that data is constantly pouring in, with new statistics from an average of twelve or thirteen games arriving daily from April to October. Statisticians can compare the results of these games to the predictions of their models, and they can see where they were wrong. Maybe they predicted that a left-handed reliever would give up lots of hits to right-handed batters—and yet he mowed them down. If so, the stats team has to tweak their model and also carry out research on why they got it wrong. Did the pitcher’s new screwball affect his statistics? Does he pitch better at night? Whatever they learn, they can feed back into the model, refining it. That’s how trustworthy models operate. They maintain a constant back-and-forth with whatever in the world they’re trying to understand or predict. Conditions change, and so must the model.

Now, you may look at the baseball model, with its thousands of changing variables, and wonder how we could even be comparing it to the model used to evaluate teachers in Washington, D.C., schools. In one of them, an entire sport is modeled in fastidious detail and updated continuously. The other, while cloaked in mystery, appears to lean heavily on a handful of test results from one year to the next. Is that really a model?

The answer is yes. A model, after all, is nothing more than an abstract representation of some process, be it a baseball game, an oil company’s supply chain, a foreign government’s actions, or a movie theater’s attendance. Whether it’s running in a computer program or in our head, the model takes what we know and uses it to predict responses in various situations. All of us carry thousands of models in our heads. They tell us what to expect, and they guide our decisions.

Here’s an informal model I use every day. As a mother of three, I cook the meals at home—my husband, bless his heart, cannot remember to put salt in pasta water. Each night when I begin to cook a family meal, I internally and intuitively model everyone’s appetite. I know that one of my sons loves chicken (but hates hamburgers), while another will eat only the pasta (with extra grated parmesan cheese). But I also have to take into account that people’s appetites vary from day to day, so a change can catch my model by surprise. There’s some unavoidable uncertainty involved.

The input to my internal cooking model is the information I have about my family, the ingredients I have on hand or I know are available, and my own energy, time, and ambition. The output is how and what I decide to cook. I evaluate the success of a meal by how satisfied my family seems at the end of it, how much they’ve eaten, and how healthy the food was. Seeing how well it is received and how much of it is enjoyed allows me to update my model for the next time I cook. The updates and adjustments make it what statisticians call a “dynamic model.”

Over the years I’ve gotten pretty good at making meals for my family, I’m proud to say. But what if my husband and I go away for a week, and I want to explain my system to my mom so she can fill in for me? Or what if my friend who has kids wants to know my methods? That’s when I’d start to formalize my model, making it much more systematic and, in some sense, mathematical. And if I were feeling ambitious, I might put it into a computer program.

Ideally, the program would include all of the available food options, their nutritional value and cost, and a complete database of my family’s tastes: each individual’s preferences and aversions. It would be hard, though, to sit down and summon all that informationoff the top of my head. I’ve got loads of memories of people grabbing seconds of asparagus or avoiding the string beans. But they’re all mixed up and hard to formalize in a comprehensive list.

The better solution would be to train the model over time, entering data every day on what I’d bought and cooked and noting the responses of each family member. I would also include parameters, or constraints. I might limit the fruits and vegetables to what’s in season and dole out a certain amount of Pop-Tarts, but only enough to forestall an open rebellion. I also would add a number of rules. This one likes meat, this one likes bread and pasta, this one drinks lots of milk and insists on spreading Nutella on everything in sight.

WINNER
Euler Book Prize

“O’Neil’s book offers a frightening look at how algorithms are increasingly regulating people. . . . Her knowledge of the power and risks of mathematical models, coupled with a gift for analogy, makes her one of the most valuable observers of the continuing weaponization of big data. . . . [She] does a masterly job explaining the pervasiveness and risks of the algorithms that regulate our lives.”—The New York Times Book Review

"Weapons of Math Destruction is the Big Data story Silicon Valley proponents won't tell. . . . [It] pithily exposes flaws in how information is used to assess everything from creditworthiness to policing tactics . . . a thought-provoking read for anyone inclined to believe that data doesn't lie.”—Reuters

“This is a manual for the twenty-first century citizen, and it succeeds where other big data accounts have failed—it is accessible, refreshingly critical and feels relevant and urgent.”—Financial Times

"Insightful and disturbing."—New York Review of Books

“Weapons of Math Destruction is an urgent critique of . . . the rampant misuse of math in nearly every aspect of our lives.”—Boston Globe

“A fascinating and deeply disturbing book.”—Yuval Noah Harari, author of Sapiens

“Illuminating . . . [O’Neil] makes a convincing case that this reliance on algorithms has gone too far.”—The Atlantic

“A nuanced reminder that big data is only as good as the people wielding it.”—Wired

“If you’ve ever suspected there was something baleful about our deep trust in data, but lacked the mathematical skills to figure out exactly what it was, this is the book for you.”—Salon

“O’Neil is an ideal person to write this book. She is an academic mathematician turned Wall Street quant turned data scientist who has been involved in Occupy Wall Street and recently started an algorithmic auditing company. She is one of the strongest voices speaking out for limiting the ways we allow algorithms to influence our lives. . . . While Weapons of Math Destruction is full of hard truths and grim statistics, it is also accessible and even entertaining. O’Neil’s writing is direct and easy to read—I devoured it in an afternoon.”—Scientific American

“Indispensable . . . Despite the technical complexity of its subject, Weapons of Math Destruction lucidly guides readers through these complex modeling systems. . . . O’Neil’s book is an excellent primer on the ethical and moral risks of Big Data and an algorithmically dependent world. . . . For those curious about how Big Data can help them and their businesses, or how it has been reshaping the world around them, Weapons of Math Destruction is an essential starting place.”—National Post

“Cathy O’Neil has seen Big Data from the inside, and the picture isn’t pretty. Weapons of Math Destruction opens the curtain on algorithms that exploit people and distort the truth while posing as neutral mathematical tools. This book is wise, fierce, and desperately necessary.”—Jordan Ellenberg, University of Wisconsin-Madison, author of How Not To Be Wrong

“O’Neil has become [a whistle-blower] for the world of Big Data . . . [in] her important new book. . . . Her work makes particularly disturbing points about how being on the wrong side of an algorithmic decision can snowball in incredibly destructive ways.”—Time

About

Author

Excerpt

Awards

WINNER
Euler Book Prize

Praise

Additional formats

Added to Wish List Removed from Wish List

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

Author: Cathy O'Neil

$20.00 US

Paperback

Sep 05, 2017
Added to Wish List Removed from Wish List

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

Author: Cathy O'Neil

Audiobook Download

Sep 06, 2016

Added to Wish List Removed from Wish List

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

Author: Cathy O'Neil

$20.00 US

Paperback

Sep 05, 2017
Added to Wish List Removed from Wish List

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

Author: Cathy O'Neil

Audiobook Download

Sep 06, 2016

Other Books by this Author

Added to Wish List Removed from Wish List

The Shame Machine

Who Profits in the New Age of Humiliation

Author: Cathy O'Neil

$27.00 US

Hardcover

Mar 22, 2022

Added to Wish List Removed from Wish List

The Shame Machine

Who Profits in the New Age of Humiliation

Author: Cathy O'Neil

$27.00 US

Hardcover

Mar 22, 2022

Books for Disability Pride Month Book Covers; 9780593979976 UPWARD BOUND (RH) 9780593851579 I IDENTIFY AS BLIND (Penguin) 9781984899422 DISABILITY VISIBILITY (Knopf) 9780593523964 WHALE EYES (PYR)

English Language Arts Social Studies Special Education

June 22 2026

Books for Disability Pride Month

July is Disability Pride Month and we’re highlighting books that center disabled stories and creators. Browse our collections here: Middle School | High School

Books for Disability Pride Month

English Language Arts Social Studies Special Education

June 22 2026

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

About

Author

Excerpt

Awards

Praise

Additional formats

Other Books by this Author

Books for Disability Pride Month

Books for Disability Pride Month

ABOUT SECONDARY EDUCATION

Contact Us

PENGUIN RANDOM HOUSE

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

About

Author

Excerpt

Awards

Praise

Additional formats

Other Books by this Author

Related Articles

Books for Disability Pride Month

Our Mission

PENGUIN RANDOM HOUSE EDUCATION

ABOUT SECONDARY EDUCATION

Contact Us

PENGUIN RANDOM HOUSE