Download high-resolution image
Listen to a clip from the audiobook
audio pause button
0:00
0:00

Dataclysm

Who We Are (When We Think No One's Looking)

Listen to a clip from the audiobook
audio pause button
0:00
0:00
A New York Times Bestseller

An audacious, irreverent investigation of human behavior—and a first look at a revolution in the making

 
Our personal data has been used to spy on us, hire and fire us, and sell us stuff we don’t need. In Dataclysm, Christian Rudder uses it to show us who we truly are.
 
For centuries, we’ve relied on polling or small-scale lab experiments to study human behavior. Today, a new approach is possible. As we live more of our lives online, researchers can finally observe us directly, in vast numbers, and without filters. Data scientists have become the new demographers.
 
In this daring and original book, Rudder explains how Facebook "likes" can predict, with surprising accuracy, a person’s sexual orientation and even intelligence; how attractive women receive exponentially more interview requests; and why you must have haters to be hot. He charts the rise and fall of America’s most reviled word through Google Search and examines the new dynamics of collaborative rage on Twitter. He shows how people express themselves, both privately and publicly. What is the least Asian thing you can say? Do people bathe more in Vermont or New Jersey? What do black women think about Simon & Garfunkel? (Hint: they don’t think about Simon & Garfunkel.) Rudder also traces human migration over time, showing how groups of people move from certain small towns to the same big cities across the globe. And he grapples with the challenge of maintaining privacy in a world where these explorations are possible.
 
Visually arresting and full of wit and insight, Dataclysm is a new way of seeing ourselves—a brilliant alchemy, in which math is made human and numbers become the narrative of our time.
© Victor G. Jeffreys II
Christian Rudder is a co-founder and former president of the dating site OkCupid, where he authored the popular OkTrends blog. He graduated from Harvard in 1998 with a degree in math and later served as creative director for SparkNotes. He has appeared on Dateline NBC and NPR's "All Things Considered" and his work has been written about in the New York Times and the New Yorker, among other places. He lives in Brooklyn with his wife and daughter. View titles by Christian Rudder
1.

Wooderson’s Law

Up where the world is steep, like in the Andes, people use funicular railroads to get where they need to go—­a pair of cable cars connected by a pulley far up the hill. The weight of the one car going down pulls the other up; the two vessels travel in counterbalance. I’ve learned that that’s what being a parent is like. If the years bring me low, they raise my daughter, and, please, so be it. I surrender gladly to the passage, of course, especially as each new moment gone by is another I’ve lived with her, but that doesn’t mean I don’t miss the days when my hair was actually all brown and my skin free of weird spots. My girl is two and I can tell you that nothing makes the arc of time more clear than the creases in the back of your hand as it teaches plump little fingers to count: one, two, tee.

But some guy having a baby and getting wrinkles is not news. You can start with whatever the Oil of Olay marketing department is running up the pole this week—­as I’m writing it’s the idea of “color correcting” your face with a creamy beige paste that is either mud from the foothills of Alsace or the very essence of bullshit—­and work your way back to myths of Hera’s jealous rage. People have been obsessed with getting older, and with getting uglier because of it, for as long as there’ve been people and obsession and ugliness. “Death and taxes” are our two eternals, right? And depending on the next government shutdown, the latter is looking less and less reliable. So there you go.

When I was a teenager—­and it shocks me to realize I was closer then to my daughter’s age than to my current thirty-­eight—­I was really into punk rock, especially pop-­punk. The bands were basically snottier and less proficient versions of Green Day. When I go back and listen to them now, the whole phenomenon seems supernatural to me: grown men brought together in trios and quartets by some unseen force to whine about girlfriends and what other people are eating. But at the time I thought these bands were the shit. And because they were too cool to have posters, I had to settle for arranging their album covers and flyers on my bedroom wall. My parents have long since moved—­twice, in fact. I’m pretty sure my old bedroom is now someone else’s attic, and I have no idea where any of the paraphernalia I collected is. Or really what most of it even looked like. I can just remember it and smile, and wince.

Today an eighteen-­year-­old tacks a picture on his wall, and that wall will never come down. Not only will his thirty-­eight-­year-­old self be able to go back, pick through the detritus, and ask, “What was I thinking?,” so can the rest of us, and so can researchers. Moreover, they can do it for all people, not just one guy. And, more still, they can connect that eighteenth year to what came before and what’s still to come, because the wall, covered in totems, follows him from that bedroom in his parents’ house to his dorm room to his first apartment to his girlfriend’s place to his honeymoon, and, yes, to his daughter’s nursery. Where he will proceed to paper it over in a billion updates of her eating mush.

A new parent is perhaps most sensitive to the milestones of getting older. It’s almost all you talk about with other people, and you get actual metrics at the doctor’s every few months. But the milestones keep coming long after babycenter .com and the pediatrician quit with the reminders. It’s just that we stop keeping track. Computers, however, have nothing better to do; keeping track is their only job. They don’t lose the scrapbook, or travel, or get drunk, or grow senile, or even blink. They just sit there and remember. The myriad phases of our lives, once gone but to memory and the occasional shoebox, are becoming permanent, and as daunting as that may be to everyone with a drunk selfie on Instagram, the opportunity for understanding, if handled carefully, is self-­evident.

What I’ve just described, the wall and the long accumulation of a life, is what sociologists call longitudinal data—­data from following the same people, over time—­and I was speculating about the research of the future. We don’t have these capabilities quite yet because the Internet, as a pervasive human record, is still too young. As hard as it is to believe, even Facebook, touchstone and warhorse that it is, has only been big for about six years. It’s not even in middle school! Information this deep is still something we’re building toward, literally, one day at a time. In ten or twenty years, we’ll be able to answer questions like . . . well, for one, how much does it mess up a person to have every moment of her life, since infancy, posted for everyone else to see? But we’ll also know so much more about how friends grow apart or how new ideas percolate through the mainstream. I can see the long-­term potential in the rows and columns of my databases, and we can all see it in, for example, the promise of Facebook’s Timeline: for the passage of time, data creates a new kind of fullness, if not exactly a new science.

Even now, in certain situations, we can find an excellent proxy, a sort of flash-­forward to the possibilities. We can take groups of people at different points in their lives, compare them, and get a rough draft of life’s arc. This approach won’t work with music tastes, for example, because music itself also evolves through time, so the analysis has no control. But there are fixed universals that can support it, and, in the data I have, the nexus of beauty, sex, and age is one of them. Here the possibility already exists to mark milestones, as well as lay bare vanities and vulnerabilities that were perhaps till now just shades of truth. So doing, we will approach a topic that has consumed authors, painters, philosophers, and poets since those vocations existed, perhaps with less art (though there is an art to it), but with a new and glinting precision. As usual, the good stuff lies in the distance between thought and action, and I’ll show you how we find it.

I’ll start with the opinions of women—­all the trends below are true across my sexual data sets, but for specificity’s sake, I’ll use numbers from OkCupid. This table lists, for a woman, the age of men she finds most attractive. If I’ve arranged it unusually, you’ll see in a second why.

Reading from the top, we see that twenty-­ and twenty-­one-­year-­old women prefer twenty-­three-­year-­old guys; twenty-­two-­year-­old women like men who are twenty-­four, and so on down through the years to women at fifty, who we see rate forty-­six-­year-­olds the highest. This isn’t survey data, this is data built from tens of millions of preferences expressed in the act of finding a date, and even from just following along the first few entries, the gist of the table is clear: a woman wants a guy to be roughly as old as she is. Pick an age in black under forty, and the number in red is always very close. The broad trend comes through better when I let lateral space reflect the progression of the values in red:

That dotted diagonal is the “age parity” line, where the male and female years would be equal. It’s not a canonical math thing, just something I overlaid as a guide for your eye. Often there is an intrinsic geometry to a situation—­it was the first science for a reason—­and we’ll take advantage wherever possible. This particular line brings out two transitions, which coincide with big birthdays. The first pivot point is at thirty, where the trend of the red numbers—­the ages of the men—­crosses below the line, never to cross back. That’s the data’s way of saying that until thirty, a woman prefers slightly older guys; afterward, she likes them slightly younger. Then at forty, the progression breaks free of the diagonal, going practically straight down for nine years. That is to say, a woman’s tastes appear to hit a wall. Or a man’s looks fall off a cliff, however you want to think about it. If we want to pick the point where a man’s sexual appeal has reached its limit, it’s there: forty.

The two perspectives (of the woman doing the rating and of the man being rated) are two halves of a whole. As a woman gets older, her standards evolve, and from the man’s side, the rough 1:1 movement of the red numbers versus the black implies that as he matures, the expectations of his female peers mature as well—­practically year-­for-­year. He gets older, and their viewpoint accommodates him. The wrinkles, the nose hair, the renewed commitment to cargo shorts—­these are all somehow satisfactory, or at least offset by other virtues. Compare this to the free ­fall of scores going the other way, from men to women.

This graph—­and it’s practically not even a graph, just a table with a couple columns—­makes a statement as stark as its own negative space. A woman’s at her best when she’s in her very early twenties. Period. And really my plot doesn’t show that strongly enough. The four highest-rated female ages are twenty, twenty-­one, twenty-­two, and twenty-­three for every group of guys but one. You can see the general pattern below, where I’ve overlaid shading for the top two quartiles (that is, top half) of ratings. I’ve also added some female ages as numbers in black on the bottom horizontal to help you navigate:

Again, the geometry speaks: the male pattern runs much deeper than just a preference for twenty-­year-­olds. And after he hits thirty, the latter half of our age range (that is, women over thirty-­five) might as well not exist. Younger is better, and youngest is best of all, and if “over the hill” means the beginning of a person’s decline, a straight woman is over the hill as soon as she’s old enough to drink.

Of course, another way to put this focus on youth is that males’ expectations never grow up. A fifty-­year-­old man’s idea of what’s hot is roughly the same as a college kid’s, at least with age as the variable under consideration—­if anything, men in their twenties are more willing to date older women. That pocket of middling ratings in the upper right of the plot, that’s your “cougar” bait, basically. Hikers just out enjoying a nice day, then bam.

In a mathematical sense, a man’s age and his sexual aims are independent variables: the former changes while the latter never does. I call this Wooderson’s law, in honor of its most famous proponent, Matthew McConaughey’s character from Dazed and Confused.

Unlike Wooderson himself, what men claim they want is quite different from the private voting data we’ve just seen. The ratings above were submitted without any specific prompt beyond “Judge this person.” But when you ask men outright to select the ages of women they’re looking for, you get much different results. The gray space below is what men tell us they want when asked:

Since I don’t think that anyone is intentionally misleading us when they give OkCupid their preferences—­there’s little incentive to do that, since all you get then is a site that gives you what you know you don’t want—­I see this as a statement of what men imagine they’re supposed to desire, versus what they actually do. The gap between the two ideas just grows over the years, although the tension seems to resolve in a kind of pathetic compromise when it’s time to stop voting and act, as you’ll see.

The next plot (the final one of this type we’ll look at) identifies the age with the greatest density of contact attempts. These most-­messaged ages are described by the darkest gray squares drifting along the left-­hand edge of the larger swath. Those three dark verticals in the graph’s lower half show the jumps in a man’s self-­concept as he approaches middle age. You can almost see the gears turning. At forty-­four, he’s comfortable approaching a woman as young as thirty-­five. Then, one year later . . . he thinks better of it. While a nine-­year age difference is fine, ten years is apparently too much.

It’s this kind of calculated no-­man’s-­land—­the balance between what you want, what you say, and what you do—­that real romance has to occupy: no matter how people might vote in private or what they prefer in the abstract, there aren’t many fifty-­year-­old men successfully pursuing twenty-­year-­old women. For one thing, social conventions work against it. For another, dating requires reciprocity. What one person wants is only half of the equation.
  • FINALIST | 2014
    L.A. Times Book Prize (Science and Tech)
An NPR Best Book of 2014
A Globe & Mail Best Book of 2014
A Brain Pickings Best Science Book of 2014
A Bloomberg Best Book of 2014
One of Hudson Booksellers' 5 Best Business Books of 2014
Goodreads Semifinalist for Best Nonfiction Book of the Year
Finalist for the Los Angeles Times Book Prize

"Most data-hyping books are vapor and slogans. This one has the real stuff: actual data and actual analysis taking place on the page. That’s something to be praised, loudly and at length. Praiseworthy, too, is Rudder’s writing, which is consistently zingy and mercifully free of Silicon Valley business gabble."
Jordan Ellenberg, Washington Post

"As a researcher, Mr. Rudder clearly possesses the statistical acumen to answer the questions he has posed so well. As a writer, he keeps the book moving while fully exploring each topic, revealing his graphs and charts with both explanatory and narrative skill. Though he forgoes statistical particulars like p-values and confidence intervals, he gives an approachable, persuasive account of his data sources and results. He offers explanations of what the data can and cannot tell us, why it is sufficient or insufficient to answer some question we may have and, if the latter is the case, what sufficient data would look like. He shows you, in short, how to think about data."
—Wall Street Journal

"Rudder is the co-founder of the dating site OKCupid and the data scientist behind its now-legendary trend analyses, but he is also — as it becomes immediately clear from his elegant writing and wildly cross-disciplinary references — a lover of literature, philosophy, anthropology, and all the other humanities that make us human and that, importantly in this case, enhance and ennoble the hard data with dimensional insight into the richness of the human experience...an extraordinarily unusual and dimensional lens on what Carl Sagan memorably called ‘the aggregate of our joy and suffering.’"
—Maria Popova, Brain Pickings

"Fascinating, funny, and occasionally howl-inducing...[Rudder] is a quant with soul, and we’re lucky to have him."
—Elle
 
"There's another side of Big Data you haven't seen—not the one that promised to use our digital world to our advantage to optimize, monetize, or systematize every last part our lives. It's the big data that rears its ugly head and tells us what we don't want to know. And that, as Christian Rudder demonstrates in his new book, Dataclysm, is perhaps an equally worthwhile pursuit. Before we heighten the human experience, we should understand it first."
—TIME

"At a time when consumers are increasingly wary of online tracking, Rudder makes a powerful argument in Dataclysm that the ability to tell so much about us from the trails we leave is as potentially useful as it is pernicious, and as educational as it may be unsettling. By explaining some of the insights he has gleaned from OkCupid and other social networks, he demystifies data-mining and sheds light on what, for better or for worse, it is now capable of."
—Financial Times

"Dataclysm is a well-written and funny look at what the numbers reveal about human behavior in the age of social media. It’s both profound and a bit disturbing, because, sad to say, we’re generally not the kind of people we like to think — or say — we are."
—Salon

"For all its data and its seemingly dating-specific focus, Dataclysm tells the story set forth by the book's subtitle, in an entertaining and accessible way. Informative, eye-opening, and (gasp) fun to read. Even if you’re not a giant stat head."
Grantland

"[Rudder] doesn’t wring or clap his hands over the big-data phenomenon (see N.S.A., Google ads, that sneaky Fitbit) so much as plunge them into big data and attempt to pull strange creatures from the murky depths." 
The New Yorker

"A hopeful and exciting journey into the heart of data collection...[Rudder's] book delivers both insider access and a savvy critique of the very machinery he is employed by. Since he's been in the data mines and has risen above them, Rudder becomes a singular and trustworthy guide.
—The Globe and Mail

"Compulsively readable — including for those with no particular affinity for numbers in and of themselves — and surprisingly personal. Starting with aggregates, Rudder posits, we can zoom in on the details of how we live, love, fight, work, play, and age; from numbers, we can derive narrative. There are few characters in the book, and few anecdotes — but the human story resounds throughout."
Refinery29

"Rudder’s lively, clear prose…makes heady concepts understandable and transforms the book’s many charts into revealing truths…Rudder teaches us a bit about how wonderfully peculiar humans are, and how we go about hiding it."
—Flavorwire

"Dataclysm is all about what we can learn about human minds and hearts by analyzing the massive ongoing experiment that is the internet."
Forbes

"The book reads as if it's written (well) by a curious child whose parents beg him or her to stop asking "what-if" questions. Rudder examines the data of the website he helped create with unwavering curiosity. Every turn presents new questions to be answered, and he happily heads down the rabbit hole to resolve them."
—U.S. News

"A wonderful march through infographics created using data derived from the web…a fun, visual book—and a necessary one at that."
The Independent (UK), 2014's Best Books on the Internet and Technology

"This is the best book that I've read on data in years, perhaps ever. If you want to understand how data is affecting the present and what it portends for the future, buy it now."
—Huffington Post

"Rudder draws from big data sets – Google searches, Twitter updates, illicitly obtained Facebook data passed shiftily between researchers like bags of weed – to draw out subtle patterns in politics, sexuality, identity and behaviour that are only revealed with distance and aggregation…Dataclysm will entertain those who want to know how machines see us. It also serves as a call to action, showing us how server farms running everything from home shopping to homeland security turn us into easily digested data products. Rudder's message is clear: in this particular sausage factory, we are the pigs.” 
New Scientist

"Dataclysm offers both the satisfaction of confirming stereotypes and the fun of defying them…Such candor is disarming, as is Mr. Rudder’s puckish sense of humor." 
–Pittsburgh Post-Gazette

"Studying human behavior is a little like exploring a jungle: it's messy, hard, and easy to lose your way. But Christian Rudder is a consummate guide, revealing essential truths about who we are. Big Data has never been so fun."
—Dan Ariely, author of Predictably Irrational
 
"Dataclysm is a book full of juicy secrets—secrets about who we love, what we crave, why we like, and how we change each other’s minds and lives, often without even knowing it. Christian Rudder makes this mathematical narrative of our culture fun to read and even more fun to discuss: You will find yourself sharing these intriguing data-driven revelations with everyone you know."
—Jane McGonigal, author of Reality Is Broken
 
"In the first few pages of Dataclysm, Christian Rudder uses massive amounts of actual behavioral data to prove what I always believed in my heart: Belle and Sebastian is the whitest band ever. It only gets better from there."
—Aziz Ansari

"It’s unheard of for a book about Big Data to read like a guilty pleasure, but Dataclysm does. It’s a fascinating, almost voyeuristic look at who we really are and what we really want."
—Steven Strogatz, Schurman Professor of Applied Mathematics, Cornell University, author of The Joy of x

"Smart, revealing, and sometimes sobering, Dataclysm affirms what we probably suspected in our darker moments: When it comes to romance, what we say we want isn't what will actually make us happy. Christian Rudder has tapped the tremendous wealth of data that the Internet offers to tease out thoughts on topics like beauty and race that most of us wouldn’t cop to publicly. It's a riveting read, and Rudder is an affable and humane guide."
—Adelle Waldman, author of The Love Affairs of Nathaniel P.

"Christian Rudder has written a funny and profound book about important issues. Race, love, sex—you name it. Are we the sum of the data we produce? Read this book immediately and see if you can answer the question."
—Errol Morris

"Big Data can be like a 3D movie without 3D glasses—you know there's a lot going on but you're mainly just disoriented. We should feel fortunate to have an interpreter as skilled (and funny) as Christian Rudder. Dataclysm is filled with insights that boil down Big Data into byte-sized revelations."
—Michael Norton, Harvard Business School, coauthor of Happy Money

"With a zest for both the profound and the wacky, Rudder demonstrates how the information we provide individually tells a vast deal about who we are collectively. A visually engaging read and a fascinating topic make this a great choice not just for followers of Nate Silver and fans of infographics, but for just about anyone who, by participating in online activity, has contributed to the data set."
—Library Journal

"Demographers, entrepreneurs, students of history and sociology, and ordinary citizens alike will find plenty of provocations and, yes, much data in Rudder's well-argued, revealing pages."
—Kirkus Reviews

About

A New York Times Bestseller

An audacious, irreverent investigation of human behavior—and a first look at a revolution in the making

 
Our personal data has been used to spy on us, hire and fire us, and sell us stuff we don’t need. In Dataclysm, Christian Rudder uses it to show us who we truly are.
 
For centuries, we’ve relied on polling or small-scale lab experiments to study human behavior. Today, a new approach is possible. As we live more of our lives online, researchers can finally observe us directly, in vast numbers, and without filters. Data scientists have become the new demographers.
 
In this daring and original book, Rudder explains how Facebook "likes" can predict, with surprising accuracy, a person’s sexual orientation and even intelligence; how attractive women receive exponentially more interview requests; and why you must have haters to be hot. He charts the rise and fall of America’s most reviled word through Google Search and examines the new dynamics of collaborative rage on Twitter. He shows how people express themselves, both privately and publicly. What is the least Asian thing you can say? Do people bathe more in Vermont or New Jersey? What do black women think about Simon & Garfunkel? (Hint: they don’t think about Simon & Garfunkel.) Rudder also traces human migration over time, showing how groups of people move from certain small towns to the same big cities across the globe. And he grapples with the challenge of maintaining privacy in a world where these explorations are possible.
 
Visually arresting and full of wit and insight, Dataclysm is a new way of seeing ourselves—a brilliant alchemy, in which math is made human and numbers become the narrative of our time.

Author

© Victor G. Jeffreys II
Christian Rudder is a co-founder and former president of the dating site OkCupid, where he authored the popular OkTrends blog. He graduated from Harvard in 1998 with a degree in math and later served as creative director for SparkNotes. He has appeared on Dateline NBC and NPR's "All Things Considered" and his work has been written about in the New York Times and the New Yorker, among other places. He lives in Brooklyn with his wife and daughter. View titles by Christian Rudder

Excerpt

1.

Wooderson’s Law

Up where the world is steep, like in the Andes, people use funicular railroads to get where they need to go—­a pair of cable cars connected by a pulley far up the hill. The weight of the one car going down pulls the other up; the two vessels travel in counterbalance. I’ve learned that that’s what being a parent is like. If the years bring me low, they raise my daughter, and, please, so be it. I surrender gladly to the passage, of course, especially as each new moment gone by is another I’ve lived with her, but that doesn’t mean I don’t miss the days when my hair was actually all brown and my skin free of weird spots. My girl is two and I can tell you that nothing makes the arc of time more clear than the creases in the back of your hand as it teaches plump little fingers to count: one, two, tee.

But some guy having a baby and getting wrinkles is not news. You can start with whatever the Oil of Olay marketing department is running up the pole this week—­as I’m writing it’s the idea of “color correcting” your face with a creamy beige paste that is either mud from the foothills of Alsace or the very essence of bullshit—­and work your way back to myths of Hera’s jealous rage. People have been obsessed with getting older, and with getting uglier because of it, for as long as there’ve been people and obsession and ugliness. “Death and taxes” are our two eternals, right? And depending on the next government shutdown, the latter is looking less and less reliable. So there you go.

When I was a teenager—­and it shocks me to realize I was closer then to my daughter’s age than to my current thirty-­eight—­I was really into punk rock, especially pop-­punk. The bands were basically snottier and less proficient versions of Green Day. When I go back and listen to them now, the whole phenomenon seems supernatural to me: grown men brought together in trios and quartets by some unseen force to whine about girlfriends and what other people are eating. But at the time I thought these bands were the shit. And because they were too cool to have posters, I had to settle for arranging their album covers and flyers on my bedroom wall. My parents have long since moved—­twice, in fact. I’m pretty sure my old bedroom is now someone else’s attic, and I have no idea where any of the paraphernalia I collected is. Or really what most of it even looked like. I can just remember it and smile, and wince.

Today an eighteen-­year-­old tacks a picture on his wall, and that wall will never come down. Not only will his thirty-­eight-­year-­old self be able to go back, pick through the detritus, and ask, “What was I thinking?,” so can the rest of us, and so can researchers. Moreover, they can do it for all people, not just one guy. And, more still, they can connect that eighteenth year to what came before and what’s still to come, because the wall, covered in totems, follows him from that bedroom in his parents’ house to his dorm room to his first apartment to his girlfriend’s place to his honeymoon, and, yes, to his daughter’s nursery. Where he will proceed to paper it over in a billion updates of her eating mush.

A new parent is perhaps most sensitive to the milestones of getting older. It’s almost all you talk about with other people, and you get actual metrics at the doctor’s every few months. But the milestones keep coming long after babycenter .com and the pediatrician quit with the reminders. It’s just that we stop keeping track. Computers, however, have nothing better to do; keeping track is their only job. They don’t lose the scrapbook, or travel, or get drunk, or grow senile, or even blink. They just sit there and remember. The myriad phases of our lives, once gone but to memory and the occasional shoebox, are becoming permanent, and as daunting as that may be to everyone with a drunk selfie on Instagram, the opportunity for understanding, if handled carefully, is self-­evident.

What I’ve just described, the wall and the long accumulation of a life, is what sociologists call longitudinal data—­data from following the same people, over time—­and I was speculating about the research of the future. We don’t have these capabilities quite yet because the Internet, as a pervasive human record, is still too young. As hard as it is to believe, even Facebook, touchstone and warhorse that it is, has only been big for about six years. It’s not even in middle school! Information this deep is still something we’re building toward, literally, one day at a time. In ten or twenty years, we’ll be able to answer questions like . . . well, for one, how much does it mess up a person to have every moment of her life, since infancy, posted for everyone else to see? But we’ll also know so much more about how friends grow apart or how new ideas percolate through the mainstream. I can see the long-­term potential in the rows and columns of my databases, and we can all see it in, for example, the promise of Facebook’s Timeline: for the passage of time, data creates a new kind of fullness, if not exactly a new science.

Even now, in certain situations, we can find an excellent proxy, a sort of flash-­forward to the possibilities. We can take groups of people at different points in their lives, compare them, and get a rough draft of life’s arc. This approach won’t work with music tastes, for example, because music itself also evolves through time, so the analysis has no control. But there are fixed universals that can support it, and, in the data I have, the nexus of beauty, sex, and age is one of them. Here the possibility already exists to mark milestones, as well as lay bare vanities and vulnerabilities that were perhaps till now just shades of truth. So doing, we will approach a topic that has consumed authors, painters, philosophers, and poets since those vocations existed, perhaps with less art (though there is an art to it), but with a new and glinting precision. As usual, the good stuff lies in the distance between thought and action, and I’ll show you how we find it.

I’ll start with the opinions of women—­all the trends below are true across my sexual data sets, but for specificity’s sake, I’ll use numbers from OkCupid. This table lists, for a woman, the age of men she finds most attractive. If I’ve arranged it unusually, you’ll see in a second why.

Reading from the top, we see that twenty-­ and twenty-­one-­year-­old women prefer twenty-­three-­year-­old guys; twenty-­two-­year-­old women like men who are twenty-­four, and so on down through the years to women at fifty, who we see rate forty-­six-­year-­olds the highest. This isn’t survey data, this is data built from tens of millions of preferences expressed in the act of finding a date, and even from just following along the first few entries, the gist of the table is clear: a woman wants a guy to be roughly as old as she is. Pick an age in black under forty, and the number in red is always very close. The broad trend comes through better when I let lateral space reflect the progression of the values in red:

That dotted diagonal is the “age parity” line, where the male and female years would be equal. It’s not a canonical math thing, just something I overlaid as a guide for your eye. Often there is an intrinsic geometry to a situation—­it was the first science for a reason—­and we’ll take advantage wherever possible. This particular line brings out two transitions, which coincide with big birthdays. The first pivot point is at thirty, where the trend of the red numbers—­the ages of the men—­crosses below the line, never to cross back. That’s the data’s way of saying that until thirty, a woman prefers slightly older guys; afterward, she likes them slightly younger. Then at forty, the progression breaks free of the diagonal, going practically straight down for nine years. That is to say, a woman’s tastes appear to hit a wall. Or a man’s looks fall off a cliff, however you want to think about it. If we want to pick the point where a man’s sexual appeal has reached its limit, it’s there: forty.

The two perspectives (of the woman doing the rating and of the man being rated) are two halves of a whole. As a woman gets older, her standards evolve, and from the man’s side, the rough 1:1 movement of the red numbers versus the black implies that as he matures, the expectations of his female peers mature as well—­practically year-­for-­year. He gets older, and their viewpoint accommodates him. The wrinkles, the nose hair, the renewed commitment to cargo shorts—­these are all somehow satisfactory, or at least offset by other virtues. Compare this to the free ­fall of scores going the other way, from men to women.

This graph—­and it’s practically not even a graph, just a table with a couple columns—­makes a statement as stark as its own negative space. A woman’s at her best when she’s in her very early twenties. Period. And really my plot doesn’t show that strongly enough. The four highest-rated female ages are twenty, twenty-­one, twenty-­two, and twenty-­three for every group of guys but one. You can see the general pattern below, where I’ve overlaid shading for the top two quartiles (that is, top half) of ratings. I’ve also added some female ages as numbers in black on the bottom horizontal to help you navigate:

Again, the geometry speaks: the male pattern runs much deeper than just a preference for twenty-­year-­olds. And after he hits thirty, the latter half of our age range (that is, women over thirty-­five) might as well not exist. Younger is better, and youngest is best of all, and if “over the hill” means the beginning of a person’s decline, a straight woman is over the hill as soon as she’s old enough to drink.

Of course, another way to put this focus on youth is that males’ expectations never grow up. A fifty-­year-­old man’s idea of what’s hot is roughly the same as a college kid’s, at least with age as the variable under consideration—­if anything, men in their twenties are more willing to date older women. That pocket of middling ratings in the upper right of the plot, that’s your “cougar” bait, basically. Hikers just out enjoying a nice day, then bam.

In a mathematical sense, a man’s age and his sexual aims are independent variables: the former changes while the latter never does. I call this Wooderson’s law, in honor of its most famous proponent, Matthew McConaughey’s character from Dazed and Confused.

Unlike Wooderson himself, what men claim they want is quite different from the private voting data we’ve just seen. The ratings above were submitted without any specific prompt beyond “Judge this person.” But when you ask men outright to select the ages of women they’re looking for, you get much different results. The gray space below is what men tell us they want when asked:

Since I don’t think that anyone is intentionally misleading us when they give OkCupid their preferences—­there’s little incentive to do that, since all you get then is a site that gives you what you know you don’t want—­I see this as a statement of what men imagine they’re supposed to desire, versus what they actually do. The gap between the two ideas just grows over the years, although the tension seems to resolve in a kind of pathetic compromise when it’s time to stop voting and act, as you’ll see.

The next plot (the final one of this type we’ll look at) identifies the age with the greatest density of contact attempts. These most-­messaged ages are described by the darkest gray squares drifting along the left-­hand edge of the larger swath. Those three dark verticals in the graph’s lower half show the jumps in a man’s self-­concept as he approaches middle age. You can almost see the gears turning. At forty-­four, he’s comfortable approaching a woman as young as thirty-­five. Then, one year later . . . he thinks better of it. While a nine-­year age difference is fine, ten years is apparently too much.

It’s this kind of calculated no-­man’s-­land—­the balance between what you want, what you say, and what you do—­that real romance has to occupy: no matter how people might vote in private or what they prefer in the abstract, there aren’t many fifty-­year-­old men successfully pursuing twenty-­year-­old women. For one thing, social conventions work against it. For another, dating requires reciprocity. What one person wants is only half of the equation.

Awards

  • FINALIST | 2014
    L.A. Times Book Prize (Science and Tech)

Praise

An NPR Best Book of 2014
A Globe & Mail Best Book of 2014
A Brain Pickings Best Science Book of 2014
A Bloomberg Best Book of 2014
One of Hudson Booksellers' 5 Best Business Books of 2014
Goodreads Semifinalist for Best Nonfiction Book of the Year
Finalist for the Los Angeles Times Book Prize

"Most data-hyping books are vapor and slogans. This one has the real stuff: actual data and actual analysis taking place on the page. That’s something to be praised, loudly and at length. Praiseworthy, too, is Rudder’s writing, which is consistently zingy and mercifully free of Silicon Valley business gabble."
Jordan Ellenberg, Washington Post

"As a researcher, Mr. Rudder clearly possesses the statistical acumen to answer the questions he has posed so well. As a writer, he keeps the book moving while fully exploring each topic, revealing his graphs and charts with both explanatory and narrative skill. Though he forgoes statistical particulars like p-values and confidence intervals, he gives an approachable, persuasive account of his data sources and results. He offers explanations of what the data can and cannot tell us, why it is sufficient or insufficient to answer some question we may have and, if the latter is the case, what sufficient data would look like. He shows you, in short, how to think about data."
—Wall Street Journal

"Rudder is the co-founder of the dating site OKCupid and the data scientist behind its now-legendary trend analyses, but he is also — as it becomes immediately clear from his elegant writing and wildly cross-disciplinary references — a lover of literature, philosophy, anthropology, and all the other humanities that make us human and that, importantly in this case, enhance and ennoble the hard data with dimensional insight into the richness of the human experience...an extraordinarily unusual and dimensional lens on what Carl Sagan memorably called ‘the aggregate of our joy and suffering.’"
—Maria Popova, Brain Pickings

"Fascinating, funny, and occasionally howl-inducing...[Rudder] is a quant with soul, and we’re lucky to have him."
—Elle
 
"There's another side of Big Data you haven't seen—not the one that promised to use our digital world to our advantage to optimize, monetize, or systematize every last part our lives. It's the big data that rears its ugly head and tells us what we don't want to know. And that, as Christian Rudder demonstrates in his new book, Dataclysm, is perhaps an equally worthwhile pursuit. Before we heighten the human experience, we should understand it first."
—TIME

"At a time when consumers are increasingly wary of online tracking, Rudder makes a powerful argument in Dataclysm that the ability to tell so much about us from the trails we leave is as potentially useful as it is pernicious, and as educational as it may be unsettling. By explaining some of the insights he has gleaned from OkCupid and other social networks, he demystifies data-mining and sheds light on what, for better or for worse, it is now capable of."
—Financial Times

"Dataclysm is a well-written and funny look at what the numbers reveal about human behavior in the age of social media. It’s both profound and a bit disturbing, because, sad to say, we’re generally not the kind of people we like to think — or say — we are."
—Salon

"For all its data and its seemingly dating-specific focus, Dataclysm tells the story set forth by the book's subtitle, in an entertaining and accessible way. Informative, eye-opening, and (gasp) fun to read. Even if you’re not a giant stat head."
Grantland

"[Rudder] doesn’t wring or clap his hands over the big-data phenomenon (see N.S.A., Google ads, that sneaky Fitbit) so much as plunge them into big data and attempt to pull strange creatures from the murky depths." 
The New Yorker

"A hopeful and exciting journey into the heart of data collection...[Rudder's] book delivers both insider access and a savvy critique of the very machinery he is employed by. Since he's been in the data mines and has risen above them, Rudder becomes a singular and trustworthy guide.
—The Globe and Mail

"Compulsively readable — including for those with no particular affinity for numbers in and of themselves — and surprisingly personal. Starting with aggregates, Rudder posits, we can zoom in on the details of how we live, love, fight, work, play, and age; from numbers, we can derive narrative. There are few characters in the book, and few anecdotes — but the human story resounds throughout."
Refinery29

"Rudder’s lively, clear prose…makes heady concepts understandable and transforms the book’s many charts into revealing truths…Rudder teaches us a bit about how wonderfully peculiar humans are, and how we go about hiding it."
—Flavorwire

"Dataclysm is all about what we can learn about human minds and hearts by analyzing the massive ongoing experiment that is the internet."
Forbes

"The book reads as if it's written (well) by a curious child whose parents beg him or her to stop asking "what-if" questions. Rudder examines the data of the website he helped create with unwavering curiosity. Every turn presents new questions to be answered, and he happily heads down the rabbit hole to resolve them."
—U.S. News

"A wonderful march through infographics created using data derived from the web…a fun, visual book—and a necessary one at that."
The Independent (UK), 2014's Best Books on the Internet and Technology

"This is the best book that I've read on data in years, perhaps ever. If you want to understand how data is affecting the present and what it portends for the future, buy it now."
—Huffington Post

"Rudder draws from big data sets – Google searches, Twitter updates, illicitly obtained Facebook data passed shiftily between researchers like bags of weed – to draw out subtle patterns in politics, sexuality, identity and behaviour that are only revealed with distance and aggregation…Dataclysm will entertain those who want to know how machines see us. It also serves as a call to action, showing us how server farms running everything from home shopping to homeland security turn us into easily digested data products. Rudder's message is clear: in this particular sausage factory, we are the pigs.” 
New Scientist

"Dataclysm offers both the satisfaction of confirming stereotypes and the fun of defying them…Such candor is disarming, as is Mr. Rudder’s puckish sense of humor." 
–Pittsburgh Post-Gazette

"Studying human behavior is a little like exploring a jungle: it's messy, hard, and easy to lose your way. But Christian Rudder is a consummate guide, revealing essential truths about who we are. Big Data has never been so fun."
—Dan Ariely, author of Predictably Irrational
 
"Dataclysm is a book full of juicy secrets—secrets about who we love, what we crave, why we like, and how we change each other’s minds and lives, often without even knowing it. Christian Rudder makes this mathematical narrative of our culture fun to read and even more fun to discuss: You will find yourself sharing these intriguing data-driven revelations with everyone you know."
—Jane McGonigal, author of Reality Is Broken
 
"In the first few pages of Dataclysm, Christian Rudder uses massive amounts of actual behavioral data to prove what I always believed in my heart: Belle and Sebastian is the whitest band ever. It only gets better from there."
—Aziz Ansari

"It’s unheard of for a book about Big Data to read like a guilty pleasure, but Dataclysm does. It’s a fascinating, almost voyeuristic look at who we really are and what we really want."
—Steven Strogatz, Schurman Professor of Applied Mathematics, Cornell University, author of The Joy of x

"Smart, revealing, and sometimes sobering, Dataclysm affirms what we probably suspected in our darker moments: When it comes to romance, what we say we want isn't what will actually make us happy. Christian Rudder has tapped the tremendous wealth of data that the Internet offers to tease out thoughts on topics like beauty and race that most of us wouldn’t cop to publicly. It's a riveting read, and Rudder is an affable and humane guide."
—Adelle Waldman, author of The Love Affairs of Nathaniel P.

"Christian Rudder has written a funny and profound book about important issues. Race, love, sex—you name it. Are we the sum of the data we produce? Read this book immediately and see if you can answer the question."
—Errol Morris

"Big Data can be like a 3D movie without 3D glasses—you know there's a lot going on but you're mainly just disoriented. We should feel fortunate to have an interpreter as skilled (and funny) as Christian Rudder. Dataclysm is filled with insights that boil down Big Data into byte-sized revelations."
—Michael Norton, Harvard Business School, coauthor of Happy Money

"With a zest for both the profound and the wacky, Rudder demonstrates how the information we provide individually tells a vast deal about who we are collectively. A visually engaging read and a fascinating topic make this a great choice not just for followers of Nate Silver and fans of infographics, but for just about anyone who, by participating in online activity, has contributed to the data set."
—Library Journal

"Demographers, entrepreneurs, students of history and sociology, and ordinary citizens alike will find plenty of provocations and, yes, much data in Rudder's well-argued, revealing pages."
—Kirkus Reviews

Books for Native American Heritage Month

In celebration of Native American Heritage Month this November, Penguin Random House Education is highlighting books that detail the history of Native Americans, and stories that explore Native American culture and experiences. Browse our collections here: Native American Creators Native American History & Culture

Read more

2024 Middle and High School Collections

The Penguin Random House Education Middle School and High School Digital Collections feature outstanding fiction and nonfiction from the children’s, adult, DK, and Grupo Editorial divisions, as well as publishers distributed by Penguin Random House. Peruse online or download these valuable resources to discover great books in specific topic areas such as: English Language Arts,

Read more

PRH Education High School Collections

All reading communities should contain protected time for the sake of reading. Independent reading practices emphasize the process of making meaning through reading, not an end product. The school culture (teachers, administration, etc.) should affirm this daily practice time as inherently important instructional time for all readers. (NCTE, 2019)   The Penguin Random House High

Read more

PRH Education Translanguaging Collections

Translanguaging is a communicative practice of bilinguals and multilinguals, that is, it is a practice whereby bilinguals and multilinguals use their entire linguistic repertoire to communicate and make meaning (García, 2009; García, Ibarra Johnson, & Seltzer, 2017)   It is through that lens that we have partnered with teacher educators and bilingual education experts, Drs.

Read more