The Devil in the Data
April 2, 2018 § 3 Comments
Daniel Levitin has been many different things at many different times. He studied applied mathematics at MIT; enrolled at the Berklee College of Music; in his thirties he turned to cognitive psychology, studying at Stanford, University of Oregon and Berkeley and making pioneering contributions to that field. He worked as a stand up comic and contributor of jokes to other comedians. He played the piano at 4, the clarinet at 8, the saxophone at 12, and guitar at 16; by 17, he was writing songs. He was part of several bands and record labels. He executed consulting engagements for AT&T, venture capital firms, record labels, and the US Navy. As a writer, he has written articles for Billboard, Grammy, the New York Times, The Wall Street Journal, the Washington Post, the Atlantic; and he has written three bestsellers before his biggest one: the book under review.
Yet if he had done nothing else in his whole life except write this book, it would have been a life well lived.
The Field Guide to Lies and Statistics is a vital book. I strongly recommend that it be made essential reading for anyone who wants to open a Facebook or Twitter account, or who wants to hear the news on the TV. It is a survival kit for our turbulent times, as a safeguard against fake news, false advertisements, and plain old misinformation.
Why do we need a survival kit?
It isn’t that people were smarter in the past, or that the fewer sources for information that existed were more trustworthy than the ones available today. If you listened to only one person, and he consistently and systematically lied to you, well, you ended up believing a lot of things that weren’t true. Especially if that one person couched his pronouncements in an impregnable cocoon of infallibility, protected by religion, tradition, power, or what passed for cutting-edge science at the time. It usually helped that everyone you spoke to believed the same things as you did.
Modern scientific procedure was invented precisely to guard against such tyranny. Using scientific principles, consistent and impersonal methods were devised to test the truth of statements, using data measurements and logical deduction. These methods were applied initially to questions relating to the physical world, and later, as our confidence grew, to questions relating to human life – historical, forensic, economic, medical, psychometric, meteorological, political…you name it. We used it to test our experts. “Because I say so” didn’t cut it any more. Just bring the data, we said. Prove it to us. We fought superstition with data, and we thought we had won.
Flash forward to today. We are now super-saturated with data. Facts, figures, conclusions, opinions and predictions drip off every screen we glance at. Data rubs off on us, goes into tiny nooks and crevices in our brain that we hardly knew existed, and sits there as bias, infecting our thoughts and words, and using them to infect others and propagate. We are a colonized species today: biases have subjugated and domesticated us. They use our brains for homes, they use our technology to ride from one place to another at the speed of light, and they use us to fight their battles against each other. Their weapon of choice in these battles? Data. Those sneaky little double agents.
It turns out that you can misrepresent data very easily. As I’ve said before in these pages, most people have a very poor instinct for real world data – what it looks like, at what rate it changes, what it will look like in the future. We’re also not very good at using techniques like inference and deduction to draw conclusions about data: we make mistakes while applying them, or we reserve our scientific skepticism only for statements we are already biased against.
And when we move out of the realm of arithmetic and into that of statistical distributions and probabilities, we’re way out of our depth. We expect certainty and simple linear relationships. Most importantly, many of us feel intimidated by numbers and readily accept anything said in complicated words by people in white lab-coats.
Yet, the only test of veracity for practically any statement about our societies requires a grasp of statistics. Every one of us requires this: because on social media, everyone is an expert. Data can be retweeted, forwarded or shared from a million sources, and it is far easier to ascertain the truth in the statement itself than to ascertain the expertise (or even identity) of the original author of the statement.
So, when a celebrity publishes a book in which he says, “The likelihood of death by pool (1 in 11,000) versus death by gun (1 in 1 million-plus) isn’t even close”, he is either deliberately misleading his readers or demonstrating a poor understanding of conditional probability and of the pitfalls of using crude arithmetic averages. And it isn’t about how many copies of his book he has sold, or what I say about him: you can work out for yourself if he’s a charlatan. All you need is a high-school understanding of statistics, explained in layman’s language in Levitin’s book.
Yes, prejudice and superstition are still our masters, and they keep us in chains. But nobody else can fight this war of liberation for us. We need to cast off the shackles ourselves – and Levitin’s book tells us how.