Journal of Occupational & Environmental Medicine:
Slade, Martin D. MD, MPH, CIME, FAADEP, FACOEM
Yale School of Medicine New Haven, Conn
Books for review should be sent to Jonathan Borak, MD, Clinical Professor of Epidemiology and Medicine, Yale School of Medicine, 234 Church Street (6th Floor), New Haven, CT 06510; E-mail: email@example.com.
I have often thought of sitting down and writing a book about statistics for the nonstatistician; one that would not worry about formulae, but rather allow the reader to grasp the ideas that the mathematics capture. I believed that the essential book on this topic had yet to be published. I pictured walking the reader through both thought experiments and real-life examples to see the application of a variety of statistical concepts. In my imagined book, the examples would be simple enough to follow, yet interesting enough to cause the reader to become anxious to see how they unfolded. I am somewhat relieved to tell you that Charles Wheelan's book Naked Statistics: Stripping the Dread From the Data has allowed me to put away that thought as he has written just that book.
Naked Statistics does an excellent job of introducing the topics of statistics in an orderly manner, often using examples that cause one's initial intuitive ideas and thoughts to be turned upside down. For example, sudden infant death syndrome is marked by the unfortunate and mysterious death of a seemingly healthy child during its sleep. Because of the lack of understanding about the mechanisms of such deaths, their occurrence often led to suspicions of child abuse. British prosecutors believed that they could differentiate between natural deaths and foul play by focusing their attention to families that had reported multiple cot deaths: “Since the incidence of cot death is rare, 1 in 8,500, the chance of having two cot deaths in the same family would be (1/8,500) which is roughly 1 in 73 million. This reeks of foul play.”2
That approach seems plausible, until the concept of statistical independence, or more correctly in this case statistical dependence, is understood. If there were some underlying link between those two sudden infant death syndrome events, for example, a shared genetic anomaly, then the two deaths would not be independent events. To the contrary, suffering one cot death would substantially increase the likelihood of there being a second one. Unfortunately, many parents were wrongly sent to prison, convicted by faulty statistical reasoning.
The book does not limit itself to medical examples; it allows readers to understand the use of statistics throughout our world and its activities. For example, on what basis did the Joseph Schlitz Brewing Company decide that it could risk broadcasting a live taste test of its beer against a major competitor, Michelob, during halftime of the 1981 Super Bowl? Or, if you found yourself on Let's Make a Deal, should you switch your choice of prize doors (there were three doors and one of them had the grand prize) after being shown that the prize was not behind one of the two doors that you did not choose? Or, how is it that a mutual fund can use bogus data to claim that three of their new funds have “consistently outperformed the S&P 500”? The answers to all of these questions and more such riddles can be found inside Charles Wheelan's book.
Naked Statistics contains 13 chapters. The first chapter (“What's the Point?”) discusses the reasons that statistics is useful. Chapters 2 and 3 (“Descriptive Statistics: Who was the best baseball player of all time” and “Deceptive Description: ‘He's got a great personality!' and other true but grossly misleading statements) discuss how the numbers can give us insight into phenomena that we care about, but that allow bad actors to commit statistical malfeasance to obscure their nefarious motives. Chapters 5, 5a, and 6 (“Basic Probability: Don't buy the extended warranty on your $99 printer,” “The Monty Hall Problem,” and “Problems With Probability: How overconfident math geeks nearly destroyed the global financial system”) focus on studying outcomes that involve elements of uncertainty. Chapter 8 (“The Central Limit Theorem: The Lebron James of statistics”) discusses the engine of statistics, that is, why it can be used to draw inferences. Chapters 9 and 10 (“Inference: Why my statistics professor thought I might have cheated” and “Polling: How we know that 64 percent of Americans support the death penalty (with a sampling error ± 3 percent”) discuss the drawing of meaningful conclusions from observed data. Chapters 11 and 12 (“Regression Analysis: The miracle elixir” and “Common Regression Mistakes: The mandatory warning label”) discuss the determination of relationships between variables that control for other factors. The final chapter (“Program Evaluation: Will going to Harvard change your life?”) discusses the use of counterfactuals in statistical analyses.
In conveying the statistical concepts included in the book, the author illustrates his views with a wide variety of well-cited studies. Thus, motivated and curious readers can dig deeper into the examples and the statistics they illustrate. Also, to its credit, the book includes an appendix that contains a short description of four of the more commonly used statistical software packages along with approximate product prices.
In summary, Charles Wheelan created something rare, a statistical book that is a pleasurable read, while also revealing the basic concepts of statistics along with their strengths and weaknesses. The author has created a book that will allow people to better evaluate the inferences drawn by researchers, pollsters, and others. It should be considered as essential reading for all those unfamiliar with statistics, especially in a world dominated by data.
Martin D. Slade
Yale School of Medicine
New Haven, Conn
Copyright © 2014 by the American College of Occupational and Environmental Medicine