Big Or Confusing Numbers Require StatisticsWe are faced every day with oceans of facts and figures. It is impossible to consider each fact individually, so we use "statistics" to deal with these piles of numbers. "Statistics" are numbers that describe, or summarize, groups of other numbers. The study of this type of analysis and description of unmanageable bunches of data is called "Statistics". How many people attended the Million Man March? (More on crowd numbers at the bottom of this post.)
Statistics Help Us See PatternsSometimes these patterns, the conclusions we derive from the raw information, are important. For example:
- Who won the election? (See earlier post Why Doesn't Every Vote Get Counted?.)
- What are the President's poll numbers? (Bush Poll Numbers -- Margin Of Error)
- Is it safe to launch? (Discussion of miscommunication that led to the Challenger disaster; Illustrations are clearer at this pdf site.)
- Who committed the crime? (See "The Case Of The Careless Cab")
Bad Statistics = Bad DecisionsStatistics that are used improperly or misleadingly can cause you to misinterpret the underlying data, leading to bad decisions. (The examples below assume nobody is actually lying. Of course a lot of figures you read are just completely fake, but nobody has bothered to verify them.)
Example 1Suppose you are listening to three political candidates, and you want to vote for the one which is most likely to work to preserve the environment. Candidate Able says she voted for green legislation 20 times in her last term in office. Candidate Baker says he voted for 80% of the green bills that were proposed during his last term. Candidate Charlie says she has voted for more green legislation than either Able or Baker.
Before you vote you might want to know that:
Better keep looking for a candidate friendly to the environment.
- Although Able voted green 20 times, she voted against green legislation 100 times. She neglects to mention this.
- Although Baker voted for 80% of the green bills proposed, he voted against the most important and significant bills. He has padded his figures with many minor measures that might be considered environmental.
- Candidate Charlie has been in the legislature for much longer than either Able or Baker. In her earlier terms she voted for many pieces of green legislation, but more recently she has voted against all green measures.
Example 2The average pay at Company A is higher than the average pay at Company B. Which would you rather work for? Before you answer consider that the "average" can be misleading. The CEO at Company A makes ten times the salary of the CEO at Company B, thus "raising the average". All the other workers at Company A earn less than their counterparts at Company B.
So unless you are going to be CEO, you will get paid more at Company B.
If You Don't Understand Statistics, You Can't Spot Bad StatisticsStatistics are widely used in newsmedia, in government reports, and in many other information sources. The purpose should be to make the raw information easier to understand, but often misuse of statistics (sometimes deliberate, sometimes incompetent) causes misinformation or confusion.
Three things to keep in mind when you see statistics or other numbers in media articles or web sites:
- Reporters and their editors believe people like to see "facts" and figures, so they try to find some to put in.
- Reporters (like most other) people don't have a clue about statistics.
- Reporters and most other writers are on a deadline.
- What is the source of that number?
- How certain is that number? What is the range of uncertainty?
- What (possibly confused) calculations were used to arrive at that number?
Examples: Crowd NumbersNews stories about demonstrations or other events often include numbers representing the size of the crowd. Nobody actually enumerated the crowd, counting each member, so such numbers are always estimates.
- What is the source of the estimate? (Consider possible bias.)
- What method was used? (Each has its pros and cons.)
- Were the raw data further manipulated? (For example by averaging.)
"Counting the March" is an excellent site about using aerial imaging to count the Million Man March.
Additional ResourcesRobert Niles's site on statistics for journalists. Excellent.
A good discussion of misinterpretation of statistics by the media, from Statistics Canada.
Another good site on importance of proper use of statistics
Numberwatch -- "All about the scares, scams, junk, panics, and flummery cooked up by the media, politicians, bureaucrats, so-called scientists and others who try to confuse you with wrong numbers."
At "stats", "We check out the facts and figures behind the news," and unsurprisingly, often find them misleading or wrong.
Nice article on innumeracy among journalists.
David Wheat's Science In Action site has articles about science and math in the real world, weird science, science news, unexpected connections, and other cool science stuff. There is an index of the articles by topic here.
tags: science, statistics, crowds, science education, math, education, Science In Action