11 October 2004

Do Polls Mean What the Headlines Say?

Here's a typical headline: "Kerry Opens Three-Point Lead on Bush--Poll". This headline lies. (See the whole article here.)

The article states that a tracking poll showed Kerry preferred by 47% of those asked, and Bush by 44%. Down near the bottom the article says "The poll of 1,214 likely voters was taken Friday through Sunday and has a margin of error of plus or minus 2.9 percentage points."

Here is what that "2.9%" means. This is the 95% confidence interval. A sample of people was polled. We want to know not just whom these 1.214 people prefer, but whom all likely voters prefer. All likely voters is the "universe" that this "sample" represents. When we do the math based on the sample responses, we learn that there is a 95% probability that the result we would have gotten by polling the entire population of likely voters is within 2.9% of, in Kerry's case, 47%. That is, between 44.1% and 49.9% of the total universe of likely voters probably prefer Kerry, and between 41.1% and 46.9% of the whole universe of likely voters probably prefer Bush.

Note that these ranges overlap! Based on the data from this poll, it is just as likely that 45% prefer Bush and 45% prefer Kerry, or that 44.5% prefer Kerry and 46% prefer Bush. This is what is known as "a statistical dead heat." The data from this sample DO NOT indicate that Kerry is ahead of Bush. Just the opposite could be true.

It is true that Kerry is ahead of Bush among this particular sample of likely voters. But we will never sample these same individuals again. The only reason they were polled was to try to find out the preferences of the wider universe of all voters.

Headline writers never seem to understand this. Therefore headlines on stories about poll results usually lie. You've got to DO THE MATH!

No comments: