1. Headline
  1. Headline
Image: Polling
Peggy Bryant types information into a computer as she conducts a phone survey of political preferences for the research firm SRBI.
Special to msnbc.com

A hundred million Americans or so are about to vote for president. A thousand Americans are already telling the rest of us how that vote is likely to turn out. They are the 0.001 percent of us who offer their opinions when pollsters call. But how can those pollsters claim that so few Americans accurately predict what all of us are thinking? Call it science, plus or minus a few percent.

Say you have 100 million marbles in a jar. A very big jar. Some marbles are blue. Some are white. (And a few might be green.) You want to know how many of each are in the jar.

You can count them all. Or you can grab a small batch, count them out, and figure that your batch represents the whole. If you are really random about the way you pick your batch of marbles, 95 times out of 100, your batch will accurately represent the whole collection. Statisticians have fancy numbers to prove this is true. Decades of polling experience backs them up.

The trick is to be really neutral about picking your batch so that every marble in the jar has a chance of being picked. Stir the jar. Spill the marbles all over the place and, without looking, wade through the mess (try not to slip and fall on all those loose marbles) and pick up your sample. You should have a reasonably fair representation of your whole universe of marbles.

That’s the first scientific rule of polling. A representative sample will almost always accurately represent the whole universe, if everybody in that universe has a chance of being picked for the sample.

Where to pick?
One good way to make sure everybody in America has a shot of participating is by randomly picking phone numbers, since just about everybody has a phone. But there are still key questions. How many calls do you need to make in order to be confident that it represents the whole? And does calling just any batch of Americans work, or do you have to target those calls?

The study of statistics helps answer the first question. A sample size of 600 marbles — er, voters — gives you an accuracy within 4 percent. Sample 1,000 and your accuracy improves to a margin of error of 3 percent. A sample of 1,500 gives you accuracy to within 2 percent. Short of asking everybody, that’s about as accurate as you can get.

So let’s say you want a poll that’s good to within plus or minus 3 percent. Do you just pick 1,000 phone numbers completely at random? No, because there are different voting patterns by region and by state. So pollsters determine, from previous elections, how many people vote in each region of the country. Twenty-three percent of voters are in the East, 26 percent in the South, 31 percent in the Great Lakes/Central region, 20 percent in the West.

So you want to make sure that 23 percent of your 1,000 phone calls, 230, go to states in the East. Another 260 calls will go to the South, 310 calls will go to the central region and 200 calls will go to the West. Pollsters also break down the voter turnout by state, and make sure each state gets the appropriate number of calls.

The process of divvying up the calls by geographical area is called stratification. That’s what they mean when they say the poll was a stratified random sample.

The meaing of ‘random’
Now comes the random part. To blindly end up with random numbers, some pollsters use databases that include every listed phone number in America. They program a computer to pick the numbers randomly. Some use computer programs that just randomly pick seven digits to go with the area codes the pollsters want to call. With either process, everybody in each state or region has an equal shot of being picked.

  1. More from TODAY.com
    1. Prince William, Duchess Kate expecting baby No. 2 in April 2015

      Clear your calendar for royal baby fever: The Duchess of Cambridge has a due date. The royal household said that the forme...

    2. What's your favorite horror film? Vote in TODAY's Scary Movie bracket
    3. This mom is fighting Toys 'R Us for carrying 'Breaking Bad' figures
    4. Erica Hill's morning routine: How TODAY anchor trains for the New York City marathon
    5. To hashtag or not to hashtag? The new social rules of weddings

But Americans don’t differ just by where they live. Some are male, some female. Some are old, some young. There are differences in race, religion, education, income and, of course, different numbers of people registered with different political parties. Pollsters want to be able to compare their results with American reality. So when they do their poll, they don’t just ask “Who are you going to vote for?” They also ask “Who are you?” — defined by the types of demographic characteristics listed at right.

After a poll is done, the initial results are grouped by these demographic categories. Let’s say that of the people responding to a poll, only 40 percent are women. The pollster adjusts the results from women up, and the results from men down, until they accurately match the American population. If only 2 percent of the respondents were Hispanic, the pollster juggles the Hispanic response up, and the other groups down, until everything matches “real life.” They adjust all their findings to accurately match America’s demographics in categories of age, race, religion, gender, income and education.

It may sound like a less-than-random tinkering with the numbers. But remember, everybody out there had their chance to be called when those random phone numbers were picked. These adjustments are done to more accurately reflect all the subparts of the overall universe of voters. You might call this fudging the numbers. Pollsters call it “weighting.”

There are other factors that make a poll more or less accurate. Questions have to be fair and balanced. In addition, the people conducting the interviews have to ask the questions without any bias in their tone of voice. Research shows that people responding to a poll want to please the questioner by answering the way they think the questioner wants them to, so the questioners have to be carefully neutral.

It takes 7,000 to 8,000 phone numbers to get 1,000 useful responses. Some numbers aren’t working. At some, no one answers. And only a third of the people who answer agree to participate. But what those few people say, aided by a little mathematical jiggling, is how we know in advance how America is likely to vote.

David Ropeik is a longtime science journalist and currently serves as Director of Risk Communication at the Harvard Center for Risk Analysis

© 2013 MSNBC Interactive.  Reprints


Discussion comments


Most active discussions

  1. votes comments
  2. votes comments
  3. votes comments
  4. votes comments

More on TODAY.com

  1. Getty Images file

    Prince William, Duchess Kate expecting baby No. 2 in April 2015

    10/20/2014 10:58:01 AM +00:00 2014-10-20T10:58:01
  1. Ebola quarantine period for 48 contacts of Thomas Eric Duncan is over

    The first wave of people who had contact with the original Dallas Ebola patient were taken off a watch list early Monday, marking a moment of relief for the 48 individuals even as dozens more continue to be monitored by officials.

    10/20/2014 10:50:34 AM +00:00 2014-10-20T10:50:34
  1. 'Get my Dad!': Rescuer pulls man from burning home

    A man in Fresno, California, is being called a hero after he calmly walked into a roaring house fire and pulled another man to safety.

    10/20/2014 10:52:14 AM +00:00 2014-10-20T10:52:14
  1. Courtesy Everett Collection

    What's your favorite horror film? Vote in TODAY's Scary Movie bracket

    10/20/2014 11:02:21 AM +00:00 2014-10-20T11:02:21
  1. Samantha Okazaki / TODAY

    Erica Hill’s morning routine: How she trains for the New York City marathon

    10/19/2014 12:50:06 PM +00:00 2014-10-19T12:50:06