When I taught research methods and statistics to graduate
students at Brooklyn College, if a student submitted a project like the daily
and weekly political polls featured during the recent presidential campaign,
the student would receive a low grade or fail the course. I would explain to
the student that at best his or her poll was what is called a pilot
investigation--a work in progress--to identify the issues and obstacles for
designing a valid piece of research.
The same can be said about many of the political polls.Their primary flaw was in the critical first link in the chain: the sampling, which refers to how the pollsters selected the people they queried and how many participants were in the final samples from which conclusions were drawn.
If the purpose of a poll was to assess preferences or intentions to vote for Donald Trump or Hillary Clinton in a particular state, the sample should have had a sufficient number that included the diversity of the voting population--by age, religion, ethnicity, political affiliation, education, and income. Many samples did not. For example, according to the Independent Voter Network (IVN), the CNN polls did not have adequate representation of 18-34 year old voters, a demographic of 75 million, the largest living U.S. generation; and Fox News polls notably under-sampled independent voters.
A striking example of inadequate sampling was the Nevada Suffolk University poll conducted in August 2016. It surveyed "500 likely voters" on a variety of issues, including who they intended to vote for among the five candidates on the Nevada ballot--pus two additional choices: "none of these candidates" and "undecided." The results of this flawed poll was boldly headlined by CNN: "Nevada Poll: Clinton and Trump Neck and Neck."
Despite faulty sampling and other possible deficiencies some polls did correctly predict the winner in many states and nationally. But this is not a vindication of the polls or necessarily something to cheer about. After all, there were only two viable candidates. Place the name Donald Trump on one piece of paper and Hillary Clinton on another and put the papers in a hat. Then ask a few thousand people to pick the winner out of the hat. In line with probability about 50 percent of the picks will be correct. Similarly, place different narrow margins of victory for one or the other candidate (which historically is usually the outcome) in the hat. Many of the picks will be correct, some will be close to correct, and the ones that miss by a few points could be interpreted as close when margin of error is factored in. But these outcomes can only be determined after the fact. For legitimate discussion about an ongoing political race during the campaign it is essential to have trusted assessments based on valid research.
The average size of poll samples is 1,000, says polling report.com. Obviously, as few as 500, as in the Nevada poll, is not sufficient to predict a trend for the population of an entire state. For predicting a national trend even a sample of 1,700 (as in several Pew polls) may be inadequate.
When looked at from a scientific perspective, is it any wonder that most of the polls missed the mark?
The Washington Post reported that an analysis of 145 national and 14 state polls conducted during the week before the election "consistently overestimated Clinton's vote margin against Trump." Jon Krosnick , Professor of Political Science at Stanford university, told the Washington Post that he wasn't surprised at the inaccuracy of state polls because of flawed sampling; he added, "most state polls are not scientific."
And these polls look even murkier when their faulty sampling is probed further.
Polls typically use the highly respected technique of "random sampling." But none of the 2016 political polls actually accomplished random sampling, despite what may have been the intention of the pollsters. In random sampling a population for study is identified, such as all the registered voters in a state, or for a national poll all registered voters in the nation. The researcher then selects at random, let's say, every 20th person in a list or telephone directory for that entire population. The assumption is that the random selection is highly likely to include the range of diversity within the population, thus resulting in an unbiased and inclusive sample.
A key factor that defines a truly random sample is that the researcher or pollster picks the sample, as in selecting every 20th person. All well and good. But what the polling reports don't tell you is that when pollsters call or canvass every 20th, person by telephone, most hang up, don't answer, or refuse to participate.
The American Association for Public Opinion Research reports that people are increasingly unwilling to participate in surveys. Wired Magazine has reported a dramatic decline in response to telephone polls from 72 percent response rate in 1980 to 0.9 percent in 2016 (Pew reported a low of 8 percent by 2014). Wired traces the sharp decline over the last decade to the widespread use of cell phones. Pollsters favor robo-dialing to landlines. But "By 2014, 60 percent of Americans used cell phones either most or all of the time, making it difficult or impossible for polling firms to reach three out of five Americans."