Posts Tagged 'Presidential elections'

Let’s Make Research and Polling Great Again!

Crux Logo Final 2016

The day after the US Presidential election, we quickly wrote and posted about the market research industry’s failure to accurately predict the election.  Since this has been our widest-read post (by a factor of about 10!) we thought a follow-up was in order.

Some of what we predicted has come to pass. Pollsters are being defensive, claiming their polls really weren’t that far off, and are not reaching very deep to try to understand the core of why their predictions were poor. The industry has had a couple of confabs, where the major players have denied a problem exists.

We are at a watershed moment for our industry. Response rates continue to plummet, clients are losing confidence in the data we provide, and we are swimming in so much data our insights are often not able to find space to breathe. And the public has lost confidence in what we do.

Sometimes it is everyday conversations that can enlighten a problem. Recently, I was staying at an AirBnB in Florida. The host (Dan) was an ardent Trump supporter and at one point he asked me what I did for a living. When I told him I was a market researcher the conversation quickly turned to why the polls failed to accurately predict the winner of the election. By talking with Dan I quickly I realized the implications of Election 2016 polling to our industry. He felt that we can now safely ignore all polls – on issues, approval ratings, voter preferences, etc.

I found myself getting defensive. After all, the polls weren’t off that much.  In fact, they were actually off by more in 2012 than in 2016 – the problem being that this time the polling errors resulted in an incorrect prediction. Surely we can still trust polls to give a good sense of what our citizenry thinks about the issues of the day, right?

Not according to Dan. He didn’t feel our political leaders should pay attention to the polls at all because they can’t be trusted.

I’ve even seen a new term for this bandied about:  poll denialism. It is a refusal to believe any poll results because of their past failures. Just the fact that this has been named should be scary enough for researchers.

This is unnerving not just to the market research industry, but to our democracy in general.  It is rarely stated overtly, but poll results are a key way political leaders keep in touch with the needs of the public, and they shape public policy a lot more than many think. Ignoring them is ignoring public opinion.

Market research remains closely associated with political polling. While I don’t think clients have become as mistrustful about their market research as the public has become about polling, clients likely have their doubts. Much of what we do as market researchers is much more complicated than election polling. If we can’t successfully predict who will be President, why would a client believe our market forecasts?

We are at a defining moment for our industry – a time when clients and suppliers will realize this is an industry that has gone adrift and needs a righting of the course. So what can we do to make research great again?  We have a few ideas.

  1. First and foremost, if you are a client, make greater demands for data quality. Nothing will stimulate the research industry more to fix itself than market forces – if clients stop paying for low quality data and information, suppliers will react.
  2. Slow down! There is a famous saying about all projects.  They have three elements that clients want:  a) fast, b) good, and c) cheap, and on any project you can choose two of these.  In my nearly three decades in this industry I have seen this dynamic change considerably. These days, “fast” is almost always trumping the other two factors.  “Good” has been pushed aside.  “Cheap” has always been important, but to be honest budget considerations don’t seem to be the main issue (MR spending continues to grow slowly). Clients are insisting that studies are conducted at a breakneck pace and data quality is suffering badly.
  3. Insist that suppliers defend their methodologies. I’ve worked for corporate clients, but also many academic researchers. I have found that a key difference between them becomes apparent during results presentations. Corporate clients are impatient and want us to go as quickly as possible over the methodology section and get right into the results.  Academics are the opposite. They dwell on the methodology and I have noticed if you can get an academic comfortable with your methods it is rare that they will doubt your findings. Corporate researchers need to understand the importance of a sound methodology and care more about it.
  4. Be honest about the limitations of your methodology. We often like to say that everything you were ever taught about statistics assumed a random sample and we haven’t seen a study in at least 20 years that can credibly claim to have one.  That doesn’t mean a study without a random sample isn’t valuable, it just means that we have to think through the biases and errors it could contain and how that can be relevant to the results we present. I think every research report should have a page after the methodology summary that lists off the study’s limitations and potential implications to the conclusions we draw.
  5. Stop treating respondents so poorly. I believe this is a direct consequence of the movement from telephone to online data collection. Back in the heyday of telephone research, if you fielded a survey that was too long or was challenging for respondents to answer, it wasn’t long until you heard from your interviewers just how bad your questionnaire was. In an online world, this feedback never gets back to the questionnaire author – and we subsequently beat up our respondents pretty badly.  I have been involved in at least 2,000 studies and about 1 million respondents.  If each study averages 15 minutes that implies that people have spent about 28 and a half years filling out my surveys.  It is easy to lose respect for that – but let’s not forget the tremendous amount of time people spend on our surveys. In the end, this is a large threat to the research industry, as if people won’t respond, we have nothing to sell.
  6. Stop using technology for technology’s sake. Technology has greatly changed our business. But, it doesn’t supplant the basics of what we do or allow us to ignore the laws of statistics.  We still need to reach a representative sample of people, ask them intelligent questions, and interpret what it means for our clients.  Tech has made this much easier and much harder at the same time.  We often seem to do things because we can and not because we should.

The ultimate way to combat “poll denialism” in a “post-truth” world is to do better work, make better predictions, and deliver insightful interpretations. That is what we all strive to do, and it is more important than ever.

 

An Epic Fail: How Can Pollsters Get It So Wrong?

picture1

Perhaps the only bigger loser than Hillary Clinton in yesterday’s election was the polling industry itself. Those of us who conduct surveys for a living should be asking if we can’t even get something as simple as a Presidential election right, why should our clients have confidence in any data we provide?

First, a recap of how poorly the polls and pundits performed:

  • FiveThirtyEight’s model had Clinton’s likelihood of winning at 72%.
  • Betfair (a prediction market) had Clinton trading at an 83% chance of winning.
  • A quick scan of Real Clear Politics on Monday night showed 25 final national polls. 22 of these 25 polls had Clinton as the winner, and the most reputable ones almost all had her winning the popular vote by 3 to 5 points. (It should be noted that Clinton seems likely to win the popular vote.)

There will be claims that FiveThirtyEight “didn’t say her chances were 100%” or that Betfair had Trump with a “17% chance of winning.” Their predictions were never to be construed to be certain.  No prediction is ever 100% certain, but this is a case where almost all forecasters got it wrong.  That is pretty close to the definition of a bias – something systematic that affected all predictions must have happened.

The polls will claim that the outcome was in the margin of error. But, to claim a “margin of error” defense is statistically suspect, as margins of error only apply to random or probability samples and none of these polls can claim to have a random sample. FiveThirtyEight also had Clinton with 302 electoral votes, way beyond any reasonable error rate.

Regardless, the end result is going to end up barely within the margin of error of most of these polls erroneously use anyway. That is not a free pass for the pollsters at all. All it means is rather than their estimate being accurate 95% of the time, it was predicted to be accurate a little bit less:  between 80% and 90% of the time for most of these polls by my calculations.

Lightning can strike for sure. But this is a case of it hitting the same tree numerous times.

So, what happened? I am sure this will be the subject of many post mortems by the media and conferences from the research industry itself, but let me provide an initial perspective.

First, it seems that it had anything to do with the questions themselves. In reality, most pollsters use very similar questions to gather voter preferences and many of these questions have been in use for a long time.  Asking whom you will vote for is pretty simple. The question itself seems to be an unlikely culprit.

I think the mistakes the pollster’s made come down to some fairly basic things.

  1. Non-response bias. This has to be a major reason why the polls were wrong. In short, non-response bias means that the sample of people who took the time to answer the poll did not adequately represent the people who actually voted.  Clearly this must have occurred. There are many reasons this could happen.  Poor response rates is likely a key one, but poor selection of sampling frames, researchers getting too aggressive with weighting and balancing, and simply not being able to reach some key types of voters well all play into it.
  2. Social desirability bias. This tends to be more present in telephone and in-person polls that involve an interviewer but it happens in online polls as well. This is when the respondent tells you what you want to hear or what he or she thinks is socially acceptable. A good example of this is if you conduct a telephone poll and an online poll at the same time, more people will say they believe in God in the telephone poll.  People tend to answer how they think they are supposed to, especially when responding to an interviewer.   In this case, let’s take the response bias away.  Suppose pollsters reached every single voter who actually showed up in a poll. If we presume “Trump” was a socially unacceptable answer in the poll, he would do better in the actual election than in the poll.  There is evidence this could have happened, as polls with live interviewers had a wider Clinton to Trump gap than those that were self-administered.
  3. Third parties. It looks like Gary Johnson’s support is going to end up being about half of what the pollster’s predicted.  If this erosion benefited Trump, it could very well have made a difference. Those that switched their vote from Johnson in the last few weeks might have been more likely to switch to Trump than Clinton.
  4. Herding. This season had more polls than ever before and they often had widely divergent results.  But, if you look closely you will see that as the election neared, polling results started to converge.  The reason could be that if a pollster had a poll that looked like an outlier, they probably took a closer look at it, toyed with how the sample was weighted, or decided to bury the poll altogether.  It is possible that there were some accurate polls out there that declared a Trump victory, but the pollster’s didn’t release them.

I’d also submit that the reasons for the polling failure are likely not completely specific to the US and this election. We can’t forget that pollsters also missed the recent Brexit vote, the Mexican Presidency, and David Cameron’s original election in the UK.

So, what should the pollsters do? Well, they owe it to the industry to convene, share data, and attempt to figure it out. That will certainly be done via the trade organizations pollsters belong to, but I have been to a few of these events and they devolve pretty quickly into posturing, defensiveness, and salesmanship. Academics will take a look, but they move so slowly that the implications they draw will likely be outdated by the time they are published.  This doesn’t seem to be an industry that is poised to fix itself.

At minimum, I’d like to see the polling organizations re-contact all respondents from their final polls. That would shed a lot of light on any issues relating to social desirability or other subtle biases.

This is not the first time pollsters have gotten it wrong. President Hillary Clinton will be remembered in history along with President Thomas Dewey and President Alf Landon.  But, this time seems different.  There is so much information out there that seeing the signal to the noise is just plain difficult – and there are lessons in that for Big Data analyses and research departments everywhere.

We are left with an election result that half the country is ecstatic about and half is worried about.  However, everyone in the research industry should be deeply concerned. I am hopeful that this will cause more market research clients to ask questions about data quality, potential errors and biases, and that they will value quality more. Those conversations will go a long way to putting a great industry back on the right path.

Will Young People Vote?

picture2

Once again we are in an election cycle where the results could hinge on a simple question:  will young people vote? Galvanizing youth turnout is a key strategy for all candidates. It is perhaps not an exaggeration to say that Millennial voters hold the key to the future political leadership of the country.

But, this is nothing specific to Millennials and to this election. Young voters have effectively been the “swing vote” since the election of Kennedy in 1960. Yet, young voter turnout is consistently low relative to other age groups.

The 26th Amendment was ratified in 1971 giving 18-21 year olds the right to vote for the first time. This means that anyone born in 1953 or later has never been of age at a time when they could not vote in a Presidential election. So, only those who are currently 64 or older (approximately) will have turned 18 at a time when they were not enfranchised.

This right did not come easily. The debate about lowering the voting age started in earnest during World War II, as many soldiers under 21 (especially those drafted into the armed forces) didn’t understand how they could be expected to sacrifice so much for a country if they did not have a say in how it was governed. The movement gained steam during the cultural revolution of the 1960’s and culminated in the passage of the 26th Amendment.

Young people celebrated their new found right to vote, and then promptly failed to take advantage of it. The chart below shows 18-24 year old voter turnout compared to totalvoter turnout for all Presidential election years since the 26th Amendment was ratified.

picture1

Much was made of Obama’s success in galvanizing the young vote in 2008. However, there was only a 2 percentage point gain increase in young voter turnout in 2008 versus 2004. As the chart shows, there was a big falloff in young voter participation in 1996 and 2000, which were the last elections before Millennials comprised the bulk of the 18-24 age group.

It remains that young voters are far less likely to vote than older adults and that trend is likely to continue.

How can you predict an election by interviewing only 400 people?

This might be the most commonly asked question researchers get at cocktail parties (to the extent that researchers go to cocktail parties). It is also a commonly unasked question among researchers themselves: how can we predict an election by only talking to 400 people? 

The short answer is we can’t. We can never predict anything with 100% certainty from a research study or poll. The only way we could predict the election with 100% certainty would be to interview every person who will end up voting. Even then, since people might change their mind between the poll and the election we couldn’t say our prediction was 100% likely to come true.

To provide an example, if I want to flip a coin 100 times, my best estimate before I do it would be that I will get “heads” 50 times. But, it isn’t 100% certain the coin will land on heads 50 times.

The reason it is hard to comprehend how we predict elections by talking to so few people is our brains aren’t trained to understand probability. If we interview 400 people and find that 53% will vote for Hillary Clinton and 47% for Donald Trump, as long as the poll was conducted well, this result becomes our best prediction for what the vote will be. It is similar to predicting we will get 50 heads out of 100 coin tosses.  53% is our best prediction given the information we have. But, it isn’t an infallible prediction.

Pollsters provide a sampling error, which is +/-5% in this case. 400 is a bit of a magic number. It results in a maximum possible sampling error of +/-5% which has long been an acceptable standard. (Actually, we need 384 interviews for that, but researchers will use 400 instead because it sounds better.)

What that means is that if we repeated this poll over and over, we would expect to find Clinton to receive between 48% and 58% of the intended vote, 95% of the time. We’d expect Trump to receive between 42% and 52% of the intended vote, 95% of the time. On average though, if we kept doing poll after poll, our best guess would be if we averaged Clinton’s result it would be 53%.

In the coin flipping example, if we repeatedly flipped the coin 400 times, we should get between 45% and 55% heads 95% of the time. But, our average and most common result will be 50% heads.

Because the ranges of the election poll (48%-58% for Clinton and 42%-52% for Trump) overlap, you will often see reporters (and the candidate that is in second place) say that the poll is a “statistical dead heat.” There is no such thing as a statistical dead heat in polling unless the exact number of respondents prefer each candidate, which may never have actually happened in the history of polling.

There is a much better way to report the findings of the poll. We can statistically determine the “odds” that the 53% for Clinton is actually higher than the 47% for Trump. If we repeated the poll many times, what is the probability that the percentage we found for Clinton would be higher than what we found for Trump? In other words, what is the probability that Clinton is going to win?

The answer in this case is 91%.  Based on our example poll, Clinton has a 91% chance of winning the election. Say that instead of 400 people we interviewed 1,000. The same finding would imply that Clinton has a 99% chance of winning. This is a much more powerful and interesting way to report polling results, and we are surprised we have never seen a news organization use polling data in this way.

Returning to our coin flipping example, if we flip a coin 400 times and get heads 53% of the time, there is a 91% chance that we have a coin that is unfair, and biased towards heads. If we did it 1,000 times and got heads 53% of the time, there would be a 99% chance that the coin is unfair. Of course, a poll is a snapshot in time. The closer it is to the election, the more likely it is that the numbers will not change.  And, polling predictions assume many things that are rarely true:  that we have a perfect random sample, that all subgroups respond at the same rate, that questions are clear, that people won’t change their mind on Election Day, etc.

So, I guess the correct answer to “how can we predict the election from surveying 400 people” is “we can’t, but we can make a pretty good guess.”

Polls can be as influential as the election

Many of us on the supplier side of the market research industry had our original interest in this field kindled by political polling. The market research industry was largely established as a by-product of polling. It didn’t take the founding fathers of election polling long to realize that, during a time of massive expansion of the US economy in the post WWII era, there was money to be made by polling for companies and brands.

In some ways polling has become more important than the election itself. In 2000 Elizabeth Dole was touted by many as a potential Republican candidate. While many knew her only as the wife of Bob Dole, she seemed to have a lot going for her. She had been Secretary of Labor, head of the Red Cross, was well-spoken, and seemed poised to become the perhaps the first woman with a realistic shot at the White House. She was seen as a viable candidate by most pundits.

But, polls conducted before any primaries had been contested indicated that her support level was low, largely because she was unknown. As a consequence of a poor showing in early polls, she stumbled in fundraising and pulled out of the race without a voter ever having a chance to vote for or against her. Had the initial polls never been taken, she likely would have had enough fundraising support to enter the initial primaries. As she was an excellent communicator, who knows where it might have gone from there.

This made me wonder what the value of early polling is. It certainly seems to limit the viability of lesser-known candidates. I doubt that if the polling environment in 1992 was as it is today if Bill Clinton would have had the chance to emerge as a contender.

As we turn to the current race, on the Republican side there soon could be as many as a dozen declared candidates, and some are predicting up to 20. Fundraising success will become the first screen to winnow the field. And, early poll results will directly affect their ability to fundraise. I believe this is why Jeb Bush has been late to declare his candidacy. He has had an incredible level of success raising money, and once he declares the pollsters will start assessing his viability. He’s best off continuing to fundraise without becoming a declared candidate as declaring probably runs a risk for him.

Further, both Fox and CNN have recently announced that they will only include the top 10 candidates in the first Republican debates. How will they winnow the field? By looking at polling data.

Should we worry that the polling industry has too much say in who gets support? I asked this question to a well-respected pollster once and he said that the issue is more on how well the polls are done. If we do our jobs well we keep politicians abreast of popular opinion and thus are a valuable contributor to democracy. There is nothing wrong with accurately measuring the truth and communicating it.

Of course, when polls are done poorly, the opposite is true. The media has an insatiable appetite for polls. As a consequence, there are many poorly-designed polls released and reported upon. There are even more polls that are really just shilling for the parties and Super PACs in disguise. The media has been either unable or unwilling to differentiate the credible from the bad, and with a continuous news cycle we’ll see more poor quality polls reported upon.

It doesn’t help that even the major pollsters struggle to get it right. In the recent UK elections, pretty much every pollster missed badly.  Even FiveThirtyEight, Nate Silver’s site that tends to be highly critical of polling and a self-appointed arbiter of good and bad polls, had to issue a mea culpa when their own predictions rang hollow.

As long as the media is running 24/7 and starved for content, the polls will continue.  The challenge is to sort out the good from the bad and the signal from the noise.  It isn’t easy but it is important – literally who gets elected as the next US President can depend upon it.

 

Wanna bet that Hillary will be the next President?

800px-Hillary_Clinton_official_Secretary_of_State_portrait_crop

There is a movement afoot to allow Las Vegas casinos to take bets on Presidential elections.  Betting on who will be the next President has the potential to increase the interest level in the election and perhaps voter turnout as a consequence. Of course, it will also bring more revenue to Nevada casinos. Detractors of the idea cite the typical arguments against gambling of any kind. I suppose campaign insiders could engineer a campaign emergency to sabotage their candidate and collect winning bets they have made on the other side.

One aspect that hasn’t been discussed is whether betting on elections would make for better predictions than current polling methods.

Election polling is simple at its core. Pollsters find a representative sample of likely voters and ask a basic question:  if the election were held today, whom would you vote for?  While the question itself is basic, polls often disagree on the answers. Differences in the polls tend to stem from how the sample was drawn, how “likely voters” are classified, and context (issue questions that may have preceding the voting question). On the whole, the major polling organizations do a good job with election polls, and, especially if you group all the polls together, they make excellent predictions. But, the pollsters are not always right. If we had left it up to the major polling organizations to select our Presidents, our children would be learning about the policies of President Alf Landon and President Thomas Dewey. (Of course if we trusted the polls and not the actual election, we also would be teaching about President Al Gore, but that is another story.)

A few election cycles back a few firms tried a new approach. Rather than ask “whom would you vote for?” the new approach asked “regardless of whom you may favor or vote for, who do you think will win the election?” This was seen as an attempt to get over the difficulty of predicting turnout. It was sort of a way to crowd source an election poll. The approach worked well, but has been tried too infrequently to make a definitive judgment. While election polls are great experiments in that we can judge their success or failure by a real-world result, they aren’t so good in that the sample size of national elections is small.

An interesting approach to the last few election cycles was taken by Intrade. Intrade was a “prediction market” — an exchange that traded shares for future events that had a “yes/no” type outcome, for instance, “will Barack Obama win the election?” The share price for this would be between $0 and $1. Once the election is over, a share of Obama would close at $1 if he won, and $0 if he lost. Since there was an active market in this trading, you could make real bets with real money on the election depending on where you stood. For instance, if an Obama share was trading at 72 cents, this could be interpreted as saying the market feels he has a 72% chance of winning. If you felt Obama had a greater than 72% chance of winning you’d buy his “stock.” When the election was over, you’d either lose 72 cents if he lost the election or make 28 cents if he won. What was interesting about the approach was watching how the stock price would move as the campaign season progressed.

After the conventions or debates, Obama’s share price would change. At any moment, the share price reflected the probability of victory. A good speech would move his price (and probability of winning) up a few points. Intrade was an excellent predictor and took into account the uncertainty inherent in predictions in an understandable way. The share prices of the candidates clearly showed their probability of winning in real time.

Allowing Vegas style betting on Presidential elections would be similarly interesting. But would it be accurate?

Vegas bookmakers establish initial odds on an event, and these odds (or a point spread in the case of football) evolve depending on how the betting comes in. Many people don’t realize that the oddsmakers are not actually concerned about the probability of who might win the football game. Instead, they set and adjust odds/point spreads to attract an even amount of money bet on both sides of the game, as that is how the casino maximizes its profits. So, the spread might not reflect the probability of winning, especially for teams with large, rabid fan bases, who may irrationally wager on their team (providing a buying opportunity on the other side for the rest of us). How do they do? In a perfect world (from the casino’s point-of-view), 50% of the underdogs would win and 50% of the favorites would win. In 2013, 512 regular season NFL games were played. The favorites won 248 times (48.9%). This is not significantly different than 50% in a statistical sense, so it appears that the sports books do a pretty good job.

It would not surprise me if allowing Vegas casinos to take election bets would result in a better prediction than the polls. Money tends to flow rationally and in response to new information, and it likely behaves more rationally than individual respondents in a poll. The polls won’t go out of business, as the polls have an excellent ability of understanding who voter for whom and why, and these results drive campaign decisions and cable TV news content. Of course the best approach to predicting who the next President will be is probably to just ask Nate Silver. 🙂