Posts Tagged 'Biases and errors'

Pre-Election Polling and Baseball Share a Lot in Common

The goal of a pre-election poll is to predict which candidate will win an election and by how much. Pollsters work towards this goal by 1) obtaining a representative sample of respondents, 2) determining which candidate a respondent will vote for, and 3) predicting the chances each respondent will take the time to vote.

All three of these steps involve error. It is the first one, obtaining a representative sample of respondents, which has changed the most in the past decade or so.

It is the third characteristic that separates pre-election polling from other forms of polling and survey research. Statisticians must predict how likely each person they interview will be to vote. This is called their “Likely Voter Model.”

As I state in POLL-ARIZED, this is perhaps the most subjective part of the polling process. The biggest irony in polling is that it becomes an art when we hand the data to the scientists (methodologists) to apply a Likely Voter Model.

It is challenging to understand what pollsters do in their Likely Voter Models and perhaps even more challenging to explain.  

An example from baseball might provide a sense of what pollsters are trying to do with these models.

Suppose Mike Trout (arguably the most underappreciated sports megastar in history) is stepping up to the plate. Your job is to predict Trout’s chances of getting a hit. What is your best guess?

You could take a random guess between 0 and 100%. But, since that would give you a 1% chance of being correct, there must be a better way.

A helpful approach comes from a subset of statistical theory called Bayesian statistics. This theory says we can start with a baseline of Trout’s hit probability based on past data.

For instance, we might see that so far this year, the overall major league batting average is .242. So, we might guess that Trout’s probability of getting a hit is 24%.

This is better than a random guess. But, we can do better, as Mike Trout is no ordinary hitter.

We might notice there is even better information out there. Year-to-date, Trout is batting .291. So, our guess for his chances might be 29%. Even better.

Or, we might see that Trout’s lifetime average is .301 and that he hit .333 last year. Since we believe in a concept called regression to the mean, that would lead us to think that his batting average should be better for the rest of the season than it is currently. So, we revise our estimate upward to 31%.

There is still more information we can use. The opposing pitcher is Justin Verlander. Verlander is a rare pitcher who has owned Trout in the past – Trout’s average is just .116 against Verlander. This causes us to revise our estimate downward a bit. Perhaps we take it to about 25%.

We can find even more information. The bases are loaded. Trout is a clutch hitter, and his career average with men on base is about 10 points higher than when the bases are empty. So, we move our estimate back up to about 28%.

But it is August. Trout has a history of batting well early in and late in the season, but he tends to cool off during the dog days of summer. So, we decide to end this and settle on a probability of 25%.

This sort of analysis could go on forever. Every bit of information we gather about Trout can conceivably help make a better prediction for his chances. Is it raining? What is the score? What did he have for breakfast? Is he in his home ballpark? Did he shave this morning? How has Verlander pitched so far in this game? What is his pitch count?

There are pre-election polling analogies in this baseball example, particularly if you follow the probabilistic election models created by organizations like FiveThirtyEight and The Economist.

Just as we might use Trout’s lifetime average as our “prior” probability, these models will start with macro variables for their election predictions. They will look at the past implications of things like incumbency, approval ratings, past turnout, and economic indicators like inflation, unemployment, etc. In theory, these can adjust our assumptions of who will win the election before we even include polling data.

Of course, using Trout’s lifetime average or these macro variables in polling will only be helpful to the extent that the future behaves like the past. And therein lies the rub – overreliance on past experience makes these models inaccurate during dynamic times.

Part of why pollsters missed badly in 2020 is unique things were going on – a global pandemic, changed methods of voting, increased turnout, etc. In baseball, perhaps this is a year with a juiced baseball, or Trout is dealing with an injury.

The point is that while unprecedented things are unpredictable, they happen with predictable regularity. There is always something unique about an election cycle or a Mike Trout at bat.

The most common question I am getting from readers of POLL-ARIZED is, “will the pollsters get it right in 2024?” My answer is that since pollsters are applying past assumptions in their model, they will get it right to the extent that the world in 2024 looks like the world did in 2020, and I would not put my own money on it.

I make a point in POLL-ARIZED that pollsters’ models have become too complex. While in theory, the predictive value of a model never gets worse when you add in more variables, in practice, this has made these models uninterpretable. Pollsters include so many variables in their likely voter models that many of their adjustments cancel each other out. They are left with a model with no discernable underlying theory.

If you look closely, we started with a probability of 24% for Trout. Even after looking at a lot of other information and making reasonable adjustments, we still ended up with a prediction of 25%. The election models are the same way. They include so many variables that they can cancel out each other’s effects and end up with a prediction that looks much like the raw data did before the methodologists applied their wizardry.

This effort is better spent at getting better input for the models by investing in generating the trust needed to increase the response rates we get to our surveys and polls. Improving the quality of our data input will increase the predictive quality of the polls more than coming up with more complicated ways to weight the data.

Of course, in the end, one candidate wins, and the other loses, and Mike Trout either gets a hit, or he doesn’t, so the actual probability moves to 0% or 100%. Trout cannot get 25% of a hit, and a candidate cannot win 79% of an election.

As I write this, I looked up the last time Trout faced Verlander. It turns out Verlander struck him out!

The Insight that Insights Technology is Missing

The market research insights industry has long been characterized by a resistance to change. This likely results from the academic nature of what we do. We don’t like to adopt new ways of doing things until they have been proven and studied.

I would posit that the insights industry has not seen much change since the transition from telephone to online research occurred in the early 2000s. And even that transition created discord within the industry, with many traditional firms resistant to moving on from telephone studies because online data collection had not been thoroughly studied and vetted.

In the past few years, the insights industry has seen an influx of capital, mostly from private equity and venture capital firms. The conditions for this cash infusion have been ripe: a strong and growing demand for insights, a conservative industry that is slow to adapt, and new technologies arising that automate many parts of a research project have all come together simultaneously.

Investing organizations see this enormous business opportunity. Research revenues are growing, and new technologies are lowering costs and shortening project timeframes. It is a combustible business situation that needs a capital accelerant.

Old school researchers, such as myself, are becoming nervous. We worry that automation will harm our businesses and that the trend toward DIY projects will result in poor-quality studies. Technology is threatening the business models under which we operate.

The trends toward investment in automation in the insights industry are clear. Insights professionals need to embrace this and not fight it.

However, although the movement toward automation will result in faster and cheaper studies, this investment ignores the threats that declining data quality creates. In the long run, this automation will accelerate the decline in data quality rather than improve it.

It is great that we are finding ways to automate time-consuming research tasks, such as questionnaire authoring, sampling, weighting, and reporting. This frees up researchers to concentrate on drawing insights out of the data. But, we can apply all the automation in the world to the process, yet if we do not do something about data quality, it will not increase the value clients receive.

I argue in POLL-ARIZED that the elephant in the research room is the fact that very few people want to take our surveys anymore. When I began in this industry, I routinely fielded telephone projects with 70-80% response rates. Currently, telephone and online response rates are between 3-4% for most projects.

Response rates are not everything. You can make a compelling argument that they do not matter at all. There is no problem as long as the 3-4% response we get is representative. I would rather have a representative 3% answer a study than a biased 50%.

But, the fundamental problem is that this 3-4% is not representative. Only about 10% of the US population is currently willing to take surveys. What is happening is that this same 10% is being surveyed repeatedly. In the most recent project Crux fielded, respondents had taken an average of 8 surveys in the past two weeks. So, we have about 10% of the population taking surveys every other day, and our challenge is to make them represent the rest of the population.

Automate all you want, but the data that are the backbone of the insights we are producing quickly and cheaply is of historically low quality.

The new investment flooding into research technology will contribute to this problem. More studies will be done that are poorly designed, with long, tortuous questionnaires. Many more surveys will be conducted, fewer people will be willing to take them, and response rates will continue to fall.

There are plenty of methodologists working on these problems. But, for the most part, they are working on new ways to weight the data we can obtain rather than on ways to compel more response. They are improving data quality, but only slightly, and the insights field continues to ignore the most fundamental problem we have: people do not want to take our surveys.

For the long-term health of our field, that is where the investment should go.

In POLL-ARIZED, I list ten potential solutions to this problem. I am not optimistic that any of them will be able to stem the trend toward poor data quality. But, I am continually frustrated that our industry has not come together to work towards expanding respondent trust and the base of people willing to take part in our projects.

The trend towards research technology and automation is inevitable. It will be profitable. But, unless we address data quality issues, it will ultimately hasten the decline of this field.

POLL-ARIZED available on May 10

I’m excited to announce that my book, POLL-ARIZED, will be available on May 10.
 
After the last two presidential elections, I was fearful my clients would ask a question I didn’t know how to answer: “If pollsters can’t predict something as simple as an election, why should I believe my market research surveys are accurate?”
 
POLL-ARIZED results from a year-long rabbit hole that question led me down! In the process, I learned a lot about why polls matter, how today’s pollsters are struggling, and what the insights industry should do to improve data quality.
 
I am looking for a few more people to read an advance copy of the book and write an Amazon review on May 10. If you are interested, please send me a message at poll-arized@cruxresearch.com.

A forgotten man: rural respondents

I have attended hundreds of focus groups. These are moderated small group discussions, typically with anywhere from 4 to 12 participants. The discussions take place in a tricked-out conference room, decked with recording equipment and a one-way mirror. Researchers and clients sit behind this one-way mirror in a cushy, multi-tiered lounge. The lounge has comfortable chairs, a refrigerator with beer and wine, and an insane number of M&M’s. Experienced researchers have learned to sit as far away from the M&M’s as possible.

Focus groups are used for many purposes. Clients use them to test out new product ideas or new advertising under development. We recommend them to clients if their objectives do not seem quite ready for survey research. We also like to do focus groups after a survey research project is complete, to put some personality on our data and to have an opportunity to pursue unanswered questions.

I would estimate that at least half of all focus groups being conducted are being held in just three cities: New York, Chicago, and Los Angeles. Most of the other half are held in other major cities or in travel destinations like Las Vegas or Orlando. These city choices can have little to do with the project objectives – focus groups tend to be held near where the client’s offices are or in cities that are easy to fly to. Clients often cities simply because they want to go there.

The result is that early-stage product and advertising ideas are almost always evaluated by urban participants or by suburban participants who live near a large city. Smaller city, small town, and rural consumers aren’t an afterthought in focus group research. They aren’t thought about at all.

I’ve always been conscious of this, perhaps because I grew up in a rural town and have never lived in a major metropolitan area. The people I grew up with an knew best were not being asked to provide their opinions.

This isn’t just an issue in qualitative research, it happens with surveys and polls as well. Rural and small-town America is almost always underrepresented in market research projects.

This wasn’t a large issue for quantitative market research early on, as RDD telephone samples could effectively include rural respondents. Many years ago, I started adding questions into questionnaires that would allow me to look at the differences between urban, suburban, and rural respondents. I would often find differences, but pointing them out met with little excitement with clients who often seemed uninterested in targeting their products or marketing to a small-town audience.

Online samples do not include rural respondents as effectively as RDD telephone samples. The rural respondents that are in online sampling data bases are not necessarily representative of rural people. Weighting them upward does not magically make them representative.

In 30 years, I have not had a single client ask me to correct a sample to ensure that rural respondents are properly represented. The result is that most products and services are designed for suburbia and don’t take the specific needs of small-town folks into account.

All biases only matter if they affect what we are measuring. If rural respondents and suburban respondents feel the same way about something, this issue doesn’t matter. However, it can matter. It can matter for product research, it certainly matters to the educational market research we have conducted, and it is likely a hidden cause of some of the problems that have occurred with election polling.

The myth of the random sample

Sampling is at the heart of market research. We ask a few people questions and then assume everyone else would have answered the same way.

Sampling works in all types of contexts. Your doctor doesn’t need to test all of your blood to determine your cholesterol level – a few ounces will do. Chefs taste a spoonful of their creations and then assume the rest of the pot will taste the same. And, we can predict an election by interviewing a fairly small number of people.

The mathematical procedures that are applied to samples that enable us to project to a broader population all assume that we have a random sample. Or, as I tell research analysts: everything they taught you in statistics assumes you have a random sample. T-tests, hypotheses tests, regressions, etc. all have a random sample as a requirement.

Here is the problem: We almost never have a random sample in market research studies. I say “almost” because I suppose it is possible to do, but over 30 years and 3,500 projects I don’t think I have been involved in even one project that can honestly claim a random sample. A random sample is sort of a Holy Grail of market research.

A random sample might be possible if you have a captive audience. You can random sample some the passengers on a flight or a few students in a classroom or prisoners in a detention facility. As long as you are not trying to project beyond that flight or that classroom or that jail, the math behind random sampling will apply.

Here is the bigger problem: Most researchers don’t recognize this, disclose this, or think through how to deal with it. Even worse, many purport that their samples are indeed random, when they are not.

For a bit of research history, once the market research industry really got going the telephone random digit dial (RDD) sample became standard. Telephone researchers could randomly call land line phones. When land line telephone penetration and response rates were both high, this provided excellent data. However, RDD still wasn’t providing a true random, or probability sample. Some households had more than one phone line (and few researchers corrected for this), many people lived in group situations (colleges, medical facilities) where they couldn’t be reached, some did not have a land line, and even at its peak, telephone response rates were only about 70%. Not bad. But, also, not random.

Once the Internet came of age, researchers were presented with new sampling opportunities and challenges. Telephone response rates plummeted (to 5-10%) making telephone research prohibitively expensive and of poor quality. Online, there was no national directory of email addresses or cell phone numbers and there were legal prohibitions against spamming, so researchers had to find new ways to contact people for surveys.

Initially, and this is still a dominant method today, research firms created opt-in panels of respondents. Potential research participants were asked to join a panel, filled out an extensive demographic survey, and were paid small incentives to take part in projects. These panels suffer from three response issues: 1) not everyone is online or online at the same frequency, 2) not everyone who is online wants to be in a panel, and 3) not everyone in the panel will take part in a study. The result is a convenience sample. Good researchers figured out sophisticated ways to handle the sampling challenges that result from panel-based samples, and they work well for most studies. But, in no way are they a random sample.

River sampling is a term often used to describe respondents who are “intercepted” on the Internet and asked to fill out a survey. Potential respondents are invited via online ads and offers placed on a range of websites. If interested, they are typically pre-screened and sent along to the online questionnaire.

Because so much is known about what people are doing online these days, sampling firms have some excellent science behind how they obtain respondents efficiently with river sampling. It can work well, but response rates are low and the nature of the online world is changing fast, so it is hard to get a consistent river sample over time. Nobody being honest would ever use the term “random sampling” when describing river samples.

Panel-based samples and river samples represent how the lion’s share of primary market research is being conducted today. They are fast and inexpensive and when conducted intelligently can approximate the findings of a random sample. They are far from perfect, but I like that the companies providing them don’t promote them as being random samples. They involve some biases and we deal with these biases as best we can methodologically. But, too often we forget that they violate a key assumption that the statistical tests we run require: that the sample is random. For most studies, they are truly “close enough,” but the problem is we usually fail to state the obvious – that we are using statistical tests that are technically not appropriate for the data sets we have gathered.

Which brings us to a newer, shiny object in the research sampling world: ABS samples. ABS (addressed-based samples) are purer from a methodological standpoint. While ABS samples have been around for quite some time, they are just now being used extensively in market research.

ABS samples are based on US Postal Service lists. Because USPS has a list of all US households, this list is an excellent sampling frame. (The Census Bureau also has an excellent list, but it is not available for researchers to use.) The USPS list is the starting point for ABS samples.

Research firms will take the USPS list and recruit respondents from it, either to be in a panel or to take part in an individual study. This recruitment can be done by mail, phone, or even online. They often append publicly-known information onto the list.

As you might expect, an ABS approach suffers from some of the same issues as other approaches. Cooperation rates are low and incentives (sometimes large) are necessary. Most surveys are conducted online, and not everyone in the USPS list is online or has the same level of online access. There are some groups (undocumented immigrants, homeless) that may not be in the USPS list at all. Some (RVers, college students, frequent travelers) are hard to reach. There is evidence that ABS approaches do not cover rural areas as well as urban areas. Some households use post office boxes and not residential addresses for their mail. Some use more than one address. So, although ABS lists cover about 97% of US households, the 3% that they do not cover are not randomly distributed.

The good news is, if done correctly, the biases that result from an ABS sample are more “correctable” than those from other types of samples because they are measurable.

A recent Pew study indicates that survey bias and the number of bogus respondents is a bit smaller for ABS samples than opt-in panel samples.

But ABS samples are not random samples either. I have seen articles that suggest that of all those approached to take part in a study based on an ABS sample, less than 10% end up in the survey data set.

The problem is not necessarily with ABS samples, as most researchers would concur that they are the best option we have and come the closest to a random sample. The problem is that many firms that are providing ABS samples are selling them as “random samples” and that is disingenuous at best. Just because the sampling frame used to recruit a survey panel can claim to be “random” does not imply that the respondents you end up in a research database constitute a random sample.

Does this matter? In many ways, it likely does not. There are biases and errors in all market research surveys. These biases and errors vary not just by how the study was sampled, but also by the topic of the question, its tone, the length of the survey, etc. Many times, survey errors are not the same throughout an individual survey. Biases in surveys tend to be “unknown knowns” – we know they are there, but aren’t sure what they are.

There are many potential sources of errors in survey research. I am always reminded of a quote from Humphrey Taylor, the past Chairman of the Harris Poll who said “On almost every occasion when we release a new survey, someone in the media will ask, “What is the margin of error for this survey?” There is only one honest and accurate answer to this question — which I sometimes use to the great confusion of my audience — and that is, “The possible margin of error is infinite.”  A few years ago, I wrote a post on biases and errors in research, and I was able to quickly name 15 of them before I even had to do an Internet search to learn more about them.

The reality is, the improvement in bias that is achieved by an ABS sample over a panel-based sample is small and likely inconsequential when considered next to the other sources of error that can creep into a research project. Because of this, and the fact that ABS sampling is really expensive, we tend to only recommend ABS panels in two cases: 1) if the study will result in academic publication, as academics are more accepting of data that comes from and ABS approach, and 2) if we are working in a small geography, where panel-based samples are not feasible.

Again, ABS samples are likely the best samples we have at this moment. But firms that provide them are often inappropriately portraying them as yielding random samples. For most projects, the small improvements in bias they provide is not worth the considerable increased budget and increased study time frame, which is why, for the moment, ABS samples are currently used in a small proportion of research studies. I consider ABS to be “state of the art” with the emphasis on “art” as sampling is often less of a science than people think.

Jeff Bezos is right about market research

In an annual shareholder letter, Amazon’s Jeff Bezos recently stated that market research isn’t helpful. That created some backlash among researchers, who reacted defensively to the comment.

For context, below is the text of Bezos’ comment:

No customer was asking for Echo. This was definitely us wandering. Market research doesn’t help. If you had gone to a customer in 2013 and said “Would you like a black, always-on cylinder in your kitchen about the size of a Pringles can that you can talk to and ask questions, that also turns on your lights and plays music?” I guarantee you they’d have looked at you strangely and said “No, thank you.”

This comment is reflective of someone who understands the role market research can play for new products as well as its limitations.

We have been saying for years that market research does a poor job of predicting the success of truly breakthrough products. What was the demand for television sets in the 1920’s and 1930’s before there was even content to broadcast or a way to broadcast it? Just a decade ago, did consumers know they wanted a smartphone they would carry around with them all day and constantly monitor? Henry Ford once said that if he had asked customers what they wanted they would have wanted faster horses and not cars.

In 2014, we wrote a post (Writing a Good Questionnaire is Just Like Brian Surgery) that touched on this issue. In short, consumer research works best when the consumer has a clear frame-of-reference from which to draw. New product studies on line extensions or easily understandable and relatable new ideas tend to be accurate. When the new product idea is harder to understand or is outside the consumer’s frame-of-reference research isn’t as predictive.

Research can sometimes provide the necessary frame-of-reference. We put a lot of effort to be sure that concept descriptions are understandable. We often go beyond words to do this and produce short videos instead of traditional concept statements. But even then, if the new product being tested is truly revolutionary the research will probably predict demand inaccurately. The good news is few new product ideas are actually breakthroughs – they are usually refinements on existing ideas.

Failure to provide a frame-of-reference or realize that one doesn’t exist leads to costly research errors. Because this error is not quantifiable (like a sample error) it gets little attention.

The mistake people are making when reacting to Bezos’ comment is they are viewing it as an indictment of market research in general. It is not. Research still works quite well for most new product forecasting studies. For new products, companies are often investing millions or tens of millions in development, production, and marketing. It usually makes sense to invest in market research to be confident these investments will pay off and to optimize the product.

It is just important to recognize that there are cases where respondents don’t have a good frame-of-reference and the research won’t accurately predict demand. Truly innovative ideas are where this is most likely to happen.

I’ve learned recently that this anti-research mentality pervades the companies in Silicon Valley. Rather than use a traditional marketing approach of identifying a need and then developing a product to fulfill the need, tech firms often concern themselves first with the technology. They develop a technology and then look for a market for it. This is a risky strategy and likely fails more than it succeeds, but the successes, like the Amazon Echo, can be massive.

I own an Amazon Echo. I bought it shortly after it was launched having little idea what it was or what it could do. Even now I am still not quite sure what it is capable of doing. It probably has a lot of potential that I can’t even conceive of. I think it is still the type of product that might not be improved much by market research, even today, when it has been on the market for years.

Will adding a citizenship question to the Census harm the Market Research Industry?

The US Supreme Court appears likely to allow the Department of Commerce to reinstate a citizenship question on the 2020 Census. This is largely viewed as a political controversy at the moment. The inclusion of a citizenship question has proven to dampen response rates among non-citizens, who tend to be people of color. The result will be gains in representation for Republicans at the expense of Democrats (political district lines are redrawn every 10 years as a result of the Census). Federal funding will likely decrease for states with large immigrant populations.

It should be noted that the Census bureau itself has come out against this change, arguing that it will result in an undercount of about 6.5 million people. Yet, the administration has pressed forward and has not committed funds needed by the Census Bureau to fully research the implications. The concern isn’t just about non-response from non-citizens. In tests done by the Census Bureau, non-citizens are also more likely to inaccurately respond to this question than citizens, meaning the resulting data will be inaccurate.

Clearly this is a hot-button political issue. However, there is not much talk of how this change may affect research. Census data are used to calibrate most research studies in the US, including academic research, social surveys, and consumer market research. Changes to the Census may have profound effects on data quality.

The Census serves as a hidden backbone for most research studies whether researchers or clients realize it or not. Census information helps us make our data representative. In a business climate that is becoming more and more data-driven the implications of an inaccurate Census are potentially dire.

We should be primarily concerned that the Census is accurate regardless of the political implications. Adding questions that temper response will not help accuracy. Errors in the Census have a tendency to become magnified in research. For example, in new product research it is common to project study data from about a thousand respondents to a universe of millions of potential consumers. Even a small error in the Census numbers can lead businesses to make erroneous investments. These errors create inefficiencies that reverberate throughout the economy. Political concerns aside, US businesses undoubtably suffer from a flawed Census. Marketing becomes less efficient.

All is not lost though. We can make a strong case that there are better, less costly ways to conduct the Census. Methodologists have long suggested that a sampling approach would be more accurate than the current attempt at enumeration. This may never happen for the decennial Census because the Census methodology is encoded in the US Constitution and it might take an amendment to change it.

So, what will happen if this change is made? I suspect that market research firms will switch to using data that come from the Census’ survey programs, such as the American Community Survey (ACS). Researchers will rely less on the actual decennial census. In fact, many research firms already use the ACS rather than the decennial census (and the ACS currently contains the citizenship question).

The Census bureau will find ways to correct for resulting error, and to be honest, this may not be too difficult from a methodological standpoint. Business will adjust because there will be economic benefits to learning how to deal with a flawed Census, but in the end, this change will take some time for the research industry to address. Figuring things like this out is what good researchers do. While it is unfortunate that this change looks likely to be made, its implications are likely more consequential politically than it will be to the research field.

How Did Pollsters Do in the Midterm Elections?

Our most read blog post was posted the morning after the 2016 Presidential election. It is a post we are proud of because it was composed in the haze of a shocking election result. While many were celebrating their side’s victory or in shock over their side’s losses, we mused about what the election result meant for the market research industry.

We predicted pollsters would become defensive and try to convince everyone that the polls really weren’t all that bad. In fact, the 2016 polls really weren’t. Predictions of the popular vote tended to be within a percent and a half or so of the actual result which was better than for the previous Presidential election in 2012. However, the concern we had about the 2016 polls wasn’t related to how close they were to the result. The issue we had was one of bias: 22 of the 25 final polls we found made an inaccurate prediction and almost every poll was off in the same direction. That is the very definition of bias in market research.

Suppose that you had 25 people flip a coin 100 times. On average, you’d expect 50% of the flips to be “heads.” But, if say, 48% of them were “heads” you shouldn’t be all that worried as that can happen. But, if 22 of the 25 people all had less than 50% heads you should worry that there was something wrong with the coins or they way they were flipped. That is, in essence, what happened in the 2016 election with the polls.

Anyway, this post is being composed the aftermath of the 2018 midterm elections. How did the pollsters do this time?

Let’s start with FiveThirtyEight.com. We like this site because they place probabilities around their predictions. Of course, this gives them plausible deniability when their prediction is incorrect, as probabilities are never 0% or 100%. (In 2016 they gave Donald Trump a 17% chance of winning and then defended their prediction.) But this organization looks at statistics in the right way.

Below is their final forecast and the actual result. Some results are still pending, but at this moment, this is how it shapes up.

  • Prediction: Republicans having 52 seats in the Senate. Result: It looks like Republicans will have 53 seats.
  • Prediction: Democrats holding 234 and Republicans holding 231 House seats. Result: It looks like Democrats will have 235 or 236 seats.
  • Prediction: Republicans holding 26 and Democrats holding 24 Governorships. Result: Republicans now hold 26 and Democrats hold 24 Governorships.

It looks like FiveThirtyEight.com nailed this one. We also reviewed a prediction market and state-level polls, and it seems that this time around the polls did a much better job in terms of making accurate predictions. (We must say that on election night, FiveThirtyEight’s predictions were all over the place when they were reporting in real time. But, as results settled, their pre-election forecast looked very good.)

So, why did polls seem to do so much better in 2018 than 2016? One reason is the errors cancel out when you look at large numbers of races. Sure, the polls predicted Democrats would have 234 seats, and that is roughly what they achieved. But, in how many of the 435 races did the polls make the right prediction? That is the relevant question, as it could be the case that the polls made a lot of bad predictions that compensated for each other in the total.

That is a challenging analysis to do because some races had a lot of polling, others did not, and some polls are more credible than others. A cursory look at the polls suggests that 2018 was a comeback victory for the pollsters. We did sense a bit of an over-prediction favoring the Republican Senatorial candidates, but on the House side there does not seem to be a clear bias.

So, what did the pollsters do differently? Not much really. Online sampling continues to evolve and get better, and the 2016 result has caused polling firms to concentrate more carefully on their sampling. One of the issues that may have caused the 2016 problem is that pollsters are starting to almost exclusively use the top 2 or 3 panel companies. Since 2016, there has been a consolidation among sample suppliers, and as a result, we are seeing less variance in polls as pollsters are largely all using the same sample sources. The same few companies provide virtually all the sample used by pollsters.

Another key difference was that turnout in the midterms was historically high. Polls are more accurate in high turnout races, as polls almost always survey many people who do not end up showing up on election day, particularly young people. However, there are large and growing demographic differences (age, gender, race/ethnicity) in supporters of each party, and that greatly complicates polling accuracy. Some demographic subgroups are far more likely than others to take part in a poll.

Pollsters are starting to get online polling right. A lot of the legacy firms in this space are still entrenched in the telephone polling world, have been protective of their aging methodologies, and have been slow to change. After nearly 20 years of online polling the upstarts have finally forced the bigger polling firms to question their approaches and to move to a world where telephone polling just doesn’t make a lot of sense. Also, many of the old guard, telephone polling experts are now retired or have passed on, and they have largely led the resistance to online polling.

Gerrymandering helps the pollster as well. It still remains the case that relatively few districts are competitive. Pew suggests that only 1 in 7 districts was competitive. You don’t have to be a pollster to accurately predict how about 85% of the races will turn out. Only about 65 of the 435 house races were truly at stake. If you just flipped a coin in those races, in total your prediction of house seats would have been fairly close.

Of course, pollsters may have just gotten lucky. We view that as unlikely though, as there were too many races. Unlike in 2018 though, in 2016 we haven’t seen any evidence of bias (in a statistical sense) in the direction of polling errors.

So, this is a good comeback success for the polling industry and should give us greater confidence for 2020. It is important that the research industry broadcasts this success. When pollsters have a bad day, like they did in 2016, it affects market research as well. Our clients lose confidence in our ability to provide accurate information. When the pollsters get it right, it helps the research industry as well.


Visit the Crux Research Website www.cruxresearch.com

Enter your email address to follow this blog and receive notifications of new posts by email.