Why the Media Cried (Red) Wolf

Journalists are puzzled as to why a predicted “red wave” (a Republican resurgence) did not materialize in the 2022 midterm elections. The signals that the red wave would fail to form were clear. The failure of journalists to foresee the success of Democratic candidates was caused by their inability to discern the good polls from the bad.

Established, media- and college-branded polls performed historically well in this cycle. They provided all the data necessary to foresee that a red wave would not emerge.

So why was there such a widespread view that the Republicans would have a big night?

The answer is that journalists have become indiscriminate in their polling coverage. Conservative-leaning pollsters released a flood of poor-quality polls in the last two weeks before the election. These polls pointed to a brewing red tsunami, and the media covered them with little, if any, due diligence.

I have had conversations with long-time pollsters who, through rolled eyes, tell me they think some of these pollsters are simply making up their numbers. In this cycle, pollsters obtained cross-tabulations from a Trafalgar poll that indicated that almost two-thirds of Gen Z Voters would vote for a MAGA candidate in Georgia (when one-third would have represented a historic swing). Yet, respected journalists widely reported the results of this very same poll.

Trafalgar’s 2022 polls were demonstrably inaccurate. Trafalgar released 19 statewide polls in the week preceding the election. These polls chose the correct winner in just 11 of these polls. Just seven were within their margin of error, and Trafalgar’s mean polling error is likely to end up being more than double the mean polling error of “name-brand” pollsters.

It is understandable that right-leaning media are interested in these polls, as they provide a hopeful, confirmatory message their audience wants to hear. Since reputable polls have erred in a liberal direction in the past few cycles, there is a sense that we cannot trust them anymore.

Journalists ignored that polls have always fluctuated between missing in a liberal or conservative direction. Because polls have been off in a liberal direction in the past two presidential elections, journalists have assumed a liberal bias is here to stay. In 2022, this proved to be incorrect.

It isn’t just the media that provide oxygen to these polls. Poll aggregators (particularly RealClearPolitics) had a horrible cycle because they were indiscriminate in which polls were included in their averages. Predictive modelers (such as FiveThirtyEight) had a solid night that could have been tremendous if they could get out of a mentality that every poll has something of value to contribute to their models.

Reporting on polls with suspect methods is simply bad journalism. Trusted journalists would never release a story without considerable fact-checking of their sources. Yet, they continue to cover polls that are not transparent, have poor track records, have no defensible methodology, and are shunned by the polling establishment.  

This is journalistic malpractice, and the result can be dire. When the election results do not match expectations set by the polls, an environment is fostered where election denialism thrives. January 6th happened partly because the partisan polls the protesters focused on had Donald Trump winning the election, and good journalists fueled this mentality by reporting on these polls. They provided these polls with a legitimacy they did not deserve.

Statistical laws imply that we cannot know in advance which polls will be correct in any given election. But we know which ones meet industry standards for methodology and disclosure and that, in the long term, have been proven to get it right far more often than they get it wrong.

It is no secret that pollsters face technological headwinds, but their occasional misses are not for lack of trying. After each election, pollsters convene, share findings, and discuss how to improve polls for the next election. In this sense, polling is one of the most honest professions.

Do you know who is missing from these conversations and not contributing to this honesty? The conservative-leaning pollsters.

My advice to journalists is this: stick to credible polls and stop giving every poll a voice. Rely more on the pollsters themselves for editorial decisions on what goes in the polls and the interpretations of their results. Stop creating the news by being too involved in the content of polls and return to doing what you do best: report on poll findings and provide context.

Above all, fact-check the polls like you would any other source.

Polling’s Winners and Losers from the Midterms

The pollsters did well last night.

Right now (the morning after the election), it is hard to know if 2022 will go down as a watershed moment when pollsters once again found their footing or if it will merely be a stay of execution. The 2018 midterms were also quite good for pollsters, yet the 2020 election was not.

To be clear, there are still many votes to count, so it is unfair to judge the polls too quickly. In POLL-ARIZED, I criticize media members who do. Nonetheless, below is a list of what I see as some winners and losers and some that seem like they are in the middle.

The Winners

  • Pre-election polling in general. For the most part, the polls did a good job of pointing out the close races, and exit polls suggest that they did an excellent job of highlighting the issues that concern voters most. I suspect the polling error rate will be far below the historical average of five+ points for midterm elections.
  • The “good” pollsters. The better-known polling brands, especially those with media partnerships, and some college polling centers had good results.
  • John King’s brain. Say what you want about CNN, but watching someone who knows the name of every county in America, the candidates in every election district, and the results of past elections perform without a net and stick the landing is impressive.
  • The CNN magic wall. I know other networks have them, but I can’t be the only data geek who marvels at the database systems and APIs behind CNN’s screen. It must have cost millions and involved dozens of people.
  • The Iowa Poll’s response rate. Their methodology statement says they contacted 1,118 Iowa residents for a final sample size of 801, with a response rate of 72%. This reminds me of the good old days. I would like to see pollsters spend more time benchmarking what Selzer & Co. are doing right with this poll.

The Losers

  • The partisan pollsters, particularly Trafalgar. These pollsters were way off this cycle. They have been way off in most cycles. I hope that non-partisan media outlets will stop covering them. They provide a story that outlets and viewers seeking a confirmation bias enjoy, but objective media should leave them behind for good.
  • The media who failed to see that there were so many less-reputable conservative polls released over the past two weeks. Most media were hoodwinked by this and ran a narrative that a red storm was brewing.
  • Response rates. I delved into the methodology of many final polls this cycle; most had net response rates of less than 2%. That is about half what response rates were just two years ago. The fact that the pollsters did so well with this low response is a testament to the brilliance of methodologists, but the data they have to work with is getting worse each cycle. They will not be able to keep pulling rabbits out of their hat.
  • The prediction markets. I have long hoped that the betting markets can emerge to provide a plausible alternative to polls regarding predicting elections so that the polls can focus on issues and not the horse races. These markets did not have a good night.
  • FiveThirtyEight’s pollster ratings. It is too early to make a definitive statement, but some of their highly rated pollsters had poor results, while many with middling grades did well. These ratings are helpful when they are accurate and have a defensible method behind them. When these gradings are inaccurate, they ruin reputations and businesses, so FiveThirtyEight must embrace that producing objective and accurate ratings is a serious responsibility.

The “So-So”

  • The Iowa Poll. Even with the high response, this poll seemed to overstate the Republican vote this time. They did get all the winners correct. This poll has a strong history of success, so it might be fair to chalk the slight miss up to normal sampling fluctuation. It isn’t statistically possible to get it right every single time. I must admit I have a bias of rooting for this poll.
  • The modelers, such as FiveThirtyEight and the Economist. On the hand, the concept of a probabilistic forecast is spot on. On the other, it is not particularly informative in coin-toss races. In this cycle, the forecasts they made for Senate and House seats weren’t much different than what could have been made by just tossing a coin in the contested races. Their median predictions for House and Senate seats overstated where the Republicans will end up, possibly because they also fell prey to the release of so many conservative-leaning polls in the campaign’s final stages.
  • Polling error direction. In the past few cycles, the polling error has been in the direction of overcounting Democrats. In 2022, this error seemed to move in the other direction. Historically, these errors have been uncorrelated from election to election, so I must admit that I’ve probably jumped the gun by suggesting in POLL-ARIZED the pro-Democrat error direction was structural and here to stay.
  • The media’s coverage of the polls on election day. In 2016 and 2020, the press reveled in bashing the pollsters. This time, they hardly talked about them at all. That seemed a bit unfair – if pollsters are going to be criticized when they do poorly, they should be celebrated when they do well.

All-in-all, a good night for the pollsters. But, I don’t want to rush to a conclusion that the polls are now fixed because, in reality, the pollsters didn’t change much in their methods from 2020. I hope the industry will study what went right, as we tend to re-examine our methods when they fail, not when they succeed.

The value of looking at data from more than one perspective

About 20 years ago, I flew to the Midwest to present the findings from an extensive project. My audience included the head of marketing, my direct research client, and the firm’s CEO. We constructed an insightful study that profiled the market my client played in, their position, and their competitive strengths and weaknesses.

I spent about an hour presenting the study findings and fielding questions. It went great. It was one of those meetings where I knew our work would affect this company, and the CEO seemed to buy into taking action based on our recommendations.

Then, with about five minutes to go in the meeting, I asked if there were any follow-up analyses they would like us to do. The CEO said, “yes, there is one thing …”

He then instructed me to take a couple of weeks to do a new analysis and then to fly back out and present it to him. I was at first taken aback, as I thought the project was over, and I was ready to declare victory and move on to other things.

The analysis he requested? He told me to imagine that his largest competitor would call me tomorrow. I could use everything I knew about my client and the information gathered in our study. If this competitor called me, what would I tell them about how to position against my client? What are the implications of our research from his competitor’s point of view?

This is a brilliant idea. I have always believed that although research can often be quite insightful, it is more about what clients do with our data that matters. This CEO knew full well that his competitor probably had their own research firm doing a similar project to what I had just presented. He wanted to view the world from his competitor’s perspective.

It worked. I returned in a couple of weeks and did a role-play presentation where I pretended they were their competition. This led to a game-theory discussion of how their competition would likely react to initiatives they were considering, how they could address their weaknesses, and where their strengths mattered.

Since then, I have proposed similar analyses to many clients. I have been surprised at how few have taken me up on the offer. So, late in presentations, I often slip in a few slides showing what I would tell their competition based on the study findings if I worked for them.

If I were a client-side researcher, I’d ask my researchers to do this regularly. It forces us to do a better job at checking our biases, as, like it or not, we want our data to show our clients are succeeding. We know how much work they put in, and it isn’t easy to tell them where their weaknesses are. Looking at the data from another angle gives us the space to be more agnostic in our conclusions and provides better insight to the clients. It makes us more agnostic to the data and less likely to tell clients what they want to hear.

The request from this CEO made me a better, more empathetic researcher. We worked with his firm for about 15 years; he recently retired. He will always be in my “client hall of fame” because of his willingness to view research results objectively and his insistence that we consider all perspectives.

Clients hire us so they can learn from us, but often they don’t realize how much we learn from them.

Your grid questions probably aren’t working

Convincing people to participate in surveys and polls has become so challenging that more attention is going toward preventing them from suspending once they choose to respond.

Most survey suspends occur in one of two places. The first is at the initial screen the respondent sees. Respondents click through an invitation, and many quickly decide that the survey isn’t for them and abandon the effort.

The second most common place is the first grid question respondents encounter. They see an imposing grid question and decide it isn’t worth their time to continue. It doesn’t matter where this question is placed – this happens whether the first grid question is early in the questionnaire, in the middle, or toward the end.

Respondents hate answering grid questions. Yet clients continue to ask them, and survey researchers include them without much thought. The quality of data they yield tends to be low.

A measurement error issue with grid questions is known as “response set bias.” When we present a list of, say, ten items, we want to get a respondent to make an independent judgment of each, unrelated to what they think of the others. But, with a long list of items, that is not what happens. Instead, when people respond to later questions, they remember what they said earlier. If I indicated that feature A in a list was “somewhat important” to me when I assess feature B, it is natural to think about how it compares in importance to feature A. This introduces unwanted correlations into the data set.

Instead, we want a respondent to assess feature A, clear their mind entirely, and then assess feature B. That is a challenging task, but placing features on a long, intimidating list, makes it near impossible. Some researchers think we eliminate this error by randomizing the list order, but all that does is spread the error out. It is important to randomize the options so this error doesn’t concentrate on just a few items, but randomization does not solve the problem.

Errors you have probably heard of lurk in long grid questions. Things like fatigue biases (respondents attend less to the items late in the list), question order biases, priming effects, recency biases, etc. In short, grid questions are just asking for many measurement errors, and we end up crossing our fingers and hoping some of these cancel each other out.

This is admittedly a mundane topic, but it is the one questionnaire design issue I have the most difficulty convincing clients to do something about. Grid questions capture a lot of data in a short amount of questionnaire time, so they are enticing for clients.

I prefer a world where we seldom ask them. If we need to, we recommend maybe one or two per questionnaire and never more than 4 to 6 items in them. I rarely succeed in convincing clients of this.

“Textbook” explanations of problems with grid questions do not include the issue that bothers me most. What happens in grid questions is the question respondents hear and respond to is often not the literal question that is composed.

Consider a grid question like this, with a 5-point importance scale as the response options:

Q: How important were the following when you decided to buy the widget?

  1. The widget brand cares about sustainability
  2. The price of the widget
  3. The color of the widget is attractive to you
  4. The widget will last a long time

Think about the first item (“The widget brand cares about sustainability”). The client wants to understand how important sustainability is in the buying decision. How important of a buying criterion is sustainability?

But that is likely not what the respondent “hears” in the question. The respondent will probably see the question as asking if they care about sustainability and who doesn’t? So, what would tend to happen is sustainability would be overstated as a decision driver when analyzing the data set. Respondents don’t leap to thinking about sustainability as a buying consideration; instead, they respond about sustainability in general.

Clients and suppliers must realize that respondents do not parse our words as we would like them to, and they do not always attend to our questions. We need to anticipate this.

How do we fix this issue?  We should be more straightforward in how we ask questions. In this example, I would prefer to derive the importance of sustainability in the buying decision. I’d include a question asking how much they care about sustainability (and be careful to phrase it so it can have a response across various answer choices).  Then, in a second question, I would gather a dependent variable asking how likely they are to buy the widget in the future.

A regression or correlation analysis would provide coefficients across variables that indicate their relative importance. Yes, it would be based on correlations and not necessarily causation. In reality, research studies rarely set up the experiments necessary to give evidence of causation, and we should not get too hung up on that.

I would conclude that sustainability is an essential feature if it popped in the regression as having a high coefficient and if I saw something else in other questions or open-ends that indicated sustainability mattered from another angle. Always look for another data point or another data source that supports your conclusion.

Grid questions are the most over-rated and overused types of survey questions. Clients like them, but they tend to provide poor-quality data. Use them sparingly and look for alternatives.

Pre-Election Polling and Baseball Share a Lot in Common

The goal of a pre-election poll is to predict which candidate will win an election and by how much. Pollsters work towards this goal by 1) obtaining a representative sample of respondents, 2) determining which candidate a respondent will vote for, and 3) predicting the chances each respondent will take the time to vote.

All three of these steps involve error. It is the first one, obtaining a representative sample of respondents, which has changed the most in the past decade or so.

It is the third characteristic that separates pre-election polling from other forms of polling and survey research. Statisticians must predict how likely each person they interview will be to vote. This is called their “Likely Voter Model.”

As I state in POLL-ARIZED, this is perhaps the most subjective part of the polling process. The biggest irony in polling is that it becomes an art when we hand the data to the scientists (methodologists) to apply a Likely Voter Model.

It is challenging to understand what pollsters do in their Likely Voter Models and perhaps even more challenging to explain.  

An example from baseball might provide a sense of what pollsters are trying to do with these models.

Suppose Mike Trout (arguably the most underappreciated sports megastar in history) is stepping up to the plate. Your job is to predict Trout’s chances of getting a hit. What is your best guess?

You could take a random guess between 0 and 100%. But, since that would give you a 1% chance of being correct, there must be a better way.

A helpful approach comes from a subset of statistical theory called Bayesian statistics. This theory says we can start with a baseline of Trout’s hit probability based on past data.

For instance, we might see that so far this year, the overall major league batting average is .242. So, we might guess that Trout’s probability of getting a hit is 24%.

This is better than a random guess. But, we can do better, as Mike Trout is no ordinary hitter.

We might notice there is even better information out there. Year-to-date, Trout is batting .291. So, our guess for his chances might be 29%. Even better.

Or, we might see that Trout’s lifetime average is .301 and that he hit .333 last year. Since we believe in a concept called regression to the mean, that would lead us to think that his batting average should be better for the rest of the season than it is currently. So, we revise our estimate upward to 31%.

There is still more information we can use. The opposing pitcher is Justin Verlander. Verlander is a rare pitcher who has owned Trout in the past – Trout’s average is just .116 against Verlander. This causes us to revise our estimate downward a bit. Perhaps we take it to about 25%.

We can find even more information. The bases are loaded. Trout is a clutch hitter, and his career average with men on base is about 10 points higher than when the bases are empty. So, we move our estimate back up to about 28%.

But it is August. Trout has a history of batting well early in and late in the season, but he tends to cool off during the dog days of summer. So, we decide to end this and settle on a probability of 25%.

This sort of analysis could go on forever. Every bit of information we gather about Trout can conceivably help make a better prediction for his chances. Is it raining? What is the score? What did he have for breakfast? Is he in his home ballpark? Did he shave this morning? How has Verlander pitched so far in this game? What is his pitch count?

There are pre-election polling analogies in this baseball example, particularly if you follow the probabilistic election models created by organizations like FiveThirtyEight and The Economist.

Just as we might use Trout’s lifetime average as our “prior” probability, these models will start with macro variables for their election predictions. They will look at the past implications of things like incumbency, approval ratings, past turnout, and economic indicators like inflation, unemployment, etc. In theory, these can adjust our assumptions of who will win the election before we even include polling data.

Of course, using Trout’s lifetime average or these macro variables in polling will only be helpful to the extent that the future behaves like the past. And therein lies the rub – overreliance on past experience makes these models inaccurate during dynamic times.

Part of why pollsters missed badly in 2020 is unique things were going on – a global pandemic, changed methods of voting, increased turnout, etc. In baseball, perhaps this is a year with a juiced baseball, or Trout is dealing with an injury.

The point is that while unprecedented things are unpredictable, they happen with predictable regularity. There is always something unique about an election cycle or a Mike Trout at bat.

The most common question I am getting from readers of POLL-ARIZED is, “will the pollsters get it right in 2024?” My answer is that since pollsters are applying past assumptions in their model, they will get it right to the extent that the world in 2024 looks like the world did in 2020, and I would not put my own money on it.

I make a point in POLL-ARIZED that pollsters’ models have become too complex. While in theory, the predictive value of a model never gets worse when you add in more variables, in practice, this has made these models uninterpretable. Pollsters include so many variables in their likely voter models that many of their adjustments cancel each other out. They are left with a model with no discernable underlying theory.

If you look closely, we started with a probability of 24% for Trout. Even after looking at a lot of other information and making reasonable adjustments, we still ended up with a prediction of 25%. The election models are the same way. They include so many variables that they can cancel out each other’s effects and end up with a prediction that looks much like the raw data did before the methodologists applied their wizardry.

This effort is better spent at getting better input for the models by investing in generating the trust needed to increase the response rates we get to our surveys and polls. Improving the quality of our data input will increase the predictive quality of the polls more than coming up with more complicated ways to weight the data.

Of course, in the end, one candidate wins, and the other loses, and Mike Trout either gets a hit, or he doesn’t, so the actual probability moves to 0% or 100%. Trout cannot get 25% of a hit, and a candidate cannot win 79% of an election.

As I write this, I looked up the last time Trout faced Verlander. It turns out Verlander struck him out!

Things That Surprised Me When Writing a Book

I recently published a book outlining the challenges election pollsters face and the implications of those challenges for survey researchers.

This book was improbable. I am not an author nor a pollster, yet I wrote a book on polling. It is a result of a curiosity that got away from me.

Because I am a new author, I thought it might be interesting to list unexpected things that happened along the way. I had a lot of surprises:

  • How quickly I wrote the first draft. Many authors toil for years on a manuscript. The bulk of POLL-ARIZED was composed in about three weeks, working a couple of hours daily. The book covers topics central to my career, and it was a matter of getting my thoughts typed and organized. I completed the entire first draft before telling my wife I had started it.
  • How long it took to turn that first draft into a final draft. After I had all my thoughts organized, I felt a need to review everything I could find on the topic. I read about 20 books on polling and dozens of academic papers, listened to many hours of podcasts, interviewed polling experts, and spent weeks researching online. I convinced a few fellow researchers to read the draft and incorporated their feedback. The result was a refinement of my initial draft and arguments and the inclusion of other material. This took almost a year!
  • How long it took to get the book from a final draft until it was published. I thought I was done at this point. Instead, it took another five months to get it in shape to publish – to select a title, get it edited, commission cover art, set it up on Amazon and other outlets, etc. I used Scribe Media, which was expensive, but this process would have taken me a year or more if I had done it without them.
  • That going for a long walk is the most productive writing tactic ever. Every good idea in the book came to me when I trekked in nature. Little of value came to me when sitting in front of a computer. I would go for long hikes, work out arguments in my head, and brew a strong cup of coffee. For some reason, ideas flowed from my caffeinated state of mind.
  • That writing a book is not a way to make money. I suspected this going in, but it became clear early on that this would be a money-losing project. POLL-ARIZED has exceeded my sales expectations, but it cost more to publish than it will ever make back in royalties. I suspect publishing this book will pay back in our research work, as it establishes credibility for us and may lead to some projects.
  • Marketing a book is as challenging as writing one. I guide large organizations on their marketing strategy, yet I found I didn’t have the first clue about how to promote this book. I would estimate that the top 10% of non-fiction books make up 90% of the sales, and the other 90% of books are fighting for the remaining 10%.
  • Because the commission on a book is a few dollars per copy, it proved challenging to find marketing tactics that pay back. For instance, I thought about doing sponsored ads on LinkedIn. It turns out that the per-click charge for those ads was more than the book’s list price. The best money I spent to promote the book was sponsored Amazon searches. But even those failed to break even.
  • Deciding to keep the book at a low price proved wise. So many people told me I was nuts to hold the eBook at 99 cents for so long or keep the paperback affordable. I did this because it was more important to me to get as many people to read it as possible than to generate revenue. Plus, a few college professors have been interested in adopting the book for their survey research courses. I have been studying the impact of book prices on college students for about 20 years, and I thought it was right not to contribute to the problem.
  • BookBub is incredible if you are lucky enough to be selected. BookBub is a community of voracious readers. I highly recommend joining if you read a lot. Once a week, they email their community about new releases they have vetted and like. They curate a handful of titles out of thousands of submissions. I was fortunate that my book got selected. Some authors angle for a BookBub deal for years and never get chosen. The sales volume for POLL-ARIZED went up by a factor of 10 in one day after the promotion ran.
  • Most conferences and some podcasts are “pay to play.” Not all of them, but many conferences and podcasts will not support you unless you agree to a sponsorship deal. When you see a research supplier speaking at an event or hear them on a podcast, they may have paid the hosts something for the privilege. This bothers me. I understand why they do this, as they need financial support. Yet, I find it disingenuous that they do not disclose this – it is on the edge of being unethical. It harms their product. If a guest has to pay to give a conference presentation or talk on a podcast, it pressures them to promote their business rather than have an honest discussion of the issues. I will never view these events or podcasts the same. (If you see me at an event or hear me on a podcast, be assured that I did not pay anything to do so.)
  • That the industry associations didn’t want to give the book attention. If you have read POLL-ARIZED, you will know that it is critical (I believe appropriately and constructively) of the polling and survey research fields. The three most important associations rejected my proposals to present and discuss the book at their events. This floored me, as I cannot think of any topics more essential to this industry’s future than those I raise in the book. Even insights professionals who have read the book and disagree with my arguments have told me that I am bringing up points that merit discussion. This cold shoulder from the associations made me feel better about writing that “this is an industry that doesn’t seem poised to fix itself.”
  • That clients have loved the book. The most heartwarming part of the process is that it has reconnected me with former colleagues and clients from a long research career. Everyone I have spoken to who is on the client-side of the survey research field has appreciated the book. Many clients have bought it for their entire staff. I have had client-side research directors I have never worked with tell me they loved the book.
  • That some of my fellow suppliers want to kill me. The book lays our industry bare, and not everyone is happy about that. I had a competitor ask me, ” Why are you telling clients to ask us what our response rates are?” I stand behind that!
  • How much I learned along the way. There is something about getting your thoughts on paper that creates a lot of learning. There is a saying that the best way to learn a subject is to teach it. I would add that trying to write a book about something can teach you what you don’t know. That was a thrill for me. But then again, I was the type of person who would attend lectures for classes I wasn’t even taking while in college. I started writing this book to educate myself, and it has been a great success in that sense.
  • How tough it was for me to decide to publish it. There was not a single point in the process when I did not consider not publishing this book. I found I wanted to write it a lot more than publish it. I suffered from typical author fears that it wouldn’t be good enough, that my peers would find my arguments weak, or that it would bring unwanted attention to me rather than the issues the book presents. I don’t regret publishing it, but it would never have happened without encouragement from the few people who read it in advance.
  • The respect I gained for non-fiction authors. I have always been a big reader. I now realize how much work goes into this process, with no guarantee of success. I have always told people that long-form journalism is the profession I respect the most. Add “non-fiction” writers to that now!

Almost everyone who has contacted me about the book has asked me if I will write another one. If I do, it will likely be on a different topic. If I learned anything, this process requires selecting an issue you care about passionately. Journalists are people who can write good books about almost anything. The rest of us mortals must choose a topic we are super interested in, or our books will be awful.

I’ve got a few dancing around in my head, so who knows, maybe you’ll see another book in the future.

For now, it is time to get back to concentrating on our research business!

The Insight that Insights Technology is Missing

The market research insights industry has long been characterized by a resistance to change. This likely results from the academic nature of what we do. We don’t like to adopt new ways of doing things until they have been proven and studied.

I would posit that the insights industry has not seen much change since the transition from telephone to online research occurred in the early 2000s. And even that transition created discord within the industry, with many traditional firms resistant to moving on from telephone studies because online data collection had not been thoroughly studied and vetted.

In the past few years, the insights industry has seen an influx of capital, mostly from private equity and venture capital firms. The conditions for this cash infusion have been ripe: a strong and growing demand for insights, a conservative industry that is slow to adapt, and new technologies arising that automate many parts of a research project have all come together simultaneously.

Investing organizations see this enormous business opportunity. Research revenues are growing, and new technologies are lowering costs and shortening project timeframes. It is a combustible business situation that needs a capital accelerant.

Old school researchers, such as myself, are becoming nervous. We worry that automation will harm our businesses and that the trend toward DIY projects will result in poor-quality studies. Technology is threatening the business models under which we operate.

The trends toward investment in automation in the insights industry are clear. Insights professionals need to embrace this and not fight it.

However, although the movement toward automation will result in faster and cheaper studies, this investment ignores the threats that declining data quality creates. In the long run, this automation will accelerate the decline in data quality rather than improve it.

It is great that we are finding ways to automate time-consuming research tasks, such as questionnaire authoring, sampling, weighting, and reporting. This frees up researchers to concentrate on drawing insights out of the data. But, we can apply all the automation in the world to the process, yet if we do not do something about data quality, it will not increase the value clients receive.

I argue in POLL-ARIZED that the elephant in the research room is the fact that very few people want to take our surveys anymore. When I began in this industry, I routinely fielded telephone projects with 70-80% response rates. Currently, telephone and online response rates are between 3-4% for most projects.

Response rates are not everything. You can make a compelling argument that they do not matter at all. There is no problem as long as the 3-4% response we get is representative. I would rather have a representative 3% answer a study than a biased 50%.

But, the fundamental problem is that this 3-4% is not representative. Only about 10% of the US population is currently willing to take surveys. What is happening is that this same 10% is being surveyed repeatedly. In the most recent project Crux fielded, respondents had taken an average of 8 surveys in the past two weeks. So, we have about 10% of the population taking surveys every other day, and our challenge is to make them represent the rest of the population.

Automate all you want, but the data that are the backbone of the insights we are producing quickly and cheaply is of historically low quality.

The new investment flooding into research technology will contribute to this problem. More studies will be done that are poorly designed, with long, tortuous questionnaires. Many more surveys will be conducted, fewer people will be willing to take them, and response rates will continue to fall.

There are plenty of methodologists working on these problems. But, for the most part, they are working on new ways to weight the data we can obtain rather than on ways to compel more response. They are improving data quality, but only slightly, and the insights field continues to ignore the most fundamental problem we have: people do not want to take our surveys.

For the long-term health of our field, that is where the investment should go.

In POLL-ARIZED, I list ten potential solutions to this problem. I am not optimistic that any of them will be able to stem the trend toward poor data quality. But, I am continually frustrated that our industry has not come together to work towards expanding respondent trust and the base of people willing to take part in our projects.

The trend towards research technology and automation is inevitable. It will be profitable. But, unless we address data quality issues, it will ultimately hasten the decline of this field.

POLL-ARIZED available on May 10

I’m excited to announce that my book, POLL-ARIZED, will be available on May 10.
 
After the last two presidential elections, I was fearful my clients would ask a question I didn’t know how to answer: “If pollsters can’t predict something as simple as an election, why should I believe my market research surveys are accurate?”
 
POLL-ARIZED results from a year-long rabbit hole that question led me down! In the process, I learned a lot about why polls matter, how today’s pollsters are struggling, and what the insights industry should do to improve data quality.
 
I am looking for a few more people to read an advance copy of the book and write an Amazon review on May 10. If you are interested, please send me a message at poll-arized@cruxresearch.com.


Visit the Crux Research Website www.cruxresearch.com

Enter your email address to follow this blog and receive notifications of new posts by email.