Posts Tagged 'Biases and errors'

The myth of the random sample

Sampling is at the heart of market research. We ask a few people questions and then assume everyone else would have answered the same way.

Sampling works in all types of contexts. Your doctor doesn’t need to test all of your blood to determine your cholesterol level – a few ounces will do. Chefs taste a spoonful of their creations and then assume the rest of the pot will taste the same. And, we can predict an election by interviewing a fairly small number of people.

The mathematical procedures that are applied to samples that enable us to project to a broader population all assume that we have a random sample. Or, as I tell research analysts: everything they taught you in statistics assumes you have a random sample. T-tests, hypotheses tests, regressions, etc. all have a random sample as a requirement.

Here is the problem: We almost never have a random sample in market research studies. I say “almost” because I suppose it is possible to do, but over 30 years and 3,500 projects I don’t think I have been involved in even one project that can honestly claim a random sample. A random sample is sort of a Holy Grail of market research.

A random sample might be possible if you have a captive audience. You can random sample some the passengers on a flight or a few students in a classroom or prisoners in a detention facility. As long as you are not trying to project beyond that flight or that classroom or that jail, the math behind random sampling will apply.

Here is the bigger problem: Most researchers don’t recognize this, disclose this, or think through how to deal with it. Even worse, many purport that their samples are indeed random, when they are not.

For a bit of research history, once the market research industry really got going the telephone random digit dial (RDD) sample became standard. Telephone researchers could randomly call land line phones. When land line telephone penetration and response rates were both high, this provided excellent data. However, RDD still wasn’t providing a true random, or probability sample. Some households had more than one phone line (and few researchers corrected for this), many people lived in group situations (colleges, medical facilities) where they couldn’t be reached, some did not have a land line, and even at its peak, telephone response rates were only about 70%. Not bad. But, also, not random.

Once the Internet came of age, researchers were presented with new sampling opportunities and challenges. Telephone response rates plummeted (to 5-10%) making telephone research prohibitively expensive and of poor quality. Online, there was no national directory of email addresses or cell phone numbers and there were legal prohibitions against spamming, so researchers had to find new ways to contact people for surveys.

Initially, and this is still a dominant method today, research firms created opt-in panels of respondents. Potential research participants were asked to join a panel, filled out an extensive demographic survey, and were paid small incentives to take part in projects. These panels suffer from three response issues: 1) not everyone is online or online at the same frequency, 2) not everyone who is online wants to be in a panel, and 3) not everyone in the panel will take part in a study. The result is a convenience sample. Good researchers figured out sophisticated ways to handle the sampling challenges that result from panel-based samples, and they work well for most studies. But, in no way are they a random sample.

River sampling is a term often used to describe respondents who are “intercepted” on the Internet and asked to fill out a survey. Potential respondents are invited via online ads and offers placed on a range of websites. If interested, they are typically pre-screened and sent along to the online questionnaire.

Because so much is known about what people are doing online these days, sampling firms have some excellent science behind how they obtain respondents efficiently with river sampling. It can work well, but response rates are low and the nature of the online world is changing fast, so it is hard to get a consistent river sample over time. Nobody being honest would ever use the term “random sampling” when describing river samples.

Panel-based samples and river samples represent how the lion’s share of primary market research is being conducted today. They are fast and inexpensive and when conducted intelligently can approximate the findings of a random sample. They are far from perfect, but I like that the companies providing them don’t promote them as being random samples. They involve some biases and we deal with these biases as best we can methodologically. But, too often we forget that they violate a key assumption that the statistical tests we run require: that the sample is random. For most studies, they are truly “close enough,” but the problem is we usually fail to state the obvious – that we are using statistical tests that are technically not appropriate for the data sets we have gathered.

Which brings us to a newer, shiny object in the research sampling world: ABS samples. ABS (addressed-based samples) are purer from a methodological standpoint. While ABS samples have been around for quite some time, they are just now being used extensively in market research.

ABS samples are based on US Postal Service lists. Because USPS has a list of all US households, this list is an excellent sampling frame. (The Census Bureau also has an excellent list, but it is not available for researchers to use.) The USPS list is the starting point for ABS samples.

Research firms will take the USPS list and recruit respondents from it, either to be in a panel or to take part in an individual study. This recruitment can be done by mail, phone, or even online. They often append publicly-known information onto the list.

As you might expect, an ABS approach suffers from some of the same issues as other approaches. Cooperation rates are low and incentives (sometimes large) are necessary. Most surveys are conducted online, and not everyone in the USPS list is online or has the same level of online access. There are some groups (undocumented immigrants, homeless) that may not be in the USPS list at all. Some (RVers, college students, frequent travelers) are hard to reach. There is evidence that ABS approaches do not cover rural areas as well as urban areas. Some households use post office boxes and not residential addresses for their mail. Some use more than one address. So, although ABS lists cover about 97% of US households, the 3% that they do not cover are not randomly distributed.

The good news is, if done correctly, the biases that result from an ABS sample are more “correctable” than those from other types of samples because they are measurable.

A recent Pew study indicates that survey bias and the number of bogus respondents is a bit smaller for ABS samples than opt-in panel samples.

But ABS samples are not random samples either. I have seen articles that suggest that of all those approached to take part in a study based on an ABS sample, less than 10% end up in the survey data set.

The problem is not necessarily with ABS samples, as most researchers would concur that they are the best option we have and come the closest to a random sample. The problem is that many firms that are providing ABS samples are selling them as “random samples” and that is disingenuous at best. Just because the sampling frame used to recruit a survey panel can claim to be “random” does not imply that the respondents you end up in a research database constitute a random sample.

Does this matter? In many ways, it likely does not. There are biases and errors in all market research surveys. These biases and errors vary not just by how the study was sampled, but also by the topic of the question, its tone, the length of the survey, etc. Many times, survey errors are not the same throughout an individual survey. Biases in surveys tend to be “unknown knowns” – we know they are there, but aren’t sure what they are.

There are many potential sources of errors in survey research. I am always reminded of a quote from Humphrey Taylor, the past Chairman of the Harris Poll who said “On almost every occasion when we release a new survey, someone in the media will ask, “What is the margin of error for this survey?” There is only one honest and accurate answer to this question — which I sometimes use to the great confusion of my audience — and that is, “The possible margin of error is infinite.”  A few years ago, I wrote a post on biases and errors in research, and I was able to quickly name 15 of them before I even had to do an Internet search to learn more about them.

The reality is, the improvement in bias that is achieved by an ABS sample over a panel-based sample is small and likely inconsequential when considered next to the other sources of error that can creep into a research project. Because of this, and the fact that ABS sampling is really expensive, we tend to only recommend ABS panels in two cases: 1) if the study will result in academic publication, as academics are more accepting of data that comes from and ABS approach, and 2) if we are working in a small geography, where panel-based samples are not feasible.

Again, ABS samples are likely the best samples we have at this moment. But firms that provide them are often inappropriately portraying them as yielding random samples. For most projects, the small improvements in bias they provide is not worth the considerable increased budget and increased study time frame, which is why, for the moment, ABS samples are currently used in a small proportion of research studies. I consider ABS to be “state of the art” with the emphasis on “art” as sampling is often less of a science than people think.

Jeff Bezos is right about market research

In an annual shareholder letter, Amazon’s Jeff Bezos recently stated that market research isn’t helpful. That created some backlash among researchers, who reacted defensively to the comment.

For context, below is the text of Bezos’ comment:

No customer was asking for Echo. This was definitely us wandering. Market research doesn’t help. If you had gone to a customer in 2013 and said “Would you like a black, always-on cylinder in your kitchen about the size of a Pringles can that you can talk to and ask questions, that also turns on your lights and plays music?” I guarantee you they’d have looked at you strangely and said “No, thank you.”

This comment is reflective of someone who understands the role market research can play for new products as well as its limitations.

We have been saying for years that market research does a poor job of predicting the success of truly breakthrough products. What was the demand for television sets in the 1920’s and 1930’s before there was even content to broadcast or a way to broadcast it? Just a decade ago, did consumers know they wanted a smartphone they would carry around with them all day and constantly monitor? Henry Ford once said that if he had asked customers what they wanted they would have wanted faster horses and not cars.

In 2014, we wrote a post (Writing a Good Questionnaire is Just Like Brian Surgery) that touched on this issue. In short, consumer research works best when the consumer has a clear frame-of-reference from which to draw. New product studies on line extensions or easily understandable and relatable new ideas tend to be accurate. When the new product idea is harder to understand or is outside the consumer’s frame-of-reference research isn’t as predictive.

Research can sometimes provide the necessary frame-of-reference. We put a lot of effort to be sure that concept descriptions are understandable. We often go beyond words to do this and produce short videos instead of traditional concept statements. But even then, if the new product being tested is truly revolutionary the research will probably predict demand inaccurately. The good news is few new product ideas are actually breakthroughs – they are usually refinements on existing ideas.

Failure to provide a frame-of-reference or realize that one doesn’t exist leads to costly research errors. Because this error is not quantifiable (like a sample error) it gets little attention.

The mistake people are making when reacting to Bezos’ comment is they are viewing it as an indictment of market research in general. It is not. Research still works quite well for most new product forecasting studies. For new products, companies are often investing millions or tens of millions in development, production, and marketing. It usually makes sense to invest in market research to be confident these investments will pay off and to optimize the product.

It is just important to recognize that there are cases where respondents don’t have a good frame-of-reference and the research won’t accurately predict demand. Truly innovative ideas are where this is most likely to happen.

I’ve learned recently that this anti-research mentality pervades the companies in Silicon Valley. Rather than use a traditional marketing approach of identifying a need and then developing a product to fulfill the need, tech firms often concern themselves first with the technology. They develop a technology and then look for a market for it. This is a risky strategy and likely fails more than it succeeds, but the successes, like the Amazon Echo, can be massive.

I own an Amazon Echo. I bought it shortly after it was launched having little idea what it was or what it could do. Even now I am still not quite sure what it is capable of doing. It probably has a lot of potential that I can’t even conceive of. I think it is still the type of product that might not be improved much by market research, even today, when it has been on the market for years.

Will adding a citizenship question to the Census harm the Market Research Industry?

The US Supreme Court appears likely to allow the Department of Commerce to reinstate a citizenship question on the 2020 Census. This is largely viewed as a political controversy at the moment. The inclusion of a citizenship question has proven to dampen response rates among non-citizens, who tend to be people of color. The result will be gains in representation for Republicans at the expense of Democrats (political district lines are redrawn every 10 years as a result of the Census). Federal funding will likely decrease for states with large immigrant populations.

It should be noted that the Census bureau itself has come out against this change, arguing that it will result in an undercount of about 6.5 million people. Yet, the administration has pressed forward and has not committed funds needed by the Census Bureau to fully research the implications. The concern isn’t just about non-response from non-citizens. In tests done by the Census Bureau, non-citizens are also more likely to inaccurately respond to this question than citizens, meaning the resulting data will be inaccurate.

Clearly this is a hot-button political issue. However, there is not much talk of how this change may affect research. Census data are used to calibrate most research studies in the US, including academic research, social surveys, and consumer market research. Changes to the Census may have profound effects on data quality.

The Census serves as a hidden backbone for most research studies whether researchers or clients realize it or not. Census information helps us make our data representative. In a business climate that is becoming more and more data-driven the implications of an inaccurate Census are potentially dire.

We should be primarily concerned that the Census is accurate regardless of the political implications. Adding questions that temper response will not help accuracy. Errors in the Census have a tendency to become magnified in research. For example, in new product research it is common to project study data from about a thousand respondents to a universe of millions of potential consumers. Even a small error in the Census numbers can lead businesses to make erroneous investments. These errors create inefficiencies that reverberate throughout the economy. Political concerns aside, US businesses undoubtably suffer from a flawed Census. Marketing becomes less efficient.

All is not lost though. We can make a strong case that there are better, less costly ways to conduct the Census. Methodologists have long suggested that a sampling approach would be more accurate than the current attempt at enumeration. This may never happen for the decennial Census because the Census methodology is encoded in the US Constitution and it might take an amendment to change it.

So, what will happen if this change is made? I suspect that market research firms will switch to using data that come from the Census’ survey programs, such as the American Community Survey (ACS). Researchers will rely less on the actual decennial census. In fact, many research firms already use the ACS rather than the decennial census (and the ACS currently contains the citizenship question).

The Census bureau will find ways to correct for resulting error, and to be honest, this may not be too difficult from a methodological standpoint. Business will adjust because there will be economic benefits to learning how to deal with a flawed Census, but in the end, this change will take some time for the research industry to address. Figuring things like this out is what good researchers do. While it is unfortunate that this change looks likely to be made, its implications are likely more consequential politically than it will be to the research field.

How Did Pollsters Do in the Midterm Elections?

Our most read blog post was posted the morning after the 2016 Presidential election. It is a post we are proud of because it was composed in the haze of a shocking election result. While many were celebrating their side’s victory or in shock over their side’s losses, we mused about what the election result meant for the market research industry.

We predicted pollsters would become defensive and try to convince everyone that the polls really weren’t all that bad. In fact, the 2016 polls really weren’t. Predictions of the popular vote tended to be within a percent and a half or so of the actual result which was better than for the previous Presidential election in 2012. However, the concern we had about the 2016 polls wasn’t related to how close they were to the result. The issue we had was one of bias: 22 of the 25 final polls we found made an inaccurate prediction and almost every poll was off in the same direction. That is the very definition of bias in market research.

Suppose that you had 25 people flip a coin 100 times. On average, you’d expect 50% of the flips to be “heads.” But, if say, 48% of them were “heads” you shouldn’t be all that worried as that can happen. But, if 22 of the 25 people all had less than 50% heads you should worry that there was something wrong with the coins or they way they were flipped. That is, in essence, what happened in the 2016 election with the polls.

Anyway, this post is being composed the aftermath of the 2018 midterm elections. How did the pollsters do this time?

Let’s start with FiveThirtyEight.com. We like this site because they place probabilities around their predictions. Of course, this gives them plausible deniability when their prediction is incorrect, as probabilities are never 0% or 100%. (In 2016 they gave Donald Trump a 17% chance of winning and then defended their prediction.) But this organization looks at statistics in the right way.

Below is their final forecast and the actual result. Some results are still pending, but at this moment, this is how it shapes up.

  • Prediction: Republicans having 52 seats in the Senate. Result: It looks like Republicans will have 53 seats.
  • Prediction: Democrats holding 234 and Republicans holding 231 House seats. Result: It looks like Democrats will have 235 or 236 seats.
  • Prediction: Republicans holding 26 and Democrats holding 24 Governorships. Result: Republicans now hold 26 and Democrats hold 24 Governorships.

It looks like FiveThirtyEight.com nailed this one. We also reviewed a prediction market and state-level polls, and it seems that this time around the polls did a much better job in terms of making accurate predictions. (We must say that on election night, FiveThirtyEight’s predictions were all over the place when they were reporting in real time. But, as results settled, their pre-election forecast looked very good.)

So, why did polls seem to do so much better in 2018 than 2016? One reason is the errors cancel out when you look at large numbers of races. Sure, the polls predicted Democrats would have 234 seats, and that is roughly what they achieved. But, in how many of the 435 races did the polls make the right prediction? That is the relevant question, as it could be the case that the polls made a lot of bad predictions that compensated for each other in the total.

That is a challenging analysis to do because some races had a lot of polling, others did not, and some polls are more credible than others. A cursory look at the polls suggests that 2018 was a comeback victory for the pollsters. We did sense a bit of an over-prediction favoring the Republican Senatorial candidates, but on the House side there does not seem to be a clear bias.

So, what did the pollsters do differently? Not much really. Online sampling continues to evolve and get better, and the 2016 result has caused polling firms to concentrate more carefully on their sampling. One of the issues that may have caused the 2016 problem is that pollsters are starting to almost exclusively use the top 2 or 3 panel companies. Since 2016, there has been a consolidation among sample suppliers, and as a result, we are seeing less variance in polls as pollsters are largely all using the same sample sources. The same few companies provide virtually all the sample used by pollsters.

Another key difference was that turnout in the midterms was historically high. Polls are more accurate in high turnout races, as polls almost always survey many people who do not end up showing up on election day, particularly young people. However, there are large and growing demographic differences (age, gender, race/ethnicity) in supporters of each party, and that greatly complicates polling accuracy. Some demographic subgroups are far more likely than others to take part in a poll.

Pollsters are starting to get online polling right. A lot of the legacy firms in this space are still entrenched in the telephone polling world, have been protective of their aging methodologies, and have been slow to change. After nearly 20 years of online polling the upstarts have finally forced the bigger polling firms to question their approaches and to move to a world where telephone polling just doesn’t make a lot of sense. Also, many of the old guard, telephone polling experts are now retired or have passed on, and they have largely led the resistance to online polling.

Gerrymandering helps the pollster as well. It still remains the case that relatively few districts are competitive. Pew suggests that only 1 in 7 districts was competitive. You don’t have to be a pollster to accurately predict how about 85% of the races will turn out. Only about 65 of the 435 house races were truly at stake. If you just flipped a coin in those races, in total your prediction of house seats would have been fairly close.

Of course, pollsters may have just gotten lucky. We view that as unlikely though, as there were too many races. Unlike in 2018 though, in 2016 we haven’t seen any evidence of bias (in a statistical sense) in the direction of polling errors.

So, this is a good comeback success for the polling industry and should give us greater confidence for 2020. It is important that the research industry broadcasts this success. When pollsters have a bad day, like they did in 2016, it affects market research as well. Our clients lose confidence in our ability to provide accurate information. When the pollsters get it right, it helps the research industry as well.

Is segmentation just discrimination with an acceptable name?

A short time ago we posted a basic explanation of the Cambridge Analytica/Facebook scandal (which you can read here). In it, we stated that market segmentation and stereotyping are essentially the same thing. This presents an ethical quandary for marketers as almost every marketing organization makes heavy use of market segmentation.

To review, marketers place customers into segments so that they can better understand and serve them. Segmentation is at the essence of marketing. Segments can be created along any measurable dimension, but since almost all segments have a demographic component we will focus on that for this post.

It can be argued that segmentation and stereotyping are the same thing. Stereotyping is attaching perceived group characteristic to an individual. For instance, if you are older I might assume your political views lean conservative, since it is known that political views tend to be more conservative in older Americans that they are in general among younger Americans. If you are female I might assume you are more likely to be the primary shopper for your household, since females in total do more of the family shopping than males. If you are African-American, I might assume you have a higher likelihood than others to listen to rap music, since that genre indexes high among African-Americans.

These are all stereotypes. These examples can be shown to true of a larger group, but that doesn’t necessarily imply that they apply to all the individuals in the group. There are plenty of liberal older Americans, females who don’t shop at all, and African-Americans who can’t stand rap music.

Segmenting consumers (which is applying stereotypes) isn’t inherently a bad thing. It leads to customized products and better customer experiences. The potential problem isn’t with stereotyping, it is when doing so moves to a realm of being discriminatory that we have to be careful. As marketers we tread a fine line. Stereotyping oversimplifies the complexity of consumers by forming an easy to understand story. This is useful in some contexts and discriminatory in others.

Some examples are helpful. It can be shown that African-Americans have a lower life expectancy than Whites. A life insurance company could use this information to charge African-Americans higher premiums than Whites. (Indeed, many insurance companies used to do this until various court cases prevented them from doing so.) This is a segmentation practice that many would say crosses a line to become discriminatory.

In a similar vein, car insurance companies routinely charge higher risk groups (for example younger drivers and males) higher rates than others. That practice has held up as not being discriminatory from a legal standpoint, largely because the discrimination is not against a traditionally disaffected group.

At Crux, we work with college marketers to help them make better admissions offer decisions. Many colleges will document the characteristics of their admitted students who thrive and graduate in good standing. The goal is to profile these students and then look back at how they profiled as applicants. The resulting model can be used to make future admissions decisions. Prospective student segments are established that have high probabilities of success at the institution because they look like students known to be successful, and this knowledge is used to make informed admissions offer decisions.

However, this is a case where a segmentation can cross a line and become discriminatory. Suppose that the students who succeed at the institution tend to be rich, white, female, and from high performing high schools. By benchmarking future admissions offers against them, an algorithmic bias is created. Fewer minorities, males, and students from urban districts will be extended admissions offers What turns out to be a good model from a business standpoint ends up perpetuating a bias., and places certain demographics of students at a further disadvantage.

There is a burgeoning field in research known as “predictive analytics.” It allows data jockeys to use past data and artificial intelligence to make predictions on how consumers will react. It is currently mostly being used in media buying. Our view is it helps in media efficiency, but only if the future world can be counted on to behave like the past. Over-reliance on predictive analytics will result in marketers missing truly breakthrough trends. We don’t have to look further than the 2016 election to see how it can fail; many pollsters were basing their modeling on how voters had performed in the past and in the process missed a fundamental shift in voter behavior and made some very poor predictions.

That is perhaps an extreme case, but shows that segmentations can have unintended consequences. This can happen in consumer product marketing as well. Targeted advertising can become formulaic. Brands can decline distribution in certain outlets. Ultimately, the business can suffer and miss out on new trends.

Academics (most notably Kahneman and Tversky) have established that people naturally apply heuristics to decision making. These are “rules of thumb” that are often useful because they allow us to make decisions quickly. However, these academics have also demonstrated how the use of heuristics often result in sub-optimal and biased decision making.

This thinking applies to segmentation. Segmentation allows us to make marketing decisions quickly because we assume that individuals take on the characteristics of a larger group. But, it ignores the individual variability within the group, and often that is where the true marketing insight lies.

We see this all the time in the generational work we do. Yes, Millennials as a group tend to be a bit sheltered, yet confident and team-oriented. But this does not mean all of them fit the stereotype. In fact, odds are high that if you profile an individual from the Millennial generation, he/she will only exhibit a few of the characteristics commonly attributed to the generation. Taking the stereotype too literally can lead to poor decisions.

This is not to say that marketers shouldn’t segment their customers. This is a widespread practice that clearly leads to business results. But, they should do so considering the errors and biases applying segments can create, and think hard about whether this can unintentionally discriminate and, ultimately, harm the business in the long term.

Market research isn’t about storytelling, it is about predicting the future

We recently had a situation that made me question the credibility of market research. We had fielded a study for a long-term client and were excited to view the initial version of the tabs. As we looked at results by age groupings we found them to be surprising. But this was also exciting because we were able to weave a compelling narrative around why the age results seemed counter-intuitive.

Then our programmer called to say a mistake had been made in the tabs and the banner points by age had been mistakenly reversed.

So, we went back to the drawing board ad constructed another, equally compelling story, as to why the data were behaving as they were.

This made me question the value of research. Good researchers can review seemingly disparate data points from a study and generate a persuasive story as to why they are as they are. Our entire business is based on this skill – in the end clients pay us to use data to provide insight into their marketing issues. Everything else we do is a means to this end.

Our experience with the flipped age banner points illustrates that stories can be created around any data. In fact, I’d bet that if you gave us a randomly-generated data set we could convince you as to its relevance to your marketing issues. I actually thought about doing this – taking the data we obtain by running random data through a questionnaire when testing it before fielding, handing it to an analyst, and seeing what happens. I’m convinced we could show you a random data set’s relevance to your business.

This issue is at the core of polling’s PR problem. We’ve all heard people say that you can make statistics say anything, therefore polls can’t be trusted. There are lies, damn lies, and statistics. I’ve argued against this for a long time because the pollsters and researchers I have known have universally been well-intentioned and objective and never try to draw a pre-determined conclusion from the data.

Of course, this does not mean that all of the stories we tell with data aren’t correct or enlightening. But, they all come from a perspective. Clients value external suppliers because of this perspective – we are third-party observers who aren’t wrapped up in the internal issues client’s face and we are often in a good position to view data with an objective mind. We’ve worked with hundreds of organizations and can bring these experiences bring that to bear on your study. Our perspective is valuable.

But, it is this perspective that creates an implicit bias in all we do. You will assess a data set from a different set of life experiences and background than I will. That is just human nature. Like all biases in research, our implicit bias may or not be relevant to a project. In most cases, I’d say it likely isn’t.

So, how can researchers reconcile this issue and sleep at night knowing their careers haven’t been a sham?

First and foremost, we need to stop saying that research is all about storytelling. It isn’t. The value of market research isn’t in the storytelling it is in the predictions of the future it makes. Clients aren’t paying us to tell them stories. They are paying us to predict the future and recommend actions that will enhance their business. Compelling storytelling is a means to this but is not our end goal. Data-based storytelling provides credibility to our predictions and gives confidence that they have a high probability of being correct.

In some sense, it isn’t the storytelling that matters, it is the quality of the prediction. I remember having a college professor lecturing on this. He would say that the quality of a model is judged solely by its predictive value. Its assumptions, arguments, and underpinnings really didn’t matter.

So, how do we deal with this issue … how do we ensure that the stories we tell with data are accurate and fuel confident predictions? Below are some ideas.

  1. Make predictions that can be validated at a later date. Provide a level of confidence or uncertainty around the prediction. Explain what could happen to prevent your prediction from coming true.
  2. Empathize with other perspectives when analyzing data. One of the best “tricks” I’ve ever seen is to re-write a research report as if you were writing it for your client’s top competitor. What conclusions would you draw for them? If it is an issue-based study, consider what you would conclude from the data if your client was on the opposite side of the issue.
  3. Peg all conclusions to specific data points in the study. Straying from the data is where your implicit bias may tend to take over. Being able to tie conclusions directly to data is dependent on solid questionnaire design.
  4. Have a second analyst review your work and play devil’s advocate. Show him/her the data without your analysis and see what stories and predictions he/she can develop independent of you. Have this same person review your story and conclusions and ask him/her to try to knock holes in them. The result is a strengthened argument.
  5. Slow down. It just isn’t possible to provide stories, conclusions, and predictions from research data that consider differing perspectives when you have just a couple of days to do it. This requires more negotiation upfront as to project timelines. The ever-decreasing timeframes for projects are making it difficult to have the time needed to objectively look at data.
  6. Realize that sometimes a story just isn’t there. Your perspective and knowledge of a client’s business should result in a story leaping out at you and telling itself. If this doesn’t happen, it could be because the study wasn’t designed well or perhaps there simply isn’t a story to be told. The world can be a more random place than we like to admit, and not everything you see in a data set is explainable. Don’t force it – developing a narrative that is reaching for explanations is inaccurate and a disservice to your client.

Going Mobile

There has been a critical trend happening in market research data collection that is getting little attention. If you are gathering data in online surveys and polls, chances are that most of your respondents are now answering your questionnaires on mobile devices.

This trend snuck up on us. Just three years ago we were advising clients that we were noticing that about 25% of respondents were answering on mobile devices. Of the last 10 projects we have completed, that percentage is now between 75% and 80%. (Our firm conducts a lot of research with younger respondents which likely skews this higher for us than other firms, but it remains the case mobile response has become the norm.)

Survey response tools have evolved considerably. Respondents initially used either the mail or provided responses to an interviewer on the other end of a clipboard. Then, people primarily answered surveys from a tethered land-line phone. The internet revolution made it possible to move data collection to a (stationary) computer. Now, respondents are choosing to answer on a device that is always with them and when and where they choose.

There are always “mode” effects in surveys – whereby the mode itself can influence results. However, the mode effects involved in mobile data collection has not been well-studied. We will sometimes compare mobile versus non-mobile respondents on a specific project, but in our data this is not a fair comparison because there is a self-selection that occurs. Our respondents can choose to respond either on a mobile device or on a desktop/laptop. If we see differences across modes it could simply be due to the nature of the choice respondents make and have little to do with the mode itself.

To study this properly, an experimental design would be needed – where respondents are randomly assigned to a mobile or desktop mode. After searching and asking around to the major panel companies, I wasn’t able to find any such studies that have been conducted.

That is a bit crazy – our respondents are providing data in a new and interesting fashion, and our industry has done little to study how that might influence the usefulness of the information we collect.

Here is what we do know. First, questionnaires do not look the same on mobile devices as they do on laptops. Most types of questions look similar, but grid-style questions look completely different.  Typically, on a mobile device respondents will see one item at a time and on a desktop they will see the entire list. This will create a greater response-set type bias on the desktop version. I’d say that this implies that a mode effect likely does occur and that it doesn’t vary in the same way across all types of questions you are asking.

Second, the limited real estate of a mobile device makes wordy questions and responses look terrible. Depending on the survey system you are using, a lengthy question can require both horizontal and vertical scrolling, almost guaranteeing that respondents won’t attend to it.

Our own anecdotal information suggests that mobile respondents will complete a questionnaire faster, are more likely to suspend the survey part-way, and provide less rich open-ended responses.

So, how can we guard against these mode effects? Well, in the absence of research-on-research that outlines their nature, we have a few suggestions:

  • First and foremost, we need to develop a “mobile-first” mentality when designing questionnaires. Design your questionnaire for mobile and adapt it as necessary for the desktop. This is likely opposite to what you are currently doing.
  • Mobile-first means minimizing wording and avoiding large grid-type questions. If you must use grids, use fewer scale points and keep the number of items to a minimum.
  • Visuals are tough … remember that you have a 5 or 6 inch display to work with when showing images. You are limited here.
  • Don’t expect much from open-ended questions. Open-ends on mobile have to be precisely worded and not vague. We often find that clients expect too much from open-ended responses.
  • Test the questionnaire on mobile. Most researchers who are designing and testing questionnaires are looking at a desktop/laptop screen all day long, and our natural tendency is to only test on a desktop. Start your testing on mobile and then move to the desktop.
  • Shorten your questionnaires. It seems likely that respondents will have more patience for lengthy surveys when they are taking them on stationary devices as opposed to devices that are with them at all (sometimes distracting) times.
  • Finally, educate respondents not to answer these surveys when they themselves are “mobile.” With the millions of invitations and questionnaires our industry is fulfilling, we need to be sure we aren’t distracting respondents while they are driving.

In the long run, as even more respondents choose mobile this won’t be a big issue. But, if you have a tracking study in place you should wonder if the movement to mobile is affecting your data in ways you aren’t anticipating.

Let’s Make Research and Polling Great Again!

Crux Logo Final 2016

The day after the US Presidential election, we quickly wrote and posted about the market research industry’s failure to accurately predict the election.  Since this has been our widest-read post (by a factor of about 10!) we thought a follow-up was in order.

Some of what we predicted has come to pass. Pollsters are being defensive, claiming their polls really weren’t that far off, and are not reaching very deep to try to understand the core of why their predictions were poor. The industry has had a couple of confabs, where the major players have denied a problem exists.

We are at a watershed moment for our industry. Response rates continue to plummet, clients are losing confidence in the data we provide, and we are swimming in so much data our insights are often not able to find space to breathe. And the public has lost confidence in what we do.

Sometimes it is everyday conversations that can enlighten a problem. Recently, I was staying at an AirBnB in Florida. The host (Dan) was an ardent Trump supporter and at one point he asked me what I did for a living. When I told him I was a market researcher the conversation quickly turned to why the polls failed to accurately predict the winner of the election. By talking with Dan I quickly I realized the implications of Election 2016 polling to our industry. He felt that we can now safely ignore all polls – on issues, approval ratings, voter preferences, etc.

I found myself getting defensive. After all, the polls weren’t off that much.  In fact, they were actually off by more in 2012 than in 2016 – the problem being that this time the polling errors resulted in an incorrect prediction. Surely we can still trust polls to give a good sense of what our citizenry thinks about the issues of the day, right?

Not according to Dan. He didn’t feel our political leaders should pay attention to the polls at all because they can’t be trusted.

I’ve even seen a new term for this bandied about:  poll denialism. It is a refusal to believe any poll results because of their past failures. Just the fact that this has been named should be scary enough for researchers.

This is unnerving not just to the market research industry, but to our democracy in general.  It is rarely stated overtly, but poll results are a key way political leaders keep in touch with the needs of the public, and they shape public policy a lot more than many think. Ignoring them is ignoring public opinion.

Market research remains closely associated with political polling. While I don’t think clients have become as mistrustful about their market research as the public has become about polling, clients likely have their doubts. Much of what we do as market researchers is much more complicated than election polling. If we can’t successfully predict who will be President, why would a client believe our market forecasts?

We are at a defining moment for our industry – a time when clients and suppliers will realize this is an industry that has gone adrift and needs a righting of the course. So what can we do to make research great again?  We have a few ideas.

  1. First and foremost, if you are a client, make greater demands for data quality. Nothing will stimulate the research industry more to fix itself than market forces – if clients stop paying for low quality data and information, suppliers will react.
  2. Slow down! There is a famous saying about all projects.  They have three elements that clients want:  a) fast, b) good, and c) cheap, and on any project you can choose two of these.  In my nearly three decades in this industry I have seen this dynamic change considerably. These days, “fast” is almost always trumping the other two factors.  “Good” has been pushed aside.  “Cheap” has always been important, but to be honest budget considerations don’t seem to be the main issue (MR spending continues to grow slowly). Clients are insisting that studies are conducted at a breakneck pace and data quality is suffering badly.
  3. Insist that suppliers defend their methodologies. I’ve worked for corporate clients, but also many academic researchers. I have found that a key difference between them becomes apparent during results presentations. Corporate clients are impatient and want us to go as quickly as possible over the methodology section and get right into the results.  Academics are the opposite. They dwell on the methodology and I have noticed if you can get an academic comfortable with your methods it is rare that they will doubt your findings. Corporate researchers need to understand the importance of a sound methodology and care more about it.
  4. Be honest about the limitations of your methodology. We often like to say that everything you were ever taught about statistics assumed a random sample and we haven’t seen a study in at least 20 years that can credibly claim to have one.  That doesn’t mean a study without a random sample isn’t valuable, it just means that we have to think through the biases and errors it could contain and how that can be relevant to the results we present. I think every research report should have a page after the methodology summary that lists off the study’s limitations and potential implications to the conclusions we draw.
  5. Stop treating respondents so poorly. I believe this is a direct consequence of the movement from telephone to online data collection. Back in the heyday of telephone research, if you fielded a survey that was too long or was challenging for respondents to answer, it wasn’t long until you heard from your interviewers just how bad your questionnaire was. In an online world, this feedback never gets back to the questionnaire author – and we subsequently beat up our respondents pretty badly.  I have been involved in at least 2,000 studies and about 1 million respondents.  If each study averages 15 minutes that implies that people have spent about 28 and a half years filling out my surveys.  It is easy to lose respect for that – but let’s not forget the tremendous amount of time people spend on our surveys. In the end, this is a large threat to the research industry, as if people won’t respond, we have nothing to sell.
  6. Stop using technology for technology’s sake. Technology has greatly changed our business. But, it doesn’t supplant the basics of what we do or allow us to ignore the laws of statistics.  We still need to reach a representative sample of people, ask them intelligent questions, and interpret what it means for our clients.  Tech has made this much easier and much harder at the same time.  We often seem to do things because we can and not because we should.

The ultimate way to combat “poll denialism” in a “post-truth” world is to do better work, make better predictions, and deliver insightful interpretations. That is what we all strive to do, and it is more important than ever.

 


Visit the Crux Research Website www.cruxresearch.com

Enter your email address to follow this blog and receive notifications of new posts by email.