Posts Tagged 'Methodology'

Less useful research questions

Questionnaire “real estate” is limited and valuable. Most surveys fielded today are too long and this causes problems with respondent fatigue and trust. Researchers tend to start the questionnaire design process with good intent and aim to keep survey experiences short and compelling for respondents. However, it is rare to see a questionnaire get shorter as it undergoes revision and review, and many times the result is impossibly long surveys.

One way to guard against this is to be mindful. All questions included should have a clear purpose and tie back to study objectives. Many times, researchers include some questions and options simply out of habit, and not because these questions will add value to the project.

Below are examples of question types that, more often or not, add little to most questionnaires. These questions are common and used out of habit. There are certainly exceptions when it makes sense to include these questions, but for the most part we advise against using them unless there is a specific reason to include them.

Marital status

Somewhere along the way, asking a respondent’s marital status became standard on most consumer questionnaires. Across thousands of studies, I can only recall a few times when I have actually used it for anything. It is appropriate to ask if it is relevant. Perhaps your client is a jewelry company or in the bridal industry. Or, maybe you are studying relationships. However, I would nominate marital status as being the least used question in survey research history.

Other (specify)

Many multiple response questions ask a respondent to select all that apply from a list, and then as a final option will have “other.” Clients constantly pressure researchers to leave a space for respondents to type out what this “other” option is. We rarely look at what they type in. I tell clients that if we expect a lot of respondents to select the other option, it probably means that we have not done a good job at developing the list. It may also mean that we should be asking the question in an open-ended fashion. Even when it is included, most of the respondents who select other will not type anything into the little box anyway.

Don’t Know Options

We recently composed an entire post about when to include a Don’t Know option on a question. To sum it up, the incoming assumption should be that you will not use a Don’t Know option unless you have an explicit reason to do so. Including Don’t Know as an option can make a data set hard to analyze. However, there are exceptions to this rule, as Don’t Know can be an appropriate choice. That said, it is overused on surveys currently.

Open-Ends

The transition from telephone to online research has completely changed how researchers can ask open-ended questions. In the telephone days, we could pose questions that were very open-ended because we had trained interviewers who could probe for meaningful answers. With online surveys, open-ended questions that are too loose rarely produce useful information. Open-ends need to be specific and targeted. We favor the inclusion of just a handful of open-ends in each survey, and that they are a bit less “open-ended” than what has been traditionally asked.

Grid questions with long lists

We have all seen these. These are long lists of items that require a scaled response, perhaps a 5-point agree/disagree scale. The most common abandon point on a survey is the first time a respondent encounters a grid question with a long list. Ideally, these lists are about 4 to 6 items and there are no more than two or three of them on a questionnaire.

We currently field a study that has a list like this with 28 items in it. There is no way we are getting good information from this question and we are fatiguing the respondent for the remainder of the survey.

Specifying time frames

Survey research often seeks to find out about a behavior across a specified time frame. For instance, we might want to know if a consumer has used a product in the past day, past week, past month, etc. The issue here is not so much the time frame, it is when we consider the responses to be literal. I have seen clients take past day usage and multiply it by 365 and assume that will equate to past year usage. Technically and mathematically, that might be true, but it isn’t how respondents react to questions.

In reality, it is likely accurate to ask if a respondent has done something in the past day. But, once the time frames get longer, we are really asking about “ever” usage. It depends a bit on the purchase cycle of the product and its cost, but for most products, asking if they have used in the past month, 6 months, year, etc. will yield similar responses.

Some researchers work around this by just asking “ever used” and “recently used.” There are times when that works, but we tend to set a reasonable time frame for recent use and go with that, typically within the past week.

Household income

Researchers have asked household income as long as the survey research field has been around. There are at least three serious problems with it. First, many respondents are not knowledgeable about what their household income is. Most households have a “family CFO” who takes the lead on financial issues, and even this person often will not know what the family income is. 

Second, the categories chosen affect the response to the income question, indicating just how unstable it is. Asking household income in say, ten categories versus five categories will not result in comparable data. Respondents tend to assume the middle of the range given is normal, and respond using that as a reference point.

Third, and most importantly, household income is a lousy measure of socio-economic status (SES). Many young people have low annual incomes but a wealthy lifestyle as they are still being supported by their parents. Many older people are retired and may have almost non-existent incomes, yet live a wealthy lifestyle off of their savings. Household income tends to only be a reasonable measure of SES for respondents aged about 30 to 60,

There are better measures of SES. Education level can work, and a particularly good question is to ask the respondent about their mother’s level of education, which has been shown to correlate strongly with SES. We also ask about their attitudes towards their income – whether they have all the money they need, just enough, or if they struggle to meet basic expenses.

Attention spans are getting shorter and as more and more surveys are being completed on mobile devices there are plenty of distractions as respondents answer questionnaires. Engage them, get their attention, and keep the questionnaire short. There may be no such thing as a dumb question, but there are certainly questions that when asked on a survey do not yield useful information.

Should you include a “Don’t Know” option on your survey question?

Questionnaire writers construct a bridge between client objectives and a line of questioning that a respondent can understand. This is an underappreciated skill.

The best questionnaire writers empathize with respondents and think deeply about tasks respondents are asked to perform. We want to strike a balance between the level of cognitive effort required and a need to efficiently gather large amounts of data. If the cognitive effort required is too low, the data captured is not of high quality. If it is too high, respondents get fatigued and stop attending to our questions.

One of the most common decisions researchers have to make is whether or not to allow for a Don’t Know (DK) option on a question. This is often a difficult choice, and the correct answer on whether to include a DK option might be the worst possible answer: “It depends.”

Researchers have genuine disagreements about the value of a DK option. I lean strongly towards not using DK’s unless there is a clear and considered reason for doing so.

Clients pay us to get answers from respondents and to find out what they know, not what they don’t know. Pragmatically, whenever you are considering adding a DK option your first inclination should be that you perhaps have not designed the question well. If a large proportion of your respondent base will potentially choose “don’t know,” odds are high that you are not asking a good question to begin with, but there are exceptions.

If you get in a situation where you are not sure if you should include a DK option, the right thing to do is to think broadly and reconsider your goal: why are you asking the question in the first place? Here is an example which shows how the DK decision can actually be more complicated than it first appears.

We recently had a client that wanted us to ask a question similar to this: “Think about the last soft drink you consumed. Did this soft drink have any artificial ingredients?”

Our quandary was whether we should just ask this as a Yes/No question or to also give the respondent a DK option. There was some discussion back and forth, as we initially favored not including DK, but our client wanted it.

Then it dawned on us that whether or not to include DK depended on what the client wanted to get out of the question. On one hand, the client might want to truly understand if the last soft drink consumed had any artificial ingredients in it, which is ostensibly what the question asks. If this was the goal, we felt it was necessary to better educate the respondent on what an “artificial ingredient” was so they could provide an informed answer and so all respondents would be working from a common definition. Or, alternatively, we could ask for the exact brand and type of soft drink they consumed and then on the back-end code which ones have artificial ingredients and which do not, and thus get a good estimate for the client.

The other option was to realize that respondents might have their own definitions of “artificial ingredients” that may or may not match our client’s definition. Or, they may have no clue what is artificial and what is not.

In the end, we decided to use the DK option in this case because understanding how many people are ignorant to artificial ingredients fit well with our objectives. When we pressed the client, we learned that they wanted to document this ambiguity. If a third of consumers don’t know whether or not their soft drinks have artificial ingredients in them, this would be useful information for our client to know.

This is a good example on how a seemingly simple question can have a lot of thinking behind it and how it is important to contextualize this reasoning when reporting results. In this case, we are not really measuring whether people are drinking soft drinks with artificial ingredients. We are measuring what they think they are doing, which is not the same thing and likely more relevant from a marketing point-of-view.

There are other times when a DK option makes sense to include. For instance, some researchers will conflate the lack of an option (a DK response) with a neutral opinion and these are not the same thing. For example, we could be asking “how would you rate the job Joe Biden is doing as President?” Someone who answers in the middle of the response scale likely has a considered, neutral opinion of Joe Biden. Someone answering DK has not considered the issue and should not be assumed to have a neutral opinion of the president. This is another case where it might make sense to use DK.

However, there are probably more times when including a DK option is a result of lazy questionnaire design than any deep thought regarding objectives. In practice, I have found that it tends to be clients who are inexperienced in market research that press hardest to include DK options.

There are at least a couple of serious problems with including DK options on questionnaires. The first is “satisficing” – which is a tendency respondents have to not place a lot of effort on responding and instead choose the option that requires the least cognitive effort. The DK option encourages satisficing. A DK option also allows respondents to disengage with the survey and can lead to inattention on subsequent items.

DK responses create difficulties when analyzing data. We like to look at questions on a common base of respondents, and that becomes hard to comprehend when respondents choose DK on some questions but not others. Including DK makes it harder to compare results across questions. DK options also limit the ability to use multivariate statistics, as a DK response does not fit neatly on a scale.

Critics would say that researchers should not force respondents to express and opinion they do not have and therefore should provide DK options. I would counter by saying that if you expect a substantial amount of people to not have an opinion, odds are high you should reframe the question and ask them about something they do know about. It is usually (but not always) the case that we want to find out more about what people know than what they don’t know.

“Don’t know” can be a plausible response. But, more often than not, even when it is a plausible response if we feel a lot of people will choose it, we should reconsider why we are asking the question. Yes, we don’t want to force people to express an option they don’t have. But rather than include DK, it is better to rewrite a question to be more inclusive of everybody.

As an extreme example, here is a scenario that shows how a DK can be designed out of a question:

We might start with a question the client provides us: “How many minutes does your child spend doing homework on a typical night?” For this question, it wouldn’t take much pretesting to realize that many parents don’t really know the answer to this, so our initial reaction might be to include a DK option. If we don’t, parents may give an uninformed answer.

However, upon further thought, we should realize that we may not really care about how many minutes the child spends on homework and we don’t really need to know whether the parent knows this precisely or not. Thinking even deeper, some kids are much more efficient in their homework time than others, so measuring quantity isn’t really what we want at all. What we really want to know is, is the child’s homework level appropriate and effective from the parent’s perspective?

This probing may lead us down a road to consider better questions, such as “in your opinion, does your child have too much, too little, or about the right amount of homework?” or “does the time your child spends on homework help enhance his/her understanding of the material?” This is another case when thinking more about why we are asking the question tends to result in better questions being posed.

This sort of scenario happens a lot when we start out thinking we want to ask about a behavior, when what we really want to do is ask about an attitude.

The academic research on this topic is fairly inconclusive and sometimes contradictory. I think that is because academic researchers don’t consider the most basic question, which is whether or not including DK will better serve the client’s needs. There are times that understanding that respondents don’t know is useful. But, in my experience, more often than not if a lot of respondents choose DK it means that the question wasn’t designed well. 

Which quality control checks questions should you use in your surveys?

While it is no secret that the quality of market research data has declined, how to address poor data quality is rarely discussed among clients and suppliers. When I started in market research more than 30 years ago, telephone response rates were about 60%. Six in 10 people contacted for a market research study would choose to cooperate and take our polls. Currently, telephone response rates are under 5%. If we are lucky, 1 in 20 people will take part. Online research is no better, as even from verified customer lists response rates are commonly under 10% and even the best research panels can have response rates under 5%.

Even worse, once someone does respond, a researcher has to guard against “bogus” interviews that come from scripts and bots, as well as individuals who are cheating on the survey to claim the incentives offered. Poor-quality data is clearly on the rise and is an existential threat to the market research industry that is not being taken seriously enough.

Maximizing response requires a broad approach with tactics deployed throughout the process. One important step is to cleanse each project of bad quality respondents. Another hidden secret in market research is that researchers routinely have to remove anywhere from 10% to 50% of respondents from their database due to poor quality.

Unfortunately, there is no industry standard way of doing this – of identifying poor-quality respondents. Every supplier sets their own policies. This is likely because there is considerable variability in how respondents are sourced for studies, and a one-size-fits-all approach might not be possible, and some quality checks depend on the specific topic of the study. Unfortunately, researchers are left to largely fend for themselves when trying to come up with a process for how to remove poor quality respondents from their data.

One of the most important ways to guard against poor quality respondents is to design a compelling questionnaire to begin with. Respondents will attend to a short, relevant survey. Unfortunately, we rarely provide them with this experience.

We have been researching this issue recently in an effort to come up with a workable process for our projects. Below, we share our thoughts. The market research industry needs to work together on this issue, as when one of us removes a bad respondent from a database in helps the next firm with their future studies.

There is a practical concern for most studies – we rarely have room for more than a handful of questions that relate to quality control. In addition to speeder and straight-line checks, studies tend to have room for about 4-5 quality control questions. With the exception of “severe speeders” as described below, respondents will be automatically removed if they fail three or more of the checks. We use a “three strikes and you’re out” rule to remove respondents. If anything, this is probably too conservative, but we’d rather err on the side of retaining some bad quality respondents in than inadvertently removing some good quality ones.

When possible, we favor checks that can be done programmatically, without human intervention, as that keeps fielding and quota management more efficient. To the degree possible, all quality check questions should have a base of “all respondents” and not be asked of subgroups.

Speeder Checks

We aim to set up two criteria: “severe” speeders are those that complete the survey in less than one-third of the median time. These respondents are automatically tossed. “Speeders” are those that take between one-third and one-half of the median time, and these respondents are flagged.

We also consider setting up timers within the survey – for example, we may place timers on a particularly long grid question or a question that requires substantial reading on the part of the respondent. Note that when establishing speeder checks it is important to use the median length as a benchmark and not the mean. In online surveys, some respondents will start a survey and then get distracted for a few hours and come back to it, and this really skews the average survey length. Using the median gets around that.

Straight Line Checks

Hopefully, we have designed our study well and do not have long grid type questions. However, more often than not these types of questions find their way into questionnaires.  For grids with more than about six items, we place a straight-lining check – if a respondent chooses the same response for all items in the grid, they are flagged.

Inconsistent Answers

We consider adding two question that check for inconsistent answers. First, we re-ask a demographic question from the screener near the end of the survey. We typically use “age” as this question. If the respondent doesn’t choose the same age in both questions, they are flagged.

In addition, we try to find an attitudinal question that is asked that we can re-ask in the exact opposite way. For instance, if earlier we asked “I like to go to the mall” on a 5-point agreement scale, we will also ask the opposite: “I do not like to go to the mall” on the same scale. Those that answer the same for both are flagged. We try to place these two questions a few minutes apart in the questionnaire.

Low Incidence items

This is a low attentiveness flag. It is meant to catch people who say they do really unlikely things and also catch people who say they don’t do likely things because they are not really paying attention to the questions we pose. We design this question specific to each survey and tend to ask what respondents have done over the past weekend. We like to have two high incidence items (such as “watched TV,” or “rode in a car”), 4 to 5 low incidence items (such as “flew in an airplane,” “read an entire book,” “played poker”) and one incredibly low incidence item (such as “visited Argentina”).  Respondents are flagged if they didn’t do at least one of our high incidence items, if they said they did more than two of our low incidence items, or if they say they did our incredibly low incidence item.

Open-ended check

We try to include this one in all studies, but sometimes have to skip it if the study is fielding on a tight timeframe because it involves a manual process. Here, we are seeing if a respondent provides a meaningful response to an open-ended question. Hopefully, we can use a question that is already in the study for this, but when we cannot we tend to use one like this: “Now I’d like to hear your opinions about some other things. Tell me about a social issue or cause that you really care about.  What is this cause and why do you care about it?” We are manually looking to see if they provide an articulate answer and they are flagged if they do not.

Admission of inattentiveness

We don’t use this one as a standard, but are starting to experiment with it. As the last question of the survey, we can ask respondents how attentive they were. This will suffer from a large social desirability bias, but we will sometimes directly ask them how attentive they were when taking the survey, and flag those that say they did not pay attention at all.

Traps and misdirects

I don’t really like the idea of “trick questions” – there is research that indicates that these types of questions tend to trap too many “good” respondents. Some researchers feel that these questions lower respondent trust and thus answer quality. That seems to be enough to recommend against this style of question. The most common types I have seen ask a respondent to select the “third choice” below no matter what, or to “pick the color from the list below,” or “select none of the above.” We counsel against using these.

Comprehension

This was recommended by a research colleague and was also mentioned by an expert in a questionnaire design seminar we attended. We don’t use this as a quality check, but like to use it during a soft-launch period. The question looks like this: “Thanks again for taking this survey.  Were there any questions on this survey you had difficulty with or trouble answering?  If so, it will be helpful to us if you let us know what those problems were in the space below.” This is a useful question, but we don’t use it as a quality check per se.

Preamble

I have mixed feelings on this type of quality check, but we use it when we can phrase it positively. A typical wording is like this: “By clicking yes, you agree to continue to our survey and give your best effort to answer 10-15 minutes of questions. If you speed through the survey or otherwise don’t give a good effort, you will not receive credit for taking the survey.”

This is usually one of the first questions in the survey. The argument I see against this is it sets the respondent up to think we’ll be watching them and that could potentially affect their answers. Then again, it might affect them in a good way if it makes them attend more.

I prefer a question that takes a gentler, more positive approach – telling respondents we are conducting this for an important organization, that their opinions will really matter, promise them confidentiality, and then ask them to agree to give their best effort, as opposed to lightly threatening them as this one does.

Guarding against bad respondents has become an important part of questionnaire design, and it is unfortunate that there is no industry standard on how to go about it. We try to build in some quality checks that will at least spot the most egregious cases of poor quality. This is an evolving issue, and it is likely that what we are doing today will change over time, as the nature of market research changes.

Researchers should be mindful of “regression toward the mean”

There is a concept in statistics known as regression toward the mean that is important for researchers to consider as we look at how the COVID-19 pandemic might change future consumer behavior. This concept is as challenging to understand as it is interesting.

Regression toward the mean implies that an extreme example in a data set tends to be followed by an example that is less extreme and closer to the “average” value of the population. A common example is if two parents that are above average in height have a child, that child is demonstrably more likely to be closer to average height than the “extreme” height of their parents.

This is an important concept to keep in mind in the design of experiments and when analyzing market research data. I did a study once where we interviewed the “best” customers of a quick service restaurant, defined as those that had visited the restaurant 10 or more times in the past month. We gave each of them a coupon and interviewed them a month later to determine the effect of the coupon. We found that they actually went to the restaurant less often the month after receiving the coupon than the month before.

It would have been easy to conclude that the coupon caused customers to visit less frequently and that there was something wrong with it (which is what we initially thought). What really happened was a regression toward the mean. Surveying customers who had visited a large number of times in one month made it likely that these same customers would visit a more “average” amount in a following month whether they had a coupon or not. This was a poor research design because we couldn’t really assess the impact of the coupon which was our goal.

Personally, I’ve always had a hard time understanding and explaining regression toward the mean because the concept seems to be counter to another concept known as “independent trials”. You have a 50% chance of flipping a fair coin and having it come up heads regardless of what has happened in previous flips. You can’t guess whether the roulette wheel will come up red or black based on what has happened in previous spins. So, why would we expect a restaurant’s best customers to visit less in the future?

This happens when we begin with a skewed population. The most frequent customers are not “average” and have room to regress toward the mean in the future. Had we surveyed all customers across the full range of patronage there would be no mean to regress to and we could have done a better job of isolating the effect of the coupon.

Here is another example of regression toward the mean. Suppose the Buffalo Bills quarterback, Josh Allen, has a monster game when they play the New England Patriots. Allen, who has been averaging about 220 yards passing per game in his career goes off and burns the Patriots for 450 yards. After we are done celebrating and breaking tables in western NY, what would be our best prediction for the yards Allen will throw for the second time the Bills play the Patriots?

Well, you could say the best prediction is 450 yards as that is what he did the first time. But, regression toward the mean would imply that he’s more likely to throw close to his historic average of 220 yards the second time around. So, when he throws for 220 yards the second game it is important to not give undue credit to Bill Belichick for figuring out how to stop Allen.

Here is another sports example. I have played (poorly) in a fantasy baseball league for almost 30 years. In 2004, Derek Jeter entered the season as a career .317 hitter. After the first 100 games or so he was hitting under .200. The person in my league that owned him was frustrated so I traded for him. Jeter went on to hit well over .300 the rest of the season. This was predictable because there wasn’t any underlying reason (like injury) for his slump. His underlying average was much better than his current performance and because of the concept of regression toward the mean it was likely he would have a great second half of the season, which he did.

There are interesting HR examples of regression toward the mean. Say you have an employee that does a stellar job on an assignment – over and above what she normally does. You praise her and give her a bonus. Then, you notice that on the next assignment she doesn’t perform on the same level. It would be easy to conclude that the praise and bonus caused the poor performance when in reality her performance was just regressing back toward the mean. I know sales managers who have had this exact problem – they reward their highest performers with elaborate bonuses and trips and then notice that the following year they don’t perform as well. They then conclude that their incentives aren’t working.

The concept is hard at work in other settings. Mutual funds that outperform the market tend to fall back in line the next year. You tend to feel better the day after you go to the doctor. Companies profiled in “Good to Great” tend to have hard times later on.

Regression toward the mean is important to consider when designing sampling plans. If you are sampling an extreme portion of a population it can be a relevant consideration. Sample size is also important. When you have just a few cases of something, mathematically an extreme response can skew your mean.

The issue to be wary of is that when we fail to consider regression toward the mean, we tend to overstate the importance of correlation between two things. We think our mutual fund manager is a genius when he just got lucky, that our coupon isn’t working, or that Josh Allen is becoming the next Drew Brees. All of these could be true, but be careful in how you interpret data that result from extreme or small sample sizes.

How does this relate to COVID? Well, at the moment, I’d say we are still in an “inflated expectations” portion of a hype curve when we think of what permanent changes may take place resulting from the pandemic. There are a lot of examples. We hear that commercial real estate is dead because businesses will keep employees working from home. Higher education will move entirely online. In-person qualitative market research will never happen again. Business travel is gone forever. We will never again work in an office setting. Shaking hands is a thing of the past.

I’m not saying there won’t be a new normal that results from COVID, but if we believe in regression toward the mean and the hype curve we’d predict that the future will look more like the past than how it is currently being portrayed. The post-COVID world will certainly look more like the past than a more extreme version of the present. We will naturally regress back toward the past and not to a more extreme version of current behaviors. The “mean” being regressed to has likely changed, but not as much as the current, extreme situation implies.

“Margin of error” sort of explained (+/-5%)

It is now September of an election year. Get ready for a two-month deluge of polls and commentary on them. One thing you can count on is reporters and pundits misinterpreting the meaning behind “margin of error.” This post is meant to simplify the concept.

Margin of error refers to sampling error and is present on every poll or market research survey. It can be mathematically calculated. All polls seek to figure out what everybody thinks by asking a small sample of people. There is always some degree of error in this.

The formula for margin of error is fairly simple and depends mostly on two things: how many people are surveyed and their variability of response. The more people you interview, the lower (better) the margin of error. The more the people you interview give the same response (lower variability), the better the margin of error. If a poll interviews a lot of people and they all seem to be saying the same thing, the margin of error of the poll is low. If the poll interviews a small number of people and they disagree a lot, the margin of error is high.

Most reporters understand that a poll with a lot of respondents is better than one with fewer respondents. But most don’t understand the variability component.

There is another assumption used in the calculation for sampling error as well: the confidence level desired. Almost every pollster will use a 95% confidence level, so for this explanation we don’t have to worry too much about that.

What does it mean to be within the margin of error on a poll? It simply means that the two percentages being compared can be deemed different from one another with 95% confidence. Put another way, if the poll was repeated a zillion times, we’d expect that at least 19 out of 20 times the two numbers would be different.

If Biden is leading Trump in a poll by 8 points and the margin of error is 5 points, we can be confident he is really ahead because this lead is outside the margin of error. Not perfectly confident, but more than 95% confident.

Here is where reporters and pundits mess it up.  Say they are reporting on a poll with a 5-point margin of error and Biden is leading Trump by 4 points. Because this lead is within the margin of error, they will often call it a “statistical dead heat” or say something that implies that the race is tied.

Neither is true. The only way for a poll to have a statistical dead heat is for the exact same number of people to choose each candidate. In this example the race isn’t tied at all, we just have a less than 95% confidence that Biden is leading. In this example, we might be 90% sure that Biden is leading Trump. So, why would anyone call that a statistical dead heat? It would be way better to be reporting the level of confidence that we have that Biden is winning, or the p-value of the result. I have never seen a reporter do that, but some of the election prediction websites do.

Pollsters themselves will misinterpret the concept. They will deem their poll “accurate” as long as the election result is within the margin of error. In close elections this isn’t helpful, as what really matters is making a correct prediction of what will happen.

Most of the 2016 final polls were accurate if you define being accurate as coming within the margin of error. But, since almost all of them predicted the wrong winner, I don’t think we will see future textbooks holding 2016 out there as a zenith of polling accuracy.

Another mistake reporters (and researchers make) is not recognizing that the margin of error only refers to sampling error which is just one of many errors that can occur on a poll. The poor performance of the 2016 presidential polls really had nothing to do with sampling error at all.

I’ve always questioned why there is so much emphasis on sampling error for a couple of reasons. First, the calculation of sampling error assumes you are working with a random sample which in today’s polling world is almost never the case. Second, there are many other types of errors in survey research that are likely more relevant to a poll’s accuracy than sampling error. The focus on sampling error is driven largely because it is the easiest error to mathematically calculate. Margin of error is useful to consider, but needs to be put in context of all the other types of errors that can happen in a poll.

How COVID-19 may change Market Research

Business life is changing as COVID-19 spreads in the US and the world. In the market research and insights field there will be both short-term and long-term effects. It is important that clients and suppliers begin preparing for them.

This has been a challenging post to write. First, in the context of what many people are going though in their personal and business lives as a result of this disruption, writing about what might happen to one small sector of the business world can come across as uncaring and tone-deaf, which is not the intention. Second, this is a quickly changing situation and this post has been rewritten a number of times in the past week. I have a feeling it may not age well.

Nonetheless, market research will be highly impacted by this situation. Below are some things we think will likely happen to the market research industry.

  • An upcoming recession will hit the MR industry hard. Market research is not an investment that typically pays off quickly. Companies that are forced to pare back will cut their research spending and likely their staffs.
  • Cuts will affect clients more than suppliers. In previous recessions, clients have cut MR staff and outsourced work to suppliers. This is an opportunity for suppliers that know their clients’ businesses well and can step up to help.
  • Unlike a lot of other types of industries, it is the large suppliers that are most at risk of losing work. Publicly-held research suppliers will be under even more intense pressure from their investors than usual. There will most certainly be cost cutting at these firms, and if the concerns over the virus persist, it will lead to layoffs.
  • The smallest suppliers could face an existential risk. Many independent contractors and small firms are dependent on one or two clients for the bulk of their revenue. If those clients are in highly affected sectors, these small suppliers will be at risk of going out of business.
  • Smallish to mid-sized suppliers may emerge stronger. Clients are going to be under cost pressures due to a receding economy and smaller research suppliers tend to be less expensive. Smaller research firms did well post 9/11 and during the recession of 2008-09 because clients moved work from higher priced larger firms to them. Smaller research firms would be wise to build tight relationships so that when the storm over the virus abates, they will have won their clients trust for future projects.
  • New small firms will emerge as larger firms cut staff and create refugees who will launch new companies.

Those are all items that might pertain to any sort of sudden business downturn. There are also some things that we think will happen that are specific to the COVID-19 situation:

  • Market research conferences will never be the same. Conferences are going to have difficulty drawing speakers and attendees. Down the line, conferences will be smaller and more targeted and there will be more virtual conferences and training sessions scheduled. At a minimum, companies will send fewer people to research conferences.
  • This will greatly affect MR trade associations as these conferences are important revenue sources for them. They will rethink their missions and revenue models, and will become less dependent on their signature events. The associations will have more frequent, smaller, more targeted online events. The days of the large, comprehensive research conference may be over.
  • Business travel will not return to its previous level. There will be fewer in-person meetings between clients and suppliers and those that are held will have fewer participants. Video conferencing will become an even more important way to reach clients.
  • Clients and suppliers will allow much more “work from home.” It may become the norm that employees are only expected to be in the office for key meetings. The situation with COVID-19 will give companies who don’t have a lot of experience allowing employees to work from home the opportunity to see the value in it. When the virus is under control, they will embrace telecommuting. We will see this crisis kick-start an already existing movement towards allowing more employees to work from home. The amount of office space needed will shrink.
  • Research companies will review and revise their sick-leave policies and there will be pressure on them to make them more generous.
  • Companies that did the right thing during the crisis will be rewarded with employee loyalty. Employees will become more attached and appreciative of suppliers that showed flexibility, did what they could to maintain payroll, and expressed genuine concerns for their employees.

Probably the biggest change we will see in market research projects is to qualitative research.

  • While there will always be great value in traditional, in-person focus groups , the situation around COVID-19 is going to cause online qualitative to become the standard approach. We are at a time where the technologies available for online qualitative are well-developed, yet clients and suppliers have clung to traditional methods. To date, the technology has been ahead of the demand. Companies will be forced by travel restrictions to embrace online methods and this will be at the expense of traditional groups. This is an excellent time to be in the online qualitative technology business. It is not such a great time to be in the focus group facility management business.
  • Independent moderators, who work exclusively with traditional groups, are going to be in trouble and not just in the short term. Many of these individuals will retire or look for work elsewhere or leave research. Others will necessarily adapt to online methods. Of course, there will continue to be independent moderators but we are predicting the demand for in-person groups will be permanently affected, and this portion of the industry will significantly shrink.
  • There is a risk that by not commissioning as much in-person qualitative, marketers may become further removed from direct human interaction with their customer base. This is a very real concern. We wouldn’t be in market research if we didn’t have an affinity for data and algorithms, but qualitative research is what keeps all of our efforts grounded. I’d caution clients to think carefully before removing all in-person interaction from your research plans.

What will happen to quantitative research? In the short-run, most studies will continue. Respondents are home, have free time, and thus far have shown they are willing to take part in studies. Some projects, typically in highly affected industries like travel and entertainment, are being postponed or canceled. All current data sets need to be viewed with a careful eye as the tumult around the virus can affect results. For instance, we conduct a lot of research with young respondents, and we now know for sure that their parents are likely nearby when they are taking our surveys, and that can influence our findings for some subjects.

Particular care needs to be taken in ongoing tracking studies. It makes sense for many trackers to add questions in to see how the situation has affected the brand in question.

But, in the longer term there will be too much change in quantitative research methods that result directly from this situation. If anything, there will be a greater need to understand consumers.

Tough times for sure. It has been heartening to see how our industry has reacted. Research panel and technology providers have reached out to help keep projects afloat. We’ve had subcontractors tell us we can delay payments if we need to. Calls with clients have become more “human” as we hear their kids and pets in the background and see the stresses they are facing. Respondents have continued to fill out our surveys.

There is a lot of uncertainty right now. At its core, market research is a way to reduce uncertainty for decision makers by making the future more predictable, so we are needed now more than ever. Research will adapt as it always does, and I believe in the long-run it may become even more valued as a result of this crisis.

The myth of the random sample

Sampling is at the heart of market research. We ask a few people questions and then assume everyone else would have answered the same way.

Sampling works in all types of contexts. Your doctor doesn’t need to test all of your blood to determine your cholesterol level – a few ounces will do. Chefs taste a spoonful of their creations and then assume the rest of the pot will taste the same. And, we can predict an election by interviewing a fairly small number of people.

The mathematical procedures that are applied to samples that enable us to project to a broader population all assume that we have a random sample. Or, as I tell research analysts: everything they taught you in statistics assumes you have a random sample. T-tests, hypotheses tests, regressions, etc. all have a random sample as a requirement.

Here is the problem: We almost never have a random sample in market research studies. I say “almost” because I suppose it is possible to do, but over 30 years and 3,500 projects I don’t think I have been involved in even one project that can honestly claim a random sample. A random sample is sort of a Holy Grail of market research.

A random sample might be possible if you have a captive audience. You can random sample some the passengers on a flight or a few students in a classroom or prisoners in a detention facility. As long as you are not trying to project beyond that flight or that classroom or that jail, the math behind random sampling will apply.

Here is the bigger problem: Most researchers don’t recognize this, disclose this, or think through how to deal with it. Even worse, many purport that their samples are indeed random, when they are not.

For a bit of research history, once the market research industry really got going the telephone random digit dial (RDD) sample became standard. Telephone researchers could randomly call land line phones. When land line telephone penetration and response rates were both high, this provided excellent data. However, RDD still wasn’t providing a true random, or probability sample. Some households had more than one phone line (and few researchers corrected for this), many people lived in group situations (colleges, medical facilities) where they couldn’t be reached, some did not have a land line, and even at its peak, telephone response rates were only about 70%. Not bad. But, also, not random.

Once the Internet came of age, researchers were presented with new sampling opportunities and challenges. Telephone response rates plummeted (to 5-10%) making telephone research prohibitively expensive and of poor quality. Online, there was no national directory of email addresses or cell phone numbers and there were legal prohibitions against spamming, so researchers had to find new ways to contact people for surveys.

Initially, and this is still a dominant method today, research firms created opt-in panels of respondents. Potential research participants were asked to join a panel, filled out an extensive demographic survey, and were paid small incentives to take part in projects. These panels suffer from three response issues: 1) not everyone is online or online at the same frequency, 2) not everyone who is online wants to be in a panel, and 3) not everyone in the panel will take part in a study. The result is a convenience sample. Good researchers figured out sophisticated ways to handle the sampling challenges that result from panel-based samples, and they work well for most studies. But, in no way are they a random sample.

River sampling is a term often used to describe respondents who are “intercepted” on the Internet and asked to fill out a survey. Potential respondents are invited via online ads and offers placed on a range of websites. If interested, they are typically pre-screened and sent along to the online questionnaire.

Because so much is known about what people are doing online these days, sampling firms have some excellent science behind how they obtain respondents efficiently with river sampling. It can work well, but response rates are low and the nature of the online world is changing fast, so it is hard to get a consistent river sample over time. Nobody being honest would ever use the term “random sampling” when describing river samples.

Panel-based samples and river samples represent how the lion’s share of primary market research is being conducted today. They are fast and inexpensive and when conducted intelligently can approximate the findings of a random sample. They are far from perfect, but I like that the companies providing them don’t promote them as being random samples. They involve some biases and we deal with these biases as best we can methodologically. But, too often we forget that they violate a key assumption that the statistical tests we run require: that the sample is random. For most studies, they are truly “close enough,” but the problem is we usually fail to state the obvious – that we are using statistical tests that are technically not appropriate for the data sets we have gathered.

Which brings us to a newer, shiny object in the research sampling world: ABS samples. ABS (addressed-based samples) are purer from a methodological standpoint. While ABS samples have been around for quite some time, they are just now being used extensively in market research.

ABS samples are based on US Postal Service lists. Because USPS has a list of all US households, this list is an excellent sampling frame. (The Census Bureau also has an excellent list, but it is not available for researchers to use.) The USPS list is the starting point for ABS samples.

Research firms will take the USPS list and recruit respondents from it, either to be in a panel or to take part in an individual study. This recruitment can be done by mail, phone, or even online. They often append publicly-known information onto the list.

As you might expect, an ABS approach suffers from some of the same issues as other approaches. Cooperation rates are low and incentives (sometimes large) are necessary. Most surveys are conducted online, and not everyone in the USPS list is online or has the same level of online access. There are some groups (undocumented immigrants, homeless) that may not be in the USPS list at all. Some (RVers, college students, frequent travelers) are hard to reach. There is evidence that ABS approaches do not cover rural areas as well as urban areas. Some households use post office boxes and not residential addresses for their mail. Some use more than one address. So, although ABS lists cover about 97% of US households, the 3% that they do not cover are not randomly distributed.

The good news is, if done correctly, the biases that result from an ABS sample are more “correctable” than those from other types of samples because they are measurable.

A recent Pew study indicates that survey bias and the number of bogus respondents is a bit smaller for ABS samples than opt-in panel samples.

But ABS samples are not random samples either. I have seen articles that suggest that of all those approached to take part in a study based on an ABS sample, less than 10% end up in the survey data set.

The problem is not necessarily with ABS samples, as most researchers would concur that they are the best option we have and come the closest to a random sample. The problem is that many firms that are providing ABS samples are selling them as “random samples” and that is disingenuous at best. Just because the sampling frame used to recruit a survey panel can claim to be “random” does not imply that the respondents you end up in a research database constitute a random sample.

Does this matter? In many ways, it likely does not. There are biases and errors in all market research surveys. These biases and errors vary not just by how the study was sampled, but also by the topic of the question, its tone, the length of the survey, etc. Many times, survey errors are not the same throughout an individual survey. Biases in surveys tend to be “unknown knowns” – we know they are there, but aren’t sure what they are.

There are many potential sources of errors in survey research. I am always reminded of a quote from Humphrey Taylor, the past Chairman of the Harris Poll who said “On almost every occasion when we release a new survey, someone in the media will ask, “What is the margin of error for this survey?” There is only one honest and accurate answer to this question — which I sometimes use to the great confusion of my audience — and that is, “The possible margin of error is infinite.”  A few years ago, I wrote a post on biases and errors in research, and I was able to quickly name 15 of them before I even had to do an Internet search to learn more about them.

The reality is, the improvement in bias that is achieved by an ABS sample over a panel-based sample is small and likely inconsequential when considered next to the other sources of error that can creep into a research project. Because of this, and the fact that ABS sampling is really expensive, we tend to only recommend ABS panels in two cases: 1) if the study will result in academic publication, as academics are more accepting of data that comes from and ABS approach, and 2) if we are working in a small geography, where panel-based samples are not feasible.

Again, ABS samples are likely the best samples we have at this moment. But firms that provide them are often inappropriately portraying them as yielding random samples. For most projects, the small improvements in bias they provide is not worth the considerable increased budget and increased study time frame, which is why, for the moment, ABS samples are currently used in a small proportion of research studies. I consider ABS to be “state of the art” with the emphasis on “art” as sampling is often less of a science than people think.

Will adding a citizenship question to the Census harm the Market Research Industry?

The US Supreme Court appears likely to allow the Department of Commerce to reinstate a citizenship question on the 2020 Census. This is largely viewed as a political controversy at the moment. The inclusion of a citizenship question has proven to dampen response rates among non-citizens, who tend to be people of color. The result will be gains in representation for Republicans at the expense of Democrats (political district lines are redrawn every 10 years as a result of the Census). Federal funding will likely decrease for states with large immigrant populations.

It should be noted that the Census bureau itself has come out against this change, arguing that it will result in an undercount of about 6.5 million people. Yet, the administration has pressed forward and has not committed funds needed by the Census Bureau to fully research the implications. The concern isn’t just about non-response from non-citizens. In tests done by the Census Bureau, non-citizens are also more likely to inaccurately respond to this question than citizens, meaning the resulting data will be inaccurate.

Clearly this is a hot-button political issue. However, there is not much talk of how this change may affect research. Census data are used to calibrate most research studies in the US, including academic research, social surveys, and consumer market research. Changes to the Census may have profound effects on data quality.

The Census serves as a hidden backbone for most research studies whether researchers or clients realize it or not. Census information helps us make our data representative. In a business climate that is becoming more and more data-driven the implications of an inaccurate Census are potentially dire.

We should be primarily concerned that the Census is accurate regardless of the political implications. Adding questions that temper response will not help accuracy. Errors in the Census have a tendency to become magnified in research. For example, in new product research it is common to project study data from about a thousand respondents to a universe of millions of potential consumers. Even a small error in the Census numbers can lead businesses to make erroneous investments. These errors create inefficiencies that reverberate throughout the economy. Political concerns aside, US businesses undoubtably suffer from a flawed Census. Marketing becomes less efficient.

All is not lost though. We can make a strong case that there are better, less costly ways to conduct the Census. Methodologists have long suggested that a sampling approach would be more accurate than the current attempt at enumeration. This may never happen for the decennial Census because the Census methodology is encoded in the US Constitution and it might take an amendment to change it.

So, what will happen if this change is made? I suspect that market research firms will switch to using data that come from the Census’ survey programs, such as the American Community Survey (ACS). Researchers will rely less on the actual decennial census. In fact, many research firms already use the ACS rather than the decennial census (and the ACS currently contains the citizenship question).

The Census bureau will find ways to correct for resulting error, and to be honest, this may not be too difficult from a methodological standpoint. Business will adjust because there will be economic benefits to learning how to deal with a flawed Census, but in the end, this change will take some time for the research industry to address. Figuring things like this out is what good researchers do. While it is unfortunate that this change looks likely to be made, its implications are likely more consequential politically than it will be to the research field.


Visit the Crux Research Website www.cruxresearch.com

Enter your email address to follow this blog and receive notifications of new posts by email.