Online surveying has become the dominant method of quantitative data collection in the market research industry in a short time. Concerns over the viability of online sampling are valid, yet on the whole the MR industry has done an excellent job of addressing them.
However, there is a data quality issue that is rarely discussed that researchers haven’t found a standard way of dealing with just yet: many of your online respondents are faking it.
They may be doing so for a number of reasons. Your questionnaire may be poorly designed, difficult to understand, or just too long. Your incentive system might be working at cross purposes by getting respondents to respond dishonestly to screening questions or speed along through the questionnaire to get the incentive you offer. Your respondents could be distracted by other things going on in their homes or other windows open on his/her desktop. If they are answering on a mobile device, all sorts of things might be competing with your survey for their attention.
These respondents go by various names: speeders, satisficers, straightliners. Our experience is that they constitute minimally 10% of all respondents to a study and sometimes as much as 20%.
So, what can we do about this?
One way to deal with this issue is to assume it away. This is likely what most researchers do – we know we have issues with respondents who are not considering their answers carefully, so we assume the errors associated with their responses are randomly distributed. In many projects that might actually work, but why wouldn’t we try to fix data quality issues that we know exist?
At Crux, we try to address issues of data quality at two stages: 1) when we design the questionnaires and 2) after data are collected.
Questionnaire design is a root cause of the issue. Questions must be relevant to the respondent, easy to understand, and answer choices need to be unambiguous. Grid questions should be kept to a minimum. Questionnaires should have a progress bar indicating how much of the questionnaire is left to complete and the respondent should be given an indication of how much time the questionnaire is expected to take. The incentive system should be well thought out, and not so “rich” as to cause unintended behaviors.
The survey research industry eventually must come to a realization that we are torturing our respondents. Our questionnaires are too long, too complicated, and ask a lot of questions that are simply not answerable. It may sound oversimplified, but we try hard to think like a respondent when we compose a questionnaire. We try to keep questions short, with unambiguous answers, and to keep scales consistent throughout the questionnaire.
Another issue to contend with is where to get your respondents. This issue is particularly pronounced with online intercepts or “river” sample, and is less prevalent with standing respondent panels or with client customer lists. But, even with large and respected online panels the percent of “faking” respondents can vary dramatically. When evaluating panels, ask the supplier what they do about this issue, and if they don’t have a ready answer, be prepared to walk away.
Beyond panel recruitment and questionnaire design, there are other adjustments than should be made to the resulting sample. Begin by over-recruit every study by at least 10%, as you should anticipate having to remove at least that many respondents.
“Speeders” are fairly easy to identify. We tend to remove anyone who completed the survey in less than half of the median time from the data base. It is important to use median time as the benchmark measure and not the mean, as some respondents can often be logged as taking a very long time to complete a survey (if, for instance, they start the survey, decide to eat dinner, and then come back to it).
Straightline checks are more challenging. Some suppliers will remove respondents who complete a large grid by providing the same answer for all items. We feel it is better to take a more sophisticated approach. We look at the variability in answers (the standard deviation) across all grid items in the study. A respondent who demonstrates little variability compared to the rest of the sample is targeted for further review.
Recently, we have been adding in some questions that help demonstrate that the respondent is reading the questions. So, for instance, a question might ask a respondent to choose the third item in a long list, or to choose a particular word from a list.
Another effective technique is, in a grid list, include a couple of items that are worded negatively compared to other items. For instance, if you have a long grid using a Likert scale, you might ask respondents to react to opposite statements such as “This company provides world-class service” and “This company provides the worst service ever” in the same grid. If you get the same answer to each, it is evidence to look further, as clearly the respondent is being inconsistent.
All of this takes time and effort, and, to be honest, it will probably only catch the most egregious offenders. Our experience is you will remove about 10% of your respondent base.
The MR industry has efforts in place (mainly via our trade associations) to develop standard ways of dealing with these issues. Until these standards are developed, tested, and accepted, it is important for clients to recognize these issues exist, and for suppliers to take the time to address them.