What sets good researchers apart is their ability to find a compelling story in a data set. It is what we do – we review various data points, combine that with our knowledge of a client’s business, and craft a story that leads to market insight.
Unfortunately, researchers can be too good at this. We have a running joke in our firm that we could probably hand a random data set to an analyst, and they could come up with a story that was every bit as convincing as the story they would develop from actual data.
Market researchers need to be wary of something well-known among academic researchers: a phenomenon known as “p-hacking.” It is a tendency to run and re-run analyses until we discover a statistically significant result.
A “p-value” is one of the most important statistics in research. It can be tricky to define precisely — it is the probability that your effect (research result) is due to chance and not the difference between your test and control. It is the chance that your hypothesis will be falsely rejected. We say the result is statistically significant when a p-value is less than 5%. We mean there is less than 5% we got this result by chance.
Researchers widely use p-values to determine if a result is worth mentioning. In academia, most papers will not be published in a peer-reviewed journal if their p-value is not below 5%. Most quant analysts will not highlight a finding in market research if the p-value isn’t under 5%.
P-hacking is what happens when the initial analysis doesn’t hit this threshold. Researchers will do things such as:
- Change the variable. Our result doesn’t hit the threshold, so we search for a new measure where it does.
- Redefine our variables. Using the full range of the response didn’t work, so we look at the top box, the top 2 boxes, the mean, etc., until the result we want pans out.
- Change the population. It didn’t work with all respondents, but is there something among a subgroup, such as males, young respondents, or customers?
- Run a table that does statistical testing of all subgroups compared to each other. (Guaranteeing that one in 20 of these significant findings will be due to chance.)
- Relax the threshold. The findings didn’t work at 5%, so we go ahead and report them anyway and say they are “directional.”’
These tactics are all inappropriate and common. If you are a market researcher and reading this, I’d be surprised if you haven’t done all of these at some point in your career. I have done them all.
P-hacking happens for understandable reasons. Other information outside the study points towards a result we should be getting. Our clients pressure us to do it. And, with today’s sample sizes being so large, p-hacking is easy to do. Give me a random data set with 2,000 respondents, and I will guarantee that I can find statistically significant results and create a story around them that will wow your marketing team.
I learned about p-hacking the hard way. Early in my career, I gathered an extensive data set for a college professor who was well-known and well-published within his field. He asked me to run some statistical analyses for him. When the ones he specified didn’t pan out, I started running the data on subgroups, changing how some variables were defined, etc., until I could present him with significant statistical output.
Fortunately, rather than chastise me, he went into teaching mode. He told me that just fishing around in the data set until you find something that works statistically is not how data analysis should be done. With a big data set and enough hooks in the water, you will always find some insight ready to bite.
Instead, he taught me that you always start with a hypothesis. If that hypothesis doesn’t pan out, first recognize that there is some learning in that. And it is okay to use that learning to adjust your hypothesis and test again, but your analysis has to be driven by the theory instead of the theory being driven by the data.
Good analysis is not about tinkering with data through trial and error. Too many researchers do this until something works. They fail to report on the many unproductive rabbit holes they dug. But, by definition, you’d randomly get a statistically significant result about one time in 20.
This sounds obscure, but I would say that it is the most common mistake I see marketing analysts make. Clients will press us to redefine variables to make a regression work better. We’ll use “top box” measures rather than the full variable range, with no real reason except that it makes our models fit. We relax the level of statistical significance. We p-hack.
In general, market researchers “fish in the data” a lot. I sometimes wonder how many lousy marketing decisions have been made over time due to p-hacking.
I used to sit next to an incredible statistician. As good a data analyst as he was, he was one of the worst questionnaire writers I have ever met. He didn’t seem to care too much, as he felt he could wrangle almost any data into submission with his talent. He was a world-class p-hacker.
I was the opposite. I’ve never been a great statistician. So, I’ve learned to compensate by developing design talent, as I quickly noticed that a well-written questionnaire makes data analysis easy and often obviates the need for complex statistics. I learned over time that a good questionnaire is an antidote to p-hacking.
Start with hypotheses and think about alternative hypotheses when you design the project. And develop these before you even compose a questionnaire. Never believe that the story will magically appear in your data – instead, start with a range of potential stories and then, in your design, allow for data to support or refute each of them. Be balanced in how you go about it, but be directed as well.
It is vital to push for the time upfront to accomplish this, as the collapsed time frames for today’s projects are a key cause of p-hacking.
Of course, nobody wants to conduct a project and be unable to conclude anything. If that happens, you likely went wrong at the project’s design stage – you didn’t lay out objectives and potential hypotheses well. Resist the tendency to p-hack, be mindful of this issue, and design your studies well so you won’t be tempted to do it.