There has been recent debate among academics and statisticians surrounding the concept of statistical significance. Some high-profile medical studies have just narrowly missed meeting the traditional statistical significance cutoff of 0.05. This has resulted in potentially life changing drugs not being approved by regulators or pursued for further development by pharma companies. These cases have led to a much-needed review and re-education as to what statistical significance means and how it should be applied.

In a 2014 blog post (Is This Study Significant?) we discussed common misunderstandings market researchers have regarding statistical significance. The recent debate suggests this misunderstanding isn’t limited to market researchers – it appears that academics and regulators have the same difficulty.

Statistical significance is a simple concept. However, it seems that the human brain just isn’t wired well to understand probability and that lies at the root of the problem.

A measure is typically classified as statistically significant if its p-value is 0.05 or less. This means that there is a less than 5% probability that the result came from chance or random fluctuation. Two measures are deemed to be statistically different if there is a 19 out of 20 chance or greater that they are.

There are real problems with this approach. Foremost, it is unclear how this 5% probability cutoff was chosen. Somewhere along the line it became a standard among academics. This standard could have just as easily been 4% or 6% or some other number. This cutoff was chosen subjectively.

**What are the chances that this 5% cutoff is optimal for all studies, regardless of the situation?**

Regulators should look beyond statistical significance when they are reviewing a new medication. Let’s say a study was only significant at 6%, not quite meeting the 5% standard. That shouldn’t automatically disqualify a promising medication from consideration. Instead, regulators should look at the situation more holistically. What will the drug do? What are its side effects? How much pain does it alleviate? What is the risk of making mistakes in approval: in approving a drug that doesn’t work or in failing to approve a drug that does work? We could argue that the level of significance required in the study should depend on the answers to these questions and shouldn’t be the same in all cases.

The same is true in market research. Suppose you are researching a new product and the study is only significant at 10% and not the 5% that is standard. Whether you should greenlight the product for development depends on considerations beyond statistical significance. What is the market potential of the product? What is the cost of its development? What is the risk of failing to greenlight a winning idea or greenlighting a bad idea? Currently, too many product managers rely too much on a research project to give them answers when the study is just one of many inputs into these decisions.

There is another reason to rethink the concept of statistical significance in market research projects. Statistical significance assumes a random or a probability sample. We can’t stress this enough – **there hasn’t been a market research study conducted in at least 20 years that can credibly claim to have used a true probability sample of respondents**. Some (most notably ABS samples) make a valiant attempt to do so but they still violate the very basis for statistical significance.

Given that, **why do research suppliers (Crux Research included) continue to do statistical testing on projects**? Well, one reason is clients have come to expect it. A more important reason is that statistical significance holds some meaning. On almost every study we need to draw a line and say that two data poworints are “different enough” to point out to clients and to draw conclusions from. Statistical significance is a useful tool for this. It just should no longer be viewed as a tool where we can say precise things like “these two data points have a 95% chance of actually being different”.

We’d rather use a probability approach and report to clients the chance that two data points would be different if we had been lucky enough to use a random sample. That is a much more useful way to look at data, but it probably won’t be used much until colleges start teaching it and a new generation of researchers emerges.

The current debate over the usefulness of statistical significance is a healthy one to have. Hopefully, it will cause researchers of all types to think deeper about how precise a study needs to be and we’ll move away from the current one-size-fits-all thinking that has been pervasive for decades.