How to Understand & Calculate Statistical Significance [Example]
Have you ever presented results from a marketing campaign and been asked, “But are these results statistically significant?” As data-driven marketers, we need to not only measure the results of our marketing campaigns, but also demonstrate the validity of the data – which is exactly statistical significance.
While there are several free tools out there to calculate statistical significance for you (HubSpot even has one here), understanding what they calculate and what it all means is helpful. Below, we’ll take a closer look at the numbers using a specific example of statistical significance so you can understand why they’re critical to marketing success.
What is statistical significance?
In marketing, statistical significance is when the results of your research show that the relationships between the variables being tested (such as conversion rate and landing page type) are not random. they influence each other.
In marketing, you want your results to be statistically significant as it means you are not wasting money on campaigns that don’t produce the results you want. Marketers often run statistical significance tests before starting campaigns to test whether certain variables are more successful than others.
Example of statistical significance
For example, let’s say you’re running an advertising campaign on Facebook, but you want to make sure you’re using an ad that is most likely to produce the results you want. So you perform an A / B test for 48 hours with display A as the control variable and B as the variation. These are the results that I get:
advertisement | Impressions | Conversions |
Display A | 6,000 | 430 |
Advertisement B | 5869 | 560 |
Even though we can tell from the numbers that Ad B had more conversions, you should be sure that the difference in conversions is significant and not random. When I put these numbers in a chi-square test calculator (more on that later), my p-value is 0.0, which means my results are significant and there is a performance difference between display A and display B that is not due is risking.
When I’m running my actual campaign, I want to use Ad B.
If you’re like me, you need some more explanation of what p-value and 0.0 mean, so we’ll go through a detailed example below.
How to calculate statistical significance
- Determine what you want to test.
- Determine your hypothesis.
- Start collecting data.
- Calculate chi-square results.
- Calculate your expected results.
- See how your results differ from your expectations.
- Find your sum.
- Report statistical significance to your teams.
1. Determine what you want to test.
First, decide what you want to test. This can be the comparison of conversion rates on two landing pages with different images, click rates of e-mails with different subject lines or conversion rates of different call-to-action buttons at the end of a blog post. The choices are endless.
My advice would be to keep it simple; Pick a piece of content that you want to create two different variations of and set your goal – a better conversion rate or more views are good places to start.
You can certainly test more variations or even create a multivariate test, but for this example we’ll stick with two variations of a landing page with the aim of increasing conversion rates. To learn more about A / B testing and multivariate testing, see The Critical Difference Between A / B and Multivariate Testing.
2. Determine your hypothesis.
Before I start collecting data, I find it helpful to formulate my hypothesis at the beginning of the test and determine the level of confidence I want to test. Since I’m testing a landing page and want to see if one does better, that’s what I’m assuming There is a relationship between the landing page visitors get and their conversion rate.
3. Start collecting your data.
Now that you’ve decided what to test, it’s time to start collecting your data. Since you are likely to run this test to determine which content is best to use in the future, you should take a sample size. For a landing page, this can mean that you set a certain period of time for your test to run (e.g. go live for three days).
For something like an email, you can choose a random sample of your list to send random variations of your emails to. Determining the correct sample size can be difficult, and the correct sample size varies between tests. As a rule of thumb, the expected value for each variation should be greater than 5. (We’ll cover the expected values below.)
4. Calculate the chi-square results.
There are several statistical tests you can run to measure the significance of your data. Choosing a test depends on what you want to test and what type of data you are collecting. In most cases, you will use a chi-square test because the data is discrete.
Discreet is a fancy way of saying that your experiment can produce a finite number of results. For example, a visitor will either convert or not convert; There are no different degrees of conversion for a single visitor.
You can test with different degrees of confidence (sometimes called the alpha of the test). If you want the requirement to reach statistical significance to be high, your alpha will be lower. You may have seen statistical significance in terms of trust.
Example: “The results are statistically significant with a 95 percent certainty.” In this scenario, the alpha was 0.05 (the confidence is calculated as one minus the alpha), which means the chance of making a mistake in the relationship given is one in 20.
After collecting the data, I’ll put it on a chart to make it easier to organize. Since I’m testing two different variations (A and B) and there are two possible outcomes (converted, not converted), I have a 2×2 chart. I add up every column and row so I can easily summarize the results.
After I’ve created my chart, the next step is to run the equation using the chi-square formula.
Statistical significance formula
The picture below is the chi-square formula for statistical significance:
In the equation,
- Σ means sum,
- O = observed, actual values,
- E = expected values.
When you run the equation, calculate everything after the Σ for each pair of values, then sum (add) them all.
5. Calculate your expected values.
Now I calculate the expected values. In the example above, if there was no relationship between what the landing page visitors saw and their conversion rate, we would expect the same conversion rates with versions A and B. From the totals, we can see that 1,945 out of 4,935 people converted total visitors, or about 39% of the visitors.
To calculate the expected frequencies (E in the chi-square formula) for each version of the landing page, we can multiply the row total for that cell by the column total and divide by the total number of visitors. To find the expected conversion value for version A in this example, I would use the following equation:
(1945 * 2401) / 4935 = 946
6. See how your results differ from your expectations.
To calculate the chi-square, I compare the observed frequencies (O in the chi-square equation) with the expected frequencies (E in the chi-square equation). This comparison is done by subtracting the observed value from the expected value, squaring the result, and dividing by the expected frequency value.
Essentially, I’m trying to see how different my actual results are from our expectations. Squaring the difference increases the effects of the difference, and dividing by the expected value normalizes the results. As a refresher, the equation looks like this: (observed – expected) * 2) / expected
7. Find your total.
Then I sum the four results to get my chi-square number. In this case it’s 0.95. To see if the conversion rates for my landing pages are different with statistical significance or not, I compare this to the value from a chi-square distribution table based on my alpha (in this case .05) and degrees of freedom.
The degrees of freedom are based on how many variables you have. For a 2×2 table like this example, the degree of freedom is 1.
In this case, the chi-square value would need to be 3.84 or greater for the results to be statistically significant. Since 0.95 is less than 3.84, my results are not statistically different. This means that there is no correlation between the version of the landing page that a visitor receives and the conversion rate with statistical significance.
8. Report statistical significance to your teams.
After you’ve run your experiment, the next step is to report your results to your teams to make sure everyone is on the same page about the next steps. To continue with the previous example, I would need to let my teams know that the type of landing page we’re using in our upcoming campaign won’t affect our conversion rate because our test results weren’t significant.
If the results were significant, I would let my teams know that version A of the landing page is doing better than the others and we should choose to use that version in our upcoming campaign.
Why statistical significance is important
You might be wondering why this is important when you can easily do the calculation using a free tool. Knowing how statistical significance is calculated can help you determine the best test results from your own experiments.
Many tools use a 95% confidence rate. However, for your tests, it may be useful to use a lower confidence rate if the test doesn’t need to be as rigorous.
Understanding the underlying calculations can also help explain why your results might matter to people who are new to statistics.
If you’d like to download the spreadsheet I used in this example so you can see the calculations for yourself, click here.
Editor’s note: This blog post was originally published in April 2013, but was updated in September 2021 to be up-to-date and complete.