Confidence Is Key
Let us pretend that we are the head of marketing for an eCommerce company that just completed an incrementality study. We have reviewed our results and are quite happy that we saw a 4.5x incremental ROAS. Even better, the test is highly statistically significant. This is fantastic as our goal ROAS is 3.7x.
Before unlocking the budget, our wise analyst clears her throat and remarks, "I had a statistics professor at UC Berkeley who said more than 20 times in a semester, 'A data point is worthless without a confidence interval.' So to make my Berkeley stat professor happy, "What's the confidence interval?"
We ask our Facebook representative to pull the information and discover that the test is 99% confident our ROAS is between a 2.0 and 7.0. Or to state another way, the midpoint of our test is 4.5 ± 2.5.
So What Happened?
When conducting a statistical test, the sample size is always a factor in providing an estimate. Most people are familiar with confidence intervals in the realm of political polls. You will see an estimate on the bounds of the answers. The more people Gallop asks, the tighter the confidence intervals can be on a poll.
Similarly, for an advertising test the more you spend the tighter the bounds on a test statistic that is produced. In our example above, the test statistic is ROAS and the study is providing an estimate on the ROAS Facebook returns. The more impressions served, the better the estimate becomes and the smaller the confidence interval.
The test above is by no means incorrect; however, it is always important to understand what a test is 99% confident about.
To give people a visceral sense around the importance of confidence intervals, think of your local weatherman. He should be 99.9% confident that it will rain in the next 500 days. Now if he is 99.9% confident it will rain tomorrow, you should bring an umbrella.
What's Next
Anytime you are running a test consider asking the analysts in the room what is the confidence interval for a given test statistic. Having high confidence on a wide interval is easy. But if you run a test that has a high degree of confidence in a tight confidence interval, you may want to weigh that information more than other sources.
--
This write up is part 4 in the series we call Adventures in Significance. After running 257 lift studies across 30+ different clients in our past, we thought it would be helpful to write a series on common mistakes and how to avoid them. This is a dreaded subject by many but is core to the direct response advertising world. We thought we could do our part with a round-up of our favorite opinions on the subject and how this applies to your business.
--
Thanks to Max Bender for sharing their work on Unsplash