min
Top 5 Misconceptions About A/B Testing Debunked
Shana Pilewski, Head of Content at Dynamic Yield, debunks the most common misconceptions about A/B testing.
Common Misconception #1: Session-based attribution is more widely used, therefore it must be the best
“Per session” KPIs are easier to implement, monitor, and optimize toward than “per unique user,” yet the best KPIs for achieving true, long-term revenue uplifts, are “per user” based. This is because every statistical engine out there assumes that trials (namely “unique users” / “sessions”) are independent. By independent we mean that each one has the same probability to convert/give expected revenue. Clearly, when the same user starts several sessions, the probability of converting changes dramatically between sessions. There is a clear statistical dependence.
In practice, many users initiate several sessions. The result of this assumption breakage is that the reliability of the results coming out of the statistical engine are badly hurt in an unknown direction. So statistical significance, probability to be best, confidence intervals, winner declarations are all less reliable to an unknown extent. The “per user” KPI is much more reliable because each user is counted just once, and the assumption that different users are statistically independent remains pretty solid.
Common Misconception #2: Multivariate is the be-all-end-all for optimizing numerous combinations of website elements
It’s true that multivariate testing is a highly sophisticated tool. A single multivariate test can answer dozens of questions, all at once, while an A/B test can only answer one question at a time. But just because a test is complex doesn’t mean that it’s better, or that the data generated is more useful.
Multivariate tests have five huge problems in practical use -- they require tons of traffic, are tricky to set up, create serious opportunity costs, don't allow you to move quickly and fail fast, and are biased towards design.
Common Misconception #3: Your losing tests will always be losers
While an A/B test may reveal the best variation across your traffic on average, there will always be segments of visitors for whom the winning variation is not optimal. In other words, “winner takes all” may be the best bet for the majority of your visitors, but you’ll be compromising the experience for the other portion of your visitors.
Only after recognizing this flaw in test analysis and execution does it become clear that losing A/B tests can actually end up as winners, and that hidden opportunities once thought meaningless may actually bare the most fruit through a modernized, personal way of thinking.
It's critical to discover the impact of test actions on different audience groups when running experiments, no matter how the segment is defined. Only after taking the time to thoroughly analyze the results of one’s experiments for different segments can deeper optimization opportunities be identified, even for tests that are failing to produce uplifts for the average user (which doesn’t actually exist).
Common Misconception #4: Humans don't need AI to help them scale their efforts
No matter how mathematical an individual’s brain, there will always be a limit to how many segments can be managed before becoming too complicated. Especially when factoring in contextual data such as user activity, affinities, geography, etc. And with a host of permutations and combinations, picking a winning variation in the face of a constantly changing customer base becomes impossible.
By using ad serving-like techniques for changing the onsite experience, instead of doing an A/B test of five different banners or five different call-to-actions, marketers can create all the variations they need and let a real-time machine learning engine do the work. Using algorithms that constantly collect all user data and signals, the best variation can then be delivered to each individual user, regardless of where they arrive from, what device they are using, and so on.
Common Misconception #5: You can A/B test without segmentation
Targeting the “average” person, usually in the form of “all visitors,” is an easy trap to fall into. The idea is that by casting a wide net, you can catch everyone. In reality, there is no such thing as an “average” visitor. Sure, your website has a bell curve of visitors but that bell curve is actually made up of subgroups of visitors, each with their own bell curves in different places.
Each of these groups will have distinct preferences, and by targeting those groups specifically while running experiments, you can increase the likelihood that you’ll find that group’s drivers for higher conversion rates or increased average order value.