The Art and Zen of Testing for Success
It’s Not Just About Lift
I often see the definition of success in a test get whittled down to a single goal: lift. Whether it’s lift in conversion rate, revenue per visitor, time spent on site, or some other KPI, I get the sense that many companies think testing is lift or die. While showing lift is certainly important, I don’t believe it should be the single measurement by which we deem a test a success or failure. I would argue that it’s equally, if not more, important to be able to present learnings from a test.
The Art of Testing
Designing a good test requires a little bit of art. By art, I don’t just mean creative. Yes, the creative should absolutely be great, otherwise you run the risk of garbage in, garbage out. However, the question is also vital.
Start With a Question
Every test should naturally answer a question. If you’re having a hard time looking for inspiration, think back to the ideas that you’ve often debated in your marketing meetings.
For example, is the form more effective on the right or left? Do users respond to a green button more than a red button? Should my subscription process be 1 page or 5 pages long? Do models really make a difference in images?
Too often, I find that the question isn’t even present. A company gave me a user scenario of their ideal A/B test recently, and it consisted of dramatically changing an entire checkout process from start to end. I lost count of how many variables had been changed between their A and B versions. The test would also require extensive developer resources to implement because it involved a lot of backend integration. I asked them what they were trying to understand through this test, and I got a lot of blank expressions and averted eyes.
The best-case scenario in running this type of test is that you as the marketer find a lot of lift and everybody claps their hands about how great the test was, and then they get back to business as usual.
Can You Answer ‘Why?’
The more likely scenario is that you find some lift in the test, and the first question you get back after presenting the results to management is, why? Why did version B generate lift? If you don’t have a firm grasp on what questions you’re asking in your test, you may find yourself at a loss to answer. Sure, you can always hypothesize that the user experience was much improved in the alternative, or that refreshing the site was impactful, but wouldn’t it be nice to know that removing the left navigation and consolidating the billing and shipping address pages were most influential?
The worst-case scenario is that version B performs poorly, and again, everyone is asking you why, but now they’re also talking about how wasteful and unsuccessful testing has proven to be. Where do you go from this point? The odds of getting the technical and political support necessary to continue testing are probably slim at this point.
But imagine if you had instead designed your test with your questions forming its foundation. Now, regardless of the outcome in lift, you could still present learnings and next steps to keep things moving forward.
Which of the following statements sound better?
• “We learned that removing the left navigation entirely was not effective so we’re going to move on to testing a shortened navigation along with the promotional banner and call-to-action.”
• “Version B performed worse so we’re going to test something dramatically different from both A and B next time.”
I’d take the first one any day.
The Zen of Testing
Patience truly is a virtue, and it’s much easier said than done. I know from personal experience. I recently broke my #1 rule of “Do it right the first time” for a client because we were both rushing to hit some milestones. We ended up in that “likely scenario” bucket where we didn’t see the home run and were left wondering what we truly got out of the test. Did we understand what the impact was of featuring specific products vs. including a red free shipping banner at the top? No — we had decided to forgo the multivariate for the A/B test because we weren’t sure we had the traffic and time to support the MVT. Did we understand how people progressed through the funnel in each version? No — we skipped tagging the funnel because it would take too much time. In hindsight, I very much regret not taking the incremental time to do things right the first time so that we could avoid making up for it the second time around. However, it’s a lesson learned and hopefully it means that you won’t have to learn it through experience as well!
So while it’s tempting to run your boil-the-ocean test all at once, it really is worth the time to break that test down into different components. Ask yourself what you are trying to understand by running that test. That should naturally lead you to the questions. From there, try to construct an iterative approach that knocks out these questions in waves. This approach allows you to get learnings faster and also breaks up the development resources you may need into smaller, bite-sized chunks.
A lot of people think testing is all about luck, but I find that the more frequently and intelligently you test, the luckier you get.