One mistake I see a lot of companies making is running a test to find a winner, instead of running a test to learn something. The two-recipe test is a symptom of this. Often the test includes a complete redesign plotted against the incumbent default. In this scenario, the redesign may edge out the default. It may not. Either way, you may have won a battle by reaching a slightly higher threshold in performance or by stopping an engagement-crippling redesign from becoming a permanent fixture on your site. You have found what can be called a local maximum. The problem here is while you may have won a battle, that battle is not helping you win the war.
What is the war? The war here is optimizing your site to move past better in search of best. Let’s say you are fortunate in that your redesign test had a challenger that performed better than the default. In this case you have found a better experience, but it tells you nothing about how to get to the best experience. Is this new baseline the best the site can do? Maybe it is, but more than likely it is not. At this point you have perfect visibility backward into how the tested change impacted performance, but no visibility forward into what types of changes will drive further increases.
How do you get to the best experience then? That involves identifying what matters first and then iteratively improving those elements. To put this in context, consider the redesign. A redesign may involve changing five different elements. If you have a large marketing team or dedicated agency, make that 15 different elements. Either way, it will be impossible to accurately allocate any positive or negative lift among those various changes, resulting in a real attribution problem. Instead, start with some diagnostic tests that will tell you what about the current page is important and worthy of follow-up tests. These tests allow you to keep your focus uphill, looking past local maximums.
Here are a couple ways to do this:
The Exclusion Test
A simple exclusion test is a perfect example. This involves creating several experiences where you exclude one element at a time from the page to see its overall impact on page consumption (or conversion rate – depending on what metric makes sense for your business). If you remove something and page consumption goes down, that element is helpful and should be at the top of your list of elements on which you will focus a series of A/B tests. Conversely, if you remove an element and page consumption goes up, that element is probably adding clutter and you may want to consider removing it. For most publishers, these tests are very easy to set up because of how the typical main edit wells and right rails are set up. These elements typically employ styling such that if you hide one element, everything below it collapses neatly upward. This means a simple CSS change that can be written in two minutes will produce a recipe. If your site is employing proper CSS, you can have an entire test set up in an hour.
The Multivariate Test
The MVT seems to be a testing buzzword that honestly gets many testers into trouble because they misuse it. When implemented correctly though, it can be a powerful diagnostic tool. If you are using an MVT to find the magic recipe from an array of 46 permutations, you are probably misusing it. Instead, the MVT can be very helpful in identifying which elements on the page contribute to success, or have high “element contribution”. Either way, the focus here should be on the importance of the elements, and not the specific treatments of those elements (red vs. green, etc.) Running an MVT in this way allows you to test three or more elements at once, with the goal of finding out which of the elements matter enough to warrant follow-up A/B tests.
These two types of tests can be helpful diagnostic methods of finding out where the levers are on your site that if pulled will drive toward the best site, instead of simply settling with better. Running a couple of extra diagnostic tests to create a map to the top of the mountain may sound like more effort and more work than simply stopping at the nearest peak. It is more work, but that is what will distinguish best from better on your site. If it were easy everyone would be doing it and you would be paid far less because anyone can do it.