One of the more common refrains I hear as I speak with different organizations or read industry blogs, is how do you deal with a failed test? People speak of this as if it is a common or accepted practice, one that you need to help people understand before you move forward. The irony of these statements is that when most groups are speaking, they are measuring the value of the test by if they got a “winner”, a recipe that beat their control. People almost always come into testing with the wrong idea of what a successful test is. Change what what success means, and you will be able to change your entire testing program.

Success and failure of any test is determined before you launch the test, not by the measurement of one recipe versus another. A successful test may have no recipe beat the control, and an unsuccessful test may have a clear single winner. Success is not lift, because lift without context is nice but almost meaningless.

Success is putting an idea through the right system, which enables you find out the right answers and that allows you to get performance that you would not have otherwise. If all you do is test one idea versus another that you were already considering, you are not generating lift, you are only stopping negative actions. In addition, if I find something that beats control by 5%, that sounds great, until you add context that if I had tested 3 other recipes, they would result in a 10%, 15%, and 25% change. Do you reward the 5% gain, or the 20% opportunity loss?

In the long run, a great idea poorly executed will never beat a mediocre idea executed correctly.

You can measure how successful a test will be by asking some very simple questions before you consider running the test:

1) Are you prepared to act on the test? – Do you know what the metric you are using is? Do you have the ability to push results? Is everyone in agreement before you start that no matter what wins, you will go with it? Do you know what the rules of action are and when you can call a winner and when is too soon? If you answered no to any of those questions, then any test you run is going to be almost meaningless.

2) Are you challenging an assumption? – This means that you need to make sure that you can see not only if you are correct, but if you are wrong. It also means that you need to have more than 1 alternative in a test. Alternatives need to be different from each other and allow for an outcome outside of common opinion to take hold. Consider any test with a single alternative to be a failure as there is no way to get a result with context.

3) Are you focusing on should over can?– This is when we get caught up on can we do a test, can we target to a specific group, or making sure that we can track 40 metrics. It is incredibly easy to get lost in the execution of a campaign, but the reality is that most of the things we think are required aren’t, and if we can not tie an action back to the goal of a test, then there is no reason to do it. These items should be in consideration based on your infrastructure, and based on value. Prioritize campaigns by how efficient they are to run, and never include more then you need to take the actions you need to take. Any conversation that you are having that is focused purely on the action is both inefficient and a red herring taking you away from what matters.

So how then do you make sure that you are getting success from a test? If nothing else, you need to build a framework for what will define a successful test, and then make sure that all actions you take fill that framework. Getting people to agree to these rules can seem extremely difficult at first, but having the conversation outside of a specific test and making it a requirement that they follow them will help ensure that your program is moving down the right path to success.

Here is a really simple sample guideline to make sure all tests you run will be valuable. Each organization should build their own, but they will most likely be very similar:

  • At least 4 recipes
  • One success metric that is site wide, same as other tests, and directly tied to revenue
  • No more than 4 other metrics, and all of these must be site wide and used in multiple tests
  • Everyone in agreement on how to act with results
  • Everyone prepared to do a follow-up test based on the outcome
  • At least 7 segments and no more than 20, with each segment at least 5-7% of your population and all must have a comparable segment
  • If interested in targeting, test must be open to larger population and use segments to either confirm beliefs or to prove yourself wrong. (e.g. if I want to target to Facebook users, I should serve the same experiences to all users and if I am right, then the content I have for Facebook users will be the highest performer for my Facebook segment).

One of the most important things that an optimization program can do is make sure that all tests follow a similar framework. Success in the long run follows from how you approach the problem, not by the outcome of a specific action. You will notice that in no point here is the focus on the creation of test ideas, which is where most people spend way too much time. Any test idea is only as good as the system by which you evaluate it. Tests should never be about my idea versus yours, but instead about the discovery and exploitation of comparative information, where we can figure out what option is best, not if my idea is better than yours.

What variant won, whose idea was it, and generating test ideas are some of the biggest red herrings in testing programs. You have to be able to move the conversation away from the inputs, and instead focus people on the creation of a valuable system by which you filter all of that noise. Do not let yourself get caught in a trap of being reactive, instead proactively reach out and help groups understand how vital it is that we follow this type of framework.

Change the conversation, change how you measure success, and others will follow. Keep having the same conversation or let others dictate how you are going to act, and you will never be able to prove success.

0 comments