The many lay­ers of what can run a test­ing pro­gram off track are both com­pli­cated and sim­ple. All the major errors come from a need to fit an exist­ing struc­ture, to do what your boss wants, and most impor­tantly to do what will make oth­ers happy. All the hard work that really defines suc­cess is the things that no one wants to do, be it agree­ing on a met­ric, being the leader that your orga­ni­za­tion needs (even when they may not want it), or mak­ing sure peo­ple under­stand effi­ciency. All of these sins really are about defin­ing how you are going to act when it comes time to do an action.

The next sin, only test­ing what you want, is the first one that is really about the action itself. It is the act of not just sit­ting together and “brain storm­ing” or lis­ten­ing to a pitch about some­thing that sounds great, and about incor­po­rat­ing the need to grow and learn in your actions, so that the path your group takes is organic and not inor­ganic. Groups get so caught up on only test­ing what the boss wants, or what your design peo­ple think will win, that they miss almost all the really impor­tant suc­cesses that can hap­pen from a test. Every group starts out by want­ing to test out one fea­ture or another, and every­one hears about a best prac­tice or a cool thing that another group did and they want to do the same. We fail to feed the sys­tem with a broad range of inputs because we can’t see past our own opin­ion, and because of that we dra­mat­i­cally lower the out­comes of our efforts.

Groups fail when they are too caught up on what wins, or on prov­ing some­one right. Peo­ple are so caught up on val­i­dat­ing an idea that they fail to see what other options will do, or even more, what if they are wrong? What mat­ters more is not the dis­cov­ery of a sin­gle val­i­dated idea, but the com­par­a­tive analy­sis of mul­ti­ple paths. This means that the worst thing we can do is limit or focus any effort only on what we want or lim­it­ing our efforts to what is pop­u­lar. The sin in test­ing is the want to val­i­date instead of learn, and the want to lim­it­ing of focus only to our own opin­ion. The truth is that the least impor­tant part of any test is what wins, since the value of the “win” is only as valu­able as the con­text of that win. If we dis­cover a 5% lift, that may be great, but if in the same test we could have had a 10%, 20%, or a 50% lift, then the 5% sud­denly becomes an awful result. We get the most value when we are wrong, and when we dis­cover this and allow our­selves to move down that path. Being aware of your ego, and not lim­it­ing your efforts to you or your bosses opin­ion, is what defines the mag­ni­tude of actual value you are achieving.

The sad truth is that we are often extremely unaware of the real value of our opin­ions. There is almost an inverse cor­re­la­tion between what peo­ple think will win, and what will win. One of the best ways to test this is to make sure that each test has a large num­ber of very dif­fer­ent vari­ants and then do a poll before the test for what peo­ple think will win. You will find almost no con­nec­tion between votes and out­come, which says a lot about our abil­ity to mea­sure things ratio­nally after we have formed an opin­ion. This means that any­time that we only test what peo­ple think will win, or what they want to see will win, we have fun­da­men­tally crip­pled our abil­ity to deliver mean­ing­ful value. Remem­ber that if you only test two things, and the thing you want won, all you have done is added cost with the test. Chal­lenge your­self and oth­ers to think in terms of pos­si­bil­i­ties and not opin­ions, and to scope things in terms of achiev­ing the most options, and not just the set options.

One of the hard­est tasks for groups to deal with is the need to assume that they know noth­ing about the value of an action. You are invested in prov­ing to oth­ers that you know best, or in the value of any action that is already hap­pen­ing, but the sad truth is that most actions are done out of pat­tern and his­tory and not because of mea­sured value to an orga­ni­za­tion. Even worse, we build out giant project plans that we sud­denly become inflex­i­ble to change or dis­rupt, being so focused on com­ple­tion that we only pre­tend care about the actual value of the project. The need to take a step back and under­stand that, “if I am right, it will prove itself out. If I am wrong, it will show me a bet­ter way to do things” is easy to say but almost impos­si­ble to take hold of imme­di­ately. There is noth­ing worse then doing a mas­sive project only to dis­cover neu­tral or neg­a­tive per­for­mance, yet this is by far the most com­mon out­comes for groups when they test out large redesigns. It is vital that for pro­grams as they build out that they not only test what they want to win, but that they build into all plans dynamic points where things can go direc­tions not expected, so that we are not so inflex­i­ble to the real­ity of quan­ti­ta­tive results.

Ask your­self these ques­tions before you take any action: “how do I prove myself wrong?”, “what if I am focus­ing on all the wrong things?”, “what are the other fea­si­ble alter­na­tives for this page, sec­tion, mod­ule?” It sounds counter intu­itive, but it will help you under­stand just how much larger the testable world is from the world that you inher­ently would start out with. You are not lim­ited to only what you want, you are only lim­ited by your imag­i­na­tion and the effi­cient use of your resources. Force every action to fun­da­men­tally deal with these ques­tions, and not the ques­tion of “how much bet­ter is this idea?”.

It is easy to limit the pos­si­ble value of your pro­gram from think­ing you are right. All peo­ple are wired to do so, and build their empires off the pro­jec­tion of this knowl­edge to oth­ers. Build­ing out your tests to get past this sin, and instead find the right answer and to know how it mea­sures to the larger world is vital to the level of value that you can get from your test­ing program.