One of my first intro­duc­tions to the larger world of test­ing was get­ting a chance to serve on a panel about Mul­ti­vari­ate test­ing. I remem­ber how diver­gent the opin­ions were and how bad the mis­con­cep­tions were of the entire process. Just about every­one I talked to had these same com­mon pre­con­ceived notion of how to use mul­ti­vari­ate test­ing, and even worse almost all those notions were based on their need to prop­a­gate their sales pitches. Now as I work with more and more orga­ni­za­tions, you see the same bad ideas repli­cat­ing and groups con­tinue to not under­stand the true value from mul­ti­vari­ate test­ing. MVT test­ing is some­thing that holds all these promises, but when done for the wrong rea­sons, mul­ti­plies the worst of test­ing, instead of facil­i­tat­ing the best of test­ing. Even worse, groups then con­fuse the issue, focus­ing on the method of the test, and not the fun­da­men­tal mind­set that cre­ated it. Many groups then get into debates around the “value” of the dif­fer­ent mul­ti­vari­ate meth­ods out there, which is noth­ing more than a fools errand since any method is going to fail.

Too many times peo­ple get caught up on the “advan­tages” or “dis­ad­van­tages” of the var­i­ous forms of mul­ti­vari­ate analy­sis. There are many advan­tages of full fac­to­r­ial test­ing, from fewer rules, bet­ter insight into inter­ac­tions across tested ele­ments, and the abil­ity to test out non uni­form con­cept arrays. There are many advan­tages to par­tial fac­to­r­ial test­ing, speed, forced con­for­mity to bet­ter test­ing rules, more effi­cient use of resources. What does not mat­ter is which one allows you to throw things at a wall and get an answer. When you are busy try­ing to answer the wrong ques­tion, then you can fail with any tool. It is only when you are try­ing to suc­ceed that the dif­fer­ences between tools matter.

The fun­da­men­tal use of mul­ti­vari­ate test­ing for most groups is to com­bine mul­ti­ple badly con­ceived A/B tests, so that they can quickly throw them all together so they can find a com­bi­na­tion that increases results. So many groups want to try out this com­bi­na­tion of ideas, so they think a MVT cam­paign is the solu­tion. Fun­da­men­tally you can use the test that way, it is a both sta­tis­ti­cally a valid out­come and will guar­an­tee a result, but at what cost? The chal­lenge is that you will wast­ing resources, time, and are guar­an­teed to get a sub­op­ti­mal out­come from this flawed way of think­ing. Any form of mul­ti­vari­ate test­ing that is just used as a mas­sive col­lab­o­ra­tion of indi­vid­ual tests is always going to be inef­fi­cient, since you are repli­cat­ing and adding the imper­fec­tions of those indi­vid­ual tests in a way that mag­ni­fies those imper­fec­tions. If your goal is sim­ply that indi­vid­ual out­come, and it is for way too many pro­grams and espe­cially agen­cies, then you will never get any true value from mul­ti­vari­ate test­ing until you change your mindset.

Fun­da­men­tally the con­cept of try­ing to just find a com­bi­na­tion misses a fun­da­men­tal truth, that you are spend­ing a mas­sive amount of resources, cre­at­ing all these per­mu­ta­tions and offers, with­out an under­stand­ing of the effi­ciency of each resources.

1) All the ideas come from pre­con­cep­tions and hypoth­e­sis about what does work

2) The addi­tion of all new vari­ants adds cost in the cre­ation and the data acqui­si­tion to be meaningful

If we instead focus on mul­ti­vari­ate test­ing as a means to fil­ter our resources instead of sim­ply com­bine them, then we are able to achieve effi­ciency. If we try to limit our resources and only apply them where we will get the most return, then we must always via mul­ti­vari­ate test­ing as a tool to learn and be effi­cient, not one to just throw things out to see what works.

The clas­sic exam­ple of a mul­ti­vari­ate test is test­ing a but­ton. Let us say I have a medium orange pur­chase but­ton cur­rently on my site. I might think that red might be bet­ter than orange, and my UX per­son thinks that buy now will per­form bet­ter because he saw it on a few other com­peti­tor sites. You throw it out by also adding a slightly larger but­ton and you get a pre­dicted best com­bi­na­tion of large orange buy now. You slap your­self on the back, and you move for­ward. The real­ity is that each of those fac­tors, size, color, copy have a mas­sive amount of fea­si­ble alter­na­tives, and all we did was look at a very lim­ited biased set of them.

Let me pro­pose a bet­ter way. Look at that same test, but instead of pre­con­ceiv­ing the out­come, look for the value of each fac­tor. If we took the same test, and we found out that size mat­ters more than color, despite what you thought going in. If we spend as lit­tle resources as pos­si­ble to achieve that under­stand­ing, then we have left the max­i­mum amount of resources avail­able to apply to the win­ning fac­tor or ele­ment. If we have learned that size mat­ters, we can shift our resources away from less influ­en­tial ele­ments and then apply the resources towards as many dif­fer­ent fea­si­ble alter­na­tives of the exe­cu­tion of the win­ning fac­tor. Instead of being lim­ited to test­ing 3–4 sizes, we can know the value of size and then cre­ate as many dif­fer­ent alter­na­tives as pos­si­ble. Not only have we used less resources, but they have been applied towards the most influ­en­tial part of our experience.

Even bet­ter, I now have learned that size mat­ters most, and I have an out­come that is dif­fer­ent and greater then I would have before. In fact I have shifted the sys­tem so that the absolute worst thing that can hap­pen is that I end up with the same alter­na­tive I would have before, but for less time and resources. I have also added a much higher upside so that I can get a bet­ter out­come by hav­ing an alter­na­tive that I would not have pre­vi­ously included come out the win­ner. I have also tested out more alter­na­tives of the impor­tant fac­tor so that I am not lim­it­ing my out­put by the sin­gle input of pop­u­lar opin­ion. I have lever­aged mul­ti­vari­ate test­ing as a way to learn what mat­ters and to focus my future efforts on that. I no longer have to cre­ate alter­na­tives for fac­tors that have no influ­ence, and can instead focus resources on test­ing as many dif­fer­ent fea­si­ble alter­na­tives I can for the things that do influ­ence behavior.

The less you spend to reach a con­clu­sion, the greater the ROI. The faster you move, the faster you can get to the next value as well, also increas­ing the out­come of your pro­gram. What is more impor­tant is to focus on the use of mul­ti­vari­ate as a learn­ing tool ONLY, one that was used to tell us where to apply resources. One that frees us up to test out as many resources for fea­si­ble alter­na­tives on the most valu­able or influ­en­tial fac­tor, while elim­i­nat­ing the equiv­a­lent waste on fac­tors that do not have the same impact. The goal is to get the out­come, get­ting overly caught up in doing it in one mas­sive step as opposed to smaller eas­ier steps, is fool’s gold.

You CAN lever­age mul­ti­vari­ate tests in a large num­ber of ways, and let me tell you that there are enough 15×8 tests out there to show that sta­tis­ti­cally, it is a sta­tis­ti­cally valid approach. The ques­tion is never what can you do, but what SHOULD you do. Just because I can test a mas­sive amount of per­mu­ta­tions does not mean that I am being effi­cient or get­ting the return on my efforts that I should. We can’t just ignore the con­text of the out­put to make you feel bet­ter about your results. You will get a result no mat­ter what you do, the trick is con­stantly get­ting bet­ter results for fewer resources.

If you are stuck in the realm of try­ing to show results from a sin­gle test, or are not think­ing in terms of your test­ing pro­gram as a learn­ing opti­miza­tion machine, then you aren’t going to get results you need no mat­ter what you do. mul­ti­vari­ate tests are use­ful only in the con­text of your pro­gram, if you are stuck think­ing in terms of just the out­come of that spe­cific test, you will never achieve the results that you want.

If you shift to think about it in con­text of a larger pro­gram, then mul­ti­vari­ate tests are just one of many tools you have at your dis­posal to achieve those goals. Don’t let the promises and sales pitches of a few divert your atten­tion away what mat­ters. And if you are focus­ing on what mat­ters, then the nature of which type of mul­ti­vari­ate test you use becomes almost com­pletely moot.