Talk to 5 peo­ple in the opti­miza­tion space and you will get 5 dif­fer­ent sto­ries about how best to solve your web­site. Talk with 50 how­ever and those 5 will get repeated more often than not. Such is the world we oper­ate in where “best prac­tices” become so com­mon place and repeated that we often do not take the time to really think about or prove their effec­tive­ness. Because of this phe­nom­e­non a lot of actions which are less than ideal or out­right bad for com­pa­nies become rein­forced must do items.

The real­ity is that dis­ci­pline is going to always win out over spe­cific actions, and that often times the best answer is to mea­sure every­thing against each other and take noth­ing for granted. While all of that is true it is still impor­tant you under­stand these com­mon sug­ges­tions, where they work, how, why, and more impor­tantly why peo­ple believe they are more valu­able than they really may be.

Test Free Ship­ping or Price Changes

This is a real com­mon one for retail sites as it is easy to under­stand, and a com­mon tac­tic (thanks Ama­zon) and one that is easy to sell to the higher ups. The prob­lem is not actu­ally the con­cept, but how peo­ple mea­sure the impact of it, and what that means to other sim­i­lar tac­tics. What can eas­ily seem like a huge win is often a mas­sive loss, and even worse due to how most back-end sys­tems are designed the actual amount of work needed to achieve these tests can be much higher than other more sim­ple and extremely valu­able uses of your finite resources.

Let’s look at the math of a basic free ship­ping test. In this sim­pli­fied sce­nario, we sell 1 item for $90 dol­lars on our site, with an actual cost of $70 to us ($20 net profit). Our ship­ping is $10 dol­lars, which means that when it is nor­mally pur­chased some­one pays us $100.

We want to test free ship­ping, where we pay for the ship­ping and sell the same wid­get for now $90. We run the test and we have an 50% increase in sales! We should be get­ting pro­mo­tions and in most cases the per­son who ran this project is shout­ing their accom­plish­ments to the entire world and every­one that will lis­ten. Obvi­ously this is the great­est thing ever and every­one should be doing it… except you just lost a lot of money.

The prob­lem here is that we often con­fused gross and net profit, espe­cially because in a lot of dif­fer­ent tests you are not directly chang­ing the bot­tom line. In the case of free ship­ping or pric­ing tests though, we are directly change what a sin­gle sell means to us.

Let’s dive into the num­bers of the above. Let’s say that we sell 1000 orders in our con­trol nor­mal group.

$100 X 1000 = $100000

But the real num­ber that impacts the busi­ness is:

$20 x 1000 = $20000

In the free ship­ping option, we have cut our profit in half by pay­ing for the $10 ship­ping, which means that at $10 profit we actu­ally have to have twice as many orders JUST TO BREAK EVEN.

$20000 / $10 = 2000

This means that if we fall back to the stan­dard RPV report­ing that you look at for other types of tests, then the math says that:

$100 X 1000 = $100000
$90 X 2000 = $180000

So any option where we do not increase RPV by at least 180% means we are dra­mat­i­cally los­ing rev­enue. So many times you see reports of amaz­ing results from these kinds of opti­miza­tion efforts which are mask­ing the real­i­ties behind the busi­ness. It can be hard, no mat­ter how much this makes sense in con­ver­sa­tion, to have the dis­ci­pline to think about a 50% increase as a loss, but that is exactly what hap­pened here. Sadly this hypo­thet­i­cal story plays out often in the real world, with the most likely result being the push­ing of the results and not the ratio­nal eval­u­a­tion of the impact to the business.

This same sce­nario plays out any­time we have var­ied mar­gin and not as var­ied gross cost. The other com­mon exam­ple is price changes, where the cost of the item remains fixed, but the test is only truly impact­ing how much mar­gin we make off of the item. In both cases we are forced to set min­i­mum marks prior to start­ing a test, and treat­ing those as the neu­tral point, not the nor­mal rel­a­tive per­cent­age lift that we might be accus­tomed to.

Always repeat con­tent on your site

This and a large num­ber of other com­mon per­son­al­iza­tion type sug­ges­tions (who to tar­get to and how to tar­get to them) actu­ally have a large num­ber of issues inher­ent to them. The first is that even if what is sug­gested is true, it does not mean that it is the most valu­able way to tackle the prob­lem. Just because repeat­ing con­tent does improve per­for­mance by 3%, it doesn’t mean that doing some­thing else com­pletely will not result in a 10% or 50% increase.

The sad truth is that repeat­ing con­tent, when it does work, is often a very small incre­men­tal gain and pails in com­par­i­son to many other con­cepts of con­tent that you could be try­ing. The goal is not to just do some­thing that pro­duces an out­come as every action pro­duces an out­come, the goal is to find the action that pro­duces the max­i­mum out­come for the low­est amount of resources. In that light repeat­ing con­tent is often but not always a poor use of time and resources. The rea­son it is talked about is often not due to its per­for­mance but because it is easy to under­stand and eas­ier to get buy-in from above.

The sec­ond major prob­lem with these is that they skip the entire dis­ci­pline that leads to the answer. There is no prob­lem with repeat­ing con­tent as long as you also try 3–4 other com­pletely dif­fer­ent forms of con­tent. Repeat­ing con­tent may be the right answer, it may be an ok answer, and it may be the worst answer, but you only know that if you are open to dis­cov­er­ing the truth. There is no prob­lem hav­ing a cer­tain group or behav­ior you want to see if you can tar­get to, the issue is when you tar­get to them with­out look­ing at the other fea­si­ble alter­na­tives. If you are not test­ing out mul­ti­ple con­cepts to every­one and look­ing at them for the best com­bi­na­tion, then no mat­ter what you do you are los­ing rev­enue (and mak­ing you and your team do extra work).

The real irony of course is that if you test these out in a way to find out the impact com­pared to other alter­na­tives, the absolutely worst case sce­nario is that you are cor­rect and you tar­get as you would have liked. Any other sce­nario presents you either with a piece of con­tent or the group or both that results in bet­ter per­for­mance. Know­ing this infor­ma­tion allows you to save time and effort in the future as well as spend resources on actions that are more likely to pro­duce a result.

It is not unusual to find that doing just tar­get­ing to a spe­cific group will result in that group show­ing a slight increase, and if that is all that you look at you would have evi­dence to present and share inter­nally as suc­cess. Look­ing at the issues deeper you com­monly find that the over­all impact to the busi­ness is neg­li­gi­ble (within the stan­dard 2% nat­ural vari­ance) or even worse neg­a­tive to the whole. It is also not uncom­mon to find a com­bi­na­tion that you never thought of pre­sent­ing a mas­sive gain.

One of my favorite sto­ries in this line was when I worked with an orga­ni­za­tion that had decided exactly how and what to tar­get to a num­ber of spe­cific groups based on a very com­plex sta­tis­ti­cal analy­sis of site behav­iors. They had built out large amounts of infra­struc­ture to facil­i­tate this exact action. We instead took 100% of the same con­tent they already had and pre­sented it to every­one, look­ing at the impact to serv­ing it to the groups they envi­sioned as well as oth­ers. We sim­ple took all their exist­ing con­tent and serve it to every­one and also in a few dif­fer­ent dynamic per­mu­ta­tions. The result showed that if they had done only what they had envi­sioned they would have lost 18% total leads on the site (this is also a great exam­ple of why causal infer­ence is so vital and to not rely on cor­rel­a­tive infer­ence). They also found that by serv­ing 2 of their nor­mal pieces of con­tent based on behav­iors they had not envi­sioned they would see a 20% gain. They were able to go from caus­ing dra­matic harm to their busi­ness to a large mean­ing­ful mul­ti­mil­lion dol­lar gain sim­ply by not rely­ing solely on hearsay and instead test­ing their assumptions.

In both cases there are many dif­fer­ent ways you can manip­u­late the data to look like there was a pos­i­tive out­come while actu­ally doing dam­age. In both cases mas­sive amounts of time and effort was spent to try some­thing only to find an out­come counter to people’s assump­tions. In both cases test­ing out assump­tions and explor­ing to dis­cover the value of dif­fer­ent actions prior would have bet­ter informed and cre­ated more value.

In the end, any idea is only going to be as valu­able as the sys­tem you put it through. There is noth­ing inher­ently wrong with either con­cept as long as they are mea­sured for effi­ciency and acted on ratio­nally. If you can take a com­mon heuris­tic and eval­u­ate it prop­erly, there is value to be had. That does not mean that they will act as mag­i­cal panacea, nor should you plan your pro­gram around such flawed sim­ple ideas. Focus on build­ing the proper sys­tem and you will be able to pro­vide value no mat­ter what con­cepts get thrown your way.