As the online world starts to get deeper and deeper into math­e­mat­i­cal dis­ci­plines, new peo­ple are con­stantly being made aware of all the amaz­ing math­e­mat­i­cal tools that are avail­able. Often times mar­keters are talk­ing about or lever­ag­ing these tools with­out really under­stand­ing the math part, as they get caught up in the more com­mon names, things like media mix mod­el­ing, con­fi­dence, or rev­enue attri­bu­tion mod­el­ing. The prob­lem arises how­ever when those tools and their power are focused on, but not the dis­ci­plines and func­tions that make them viable as tools. Just hav­ing access to some way of look­ing at data does not inher­ently make it valu­able, yet too many ana­lysts end the con­ver­sa­tion at that point. Every tool is only as good as the way you use it. So how then do you enable peo­ple to get value from these tools instead of just empty promises left from not under­stand­ing the real nature of the tool.

Before any­one starts focus­ing on math­e­mat­i­cal mod­els, the first thing and the last thing that must be under­stood is by far my favorite quote about math, “All mod­els are wrong. Some are use­ful.” All mod­els are built off of assump­tions, and those assump­tions deter­mine the valid­ity of any out­put of that sys­tem. Not only do the assump­tions have to be under­stood at the start, but as the envi­ro­ment that you are mod­el­ing evolves, they too must be kept true, which can be extremely prob­lem­atic with the con­stantly chang­ing nature of the dig­i­tal world. Because of this, a con­stant and vig­i­lant aware­ness of not only the ini­tial assump­tions, but also the longer term con­tin­ued fit of those assump­tions is vital for get­ting a pos­i­tive out­come in the long term.

In the world of test­ing, the most com­mon mod­els used are p-score based mod­els. T-Test, Z-Score, Chi-squared mod­els are all basi­cally dif­fer­ent ver­sions of the same con­cept. The most impor­tant things that peo­ple miss are that these mod­els require a num­ber of key things before they can ever be use­ful. The first two things that are mon­u­men­tal are that they require rep­re­sen­ta­tive data to be mean­ing­ful. It doesn’t mat­ter if all other parts of the model are cor­rect if it has no reflec­tion on the real basis of your busi­ness. Get­ting con­fi­dence quickly means noth­ing if the con­fi­dence does not reflect your larger busi­ness nature. This is why you will find peo­ple who do not under­stand this prob­lem shocked when they act too quickly and then find that the real impact is dif­fer­ent then what they measured.

The other large assump­tion is that the data dis­tri­b­u­tion will approach a nor­mal or Gauss­ian dis­tri­b­u­tion (a bell curve). Data may over a long enough period approach that dis­tri­b­u­tion, but the real­ity is that the biases, vari­ance, and con­straints of the every­day world mean that this dis­tri­b­u­tion is ques­tion­able at best. Because both of the nature of online data col­lec­tion, be it biased vis­i­tor entry, lim­ited cat­a­logs, or con­strained numeric out­comes, all of these assump­tions may never really come into effect. This does not mean that this, or any model, is com­pletely worth­less, but it does mean that you can­not blindly fol­low these tools even as a decid­ing fac­tor between hypothesis.

But mod­els are not restricted only to the test­ing world, in the ana­lyt­ics com­mu­nity, every­thing from attri­bu­tion mod­els to media mix mod­el­ing sys­tems are becom­ing all the rage. The sophis­ti­ca­tion of these mod­els range from one time basic mod­els to large scale com­plex machine learn­ing sys­tems, but all of them have lim­i­ta­tions that require you to keep a close eye on the con­text of their use. Even in the most advanced and rel­e­vant uses, it is impor­tant to note that the assump­tions and model that you used need to be updated and changed over time. The nature of online data col­lec­tion makes it so that there are so many vari­ables that impact the bias and dis­tri­b­u­tion of your users, that any model as a one-time fix will almost imme­di­ately lose value. Pre­dic­tive model’s can have amaz­ing impact on your busi­ness, but they can also lead you astray if you do not keep a watch­ful vig­i­lance on the rel­e­vance on those mod­els as the world they rep­re­sents changes. The only true way to ensure value over any period of time is to update and incor­po­rate learned behav­ior into your mod­els and their usage.

The other lim­i­ta­tion to keep in mind is that you start one model with only cor­rel­a­tive infor­ma­tion, often com­ing from one or more ana­lyt­ics solu­tion. This gives you a great start, but like all other uses of this infor­ma­tion, it lacks vital infor­ma­tion that affects its value. The most impor­tant part of any use of a model such as these is to under­stand that you must con­stantly update that model, espe­cially as you start col­lect­ing causal impact data that tells you your abil­ity to influ­ence behav­ior. A one-time model may sound great, and may give you a short term boost, but in the long term, it becomes almost mean­ing­less unless you keep the model rel­e­vant and the data focused on the effi­ciency of out­come. The world is not sta­tic, nor should your use of approx­i­ma­tions for that world be static.

This means that as you start to lever­age any model, that you must make sure that you have mem­bers of your team that under­stand the nature of data and how best to lever­age them. You may not need a full time sta­tis­ti­cian, but it does mean that you should be spend­ing resources and improv­ing the skills of your cur­rent resources to under­stand both the nature of the tools and the rel­e­vance to your busi­ness. A full time sta­tis­ti­cian may actu­ally be a detri­ment to your group, as you need to make sure that you are not solely focused on class­room sta­tis­tics, but instead on the real and often com­plex world of your par­tic­u­lar envi­ron­ment. Every­thing you do should be focused on the prag­matic max­i­miza­tion of value to your organization.

I can­not sug­gest enough that you think about and explore ways to lever­age mod­els into your prac­tices, and that you start to lever­age their power to stop opin­ion based deci­sion mak­ing. That being said, if you are to get value from these tools, you must under­stand both sides of the coin, and make sure that you keep you use of the mod­els as rel­e­vant and pow­er­ful as their orig­i­nal intent. Never stop grow­ing your under­stand­ing of your site, users, and the effi­ciency of change, but also keep that focus not only on your orga­ni­za­tion, but also on each tool you lever­age to achieve your goals.