In my pre­vi­ous arti­cle, I com­pared web data to water. Just like we can’t sur­vive for an extended period of time with­out safe drink­ing water, data-driven busi­nesses can’t sur­vive with­out good data.  In this sec­ond part, I will focus on dif­fer­ent con­sid­er­a­tions related to data validation.

Nobody’s Per­fect

If you have a pre-teen daugh­ter, you’ll no doubt have the Han­nah Mon­tana song “Nobody’s Per­fect” burned into your cere­bral cor­tex. I have more than one tween daugh­ter so just imag­ine its effect on my brain. Well … no imple­men­ta­tion is ever per­fect either. In fact, seek­ing per­fec­tion in your imple­men­ta­tion can be a dan­ger­ous goal. You want your web data qual­ity to be as com­plete and accu­rate as pos­si­ble, but per­fec­tion or near-perfection can be costly to achieve.

The higher the level of accu­racy that is required, the higher the invest­ment of time and effort that is needed from the busi­ness to cal­i­brate its imple­men­ta­tion. The tar­geted level of accu­racy may require a quick cost-benefit analy­sis. Is your orga­ni­za­tion will­ing to invest more hours in cal­i­brat­ing your imple­men­ta­tion and inter­nal sys­tems to gain addi­tional ben­e­fits (e.g., exec­u­tive con­fi­dence, user adop­tion, exter­nal report­ing, etc.)? In some cases, small incre­men­tal changes in the error of mar­gin can be a big deal. In other cases, they can result in dimin­ish­ing returns — delay­ing or wast­ing time that could be spent on analy­sis, test­ing, or other high-value activities.

In most cases, you’re deal­ing with an explain­able and unex­plain­able mar­gin of error. You typ­i­cally want to reduce high amounts of unex­plained mar­gin of error. If you have explain­able mar­gin of error, you have a cou­ple of options: close or acknowl­edge the gap. For exam­ple, a retailer knows that its real-time Site­Cat­a­lyst data is con­sis­tently 10% higher in terms of rev­enue than its back­end sys­tem. Its back­end sys­tem removes fraud­u­lent orders and prod­uct returns from its final rev­enue num­bers. In this case, the retailer can make the deci­sion to close the gap between Site­Cat­a­lyst and its back­end sys­tem by feed­ing this post-sale data back into Site­Cat­a­lyst. Or the retailer can acknowl­edge the gap and move for­ward with opti­miz­ing its web­site and cam­paigns based on the under­stand­ing that its web data doesn’t fac­tor out fraud­u­lent and returned orders.

Jim Novo, Avinash Kaushik, and oth­ers have advo­cated for pre­ci­sion over accu­racy. Pre­ci­sion focuses on repro­ducibil­ity and repeata­bil­ity com­pared to accu­racy which focuses on obtain­ing the exact num­ber. As long as your web data con­sis­tently falls within an accept­able thresh­old of accu­racy, your busi­ness should be able to act on the data’s direc­tional insights with confidence.

Two-fold respon­si­bil­ity

When it comes to data val­i­da­tion, you need to focus on two areas. First, were the orig­i­nal busi­ness require­ments suc­cess­fully met by the imple­men­ta­tion? Hope­fully, you have a mea­sure­ment strat­egy or busi­ness require­ment doc­u­ment in place that you can refer to. Your team needs to ver­ify that the desired reports were set up prop­erly and that they’re col­lect­ing data. Do you have all of the right buck­ets in place? Are the buck­ets cap­tur­ing anything?

Sec­ond, is the data in the reports sound and accu­rate (pre­cise)? You might be ini­tially thrilled to see a com­pre­hen­sive list of cus­tom reports in the Site­Cat­a­lyst inter­face until you start look­ing more closely at the actual data. Is the data flow­ing into the buck­ets any good? At this stage, data val­i­da­tion should involve a web ana­lyst who can shine a busi­ness per­spec­tive on the reports to deter­mine whether or not the data is drink­able. Fre­quently, the data val­i­da­tion respon­si­bil­ity falls solely to tech­ni­cal QA staff. In their mind, if the JavaScript code exe­cutes fine and doesn’t throw any errors, it passes. How­ever, what about the actual val­ues in the report? This is where a web ana­lyst who is knowl­edge­able with the Omni­ture tool can help.

For exam­ple, to an untrained, inex­pe­ri­enced, or unfa­mil­iar eye, the col­lected data in the Pages report might look fine — “we’re col­lect­ing a bunch of page data, and it’s nicely for­mat­ted. Booyah!” A trained eye will spot the three or four instances of the same home page in the Pages report that is concerning.

Bad inter­pre­ta­tion, not bad data

Once you’ve suc­cess­fully val­i­dated your imple­men­ta­tion, your job is not done. A few days, weeks, or months after launch, you may run into con­cerns from dif­fer­ent end users that the data seems too low or too high. In many cases, the data is actu­ally sound, but it is just being inter­preted incor­rectly or sim­ply misunderstood.

For exam­ple, if you seri­al­ized a “lead com­pleted” event so it was only counted once per visit and used that met­ric in your new lead con­ver­sion rate, it would be much lower than a non-serialized “lead com­pleted” event, which fires every time a vis­i­tor lands on a par­tic­u­lar con­fir­ma­tion page. The seri­al­ized approach may be a bet­ter indi­ca­tor of your true con­ver­sion rate, but it may also be dif­fer­ent from how you were track­ing it previously.

Rather than jump­ing to the con­clu­sion that the imple­men­ta­tion is auto­mat­i­cally flawed, seek to under­stand how the data is col­lected and what the data really means. Your assump­tions about the data may be wrong. If you are intro­duc­ing changes to the way your data is col­lected or how KPIs are cal­cu­lated, com­mu­ni­cate and explain those changes to the orga­ni­za­tion so they under­stand that the data isn’t bad — it’s just dif­fer­ent (and hope­fully better).

Good data today … but what about tomorrow?

After going through a thor­ough data val­i­da­tion effort, what ensures that it remains accu­rate (pre­cise) and com­plete in 3 months? 6 months? 18 months? There are many inter­nal and exter­nal fac­tors that can spoil good data over time:

  • A part­ner fails to notify you of a sig­nif­i­cant change they made to your company’s JS file
  • A new devel­oper tags web pages with­out know­ing the tag­ging standards
  • A mar­ket­ing team doesn’t include unique track­ing codes in its email campaigns
  • An IT team adds sev­eral redi­rects to your web­site, which now inter­fere with your cam­paign tracking
  • Another IT team changes your CMS and your page nam­ing goes awry
  • A third-party ven­dor doesn’t under­stand how to set Omniture’s con­ver­sion vari­ables for your site within its hosted online application

From time to time, you may need to spot check your imple­men­ta­tion to ensure it’s as accu­rate and pre­cise as pos­si­ble. You may even want to sched­ule reg­u­lar six-month “den­tal check-ups” to ensure your site imple­men­ta­tion stays clean. If your senior exec­u­tives are extra sen­si­tive about cer­tain key data points or sev­eral mov­ing pieces were required to achieve spe­cific reports, you may need to mon­i­tor those reports on a more fre­quent basis in order to main­tain your company’s trust in the data. You can use SiteCatalyst’s Alerts fea­ture to notify you of sig­nif­i­cant changes in your KPIs related to these key parts of your site. Use alerts like a check engine light for your implementation.

I recently heard of an e-commerce team that stopped using its ana­lyt­ics reports for sev­eral months when they ques­tioned the IT team’s abil­ity to accu­rately tag its web pages. Rather than set­tling for no data, these con­cerns should have been con­fronted head on. Clean and safe web data is the goal so let’s get proper train­ing for the IT team and a data val­i­da­tion process in place. A data-shaken state can’t per­sist if you want your orga­ni­za­tion to be data-driven.

Your imple­men­ta­tion needs to evolve

The com­plete­ness of the data over time is another issue. Most of the com­pa­nies my con­sult­ing team works with are not sta­tic and nei­ther are their online properties.

  • New web­sites are being created
  • New online strate­gies are being formulated
  • New online mar­ket­ing cam­paigns are being launched
  • New web­site fea­tures and tech­nolo­gies are being introduced
  • New online man­age­ment teams are being formed

All of these fac­tors can make an exist­ing imple­men­ta­tion feel incom­plete to a com­pany. Some clients like to blame the tool or imple­men­ta­tion when they are not receiv­ing the right data. The imple­men­ta­tion is sim­ply guilty of stand­ing still in a fast-paced, con­stantly chang­ing busi­ness envi­ron­ment. A per­fectly good imple­men­ta­tion can be knocked out of align­ment with the busi­ness needs of an orga­ni­za­tion when a web­site is redesigned, an online strat­egy shifts, or a new senior exec­u­tive is intro­duced. It is like blam­ing our tailor-made suit for no longer fit­ting after we’ve gained or lost 30 pounds. We either need a sig­nif­i­cant alter­ation to the suit or an entirely new suit. Your imple­men­ta­tion needs to evolve with your busi­ness. Don’t for­get to re-visit your web mea­sure­ment strat­egy if any major changes impact your orga­ni­za­tion or web­site.

Nobody’s per­fect
I gotta work it
Again and again
’Til I get it right

Sage advice from a fifteen-year-old. Thank you, Han­nah Montana!

In the final post in this series, I will cover the impor­tance of hav­ing account­abil­ity through­out the organization.