Ear­lier this month, Google con­tin­ued its efforts to make search­ing the web a bet­ter expe­ri­ence by intro­duc­ing Google Pre­view. This fea­ture allows you to mouse over the links in nat­ural search results to see a small snap­shot of the land­ing page so you can check it out before you click. Cur­rently, Google Pre­view only dis­plays some of the time on search results pages, but as a some­one who often finds him­self click­ing through many search results—and often wast­ing time in the process—to find what I’m look­ing for, I def­i­nitely appre­ci­ated this update.

How­ever, it wasn’t long before web ana­lysts began to notice that the Google Web Pre­view bot exe­cutes JavaScript and, there­fore, causes Page Views and Vis­its to appear in Site­Cat­a­lyst, Dis­cover, and other ana­lyt­ics tools. We have also noticed this, and have put together a bit of research and thought on how to approach this change as it relates to your data in the Online Mar­ket­ing Suite.

A Few Facts

At Adobe, we like to make ratio­nal deci­sions based on data. For­tu­nately, we have lots of it. So, when we began to hear about Google Preview—and to notice the effects in our own report suites—we began to study the effects of Google Pre­view on our cus­tomers’ data. Over­all, we have seen, on aver­age, less than a 1% increase in page view traf­fic due to Google Pre­view. Because each visit from the Google Pre­view bot often includes many page views (more than a human user would likely per­form), the increase in vis­its is even less. (Of course, this may vary from site to site.) This means that Google Pre­view is highly unlikely to affect your ana­lyt­ics data in Site­Cat­a­lyst and other tools in any sig­nif­i­cant way. In fact, the traf­fic bump, though inter­est­ing, is largely irrelevant.

Here’s an exam­ple that I can share, because it comes from the data for this very blog. Take a look at our Page Views for 1–22 Novem­ber 2010. (FYI, Google Pre­view dropped on/around 8 Nov 2010.)

Page Views for the Omniture Blogs stay steady, even after the release of Google Preview.

Noth­ing wild at all, although we know Google Pre­view has been hit­ting these blogs because we have seen it both in the Browsers report and in the raw data for the blog.

Here is what the Browsers report looks like for Safari 3.1 (which is one way to detect the Google Pre­view bot reported in Site­Cat­a­lyst and Discover):

There is a definite 'spike' in traffic for Safari 3.1, but it represents less than a 1% increase overall.

Wow! A huge spike in traf­fic for Safari 3.1 (look­ing at the Page Views met­ric) begin­ning on exactly 8 Nov 2010 (the day of Google Pre­view launch), but as you can see on the Y-axis, even at its high point, Google Pre­view only accounts for less than 1% of over­all site traffic.

Another obser­va­tion is that the Google Pre­view spi­der is a good neti­zen inas­much as it iden­ti­fies itself clearly, using a user-agent string like the following:

Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13

A third impor­tant fact has to do with our approach to sud­den changes like this one, espe­cially dur­ing the last two months of each year. You may be won­der­ing why we can’t just exclude the Page Views com­ing from Google Web Pre­view, across the board, or throw together a set­ting in the Admin Con­sole to do this. It’s a com­pletely rea­son­able ques­tion. This is, of course, the most impor­tant time of year for online retail­ers, many of whom are mak­ing business-critical deci­sions based on data they are get­ting in real-time from the suite. It isn’t a good time for us to make sig­nif­i­cant changes to our plat­form or to our pro­duc­tion user inter­faces, because any unex­pected changes or issues intro­duced now could prove tremen­dously detri­men­tal to our cus­tomers. We refuse to let that hap­pen. This is why we (like most enter­prise SaaS offer­ings) do not risk a bad expe­ri­ence by intro­duc­ing new func­tion­al­ity within the prod­uct (even if it has been through a rig­or­ous QA process) dur­ing the final six weeks of the cal­en­dar year.

So, What Should I Do About It?

Over the past cou­ple of weeks, I have spent a sig­nif­i­cant amount of time dis­cussing this with mem­bers of our cus­tomer com­mu­nity, Prod­uct Man­agers, and other ana­lysts. The answer that many sug­gested to the ques­tion posed above—so, what should I do about it?—was often this: noth­ing. As described above, most users will see a min­i­mal, unde­tectable spike in traf­fic that will not throw off con­ver­sion rates, bounce rates, or gen­eral trends in traf­fic. How­ever, we cer­tainly agree with the idea that, all things being equal, it would be bet­ter to have only human data present when doing analy­sis. So, for those who want to remove Google Pre­view data from your report suites, here are a few things to do.

You can begin to fig­ure out how Google Pre­view is affect­ing your data within Site­Cat­a­lyst or Dis­cover. In Site­Cat­a­lyst, go to the Vis­i­tor Pro­file > Tech­nol­ogy > Browsers report for Novem­ber 2010. Change the met­ric you are view­ing to Vis­its or Vis­i­tors (or Page Views, if you have Dis­cover or a cus­tom event rep­re­sent­ing Page Views). Now switch to trended view, and change the items dis­played to include Safari 3.1. Take a look at the trend for this browser dur­ing Novem­ber 2010, but more impor­tantly, look at the per­cent­age of total traf­fic as dis­played in this report. In the over­whelm­ing major­ity of cases, you will find that it isn’t much at all.

How­ever, if you are still not com­fort­able with what you see, I am pleased to report that a VISTA solu­tion is avail­able (for post-collection data alteration/manipulation) to exclude all traf­fic from Google Pre­view. You can have this VISTA rule applied to your report suites by con­tact­ing your Account Man­age­ment team. Again, we do not expect that every­one will want or need to apply this VISTA rule, but we are mak­ing it avail­able to those who are espe­cially con­cerned by Google Pre­view after check­ing its effects on their data.

On that note, one final thing: We are con­cur­rently pur­su­ing a long-term solu­tion to these ques­tions, and any other ques­tions about spi­der and bot fil­ter­ing in the Online Mar­ket­ing Suite. A com­pre­hen­sive solu­tion is on our prod­uct roadmap, cur­rently under devel­op­ment, and you can fol­low its progress in the Idea Exchange by click­ing here. Yet again, you voted, and we listened.

As always, if you have any ques­tions about any­thing in this post, or about any­thing else related to the Adobe Online Mar­ket­ing Suite, please leave a com­ment here or con­tact me on Twit­ter and I’ll do my best to get you the infor­ma­tion that you need.

  • Steve Fer­nan­dez

    Thank you for the info. I was won­der­ing about this very thing. Would you be able or will­ing to pro­vide any addi­tional infor­ma­tion about how to dig into that data? I’m curi­ous to com­pare what I can see from the data about how vis­i­tors are using Google Pre­view to look at pages vs what they actu­ally do, click & visit, my site. I could see some oppor­tu­ni­ties for opti­miza­tion from this. I’ve tried to do some dig­ging via Data Ware­house, but all of my queries using a set­ting of Browser = “safari 3.1″ have all come back empty; even though I first ver­i­fied that DW had entires of “safari 3.1″ for the day I examined.

  • Steve Fer­nan­dez

    After a false start, I fig­ured out my mis­step in con­fig­ur­ing the DW query. It takes a lit­tle bit of data­base work, but I found it pos­si­ble to use this to cul­mi­nate what my vis­i­tors are doing on Google that may not be trans­lat­ing as well on my site; specif­i­cally search­ing via Google. I’ll just hit the high­lights if any­one is curi­ous to attempt this kind of analysis:

    DW report pulling out: IP, Browser, Pages, Refer­rers, Page Views, Vis­its and Search Key­words — All

    This gets you the basic data if you fil­ter the DW seg­ment by “safari 3.1″ as a pageview func­tion. If you want to get crazy, you can omit the seg­men­ta­tion and use the IP data to see the step of Preview-to-Site. Just use a temp table to match on the IP and maybe toss any IPs with only 1 row or another temp table to look for IPs that don’t have some­thing other than “safari 3.1″ as a Browser.

    The results I got were very inter­est­ing. It’s both some­what reveal­ing on what was searched for vs what Google gave as a link, and what my cus­tomers were look­ing for. I’ve only just com­pleted my first report doing this and still have analy­sis to do.

    I really wish we had access to the user-agent string in DW so I could have more con­fi­dence in “see­ing” the activ­ity from Google Pre­view though.

    • http://blogs.omniture.com/author/bgaines Ben Gaines

      Steve,

      Great point about user-agent string in Data Ware­house. I com­pletely agree that it would help in debug­ging a num­ber of questions/issues. If you wouldn’t mind, stop by the Idea Exchange (http://​ideas​.omni​ture​.com) and sub­mit this as an idea.

      In any case, I’m glad you were able to get the info you needed (at least mostly) and do what sounds like some really inter­est­ing analy­sis based on Google Pre­view data!

      Thanks,
      Ben

  • http://www.whencanistop.com Alec Cochrane

    Hi Guys,

    I’m way out of date on this, but I got sent here by client care so I said I’d put in an update.

    In Site­Cat­a­lyst the Web Pre­view for one our clients was appear­ing as Chrome v9.0 with an OS of Linux, so some­thing may have changed. Inter­est­ingly they also had a screen res­o­lu­tion of 1,024 x 1,024 so they were easy to spot as nobody else uses that res­o­lu­tion. It may have just been that client based on their page size mind you, but it seems too much of a coincidence.

    Cheers,
    Alec