Ear­lier this month, Google con­tin­ued its efforts to make search­ing the web a bet­ter expe­ri­ence by intro­duc­ing Google Pre­view. This fea­ture allows you to mouse over the links in nat­ural search results to see a small snap­shot of the land­ing page so you can check it out before you click. Cur­rently, Google Pre­view only dis­plays some of the time on search results pages, but as a some­one who often finds him­self click­ing through many search results—and often wast­ing time in the process—to find what I’m look­ing for, I def­i­nitely appre­ci­ated this update.

How­ever, it wasn’t long before web ana­lysts began to notice that the Google Web Pre­view bot exe­cutes JavaScript and, there­fore, causes Page Views and Vis­its to appear in Site­Cat­a­lyst, Dis­cover, and other ana­lyt­ics tools. We have also noticed this, and have put together a bit of research and thought on how to approach this change as it relates to your data in the Online Mar­ket­ing Suite.

A Few Facts

At Adobe, we like to make ratio­nal deci­sions based on data. For­tu­nately, we have lots of it. So, when we began to hear about Google Preview—and to notice the effects in our own report suites—we began to study the effects of Google Pre­view on our cus­tomers’ data. Over­all, we have seen, on aver­age, less than a 1% increase in page view traf­fic due to Google Pre­view. Because each visit from the Google Pre­view bot often includes many page views (more than a human user would likely per­form), the increase in vis­its is even less. (Of course, this may vary from site to site.) This means that Google Pre­view is highly unlikely to affect your ana­lyt­ics data in Site­Cat­a­lyst and other tools in any sig­nif­i­cant way. In fact, the traf­fic bump, though inter­est­ing, is largely irrelevant.

Here’s an exam­ple that I can share, because it comes from the data for this very blog. Take a look at our Page Views for 1–22 Novem­ber 2010. (FYI, Google Pre­view dropped on/around 8 Nov 2010.)

Page Views for the Omniture Blogs stay steady, even after the release of Google Preview.

Noth­ing wild at all, although we know Google Pre­view has been hit­ting these blogs because we have seen it both in the Browsers report and in the raw data for the blog.

Here is what the Browsers report looks like for Safari 3.1 (which is one way to detect the Google Pre­view bot reported in Site­Cat­a­lyst and Discover):

There is a definite 'spike' in traffic for Safari 3.1, but it represents less than a 1% increase overall.

Wow! A huge spike in traf­fic for Safari 3.1 (look­ing at the Page Views met­ric) begin­ning on exactly 8 Nov 2010 (the day of Google Pre­view launch), but as you can see on the Y-axis, even at its high point, Google Pre­view only accounts for less than 1% of over­all site traffic.

Another obser­va­tion is that the Google Pre­view spi­der is a good neti­zen inas­much as it iden­ti­fies itself clearly, using a user-agent string like the following:

Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13

A third impor­tant fact has to do with our approach to sud­den changes like this one, espe­cially dur­ing the last two months of each year. You may be won­der­ing why we can’t just exclude the Page Views com­ing from Google Web Pre­view, across the board, or throw together a set­ting in the Admin Con­sole to do this. It’s a com­pletely rea­son­able ques­tion. This is, of course, the most impor­tant time of year for online retail­ers, many of whom are mak­ing business-critical deci­sions based on data they are get­ting in real-time from the suite. It isn’t a good time for us to make sig­nif­i­cant changes to our plat­form or to our pro­duc­tion user inter­faces, because any unex­pected changes or issues intro­duced now could prove tremen­dously detri­men­tal to our cus­tomers. We refuse to let that hap­pen. This is why we (like most enter­prise SaaS offer­ings) do not risk a bad expe­ri­ence by intro­duc­ing new func­tion­al­ity within the prod­uct (even if it has been through a rig­or­ous QA process) dur­ing the final six weeks of the cal­en­dar year.

So, What Should I Do About It?

Over the past cou­ple of weeks, I have spent a sig­nif­i­cant amount of time dis­cussing this with mem­bers of our cus­tomer com­mu­nity, Prod­uct Man­agers, and other ana­lysts. The answer that many sug­gested to the ques­tion posed above—so, what should I do about it?—was often this: noth­ing. As described above, most users will see a min­i­mal, unde­tectable spike in traf­fic that will not throw off con­ver­sion rates, bounce rates, or gen­eral trends in traf­fic. How­ever, we cer­tainly agree with the idea that, all things being equal, it would be bet­ter to have only human data present when doing analy­sis. So, for those who want to remove Google Pre­view data from your report suites, here are a few things to do.

You can begin to fig­ure out how Google Pre­view is affect­ing your data within Site­Cat­a­lyst or Dis­cover. In Site­Cat­a­lyst, go to the Vis­i­tor Pro­file > Tech­nol­ogy > Browsers report for Novem­ber 2010. Change the met­ric you are view­ing to Vis­its or Vis­i­tors (or Page Views, if you have Dis­cover or a cus­tom event rep­re­sent­ing Page Views). Now switch to trended view, and change the items dis­played to include Safari 3.1. Take a look at the trend for this browser dur­ing Novem­ber 2010, but more impor­tantly, look at the per­cent­age of total traf­fic as dis­played in this report. In the over­whelm­ing major­ity of cases, you will find that it isn’t much at all.

How­ever, if you are still not com­fort­able with what you see, I am pleased to report that a VISTA solu­tion is avail­able (for post-collection data alteration/manipulation) to exclude all traf­fic from Google Pre­view. You can have this VISTA rule applied to your report suites by con­tact­ing your Account Man­age­ment team. Again, we do not expect that every­one will want or need to apply this VISTA rule, but we are mak­ing it avail­able to those who are espe­cially con­cerned by Google Pre­view after check­ing its effects on their data.

On that note, one final thing: We are con­cur­rently pur­su­ing a long-term solu­tion to these ques­tions, and any other ques­tions about spi­der and bot fil­ter­ing in the Online Mar­ket­ing Suite. A com­pre­hen­sive solu­tion is on our prod­uct roadmap, cur­rently under devel­op­ment, and you can fol­low its progress in the Idea Exchange by click­ing here. Yet again, you voted, and we listened.

As always, if you have any ques­tions about any­thing in this post, or about any­thing else related to the Adobe Online Mar­ket­ing Suite, please leave a com­ment here or con­tact me on Twit­ter and I’ll do my best to get you the infor­ma­tion that you need.

4 comments
Alec Cochrane
Alec Cochrane

Hi Guys, I'm way out of date on this, but I got sent here by client care so I said I'd put in an update. In SiteCatalyst the Web Preview for one our clients was appearing as Chrome v9.0 with an OS of Linux, so something may have changed. Interestingly they also had a screen resolution of 1,024 x 1,024 so they were easy to spot as nobody else uses that resolution. It may have just been that client based on their page size mind you, but it seems too much of a coincidence. Cheers, Alec

Steve Fernandez
Steve Fernandez

After a false start, I figured out my misstep in configuring the DW query. It takes a little bit of database work, but I found it possible to use this to culminate what my visitors are doing on Google that may not be translating as well on my site; specifically searching via Google. I'll just hit the highlights if anyone is curious to attempt this kind of analysis: DW report pulling out: IP, Browser, Pages, Referrers, Page Views, Visits and Search Keywords - All This gets you the basic data if you filter the DW segment by "safari 3.1" as a pageview function. If you want to get crazy, you can omit the segmentation and use the IP data to see the step of Preview-to-Site. Just use a temp table to match on the IP and maybe toss any IPs with only 1 row or another temp table to look for IPs that don't have something other than "safari 3.1" as a Browser. The results I got were very interesting. It's both somewhat revealing on what was searched for vs what Google gave as a link, and what my customers were looking for. I've only just completed my first report doing this and still have analysis to do. I really wish we had access to the user-agent string in DW so I could have more confidence in "seeing" the activity from Google Preview though.

Steve Fernandez
Steve Fernandez

Thank you for the info. I was wondering about this very thing. Would you be able or willing to provide any additional information about how to dig into that data? I'm curious to compare what I can see from the data about how visitors are using Google Preview to look at pages vs what they actually do, click & visit, my site. I could see some opportunities for optimization from this. I've tried to do some digging via Data Warehouse, but all of my queries using a setting of Browser = "safari 3.1" have all come back empty; even though I first verified that DW had entires of "safari 3.1" for the day I examined.

Ben Gaines
Ben Gaines

Steve, Great point about user-agent string in Data Warehouse. I completely agree that it would help in debugging a number of questions/issues. If you wouldn't mind, stop by the Idea Exchange (http://ideas.omniture.com) and submit this as an idea. In any case, I'm glad you were able to get the info you needed (at least mostly) and do what sounds like some really interesting analysis based on Google Preview data! Thanks, Ben