Earlier this month, Google continued its efforts to make searching the web a better experience by introducing Google Preview. This feature allows you to mouse over the links in natural search results to see a small snapshot of the landing page so you can check it out before you click. Currently, Google Preview only displays some of the time on search results pages, but as a someone who often finds himself clicking through many search results—and often wasting time in the process—to find what I’m looking for, I definitely appreciated this update.

However, it wasn’t long before web analysts began to notice that the Google Web Preview bot executes JavaScript and, therefore, causes Page Views and Visits to appear in SiteCatalyst, Discover, and other analytics tools. We have also noticed this, and have put together a bit of research and thought on how to approach this change as it relates to your data in the Online Marketing Suite.

A Few Facts

At Adobe, we like to make rational decisions based on data. Fortunately, we have lots of it. So, when we began to hear about Google Preview—and to notice the effects in our own report suites—we began to study the effects of Google Preview on our customers’ data. Overall, we have seen, on average, less than a 1% increase in page view traffic due to Google Preview. Because each visit from the Google Preview bot often includes many page views (more than a human user would likely perform), the increase in visits is even less. (Of course, this may vary from site to site.) This means that Google Preview is highly unlikely to affect your analytics data in SiteCatalyst and other tools in any significant way. In fact, the traffic bump, though interesting, is largely irrelevant.

Here’s an example that I can share, because it comes from the data for this very blog. Take a look at our Page Views for 1-22 November 2010. (FYI, Google Preview dropped on/around 8 Nov 2010.)

Page Views for the Omniture Blogs stay steady, even after the release of Google Preview.

Nothing wild at all, although we know Google Preview has been hitting these blogs because we have seen it both in the Browsers report and in the raw data for the blog.

Here is what the Browsers report looks like for Safari 3.1 (which is one way to detect the Google Preview bot reported in SiteCatalyst and Discover):

There is a definite 'spike' in traffic for Safari 3.1, but it represents less than a 1% increase overall.

Wow! A huge spike in traffic for Safari 3.1 (looking at the Page Views metric) beginning on exactly 8 Nov 2010 (the day of Google Preview launch), but as you can see on the Y-axis, even at its high point, Google Preview only accounts for less than 1% of overall site traffic.

Another observation is that the Google Preview spider is a good netizen inasmuch as it identifies itself clearly, using a user-agent string like the following:

Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13

A third important fact has to do with our approach to sudden changes like this one, especially during the last two months of each year. You may be wondering why we can’t just exclude the Page Views coming from Google Web Preview, across the board, or throw together a setting in the Admin Console to do this. It’s a completely reasonable question. This is, of course, the most important time of year for online retailers, many of whom are making business-critical decisions based on data they are getting in real-time from the suite. It isn’t a good time for us to make significant changes to our platform or to our production user interfaces, because any unexpected changes or issues introduced now could prove tremendously detrimental to our customers. We refuse to let that happen. This is why we (like most enterprise SaaS offerings) do not risk a bad experience by introducing new functionality within the product (even if it has been through a rigorous QA process) during the final six weeks of the calendar year.

So, What Should I Do About It?

Over the past couple of weeks, I have spent a significant amount of time discussing this with members of our customer community, Product Managers, and other analysts. The answer that many suggested to the question posed above—so, what should I do about it?—was often this: nothing. As described above, most users will see a minimal, undetectable spike in traffic that will not throw off conversion rates, bounce rates, or general trends in traffic. However, we certainly agree with the idea that, all things being equal, it would be better to have only human data present when doing analysis. So, for those who want to remove Google Preview data from your report suites, here are a few things to do.

You can begin to figure out how Google Preview is affecting your data within SiteCatalyst or Discover. In SiteCatalyst, go to the Visitor Profile > Technology > Browsers report for November 2010. Change the metric you are viewing to Visits or Visitors (or Page Views, if you have Discover or a custom event representing Page Views). Now switch to trended view, and change the items displayed to include Safari 3.1. Take a look at the trend for this browser during November 2010, but more importantly, look at the percentage of total traffic as displayed in this report. In the overwhelming majority of cases, you will find that it isn’t much at all.

However, if you are still not comfortable with what you see, I am pleased to report that a VISTA solution is available (for post-collection data alteration/manipulation) to exclude all traffic from Google Preview. You can have this VISTA rule applied to your report suites by contacting your Account Management team. Again, we do not expect that everyone will want or need to apply this VISTA rule, but we are making it available to those who are especially concerned by Google Preview after checking its effects on their data.

On that note, one final thing: We are concurrently pursuing a long-term solution to these questions, and any other questions about spider and bot filtering in the Online Marketing Suite. A comprehensive solution is on our product roadmap, currently under development, and you can follow its progress in the Idea Exchange by clicking here. Yet again, you voted, and we listened.

As always, if you have any questions about anything in this post, or about anything else related to the Adobe Online Marketing Suite, please leave a comment here or contact me on Twitter and I’ll do my best to get you the information that you need.

Alec Cochrane
Alec Cochrane

Hi Guys, I'm way out of date on this, but I got sent here by client care so I said I'd put in an update. In SiteCatalyst the Web Preview for one our clients was appearing as Chrome v9.0 with an OS of Linux, so something may have changed. Interestingly they also had a screen resolution of 1,024 x 1,024 so they were easy to spot as nobody else uses that resolution. It may have just been that client based on their page size mind you, but it seems too much of a coincidence. Cheers, Alec

Steve Fernandez
Steve Fernandez

After a false start, I figured out my misstep in configuring the DW query. It takes a little bit of database work, but I found it possible to use this to culminate what my visitors are doing on Google that may not be translating as well on my site; specifically searching via Google. I'll just hit the highlights if anyone is curious to attempt this kind of analysis: DW report pulling out: IP, Browser, Pages, Referrers, Page Views, Visits and Search Keywords - All This gets you the basic data if you filter the DW segment by "safari 3.1" as a pageview function. If you want to get crazy, you can omit the segmentation and use the IP data to see the step of Preview-to-Site. Just use a temp table to match on the IP and maybe toss any IPs with only 1 row or another temp table to look for IPs that don't have something other than "safari 3.1" as a Browser. The results I got were very interesting. It's both somewhat revealing on what was searched for vs what Google gave as a link, and what my customers were looking for. I've only just completed my first report doing this and still have analysis to do. I really wish we had access to the user-agent string in DW so I could have more confidence in "seeing" the activity from Google Preview though.

Steve Fernandez
Steve Fernandez

Thank you for the info. I was wondering about this very thing. Would you be able or willing to provide any additional information about how to dig into that data? I'm curious to compare what I can see from the data about how visitors are using Google Preview to look at pages vs what they actually do, click & visit, my site. I could see some opportunities for optimization from this. I've tried to do some digging via Data Warehouse, but all of my queries using a setting of Browser = "safari 3.1" have all come back empty; even though I first verified that DW had entires of "safari 3.1" for the day I examined.

Ben Gaines
Ben Gaines

Steve, Great point about user-agent string in Data Warehouse. I completely agree that it would help in debugging a number of questions/issues. If you wouldn't mind, stop by the Idea Exchange (http://ideas.omniture.com) and submit this as an idea. In any case, I'm glad you were able to get the info you needed (at least mostly) and do what sounds like some really interesting analysis based on Google Preview data! Thanks, Ben