The skinny on SiteCatalyst and Google Preview
Earlier this month, Google continued its efforts to make searching the web a better experience by introducing Google Preview. This feature allows you to mouse over the links in natural search results to see a small snapshot of the landing page so you can check it out before you click. Currently, Google Preview only displays some of the time on search results pages, but as a someone who often finds himself clicking through many search results—and often wasting time in the process—to find what I’m looking for, I definitely appreciated this update.
A Few Facts
At Adobe, we like to make rational decisions based on data. Fortunately, we have lots of it. So, when we began to hear about Google Preview—and to notice the effects in our own report suites—we began to study the effects of Google Preview on our customers’ data. Overall, we have seen, on average, less than a 1% increase in page view traffic due to Google Preview. Because each visit from the Google Preview bot often includes many page views (more than a human user would likely perform), the increase in visits is even less. (Of course, this may vary from site to site.) This means that Google Preview is highly unlikely to affect your analytics data in SiteCatalyst and other tools in any significant way. In fact, the traffic bump, though interesting, is largely irrelevant.
Here’s an example that I can share, because it comes from the data for this very blog. Take a look at our Page Views for 1–22 November 2010. (FYI, Google Preview dropped on/around 8 Nov 2010.)
Nothing wild at all, although we know Google Preview has been hitting these blogs because we have seen it both in the Browsers report and in the raw data for the blog.
Here is what the Browsers report looks like for Safari 3.1 (which is one way to detect the Google Preview bot reported in SiteCatalyst and Discover):
Wow! A huge spike in traffic for Safari 3.1 (looking at the Page Views metric) beginning on exactly 8 Nov 2010 (the day of Google Preview launch), but as you can see on the Y-axis, even at its high point, Google Preview only accounts for less than 1% of overall site traffic.
Another observation is that the Google Preview spider is a good netizen inasmuch as it identifies itself clearly, using a user-agent string like the following:
Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13
A third important fact has to do with our approach to sudden changes like this one, especially during the last two months of each year. You may be wondering why we can’t just exclude the Page Views coming from Google Web Preview, across the board, or throw together a setting in the Admin Console to do this. It’s a completely reasonable question. This is, of course, the most important time of year for online retailers, many of whom are making business-critical decisions based on data they are getting in real-time from the suite. It isn’t a good time for us to make significant changes to our platform or to our production user interfaces, because any unexpected changes or issues introduced now could prove tremendously detrimental to our customers. We refuse to let that happen. This is why we (like most enterprise SaaS offerings) do not risk a bad experience by introducing new functionality within the product (even if it has been through a rigorous QA process) during the final six weeks of the calendar year.
So, What Should I Do About It?
Over the past couple of weeks, I have spent a significant amount of time discussing this with members of our customer community, Product Managers, and other analysts. The answer that many suggested to the question posed above—so, what should I do about it?—was often this: nothing. As described above, most users will see a minimal, undetectable spike in traffic that will not throw off conversion rates, bounce rates, or general trends in traffic. However, we certainly agree with the idea that, all things being equal, it would be better to have only human data present when doing analysis. So, for those who want to remove Google Preview data from your report suites, here are a few things to do.
You can begin to figure out how Google Preview is affecting your data within SiteCatalyst or Discover. In SiteCatalyst, go to the Visitor Profile > Technology > Browsers report for November 2010. Change the metric you are viewing to Visits or Visitors (or Page Views, if you have Discover or a custom event representing Page Views). Now switch to trended view, and change the items displayed to include Safari 3.1. Take a look at the trend for this browser during November 2010, but more importantly, look at the percentage of total traffic as displayed in this report. In the overwhelming majority of cases, you will find that it isn’t much at all.
However, if you are still not comfortable with what you see, I am pleased to report that a VISTA solution is available (for post-collection data alteration/manipulation) to exclude all traffic from Google Preview. You can have this VISTA rule applied to your report suites by contacting your Account Management team. Again, we do not expect that everyone will want or need to apply this VISTA rule, but we are making it available to those who are especially concerned by Google Preview after checking its effects on their data.
On that note, one final thing: We are concurrently pursuing a long-term solution to these questions, and any other questions about spider and bot filtering in the Online Marketing Suite. A comprehensive solution is on our product roadmap, currently under development, and you can follow its progress in the Idea Exchange by clicking here. Yet again, you voted, and we listened.
As always, if you have any questions about anything in this post, or about anything else related to the Adobe Online Marketing Suite, please leave a comment here or contact me on Twitter and I’ll do my best to get you the information that you need.