It’s been a very long time since I last blogged on some of the new fea­tures in a release of Site­Cat­a­lyst, but today I’m excited to pick up where I left off and share some of my favorite ele­ments of the brand-spanking-new Site­Cat­a­lyst 15.3. We believe and hope that these fea­tures improve the clar­ity of your data, enabling more effec­tive and accu­rate report­ing and analy­sis and ulti­mately improv­ing deci­sion mak­ing in a num­ber of ways.

If you are already using Site­Cat­a­lyst 15, you do not need to do any­thing to upgrade; now that it has been released, Site­Cat­a­lyst 15.3 will auto­mat­i­cally be avail­able to you. And even if you are not using Site­Cat­a­lyst 15, we’ve got some great updates in here for you as well. So here we go…

Bot Fil­ter­ing and Reporting

If you have a web site, and it is live on the Inter­net, you almost cer­tainly have spi­ders and bots stop by from time to time, doing things like index­ing your site in search engines or check­ing the per­for­mance of your site. Even though they rarely rep­re­sent a sig­nif­i­cant amount of traf­fic, you might not want them to be recorded along­side your vis­i­tor data in Site­Cat­a­lyst and Dis­cover. In Site­Cat­a­lyst 15.3, we have made it easy for you to exclude com­monly known spi­ders and bots (as defined by the IAB) as well as any cus­tom bots/spiders that you might be using for any reason.

set up your bots and spiders for exclusion in sitecatalyst 15.3

You can enable bot fil­ter­ing by going to the Admin Con­sole, select­ing the desired report suite(s), and then choos­ing Edit Set­tings > Gen­eral > Bot Rules. Fil­ter­ing is not enabled by default for new report suites, and we are not enabling it auto­mat­i­cally for exist­ing report suites at the time of the release, so you still have total con­trol over whether/how you use bot fil­ter­ing in Site­Cat­a­lyst. How­ever, begin­ning today (April 26) you can choose to enable this fea­ture for any of your report suites in Site­Cat­a­lyst 15.3.

The inter­face allows admin users to define their own rules based on user-agent string, IP address, or IP range. To exclude the IAB list of known spi­ders and bots, sim­ply check the box and click “Save.” Traf­fic from these agents will be col­lected in Site­Cat­a­lyst, but will be sec­tioned off from your other data so that it does not impact data from your “real” users. It will not be included in your traf­fic or con­ver­sion data.

view bot traffic in sitecatalyst

You can run the Site Met­rics > Bots report to see which spi­ders and bots have vis­ited your site dur­ing a given time period. Site Met­rics > Bot Pages report will dis­play the pages that these agents have vis­ited, giv­ing you addi­tional insight into the “index­a­bil­ity” of your site.

Bot fil­ter­ing is being done “pre-VISTA,” mean­ing that VISTA rules will not run on data col­lected from bots that you are exclud­ing using this new fea­ture. In most cases that’s okay, because you want bots to be excluded from the rest of your data set entirely any­way. Some of you have a VISTA rule in place to exclude bots, sim­i­lar to what this new capa­bil­ity allows. This VISTA rule will con­tinue to oper­ate nor­mally. How­ever, in some cases this VISTA rule may not be nec­es­sary any­more, and we rec­om­mend work­ing with your Adobe Account Man­ager to under­stand the next steps for you and your bot han­dling VISTA rules. Note that if you use both a VISTA rule and this new Site­Cat­a­lyst fea­ture to exclude bots, your VISTA rule will likely han­dle less traf­fic from bots than it has in the past, and so you may see a drop in traf­fic to your “bot exclu­sions” report suite(s).

One final note: bots excluded using this new fea­ture will not be included in any Data Feeds that you are receiv­ing, which means that the Data Feed traf­fic lev­els will remain con­sis­tent with the stan­dard Site­Cat­a­lyst reports.

Change to Unique Value Man­age­ment Algorithm

In the past, Site­Cat­a­lyst has allowed you to report on the first 500,000 unique val­ues per month in each report. The remain­ing unique val­ues passed in dur­ing the month are lumped together in reports as a sin­gle line item called “Uniques Exceeded.” The net result is that pop­u­lar (i.e. high-traffic) val­ues that are passed in late in the month are not vis­i­ble as indi­vid­ual line items in reports. They are buried in “Uniques Exceeded.”

Begin­ning on April 26th we are imple­ment­ing a more sophis­ti­cated means for man­ag­ing high car­di­nal­ity reports. There are a num­ber of improve­ments in this new algo­rithm, but most impor­tantly the algo­rithm allows the most pop­u­lar val­ues to show up in reports as indi­vid­ual line items regard­less of whether they occur at the begin­ning or the end of the month! This change impacts all ver­sions of SiteCatalyst.

My col­league, Matt Free­stone, has pub­lished an out­stand­ing post that deals with this topic in far more depth, so check it out in the com­ing days.

Case-Insensitive Props

His­tor­i­cally all cus­tom traf­fic vari­ables (props) have been treated as case-sensitive. For exam­ple, if “value”, “VALUE” and “Value” are passed into a prop they are con­sid­ered to be three dis­tinct line items, and their met­rics are aggre­gated sep­a­rately, in the asso­ci­ated Cus­tom Traf­fic reports. To seg­ment on these val­ues you need to cre­ate three sep­a­rate seg­men­ta­tion cri­te­ria. Cus­tom Con­ver­sion vari­ables, or eVars, on the other hand, have always been treated as case-insensitive. Their case is ignored. (If “value”, “VALUE” and “Value” are passed into an eVar they are con­sid­ered as one line item in reports, and their met­rics are aggre­gated together in the asso­ci­ated Cus­tom Con­ver­sion reports. Only one seg­ment cri­te­ria is required.)

To resolve this incon­sis­tency between props and eVars, begin­ning April 26 all traf­fic vari­ables (props, page, chan­nel, server, cus­tom links, exit links and down­load links) for new report suites will be treated as case-insensitive. This change applies to all ver­sions of Site­Cat­a­lyst. Using the exam­ple above, Site­Cat­a­lyst reports will show either “value”, “VALUE” or “Value” (usu­ally the first one that was passed in dur­ing the month), and the met­rics for all three will be aggre­gated together, just like eVars. In the future we may extend this func­tion­al­ity to traf­fic vari­ables in exist­ing report suites, but for now, it only applies to new report suites. This ensures that your future data will match your his­tor­i­cal data in exist­ing suites. (NOTE: Data Ware­house will always use the low­er­case ver­sion of the traf­fic vari­able. In Data Feeds, the “post” col­umn will con­tain the low­er­case version.)

You may also want to read this post by Matt Free­stone where he dis­cusses the ben­e­fits of case insen­si­tiv­ity as related to man­ag­ing high car­di­nal­ity reports. (In case you’re really curi­ous about case sen­si­tiv­ity for props, you can read more about this request in the Idea Exchange.)

Miss­ing Reports Restored

When Site­Cat­a­lyst 15 was first released in April 2011, a num­ber of reports that you’ve seen in Site­Cat­a­lyst 14 and ear­lier ver­sions, such as Pages Not Found, Cus­tomer Loy­alty, and PathFinder were not avail­able. We are pleased to announce that, the major­ity of these reports are back! Here is the list of reports that are being re-introduced with Site­Cat­a­lyst 15.3:

  • Pathfinder
  • Full Paths
  • Path Length
  • Orig­i­nal Entry Page
  • Days Before First Purchase
  • All Search Page Ranking
  • Pages Not Found

You might be think­ing, “I had these reports when I used Site­Cat­a­lyst 14, and I haven’t removed Site­Cat­a­lyst code from my site/app since upgrad­ing, so will I have all of my his­tor­i­cal data avail­able to me in these reports now that they are in Site­Cat­a­lyst 15?” The answer in most cases is yes. You should see data in these reports going back to the date of your Site­Cat­a­lyst 15 upgrade and beyond.

Data Col­lec­tion Improvements

New ver­sions of App­Mea­sure­ment for JavaScript and App­Mea­sure­ment for Flash are also avail­able, as well as a brand-new App­Mea­sure­ment library for Xbox, but Bret Gun­der­sen has addressed these in a sep­a­rate post as we resume our habit of blog­ging each time we do an App­Mea­sure­ment release.

The major improve­ment in our JavaScript code (ver­sion H.24.4) is that it now accounts for Google Chrome Pre-render, which can load your web pages before the user actu­ally hits your site. Most ana­lysts and mar­keters don’t want Site­Cat­a­lyst to count a Page View or instan­ti­ate a visit when this occurs, because at this point the user has not actu­ally decided to click through to your site. Code ver­sion H.24.4 com­pen­sates for this behav­ior by wait­ing until the user is actu­ally on your site to fire off a bea­con and begin record­ing data. We’re rec­om­mend­ing that most com­pa­nies con­sider upgrad­ing to this code ver­sion to take advan­tage of this improve­ment. Because this change occurs in the base code (the s_code.js file), your on-page imple­men­ta­tion typ­i­cally will not require an update for this release.

Begin­ning on April 26, code ver­sion H.24.4 will be avail­able for down­load in the Admin Con­sole in both Site­Cat­a­lyst 14 and Site­Cat­a­lyst 15.3.

Data Feed

This one is near and dear to my heart, as some­one who has spent a lot of time work­ing with Data Feeds both here at Adobe and as a Site­Cat­a­lyst cus­tomer. Begin­ning on April 26, a new lookup file called “column_headers.tsv” will be included in the files deliv­ered as part of all raw click­stream data feeds. This new lookup file con­tains a sin­gle row com­pris­ing the list of col­umn names for the data found in hit_data.tsv. For me, and per­haps for some­one at your com­pany, this means no more build­ing my 300-column ‘cre­ate’ state­ments by hand. Gen­er­ally, this should make ETL processes eas­ier and enable you to bet­ter under­stand what you are see­ing in Data Feeds.


Each of the top­ics dis­cussed above is cov­ered in Site­Cat­a­lyst 15.3 doc­u­men­ta­tion, release notes, and Knowl­edge Base arti­cles, and you will find more detail in those loca­tions. We’re really pleased with some of these fea­tures and we believe they will serve you well As always, please leave com­ments with any ques­tions, con­cerns, or requests. Addi­tion­ally you can always find me on Twit­ter at @benjamingaines. We hope you enjoy these new fea­tures, and happy ana­lyz­ing!