One of the most talked-about changes from SiteCatalyst 14 to SiteCatalyst 15 is the way data is processed and how long it takes until activity on the site is visible in the reporting — the so called “latency”.
SiteCatalyst collects data from web sites or apps. It then processes the data and creates a multitude of reports, all readily available in the SiteCatalyst user interface. Data flows in and gets aggregated so it makes sense.
Our engineering teams have made significant changes in SiteCatalyst 15, meaning SiteCatalyst 14 and Sitecatalyst 15 process data in an entirely different way.
Let’s look at two myths that I hear a lot:
“SiteCatalyst 14 is real-time”
That is part of the truth, but not the complete picture.
Data processing in SiteCatalyst follows the principle “process as soon as you can”.
When a visitor calls up a web page, her browser sends a tracking request. SiteCatalyst will almost immediately count this as a Page View. The visit, however, is counted once it’s finished, meaning after 30 minutes of inactivity.
As a result, some of the reports in SiteCatalyst are practically real-time, the most popular example being the Site Content > Pages report. Custom Traffic reports also display real-time data for the Page Views metric.
Other reports are at least 30 minutes behind, because the system will not process visit-based metrics before the visit has actually timed out. The same goes for some of the conversion reports and metrics.
“SiteCatalyst 15 has a latency of 2 hours”
That is not accurate.
SiteCatalyst 15 processes data in batches. Those batches are usually 60 minutes of collected data. As an example, all tracking data that arrives into SiteCatalyst between 11am and 11:59:59am is part of the same batch.
At 12:00, SiteCatalyst starts collecting data into a new batch. At the same time, the “11am batch” goes into processing. Processing of a 1-hour batch currently takes around 30 minutes.
So, when do we see data?
If a tracking request came in at 11:00:01am, it’ll be part of the “11am batch” which will finish processing at around 12.30. The data will therefore be visible just under 90 minutes after it happened.
If, on the other hand, the tracking request came in at 11:59:59am, it’ll also be part of the “11am batch” and therefore also visible at 12.30, a mere 30 minutes after it happened.
As a result of the batch approach, latency looks like this:
Not all data is processed in 60 minute batches. The batch size can be changed to 30 minutes for critical report suites (2 per customer).
Because a 30 minute batch can be processed in roughly 15 minutes, the maximum time for a hit to appear in SiteCatalyst is 45 minutes, and the shortest time is 15 minutes, following the same saw-tooth pattern as above.
What if you are following web activity for a new release or after having sent a newsletter? What if your editorial team needs to know right now how that new article about that politician performs?
SiteCatalyst 15 now has “Current Data” reports which work a lot like SiteCatalyst 14 worked: each report in SiteCatalyst 15 has a “Current Data” counterpart which provides the same latency that Sitecatalyst 14 did.Those reports can be used to check data fo the current day or current day and the day before.
If you need Current Data reports, go to Admin > User Management and add a user to the “Current Data” group. That user will now be able to see those reports.