What’re Small Elements? [Analysis with Insight]
One question that beginner Adobe Insight analysts frequently ask is “What are Small Elements?” Actually, even advanced analysts ask this from time to time, and it’s a good question with a relatively simple answer.
Uniques Exceeded: The first 500,000 elements in a month
If you’ve used Adobe SiteCatalyst, you’re likely familiar with the “Uniques Exceeded” element in your reports. To return reports as quickly as possible in SiteCatalyst, we limit reports to 500,000 unique elements (i.e. line items) per report/variable per month.
Very granular data — like User IDs, timestamps, page names for very granular site architectures, etc. — can sometimes yield more than 500,000 unique elements in a given month. Anything beyond the 500,000th element is included within one element called “Uniques Exceeded.”
Data related to these elements is still available via Data Warehouse exports or ASI slots specific to the element of interest.
The Difference: Small Elements are the first significant elements
In Insight, “Small Elements” serve the same purpose and are similar to SiteCatalyst’s “Uniques Exceeded,” with one major difference.
The first 500,000 elements collected in a month in SiteCatalyst gain their respective unique elements in reports, and the remainder fall into Uniques Exceeded. In Insight, the most significant (as data is processed into a dataset via log processing) elements gain their respective unique elements in the dimension, up to the configured limit of elements in the dimension. The rest fall into “Small Elements.”
In his 2005 Wired Magazine article and subsequent book, Chris Anderson popularized the idea of the “Long Tail”, illustrating the additional elements in a power law graph / Pareto distribution and stressing their potential relative significance.
So in Insight, the “Small Elements” element really reflects the “smaller” elements for that dimension in the dataset, at the time of the initial log processing for the current dataset. Small Elements really reflects the “Long Tail”… all of the elements that are individually less significant.
How to use Small Elements
There are two Insight Dimension types that come into play related to Small Elements: Simple Dimensions and Denormal Dimensions. When considering how to use Small Elements, the differences in their behavior for Simple Dimensions vs. Denormal Dimensions are important to consider.
With a Simple Dimension, elements will always be limited to the maximum number of unique elements the dimension is configured to contain (increasing the number of unique elements has hard drive implications on your Insight Servers and user query time implications in Insight.) Everything above that configured limit will always fall into Small Elements — until the next reprocessing, when if it’s more significant (significant enough to fall into the top elements), it will gain its own unique element within the configured limit.
In addition, with Simple Dimensions, Segment Exports from Insight (flat file exports of the data in the dataset) will always represent the field for a Small Element with the text “Small Elements.” Just like with SiteCatalyst and its “Uniques Exceeded”, “Small Elements” is the actual text stored in the dimension value for that row of data.
With Denormal Dimensions, the elements displayed in a data visualization in Insight will still be limited to 1,023 (the 1,024th element is reserved for the “Small Elements” element, when needed.) If Small Elements are at play, you’ll still see that element.
However, the value stored in the dimension value for that row is actually stored within the dimension. As a result, if you mask within the data visualization and make a selection in your workspace, the list of top elements is re-evaluated, and the top elements fit into the 1,023 element limit, and the remainder (related to your selection) now fall into the “Small Elements” element.
With a Denormal Dimension and the appropriate masking in analysis, “Small Elements” is a living, breathing element, always capable of showing you the real top elements related to your analysis (selections, filters, etc.) and casting the rest into “Small Elements” on the fly.
For example, I can mask into the dimension and mask to elements where there was “At Least One” of some countable. When other selections are made in my workspace (for example, time frames, groups of Visitors, etc.), the relevant elements are then displayed within the table.
(Note to Insight analysts: If you use this method, you’ll also want to turn on “Dynamic Selection” in the “Mask” menu for this visualization, to automatically update the elements in the visualization when you re-run the query or adjust the query via a selection adjustment or some other method.)
Furthermore, when you perform a Segment Export (flat file export of the data in a dataset), the actual value from the datasource, preserved uniquely in the Denormal Dimension, is available for the export. You won’t see the element “Small Elements” in a Segment Export for a Denormal Dimension — unless that was actually the value of that row in the datasource.
Denormal Dimensions have a higher “cost” in your dataset (DPU hard drive and memory, client query time, etc.), so talk with your Adobe Consulting team about your planned use of Denormals. We’re used to helping clients weigh the relative value of one Denormal vs. another. And we’re used to working with clients to reconfigure a dataset to remove one Denormal when they’re done with it and use it for another “denormal” purpose.
Recast it how you want
Don’t forget that with Insight, your raw logs remain untouched when they’re used in log processing to create a dataset. You can always reprocess your dataset with conditions in place to limit what values make it into elements of a dimension, to build your dimension to include the elements that matter the most to your current analysis. That’s part of the value of Insight.
So analyze it one way today, reprocess your dataset tonight, and look at a whole new list of elements tomorrow depending on your analysis needs.
Small Elements is your friend in the dataset efficiency it gives — and can be your friend in accomplishing deep analysis very quickly and accurately if configured and used correctly.
Have a question about anything related to Adobe Insight? Do you have any tips or best practices related to Adobe Insight you want to share? If so, please leave a comment here or send me an e-mail at mhalbrook [at] omniture [dot] com and I will do my best to answer it here on the blog so everyone can learn. (If you prefer, I won’t use your name or company name.) You can also follow me on Twitter @MichaelHalbrook.
Learn more about Adobe Consulting