One ques­tion that begin­ner Adobe Insight ana­lysts fre­quently ask is “What are Small Ele­ments?” Actu­ally, even advanced ana­lysts ask this from time to time, and it’s a good ques­tion with a rel­a­tively sim­ple answer.

Uniques Exceeded: The first 500,000 ele­ments in a month

If you’ve used Adobe Site­Cat­a­lyst, you’re likely famil­iar with the “Uniques Exceeded” ele­ment in your reports. To return reports as quickly as pos­si­ble in Site­Cat­a­lyst, we limit reports to 500,000 unique ele­ments (i.e. line items) per report/variable per month.

Very gran­u­lar data — like User IDs, time­stamps, page names for very gran­u­lar site archi­tec­tures, etc. — can some­times yield more than 500,000 unique ele­ments in a given month. Any­thing beyond the 500,000th ele­ment is included within one ele­ment called “Uniques Exceeded.”

Uniques Exceeded in SiteCatalyst

Data related to these ele­ments is still avail­able via Data Ware­house exports or ASI slots spe­cific to the ele­ment of interest.

The Dif­fer­ence: Small Ele­ments are the first sig­nif­i­cant ele­ments

In Insight, “Small Ele­ments” serve the same pur­pose and are sim­i­lar to SiteCatalyst’s “Uniques Exceeded,” with one major difference.

The first 500,000 ele­ments col­lected in a month in Site­Cat­a­lyst gain their respec­tive unique ele­ments in reports, and the remain­der fall into Uniques Exceeded. In Insight, the most sig­nif­i­cant (as data is processed into a dataset via log pro­cess­ing) ele­ments gain their respec­tive unique ele­ments in the dimen­sion, up to the con­fig­ured limit of ele­ments in the dimen­sion. The rest fall into “Small Elements.”

Small Elements in a Table Visualization in Insight

In his 2005 Wired Mag­a­zine arti­cle and sub­se­quent book, Chris Ander­son pop­u­lar­ized the idea of the “Long Tail”, illus­trat­ing the addi­tional ele­ments in a power law graph / Pareto dis­tri­b­u­tion and stress­ing their poten­tial rel­a­tive significance.

So in Insight, the “Small Ele­ments” ele­ment really reflects the “smaller” ele­ments for that dimen­sion in the dataset, at the time of the ini­tial log pro­cess­ing for the cur­rent dataset. Small Ele­ments really reflects the “Long Tail”… all of the ele­ments that are indi­vid­u­ally less significant.

Small Elements in Insight - The Long Tail

How to use Small Elements

There are two Insight Dimen­sion types that come into play related to Small Ele­ments: Sim­ple Dimen­sions and Denor­mal Dimen­sions. When con­sid­er­ing how to use Small Ele­ments, the dif­fer­ences in their behav­ior for Sim­ple Dimen­sions vs. Denor­mal Dimen­sions are impor­tant to consider.

With a Sim­ple Dimen­sion, ele­ments will always be lim­ited to the max­i­mum num­ber of unique ele­ments the dimen­sion is con­fig­ured to con­tain (increas­ing the num­ber of unique ele­ments has hard drive impli­ca­tions on your Insight Servers and user query time impli­ca­tions in Insight.) Every­thing above that con­fig­ured limit will always fall into Small Ele­ments — until the next repro­cess­ing, when if it’s more sig­nif­i­cant (sig­nif­i­cant enough to fall into the top ele­ments), it will gain its own unique ele­ment within the con­fig­ured limit.

In addi­tion, with Sim­ple Dimen­sions, Seg­ment Exports from Insight (flat file exports of the data in the dataset) will always rep­re­sent the field for a Small Ele­ment with the text “Small Ele­ments.” Just like with Site­Cat­a­lyst and its “Uniques Exceeded”, “Small Ele­ments” is the actual text stored in the dimen­sion value for that row of data.

With Denor­mal Dimen­sions, the ele­ments dis­played in a data visu­al­iza­tion in Insight will still be lim­ited to 1,023 (the 1,024th ele­ment is reserved for the “Small Ele­ments” ele­ment, when needed.) If Small Ele­ments are at play, you’ll still see that element.

How­ever, the value stored in the dimen­sion value for that row is actu­ally stored within the dimen­sion. As a result, if you mask within the data visu­al­iza­tion and make a selec­tion in your work­space, the list of top ele­ments is re-evaluated, and the top ele­ments fit into the 1,023 ele­ment limit, and the remain­der (related to your selec­tion) now fall into the “Small Ele­ments” element.

With a Denor­mal Dimen­sion and the appro­pri­ate mask­ing in analy­sis, “Small Ele­ments” is a liv­ing, breath­ing ele­ment, always capa­ble of show­ing you the real top ele­ments related to your analy­sis (selec­tions, fil­ters, etc.) and cast­ing the rest into “Small Ele­ments” on the fly.

For exam­ple, I can mask into the dimen­sion and mask to ele­ments where there was “At Least One” of some count­able. When other selec­tions are made in my work­space (for exam­ple, time frames, groups of Vis­i­tors, etc.), the rel­e­vant ele­ments are then dis­played within the table.

Masking into Small Elements in a Denormal Dimension

(Note to Insight ana­lysts: If you use this method, you’ll also want to turn on “Dynamic Selec­tion” in the “Mask” menu for this visu­al­iza­tion, to auto­mat­i­cally update the ele­ments in the visu­al­iza­tion when you re-run the query or adjust the query via a selec­tion adjust­ment or some other method.)

Fur­ther­more, when you per­form a Seg­ment Export (flat file export of the data in a dataset), the actual value from the data­source, pre­served uniquely in the Denor­mal Dimen­sion, is avail­able for the export. You won’t see the ele­ment “Small Ele­ments” in a Seg­ment Export for a Denor­mal Dimen­sion — unless that was actu­ally the value of that row in the datasource.

Denor­mal Dimen­sions have a higher “cost” in your dataset (DPU hard drive and mem­ory, client query time, etc.), so talk with your Adobe Con­sult­ing team about your planned use of Denor­mals. We’re used to help­ing clients weigh the rel­a­tive value of one Denor­mal vs. another. And we’re used to work­ing with clients to recon­fig­ure a dataset to remove one Denor­mal when they’re done with it and use it for another “denor­mal” purpose.

Recast it how you want

Don’t for­get that with Insight, your raw logs remain untouched when they’re used in log pro­cess­ing to cre­ate a dataset. You can always reprocess your dataset with con­di­tions in place to limit what val­ues make it into ele­ments of a dimen­sion, to build your dimen­sion to include the ele­ments that mat­ter the most to your cur­rent analy­sis. That’s part of the value of Insight.

So ana­lyze it one way today, reprocess your dataset tonight, and look at a whole new list of ele­ments tomor­row depend­ing on your analy­sis needs.

Small Ele­ments is your friend in the dataset effi­ciency it gives — and can be your friend in accom­plish­ing deep analy­sis very quickly and accu­rately if con­fig­ured and used correctly.

Have a ques­tion about any­thing related to Adobe Insight? Do you have any tips or best prac­tices related to Adobe Insight you want to share? If so, please leave a com­ment here or send me an e-mail at mhal­brook [at] omni­ture [dot] com and I will do my best to answer it here on the blog so every­one can learn. (If you pre­fer, I won’t use your name or com­pany name.) You can also fol­low me on Twit­ter @Michael­Hal­brook.

Learn more about Adobe Con­sult­ing

Michael Halbrook
Michael Halbrook

Thanks for the feedback, Joe & Craig. Craig, a quick clarification & reaction to your comment: You are correct that the first 1023 ordinal elements appear consistently, by default, even in a Denormal Dimension visualization. However, as noted above, in a Denormal dimension, masking into "At Least One" of a relevant countable pulls new elements into the visualization specific to the workspace selection. If you suddently started a huge new campaign that created a new data value (per your case in your comment), selecting into the recent timeframe and masking to "At Least One -> Session/Visit", or simply masking for that element in the Denormal table, would show that element outside of Small Elements. Also, note the reflections on the costs of the Denormal (DPU hard drive, memory, and client query response) in the post. Definitely configurable, though, and that's part of the point! Thanks again for the feedback, guys.

Craig Ketner
Craig Ketner

Michael, great post. The product releast of Denormal dimensions in version 5.0 was a great feature to drill into Small Elements. My understanding of Small Elements is that it takes the first 1023 (by default) ordinal elements and maintains that same list; no matter how much traffic come to the other Small Elements. So if we suddenly started a new huge campaign that created a new data value, it would never be seen outside of Small Elements without either running a new Transformation or drilling into Small Elements if it happens to be a Denormal dimension. Also, you should mention that the 1024 dimension limit is configurable, but at a cost if it is made too large.

Joe Corbett
Joe Corbett

Excellent information and graphics, Michael! This would make a great addition to the documentation and training materials for Adobe’s Insight and SiteCatalyst products. My clients have often struggled with this topic (and I've struggled to explain it) and you nailed it here.