One ques­tion that begin­ner Adobe Insight ana­lysts fre­quently ask is “What are Small Ele­ments?” Actu­ally, even advanced ana­lysts ask this from time to time, and it’s a good ques­tion with a rel­a­tively sim­ple answer.

Uniques Exceeded: The first 500,000 ele­ments in a month

If you’ve used Adobe Site­Cat­a­lyst, you’re likely famil­iar with the “Uniques Exceeded” ele­ment in your reports. To return reports as quickly as pos­si­ble in Site­Cat­a­lyst, we limit reports to 500,000 unique ele­ments (i.e. line items) per report/variable per month.

Very gran­u­lar data — like User IDs, time­stamps, page names for very gran­u­lar site archi­tec­tures, etc. — can some­times yield more than 500,000 unique ele­ments in a given month. Any­thing beyond the 500,000th ele­ment is included within one ele­ment called “Uniques Exceeded.”

Uniques Exceeded in SiteCatalyst

Data related to these ele­ments is still avail­able via Data Ware­house exports or ASI slots spe­cific to the ele­ment of interest.

The Dif­fer­ence: Small Ele­ments are the first sig­nif­i­cant ele­ments

In Insight, “Small Ele­ments” serve the same pur­pose and are sim­i­lar to SiteCatalyst’s “Uniques Exceeded,” with one major difference.

The first 500,000 ele­ments col­lected in a month in Site­Cat­a­lyst gain their respec­tive unique ele­ments in reports, and the remain­der fall into Uniques Exceeded. In Insight, the most sig­nif­i­cant (as data is processed into a dataset via log pro­cess­ing) ele­ments gain their respec­tive unique ele­ments in the dimen­sion, up to the con­fig­ured limit of ele­ments in the dimen­sion. The rest fall into “Small Elements.”

Small Elements in a Table Visualization in Insight

In his 2005 Wired Mag­a­zine arti­cle and sub­se­quent book, Chris Ander­son pop­u­lar­ized the idea of the “Long Tail”, illus­trat­ing the addi­tional ele­ments in a power law graph / Pareto dis­tri­b­u­tion and stress­ing their poten­tial rel­a­tive significance.

So in Insight, the “Small Ele­ments” ele­ment really reflects the “smaller” ele­ments for that dimen­sion in the dataset, at the time of the ini­tial log pro­cess­ing for the cur­rent dataset. Small Ele­ments really reflects the “Long Tail”… all of the ele­ments that are indi­vid­u­ally less significant.

Small Elements in Insight - The Long Tail

How to use Small Elements

There are two Insight Dimen­sion types that come into play related to Small Ele­ments: Sim­ple Dimen­sions and Denor­mal Dimen­sions. When con­sid­er­ing how to use Small Ele­ments, the dif­fer­ences in their behav­ior for Sim­ple Dimen­sions vs. Denor­mal Dimen­sions are impor­tant to consider.

With a Sim­ple Dimen­sion, ele­ments will always be lim­ited to the max­i­mum num­ber of unique ele­ments the dimen­sion is con­fig­ured to con­tain (increas­ing the num­ber of unique ele­ments has hard drive impli­ca­tions on your Insight Servers and user query time impli­ca­tions in Insight.) Every­thing above that con­fig­ured limit will always fall into Small Ele­ments — until the next repro­cess­ing, when if it’s more sig­nif­i­cant (sig­nif­i­cant enough to fall into the top ele­ments), it will gain its own unique ele­ment within the con­fig­ured limit.

In addi­tion, with Sim­ple Dimen­sions, Seg­ment Exports from Insight (flat file exports of the data in the dataset) will always rep­re­sent the field for a Small Ele­ment with the text “Small Ele­ments.” Just like with Site­Cat­a­lyst and its “Uniques Exceeded”, “Small Ele­ments” is the actual text stored in the dimen­sion value for that row of data.

With Denor­mal Dimen­sions, the ele­ments dis­played in a data visu­al­iza­tion in Insight will still be lim­ited to 1,023 (the 1,024th ele­ment is reserved for the “Small Ele­ments” ele­ment, when needed.) If Small Ele­ments are at play, you’ll still see that element.

How­ever, the value stored in the dimen­sion value for that row is actu­ally stored within the dimen­sion. As a result, if you mask within the data visu­al­iza­tion and make a selec­tion in your work­space, the list of top ele­ments is re-evaluated, and the top ele­ments fit into the 1,023 ele­ment limit, and the remain­der (related to your selec­tion) now fall into the “Small Ele­ments” element.

With a Denor­mal Dimen­sion and the appro­pri­ate mask­ing in analy­sis, “Small Ele­ments” is a liv­ing, breath­ing ele­ment, always capa­ble of show­ing you the real top ele­ments related to your analy­sis (selec­tions, fil­ters, etc.) and cast­ing the rest into “Small Ele­ments” on the fly.

For exam­ple, I can mask into the dimen­sion and mask to ele­ments where there was “At Least One” of some count­able. When other selec­tions are made in my work­space (for exam­ple, time frames, groups of Vis­i­tors, etc.), the rel­e­vant ele­ments are then dis­played within the table.

Masking into Small Elements in a Denormal Dimension

(Note to Insight ana­lysts: If you use this method, you’ll also want to turn on “Dynamic Selec­tion” in the “Mask” menu for this visu­al­iza­tion, to auto­mat­i­cally update the ele­ments in the visu­al­iza­tion when you re-run the query or adjust the query via a selec­tion adjust­ment or some other method.)

Fur­ther­more, when you per­form a Seg­ment Export (flat file export of the data in a dataset), the actual value from the data­source, pre­served uniquely in the Denor­mal Dimen­sion, is avail­able for the export. You won’t see the ele­ment “Small Ele­ments” in a Seg­ment Export for a Denor­mal Dimen­sion — unless that was actu­ally the value of that row in the datasource.

Denor­mal Dimen­sions have a higher “cost” in your dataset (DPU hard drive and mem­ory, client query time, etc.), so talk with your Adobe Con­sult­ing team about your planned use of Denor­mals. We’re used to help­ing clients weigh the rel­a­tive value of one Denor­mal vs. another. And we’re used to work­ing with clients to recon­fig­ure a dataset to remove one Denor­mal when they’re done with it and use it for another “denor­mal” purpose.


Recast it how you want

Don’t for­get that with Insight, your raw logs remain untouched when they’re used in log pro­cess­ing to cre­ate a dataset. You can always reprocess your dataset with con­di­tions in place to limit what val­ues make it into ele­ments of a dimen­sion, to build your dimen­sion to include the ele­ments that mat­ter the most to your cur­rent analy­sis. That’s part of the value of Insight.

So ana­lyze it one way today, reprocess your dataset tonight, and look at a whole new list of ele­ments tomor­row depend­ing on your analy­sis needs.

Small Ele­ments is your friend in the dataset effi­ciency it gives — and can be your friend in accom­plish­ing deep analy­sis very quickly and accu­rately if con­fig­ured and used correctly.

Have a ques­tion about any­thing related to Adobe Insight? Do you have any tips or best prac­tices related to Adobe Insight you want to share? If so, please leave a com­ment here or send me an e-mail at mhal­brook [at] omni­ture [dot] com and I will do my best to answer it here on the blog so every­one can learn. (If you pre­fer, I won’t use your name or com­pany name.) You can also fol­low me on Twit­ter @Michael­Hal­brook.

Learn more about Adobe Con­sult­ing

  • Joe Cor­bett

    Excel­lent infor­ma­tion and graph­ics, Michael! This would make a great addi­tion to the doc­u­men­ta­tion and train­ing mate­ri­als for Adobe’s Insight and Site­Cat­a­lyst prod­ucts. My clients have often strug­gled with this topic (and I’ve strug­gled to explain it) and you nailed it here.

  • http://www.lendingtree.com Craig Ket­ner

    Michael, great post. The prod­uct releast of Denor­mal dimen­sions in ver­sion 5.0 was a great fea­ture to drill into Small Ele­ments. My under­stand­ing of Small Ele­ments is that it takes the first 1023 (by default) ordi­nal ele­ments and main­tains that same list; no mat­ter how much traf­fic come to the other Small Ele­ments. So if we sud­denly started a new huge cam­paign that cre­ated a new data value, it would never be seen out­side of Small Ele­ments with­out either run­ning a new Trans­for­ma­tion or drilling into Small Ele­ments if it hap­pens to be a Denor­mal dimen­sion. Also, you should men­tion that the 1024 dimen­sion limit is con­fig­urable, but at a cost if it is made too large.

  • http://blogs.omniture.com/author/mhalbrook Michael Hal­brook

    Thanks for the feed­back, Joe & Craig.

    Craig, a quick clar­i­fi­ca­tion & reac­tion to your com­ment: You are cor­rect that the first 1023 ordi­nal ele­ments appear con­sis­tently, by default, even in a Denor­mal Dimen­sion visu­al­iza­tion. How­ever, as noted above, in a Denor­mal dimen­sion, mask­ing into “At Least One” of a rel­e­vant count­able pulls new ele­ments into the visu­al­iza­tion spe­cific to the work­space selec­tion. If you sud­dently started a huge new cam­paign that cre­ated a new data value (per your case in your com­ment), select­ing into the recent time­frame and mask­ing to “At Least One -> Session/Visit”, or sim­ply mask­ing for that ele­ment in the Denor­mal table, would show that ele­ment out­side of Small Elements.

    Also, note the reflec­tions on the costs of the Denor­mal (DPU hard drive, mem­ory, and client query response) in the post. Def­i­nitely con­fig­urable, though, and that’s part of the point!

    Thanks again for the feed­back, guys.