Traditionally speaking and in many areas of modern science, one of the biggest obstacles in any experiment or analysis is obtaining a large enough data set to fulfill the standards of sample size required by most statistical procedures. In more recent years and especially in the world of digital marketing, this is far from the case. Many clients are inundated with data – so much, in fact, that it can sometimes be difficult to know what to look for or where to start. With the computational costs associated with truly large data sets (gigs upon gigs and terabytes upon terabytes) we want to be strategic in the way we examine the data.

If you’re not entirely sure what to look for, a great place to begin is with an Association & Affinity analysis. An Association & Affinity analysis is an extremely flexible type of analysis that allows us to use basic data mining techniques to establish relationships between site metrics. Traditional marketing pros might know this type of analysis better as “market basket” analysis. This term comes from the simple example of buying items at a grocery store. With transaction level data, we can determine which items are most likely to be purchase together. For example, if we know that 71% of transactions that include milk also include bread, then we can start to make some tactical decisions about when we put those two items on sale, where we physically place the items in the store, etc.

Association Analysis finds underlying structures and relationships in large data sets.

Association Analysis finds underlying structures and relationships in large data sets.

A customer contacted Adobe Consulting to ask for help in determining whether or not a customer’s origin could help predict the location to which they’re traveling.  Additionally, they were interested in knowing if there were any seasonal or brand factors to take into consideration. What we found was striking, surprising, and significant.

With the help of the Insight Consulting team, we looked at several years of transaction level data, which amounted to a little over 10 million transactions. Needless to say, this is a little more than Excel is able to handle, and a perfect opportunity to use some simple, but very useful data mining techniques. One of the great qualities of Insight is its ability to handle large data sets, so we used Insight to do all the number crunching. As mentioned previously, there were several variables that we took into consideration for our analysis:

  • Origin – the place a customer was physically located when they made the rental reservation
  • Destination – the place a customer picked up the car for the rental reservation
  • Brand – this particular travel agency was a conglomerate of several brands, which functioned separately, but were still part of the same company
  • Date – the day, month, and year for the time of travel

We incorporated three distinct statistical measures of association:  support, confidence, and lift ratio. With these variables, we created a dashboard that allowed analysts on the client side to interact with and understand the analysis. The end result is set of tables and heat maps that identify the origin-destination pairs that are most likely to occur together, as seen below. 

Example of a custom dashboard put together for a client.  The heat map is a visual tool for determining the strongest origin-destination associations.

Example of a custom dashboard put together for a client. The heat map is a visual tool for determining the strongest origin-destination associations.

Through the use of the dashboard, we identified some insights that were very significant to the client. For example:

  • We found that the vast majority of transactions occurred within state, meaning that most customers were staying local for their travel needs.  This came as a big shock.
  • The different brands had very distinct behaviors.  Some brands tended to attract customers that were more likely to travel outside the state, while other brands were used more frequently by the customers looking to travel within their own state.
  • Seasonality had a significant effect on the choices people were making regarding their travel locations.  Overall, spring and summer months tended to see a more diverse range in the destinations of choice, while fall and winter months saw trends in customers staying closer to home.

Of course, there are literally thousands of individual origin-destination pairs when we consider the different granularities, seasons, and brands. Delivering the analysis via a dashboard ensured that the client would be able to dive deeper into any and all pairs at the granularity that fit the needs of the specific business units that would be consuming the analysis.

With web analytics data, there is almost no limit to the areas where an Association & Affinity analysis may apply. Contact an Adobe representative to explore the details of this type of analysis for your organization.

 

0 comments