Generate Rockstar AEM Logs Metrics with R Programming Utility

Posted on Friday, May 5, 2017 By

Today’s Tips & Tricks guest post comes from Atish Jain, who is a Senior AEM Developer at SapientRazorfish (Publicis.Sapient) with over seven years of experience working with CMS based applications. Atish was a semi-finalist in this year’s AEM Rockstar competition. 

The tool I’m sharing is an R programming based utility to find gaps in renditions versus assets uploaded. This can be helpful in asserting the Bulk Migration, Longevity Tests success, Upload Activities, and Comparative Analysis in your AEM instance.

For those unfamiliar with R programming, it is a free open-source language and environment used for data manipulation, calculations, statistical computing, and graphical techniques useful to statisticians, analysts, data miners, and other researchers.  To learn more about R, visit

Trend analysis for upload vs. workflow completion and systems experience an increase in slowness with time. The stats can be analyzed to find missing assets reports and degradation in AEM server performance under continuous load. It works on the logs that are produced under crx-quickstart folder of AEM. Hence, there is no direct performance impact on the AEM instance. Also, reports can be generated over historical log files to produce and find comparative results, and do an analysis.

The utility helps you:

  • Analyze the exact count of missing Assets Renditions with the upload path that has been missed.
  • Conduct trend analysis for uploaded assets versus renditions generation. The pace of renditions generation can be calibrated for better insights for estimating activity timings and degradation factors.

The AEM logs are powerful and transformable to produce vital statistics. This utility, based on R programming language utilizes this power and generates metrics.

Here is how the utility works:

Step1: parses error.log(s) to subset log lines with date time – A

Step2: parses A to find log lines for upload trigger –B

Step3: parses A to find log lines for last rendition – C

Step4: merges B & C to create reports.

The concept detailed above can be enhanced into a more exhaustive application that can create extensive reports from AEM logs.

For example, the utility can be extended to generate more detailed graphical reports via the graphical plugin API available for R.

If you have any questions, you can contact Atish at All opinions expressed by Atish Jain and are his own and not Adobe’s.


QUICKSTART_LOGS_DIR <- "D:/Atish/aem-rock/output/logs"
OUTPUT_DIR <- "D:/Atish/aem-rock/results/day1"
print_renditions_gap_flag <- TRUE
upload_report_print_flag <- TRUE
RENDITION_LOG_TXT_PATTERN<- "jcr:content/renditions/cq5dam.web.1280.1280.jpeg"
ERROR_LOGS_FILE_PATTERN <- "error\\.log\\.\\d\\d\\d\\d*"
renditions_gap <- 0
error_log_files_list <- function(QUICKSTART_LOGS_DIR) {
error_file_list <- list.files(pattern = ERROR_LOGS_FILE_PATTERN)
error_file_list <- unlist(list(error_file_list,list.files(pattern = "error.log$")))

upload_report_calculation <- function(logs_dir){
dataset_upload <-NULL
dataset_workflowstart <-NULL
error_log_combined <- NULL
for (file in error_file_list){
print(paste("Analysing log file : ", file, sep=""))
error_log_full_dataset <- NULL
error_log_subsetdate_dataset <- NULL
dataset_x <- NULL
error_log_full_dataset <- read.table(file, header=FALSE, quote="", fill=TRUE)
colnames(error_log_full_dataset) <- c("date", "time", "level", "type", "class" , "logtext1", "logtext2", "logtext3", "assetPath")
#Filter rows which contains date only and assign it to error_log_subsetdate_dataset
error_log_subsetdate_dataset <- subset(error_log_full_dataset, grepl("\\d\\d.\\d\\d.\\d\\d\\d\\d", date))
write.csv(file="dataset.csv", x=error_log_subsetdate_dataset)
#Filter rows which contains *EXECUTE_START* and */content/dam/*
#Refine the above dataframe to contain only asset upload trigger log.
upload_trigger <- subset(error_log_subsetdate_dataset, grepl(UPLOAD_TRIGGER_TXT_PATTERN, logtext2))
upload_trigger <- subset(upload_trigger, grepl("*/content/dam/*", class))
#Filter rows which contains string:jcr:content/renditions/cq5dam.web.1280.1280.jpeg.
rendition_generation <- subset(error_log_subsetdate_dataset, grepl(RENDITION_LOG_TXT_PATTERN, assetPath))
#concatenate the data and time columns of subset data frames
upload_trigger$datetime <- paste(as.Date(upload_trigger$date,format='%d.%m.%Y'), upload_trigger$time, sep=" ")
rendition_generation$datetime <- paste(as.Date(rendition_generation$date,format='%d.%m.%Y'), rendition_generation$time, sep=" ")
renditions_gap <- renditions_gap + (nrow(upload_trigger) - nrow(rendition_generation))
upload_trigger_df <- data.frame(sub('.*:','',sub('/jcr.*', '', upload_trigger$class)), upload_trigger$datetime)
colnames(upload_trigger_df) <- c("assetPath","upload_trigger.datetime")
write.csv(file="upload_trigger_df.csv", x=upload_trigger_df)
#Prepare renditions generation dataframe
rendition_gen_df <- data.frame(gsub('.{49}$', '', rendition_generation$assetPath), rendition_generation$datetime)
colnames(rendition_gen_df) <- c("assetPath","rendition_generation.datetime")
write.csv(file="rendition_gen_df.csv", x=rendition_gen_df)
dataset_x <- merge(upload_trigger_df,rendition_gen_df,'assetPath',all.x=TRUE)
#Create a new data frame with assetPath, upload, rendition generation timings
dataset_x$timeDiff <- as.POSIXlt(dataset_x$rendition_generation.datetime, "%d-%m-%Y %H:%M:%S") - as.POSIXlt(dataset_x$upload_trigger.datetime, "%d-%m-%Y %H:%M:%S")
filename <- paste(file, ".csv", sep="")
dataset_upload <- rbind(dataset_upload,dataset_x) 
print_rendtions_gap_report <- function(renditions_gap, print_renditions_gap_flag) {
temp_var <- paste("Renditions gap vs uploaded assets: ", renditions_gap, sep="")

upload_report_print <- function(dataset_upload,upload_report_print_flag){
setwd(OUTPUT_DIR) <- apply(dataset_upload, 1, function(x){any(})
uploadAsset <- dataset_upload[!,]
write.csv(file="uploadAsset.csv", x=uploadAsset)
missingRenditions <- dataset_upload[,]
write.csv(file="missingRenditions.csv", x=missingRenditions)
barplot(as.matrix(uploadAsset$timeDiff), main="Time-Diff Report", xlab="AssetsUploaded", ylab= "timeLag(sec)", beside=TRUE, col=rainbow(1))
dev.copy2pdf(file = "TimeDiffReport.pdf")
# Functions Execution
error_file_list <- error_log_files_list(QUICKSTART_LOGS_DIR)
dataset_upload <- upload_report_calculation(QUICKSTART_LOGS_DIR)
print_rendtions_gap_report(renditions_gap, print_renditions_gap_flag)


4:52 PM Permalink

Sling Pipes – A Rockstar Way to Deal with JCR

Posted on Monday, May 1, 2017 By

Today’s post is by guest writer Rima Mittal, who was invited to compete for the title of AEM Rockstar at the 2017 Adobe Summit. Along with the other finalists, we invited Rima to contribute a blog and video to our series, Rockstar Tips & Tricks. At the AEM Rockstar Session, Rima spoke on Sling Pipes – A Rockstar Way to deal with JCR. 

Rima Mittal is an Adobe Certified Lead AEM Developer and Consultant. She has extensive experience working on Java and AEM and has done multiple POCs on integrating AEM with external third-party systems. A strong believer in the importance of communities and knowledge sharing in the world of software development, she has been a speaker at various developer conferences like AEMHub 2015 and adaptTo() 2016. 

Ever encountered a situation where code changes were introduced after the client started authoring and some pages had to be re-authored? Ever spent time writing code just to modify a few hundred pages that were already authored, or with removing a component from hundreds of authored pages? Have you struggled to modify content already in the repository? Need a script to change existing production content? Sling Pipes to the rescue.

Sling Pipes

Sling Pipes is a tool for doing extract – transform – load operations through a resource tree configuration. This tiny toolset provides the ability to do such transformations with proven and reusable blocks, called pipes, streaming resources from one to the other.

A pipe is a JCR node with:

  • sling:resourceType property – Must be a pipe type registered by the plumber
  • name property – Used in bindings as an id
  • path property – Defines pipe’s input
  • expr property – Expression through which the pipe will execute
  • additionalBinding node – Node you can add to set “global” bindings (property=value) in pipe execution
  • additionalScripts – Multivalue property to declare scripts that can be reused in expressions
  • conf child node – Contains addition configuration of the pipe

Registered Pipes



Sling Pipes Demo

Here is a demo video with more on how to use and execute sling pipes in AEM.


More details can be found in the official documentation at

For any questions or comments, I can be reached on Twitter at @rimamittal or on LinkedIn at

12:11 PM Permalink

AEM Multi-Site Tips & Tricks Preview – IMMERSE 2017

Posted on Monday, April 24, 2017 By

We are excited to share another post in the AEM Rockstar Tips & Tricks Guest Blog series! This Tips & Tricks preview is from Brett Birschbach, an AEM Certified Architect and AEM Rockstar semi-finalist who will be a session presenter at Adobe’s global virtual developer conference, IMMERSE May 15th – 19th 2017. Brett will detail his Tips & Tricks in a follow-up post after IMMERSE. Brett’s session will be on Tuesday, May 16, 1:15-2:15 PM Central (11:15-12:15 PDT).

Brett is the Adobe Marketing Cloud Solutions Architect for HS2 Solutions, a digital transformation company based in Chicago. He is a hands-on problem solver with experience leading large multinational, multi-site platform projects.  Brett led the development of the new Shared Component Properties feature in the open source ACS AEM Commons library. For more from Brett, visit his Github: HitmanInWis.

Multi-site platforms are the norm in a mature Adobe Experience Manager installation. However, most implementations, in true Agile fashion, start as a single site and then expand to support multiple. Suppose your client is the NFL and it wants to put all 32 teams on the same AEM platform. We all know that launching 32 team sites on day one is A Bad Idea, so likely you are going to start with just one site as a proof of concept.  However, Agile tends to tempt a lot of us in this situation into thinking that “I’ll just code for this one site now, and worry about multi-site support and code structures on the second site when I actually need them.”  YAGNI, right?  Except…you ARE going to need it.  Pretending you are building a single site when you know the platform is going to support multiple, thinking only in the present instead of taking a step back and getting the full picture, is a great way to paint yourself into a corner.


Man painted into corner

Image by Ali Moussa, HS2 Solutions

Coding a multi-site platform beginning with single-site patterns, we are faced with technical debt in making the transition to multi-site – technical debt that often never gets fully paid.  Let’s be real, clients want to see sites #2 through #32 launched as quickly as possible after site #1 is launched.  After all, that’s the vision you cast them – that they would easily be able to create and manage all their sites on a single platform using the same components and authoring techniques as the first site.  To the client, having a single working site seems like it should account for 80% of the total work, so the rest of the sites should be a snap.  You know as well as I do that the urgency to turn out these sites means that much of the technical debt accrued early in the project will be worked around and put off until resolution on a proverbial someday, never to be repaid.  Bad decisions made when creating the first site often linger on for as long as the platform lives.

Image by Ali Moussa, HS2 Solutions

But it doesn’t have to be this way.  Structuring your code base for multi-site success isn’t that hard to do, it’s just easier not to do it (which is why we end up in this situation).  But what if you had a step-by-step guide?  What if you could leverage the experience of peers who have already done it well?  This stuff isn’t rocket science, and the principles don’t change much from project to project.  No sense wasting your time trying to come up with all the techniques on your own.  Wouldn’t you rather be doing the fun stuff like banging out a sweet, interactive Schedule component for authors to drop onto those 32 NFL team sites?  Of course, you would!  That’s why I encourage you to attend the “Multi-Sites: Setting your Codebase Up for Success” session at Adobe IMMERSE 2017.


Why should you attend?

  • Adobe IMMERSE is the premier AEM developer conference of the year, focusing on the technical audience.  If you’re not already registered, sign up using the link above!
  • I’ll be covering Basic, Advanced, and Overachiever techniques, 15 in all, using the NFL example above (modified to HFL, to avoid any grumbling from the lawyers).
  • The techniques come from a successful, international, multi-site platform implementation hosting a dozen brand and corporate sites, so they are tried and proven.
  • Every example (yes, every last one) will be demonstrated by real code that you can download, look at, and play with in order to gain a true understanding that only code can give.

Disclaimer: Being a Green Bay native (and therefore huge Packers aficionado) I *may* take the liberty to take a few jabs at our archrivals in Chicago…don’t take it personally 🙂




2:53 PM Permalink

Be an #AEMRockstar: Use AEM DataLayer

Posted on Tuesday, April 18, 2017 By

Today’s guest post features the winner of our first AEM Rockstar Competition, Dan Klco of Perficient. Dan rocked Summit attendees with his presentation, so we asked him to share his DataLayer demo. In the coming weeks, we’ll share more AEM Rockstar posts and preview one semi-finalist’s IMMERSE presentation. Stay tuned!

Dan is an experienced Adobe Digital Marketing Technical Lead, Solution Architect, Consultant and Advisor. Through his career, he has become viewed as a valued thought leader in the industry, with solid skills in leading teams to implement successful digital marketing programs in the Adobe ecosystem. Dan is also a PMC Member of the Apache Sling project, which is the basis for Adobe’s Experience Manager product, this allows him unique insight into the AEM platform.

During the AEM Rockstar session at Adobe Summit, I had a chance to talk about Digital Marketing DataLayers in AEM. I discussed how this important design pattern can help simplify Adobe Experience Manager and Adobe Marketing Cloud integrations, and introduced AEM DataLayer, a new Open Source library for creating DataLayers in AEM.

I was thrilled to be awarded first prize for my presentation and would like to share some more with information with you about how to use the AEM DataLayer library on your project.

Step 1: Identifying Data to Track

In my talk, I discussed analyzing designs in the discovery phase, to identify what data you might need to capture for Digital Marketing. It is important to focus on the might rather than the need to ensure that your DataLayer will not require significant changes during the course of the implementation.

Given the page above, you may want to track some of the following information as an example:


Track Time




On Load


Page URL

On Load


Page Path


On Load


Site Section


On Load




On Load




On Load


Page Title

About Us

On Load






Video Play



Video Complete

Step 2: Configure AEM DataLayer

The AEM DataLayer is available as a downloadable AEM Package and is easy to install and incorporate into your project. Click here to watch my Spark video demo.

The steps to install and configure the AEM DataLayer are:

  1. Install the AEM DataLayer package
  2. Setup a Cloud Configuration for AEM DataLayer
  3. Add the Cloud Configuration on your site

Step 3: Add Your Custom DataLayer Code

To create your own DataLayer code, create a simple Bundle project and add the dependencies:













You can then create Sling Model classes implementing the ComponentDataElement interface:


@Model(adaptables = Resource.class, resourceType = {"myapp/components/myresource" }, adapters = ComponentDataElement.class)
public class CustomDataElement implements ComponentDataElement {

For every class, you will need to specify the annotation parameters “resourceType” and “adapters”. You can specify any number of resource types, and when AEM encounters a resource of the type specified, it will call your Sling Model.

The WeRetail Reference project contains a number of examples ComponentDataElement classes you can use as a base for your custom implementations. For example, if you wanted to track the video displayed in every instance of the video component you scoped in Step 1, you could create a class like the one below:

*  Copyright 2017 - Perficient
*  Licensed under the Apache License, Version 2.0 (the "License");
*  you may not use this file except in compliance with the License.
*  You may obtain a copy of the License at
*  Unless required by applicable law or agreed to in writing, software
*  distributed under the License is distributed on an "AS IS" BASIS,
*  See the License for the specific language governing permissions and
*  limitations under the License.
package com.perficient.aem.weretail.datalayer;

import com.perficient.aem.datalayer.api.Component;
import com.perficient.aem.datalayer.api.ComponentDataElement;
import com.perficient.aem.datalayer.api.DataLayer;

* Adds in the video details for the AEM Mobile Video component into the
* AEMDataLayer
* @author danklco
@Model(adaptables = Resource.class, resourceType = {
"mobileapps/components/mobilevideo" }, adapters = ComponentDataElement.class)
public class MobileVideoComponent implements ComponentDataElement {

private final Resource resource;

public MobileVideoComponent(Resource resource) {
this.resource = resource;

public void updateDataLayer(DataLayer dataLayer) {

Component component = new Component();

ValueMap properties = resource.getValueMap();

component.addAttribute("video", properties.get("fileReference", String.class));


This Gist brought to you by gist-it.view rawweretail-reference/src/main/java/com/perficient/aem/weretail/datalayer/

Further Support

If you have any questions or comments about the AEM DataLayer, please open an issue on GitHub or message me on Twitter at @klcodanr. As the library matures, I will be building out more documentation and use cases on GitHub so please keep tuned!


2:26 PM Permalink

Dan Klco: #AEMRockstar 2017

Posted on Tuesday, March 28, 2017 By

If you haven’t heard already, Adobe Summit NA 2017 is in the books! As a result, we have the inaugural #AEMRockstar winner: Dan Klco from Perficient Digital.

The recorded session should be on the Summit online site for those that attended, with the presentations available for download as a PDF. Each finalist will also share their presentation in an upcoming guest post.

Final Results:

  1. Dan Klco, Perficient Digital
  2. Ruben Reusser,, Inc.
  3. Rima Mittal, Sapient
  4. Martin Fitch, Kaiser Permanente
    (tied) Robert Langridge, Dixons Carphone

Looking forward to next year.

1:29 PM Permalink
  • Authors

  • Archives

  • Developer Resources