by Ursula Johnson

Created

May 5, 2017

Today’s Tips & Tricks guest post comes from Atish Jain, who is a Senior AEM Developer at SapientRazorfish (Publicis.Sapient) with over seven years of experience working with CMS based applications. Atish was a semi-finalist in this year’s AEM Rockstar competition. 

The tool I’m sharing is an R programming based utility to find gaps in renditions versus assets uploaded. This can be helpful in asserting the Bulk Migration, Longevity Tests success, Upload Activities, and Comparative Analysis in your AEM instance.

For those unfamiliar with R programming, it is a free open-source language and environment used for data manipulation, calculations, statistical computing, and graphical techniques useful to statisticians, analysts, data miners, and other researchers.  To learn more about R, visit r-project.org.

Trend analysis for upload vs. workflow completion and systems experience an increase in slowness with time. The stats can be analyzed to find missing assets reports and degradation in AEM server performance under continuous load. It works on the logs that are produced under crx-quickstart folder of AEM. Hence, there is no direct performance impact on the AEM instance. Also, reports can be generated over historical log files to produce and find comparative results, and do an analysis.

The utility helps you:

  • Analyze the exact count of missing Assets Renditions with the upload path that has been missed.
  • Conduct trend analysis for uploaded assets versus renditions generation. The pace of renditions generation can be calibrated for better insights for estimating activity timings and degradation factors.

The AEM logs are powerful and transformable to produce vital statistics. This utility, based on R programming language utilizes this power and generates metrics.

Here is how the utility works:

Step1: parses error.log(s) to subset log lines with date time – A

Step2: parses A to find log lines for upload trigger –B

Step3: parses A to find log lines for last rendition – C

Step4: merges B & C to create reports.

The concept detailed above can be enhanced into a more exhaustive application that can create extensive reports from AEM logs.

For example, the utility can be extended to generate more detailed graphical reports via the graphical plugin API available for R.

If you have any questions, you can contact Atish at ajain216@sapient.com. All opinions expressed by Atish Jain and are his own and not Adobe’s.


 

# R SCRIPT TO FIND ASSETS UPLOADED AND ANALYSE CORRESPONDING THE RENDITIONS GENERATION COUNT
#USER INPUTS
QUICKSTART_LOGS_DIR <- "D:/Atish/aem-rock/output/logs"
OUTPUT_DIR <- "D:/Atish/aem-rock/results/day1"
print_renditions_gap_flag <- TRUE
upload_report_print_flag <- TRUE
UPLOAD_TRIGGER_TXT_PATTERN <- "*EXECUTE_START*"
RENDITION_LOG_TXT_PATTERN<- "jcr:content/renditions/cq5dam.web.1280.1280.jpeg"
 
# DO NOT CHANGE THIS LINE
ERROR_LOGS_FILE_PATTERN <- "error\\.log\\.\\d\\d\\d\\d*"
 
renditions_gap <- 0
 
#LIST ALL ERROR LOG FILES UNDER crx-quickstart
error_log_files_list <- function(QUICKSTART_LOGS_DIR) {
setwd(QUICKSTART_LOGS_DIR)
error_file_list <- list.files(pattern = ERROR_LOGS_FILE_PATTERN)
error_file_list <- unlist(list(error_file_list,list.files(pattern = "error.log$")))
error_file_list
}


upload_report_calculation <- function(logs_dir){
setwd(logs_dir)
dataset_upload <-NULL
dataset_workflowstart <-NULL
error_log_combined <- NULL
 
for (file in error_file_list){
print(paste("Analysing log file : ", file, sep=""))
error_log_full_dataset <- NULL
error_log_subsetdate_dataset <- NULL
dataset_x <- NULL
error_log_full_dataset <- read.table(file, header=FALSE, quote="", fill=TRUE)
colnames(error_log_full_dataset) <- c("date", "time", "level", "type", "class" , "logtext1", "logtext2", "logtext3", "assetPath")
 
#Filter rows which contains date only and assign it to error_log_subsetdate_dataset
error_log_subsetdate_dataset <- subset(error_log_full_dataset, grepl("\\d\\d.\\d\\d.\\d\\d\\d\\d", date))
write.csv(file="dataset.csv", x=error_log_subsetdate_dataset)
 
#Filter rows which contains *EXECUTE_START* and */content/dam/*
#Refine the above dataframe to contain only asset upload trigger log.
upload_trigger <- subset(error_log_subsetdate_dataset, grepl(UPLOAD_TRIGGER_TXT_PATTERN, logtext2))
upload_trigger <- subset(upload_trigger, grepl("*/content/dam/*", class))
 
#Filter rows which contains string:jcr:content/renditions/cq5dam.web.1280.1280.jpeg.
rendition_generation <- subset(error_log_subsetdate_dataset, grepl(RENDITION_LOG_TXT_PATTERN, assetPath))
 
#concatenate the data and time columns of subset data frames
upload_trigger$datetime <- paste(as.Date(upload_trigger$date,format='%d.%m.%Y'), upload_trigger$time, sep=" ")
rendition_generation$datetime <- paste(as.Date(rendition_generation$date,format='%d.%m.%Y'), rendition_generation$time, sep=" ")
 
renditions_gap <- renditions_gap + (nrow(upload_trigger) - nrow(rendition_generation))
upload_trigger_df <- data.frame(sub('.*:','',sub('/jcr.*', '', upload_trigger$class)), upload_trigger$datetime)
colnames(upload_trigger_df) <- c("assetPath","upload_trigger.datetime")
write.csv(file="upload_trigger_df.csv", x=upload_trigger_df)
 
#Prepare renditions generation dataframe
rendition_gen_df <- data.frame(gsub('.{49}$', '', rendition_generation$assetPath), rendition_generation$datetime)
colnames(rendition_gen_df) <- c("assetPath","rendition_generation.datetime")
write.csv(file="rendition_gen_df.csv", x=rendition_gen_df)
dataset_x <- merge(upload_trigger_df,rendition_gen_df,'assetPath',all.x=TRUE)
 
#Create a new data frame with assetPath, upload, rendition generation timings
dataset_x$timeDiff <- as.POSIXlt(dataset_x$rendition_generation.datetime, "%d-%m-%Y %H:%M:%S") - as.POSIXlt(dataset_x$upload_trigger.datetime, "%d-%m-%Y %H:%M:%S")
 
filename <- paste(file, ".csv", sep="")
dataset_upload <- rbind(dataset_upload,dataset_x) 
}
 
return(dataset_upload)
}
 
print_rendtions_gap_report <- function(renditions_gap, print_renditions_gap_flag) {
if(print_renditions_gap_flag){
temp_var <- paste("Renditions gap vs uploaded assets: ", renditions_gap, sep="")
 
print(temp_var)
setwd(OUTPUT_DIR)
write(temp_var,file="Rsummary.txt",append=FALSE)
}
}

upload_report_print <- function(dataset_upload,upload_report_print_flag){
if(upload_report_print_flag){
setwd(OUTPUT_DIR)
row.has.na <- apply(dataset_upload, 1, function(x){any(is.na(x))})
uploadAsset <- dataset_upload[!row.has.na,]
write.csv(file="uploadAsset.csv", x=uploadAsset)
 
missingRenditions <- dataset_upload[row.has.na,]
write.csv(file="missingRenditions.csv", x=missingRenditions)
x11()
barplot(as.matrix(uploadAsset$timeDiff), main="Time-Diff Report", xlab="AssetsUploaded", ylab= "timeLag(sec)", beside=TRUE, col=rainbow(1))
dev.copy2pdf(file = "TimeDiffReport.pdf")
}
}
 
# Functions Execution
setwd(QUICKSTART_LOGS_DIR)
error_file_list <- error_log_files_list(QUICKSTART_LOGS_DIR)
dataset_upload <- upload_report_calculation(QUICKSTART_LOGS_DIR)
print_rendtions_gap_report(renditions_gap, print_renditions_gap_flag)
upload_report_print(dataset_upload,upload_report_print_flag)