AEM 6 – migrating to FileDataStore from embedded segmentstore binaries

Posted on Wednesday, March 16, 2016 By

Have you ever come into a situation where you’re up and running on AEM 6 using the default setup with the segment store containing everything – content & large binaries – and you wish you would have thought to split them out into a FileDataStore (or S3) after the fact? Most of the time you probably have architected the solution this way from the start. However, there are probably times you want to eek out every bit of performance optimizations that you can after initial rollout.

Enter oak-upgrade (and its Adobe sibling, crx2oak)


As most should know already, the default Oak architecture you get with AEM 6 stores the binary content (files like images, videos, pdfs, etc) within the repository tar files. Look in crx-quickstart/repository/segmentstore for examples or read up on the structure here: or check out Francesco Mari’s analysis here:

Now for most Sites use cases where your authors create alot of content and occasionally upload a pdf or image, this setup is just fine. However, if you want the best performance and the ability to share your data store with other instances, then you should go with an external data store like FileDataStore which we’ll focus on here.

Getting Started

disclaimer: not responsible for any damage you cause to your instances. Please test thoroughly on non-production instances before considering attempting this on a working production instance.

Before we start, you should update your Oak version on AEM 6 to the latest Adobe supplied version. As of this writing, oak-upgrade (1.6) needed either 1.0.12+ (AEM 6.0) or 1.1.7+ (AEM6.1) to migrate segment format V_11 repositories. This is important.

You’re going to need alot of disk capacity during the migration as the process creates a new datastore directory with all the data as well as a new repository directory and segment store. You are doubling your disk needs temporarily as the end result is basically two copies of your repo.

Depending on the size of your repo, the migration may take a considerable amount of time. In addition to the binary data being copied, some of the Oak indexes will most likely need to be re-indexed (automatically).

The Steps

Backup all data before starting.

  1. Download the latest version of oak-upgrade from Apache. I built my version (1.6-SNAPSHOT) from source using maven.
  2. Create a parallel directory to house your new repository directory (eg. “repository_new”)
  3. Copy your oak-upgrade jar to the base install directory – parallel to the main aem quickstart jar.
  4. Run this command:
    java -jar oak-upgrade.jar –copy-binaries –src-datastore=crx-quickstart/repository/segmentstore –datastore=crx-quickstart/datastore crx-quickstart/repository crx-quickstart/repository_new –mmap[your paths may vary]
  5. The output of this command should be bunch of RepositorySidegrade INFO messages and a new repo directory with a segmentstore with tar files and a datastore directory with a bunch of two character long directories.
  6. Rename the existing repository directory to “repository.migrated” (or something else).
  7. Rename the newly populated repository to “repository” from whatever you set in step 4.
    Note: I realize that you can point AEM to whatever repository.home you wish via OSGi configuration, but this just seems easier (and faster).
  8. Create a text file named org.apache.jackrabbit.oak.plugins.blob.datastore.FileDataStore.cfg in crx-quickstart/install. Add the following content to the file:

    (or whatever you set your path to)

  9. Create a text file named org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.cfg in crx-quickstart/install. Add the following content to the file (Note: this is also where you can change the repository.home path (not shown here)):
  10. Start up AEM like usual and monitor error.log for any anomalies.


I ran the migration on a pretty vanilla AEM 6 (Oak 1.0.27) instance (with OOTB content) and the whole process went very quickly. The oak-upgrade command took less than 30 seconds. Startup took around 2 minutes to complete and I have yet to find anything wrong with the new system.


9:27 AM Permalink

MongoDB + AEM Production Servers

Posted on Tuesday, September 16, 2014 By


I response to my previous post, here is the basics of a sharded cluster MongoDB setup. As you can see, if you were to place each element on its own server you’d end up with alot of servers.

Sharded Mongo

After a bit of research into how each mongo element is used and its performance characteristics, I arrived at the following sleek(er) configuration of servers. This is arrangement is symmetric to allow for cross data center architectures.



This post was heavily influenced by a concise Will Burton post:

12:48 PM Permalink

AEM6 + Mongo Minimal Setup

Posted on Tuesday, July 22, 2014 By

With the release of AEM 6.0, clustering has been rethought in the form of Apache Jackrabbit “Oak” and the introduction to the concept of microkernels. The initial microkernel that supports clustering uses MongoDB (v. 2.6+) for persistence. This post will provide a real working example of a minimal authoring cluster using a fault-tolerant MongoDB microkernel.

If I receive enough demand, I can also post a production quality MongoDB setup.

What you’ll need:

  • MongoDB installed
  • AEM 6.0 quickstart jar
  • Oracle JRE 1.7

The examples provided were tested using an OS X host but should be compatible on Windows as well as other Unix hosts.

Prep Work

  1. Install the MongoDB binaries
    (I used homebrew: sudo brew install mongodb)
  2. Install the Oracle JDK/JRE v 1.7 (if you haven’t already)
  3. Create a directory for housing your mongo data. I used “aem6_cluster”.
  4. Inside that directory, create these directories: node0,node1,arb0
    The two node directories will be the data directories for two members of a replica set. The arb directory is the data directory for an arbiter.
  5. From the main cluster directory, create another folder named “author”.
  6. Copy your aem 6 quickstart jar and license file to that directory. (Rename the quickstart to this if needed: aem6-author-p4502.jar)

MongoDB Logo

MongoDB setup

  1. From the main cluster directory run this mongo command (from terminal or command prompt):
     mongod --dbpath node0/ --replSet aem6
  2. Enter the mongo util in another terminal window:
  3. Initialize the replica set named in step 7:
     rs.initiate(rsconf = {_id: "aem6", members: [{_id: 0, host: "<hostname>:27017"}]})
  4. After the command is issued, it should return a json response with “ok”: 1. Keep this terminal window open.
  5. From the main cluster directory run this mongo command (from a new terminal or command prompt):
     mongod --dbpath node1/ --replSet aem6 --port 27018
  6. Back in the mongo util window that we kept open, issue this command to add the new node to the replica set:
  7. Once again, if the command executes correctly you should get a json response with “ok” : 1. Keep this window open (again).
  8. From the main cluster directory run this mongo command (from a new terminal or command prompt):
     mongod --port 27019 --dbpath arb0 --replSet aem6
  9. Back in the mongo util window that we kept open, issue this command to add this new node as an arbiter:

At this point we have a clean mongo replica set with 3 members. There should be a PRIMARY and SECONDARY with an aribter node to break up any ties for voting who is the PRIMARY. The two nodes provide minimal redundancy. If one goes down, the other takes over. If both go down, you’re up a creek.

AEM Logo

AEM 6 startup

From the author directory, issue this command:

 java -XX:MaxPermSize=512M -mx4g -jar aem6-author-p4502.jar -r crx3,crx3mongo -Doak.mongo.uri=mongodb://<hostname>:27017,<hostname>:27018

AEM should install and replicate automatically to the mongo nodes. Installation on pretty quick SSD-based mac takes less that 5 minutes (results may vary).

You can now repeat this process on any number of AEM instances as long as they are running on separate ports. Each AEM instance will be active+active in the cluster.

Note the run modes: crx,crx3mongo. The JVM settings can be left out – I just like to dedicate a bit more RAM.

Mongo Notes

To test out the fault tolerance, just ctl+c the mongo instance and watch the AEM driver switch to the elected PRIMARY.

Mongo replica sets must have an odd number of members. An arbiter is not required, but it allows you to have nicely divisible replica sets.

1:00 PM Permalink

Static multi-image component with image preview

Posted on Wednesday, November 27, 2013 By

I recently got a requirement in which I had to build a component in which the user can Drag and Drop 2 main images (e.g.: hero) and 3 sub images ( icons,buttons,logos etc). The main requirement was being able to preview these images while the author is creating content for that page.

I got it working after few trials and I am sharing the same.

Environment: CQ561

1. Create a single image component extending the OOTB (xtype=’html5smartimage’) image component. This should be fairly simple and if you need any information on this leave a comment and I can help.

Screen Shot 2013-11-27 at 4.37.12 PM


2. Now create another tab in your component structure by copy/pasting your current working image cq:Panel. For E.g.: in my case I copied /apps/blog/components/content/hero/dialog/items/items/imagetab1 and pasted it under /apps/blog/components/content/hero/dialog/items/items/ to create /apps/blog/components/content/hero/dialog/items/items/imagetab2

Screen Shot 2013-11-27 at 4.42.23 PM


3. Next to avoid confusion I also renamed the image cq:Widget at /apps/blog/components/content/hero/dialog/items/items/imagetab2/items/image to /apps/blog/components/content/hero/dialog/items/items/imagetab2/items/image4

Screen Shot 2013-11-27 at 4.44.59 PM


4. Now we will have to update the properties under image4 cq:Widget (/apps/blog/components/content/hero/dialog/items/items/imagetab2/items/image4) to make this work.


table { }td { padding-top: 1px; padding-right: 1px; padding-left: 1px; color: black; font-size: 12pt; font-weight: 400; font-style: normal; text-decoration: none; font-family: Calibri,sans-serif; vertical-align: bottom; border: medium none; white-space: nowrap; }

Property Name New Value
cropParameter ./image4/imagecrop
fileNameParameter ./image4/fileName
fileReferenceParameter ./image4/fielReference
name ./image4/image4file
rotateParameter ./image4/imageRotate

Screen Shot 2013-11-27 at 5.12.56 PM


5. Now when you go back to your content page you should have 2 different tabs and both having the image drag and drop functionality. You can extend this concept for more images as long as you make sure the properties are updated correctly for the image cq:Widget.

Screen Shot 2013-11-27 at 5.19.22 PM                                            Screen Shot 2013-11-27 at 5.19.11 PM

6:28 PM Permalink

Installing AEM from command line without sample content (Geometrixx)

Posted on Wednesday, November 27, 2013 By

When you install a new instance of Adobe Experience Manager a bunch of content is provided as Sample content. By sample content I am referring to the Geometrixx sites (Geometrixx, Geometrixx Mobile, Geometrixx Outdoors and Geometrixx Outdoors Mobile) that are provided as samples and references. They can be removed by uninstalling packages available in AEM. It is also a recommendation to Uninstall example content and users as part of the Adobe Experience Manager 5.6.1 Security Checklist specially in your Production environment.

The good part is that now we also have a  run mode option for installing or not installing the sample content while installing CQ from command line. These modes allow you to control the use of sample content. The sample content is defined before the quickstart is built and can include packages, configurations, etc:

Installation run modes are provided out-of-the-box:

  • author
  • publish


  • samplecontent
  • nosamplecontent


Environment in which I have successfully used it had:

  • Oracle Linux
  • java version “1.6.0_65”
  • 64 bit VM


These are two pairs of mutually exclusive run modes; for example, you can:

  • define either author or publish, not both at the same time
  • combine author with either samplecontent or nosamplecontent (but not both)


Command line installation:


  • You will need to configure your -XX:MaxPermSize and Xmx based on 32bit or 64bit VM and how much memory you can use.
  • I have renamed my CQ 561 jar to cq56-author-p4502.jar for author instance and cq56-publish-p4503.jar for publish instance.

Author instance installation command:

java -server -XX:MaxPermSize=256m -Xmx1024M -jar cq56-author-p4502.jar -r author, nosamplecontent

Publish instance installation command:

java -server -XX:MaxPermSize=256m -Xmx1024M -jar cq56-publish-p4503.jar -r publish, nosamplecontent

Once CQ gets installed you verify that none of the Geometrixx related content should have got installed.

5:21 PM Permalink