Have you ever come into a situation where you’re up and running on AEM 6 using the default setup with the segment store containing everything – content & large binaries – and you wish you would have thought to split them out into a FileDataStore (or S3) after the fact? Most of the time you probably have architected the solution this way from the start. However, there are probably times you want to eek out every bit of performance optimizations that you can after initial rollout.
Enter oak-upgrade (and its Adobe sibling, crx2oak)
As most should know already, the default Oak architecture you get with AEM 6 stores the binary content (files like images, videos, pdfs, etc) within the repository tar files. Look in crx-quickstart/repository/segmentstore for examples or read up on the structure here: http://jackrabbit.apache.org/oak/docs/nodestore/segment/tar.html or check out Francesco Mari’s analysis here: http://francescomari.github.io/2015/05/04/looking-into-tar-files-in-oak.html.
Now for most Sites use cases where your authors create alot of content and occasionally upload a pdf or image, this setup is just fine. However, if you want the best performance and the ability to share your data store with other instances, then you should go with an external data store like FileDataStore which we’ll focus on here.
disclaimer: not responsible for any damage you cause to your instances. Please test thoroughly on non-production instances before considering attempting this on a working production instance.
Before we start, you should update your Oak version on AEM 6 to the latest Adobe supplied version. As of this writing, oak-upgrade (1.6) needed either 1.0.12+ (AEM 6.0) or 1.1.7+ (AEM6.1) to migrate segment format V_11 repositories. This is important.
You’re going to need alot of disk capacity during the migration as the process creates a new datastore directory with all the data as well as a new repository directory and segment store. You are doubling your disk needs temporarily as the end result is basically two copies of your repo.
Depending on the size of your repo, the migration may take a considerable amount of time. In addition to the binary data being copied, some of the Oak indexes will most likely need to be re-indexed (automatically).
Backup all data before starting.
- Download the latest version of oak-upgrade from Apache. I built my version (1.6-SNAPSHOT) from source using maven.
- Create a parallel directory to house your new repository directory (eg. “repository_new”)
- Copy your oak-upgrade jar to the base install directory – parallel to the main aem quickstart jar.
- Run this command:
java -jar oak-upgrade.jar –copy-binaries –src-datastore=crx-quickstart/repository/segmentstore –datastore=crx-quickstart/datastore crx-quickstart/repository crx-quickstart/repository_new –mmap[your paths may vary]
- The output of this command should be bunch of RepositorySidegrade INFO messages and a new repo directory with a segmentstore with tar files and a datastore directory with a bunch of two character long directories.
- Rename the existing repository directory to “repository.migrated” (or something else).
- Rename the newly populated repository to “repository” from whatever you set in step 4.
Note: I realize that you can point AEM to whatever repository.home you wish via OSGi configuration, but this just seems easier (and faster).
- Create a text file named org.apache.jackrabbit.oak.plugins.blob.datastore.FileDataStore.cfg in crx-quickstart/install. Add the following content to the file:
(or whatever you set your path to)
- Create a text file named org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.cfg in crx-quickstart/install. Add the following content to the file (Note: this is also where you can change the repository.home path (not shown here)):
- Start up AEM like usual and monitor error.log for any anomalies.
I ran the migration on a pretty vanilla AEM 6 (Oak 1.0.27) instance (with OOTB content) and the whole process went very quickly. The oak-upgrade command took less than 30 seconds. Startup took around 2 minutes to complete and I have yet to find anything wrong with the new system.