Posts in Category "PDFG Generator"

Using LiveCycle to programmatically split the PDF documents – Part 1

- Khushwant Singh, Content and Community Lead @ Adobe

A discussion on Adobe forums indicates that a lot of LiveCycle users are trying to figure out how to programmatically split a PDF file. Adobe LiveCycle provides a simple method to programmatically split PDF documents using LiveCycle Assembler service. You can split PDF files using the bookmark tags or by page number.

To split the PDF documents, you require:

  • A DDX (For more information about DDX, see Adobe LiveCycle DDX reference guide)
  • Source documents
  • Access to the running instance of Adobe LiveCycle

You can write a custom DDX document suited to your requirements.  Some of the most commonly requested DDX are:

DDX for splitting PDF document using bookmarks

In the following sample DDXLiveCycle Assembler service generates a single document for each level 1 bookmark in the source document (AssemblerResultPDF.pdf in this example). The Assembler service generates a name for each document that is the concatenation of the following items:

  • A string specified by the prefix attribute
  • A 6-digit sequence number (This number could be used to re-create the original order of the pages after the document is disassembled.)
  • The bookmark title
  • The filename extension .pdf

<PDFsFromBookmarks prefix="stmt">
<PDF source="doc1.pdf"/>
</PDFsFromBookmarks>

DDX for splitting PDF document using page numbers

In this sample DDXLiveCycle Assembler service generates documents for the mentioned page number from the source document. The Assembler service generates a name for each document based on the result parameter specified in the DDX.

<?xml version="1.0" encoding="UTF-8"?>
<DDX xmlns="http://ns.adobe.com/DDX/1.0/">
<PDF result="Final.pdf">
<PDF source="PDF1.pdf" pages="1"/>
</PDF>
<PDF result="Final2.pdf">
<PDF source="PDF1.pdf" pages="2"/>
</PDF>
</DDX>

DDX for splitting PDF document using the page range

In this sample DDXLiveCycle Assembler service generates documents for the mentioned range of the pages. The Assembler service generates a name for each document based on the result parameter specified in the DDX.

<?xml version="1.0" encoding="UTF-8"?>
<DDX xmlns="http://ns.adobe.com/DDX/1.0/">
<PDF result="Final.pdf">
<PDF source="PDF1.pdf" pages="1-5"/>
</PDF>
</DDX>

DDX for splitting PDF documents using page range from different PDF documents and creating a single resultant PDF document

In the following sample DDXLiveCycle Assembler service extracts pages from multiple documents as per the range of pages mentioned in the DDX and generates a single output document

<?xml version="1.0" encoding="UTF-8"?>
<DDX xmlns="http://ns.adobe.com/DDX/1.0/">
<PDF result="Final.pdf">
<PDF source="PDF1.pdf" pages="1-3"/>
<PDF source="PDF2.pdf" pages="4-5"/>
</PDF>
</DDX>

Sample program to split a PDF document
Let us write a simple Java program to split a PDF document into multiple documents.   To download the resources used in this sample program, click here.

Complete the following steps:

  1. Create a new file and add the following code  to the file
    <DDX xmlns="http://ns.adobe.com/DDX/1.0/">
    <PDFsFromBookmarks prefix="Readme">
    <PDF source="AssemblerResultPDF.pdf"/>
    </PDFsFromBookmarks>
    </DDX>

    For this example, save the XML file as shell_disassemble.xml.
  2. Create a new Java project and add shell_disassemble.xml to the project.
  3. Add the following libraries to your project. These libraries are required to invoke assembler service in SOAP mode:
    • adobe-assembler-client.jar
    • adobe-livecycle-client.jar
    • adobe-usermanager-client.jar
    • adobe-utilities.jar
    • jbossall-client.jar (use a different JAR file if LiveCycle ES is not deployed on JBoss)
    • activation.jar
    • axis.jar
    • commons-codec-..jar
    • commons-collections-..jar
    • commons-discovery.jar
    • commons-logging.jar
    • dom-xml-apis-.jar
    • jaxen-.-beta-jar
    • jaxrpc.jar
    • log4j.jar
    • mail.jar
    • saaj.jar
    • wsdl4j.jar
    • xalan.jar
    • xbean.jar
    • xercesImpl.jar
  4. Create a new class named DisassemblePDFSOAP .
  5. Add the source PDF file to the project. I have used AssemblerResultPDF.pdf
  6. Add following code to the class:
    import com.adobe.livecycle.assembler.client.*;
    import java.util.*;
    import java.io.InputStream;
    import java.io.*;
    import com.adobe.idp.Document;
    import com.adobe.idp.dsc.clientsdk.ServiceClientFactory;
    import com.adobe.idp.dsc.clientsdk.ServiceClientFactoryProperties;
    public class DisassemblePDFSOAP
    {
    public static void main (String args[]) {
    Document outDoc = null;
    try{
    //Set connection properties required to invoke LiveCycle ES2
    Properties connectionProps = new Properties();
    connectionProps.setProperty(ServiceClientFactoryProperties.DSC_DEFAULT_SOAP_ENDPOINT, “http://10.40.18.95:8080″);
    connectionProps.setProperty(ServiceClientFactoryProperties.DSC_TRANSPORT_PROTOCOL,ServiceClientFactoryProperties.DSC_SOAP_PROTOCOL);
    connectionProps.setProperty(ServiceClientFactoryProperties.DSC_SERVER_TYPE, “JBoss”);
    connectionProps.setProperty(ServiceClientFactoryProperties.DSC_CREDENTIAL_USERNAME, “administrator”);
    connectionProps.setProperty(ServiceClientFactoryProperties.DSC_CREDENTIAL_PASSWORD, “password”);//Create a ServiceClientFactory instance
    ServiceClientFactory myFactory = ServiceClientFactory.createInstance(connectionProps);//Create an AssemblerServiceClient object
    AssemblerServiceClient assemblerClient = new AssemblerServiceClient(myFactory);FileInputStream myDDXFile = new FileInputStream(“E:\\workspace\\disassemble\\src\\shell_disassemble.xml”);
    //Create a Document object based on the DDX file
    Document myDDX = new Document(myDDXFile);//Create a Map object to store PDF source documents
    Map inputs = new HashMap();FileInputStream mySourceMap = new FileInputStream(“E:\\workspace\\backup\\disassemble\\src\\AssemblerResultPDF.pdf”);//Create a Document object based on the map.pdf source file
    Document myPDFSource = new Document(mySourceMap);//Place two entries into the Map object
    inputs.put(“AssemblerResultPDF.pdf”,myPDFSource);//Create an AssemblerOptionsSpec object
    AssemblerOptionSpec assemblerSpec = new AssemblerOptionSpec();
    assemblerSpec.setFailOnError(false);//Submit the job to the Assembler service
    AssemblerResult jobResult = assemblerClient.invokeDDX(myDDX,inputs,assemblerSpec);
    java.util.Map allDocs = jobResult.getDocuments();

    //Retrieve the result PDF documents from the Map object

    int index = 0;

    //Iterate through the map object to retrieve the result PDF document
    for (Iterator i = allDocs.entrySet().iterator(); i.hasNext();) {
    // Retrieve the Map object’s value

    Map.Entry e = (Map.Entry)i.next();
    if (index == 0)
    {
    Object o = e.getValue();

    //Cast the Object to a Document
    //and save to a file
    outDoc = (Document)o;
    File myOutFile = new File(“E:\\disassemble\\SplitPDF”+index +”.pdf”);
    outDoc.copyToFile(myOutFile);
    }
    index++;
    }
    if (index > 0)
    System.out.println(“The PDF document was disassembled into “+index+” PDF documents.”);
    else
    System.out.println(“The PDF document was not disassembled.”);

    }catch (Exception e) {
    System.out.println(“Error OCCURRED: “+e.getMessage());
    e.printStackTrace();
    }
    }
    }

  7. Modify the locations mentioned in the sample code according to the file paths in your machine
  8. Run the code.
  9. The code splits the file into multiple PDF documents based on the bookmarks or the page numbers specified in the DDX.

This is first blog in the series of the blogs about programmatically splitting the PDF document. In this blog I have shared sample code to split PDF document using bookmarks. In the follow-up blogs, I will include sample code to split PDF documents using:

  • Page numbers
  • Page range
  • Pages from different PDF documents and generate a single output document
VN:F [1.9.22_1171]
Was this helpful? Please rate the content.
Rating: 9.2/10 (5 votes cast)

Debugging LiveCycle – Working with logs (Part 2)

- Ankush Kumar, Lead Software Engineer @ Adobe

In Debugging LiveCycle – Working with logs (part 1), we covered how to handle logs at application server level. In this blog, we will cover a few areas where we can fine tune the logging in applications itself.

 LCM Logs

As you might have noticed LCM logs are found at <LiveCycle Installation Location>/configurationManager/log. Default logging level of this is INFO. This is governed by properties file kept inside adobe-lcm.jar: \com\adobe\livecycle\lcm\logging\log.properties.

Using this property file, you can:

  • Change Logging Level
  • Define file location and file name.
  • Define rotation policy

If you want to overwrite the default location of this file to a more convenient location, you can do so by modifying <LiveCycle Installation Location>/configurationManager/bin/ConfigurationManager.bat and specifying following system property:

-Djava.util.logging.config.file=<path to file>

 Generating ORB Trace

While working with natives like XMLForms, you can sometimes run into issues where an application abnormally terminates. Following parameters help in generating extra trace information for debugging such issues.

These are required to be placed as argument to the native application:

-ORBtraceLevel 25 -ORBtraceThreadId 1 -ORBtraceInvocations 1 -ORBtraceInvocationReturns 1 -ORBtraceTime 1 –ORBtraceFile <Path to log file>

Also, when we are debugging an issue related to native applications, in System Out logs we can find system natives being invoked and a large IOR is passed to them as input. This IOR can be analyzed by many easily available IOR parsers. (Just Google for them). This can be first step towards debugging natives related problem.

Variable Logging

In order to better understand and debug an orchestration, LiveCycle offers excellent process debug feature. Using workbench, one can easily trace every step of a process and find what exact values any variable hold. For more information, one can refer this blog.

http://blogs.adobe.com/shwetank/2011/11/21/process-recording-feature-of-livecycle-workbench/

But sometimes this gets difficult due to environment constraints and performance overheads. One may want to introduce a step which will log current state of all variables in either System Out log or the log of your choice.

This can be accomplished using Variable Logger service. One can introduce this while designing the orchestration. Now each time the orchestration runs, the values of variables will be logged as the step is executed.

Other Application Logging Locations

Content Services and CMSA Logs

Content Services and CMSA logs are created in working directory of the application server.

LiveCycle Installer Logs

Installer logs can be found in following two locations

  • <LiveCycle Installation Home>
  • <LiveCycle Installation Home>/logs

Service Pack Logs

Service pack logs can be found at <LiveCycle Installation Home>/patch/<Patch Name>/log

CRX and Correspondence Management Logs

From ES3 onwards, you will find CRX and CM logs at <CRX Repository Directory>/logs. (More on this will be covered in next part of blogs)

PDFG Configuration Logs

  • PDFG System Readiness Testing Logs:  <LiveCycle Installation Home>/pdfg_srt/reports
  • PDFG Config Logs: <LiveCycle Installation Home>/logs
VN:F [1.9.22_1171]
Was this helpful? Please rate the content.
Rating: 10.0/10 (1 vote cast)

LiveCycle PDF Generator – Tips and Tricks

- Saurabh Kumar Singh, Computer Scientist at Adobe

 

Following are a few tips and workarounds for LiveCycle PDFG. Please note that the workaround marked as unsupported are not officially supported by Adobe.

  • [Unsupported] On UNIX servers customers can use 64-bit OpenOffice to do OpenOffice based conversions. The obvious benefit from this is the performance improvement we get. To achieve this just point JAVA_HOME_32 to 64 bit version of Java. Same can achieved on widows too but you may observe immediate conversion failures for other native file formats.
  • [Unsupported] Any file which can be opened by Acrobat (like a text file) can be converted to PDF using LiveCycle PDF Generator. You just need to add the comma separated file extension (for example txt for text files) in XPS to PDF file-type setting.
  • A user/administrator can directly jump to PDF Generator UI by hitting http(s)://<server-name>:<port>/pdfgui. This way a user can skip couple of clicks on UI to land on PDF Generator user interface.

Watch this space, for a lot more upcoming tips and tricks.

VN:F [1.9.22_1171]
Was this helpful? Please rate the content.
Rating: 0.0/10 (0 votes cast)

LiveCycle PDF Generator – HTML to PDF

- Saurabh Kumar Singh, Computer Scientist at Adobe

 

LiveCycle PDF Generator supports HTML to PDF conversions. HTML document can be provided in any of following forms:

  • Submit an html file to be converted to PDF.
  • Provide http(s) URL of the html to be converted to PDF.
  • Submit a ZIP file containing an entire website (zip should contain index.html at the top level) for creating PDF.

While submitting an input HTML file, the user can provide a variety of options like:

  • The level to which spidering will be performed
  • Whether to get the entire site or not
  • Stay on same path (in terms of URL), while fetching the HTML document(s)
  • Stay on same server. It is useful when you have specified spidering level of more than 1 and at the same time does not want to create PDF from html documents linked on input html if it’s on a different server.
  • PDF page size and margin options
  • Add bookmarks
  • Enable tagging
  • Set initial views settings: It contains option like which page to open on PDF open

LiveCycle ES2 PDF Generator and later provides the facility to specify Adobe Acrobat Professional as the fallback to create PDF files. A downside to this fallback is that Acrobat based conversion is single-threaded, whereas LiveCycle PDFG based conversions are multi-threaded. Also, Acrobat Professional does not honor the options mentioned above.  This facility is only available on Windows. LiveCycle Administrator can also configure the Generate PDF Service to always prefer the Acrobat route. To do this navigate to Home > Services > Applications and Services > Service Management > Configure GeneratePDFService and set the “Use Acrobat WebCapture (Windows Only)” option to True.

What’s New in ES3

The HTML to PDF engine creates high fidelity PDF documents. Time taken to create the best quality PDF document may seem longer to some users. For some user a low quality PDF is acceptable, if the conversion time is faster.

In LiveCycle ES3 a new conversion engine is introduced to achieve this and get a quick turnaround time for conversions. This engine is supported on all the supported platforms of LiveCycle. Moreover this engine honors all the conversion options mentioned above. There is a bonus option of specifying header and footer text to be put in the generated PDF document. This engine acts as fallback for high quality HTML to PDF engine on UNIX machines. In order to set this new engine as the preferred route, navigate to Home > Services > Applications and Services > Service Management > Configure GeneratePDFService and set “Use ICEBrowser based Html to PDF” option to True.

 

VN:F [1.9.22_1171]
Was this helpful? Please rate the content.
Rating: 9.0/10 (3 votes cast)

Content Services: Dependencies for indexing different types of content

Indexing of content in LiveCycle Content Services 9 depends on different LiveCycle ES2 components and services. Here are a few important prerequisites:

  • Indexing of PDF files (except for dynamic PDF forms) requires the Assembler service, which is part of all LiveCycle ES2 installations.
  • Indexing of dynamic PDF files requires LiveCycle Output 9. If Output is not installed, the FormDataIntegration service, available on all LiveCycle ES2 installations, is used instead. However, in such cases, for dynamic PDFs created in Acrobat, only the form data is indexed. The form design is left unindexed.
  • Indexing of Microsoft Word 2007/2010 files (.docx) requires PDF Generator 9 (the GeneratePDF service).

Additionally, files protected by LiveCycle Rights Management 9 are not indexed.

VN:F [1.9.22_1171]
Was this helpful? Please rate the content.
Rating: 10.0/10 (1 vote cast)

Generating a PDF from any application that supports printing

Adobe LiveCycle PDF Generator ES Update 1 (8.2) introduced a new feature called the PDF Generator ES IPP Client, which allows you to generate a PDF from any application that supports printing. The feature is essentially a print driver that prints to PDF Generator ES. After the print driver is installed on a user’s computer, “Adobe LiveCycle PDF Generator ES” appears in the user’s list of available printers. Printing to that printer from any application sends the document (in PostScript format) PDF Generator ES. LiveCycle PDF Generator ES then converts the PostScript file to PDF and sends the PDF file to the user as an attachment to an email message.

Note: The PDF Generator ES IPP Client is only supported on 32-bit versions of Windows XP, Windows 2000, Windows Server 2003, Windows Vista.

Here are the steps required to get this feature working:
1. Install and configure LiveCycle PDG Generator ES.
2. Log into LiveCycle Administration Console, click Services > Applications and Services > Service Management, and find provider.email_sendmail_service. Click the service name and ensure that the Configuration tab is filled correctly. This is where you specify the information that LiveCycle uses to send the email messages.
3. Ensure that your users are configured with a valid email address in the LiveCycle database and assign the PDFGUserPermission to each user. (See Managing Users and Groups and Managing Roles in the LiveCycle User Management Help.)
4. Install and configure the print driver on your users’ computers. For instructions on installing the print driver, see “Installing the IPP client” in your LiveCycle Installing and Deploying guide (such as Installing and Deploying LiveCycle ES for JBoss).

VN:F [1.9.22_1171]
Was this helpful? Please rate the content.
Rating: 0.0/10 (0 votes cast)