pdf2Data - extract information from invoices and templates

Using pdf2Data is very simple from a code perspective. All of the code that is required is below:

// build a new Pdf2DataExtractor based on a template
Pdf2DataExtractor extractor = new Pdf2DataExtractor(template);
// sampleFile: the file you wish to process
// targetPdf: the path where you wish to store the annotated pdf (for visual inspection)
// targetXML: the path where you wish to store the extracted data (in xml format)
extractor.parsePdf(sampleFile, targetPDF, targetXML);

The part that requires manual intervention is the definition of a template, which is a pdf that contains the rules for how text should be extracted from all similar pdfs. To be able to define a template, it can be done through Adobe Reader with comments or through the online demo which is located here: DEMO URL.