How do I add XMP metadata to each page of an existing PDF?

Tags: metadataXMPPdfStamperiText 5

So far all the examples seem to involve adding metadata while creating a new Document. I want to take an existing PDF and add a GUID to each page's XMP using the stamper.

Posted on StackOverflow on Feb 10, 2015 by Rahul

The most common use of XMP in the context of PDF is when you add an XMP stream for the whole document that is referred to from the root dictionary of the PDF (aka the Catalog).

However, if you consult the PDF specification, you notice that XMP can be used in the context of many other objects inside a PDF, the page level being one of them. If you look at the spec, you will discover that /Metadata is an optional key in the page dictionary. It expects a reference to an XMP stream.

If you would use iText to create a PDF document from scratch, you wouldn't find a specific method to add this metadata, but you could use the generic addPageDictEntry() that is available in PdfWriter. You would pass PdfName.METADATA as key and a reference to a stream that is already added to the PdfWriter as value.

Your question isn't about creating a PDF from scratch, but about modifying an existing PDF. In that case, you also need the page dictionary. It is very easy to obtain these dictionaries:

PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfDictionary page;
for (int p = 1; p <= n; p++) {
    page = reader.getPageN(p);
    // do stuff with the page dictionary
}

This snippet was taken fro the Rotate90Degrees example. In that example, we look at the /Rotate entry which is a number:

PdfNumber rotate = page.getAsNumber(PdfName.ROTATE);

You need to look for the /Metadata entry, which is a stream:

PRStream stream = (PRStream) page.getAsStream(PdfName.METADATA);

Maybe this stream is null, in this case, you need to add a /Metadata entry as is shown in the AddXmpToPage example:

// We create some XMP bytes
ByteArrayOutputStream baos = new ByteArrayOutputStream();
XmpWriter xmp = new XmpWriter(baos, new PdfDictionary());
xmp.close();
// We add the XMP bytes to the writer
PdfIndirectObject ref = stamper.getWriter().addToBody(new PdfStream(baos.toByteArray()));
// We add a reference to the XMP bytes to the page dictionary
page.put(PdfName.METADATA, ref.getIndirectReference());

If there is an XMP stream, you want to keep it and add something to it.

This is how you get the XMP bytes:

byte[] xmpBytes = PdfReader.getStreamBytes(stream);

You perform your XML magic on these bytes, resulting a a new byte[] named newXmpBytes. You replace the original bytes with these new bytes like this:

stream.setData(newXmpBytes);

All these operations are done on the existing file that resides in the PdfReader object. You now have to persist the changes like this:

PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.close();
reader.close();