How to delete attachments in PDF using iText?

Tags: embedded filesfile attachment annotationannotationsremove attachmentsattachmentsiText 7
Attachment shown in Adobe Acrobat
Attachment shown in Adobe Acrobat

public void addAttachments(String src, String dest, String[] attachments)
    throws IOException,DocumentException {
    PdfReader reader = new PdfReader(src);
    PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
    for (int i = 0; i < attachments.length; i++) {
        addAttachment(stamper.getWriter(), new File(attachments[i]));
    }
    stamper.close();
}
protected void addAttachment(PdfWriter writer, File src) throws IOException {
    PdfFileSpecification fs = PdfFileSpecification.fileEmbedded(
        writer, src.getAbsolutePath(), src.getName(), null);
    writer.addFileAttachment(
        src.getName().substring(0, src.getName().indexOf('.')), fs);
}
However, I want to write another program to delete the embedded files. May I know how can I do it?

Posted on StackOverflow on Oct 30, 2014 by brian

Let me start by rewriting your code to add an embedded file.

public void manipulatePdf(String src, String dest) throws Exception {
  PdfDocument pdfDoc = new PdfDocument(new PdfReader(src), new PdfWriter(dest));
  PdfFileSpec spec = PdfFileSpec.createEmbeddedFileSpec(pdfDoc, "Some test".getBytes(), "some test file", "test.txt", null, null, null, true);
  pdfDoc.addFileAttachment("some_test", spec);
  pdfDoc.close();
}

You can find the full code sample here: AddEmbeddedFile

Now when we look at the Attachments panel of the resulting PDF, we see an attachment test.txt with description "some test file":

Attachment shown in Adobe Acrobat
Attachment shown in Adobe Acrobat

After you have added this file, you now want to remove it. To do this, please use RUPS and take a look inside:

Attachment shown in iText RUPS
Attachment shown in iText RUPS

This gives us a hint on where to find the embedded file. Take a look at the code of the RemoveEmbeddedFile example to see how we navigate through the object-oriented file format that PDF is:

public void manipulatePdf(String src, String dest) throws throws Exception {
    PdfDocument pdfDoc = new PdfDocument(new PdfReader(src), new PdfWriter(dest));
    PdfDictionary root = pdfDoc.getCatalog().getPdfObject();
    PdfDictionary names = root.getAsDictionary(PdfName.Names);
    PdfDictionary embeddedFiles = names.getAsDictionary(PdfName.EmbeddedFiles);
    PdfArray namesArray = embeddedFiles.getAsArray(PdfName.Names);
    namesArray.remove(0);
    namesArray.remove(0);
    pdfDoc.close();
}

As you can see, we start at the root of the document (aka the catalog) and we walk via Names and EmbeddedFiles to the Names array. As I know that the embedded file I want to remove is the first in the array, I remove the name and value by removing the element with index 0 twice. This first removes the description, then the reference to the file. The attachment is now gone:

Attachment is gone in Adobe Acrobat
Attachment is gone in Adobe Acrobat

As there was only one embedded file in my example, I now see an empty array when I look inside the PDF:

Empty array in iText RUPS
Empty array in iText RUPS

If you want to remove all the embedded files at once, the code is even easier. That is shown in the RemoveEmbeddedFiles example:

public void manipulatePdf(String src, String dest) throws Exception {
    PdfDocument pdfDoc = new PdfDocument(new PdfReader(src), new PdfWriter(dest));
    PdfDictionary root = pdfDoc.getCatalog().getPdfObject();
    PdfDictionary names = root.getAsDictionary(PdfName.Names);
    names.remove(PdfName.EmbeddedFiles);
    pdfDoc.close();
}

Now we don't even look at the entries of the EmbeddedFiles dictionary. There is no longer such an entry in the Names dictionary:

No embedded files in iText RUPS
No embedded files in iText RUPS

Click this link if you want to see how to answer this question in iText 5.