Which PDF filters are used to encode data?

Tags: filterscompressioniText 7

Currently I am working with the iTextSharp library to generate PDF files. These files will be processed by some file processor. This PDF file processor has some limitations with respect to the PDF filters it can use to decode the data from the file. I am very keen to know which PDF filters are used by iTextSharp to encode the data so that it can decode the data properly.

Posted on StackOverflow on Jun 17, 2015 by Midhun

iText for C# supports the filters that are defined in the PDF specification. That means that content streams (e.g. for pages) use /FlateDecode, which is what every other PDF producer will use by default, because that's the standard compression for PDF.

Image streams use other filters when applicable, for instance: JPEG images are stored using /DCTDecode, JBIG2 images are stored using /JBIG2Decode, CCITT images are stored using /CCITTFaxDecode, and so on.

It is hard to believe that there would be PDF software that doesn't support these filters. Maybe there is some very old software that doesn't support /JPXDecode (introduced in PDF 1.5; used whenever you try to introduce JPEG2000 images). However, that shouldn't be a problem as long as you don't introduce .jpx or .j2k images. Just so, /DCTDecode isn't used if you don't introduce any .jpg file, and so on.

Another thing we've noticed, is that some legacy software doesn't support compressed cross-reference tables, nor objects stored in a stream. This was introduced in PDF 1.5 (2003). That's why iText for C# doesn't compress xref tables, nor introduces objects compressed into streams unless you intentionally instruct iText for C# to do so.

Click this link if you want to see how to answer this question in iText 5.