How can I read Roman page numbers?

Tags: page numberspage labelsinspect PDFiText 5

In Adobe Reader, the first pages of an ebook can have page numbers in the Roman number format. I would like to read these page numbers (instead of the indexed page number) with iText, but I don't know which properties (labels or annotations) I should use.

Posted on StackOverflow on Mar 20, 2015 by T N

You are looking for the PageLabelExample. In this example, we have a PDF, page_labels.pdf that has pages numbered like this:

Pages with page labels
Pages with page labels

In the listPageLabels() method, we create a txt file with all the page labels:

public void listPageLabels(String src, String dest) throws IOException {
    // no PDF, just a text file
    PrintStream out = new PrintStream(new FileOutputStream(dest));
    PdfReader reader = new PdfReader(src);
    String[] labels = PdfPageLabels.getPageLabels(reader);
    for (int i = 0; i < labels.length; i++) {
        out.println(labels[i]);
    }
    out.flush();
    out.close();
    reader.close();
}

The result looks like this:

A
B
1
2
3
Movies-4
Movies-5
Movies-6
Movies-7
Movies-8

If you want an iTextSharp example, take a look at this method:

/**
 * Reads the page labels from an existing PDF
 * @param src the existing PDF
 */
public string ListPageLabels(byte[] src) {
    StringBuilder sb = new StringBuilder();
    String[] labels = PdfPageLabels.GetPageLabels(new PdfReader(src));
    for (int i = 0; i < labels.Length; i++) {
        sb.Append(labels[i]);
        sb.AppendLine();
    }
    return sb.ToString();
}