How to create Persian content in PDF?

Category: 
Tags: RTLPersianArabiclanguagesfontsiText 7pdfCalligraph

I have problem with UNICODE character showing in PDF file. There is a solution for this that it is not very efficient for me. The solution is like this.

document.add(new Paragraph("Unicode: \u0418", new Font(bfComic, 12)));
I want to retrieve data from database and show them to the user and my character is in Arabic language and some times in Farsi language. what solution do you suggest?

Posted on StackOverflow on Nov 8, 2014 by gjmkdyttyhujk

You are experiencing different problems:

Encoding of the data:

You need to make sure that you retrieve data from the database using the correct encoding. For instance:

String name1 = new String(rs.getBytes("given_name"), "UTF-8");

I use "UTF-8" in this example, because my database contains different names with special characters stored as "UTF-8". I risk that these special characters are displayed as gibberish if I would retrieve the field like this:

String name2 = rs.getString("given_name")

Encoding of the font:

Please use the IDENTITY_H encoding. In iText 7 it looks like:

PdfFont font = PdfFontFactory.createFont(fontPath, PdfEncodings.IDENTITY_H, true);

Writing from right to left / making ligatures

Although your code will work to show a single character, it won't work to show a sentence correctly.

Suppose that name1 is the Arabic version of the name "Lawrence of Arabia" and that we want to write this name to a PDF. This is done three times in the following screen shot:

Arabic text
Arabic text

The first line is wrong, because the characters are in the wrong order. They are written from left to right whereas they should be written from right to left. Even if the encoding is correct, you're rendering the text incorrectly.

The second line is also wrong. The characters are now in the correct order, but no ligatures are made: ل followed by و should be combined into a single glyph: لو

You can achieve this by creating a Style object in iText 7 and setting the run direction to BaseDirection.RIGHT_TO_LEFT. For instance:

Style arabic = new Style().setTextAlignment(TextAlignment.RIGHT).setBaseDirection(BaseDirection.RIGHT_TO_LEFT).
        setFontSize(20).setFont(bf);
document.add(new Paragraph("Wrong: " + MOVIE_WITH_SPACES).addStyle(arabic));
document.add(new Paragraph(MOVIE).setFontScript(Character.UnicodeScript.ARABIC).addStyle(arabic));

Now the text will be rendered correctly.

This is explained in chapter 11 of my book. You can find a full example here: Ligatures2.

Click this link if you want to see how to answer this question in iText 5.