Chapter 1: Introducing basic building blocks

Tags: .NET

When I created iText back in the year 2000, I tried to solve two very specific problems.

  1. In the nineties, most PDF documents were created manually on the Desktop using tools like Adobe Illustrator or Acrobat Distiller. I needed to serve PDFs automatically in unattended mode, either in a batch process, or (preferably) on the fly, served to the browser by a web application. These PDF documents couldn’t be produced manually due to the volume (the high number of pages and files) and because the content wasn’t available in advance (it needed to be calculated, based on user input and/or real-time results from database queries). In 1998, I wrote a first PDF library that solved this problem. I deployed this library on a web server and it created thousands of PDF documents in a server-side Java application.

  2. I soon faced a second problem: a developer couldn’t use my first PDF library without consulting the PDF specification. As it turned out, I was the only member in my team who understood my code. This also meant that I was the only person who could fix my code if something broke. That’s not a healthy situation in a software project. I solved this problem by rewriting my first PDF library from scratch, ensuring that knowing the PDF specification inside-out became optional instead of mandatory. I achieved this by introducing the concept of a Document to which a developer can add Paragraph, List, Image, Chunk, and other high-level objects. By combining these intuitive building blocks, a developer could easily create a PDF document programmatically. The code to create a PDF document became easier to read and easier to understand, especially by people who didn’t write that code.

This is a jump-start tutorial to iText 7. We won’t go into much detail, but let’s start with some examples that involve some of these basic building blocks.

Introducing iText's basic building blocks

Many programming tutorials start with a Hello World example. This tutorial isn’t any different.

This is the HelloWorld example for iText 7:

  1. var writer = new PdfWriter(dest);
  2. var pdf = new PdfDocument(writer);
  3. var document = new Document(pdf);
  4. document.Add(new Paragraph("Hello World!"));
  5. document.Close();

Let’s examine this example line by line:

  1. We create a PdfWriter instance. PdfWriter is an object that can write a PDF file. It doesn’t know much about the actual content of the PDF document it is writing. The PdfWriter doesn’t know what the document is about, it just writes different file parts and different objects that make up a valid document once the file structure is completed. In this case, we pass a String parameter, named dest, that contains a path to a file, for instance results/chapter01/hello_world.pdf. The constructor also accepts an Stream as parameter. For instance: if we wanted to write a web application, we could have written to the HttpResponse.OutputStream; if we wanted to create a PDF document in memory, we could have used a MemoryStream; and so on.

  2. The PdfWriter knows what to write because it listens to a PdfDocument. The PdfDocument manages the content that is added, distributes that content over different pages, and keeps track of whatever information is relevant for that content. In chapter 7, we’ll discover that there are various flavors of PdfDocument classes a PdfWriter can listen to.

  3. Once we’ve created a PdfWriter and a PdfDocument, we’re done with all the low-level, PDF-specific code. We create a Document that takes the PdfDocument as parameter. Now that we have the document object, we can forget that we’re creating PDF.

  4. We create a Paragraph containing the text "Hello World" and we add that paragraph to the document object.

  5. We close the document. Our PDF has been created.

Figure 1.1 shows what the resulting PDF document looks like:

Figure 1.1: Hello World example
Figure 1.1: Hello World example

Let's add some complexity. Let's pick a different font and let's organize some text as a list; see Figure 1.2.

Figure 1.2: List example
Figure 1.2: List example

The RickAstley example shows how this is done:

  1. var writer = new PdfWriter(dest);
  2. var pdf = new PdfDocument(writer);
  3. var document = new Document(pdf);
  4. // Create a PdfFont
  5. var font = PdfFontFactory.CreateFont(FontConstants.TIMES_ROMAN);
  6. // Add a Paragraph
  7. document.Add(new Paragraph("iText is:").SetFont(font));
  8. // Create a List
  9. List list = new List()
  10. .SetSymbolIndent(12)
  11. .SetListSymbol("u2022")
  12. .SetFont(font);
  13. // Add ListItem objects
  14. list.Add(new ListItem("Never gonna give you up"))
  15. .Add(new ListItem("Never gonna let you down"))
  16. .Add(new ListItem("Never gonna run around and desert you"))
  17. .Add(new ListItem("Never gonna make you cry"))
  18. .Add(new ListItem("Never gonna say goodbye"))
  19. .Add(new ListItem("Never gonna tell a lie and hurt you"));
  20. // Add the list
  21. document.Add(list);
  22. document.Close();

Lines 1 to 3 and line 22 are identical to the Hello World example, but now we add more than just a Paragraph. iText always uses Helvetica as the default font for text content. If you want to change this, you have to create a PdfFont instance. You can do this by obtaining a font from the PdfFontFactory (line 5). We use this font object to change the font of a Paragraph (line 7) and a List (line 9). This List is a bullet list (line 11) and the list items are indented by 12 user units (line 10). We add six ListItem objects (line 14-19) and add the list to the document.

This is fun, isn’t it? Let’s introduce some images. Figure 1.3 shows what happens if we add an Image of a fox and a dog to a Paragraph.

Figure 1.3: Image example
Figure 1.3: Image example

If we remove the boiler-plate code from the QuickBrownFox example, the following lines remain:

  1. var fox = new Image(ImageDataFactory.Create(FOX));
  2. var dog = new Image(ImageDataFactory.Create(DOG));
  3. var p = new Paragraph("The quick brown ")
  4. .Add(fox)
  5. .Add(" jumps over the lazy ")
  6. .Add(dog);
  7. document.Add(p);

We pass a path to an image to an ImageDataFactory that will return an object that can be used to create an iText Image object. The task of the ImageDataFactory is to detect the type of image that is passed (jpg, png, gif. bmp,…) and to process it so that it can be used in a PDF. In this case, we are adding the images so that they are part of a Paragraph. They replace the words “fox” and “dog”.

Publishing a database

Many developers use iText to publish the results of a database query to a PDF document. Suppose that we have a database containing all the states of the United States of America and that we want to create a PDF that lists these states and some information about each one in a table as shown in Figure 1.4.

Figure 1.4: Table example
Figure 1.4: Table example

Using a real database would probably increase the complexity of these simple examples, so we'll use a CSV file instead: united_states.csv (see figure 1.5).

Figure 1.5: United States CSV file
Figure 1.5: United States CSV file

If you take a closer look at the boilerplate code in the UnitedStates example, you'll discover that we've made a small change to the line that creates the Document (line 4). We've added an extra parameter that defines the size of the pages in the Document. The default page size is A4 and by default that page is used in portrait. In this example, we also use A4, but we rotate the page (PageSize.A4.rotate()) so that it is used in landscape as shown in Figure 1.4. We've also changed the margins (line 4). By default iText uses a margin of 36 user units (half an inch). We change all margins to 20 user units (this term will be explained further down in the text).

  1. var writer = new PdfWriter(dest);
  2. var pdf = new PdfDocument(writer);
  3. var document = new Document(pdf, PageSize.A4.Rotate());
  4. document.SetMargins(20, 20, 20, 20);
  5. var font = PdfFontFactory.CreateFont(FontConstants.HELVETICA);
  6. var bold = PdfFontFactory.CreateFont(FontConstants.HELVETICA_BOLD);
  7. var table = new Table(new float[]{4, 1, 3, 4, 3, 3, 3, 3, 1});
  8. table.SetWidthPercent(100);
  9. StreamReader sr = File.OpenText(DATA);
  10. var line = sr.readLine();
  11. process(table, line, bold, true);
  12. while ((line = sr.readLine()) != null) {
  13. process(table, line, font, false);
  14. }
  15. sr.Close();
  16. document.Add(table);
  17. document.Close();

In this example, we read this CSV file line by line, and we put all the data that is present in the CSV file in a Table object.

We start by creating two PdfFont objects of the same family: Helvetica regular (line 5) and Helvetica bold (line 6). We create a Table object for which we define nine columns by defining a float array with nine elements (line 7). Each float defines the relative width of a column. The first column is four times as wide as the second column; the third column is three times as wide as the second column; and so on. We also define the width of the table relative to the available width of the page (line 8). In this case, the table will use 100% of the width of the page, minus the page margins.

We then start to read the CSV file whose path is stored in the DATA constant (line 9). We read the first line to obtain the column headers (line 10) and we process that line (line 11). We wrote a process() method that adds the line to the table using a specific font and defining whether or not line contains the contents of a header row (lines 11 and 13).

  1. public void process(Table table, String line, PdfFont font, boolean isHeader) {
  2. var tokenizer = new StringTokenizer(line, ";");
  3. while (tokenizer.HasMoreTokens()) {
  4. if (isHeader) {
  5. table.AddHeaderCell(
  6. new Cell().Add(
  7. new Paragraph(tokenizer.NextToken()).SetFont(font)));
  8. } else {
  9. table.AddCell(
  10. new Cell().Add(
  11. new Paragraph(tokenizer.NextToken()).SetFont(font)));
  12. }
  13. }
  14. }

We use a StringTokenizer to loop over all the fields that are stored in each line of our CSV file. We create a Paragraph in a specific font. We add that Paragraph to a new Cell object. Depending on whether or not we’re dealing with a header row, we add this Cell to the table as a header cell (line 7) or as an ordinary cell (line 11).

After we’ve processed the header row (line 11), we loop over the rest of the lines (line 12) and we process the rest of the rows (line 13). As you can tell from Figure 1.4, the table doesn’t fit on a single page. There’s no need to worry about that: iText will create as many new pages as necessary until the complete table was rendered. iText will also repeat the header row because we added the cells of that row using the AddHeaderCell() method instead of using the AddCell() method.

Once we’ve finished reading the data (line 14), we add the table to the document (line 16) and we close it (line 17). We have successfully published our CSV file as a PDF.

That was easy. With only a limited number of lines of code, we have already created quite a nice table in PDF.

Summary

With just a few examples, we have already seen a glimpse of the power of iText. We discovered that it’s very easy to create a document programmatically. In this first chapter, we’ve discussed high-level objects such as Paragraph, List, Image, Table and Cell, which are iText’s basic building blocks.

However, it’s sometimes necessary to create a PDF with a lower-level syntax. iText makes this possible through its low-level API. We’ll take a look at some examples that use these low-level methods in chapter 2.