When we talk about iText 7's basic building blocks, we refer to all classes that implement the
IElement interface. iText 7 is originally written in Java, then ported to C#. Because of our experience with both programming languages, we've adopted the convenient habit –typical for C# developers– to start every name of an interface with the letter
Figure 0.1 shows an overview of the relationship between
IElement and some other interfaces.
At the top of the hierarchy, we find the
IPropertyContainer interface. This interface defines methods to set, get, and delete properties. This interfaces has two direct subinterfaces:
IElement interface will be implemented by objects such as
Table. These are the objects that we'll add to a document, either directly or indirectly. The
IRenderer interface will be implemented by objects such as
TableRenderer. These renderers are used internally by iText, but we can subclass them if we want to tweak the way an object is rendered.
IElement interface has two subinterfaces of its own. The
ILeafElement interface will be implemented by building blocks that can't contain any other elements. For instance: you can add a
Text or an
Image element to a
Paragraph object, but you can't add any object to a
Text or an
Image implement the
ILeafElement interface to reflect this. Finally, there's the
LargeElement interface that allows you to render an object before you've finished adding all the content. It's implemented by the
Table class, which means that you add a table to a document before you've completed adding all the
Cell objects. By doing so, you can reduce the memory use: all the table content that can be rendered before the content of the table is completed, can be flushed from memory.
IPropertyContainer interface is implemented by the abstract
ElementPropertyContainer class. This class has three subclasses; see figure 0.2.
Style class is a container for all kinds of style attributes such as margins, paddings and rotation. It inherits style values such as widths, heights, colors, borders and alignments from the abstract
RootElement class defines methods to add content, using either an
add() method or a
showTextAligned() method. The
Document object will add this content to a page. The
Canvas object doesn't know the concept of a page. It acts as a bridge between the high-level layout API and the low-level kernel API.
Figure 0.3 gives us an overview of the
All classes derived from the
AbstractElement class implement the
Link also implement the
ILeafElement interface. The
ILargeElement interface is only implemented by the
Table class. The basic building blocks make it very easy for you to create tagged PDF. Tagged PDF is a requirement for PDF/A, a standard for long-term preservation of document, and, PDF/UA, an accessibility standard. A properly tagged PDF includes semantic information about all the relevant content.
An ordinary PDF can show a human reader content that is organized as a table. This table is rendered using a bunch of text snippets and lines. To a machine, the table isn't more than that: text positioned at arbitrary places, lines drawn at arbitrary places. A seeing person can detect rows and columns and understand which rows are actually header or footer rows and which rows are body rows. There is no simple way for a machine to do this. When a machine detects a text snippet, it doesn't know if that text snippet is part of a paragraph, part of a title, part of a cell, or part of something else. When a PDF is tagged, it contains a structure tree that allows a machine to understand the structure of the content. Some text will be marked as part of a cell in a header row, other text will be marked as the caption of the table. All real content will be tagged. Other content, such as lines between rows and columns, running headers, page numbers, will be marked as an artifact.
In iText, we have introduced the
IAccessibleElement interface. It is implemented by all the basic building blocks that contain real content:
LineSeparator. If we define a
PdfDocument as a tagged PDF using the
setTagged() method, iText will create a structure tree so that a
Table is properly tagged as a table, a
List properly tagged as a list, and so on. There is no real content in a
Tab or an
AreaBreak, which is why these classes don't implement that interface. It's just white space; a tab and an area break don't even need to be marked as an artifact.
In this tutorial, we won't create tagged PDF; iText will just render the content to the document using the appropriate
IRenderer implementation. Figure 0.4 shows an overview of the
When you compare figure 0.4 with 0.3, you'll discover that each
AbstractElement and each
RootElement has its corresponding renderer. We won't discuss figure 0.4 in much detail. The concept of renderers will become clear the moment we start making some examples.