Chapter 2: Defining styles with CSS

Tags: HTML to PDFparsing HTMLparsing CSSpdfHtmlCSSiText 7HTMLeBookconverting HTML to PDF with pdfHTML

In the previous chapter, we looked at different snippets of Java code.

In this chapter, we'll use the same snippet for every example:

public void createPdf(String src, String dest) throws IOException {
    HtmlConverter.convertToPdf(new File(src), new File(dest));
}

Instead of looking at different snippets of Java code, we'll look at different snippets of HTML and CSS.

Old-fashioned HTML

In example C02E01_NoCss, we define styles as italic and a different font size by using tags such as <i> and <font>; see 1_no_css.html:

<html>
    <head>
        <title>Colossal</title>
        <meta name="description" content="Gloria is an out-of-work party girl ..." />
    </head>
    <body>
        <img src="img/colossal.jpg" width="120px" align="right" />
        <h1>Colossal (2016)</h1>
        <i>Directed by Nacho Vigalondo</i>
        <div>
        Gloria is an out-of-work party girl forced to leave her life in New York City,
        and move  back home. When reports surface that a giant creature is
        destroying Seoul, she gradually comes to the realization that she is
        somehow connected to this phenomenon.
        </div>
        <font size="-1">Read more about this movie on
        <a href="www.imdb.com/title/tt4680182">IMDB</a></font>
    </body>
</html>

The HTML page and the PDF that were created based on this HTML page are shown in figure 2.1.

Figure 2.1: HTML page without CSS
Figure 2.1: HTML page without CSS

We call this old-fashioned –some would call it bad taste– because nowadays HTML is only used to define the content and its structure. Nowadays, all styles –such as widths, heights, the choice of font, font size, font color, font weight, and so on...– are defined using Cascading Style Sheets (CSS). This is a much more elegant approach, as it creates a clear separation between the content and its presentation.

Let's rewrite our HTML page, and let's introduce some CSS.

Inline CSS

When opened in a browser, there is no visible difference between the HTML file from the previous example, and 2_inline_css.html:

<html>
    <head>
        <title>Colossal</title>
        <meta name="description" content="Gloria is an out-of-work party girl ..." />
    </head>
    <body>
        <img src="img/colossal.jpg" style="width: 120px;float: right" />
        <h1>Colossal (2016)</h1>
        <div style="font-style: italic">Directed by Nacho Vigalondo</div>
        <div>
        Gloria is an out-of-work party girl forced to leave her life in New York City,
        and move  back home. When reports surface that a giant creature is
        destroying Seoul, she gradually comes to the realization that she is
        somehow connected to this phenomenon.
        </div>
        <div style="font-size: 0.8em">Read more about this movie on
        <a href="www.imdb.com/title/tt4680182">IMDB</a></div>
    </body>
</html>

The width of the image and its position are now defined in a style attribute. So is the font style used for the director, as well as the font size for the IMDB link. As you can tell from figure 2.2, there is no difference between the output of this example and the previous one.

Figure 2.2: HTML page with inline CSS
Figure 2.2: HTML page with inline CSS

Instead of using the style attribute, we can also use the id or the class attribute. Both are supported by pdfHTML.

What's the difference between id and class?

  • An id is unique: each element can have only one id; each page can have only one element with that id.
  • A class is not unique: you can use the same class on multiple elements; you can use multiple classes on the same element.

In the next example, we'll define some classes, and we'll put them in the header section of the HTML page.

Internal CSS

In the 3_header_css.html HTML file, we bundle all the styles in a <style> section in the header of the HTML file.

<html>
    <head>
        <title>Colossal</title>
        <meta name="description" content="Gloria is an out-of-work party girl ..." />
        <style>
            .poster { width: 120px;float: right; }
            .director { font-style: italic; }
            .description { font-family: serif; }
            .imdb { font-size: 0.8em; }
            a { color: red; }
        </style>
    </head>
    <body>
        <img src="img/colossal.jpg" class="poster" />
        <h1>Colossal (2016)</h1>
        <div class="director">Directed by Nacho Vigalondo</div>
        <div class="description">
        Gloria is an out-of-work party girl forced to leave her life in New York City,
        and move  back home. When reports surface that a giant creature is
        destroying Seoul, she gradually comes to the realization that she is
        somehow connected to this phenomenon.
        </div>
        <div class="imdb">Read more about this movie on
        <a href="www.imdb.com/title/tt4680182">IMDB</a></div>
    </body>
</html>

We made a couple of changes when compared to the previous two examples. We changed the color of the links to red (instead of using the default color, which is blue), and we picked a serif font for the description (instead of using the default font).

When you don't define a font in an HTML file, most browsers will show the document in a serif font. Historically, the default font used by iText has always been Helvetica, which is a sans-serif font. This explains the difference between the original HTML files and the resulting PDFs in the examples so far.

You could define the font-family as serif at the level of the body element to have a better match between the HTML and the PDF. We chose not to do this in this simple example; as a result, we can clearly see the difference between the serif and the sans-serif fonts in figure 2.3:

Figure 2.3: HTML page with CSS in the header section
Figure 2.3: HTML page with CSS in the header section

The title, the line with the name of the director, and the IMDB paragraph are still in Helvetica, but the description is rendered in a Serif font.

Putting CSS in the header is fine, but as soon as more styles are involved, it might be better to put the CSS in a separate file, external to the HTML file.

External CSS

In 4_external_css.html, we don't have a <style> block in the header section. Instead we refer to a style sheet named movie.css using the <link> tag.

<html>
    <head>
        <title>Colossal</title>
        <meta name="description" content="Gloria is an out-of-work party girl ..." />
        <link rel="stylesheet" type="text/css" href="css/movie.css">
    </head>
    <body>
        <img src="img/colossal.jpg" class="poster" />
        <h1>Colossal (2016)</h1>
        <div class="director">Directed by Nacho Vigalondo</div>
        <div class="description">
        Gloria is an out-of-work party girl forced to leave her life in New York City,
        and move  back home. When reports surface that a giant creature is
        destroying Seoul, she gradually comes to the realization that she is
        somehow connected to this phenomenon.
        </div>
        <div class="imdb">Read more about this movie on
        <a href="www.imdb.com/title/tt4680182">IMDB</a></div>
    </body>
</html>

The external CSS file, movie.css, looks like this:

.poster {
    width: 120px;
    float: right;
}
.director {
    font-style: italic;
}
.description {
    font-family: serif;
}
.imdb {
    font-size: 0.8em;
}
a {
    color: green;
}

Apart from the fact that we changed the color of the content of the <a>-tag to green, there isn't much difference between figure 2.4 and figure 2.3.

Figure 2.4: HTML page with external CSS
Figure 2.4: HTML page with external CSS

You can mix and match inline CSS, CSS in the header, and external CSS. You can even define a style in one place that is overridden in an other place.

When CSS is defined in different places, which CSS takes precedence?

CSS stands for Cascading Style Sheets, and the styles defined on difference levels "cascade" into a new, virtual style sheet, combining the styles by the following rules:

  1. First there is the style sheet used by the browser. This style sheet is used in absence of specific styles. The pdfHTML add-on ships with its own stylesheet, which explains for instance why the content of an <a> tag is rendered in blue.
  2. The style sheet used by the browser can be overruled by an external style sheet.
  3. All the previously defined styles can be overruled by an internal style sheet that is present in the header section of the HTML file.
  4. Inline styles (inside an HTML element) have top priority. When a style is defined in a style attribute, it overrules the previously defined style attributes.

In these examples, we positioned the image by creating a class named poster. We used CSS to define the width, and to position the image to the right with float. In the next example, we are going to use absolute positions.

Using absolute positioning

In posters.html, we use a distance from the top and the left side of the page to define an absolute position.

<html>
    <head><title>SXSW movies</title></head>
    <body>
        <img src="img/68_kill.jpg"
            style="position: absolute; top: 5; left: 5; width: 60;">
        <img src="img/a_bad_idea_gone_wrong.jpg"
            style="position: absolute; top: 5; left: 70; width: 60;">
        <img src="img/a_critically_endangered_species.jpg"
            style="position: absolute; top: 5; left: 135; width: 60;">
        <img src="img/big_sick.jpg"
            style="position: absolute; top: 5; left: 200; width: 60;">
        <img src="img/california_dreams.jpg"
            style="position: absolute; top: 5; left: 265; width: 60;">
        <img src="img/colossal.jpg"
            style="position: absolute; top: 5; left: 330; width: 60;">
        <img src="img/daraju.jpg"
            style="position: absolute; top: 5; left: 395; width: 60;">
        <img src="img/drib.jpg"
            style="position: absolute; top: 5; left: 460; width: 60;">
        <img src="img/hot_summer_nights.jpg"
            style="position: absolute; top: 5; left: 525; width: 60;">
        <img src="img/unrest.jpg"
            style="position: absolute; top: 5; left: 590; width: 60;">
        <img src="img/hounds_of_love.jpg"
            style="position: absolute; top: 120; left: 5; width: 60;">
        <img src="img/lane1974.jpg"
            style="position: absolute; top: 120; left: 70; width: 60;">
        <img src="img/madre.jpg"
            style="position: absolute; top: 120; left: 135; width: 60;">
        <img src="img/mfa.jpg"
            style="position: absolute; top: 120; left: 200; width: 60;">
        <img src="img/mr_roosevelt.jpg"
            style="position: absolute; top: 120; left: 265; width: 60;">
        <img src="img/nobody_speak.jpg"
            style="position: absolute; top: 120; left: 330; width: 60;">
        <img src="img/prevenge.jpg"
            style="position: absolute; top: 120; left: 395; width: 60;">
        <img src="img/the_archer.jpg"
            style="position: absolute; top: 120; left: 460; width: 60;">
        <img src="img/the_most_hated_woman_in_america.jpg"
            style="position: absolute; top: 120; left: 525; width: 60;">
        <img src="img/this_is_your_death.jpg"
            style="position: absolute; top: 120; left: 590; width: 60;">
    </body>
</html>

Figure 2.5 shows the result.

Figure 2.5: HTML page with images at absolute positions
Figure 2.5: HTML page with images at absolute positions

Be careful when you use this functionality. The position you choose for an HTML page won't always fit a PDF page. You will have to take that into account, either when you design your HTML, or when you define the default page size of your PDF document.

pdfHTML also supports at-rules.

Adding "Page X of Y" using an @page rule

In movies.html, we have a list of 20 movies that are presented using movie.css for the styles. Additionaly, we also defined a CSS at-rule in the header of the page.

<html>
    <head>
        <title>Movies</title>
        <meta name="description" content="Selection of movies screened at SXSW 2017" />
        <link rel="stylesheet" type="text/css" href="css/movie.css">
        <style>
            @page {
                @bottom-right {
                    content: "Page " counter(page) " of " counter(pages);
                }
            }
        </style>
    </head>
    <body>
        <div style="width: 320pt;">
            <img src="img/68_kill.jpg" class="poster" />
            <h1>68 Kill (2017)</h1>
            <div class="director">Directed by Trent Haaga</div>
            <div class="description">
                 A punk-rock after hours about femininity,
                 masculinity and the theft of $68,000.
            </div>
            <div class="imdb">Read more about this movie on
            <a href="www.imdb.com/title/tt5189894">IMDB</a></div>
            <hr>
        </div>
        <div style="width: 320pt;">
            <img src="img/a_bad_idea_gone_wrong.jpg" class="poster" />
            <h1>A Bad Idea Gone Wrong (2017)</h1>
            <div class="director">Directed by Jason Headley</div>
            <div class="description">
                Two would-be thieves forge a surprising relationship with with an
                unexpected housesitter when they accidentally trap themselves in
                a house they just broke into.
            </div>
            <div class="imdb">Read more about this movie on
            <a href="www.imdb.com/title/tt5212918">IMDB</a></div>
            <hr>
        </div>
        ...
    </body>
</html>

With @page, we define a footer –positioned @bottom-right– of which the content is composed as ""Page X of Y" where X is the value of the current page number, and Y is the total number of pages. In figure 2.6, we see page 1 of 6 in the bottom-right corner of the page.

Figure 2.6: HTML page with "page X of Y" footers
Figure 2.6: HTML page with "page X of Y" footers

Now suppose that we want every movie to start on a new page. In that case, we'll introduce a page break.

Adding page breaks

We only made a single change in movies2.html. We added page-break-after: always to the top-<div> of every movie but the last:

<html>
    <head>
        <title>Movies</title>
        <meta name="description" content="Selection of SXSW movies" />
        <link rel="stylesheet" type="text/css" href="css/movie.css">
        <style>
            @page {
                @bottom-right {
                    content: "Page " counter(page) " of " counter(pages);
                }
            }
        </style>
    </head>
    <body>
        <div style="page-break-after: always; width: 320pt;">
            <img src="img/68_kill.jpg" class="poster" />
            <h1>68 Kill (2017)</h1>
            <div class="director">Directed by Trent Haaga</div>
            <div class="description">
                 A punk-rock after hours about femininity,
                 masculinity and the theft of $68,000.
            </div>
            <div class="imdb">Read more about this movie on
            <a href="www.imdb.com/title/tt5189894">IMDB</a></div>
            <hr>
        </div>
        ...
    </body>
</html>

The movie overview is now a document with 20 pages, one page per movie, as shown in figure 2.7.

Figure 2.7: HTML page with page breaks
Figure 2.7: HTML page with page breaks

There's much more that we could do with CSS, but let's see what we've done so far.

Summary

In this chapter, we've learned about inline CSS, CSS in the header, and external CSS. In the last example, we've even made a combination of inline CSS to force a page break, internal CSS in the header to add a footer, and external CSS for fonts, colors, and so on. In the next chapter, we're going to use some more CSS, more specifically in the context of media queries.