How to convert an HTML table to PDF?

Category: 
Tags: XML WorkerXHTMLtablesCSSiText 5

I have an HTML file that looks like this when I render it in a browser:

HTML input

When I export the String with this HTML to PDF using a Paragraph object, I get the following output:

HTML output

This is not what I want. I want the PDF to look like the HTML in the browser.

Posted on StackOverflow on Jul 2, 2014 by user3152748

Please take a look at the examples ParseHtmlTable1 and ParseHtmlTable2. They create the following PDFs: html_table_1.pdf and html_table_2.pdf.

The table is created like this:

StringBuilder sb = new StringBuilder();
sb.append("<table border=\"2\">");
sb.append("<tr>");
sb.append("<th>Sr. No.</th>");
sb.append("<th>Text Data</th>");
sb.append("<th>Number Data</th>");
sb.append("</tr>");
for (int i = 0; i < 10; ) {
    i++;
    sb.append("<tr>");
    sb.append("<td>");
    sb.append(i);
    sb.append("</td>");
    sb.append("<td>This is text data ");
    sb.append(i);
    sb.append("</td>");
    sb.append("<td>");
    sb.append(i);
    sb.append("</td>");
    sb.append("</tr>");
}
sb.append("</table>");

I've taken the liberty to define the CSS like this:

tr { text-align: center; }
th { background-color: lightgreen; padding: 3px; }
td {background-color: lightblue;  padding: 3px; }

Now we parse the CSS and the HTML like this:

CSSResolver cssResolver = new StyleAttrCSSResolver();
CssFile cssFile = XMLWorkerHelper.getCSS(new ByteArrayInputStream(CSS.getBytes()));
cssResolver.addCss(cssFile);
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline pdf = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new ByteArrayInputStream(sb.toString().getBytes()));

Now the elements list contains a single element: your table:

return (PdfPTable)elements.get(0);

You can add this table to your PDF document. This is what the result looks like:

Table presented in PDF
Table presented in PDF