Parsing XML and XHTML

There are a lot of questions about HTMLWorker on Stack Overflow. Many of these questions remain unanswered as HTMLWorker has been abandoned in favor of XML Worker in 2012. HTMLWorker was initially meant as a parser for a small selection of HTML tags. People started using it as if it were a full-blown HTML to PDF converter and then complained because HTMLWorker doesn't support CSS parsing. The HTMLWorker code grew organically up until a point where it was no longer maintainable.

We started another project, called XML Worker. It can be used to convert XHTML to PDF. It's not an URL to PDF converter in the sense that it won't "print your web site to PDF". In HTML, you can encounter content at the end of the file that needs to be added at the start of the document. When this happens, one would expect that the start of the document is the first page. That isn't possible with iText as iText flushes finished pages to the OutputStream as soon as possible and there is no way to return to a previous page to add the extra content.

XML Worker is meant to create simple reports using an easy language such as HTML (and some CSS). It won't resolve ASP pages, nor execute JavaScript. It will only deal with finished XHTML.

Would you like to have these reference answers available to you at any time? Consider downloading the entire book 'The Best iText Questions on Stack Overflow' for free!

I want to use iText to convert a series of html file to PDF. For instance: I have these files: page1.html, page2.html, page3.html,... Now I want to create a single PDF file, where page1.html is the first page, page2.html is the second page, and so on...
I have a problem with PDF fonts. I am generating PDF from HTML and that worked fine on my local machine, which has Windows as OS. But now I deploy my application on a Linux server and my Cyrillic text is displayed as question marks.
I want to generate PDF from an ASPX page using a css file. How can I do this using iTextSharp? I've downloaded itextsharp-all-5.5.7, but which of all the DLLs must I include in my asp.net c# project?
I want to convert the HTML to PDF, but I don't know where to start.
Could anybody explain to me why is it so complicated to create a PDF document from an XML file?
I am trying to convert Html to PDF. I found that iTextSharp does not support CSS well. In fact, I think HtmlWorker does not support it all.
I'm facing a problem when trying to export a Vietnamese document as PDF using iText.
I would like to add a CSS file while generating PDF from an HTML file.
I am parsing HTML string using iTextSharp XMLWorker. The parsing works fine but it is really slow, it takes around 2 seconds to parse the HTML.
I am using XML Worker to parse an HTML string into a PDF Document, and cannot find a way to control the line spacing of the PDF being generated.