Today we see more and more email messages formatted as HTML. For me the email is plain text medium (with attachments) and I don't use WebBrowser or Mozilla object as a message browser in my email client. So I'm write quick and dirty HTML parser instead. It seems enough to parse email messages but some work is needed to shift them to more general purpose HTML parser.
TDocument, TNode, TElement, TAttr etc. implements DOM2 Core.
THtmlParser produces TDocument from HTML string.
function parseString(const HtmlStr: TDomString): TDocument;
THtmlParser uses THtmlReader a event driven SAX-like interface.
To convert DOM tree to plain text or HTML use TTextFormatter or THtmlFormatter respectively.
Parser is implemented as several modules:
You can download project modules as zip archive. This is small project so only the latest (always stable ;-) version is available.