Html2txt

From unkrig.de
Revision as of 21:19, 9 May 2015 by Aunkrig (talk | contribs)
Jump to navigation Jump to search

A tool to convert HTML documents into plain text.

Html2txt is written in Java; it is available as a command line tool and as an APACHE ANT task.

Some HTML elements are converted into "markup" characters, e.g.

This is a <var>variable</var>

.

converts into

This is a <variable>

, other elements are simply ignored because they cannot reasonably be converted into text.

For a complete description of the supported HTML inline elements, see here.

For a complete description of the supported HTML block elements, see here.