Html2txt: Difference between revisions

From unkrig.de
Jump to navigation Jump to search
(Created page with "A tool to convert HTML documents into plain text. Html2txt is written in Java; it is available as a command line tool and as an APACHE ANT task. Some HTML elements are conve...")
 
No edit summary
Line 14: Line 14:


For a complete description of the supported HTML inline elements, see
For a complete description of the supported HTML inline elements, see
<span class="plainlinks">[http://html2txt.unkrig.de/javadoc/src-html/de/unkrig/html2txt/Html2Txt.html#line.1269 here]</span>.
<span class="plainlinks">[http://html2txt.unkrig.de/javadoc/de/unkrig/html2txt/Html2Txt.html#ALL_INLINE_ELEMENTS here]</span>.
 
For a complete description of the supported HTML block elements, see
<span class="plainlinks">[http://html2txt.unkrig.de/javadoc/de/unkrig/html2txt/Html2Txt.html#ALL_BLOCK_ELEMENTS here]</span>.

Revision as of 21:19, 9 May 2015

A tool to convert HTML documents into plain text.

Html2txt is written in Java; it is available as a command line tool and as an APACHE ANT task.

Some HTML elements are converted into "markup" characters, e.g.

This is a <var>variable</var>

.

converts into

This is a <variable>

, other elements are simply ignored because they cannot reasonably be converted into text.

For a complete description of the supported HTML inline elements, see here.

For a complete description of the supported HTML block elements, see here.