Talk:Comparison of HTML parsers
From Wikipedia, the free encyclopedia
Please check/review column definitions
- The softwate, a "HTML parser"... DOM with a LoadHTML method is a "HTML parser"!? There are some standalone software, that only transform HTML; and "enabled" to programmer's to traversal all nodes, etc.? What the software taxonomy here??
- Implementation language(s)
- Ok, but not confuse with "driver/bridge for bin implementation".
- Latest date
- Latest release date of significant changes in the implementation source code.
- HTML Parsing
- Common sense says that all "HTML parsers" have YES to "HTML Parsing"... So, same problem, of column "Parser": DOMDocument class with a LoadHTML method is a "enabled" to programmer's "HTML parsing"!?
- Clean HTML
- sanitize (generating standard-compatible web-page, reduce spam, etc.) and clean (strip out surplus presentational tags, remove XSS code, etc.) HTML code
- Update HTML
- Updates HTML4.X to XHTML or to HTML5, converting deprecated tags (ex. CENTER) to valid ones (ex. DIV with style="text-align:center;").