Class HtmlDocumentBuilder

java.lang.Object
javax.xml.parsers.DocumentBuilder
nu.validator.htmlparser.dom.HtmlDocumentBuilder

public class HtmlDocumentBuilder extends DocumentBuilder
This class implements an HTML5 parser that exposes data through the DOM interface.

By default, when using the constructor without arguments, the this parser coerces XML 1.0-incompatible infosets into XML 1.0-compatible infosets. This corresponds to ALTER_INFOSET as the general XML violation policy. To make the parser support non-conforming HTML fully per the HTML 5 spec while on the other hand potentially violating the SAX2 API contract, set the general XML violation policy to ALLOW. This does not work with a standard DOM implementation. It is possible to treat XML 1.0 infoset violations as fatal by setting the general XML violation policy to FATAL.

The doctype is not represented in the tree.

The document mode is represented as user data DocumentMode object with the key nu.validator.document-mode on the document node.

The form pointer is also stored as user data with the key nu.validator.form-pointer.

Version:
$Id$
Author:
hsivonen
  • Constructor Details

    • HtmlDocumentBuilder

      public HtmlDocumentBuilder(DOMImplementation implementation, XmlViolationPolicy xmlPolicy)
      Instantiates the document builder with a specific DOM implementation and XML violation policy.
      Parameters:
      implementation - the DOM implementation
      xmlPolicy - the policy
    • HtmlDocumentBuilder

      public HtmlDocumentBuilder(DOMImplementation implementation)
      Instantiates the document builder with a specific DOM implementation and the infoset-altering XML violation policy.
      Parameters:
      implementation - the DOM implementation
    • HtmlDocumentBuilder

      public HtmlDocumentBuilder()
      Instantiates the document builder with the JAXP DOM implementation and the infoset-altering XML violation policy.
    • HtmlDocumentBuilder

      public HtmlDocumentBuilder(XmlViolationPolicy xmlPolicy)
      Instantiates the document builder with the JAXP DOM implementation and a specific XML violation policy.
      Parameters:
      xmlPolicy - the policy
  • Method Details