module Kramdown::Parser::Html::Parser

Contains the parsing methods. This module can be mixed into any parser to get HTML parsing functionality. The only thing that must be provided by the class are instance variable @stack for storing the needed state and @src (instance of StringScanner) for the actual parsing.

Public Instance Methods

handle_html_start_tag(line = nil) { |el, closed, handle_body| ... } click to toggle source

Process the HTML start tag that has already be scanned/checked via @src.

Does the common processing steps and then yields to the caller for further processing (first parameter is the created element; the second parameter is true if the HTML element is already closed, ie. contains no body; the third parameter specifies whether the body - and the end tag - need to be handled in case closed=false).

    # File lib/kramdown/parser/html.rb
 87 def handle_html_start_tag(line = nil) # :yields: el, closed, handle_body
 88   name = @src[1]
 89   name.downcase! if HTML_ELEMENT[name.downcase]
 90   closed = !@src[4].nil?
 91   attrs = parse_html_attributes(@src[2], line, HTML_ELEMENT[name])
 92 
 93   el = Element.new(:html_element, name, attrs, category: :block)
 94   el.options[:location] = line if line
 95   @tree.children << el
 96 
 97   if !closed && HTML_ELEMENTS_WITHOUT_BODY.include?(el.value)
 98     closed = true
 99   end
100   if name == 'script' || name == 'style'
101     handle_raw_html_tag(name)
102     yield(el, false, false)
103   else
104     yield(el, closed, true)
105   end
106 end
handle_raw_html_tag(name) click to toggle source

Handle the raw HTML tag at the current position.

    # File lib/kramdown/parser/html.rb
127 def handle_raw_html_tag(name)
128   curpos = @src.pos
129   if @src.scan_until(/(?=<\/#{name}\s*>)/mi)
130     add_text(extract_string(curpos...@src.pos, @src), @tree.children.last, :raw)
131     @src.scan(HTML_TAG_CLOSE_RE)
132   else
133     add_text(@src.rest, @tree.children.last, :raw)
134     @src.terminate
135     warning("Found no end tag for '#{name}' - auto-closing it")
136   end
137 end
parse_html_attributes(str, line = nil, in_html_tag = true) click to toggle source

Parses the given string for HTML attributes and returns the resulting hash.

If the optional line parameter is supplied, it is used in warning messages.

If the optional in_html_tag parameter is set to false, attributes are not modified to contain only lowercase letters.

    # File lib/kramdown/parser/html.rb
114 def parse_html_attributes(str, line = nil, in_html_tag = true)
115   attrs = {}
116   str.scan(HTML_ATTRIBUTE_RE).each do |attr, val, _sep, quoted_val|
117     attr.downcase! if in_html_tag
118     if attrs.key?(attr)
119       warning("Duplicate HTML attribute '#{attr}' on line #{line || '?'} - overwriting previous one")
120     end
121     attrs[attr] = val || quoted_val || ""
122   end
123   attrs
124 end
parse_raw_html(el, &block) click to toggle source

Parse raw HTML from the current source position, storing the found elements in el. Parsing continues until one of the following criteria are fulfilled:

  • The end of the document is reached.

  • The matching end tag for the element el is found (only used if el is an HTML element).

When an HTML start tag is found, processing is deferred to handle_html_start_tag, providing the block given to this method.

    # File lib/kramdown/parser/html.rb
150 def parse_raw_html(el, &block)
151   @stack.push(@tree)
152   @tree = el
153 
154   done = false
155   while !@src.eos? && !done
156     if (result = @src.scan_until(HTML_RAW_START))
157       add_text(result, @tree, :text)
158       line = @src.current_line_number
159       if (result = @src.scan(HTML_COMMENT_RE))
160         @tree.children << Element.new(:xml_comment, result, nil, category: :block, location: line)
161       elsif (result = @src.scan(HTML_INSTRUCTION_RE))
162         @tree.children << Element.new(:xml_pi, result, nil, category: :block, location: line)
163       elsif @src.scan(HTML_CDATA_RE)
164         @tree.children << Element.new(:text, @src[1], nil, cdata: true, location: line)
165       elsif @src.scan(HTML_TAG_RE)
166         if method(:handle_html_start_tag).arity.abs >= 1
167           handle_html_start_tag(line, &block)
168         else
169           handle_html_start_tag(&block) # DEPRECATED: method needs to accept line number in 2.0
170         end
171       elsif @src.scan(HTML_TAG_CLOSE_RE)
172         if @tree.value == (HTML_ELEMENT[@tree.value] ? @src[1].downcase : @src[1])
173           done = true
174         else
175           add_text(@src.matched, @tree, :text)
176           warning("Found invalidly used HTML closing tag for '#{@src[1]}' on " \
177                   "line #{line} - ignoring it")
178         end
179       else
180         add_text(@src.getch, @tree, :text)
181       end
182     else
183       add_text(@src.rest, @tree, :text)
184       @src.terminate
185       if @tree.type == :html_element
186         warning("Found no end tag for '#{@tree.value}' on line " \
187                 "#{@tree.options[:location]} - auto-closing it")
188       end
189       done = true
190     end
191   end
192 
193   @tree = @stack.pop
194 end