module Kramdown::Parser::Html::Parser
Contains the parsing methods. This module can be mixed into any parser to get HTML parsing functionality. The only thing that must be provided by the class are instance variable @stack for storing the needed state and @src (instance of StringScanner) for the actual parsing.
Public Instance Methods
Process the HTML start tag that has already be scanned/checked via @src.
Does the common processing steps and then yields to the caller for further
processing (first parameter is the created element; the second parameter is
true
if the HTML element is already closed, ie. contains no
body; the third parameter specifies whether the body - and the end tag -
need to be handled in case closed=false).
# File lib/kramdown/parser/html.rb, line 77 def handle_html_start_tag(line = nil) # :yields: el, closed, handle_body name = @src[1].downcase closed = !@src[4].nil? attrs = Utils::OrderedHash.new @src[2].scan(HTML_ATTRIBUTE_RE).each {|attr,sep,val| attrs[attr.downcase] = val || ""} el = Element.new(:html_element, name, attrs, :category => :block) el.options[:location] = line if line @tree.children << el if !closed && HTML_ELEMENTS_WITHOUT_BODY.include?(el.value) warning("The HTML tag '#{el.value}' on line #{line} cannot have any content - auto-closing it") closed = true end if name == 'script' || name == 'style' handle_raw_html_tag(name) yield(el, false, false) else yield(el, closed, true) end end
Handle the raw HTML tag at the current position.
# File lib/kramdown/parser/html.rb, line 100 def handle_raw_html_tag(name) curpos = @src.pos if @src.scan_until(/(?=<\/#{name}\s*>)/mi) add_text(extract_string(curpos...@src.pos, @src), @tree.children.last, :raw) @src.scan(HTML_TAG_CLOSE_RE) else add_text(@src.rest, @tree.children.last, :raw) @src.terminate warning("Found no end tag for '#{name}' - auto-closing it") end end
Parse raw HTML from the current source position, storing the found elements
in el
. Parsing continues until one of the following criteria
are fulfilled:
-
The end of the document is reached.
-
The matching end tag for the element
el
is found (only used ifel
is an HTML element).
When an HTML start tag is found, processing is deferred to handle_html_start_tag, providing the block given to this method.
# File lib/kramdown/parser/html.rb, line 123 def parse_raw_html(el, &block) @stack.push(@tree) @tree = el done = false while !@src.eos? && !done if result = @src.scan_until(HTML_RAW_START) add_text(result, @tree, :text) line = @src.current_line_number if result = @src.scan(HTML_COMMENT_RE) @tree.children << Element.new(:xml_comment, result, nil, :category => :block, :location => line) elsif result = @src.scan(HTML_INSTRUCTION_RE) @tree.children << Element.new(:xml_pi, result, nil, :category => :block, :location => line) elsif @src.scan(HTML_TAG_RE) if method(:handle_html_start_tag).arity == 1 handle_html_start_tag(line, &block) else handle_html_start_tag(&block) # DEPRECATED: method needs to accept line number in 2.0 end elsif @src.scan(HTML_TAG_CLOSE_RE) if @tree.value == @src[1].downcase done = true else warning("Found invalidly used HTML closing tag for '#{@src[1].downcase}' on line #{line} - ignoring it") end else add_text(@src.getch, @tree, :text) end else add_text(@src.rest, @tree, :text) @src.terminate warning("Found no end tag for '#{@tree.value}' on line #{@tree.options[:location]} - auto-closing it") if @tree.type == :html_element done = true end end @tree = @stack.pop end