Class JsoupBasedHtmlParser
java.lang.Object
org.apache.jmeter.protocol.http.parser.HTMLParser
org.apache.jmeter.protocol.http.parser.JsoupBasedHtmlParser
Parser based on JSOUP
- Since:
- 2.10
TODO Factor out common code between
invalid reference
LagartoBasedHtmlParser
-
Field Summary
Fields inherited from class org.apache.jmeter.protocol.http.parser.HTMLParser
ATT_BACKGROUND, ATT_CODE, ATT_CODEBASE, ATT_DATA, ATT_HREF, ATT_IS_IMAGE, ATT_REL, ATT_SRC, ATT_STYLE, ATT_TYPE, DEFAULT_PARSER, IE_UA, IE_UA_PATTERN, PARSER_CLASSNAME, STYLESHEET, TAG_APPLET, TAG_BASE, TAG_BGSOUND, TAG_BODY, TAG_EMBED, TAG_FRAME, TAG_IFRAME, TAG_IMAGE, TAG_INPUT, TAG_LINK, TAG_OBJECT, TAG_SCRIPT
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiongetEmbeddedResourceURLs
(String userAgent, byte[] html, URL baseUrl, URLCollection coll, String encoding) Get the URLs for all the resources that a browser would automatically download following the download of the HTML content, that is: images, stylesheets, javascript files, applets, etc...protected boolean
Parsers should over-ride this method if the parser class is re-usable, in which case the class will be cached for the next getParser() call.Methods inherited from class org.apache.jmeter.protocol.http.parser.HTMLParser
extractIEVersion, getEmbeddedResourceURLs, getEmbeddedResourceURLs, getParser, getParser, isEnableConditionalComments
-
Constructor Details
-
JsoupBasedHtmlParser
public JsoupBasedHtmlParser()
-
-
Method Details
-
getEmbeddedResourceURLs
public Iterator<URL> getEmbeddedResourceURLs(String userAgent, byte[] html, URL baseUrl, URLCollection coll, String encoding) throws HTMLParseException Description copied from class:HTMLParser
Get the URLs for all the resources that a browser would automatically download following the download of the HTML content, that is: images, stylesheets, javascript files, applets, etc...All URLs should be added to the Collection.
Malformed URLs can be reported to the caller by having the Iterator return the corresponding RL String. Overall problems parsing the html should be reported by throwing an HTMLParseException.
N.B. The Iterator returns URLs, but the Collection will contain objects of class URLString.
- Specified by:
getEmbeddedResourceURLs
in classHTMLParser
- Parameters:
userAgent
- User Agenthtml
- HTML codebaseUrl
- Base URL from which the HTML code was obtainedcoll
- URLCollectionencoding
- Charset- Returns:
- an Iterator for the resource URLs
- Throws:
HTMLParseException
- when parsing thehtml
fails
-
isReusable
protected boolean isReusable()Description copied from class:HTMLParser
Parsers should over-ride this method if the parser class is re-usable, in which case the class will be cached for the next getParser() call.- Overrides:
isReusable
in classHTMLParser
- Returns:
- true if the Parser is reusable
-