Class JsoupBasedHtmlParser
- java.lang.Object
-
- org.apache.jmeter.protocol.http.parser.HTMLParser
-
- org.apache.jmeter.protocol.http.parser.JsoupBasedHtmlParser
-
public class JsoupBasedHtmlParser extends HTMLParser
Parser based on JSOUP- Since:
- 2.10
TODO Factor out common code between
LagartoBasedHtmlParser
and this one (adapter pattern)
-
-
Field Summary
-
Fields inherited from class org.apache.jmeter.protocol.http.parser.HTMLParser
ATT_BACKGROUND, ATT_CODE, ATT_CODEBASE, ATT_DATA, ATT_HREF, ATT_IS_IMAGE, ATT_REL, ATT_SRC, ATT_STYLE, ATT_TYPE, DEFAULT_PARSER, IE_UA, IE_UA_PATTERN, PARSER_CLASSNAME, STYLESHEET, TAG_APPLET, TAG_BASE, TAG_BGSOUND, TAG_BODY, TAG_EMBED, TAG_FRAME, TAG_IFRAME, TAG_IMAGE, TAG_INPUT, TAG_LINK, TAG_OBJECT, TAG_SCRIPT
-
-
Constructor Summary
Constructors Constructor Description JsoupBasedHtmlParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Iterator<URL>
getEmbeddedResourceURLs(String userAgent, byte[] html, URL baseUrl, URLCollection coll, String encoding)
Get the URLs for all the resources that a browser would automatically download following the download of the HTML content, that is: images, stylesheets, javascript files, applets, etc...protected boolean
isReusable()
Parsers should over-ride this method if the parser class is re-usable, in which case the class will be cached for the next getParser() call.-
Methods inherited from class org.apache.jmeter.protocol.http.parser.HTMLParser
extractIEVersion, getEmbeddedResourceURLs, getEmbeddedResourceURLs, getParser, getParser, isEnableConditionalComments
-
-
-
-
Method Detail
-
getEmbeddedResourceURLs
public Iterator<URL> getEmbeddedResourceURLs(String userAgent, byte[] html, URL baseUrl, URLCollection coll, String encoding) throws HTMLParseException
Description copied from class:HTMLParser
Get the URLs for all the resources that a browser would automatically download following the download of the HTML content, that is: images, stylesheets, javascript files, applets, etc...All URLs should be added to the Collection.
Malformed URLs can be reported to the caller by having the Iterator return the corresponding RL String. Overall problems parsing the html should be reported by throwing an HTMLParseException.
N.B. The Iterator returns URLs, but the Collection will contain objects of class URLString.
- Specified by:
getEmbeddedResourceURLs
in classHTMLParser
- Parameters:
userAgent
- User Agenthtml
- HTML codebaseUrl
- Base URL from which the HTML code was obtainedcoll
- URLCollectionencoding
- Charset- Returns:
- an Iterator for the resource URLs
- Throws:
HTMLParseException
- when parsing thehtml
fails
-
isReusable
protected boolean isReusable()
Description copied from class:HTMLParser
Parsers should over-ride this method if the parser class is re-usable, in which case the class will be cached for the next getParser() call.- Overrides:
isReusable
in classHTMLParser
- Returns:
- true if the Parser is reusable
-
-