pyzor.digest
Handle digesting the messages.
-
class
pyzor.digest.
DataDigester
(msg, spec=None)[source]
Bases: object
The major workhouse class.
-
atomic_num_lines
= 4
-
digest
-
classmethod
digest_payloads
(msg)[source]
-
email_ptrn
= re.compile('\\S+@\\S+')
-
handle_atomic
(lines)[source]
We digest everything.
-
handle_line
(line)[source]
-
handle_pieced
(lines, spec)[source]
Digest stuff according to the spec.
-
longstr_ptrn
= re.compile('\\S{10,}')
-
min_line_length
= 8
-
classmethod
normalize
(s)[source]
-
static
normalize_html_part
(s)[source]
-
classmethod
should_handle_line
(s)[source]
-
unwanted_txt_repl
= ''
-
url_ptrn
= re.compile('[a-z]+:\\S+', re.IGNORECASE)
-
value
-
ws_ptrn
= re.compile('\\s')
-
class
pyzor.digest.
HTMLStripper
(collector)[source]
Bases: html.parser.HTMLParser
Strip all tags from the HTML.
-
handle_data
(data)[source]
Keep track of the data.
-
handle_endtag
(tag)[source]
-
handle_starttag
(tag, attrs)[source]
-
class
pyzor.digest.
PrintingDataDigester
(msg, spec=None)[source]
Bases: pyzor.digest.DataDigester
Extends DataDigester: prints out what we’re digesting.
-
digest
-
handle_line
(line)[source]
-
value