Extending the parser¶
Modules such as page3
extend the CSS 2.1 parser to add support for
CSS 3 syntax.
They do so by sub-classing css21.CSS21Parser
and overriding/extending
some of its methods. If fact, the parser is made of methods in a class
(rather than a set of functions) solely to enable this kind of sub-classing.
tinycss is designed to enable you to have parser subclasses outside of tinycss, without monkey-patching. If however the syntax you added is for a W3C specification, consider including your subclass in a new tinycss module and send a pull request: see Hacking tinycss.
Example: star hack¶
The star hack uses invalid declarations that are only parsed by some versions of Internet Explorer. By default, tinycss ignores invalid declarations and logs an error.
>>> from tinycss.css21 import CSS21Parser
>>> css = '#elem { width: [W3C Model Width]; *width: [BorderBox Model]; }'
>>> stylesheet = CSS21Parser().parse_stylesheet(css)
>>> stylesheet.errors
[ParseError('Parse error at 1:35, expected a property name, got DELIM',)]
>>> [decl.name for decl in stylesheet.rules[0].declarations]
['width']
If for example a minifier based on tinycss wants to support the star hack, it can by extending the parser:
>>> class CSSStarHackParser(CSS21Parser):
... def parse_declaration(self, tokens):
... has_star_hack = (tokens[0].type == 'DELIM' and tokens[0].value == '*')
... if has_star_hack:
... tokens = tokens[1:]
... declaration = super(CSSStarHackParser, self).parse_declaration(tokens)
... declaration.has_star_hack = has_star_hack
... return declaration
...
>>> stylesheet = CSSStarHackParser().parse_stylesheet(css)
>>> stylesheet.errors
[]
>>> [(d.name, d.has_star_hack) for d in stylesheet.rules[0].declarations]
[('width', False), ('width', True)]
This class extends the parse_declaration()
method.
It removes any *
delimeter Token
at the start of
a declaration, and adds a has_star_hack
boolean attribute on parsed
Declaration
objects: True
if a *
was removed, False
for
“normal” declarations.
Parser methods¶
In addition to methods of the user API (see Parsing a stylesheet), here are the methods of the CSS 2.1 parser that can be overriden or extended:
- CSS21Parser.parse_rules(tokens, context)[source]¶
Parse a sequence of rules (rulesets and at-rules).
- Parameters:
tokens – An iterable of tokens.
context – Either
'stylesheet'
or an at-keyword such as'@media'
. (Most at-rules are only allowed in some contexts.)
- Returns:
A tuple of a list of parsed rules and a list of
ParseError
.
- CSS21Parser.read_at_rule(at_keyword_token, tokens)[source]¶
Read an at-rule from a token stream.
- Parameters:
at_keyword_token – The ATKEYWORD token that starts this at-rule You may have read it already to distinguish the rule from a ruleset.
tokens – An iterator of subsequent tokens. Will be consumed just enough for one at-rule.
- Returns:
An unparsed
AtRule
.- Raises:
ParseError
if the head is invalid for the core grammar. The body is not validated. SeeAtRule
.
- CSS21Parser.parse_at_rule(rule, previous_rules, errors, context)[source]¶
Parse an at-rule.
Subclasses that override this method must use
super()
and pass its return value for at-rules they do not know.In CSS 2.1, this method handles @charset, @import, @media and @page rules.
- Parameters:
rule – An unparsed
AtRule
.previous_rules – The list of at-rules and rulesets that have been parsed so far in this context. This list can be used to decide if the current rule is valid. (For example, @import rules are only allowed before anything but a @charset rule.)
context – Either
'stylesheet'
or an at-keyword such as'@media'
. (Most at-rules are only allowed in some contexts.)
- Raises:
ParseError
if the rule is invalid.- Returns:
A parsed at-rule
- CSS21Parser.parse_media(tokens)[source]¶
For CSS 2.1, parse a list of media types.
Media Queries are expected to override this.
- Parameters:
tokens – A list of tokens
- Raises:
ParseError
on invalid media types/queries- Returns:
For CSS 2.1, a list of media types as strings
- CSS21Parser.parse_page_selector(tokens)[source]¶
Parse an @page selector.
- Parameters:
tokens – An iterable of token, typically from the
head
attribute of an unparsedAtRule
.- Returns:
A page selector. For CSS 2.1, this is
'first'
,'left'
,'right'
orNone
.- Raises:
ParseError
on invalid selectors
- CSS21Parser.parse_declarations_and_at_rules(tokens, context)[source]¶
Parse a mixed list of declarations and at rules, as found eg. in the body of an @page rule.
Note that to add supported at-rules inside @page,
CSSPage3Parser
extendsparse_at_rule()
, not this method.- Parameters:
tokens – An iterable of token, typically from the
body
attribute of an unparsedAtRule
.context – An at-keyword such as
'@page'
. (Most at-rules are only allowed in some contexts.)
- Returns:
A tuple of:
A list of
Declaration
A list of parsed at-rules (empty for CSS 2.1)
A list of
ParseError
- CSS21Parser.parse_ruleset(first_token, tokens)[source]¶
Parse a ruleset: a selector followed by declaration block.
- Parameters:
first_token – The first token of the ruleset (probably of the selector). You may have read it already to distinguish the rule from an at-rule.
tokens – an iterator of subsequent tokens. Will be consumed just enough for one ruleset.
- Returns:
a tuple of a
RuleSet
and an error list. The errors are recoveredParseError
in declarations. (Parsing continues from the next declaration on such errors.)- Raises:
ParseError
if the selector is invalid for the core grammar. Note a that a selector can be valid for the core grammar but not for CSS 2.1 or another level.
- CSS21Parser.parse_declaration_list(tokens)[source]¶
Parse a
;
separated declaration list.You may want to use
parse_declarations_and_at_rules()
(or some other method that usesparse_declaration()
directly) instead if you have not just declarations in the same context.- Parameters:
tokens – an iterable of tokens. Should stop at (before) the end of the block, as marked by
}
.- Returns:
a tuple of the list of valid
Declaration
and a list ofParseError
- CSS21Parser.parse_declaration(tokens)[source]¶
Parse a single declaration.
- Parameters:
tokens – an iterable of at least one token. Should stop at (before) the end of the declaration, as marked by a
;
or}
. Empty declarations (ie. consecutive;
with only white space in-between) should be skipped earlier and not passed to this method.- Returns:
- Raises:
ParseError
if the tokens do not match the ‘declaration’ production of the core grammar.
Unparsed at-rules¶
- class tinycss.css21.AtRule(at_keyword, head, body, line, column)[source]¶
An unparsed at-rule.
- at_keyword¶
The normalized (lower-case) at-keyword as a string. Eg:
'@page'
- head¶
The part of the at-rule between the at-keyword and the
{
marking the body, or the;
marking the end of an at-rule without a body. ATokenList
.
- body¶
The content of the body between
{
and}
as aTokenList
, orNone
if there is no body (ie. if the rule ends with;
).
The head was validated against the core grammar but not the body, as the body might contain declarations. In case of an error in a declaration, parsing should continue from the next declaration. The whole rule should not be ignored as it would be for an error in the head.
These at-rules are expected to be parsed further before reaching the user API.
Parsing helper functions¶
The tinycss.parsing
module contains helper functions for parsing
tokens into a more structured form:
- tinycss.parsing.strip_whitespace(tokens)[source]¶
Remove whitespace at the beggining and end of a token list.
Whitespace tokens in-between other tokens in the list are preserved.
- Parameters:
tokens – A list of
Token
orContainerToken
.- Returns:
A new sub-sequence of the list.
- tinycss.parsing.split_on_comma(tokens)[source]¶
Split a list of tokens on commas, ie
,
DELIM tokens.Only “top-level” comma tokens are splitting points, not commas inside a function or other
ContainerToken
.- Parameters:
tokens – An iterable of
Token
orContainerToken
.- Returns:
A list of lists of tokens
- tinycss.parsing.validate_value(tokens)[source]¶
Validate a property value.
- Parameters:
tokens – an iterable of tokens
- Raises:
ParseError
if there is any invalid token for the ‘value’ production of the core grammar.
- tinycss.parsing.validate_block(tokens, context)[source]¶
- Raises:
ParseError
if there is any invalid token for the ‘block’ production of the core grammar.- Parameters:
tokens – an iterable of tokens
context – a string for the ‘unexpected in …’ message
- tinycss.parsing.validate_any(token, context)[source]¶
- Raises:
ParseError
if this is an invalid token for the ‘any’ production of the core grammar.- Parameters:
token – a single token
context – a string for the ‘unexpected in …’ message