Extending the parser

Modules such as page3 extend the CSS 2.1 parser to add support for CSS 3 syntax. They do so by sub-classing css21.CSS21Parser and overriding/extending some of its methods. If fact, the parser is made of methods in a class (rather than a set of functions) solely to enable this kind of sub-classing.

tinycss is designed to enable you to have parser subclasses outside of tinycss, without monkey-patching. If however the syntax you added is for a W3C specification, consider including your subclass in a new tinycss module and send a pull request: see Hacking tinycss.

Example: star hack

The star hack uses invalid declarations that are only parsed by some versions of Internet Explorer. By default, tinycss ignores invalid declarations and logs an error.

>>> from tinycss.css21 import CSS21Parser
>>> css = '#elem { width: [W3C Model Width]; *width: [BorderBox Model]; }'
>>> stylesheet = CSS21Parser().parse_stylesheet(css)
>>> stylesheet.errors
[ParseError('Parse error at 1:35, expected a property name, got DELIM',)]
>>> [decl.name for decl in stylesheet.rules[0].declarations]
['width']

If for example a minifier based on tinycss wants to support the star hack, it can by extending the parser:

>>> class CSSStarHackParser(CSS21Parser):
...     def parse_declaration(self, tokens):
...         has_star_hack = (tokens[0].type == 'DELIM' and tokens[0].value == '*')
...         if has_star_hack:
...             tokens = tokens[1:]
...         declaration = super(CSSStarHackParser, self).parse_declaration(tokens)
...         declaration.has_star_hack = has_star_hack
...         return declaration
...
>>> stylesheet = CSSStarHackParser().parse_stylesheet(css)
>>> stylesheet.errors
[]
>>> [(d.name, d.has_star_hack) for d in stylesheet.rules[0].declarations]
[('width', False), ('width', True)]

This class extends the parse_declaration() method. It removes any * delimeter Token at the start of a declaration, and adds a has_star_hack boolean attribute on parsed Declaration objects: True if a * was removed, False for “normal” declarations.

Parser methods

In addition to methods of the user API (see Parsing a stylesheet), here are the methods of the CSS 2.1 parser that can be overriden or extended:

CSS21Parser.parse_rules(tokens, context)[source]

Parse a sequence of rules (rulesets and at-rules).

Parameters:
  • tokens – An iterable of tokens.

  • context – Either 'stylesheet' or an at-keyword such as '@media'. (Most at-rules are only allowed in some contexts.)

Returns:

A tuple of a list of parsed rules and a list of ParseError.

CSS21Parser.read_at_rule(at_keyword_token, tokens)[source]

Read an at-rule from a token stream.

Parameters:
  • at_keyword_token – The ATKEYWORD token that starts this at-rule You may have read it already to distinguish the rule from a ruleset.

  • tokens – An iterator of subsequent tokens. Will be consumed just enough for one at-rule.

Returns:

An unparsed AtRule.

Raises:

ParseError if the head is invalid for the core grammar. The body is not validated. See AtRule.

CSS21Parser.parse_at_rule(rule, previous_rules, errors, context)[source]

Parse an at-rule.

Subclasses that override this method must use super() and pass its return value for at-rules they do not know.

In CSS 2.1, this method handles @charset, @import, @media and @page rules.

Parameters:
  • rule – An unparsed AtRule.

  • previous_rules – The list of at-rules and rulesets that have been parsed so far in this context. This list can be used to decide if the current rule is valid. (For example, @import rules are only allowed before anything but a @charset rule.)

  • context – Either 'stylesheet' or an at-keyword such as '@media'. (Most at-rules are only allowed in some contexts.)

Raises:

ParseError if the rule is invalid.

Returns:

A parsed at-rule

CSS21Parser.parse_media(tokens)[source]

For CSS 2.1, parse a list of media types.

Media Queries are expected to override this.

Parameters:

tokens – A list of tokens

Raises:

ParseError on invalid media types/queries

Returns:

For CSS 2.1, a list of media types as strings

CSS21Parser.parse_page_selector(tokens)[source]

Parse an @page selector.

Parameters:

tokens – An iterable of token, typically from the head attribute of an unparsed AtRule.

Returns:

A page selector. For CSS 2.1, this is 'first', 'left', 'right' or None.

Raises:

ParseError on invalid selectors

CSS21Parser.parse_declarations_and_at_rules(tokens, context)[source]

Parse a mixed list of declarations and at rules, as found eg. in the body of an @page rule.

Note that to add supported at-rules inside @page, CSSPage3Parser extends parse_at_rule(), not this method.

Parameters:
  • tokens – An iterable of token, typically from the body attribute of an unparsed AtRule.

  • context – An at-keyword such as '@page'. (Most at-rules are only allowed in some contexts.)

Returns:

A tuple of:

CSS21Parser.parse_ruleset(first_token, tokens)[source]

Parse a ruleset: a selector followed by declaration block.

Parameters:
  • first_token – The first token of the ruleset (probably of the selector). You may have read it already to distinguish the rule from an at-rule.

  • tokens – an iterator of subsequent tokens. Will be consumed just enough for one ruleset.

Returns:

a tuple of a RuleSet and an error list. The errors are recovered ParseError in declarations. (Parsing continues from the next declaration on such errors.)

Raises:

ParseError if the selector is invalid for the core grammar. Note a that a selector can be valid for the core grammar but not for CSS 2.1 or another level.

CSS21Parser.parse_declaration_list(tokens)[source]

Parse a ; separated declaration list.

You may want to use parse_declarations_and_at_rules() (or some other method that uses parse_declaration() directly) instead if you have not just declarations in the same context.

Parameters:

tokens – an iterable of tokens. Should stop at (before) the end of the block, as marked by }.

Returns:

a tuple of the list of valid Declaration and a list of ParseError

CSS21Parser.parse_declaration(tokens)[source]

Parse a single declaration.

Parameters:

tokens – an iterable of at least one token. Should stop at (before) the end of the declaration, as marked by a ; or }. Empty declarations (ie. consecutive ; with only white space in-between) should be skipped earlier and not passed to this method.

Returns:

a Declaration

Raises:

ParseError if the tokens do not match the ‘declaration’ production of the core grammar.

CSS21Parser.parse_value_priority(tokens)[source]

Separate any !important marker at the end of a property value.

Parameters:

tokens – A list of tokens for the property value.

Returns:

A tuple of the actual property value (a list of tokens) and the priority.

Unparsed at-rules

class tinycss.css21.AtRule(at_keyword, head, body, line, column)[source]

An unparsed at-rule.

at_keyword

The normalized (lower-case) at-keyword as a string. Eg: '@page'

head

The part of the at-rule between the at-keyword and the { marking the body, or the ; marking the end of an at-rule without a body. A TokenList.

body

The content of the body between { and } as a TokenList, or None if there is no body (ie. if the rule ends with ;).

The head was validated against the core grammar but not the body, as the body might contain declarations. In case of an error in a declaration, parsing should continue from the next declaration. The whole rule should not be ignored as it would be for an error in the head.

These at-rules are expected to be parsed further before reaching the user API.

Parsing helper functions

The tinycss.parsing module contains helper functions for parsing tokens into a more structured form:

tinycss.parsing.strip_whitespace(tokens)[source]

Remove whitespace at the beggining and end of a token list.

Whitespace tokens in-between other tokens in the list are preserved.

Parameters:

tokens – A list of Token or ContainerToken.

Returns:

A new sub-sequence of the list.

tinycss.parsing.split_on_comma(tokens)[source]

Split a list of tokens on commas, ie , DELIM tokens.

Only “top-level” comma tokens are splitting points, not commas inside a function or other ContainerToken.

Parameters:

tokens – An iterable of Token or ContainerToken.

Returns:

A list of lists of tokens

tinycss.parsing.validate_value(tokens)[source]

Validate a property value.

Parameters:

tokens – an iterable of tokens

Raises:

ParseError if there is any invalid token for the ‘value’ production of the core grammar.

tinycss.parsing.validate_block(tokens, context)[source]
Raises:

ParseError if there is any invalid token for the ‘block’ production of the core grammar.

Parameters:
  • tokens – an iterable of tokens

  • context – a string for the ‘unexpected in …’ message

tinycss.parsing.validate_any(token, context)[source]
Raises:

ParseError if this is an invalid token for the ‘any’ production of the core grammar.

Parameters:
  • token – a single token

  • context – a string for the ‘unexpected in …’ message