Revision | $Id: //open/dev/farrago/src/org/eigenbase/relopt/volcano/package.html#1 $ |
---|---|
Copyright | Copyright (C) 2005-2009 The Eigenbase Project
Copyright (C) 2003-2009 SQLstream, Inc. Copyright (C) 2009-2009 LucidEra, Inc. |
Author | Julian Hyde, Stephan Zuercher |
A planner (also known as an optimizer) finds the most efficient implementation of a {@link org.eigenbase.rel.RelNode relational expression}.
Interface {@link org.eigenbase.relopt.RelOptPlanner} defines a planner, and class {@link org.eigenbase.relopt.volcano.VolcanoPlanner} is an implementation which uses a dynamic programming technique. It is based upon the Volcano optimizer [1].
Interface {@link org.eigenbase.relopt.RelOptCost} defines a cost model; class {@link
org.eigenbase.relopt.volcano.VolcanoCost} is the implementation for a VolcanoPlanner
.
A {@link org.eigenbase.relopt.volcano.RelSet} is a set of equivalent relational expressions.
They are equivalent because they will produce the same result for any set of
input data. It is an equivalence class: two expressions are in the same set if
and only if they are in the same RelSet
.
One of the unique features of the optimizer is that expressions can take on a variety of physical traits. Each relational expression has a set of traits. Each trait is described by an implementation of {@link org.eigenbase.relopt.RelTraitDef}. Manifestations of the trait implement {@link org.eigenbase.relopt.RelTrait}. The most common example of a trait is calling convention: the protocol used to receive and transmit data. {@link org.eigenbase.relopt.CallingConventionTraitDef} defines the trait and {@link org.eigenbase.relopt.CallingConvention} enumerates the protocols. Every relational expression has a single calling convention by which it returns its results. Some examples:
names
array would
generate the following code:String[] names; for (int i = 0; i < names.length; i++) { String name = names[i]; // ... code for consuming relational expression ... }
New traits are added to the planner in one of two ways:
AbstractRelNode
constructors
if most relational expressions use a single manifestation of the trait.setRoot(RelNode)
call.The second trait extension mechanism requires that implementations of {@link
org.eigenbase.rel.AbstractRelNode#clone()} must not assume the type and quantity of traits in
their trait set. In either case, the new RelTraitDef
implementation must be
{@link org.eigenbase.relopt.volcano.VolcanoPlanner#addRelTraitDef(org.eigenbase.relopt.RelTraitDef)}
registered with the planner.
A {@link org.eigenbase.relopt.volcano.RelSubset} is a subset of a RelSet
containing expressions which are equivalent and which have the same
CallingConvention
. Like RelSet
, it is an equivalence class.
org.eigenbase.rel
defines {@link
org.eigenbase.rel.RelNode relational expressions}.openjava.ptree
defines an object model for Java expressions....
Sets merge when the result of a rule already exists in another set. This implies that all of the expressions are equivalent. The RelSets are merged, and so are the contained RelSubsets.
Expression registration.
Algorithm
To optimize a relational expression R:
1. Register R.
2. Create rule-calls for all applicable rules.
3. Rank the rule calls by importance.
4. Call the most important rule
5. Repeat.
Importance. A rule-call is important if it is likely to produce better implementation of a relexp on the plan's critical path. Hence (a) it produces a member of an important RelSubset, (b) its children are cheap.
Conversion. Conversions are difficult because we have to work backwards from the goal.
Rule triggering
The rules are:
PushFilterThroughProjectRule
. Operands:Filter Project
CombineProjectsRule
. Operands:Project Project
A rule can be triggered by a change to any of its operands. Consider the rule to combine two filters into one. It would have operands [Filter [Filter]]. If I register a new Filter, it will trigger the rule in 2 places. Consider:
Project (deptno) [exp 1, subset A] Filter (gender='F') [exp 2, subset B] Project (deptno, gender, empno) [exp 3, subset C] Project (deptno, gender, empno, salary) [exp 4, subset D] TableScan (emp) [exp 0, subset X]
Apply PushFilterThroughProjectRule
to [exp 2, exp 3]:
Project (deptno) [exp 1, subset A] Project (deptno, gender, empno) [exp 5, subset B] Filter (gender='F') [exp 6, subset E] Project (deptno, gender, empno, salary) [exp 4, subset D] TableScan (emp) [exp 0, subset X]
Two new expressions are created. Expression 5 is in subset B (because it is equivalent to expression 2), and expression 6 is in a new equivalence class, subset E.
The products of a applying a rule can trigger a cascade of rules. Even in this simple system (2 rules and 4 initial expressions), two more rules are triggered:
CombineProjectsRule
(exp 1, exp 5),
which createsProject (deptno) [exp 7, subset A] Filter (gender='F') [exp 6, subset E] Project (deptno, gender, empno, salary) [exp 4, subset D] TableScan (emp) [exp 0, subset X]
PushFilterThroughProjectRule
(exp
6, exp 4), which createsProject (deptno) [exp 1, subset A] Project (deptno, gender, empno) [exp 5, subset B] Project (deptno, gender, empno, salary) [exp 8, subset E] Filter (gender='F') [exp 9, subset F] TableScan (emp) [exp 0, subset X]
Each rule application adds additional members to existing subsets. The
non-singleton subsets are now A {1, 7}, B {2, 5} and E {6, 8}, and new
combinations are possible. For example, CombineProjectsRule
(exp 7,
exp 8) further reduces the depth of the tree to:
Project (deptno) [exp 10, subset A] Filter (gender='F') [exp 9, subset F] TableScan (emp) [exp 0, subset X]
Todo: show how rules can cause subsets to merge.
Conclusion: