Mondrian is an OLAP server implemented in Java.

Introduction

See architecture.

Components

Query transformer

See {@link mondrian.olap.Parser}.

Metadata

It is represented as an XML file. The metadata is loaded into memory the first time you reference a dimensional model. You can modify the model at runtime by creating instances of classes such as {@link mondrian.rolap.RolapHierarchy}.

Calculation layer

todo: See {@link mondrian.olap.Query} and {@link mondrian.olap.Result}.

todo: The package {@link mondrian.rolap}. is the one and only implementation of the API. The DriverManager (class {@link mondrian.olap.DriverManager}) acts as class-factory.

todo: How members are calculated...

todo: How aggregations are batched...

todo: MDX functions. See user-defined functions.

Aggregation manager

Aggregations are based upon the relational model: as far as the aggregation manager is concerned, there is no relationship between the columns city and state. This means that all roll-ups are the same: you just drop a column. Consider the 3 roll-ups possible by dropping a column from the aggregation {gender, city, state}: dropping gender is equivalent to removing the [Gender] dimension; dropping city is equivalent to rolling up to a higher level in the [Geography] hierarchy; and dropping state is not even allowed in the dimensional model (no, sorry, you can't ask about products sold in a cities called 'Portland'). This approach will also allow us to implement 'drill anywhere'.

An aggregation is defined by a search condition, for example, {state in ('CA', 'OR', 'WA'), city = any, gender = 'M', measure = 'Unit sales'}. The any value is important; if we had asked for a specific set of cities, we would not later be able to roll-up by dropping the city column.

The caching strategy is to throw out the aggregation with the lowest cost/benefit ratio. The 'benefit' of an item is the effort it took to produce (effort which it is saving future queries) multiplied by its 'usefulness' which declines exponentially if it is not used over time. The 'cost' of an item is its size.


$Id$ (log)