Regression
The Regression panel allows to test, tweak and observe how different
algorithms perform regression on samples living in a N-dimensional
space: "the canvas".
Regression is usually performed from N-1 dimensions to a single
Output dimension (usually the last dimension of the data).
The canvas will display the results of the regression in multiple
layers, which can be changed using the display options. These are:
- Samples: the original sample data, colors indicate class
labels, which will be ignored by the regressor
- Learned Model: the regression model obtained by the algorithm
- 2D data: a continuous curve displaying the learned function
- nD data: only the deviation for the known samples (error) is
shown in the vertical dimension (output dimension)
- Model Info: additional information from the algorithm
(gaussian position and shape, support vectors, etc.)
- Density Map: (for 2D canvas only) confidence of the algorithm
at each position in space (for algorithms that support it)
For multi-dimensional data, it is not possible to display the actual
learned function, therefore only the errors for the known samples is
provided. To get an idea of the quality of the regression it is
necessary to use the Compare button and do cross-validation
analysis.
In Practice
The easiest way to perform regression is to:
- Draw some samples (left-click) in a curve from left to right
- Click on "Regress"
This should train the algorithm and draw the regression curve that
has been learned.
Options and Commands
The interface for regression (the right-hand side of the Algorithm
Options dialog) provides the following commands:
- Regress: perform the regression using the currently selected
algorithm and options
- Clear: clear the current regression model (does NOT clear the
data)
- Compare: adds the current algorithm and options to the Compare
dialog for batch comparisons
and the following options:
- Regression Dimension: defines the dimension to be used as
output for the regression (by default the last dimension)
- Train / Test ratio: the ratio of samples in the canvas to be
used for training
- Input Dimensions: determines the dimensions that should be
used for regression (unselected dimensions, as well as the
output dimension, will be ignored)
- Manual Selection: manually select the training samples
(overrides the Train/Test ratio option)
All other options are algorithm-dependent and should be described in
the help menu of the algorithm itself.