Classification
The Classification panel allows to test, tweak and observe how
different algorithms perform classification on samples living in a
N-dimensional space: "the canvas".
Classification can be Binary or Multi-Class, depending on whether
there are presently more than 2 classes of samples (different
colors) and whether the algorithm allows it.
The canvas will display the results of the classification in
multiple layers, which can be changed using the display options.
These are:
- Samples: the original sample data, colors indicate class
labels
- Learned Model: the classified labels obtained by the algorithm
- Model Info: additional information from the algorithm
(gaussian position and shape, support vectors, etc.)
- Density Map: (for 2D canvas only) classification result for
each coordinate in space
In the case of binary classification, the red color is used to
indicate the positive class (by default class #1) while white color
indicates the negative class. Varying degrees of blackness indicate
uncertainty (for algorithms that do not have harsh class
transitions)
In Practice
The easiest way to perform classification is to:
- Draw some samples (left-click: class 1, right-click: class 0)
- Click on "Classify"
This should train the algorithm and start painting the canvas with
the results of the classification.
Options and Commands
The interface for classification (the right-hand side of the
Algorithm Options dialog) provides the following commands:
- Classify: perform the classification using the currently
selected algorithm and options
- Clear: clear the current classifier model (does NOT clear the
data)
- Show ROC: display the Reciever Operator Characteristic curve
for the current binary classification
- Compare: adds the current algorithm and options to the Compare
dialog for batch comparisons
and the following options:
- Positive Class: (currently unused) defines the class to be
used as positive class (by default class #1)
- Train / Test ratio: the ratio of samples in the canvas to be
used for training
- Input Dimensions: determines the dimensions that should be
used for classification (unselected dimensions will be ignored)
- Manual Selection: manually select the training samples
(overrides the Train/Test ratio option)
All other options are algorithm-dependent and should be described in
the help menu of the algorithm itself.