Linear Projections
One of the most straightforward solution for classification is to
project the data linearly and to perform Naive Bayes classification on
the projected data. Here we present different ways for projecting the
data linearly onto one or two dimensions.
Principal Component Analysis (PCA)
PCA seeks the directions of maximum variance (corresponding to the
eigenvectors of the data covariance matrix), and projects the data onto
the first Principal Component (eigenvector whose eigenvalue is the
largest). No distinction is made between samples belonging to different
classes. More information on Wikipedia.
Linear
Discriminant Analysis (LDA)
LDA models each sample class separately and finds the direction that
maximizes the distance between the two distributions. In its basic
form, LDA models each class as a gaussian distribution of equal
variance. More information on Wikipedia.
Fisher Linear Discriminant
Fisher-LDA extends LDA by modeling each class as a gaussian
distribution of individual variance (instead of a single variance
common to both distributions as in standard LDA). If the two classes
have similar distributions, there will be no visible difference between
LDA and Fisher-LDA. More information on Wikipedia.
Independent Component Analysis (ICA)
ICA looks for the directions that maximize data independence. While in
the previous cases the components are always orthogonal, ICA can
project the data along non-orthogonal dimensions. More information on Wikipedia.
Kernel Parameters
More information on Wikipedia.
- Kernel Type:
- Linear: linear kernel
- Polynomial: polynomial kernel
- RBF: radial basis function (gaussian) kernel
- Kernel Width: inverse variance for the kernel function, determines the radius of influence of each sample (RBF + Poly)
- Degree: degree of the polynomial (Poly)
Naive Bayes
Regardless of the method used for projection (if any), The data is
separated into positive and negative classes, and the probability of a
sample belonging to each class is computed separately. The response of
the classifier in this implementation is a Maximum A Posteriori (MAP)
decision rule. More information on Wikipedia.
The interface presents two buttons, that allow to visualize the
projection of the data into components space, and to switch the current
data in the canvas from the source samples to the projected samples,
and back again.