Class IBk

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Cloneable, UpdateableClassifier, AdditionalMeasureProducer, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

    public class IBk
    extends Classifier
    implements OptionHandler, UpdateableClassifier, WeightedInstancesHandler, TechnicalInformationHandler, AdditionalMeasureProducer
    K-nearest neighbours classifier. Can select appropriate value of K based on cross-validation. Can also do distance weighting.

    For more information, see

    D. Aha, D. Kibler (1991). Instance-based learning algorithms. Machine Learning. 6:37-66.

    BibTeX:

     @article{Aha1991,
        author = {D. Aha and D. Kibler},
        journal = {Machine Learning},
        pages = {37-66},
        title = {Instance-based learning algorithms},
        volume = {6},
        year = {1991}
     }
     

    Valid options are:

     -I
      Weight neighbours by the inverse of their distance
      (use when k > 1)
     -F
      Weight neighbours by 1 - their distance
      (use when k > 1)
     -K <number of neighbors>
      Number of nearest neighbours (k) used in classification.
      (Default = 1)
     -E
      Minimise mean squared error rather than mean absolute
      error when using -X option with numeric prediction.
     -W <window size>
      Maximum number of training instances maintained.
      Training instances are dropped FIFO. (Default = no window)
     -X
      Select the number of nearest neighbours between 1
      and the k value specified using hold-one-out evaluation
      on the training data (use when k > 1)
     -A
      The nearest neighbour search algorithm to use (default: weka.core.neighboursearch.LinearNNSearch).
     
    Version:
    $Revision: 10069 $
    Author:
    Stuart Inglis (singlis@cs.waikato.ac.nz), Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
    See Also:
    Serialized Form
    • Field Detail

      • WEIGHT_INVERSE

        public static final int WEIGHT_INVERSE
        weight by 1/distance.
        See Also:
        Constant Field Values
      • WEIGHT_SIMILARITY

        public static final int WEIGHT_SIMILARITY
        weight by 1-distance.
        See Also:
        Constant Field Values
      • TAGS_WEIGHTING

        public static final Tag[] TAGS_WEIGHTING
        possible instance weighting methods.
    • Constructor Detail

      • IBk

        public IBk​(int k)
        IBk classifier. Simple instance-based learner that uses the class of the nearest k training instances for the class of the test instances.
        Parameters:
        k - the number of nearest neighbors to use for prediction
      • IBk

        public IBk()
        IB1 classifer. Instance-based learner. Predicts the class of the single nearest training instance for each test instance.
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing classifier.
        Returns:
        a description suitable for displaying in the explorer/experimenter gui
      • getTechnicalInformation

        public TechnicalInformation getTechnicalInformation()
        Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
        Specified by:
        getTechnicalInformation in interface TechnicalInformationHandler
        Returns:
        the technical information about this class
      • KNNTipText

        public java.lang.String KNNTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setKNN

        public void setKNN​(int k)
        Set the number of neighbours the learner is to use.
        Parameters:
        k - the number of neighbours.
      • getKNN

        public int getKNN()
        Gets the number of neighbours the learner will use.
        Returns:
        the number of neighbours.
      • windowSizeTipText

        public java.lang.String windowSizeTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getWindowSize

        public int getWindowSize()
        Gets the maximum number of instances allowed in the training pool. The addition of new instances above this value will result in old instances being removed. A value of 0 signifies no limit to the number of training instances.
        Returns:
        Value of WindowSize.
      • setWindowSize

        public void setWindowSize​(int newWindowSize)
        Sets the maximum number of instances allowed in the training pool. The addition of new instances above this value will result in old instances being removed. A value of 0 signifies no limit to the number of training instances.
        Parameters:
        newWindowSize - Value to assign to WindowSize.
      • distanceWeightingTipText

        public java.lang.String distanceWeightingTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getDistanceWeighting

        public SelectedTag getDistanceWeighting()
        Gets the distance weighting method used. Will be one of WEIGHT_NONE, WEIGHT_INVERSE, or WEIGHT_SIMILARITY
        Returns:
        the distance weighting method used.
      • setDistanceWeighting

        public void setDistanceWeighting​(SelectedTag newMethod)
        Sets the distance weighting method used. Values other than WEIGHT_NONE, WEIGHT_INVERSE, or WEIGHT_SIMILARITY will be ignored.
        Parameters:
        newMethod - the distance weighting method to use
      • meanSquaredTipText

        public java.lang.String meanSquaredTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getMeanSquared

        public boolean getMeanSquared()
        Gets whether the mean squared error is used rather than mean absolute error when doing cross-validation.
        Returns:
        true if so.
      • setMeanSquared

        public void setMeanSquared​(boolean newMeanSquared)
        Sets whether the mean squared error is used rather than mean absolute error when doing cross-validation.
        Parameters:
        newMeanSquared - true if so.
      • crossValidateTipText

        public java.lang.String crossValidateTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getCrossValidate

        public boolean getCrossValidate()
        Gets whether hold-one-out cross-validation will be used to select the best k value.
        Returns:
        true if cross-validation will be used.
      • setCrossValidate

        public void setCrossValidate​(boolean newCrossValidate)
        Sets whether hold-one-out cross-validation will be used to select the best k value.
        Parameters:
        newCrossValidate - true if cross-validation should be used.
      • nearestNeighbourSearchAlgorithmTipText

        public java.lang.String nearestNeighbourSearchAlgorithmTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getNearestNeighbourSearchAlgorithm

        public NearestNeighbourSearch getNearestNeighbourSearchAlgorithm()
        Returns the current nearestNeighbourSearch algorithm in use.
        Returns:
        the NearestNeighbourSearch algorithm currently in use.
      • setNearestNeighbourSearchAlgorithm

        public void setNearestNeighbourSearchAlgorithm​(NearestNeighbourSearch nearestNeighbourSearchAlgorithm)
        Sets the nearestNeighbourSearch algorithm to be used for finding nearest neighbour(s).
        Parameters:
        nearestNeighbourSearchAlgorithm - - The NearestNeighbourSearch class.
      • getNumTraining

        public int getNumTraining()
        Get the number of training instances the classifier is currently using.
        Returns:
        the number of training instances the classifier is currently using
      • buildClassifier

        public void buildClassifier​(Instances instances)
                             throws java.lang.Exception
        Generates the classifier.
        Specified by:
        buildClassifier in class Classifier
        Parameters:
        instances - set of instances serving as training data
        Throws:
        java.lang.Exception - if the classifier has not been generated successfully
      • updateClassifier

        public void updateClassifier​(Instance instance)
                              throws java.lang.Exception
        Adds the supplied instance to the training set.
        Specified by:
        updateClassifier in interface UpdateableClassifier
        Parameters:
        instance - the instance to add
        Throws:
        java.lang.Exception - if instance could not be incorporated successfully
      • distributionForInstance

        public double[] distributionForInstance​(Instance instance)
                                         throws java.lang.Exception
        Calculates the class membership probabilities for the given test instance.
        Overrides:
        distributionForInstance in class Classifier
        Parameters:
        instance - the instance to be classified
        Returns:
        predicted class probability distribution
        Throws:
        java.lang.Exception - if an error occurred during the prediction
      • listOptions

        public java.util.Enumeration listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface OptionHandler
        Overrides:
        listOptions in class Classifier
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Parses a given list of options.

        Valid options are:

         -I
          Weight neighbours by the inverse of their distance
          (use when k > 1)
         -F
          Weight neighbours by 1 - their distance
          (use when k > 1)
         -K <number of neighbors>
          Number of nearest neighbours (k) used in classification.
          (Default = 1)
         -E
          Minimise mean squared error rather than mean absolute
          error when using -X option with numeric prediction.
         -W <window size>
          Maximum number of training instances maintained.
          Training instances are dropped FIFO. (Default = no window)
         -X
          Select the number of nearest neighbours between 1
          and the k value specified using hold-one-out evaluation
          on the training data (use when k > 1)
         -A
          The nearest neighbour search algorithm to use (default: weka.core.neighboursearch.LinearNNSearch).
         
        Specified by:
        setOptions in interface OptionHandler
        Overrides:
        setOptions in class Classifier
        Parameters:
        options - the list of options as an array of strings
        Throws:
        java.lang.Exception - if an option is not supported
      • getOptions

        public java.lang.String[] getOptions()
        Gets the current settings of IBk.
        Specified by:
        getOptions in interface OptionHandler
        Overrides:
        getOptions in class Classifier
        Returns:
        an array of strings suitable for passing to setOptions()
      • enumerateMeasures

        public java.util.Enumeration enumerateMeasures()
        Returns an enumeration of the additional measure names produced by the neighbour search algorithm, plus the chosen K in case cross-validation is enabled.
        Specified by:
        enumerateMeasures in interface AdditionalMeasureProducer
        Returns:
        an enumeration of the measure names
      • getMeasure

        public double getMeasure​(java.lang.String additionalMeasureName)
        Returns the value of the named measure from the neighbour search algorithm, plus the chosen K in case cross-validation is enabled.
        Specified by:
        getMeasure in interface AdditionalMeasureProducer
        Parameters:
        additionalMeasureName - the name of the measure to query for its value
        Returns:
        the value of the named measure
        Throws:
        java.lang.IllegalArgumentException - if the named measure is not supported
      • toString

        public java.lang.String toString()
        Returns a description of this classifier.
        Overrides:
        toString in class java.lang.Object
        Returns:
        a description of this classifier as a string.
      • pruneToK

        public Instances pruneToK​(Instances neighbours,
                                  double[] distances,
                                  int k)
        Prunes the list to contain the k nearest neighbors. If there are multiple neighbors at the k'th distance, all will be kept.
        Parameters:
        neighbours - the neighbour instances.
        distances - the distances of the neighbours from target instance.
        k - the number of neighbors to keep.
        Returns:
        the pruned neighbours.
      • main

        public static void main​(java.lang.String[] argv)
        Main method for testing this class.
        Parameters:
        argv - should contain command line options (see setOptions)