Package weka.clusterers
Class sIB
- java.lang.Object
-
- weka.clusterers.AbstractClusterer
-
- weka.clusterers.RandomizableClusterer
-
- weka.clusterers.sIB
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,Clusterer
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
public class sIB extends RandomizableClusterer implements TechnicalInformationHandler
Cluster data using the sequential information bottleneck algorithm.
Note: only hard clustering scheme is supported. sIB assign for each instance the cluster that have the minimum cost/distance to the instance. The trade-off beta is set to infinite so 1/beta is zero.
For more information, see:
Noam Slonim, Nir Friedman, Naftali Tishby: Unsupervised document classification using sequential information maximization. In: Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval, 129-136, 2002. BibTeX:@inproceedings{Slonim2002, author = {Noam Slonim and Nir Friedman and Naftali Tishby}, booktitle = {Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval}, pages = {129-136}, title = {Unsupervised document classification using sequential information maximization}, year = {2002} }
Valid options are:-I <num> maximum number of iterations (default 100).
-M <num> minimum number of changes in a single iteration (default 0).
-N <num> number of clusters. (default 2).
-R <num> number of restarts. (default 5).
-U set not to normalize the data (default true).
-V set to output debug info (default false).
-S <num> Random number seed. (default 1)
- Version:
- $Revision: 5538 $
- Author:
- Noam Slonim, Anna Huang
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description sIB()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
buildClusterer(Instances data)
Generates a clusterer.int
clusterInstance(Instance instance)
Cluster a given instance, this is the method defined in Clusterer interface do nothing but just return the cluster assigned to itjava.lang.String
debugTipText()
Returns the tip text for this propertyCapabilities
getCapabilities()
Returns default capabilities of the clusterer.boolean
getDebug()
Get debug modeint
getMaxIterations()
Get the max number of iterationsint
getMinChange()
get the minimum number of changesboolean
getNotUnifyNorm()
Get whether to normalize instances to unify prior probability before building the clustererint
getNumClusters()
Get the number of clustersint
getNumRestarts()
Get the number of restartsjava.lang.String[]
getOptions()
Gets the current settings.java.lang.String
getRevision()
Returns the revision string.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.java.lang.String
globalInfo()
Returns a string describing this clustererjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(java.lang.String[] argv)
java.lang.String
maxIterationsTipText()
Returns the tip text for this property.java.lang.String
minChangeTipText()
Returns the tip text for this property.java.lang.String
notUnifyNormTipText()
Returns the tip text for this property.int
numberOfClusters()
Get the number of clustersjava.lang.String
numClustersTipText()
Returns the tip text for this property.java.lang.String
numRestartsTipText()
Returns the tip text for this property.void
setDebug(boolean v)
Set debug mode - verbose outputvoid
setMaxIterations(int i)
Set the max number of iterationsvoid
setMinChange(int m)
set the minimum number of changesvoid
setNotUnifyNorm(boolean b)
Set whether to normalize instances to unify prior probability before building the clusterervoid
setNumClusters(int n)
Set the number of clustersvoid
setNumRestarts(int i)
Set the number of restartsvoid
setOptions(java.lang.String[] options)
Parses a given list of options.java.lang.String
toString()
-
Methods inherited from class weka.clusterers.RandomizableClusterer
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.clusterers.AbstractClusterer
distributionForInstance, forName, makeCopies, makeCopy
-
-
-
-
Method Detail
-
buildClusterer
public void buildClusterer(Instances data) throws java.lang.Exception
Generates a clusterer.- Specified by:
buildClusterer
in interfaceClusterer
- Specified by:
buildClusterer
in classAbstractClusterer
- Parameters:
data
- the training instances- Throws:
java.lang.Exception
- if something goes wrong
-
clusterInstance
public int clusterInstance(Instance instance) throws java.lang.Exception
Cluster a given instance, this is the method defined in Clusterer interface do nothing but just return the cluster assigned to it- Specified by:
clusterInstance
in interfaceClusterer
- Overrides:
clusterInstance
in classAbstractClusterer
- Parameters:
instance
- the instance to be assigned to a cluster- Returns:
- the number of the assigned cluster as an integer
- Throws:
java.lang.Exception
- if instance could not be clustered successfully
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses a given list of options. Valid options are:-I <num> maximum number of iterations (default 100).
-M <num> minimum number of changes in a single iteration (default 0).
-N <num> number of clusters. (default 2).
-R <num> number of restarts. (default 5).
-U set not to normalize the data (default true).
-V set to output debug info (default false).
-S <num> Random number seed. (default 1)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableClusterer
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an option is not supported
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableClusterer
- Returns:
- an enumeration of all the available options.
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableClusterer
- Returns:
- an array of strings suitable for passing to setOptions()
-
debugTipText
public java.lang.String debugTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDebug
public void setDebug(boolean v)
Set debug mode - verbose output- Parameters:
v
- true for verbose output
-
getDebug
public boolean getDebug()
Get debug mode- Returns:
- true if debug mode is set
-
maxIterationsTipText
public java.lang.String maxIterationsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property
-
setMaxIterations
public void setMaxIterations(int i)
Set the max number of iterations- Parameters:
i
- max number of iterations
-
getMaxIterations
public int getMaxIterations()
Get the max number of iterations- Returns:
- max number of iterations
-
minChangeTipText
public java.lang.String minChangeTipText()
Returns the tip text for this property.- Returns:
- tip text for this property
-
setMinChange
public void setMinChange(int m)
set the minimum number of changes- Parameters:
m
- the minimum number of changes
-
getMinChange
public int getMinChange()
get the minimum number of changes- Returns:
- the minimum number of changes
-
numClustersTipText
public java.lang.String numClustersTipText()
Returns the tip text for this property.- Returns:
- tip text for this property
-
setNumClusters
public void setNumClusters(int n)
Set the number of clusters- Parameters:
n
- number of clusters
-
getNumClusters
public int getNumClusters()
Get the number of clusters- Returns:
- the number of clusters
-
numberOfClusters
public int numberOfClusters()
Get the number of clusters- Specified by:
numberOfClusters
in interfaceClusterer
- Specified by:
numberOfClusters
in classAbstractClusterer
- Returns:
- the number of clusters
-
numRestartsTipText
public java.lang.String numRestartsTipText()
Returns the tip text for this property.- Returns:
- tip text for this property
-
setNumRestarts
public void setNumRestarts(int i)
Set the number of restarts- Parameters:
i
- number of restarts
-
getNumRestarts
public int getNumRestarts()
Get the number of restarts- Returns:
- number of restarts
-
notUnifyNormTipText
public java.lang.String notUnifyNormTipText()
Returns the tip text for this property.- Returns:
- tip text for this property
-
setNotUnifyNorm
public void setNotUnifyNorm(boolean b)
Set whether to normalize instances to unify prior probability before building the clusterer- Parameters:
b
- true to normalize, otherwise false
-
getNotUnifyNorm
public boolean getNotUnifyNorm()
Get whether to normalize instances to unify prior probability before building the clusterer- Returns:
- true if set to normalize, false otherwise
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing this clusterer- Returns:
- a description of the clusterer suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the clusterer.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Specified by:
getCapabilities
in interfaceClusterer
- Overrides:
getCapabilities
in classAbstractClusterer
- Returns:
- the capabilities of this clusterer
- See Also:
Capabilities
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classAbstractClusterer
- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
-
-