Class SubsetSizeForwardSelection
- java.lang.Object
-
- weka.attributeSelection.ASSearch
-
- weka.attributeSelection.SubsetSizeForwardSelection
-
- All Implemented Interfaces:
java.io.Serializable
,OptionHandler
,RevisionHandler
public class SubsetSizeForwardSelection extends ASSearch implements OptionHandler
SubsetSizeForwardSelection:
Extension of LinearForwardSelection. The search performs an interior cross-validation (seed and number of folds can be specified). A LinearForwardSelection is performed on each foldto determine the optimal subset-size (using the given SubsetSizeEvaluator). Finally, a LinearForwardSelection up to the optimal subset-size is performed on the whole data.
For more information see:
Martin Guetlein (2006). Large Scale Attribute Selection Using Wrappers. Freiburg, Germany. Valid options are:-I Perform initial ranking to select the top-ranked attributes.
-K <num> Number of top-ranked attributes that are taken into account by the search.
-T <0 = fixed-set | 1 = fixed-width> Type of Linear Forward Selection (default = 0).
-S <num> Size of lookup cache for evaluated subsets. Expressed as a multiple of the number of attributes in the data set. (default = 1)
-E <subset evaluator> Subset-evaluator used for subset-size determination.-- -M
-F <num> Number of cross validation folds for subset size determination (default = 5).
-R <num> Seed for cross validation subset size determination. (default = 1)
-Z verbose on/off
Options specific to evaluator weka.attributeSelection.ClassifierSubsetEval:
-B <classifier> class name of the classifier to use for accuracy estimation. Place any classifier options LAST on the command line following a "--". eg.: -B weka.classifiers.bayes.NaiveBayes ... -- -K (default: weka.classifiers.rules.ZeroR)
-T Use the training data to estimate accuracy.
-H <filename> Name of the hold out/test set to estimate accuracy on.
Options specific to scheme weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
- Version:
- $Revision: 11198 $
- Author:
- Martin Guetlein (martin.guetlein@gmail.com)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description SubsetSizeForwardSelection()
Constructor
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
getLookupCacheSize()
Return the maximum size of the evaluated subset cache (expressed as a multiplier for the number of attributes in a data set.int
getNumSubsetSizeCVFolds()
Get the number of cross validation folds for subset size determination (default = 5).int
getNumUsedAttributes()
Get the number of top-ranked attributes that taken into account by the search process.java.lang.String[]
getOptions()
Gets the current settings of LinearForwardSelection.boolean
getPerformRanking()
Get boolean if initial ranking should be performed to select the top-ranked attributesjava.lang.String
getRevision()
Returns the revision string.int
getSeed()
Seed for cross validation subset size determination.ASEvaluation
getSubsetSizeEvaluator()
Get the subset evaluator used for subset size determination.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.SelectedTag
getType()
Get the typeboolean
getVerbose()
Get whether output is to be verbosejava.lang.String
globalInfo()
Returns a string describing this search methodjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.java.lang.String
lookupCacheSizeTipText()
Returns the tip text for this propertyjava.lang.String
numSubsetSizeCVFoldsTipText()
Returns the tip text for this propertyjava.lang.String
numUsedAttributesTipText()
Returns the tip text for this propertyjava.lang.String
performRankingTipText()
Returns the tip text for this propertyint[]
search(ASEvaluation ASEval, Instances data)
Searches the attribute subset space by subset size forward selectionjava.lang.String
seedTipText()
Returns the tip text for this propertyvoid
setLookupCacheSize(int size)
Set the maximum size of the evaluated subset cache (hashtable).void
setNumSubsetSizeCVFolds(int f)
Set the number of cross validation folds for subset size determination (default = 5).void
setNumUsedAttributes(int k)
Set the number of top-ranked attributes that taken into account by the search process.void
setOptions(java.lang.String[] options)
Parses a given list of options.void
setPerformRanking(boolean b)
Perform initial ranking to select top-ranked attributes.void
setSeed(int s)
Seed for cross validation subset size determination.void
setSubsetSizeEvaluator(ASEvaluation eval)
Set the subset evaluator to use for subset size determination.void
setType(SelectedTag t)
Set the typevoid
setVerbose(boolean b)
Set whether verbose output should be generated.java.lang.String
subsetSizeEvaluatorTipText()
Returns the tip text for this propertyjava.lang.String
toString()
returns a description of the search as a Stringjava.lang.String
typeTipText()
Returns the tip text for this propertyjava.lang.String
verboseTipText()
Returns the tip text for this property-
Methods inherited from class weka.attributeSelection.ASSearch
forName, makeCopies
-
-
-
-
Field Detail
-
TAGS_TYPE
public static final Tag[] TAGS_TYPE
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing this search method- Returns:
- a description of the search method suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Returns:
- the technical information about this class
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses a given list of options. Valid options are:-I
Perform initial ranking to select top-ranked attributes.-K
Number of top-ranked attributes that are taken into account.-T <0 = fixed-set | 1 = fixed-width>
Typ of Linear Forward Selection (default = 0).-S
Size of lookup cache for evaluated subsets. Expressed as a multiple of the number of attributes in the data set. (default = 1).-E
class name of subset evaluator to use for subset size determination (default = null, same subset evaluator as for ranking and final forward selection is used). Place any evaluator options LAST on the command line following a "--". eg. -A weka.attributeSelection.ClassifierSubsetEval ... -- -M -F
Number of cross validation folds for subset size determination (default = 5).-R
Seed for cross validation subset size determination. (default = 1)-Z
verbose on/off.- Specified by:
setOptions
in interfaceOptionHandler
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an option is not supported
-
setLookupCacheSize
public void setLookupCacheSize(int size)
Set the maximum size of the evaluated subset cache (hashtable). This is expressed as a multiplier for the number of attributes in the data set. (default = 1).- Parameters:
size
- the maximum size of the hashtable
-
getLookupCacheSize
public int getLookupCacheSize()
Return the maximum size of the evaluated subset cache (expressed as a multiplier for the number of attributes in a data set.- Returns:
- the maximum size of the hashtable.
-
lookupCacheSizeTipText
public java.lang.String lookupCacheSizeTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
performRankingTipText
public java.lang.String performRankingTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setPerformRanking
public void setPerformRanking(boolean b)
Perform initial ranking to select top-ranked attributes.- Parameters:
b
- true if initial ranking should be performed
-
getPerformRanking
public boolean getPerformRanking()
Get boolean if initial ranking should be performed to select the top-ranked attributes- Returns:
- true if initial ranking should be performed
-
numUsedAttributesTipText
public java.lang.String numUsedAttributesTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumUsedAttributes
public void setNumUsedAttributes(int k) throws java.lang.Exception
Set the number of top-ranked attributes that taken into account by the search process.- Parameters:
k
- the number of attributes- Throws:
java.lang.Exception
- if k is less than 2
-
getNumUsedAttributes
public int getNumUsedAttributes()
Get the number of top-ranked attributes that taken into account by the search process.- Returns:
- the number of top-ranked attributes that taken into account
-
typeTipText
public java.lang.String typeTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setType
public void setType(SelectedTag t)
Set the type- Parameters:
t
- the Linear Forward Selection type
-
getType
public SelectedTag getType()
Get the type- Returns:
- the Linear Forward Selection type
-
subsetSizeEvaluatorTipText
public java.lang.String subsetSizeEvaluatorTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSubsetSizeEvaluator
public void setSubsetSizeEvaluator(ASEvaluation eval) throws java.lang.Exception
Set the subset evaluator to use for subset size determination.- Parameters:
eval
- the subset evaluator to use for subset size determination.- Throws:
java.lang.Exception
-
getSubsetSizeEvaluator
public ASEvaluation getSubsetSizeEvaluator()
Get the subset evaluator used for subset size determination.- Returns:
- the evaluator used for subset size determination.
-
numSubsetSizeCVFoldsTipText
public java.lang.String numSubsetSizeCVFoldsTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumSubsetSizeCVFolds
public void setNumSubsetSizeCVFolds(int f)
Set the number of cross validation folds for subset size determination (default = 5).- Parameters:
f
- number of folds
-
getNumSubsetSizeCVFolds
public int getNumSubsetSizeCVFolds()
Get the number of cross validation folds for subset size determination (default = 5).- Returns:
- number of folds
-
seedTipText
public java.lang.String seedTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSeed
public void setSeed(int s)
Seed for cross validation subset size determination. (default = 1)- Parameters:
s
- seed
-
getSeed
public int getSeed()
Seed for cross validation subset size determination. (default = 1)- Returns:
- seed
-
verboseTipText
public java.lang.String verboseTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setVerbose
public void setVerbose(boolean b)
Set whether verbose output should be generated.- Parameters:
d
- true if output is to be verbose.
-
getVerbose
public boolean getVerbose()
Get whether output is to be verbose- Returns:
- true if output will be verbose
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of LinearForwardSelection.- Specified by:
getOptions
in interfaceOptionHandler
- Returns:
- an array of strings suitable for passing to setOptions()
-
toString
public java.lang.String toString()
returns a description of the search as a String- Overrides:
toString
in classjava.lang.Object
- Returns:
- a description of the search
-
search
public int[] search(ASEvaluation ASEval, Instances data) throws java.lang.Exception
Searches the attribute subset space by subset size forward selection
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classASSearch
- Returns:
- the revision
-
-