NonSparseToSparse |
An instance filter that converts all incoming instances into sparse format.
|
Normalize |
An instance filter that normalize instances considering only numeric attributes and ignoring class index.
|
Randomize |
Randomly shuffles the order of instances passed through it.
|
RemoveFolds |
This filter takes a dataset and outputs a specified fold for cross validation.
|
RemoveFrequentValues |
Determines which values (frequent or infrequent ones) of an (nominal) attribute are retained and filters the instances accordingly.
|
RemoveMisclassified |
A filter that removes instances which are incorrectly classified.
|
RemovePercentage |
A filter that removes a given percentage of a dataset.
|
RemoveRange |
A filter that removes a given range of instances of a dataset.
|
RemoveWithValues |
Filters instances according to the value of an attribute.
|
Resample |
Produces a random subsample of a dataset using either sampling with replacement or without replacement.
|
ReservoirSample |
Produces a random subsample of a dataset using the reservoir sampling Algorithm "R" by Vitter.
|
SparseToNonSparse |
An instance filter that converts all incoming sparse instances into non-sparse format.
|
SubsetByExpression |
Filters instances according to a user-specified expression.
Grammar:
boolexpr_list ::= boolexpr_list boolexpr_part | boolexpr_part;
boolexpr_part ::= boolexpr:e {: parser.setResult(e); :} ;
boolexpr ::= BOOLEAN
| true
| false
| expr < expr
| expr <= expr
| expr > expr
| expr >= expr
| expr = expr
| ( boolexpr )
| not boolexpr
| boolexpr and boolexpr
| boolexpr or boolexpr
| ATTRIBUTE is STRING
;
expr ::= NUMBER
| ATTRIBUTE
| ( expr )
| opexpr
| funcexpr
;
opexpr ::= expr + expr
| expr - expr
| expr * expr
| expr / expr
;
funcexpr ::= abs ( expr )
| sqrt ( expr )
| log ( expr )
| exp ( expr )
| sin ( expr )
| cos ( expr )
| tan ( expr )
| rint ( expr )
| floor ( expr )
| pow ( expr for base , expr for exponent )
| ceil ( expr )
;
Notes:
- NUMBER
any integer or floating point number
(but not in scientific notation!)
- STRING
any string surrounded by single quotes;
the string may not contain a single quote though.
- ATTRIBUTE
the following placeholders are recognized for
attribute values:
- CLASS for the class value in case a class attribute is set.
- ATTxyz with xyz a number from 1 to # of attributes in the
dataset, representing the value of indexed attribute.
Examples:
- extracting only mammals and birds from the 'zoo' UCI dataset:
(CLASS is 'mammal') or (CLASS is 'bird')
- extracting only animals with at least 2 legs from the 'zoo' UCI dataset:
(ATT14 >= 2)
- extracting only instances with non-missing 'wage-increase-second-year'
from the 'labor' UCI dataset:
not ismissing(ATT3)
|