Package edu.msu.cme.rdp.classifier
Class Classifier
java.lang.Object
edu.msu.cme.rdp.classifier.Classifier
This is the class to do the classification.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
static final int
static final int
static final int
The minimum number of bases per sequence. -
Method Summary
Modifier and TypeMethodDescriptionvoid
addConfidence
(HierarchyTree node, HashMap map) increase the count of the RankAssignment in the map if match that node or any ancestor of that node.classify
(ClassifierSequence seq, int min_bootstrap_words) Takes a query sequence, returns the classification result.classify
(edu.msu.cme.rdp.readseq.readers.Sequence seq) Takes a query sequence, returns the classification result.
-
Field Details
-
MIN_SEQ_LEN
public static final int MIN_SEQ_LENThe minimum number of bases per sequence. Initially set to 200.- See Also:
-
MAX_SEQ_LEN
public static final int MAX_SEQ_LEN- See Also:
-
MIN_GOOD_WORDS
public static final int MIN_GOOD_WORDS- See Also:
-
MIN_BOOTSTRSP_WORDS
public static final int MIN_BOOTSTRSP_WORDS- See Also:
-
-
Method Details
-
getTrainRank
-
classify
public ClassificationResult classify(edu.msu.cme.rdp.readseq.readers.Sequence seq) throws IOException Takes a query sequence, returns the classification result. For each query sequence, first assign it to a genus node using all the words for calculation. Then randomly chooses one-eighth of the all overlapping words in the query to calculate the joint probability. The number of times a genus was selected out of the number of bootstrap trials was used as an estimate of confidence in the assignment to that genus.- Throws:
ShortSequenceException
- if the sequence length is less than the minimum sequence length.IOException
-
classify
-
classify
Takes a query sequence, returns the classification result. For each query sequence, first assign it to a genus node using all the words for calculation. Then randomly chooses one-eighth of the all overlapping words in the query to calculate the joint probability. The number of times a genus was selected out of the number of bootstrap trials was used as an estimate of confidence in the assignment to that genus.- Throws:
ShortSequenceException
- if the sequence length is less than the minimum sequence length.
-
addConfidence
increase the count of the RankAssignment in the map if match that node or any ancestor of that node.- Parameters:
node
-map
-
-