Package com.sun.speech.freetts
Interface Tokenizer
- All Known Implementing Classes:
TokenizerImpl
public interface Tokenizer
Chops a string or text file into Token instances.
-
Method Summary
Modifier and TypeMethodDescriptionIf hasErrors returns true, returns a description of the error encountered.Returns the next token.boolean
Returns true if there were errors while reading tokens.boolean
Returns true if there are more tokens, false otherwise.boolean
isBreak()
Determines if the current token should start a new sentence.void
setInputReader
(Reader reader) Sets the input reader.void
setInputText
(String textToTokenize) Sets the text to be tokenized by this tokenizer.void
setPostpunctuationSymbols
(String symbols) Sets the postpunctuation symbols of this Tokenizer to the given symbols.void
setPrepunctuationSymbols
(String symbols) Sets the prepunctuation symbols of this Tokenizer to the given symbols.void
setSingleCharSymbols
(String symbols) Sets the single character symbols of this Tokenizer to the given symbols.void
setWhitespaceSymbols
(String symbols) Sets the whitespace symbols of this Tokenizer to the given symbols.
-
Method Details
-
setInputText
Sets the text to be tokenized by this tokenizer.- Parameters:
textToTokenize
- the text to tokenize
-
setInputReader
Sets the input reader.- Parameters:
reader
- the input source
-
getNextToken
Token getNextToken()Returns the next token.- Returns:
- the next token if it exists; otherwise null
-
hasMoreTokens
boolean hasMoreTokens()Returns true if there are more tokens, false otherwise.- Returns:
- true if there are more tokens; otherwise false
-
hasErrors
boolean hasErrors()Returns true if there were errors while reading tokens.- Returns:
- true if there were errors; otherwise false
-
getErrorDescription
String getErrorDescription()If hasErrors returns true, returns a description of the error encountered. Otherwise returns null.- Returns:
- a description of the last error that occurred
-
setWhitespaceSymbols
Sets the whitespace symbols of this Tokenizer to the given symbols.- Parameters:
symbols
- the whitespace symbols
-
setSingleCharSymbols
Sets the single character symbols of this Tokenizer to the given symbols.- Parameters:
symbols
- the single character symbols
-
setPrepunctuationSymbols
Sets the prepunctuation symbols of this Tokenizer to the given symbols.- Parameters:
symbols
- the prepunctuation symbols
-
setPostpunctuationSymbols
Sets the postpunctuation symbols of this Tokenizer to the given symbols.- Parameters:
symbols
- the postpunctuation symbols
-
isBreak
boolean isBreak()Determines if the current token should start a new sentence.- Returns:
- true if a new sentence should be started
-