Class AbstractWordTokenizer

java.lang.Object
com.swabunga.spell.event.AbstractWordTokenizer
All Implemented Interfaces:
WordTokenizer
Direct Known Subclasses:
FileWordTokenizer, StringWordTokenizer

public abstract class AbstractWordTokenizer extends Object implements WordTokenizer
This class tokenizes a input string.

It also allows for the string to be mutated. The result after the spell checking is completed is available to the call to getFinalText

Author:
Jason Height(jheight@chariot.net.au), Anthony Roy (ajr@antroy.co.uk)
  • Field Details

    • currentWord

      protected Word currentWord
      The word being analyzed
    • finder

      protected WordFinder finder
      The word finder used to filter out words which are non pertinent to spell checking
    • sentenceIterator

      protected BreakIterator sentenceIterator
      An iterator to work through the sentence
    • wordCount

      protected int wordCount
      The cumulative word count that have been processed
  • Constructor Details

    • AbstractWordTokenizer

      public AbstractWordTokenizer(String text)
      Creates a new AbstractWordTokenizer object.
      Parameters:
      text - the text to process.
    • AbstractWordTokenizer

      public AbstractWordTokenizer(WordFinder wf)
      Creates a new AbstractWordTokenizer object.
      Parameters:
      wf - the custom WordFinder to use in searching for words.
  • Method Details

    • getCurrentWordCount

      public int getCurrentWordCount()
      Returns the current number of words that have been processed
      Specified by:
      getCurrentWordCount in interface WordTokenizer
      Returns:
      number of words so far iterated.
    • getCurrentWordEnd

      public int getCurrentWordEnd()
      Returns the end of the current word in the text
      Specified by:
      getCurrentWordEnd in interface WordTokenizer
      Returns:
      index in string of the end of the current word.
      Throws:
      WordNotFoundException - current word has not yet been set.
    • getCurrentWordPosition

      public int getCurrentWordPosition()
      Returns the index of the start of the current word in the text
      Specified by:
      getCurrentWordPosition in interface WordTokenizer
      Returns:
      index in string of the start of the current word.
      Throws:
      WordNotFoundException - current word has not yet been set.
    • hasMoreWords

      public boolean hasMoreWords()
      Returns true if there are more words that can be processed in the string
      Specified by:
      hasMoreWords in interface WordTokenizer
      Returns:
      true if there are further words in the text.
    • nextWord

      public String nextWord()
      Returns searches for the next word in the text, and returns that word.
      Specified by:
      nextWord in interface WordTokenizer
      Returns:
      the string representing the current word.
      Throws:
      WordNotFoundException - search string contains no more words.
    • replaceWord

      public abstract void replaceWord(String newWord)
      Replaces the current word token
      Specified by:
      replaceWord in interface WordTokenizer
      Parameters:
      newWord - replacement word.
      Throws:
      WordNotFoundException - current word has not yet been set.
    • getContext

      public String getContext()
      Returns the current text that is being tokenized (includes any changes that have been made)
      Specified by:
      getContext in interface WordTokenizer
      Returns:
      the text being tokenized.
    • isNewSentence

      public boolean isNewSentence()
      returns true if the current word is at the start of a sentence
      Specified by:
      isNewSentence in interface WordTokenizer
      Returns:
      true if the current word starts a sentence.
      Throws:
      WordNotFoundException - current word has not yet been set.