Class GoTerms

java.lang.Object
org.snpeff.geneOntology.GoTerms
All Implemented Interfaces:
Serializable, Iterable<GoTerm>

public class GoTerms extends Object implements Iterable<GoTerm>, Serializable
A collection of GO terms
Author:
Pablo Cingolani
See Also:
  • Field Details

    • debug

      public static boolean debug
    • verbose

      public static boolean verbose
  • Constructor Details

    • GoTerms

      public GoTerms()
      Default constructor
    • GoTerms

      public GoTerms(String oboFile, String nameSpace, String interestingGenesFile, String geneAssocFile, boolean removeObsolete, boolean useGeneId)
      Constructor
      Parameters:
      oboFile - : Path to OBO description file
      nameSpace - : Can be 'null' for "all namespaces"
      interestingGenesFile - : Path to a file containing a list of 'interesting' genes (one geneName per line)
      geneAssocFile - : A file containing lines like: "GOterm \t gene_product_id \t gene_name \n"
  • Method Details

    • add

      public GoTerm add(GoTerm goTerm)
      Add a GOTerm (if not already in this GOTerms) WARNING: Creates 'fake' symbolNames based on symbolIds. This method is used mostly for testing / debugging
    • addInterestingSymbol

      public void addInterestingSymbol(String symbolId, int rank, HashSet<String> noGoTermFound)
      Add a symbol as 'interesting' symbol (to every corresponding GOTerm in this set)
      Parameters:
      rank - : symbol's rank
      noGoTermFound - : Add symbol here if there are no GOTerms associated with this symbol
      symbolName - : Symbol's name
    • addSymbolId

      public boolean addSymbolId(GoTerm goTerm, String symbolId)
      Add a symbolId (as well as all needed mappings)
      Parameters:
      symbolId -
      goTermAcc -
      symbolName -
      goTermType -
      description -
      Returns:
      true if OK, false on error (GOTerm 'goTermAcc' not found)
    • addSymbolsFromChilds

      public void addSymbolsFromChilds()
      Use symbols for chids in DAG For every GOTerm, each child's symbols are added to the term so that root term contains every symbol and every interestingSymbol
    • allSymbols

      public Set<String> allSymbols()
      Create a set with all the symbols
    • checkInterestingSymbolIds

      public void checkInterestingSymbolIds(Set<String> interestingSymbolIds)
      Checks that every symboolID is in the set (as 'interesting' symbols)
      Parameters:
      interestingSymbolIds - : A set of interesting symbols Throws an exception on error
    • disjointSet

      public GoTerm disjointSet(List<GoTerm> goTermList, int activeSets)
      Produce a GOTerm based on a list of GOTerms and a 'mask'
      Parameters:
      goTermList - : A list of GOTerms
      activeSets - : An integer (binary mask) that specifies weather a set in the list should be taken into account or not. The operation performed is: Intersection{ GOTerms where mask_bit == 1 } - Union{ GOTerms where mask_bit == 0 } ) where the minus sign '-' is actually a 'set minus' operation. This operation is done for both sets in GOTerm (i.e. symbolIds and interestingSymbolIds)
      Returns:
      A GOTerm
    • getGoTerm

      public GoTerm getGoTerm(String goTermAcc)
    • getGoTermsByGoTermAcc

      public HashMap<String,GoTerm> getGoTermsByGoTermAcc()
    • getGoTermsBySymbolId

      public HashMap<String,Set<GoTerm>> getGoTermsBySymbolId()
    • getGoTermsBySymbolId

      public Set<GoTerm> getGoTermsBySymbolId(String symbolId)
    • getInterestingSymbolIdsSet

      public HashSet<String> getInterestingSymbolIdsSet()
    • getInterestingSymbolIdsSize

      public int getInterestingSymbolIdsSize()
    • getLabel

      public String getLabel()
    • getMaxRank

      public int getMaxRank()
    • getNameSpace

      public String getNameSpace()
    • getRank

      public int getRank(String symbolId)
      Get symbol's rank
      Parameters:
      symbolId -
      Returns:
    • getRankSymbolId

      public HashMap<String,Integer> getRankSymbolId()
    • iterator

      public Iterator<GoTerm> iterator()
      Iterate through each GOterm in this GOTerms
      Specified by:
      iterator in interface Iterable<GoTerm>
    • keySet

      public Set<String> keySet()
    • levels

      public int levels()
      Calculate each node's level (in DAG)
      Returns:
      maximum level
    • listTopTerms

      public List<GoTerm> listTopTerms(int numberToSelect)
      Select a number of GOTerms
      Parameters:
      numberToSelect -
      Returns:
    • numberOfInterestingSymbols

      public int numberOfInterestingSymbols()
      Calculate how many interesting symbol-IDs in are there in all these GOTerms
      Returns:
      Number of interesting symbols
    • numberOfNodes

      public int numberOfNodes()
      Number of nodes in this DAG
      Returns:
    • numberOfNodesWithOneInterestingSymbol

      public int numberOfNodesWithOneInterestingSymbol()
      Calculate the number of nodes in that have at least one interesting symbol
      Returns:
    • numberOfNodesWithOneSymbol

      public int numberOfNodesWithOneSymbol()
      Calculate the number of nodes in that have at least one annotated symbol
      Returns:
    • numberOfSymbols

      public int numberOfSymbols()
      Calculate how many symbol-IDs in are there in all these GOTerms
      Returns:
      Number of interesting symbols
    • readGeneAssocFile

      public void readGeneAssocFile(String goGenesFile, boolean useGeneId)
      Reads a file containing every gene (names and ids) associated GO terms
      Parameters:
      goGenesFile - : A file containing gene associations to GO terms
    • readInterestingSymbolIdsFile

      public void readInterestingSymbolIdsFile(String fileName)
      Reads a file with a list of 'interesting' genes (one per line)
      Parameters:
      fileName - : Can be "-" for no-file
    • readOboFile

      public void readOboFile(String oboFile, boolean removeObsolete)
      Read an OBO file
      Parameters:
      oboFile -
      nameSpace -
    • removeGOTerm

      public void removeGOTerm(String goTermAcc)
      Remove a GOTerm
    • resetInterestingSymbolIds

      public void resetInterestingSymbolIds()
      Reset every 'interesting' symbolId (on every single GOTerm in this GOTerms)
    • rootNodes

      public Set<GoTerm> rootNodes()
    • saveGseaGeneSets

      public void saveGseaGeneSets(String fileName)
      Save gene sets file for GSEA analysis Format specification: http://www.broad.mit.edu/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29
      Parameters:
      fileName -
    • setLabel

      public void setLabel(String label)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • values

      public Collection<GoTerm> values()