Class AtomCache

java.lang.Object
org.biojava.nbio.structure.align.util.AtomCache

public class AtomCache extends Object
A utility class that provides easy access to Structure objects. If you are running a script that is frequently re-using the same PDB structures, the AtomCache keeps an in-memory cache of the files for quicker access. The cache is a soft-cache, this means it won't cause out of memory exceptions, but garbage collects the data if the Java virtual machine needs to free up space. The AtomCache is thread-safe.
Since:
3.0
Author:
Andreas Prlic, Spencer Bliven, Peter Rose
  • Field Details

  • Constructor Details

    • AtomCache

      public AtomCache()
      Default AtomCache constructor. Usually stores files in a temp directory, but this can be overriden by setting the PDB_DIR variable at runtime.
      See Also:
    • AtomCache

      public AtomCache(String pdbFilePath)
      Creates an instance of an AtomCache that is pointed to the a particular path in the file system. It will use the same value for pdbFilePath and cachePath.
      Parameters:
      pdbFilePath - a directory in the file system to use as a location to cache files.
    • AtomCache

      public AtomCache(String pdbFilePath, String cachePath)
      Creates an instance of an AtomCache that is pointed to the a particular path in the file system.
      Parameters:
      pdbFilePath - a directory in the file system to use as a location to cache files.
      cachePath -
    • AtomCache

      public AtomCache(UserConfiguration config)
      Creates a new AtomCache object based on the provided UserConfiguration.
      Parameters:
      config - the UserConfiguration to use for this cache.
  • Method Details

    • getAtoms

      public Atom[] getAtoms(String name) throws IOException, StructureException
      Returns the CA atoms for the provided name. See getStructure(String) for supported naming conventions.

      This method only works with protein chains. Use getRepresentativeAtoms(String) for a more general solution.

      Parameters:
      name -
      Returns:
      an array of Atoms.
      Throws:
      IOException
      StructureException
    • getAtoms

      public Atom[] getAtoms(StructureIdentifier name) throws IOException, StructureException
      Throws:
      IOException
      StructureException
    • getRepresentativeAtoms

      public Atom[] getRepresentativeAtoms(String name) throws IOException, StructureException
      Returns the representative atoms for the provided name. See getStructure(String) for supported naming conventions.
      Parameters:
      name -
      Returns:
      an array of Atoms.
      Throws:
      IOException
      StructureException
    • getRepresentativeAtoms

      public Atom[] getRepresentativeAtoms(StructureIdentifier name) throws IOException, StructureException
      Throws:
      IOException
      StructureException
    • getBiologicalAssembly

      public Structure getBiologicalAssembly(String pdbId, int bioAssemblyId, boolean multiModel) throws StructureException, IOException
      Returns the biological assembly for a given PDB ID and bioAssemblyId, by building the assembly from the biounit annotations found in Structure.getPDBHeader()

      Note, the number of available biological unit files varies. Many entries don't have a biological assembly specified (e.g. NMR structures), many entries have only one biological assembly (bioAssemblyId=1), and some structures have multiple biological assemblies.

      Parameters:
      pdbId - the PDB ID
      bioAssemblyId - the 1-based index of the biological assembly (0 gets the asymmetric unit)
      multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
      Returns:
      a structure object
      Throws:
      IOException
      StructureException - if biassemblyId invalid input: '<' 0 or other problems while loading structure
      Since:
      3.2
    • getBiologicalAssembly

      public Structure getBiologicalAssembly(PdbId pdbId, int bioAssemblyId, boolean multiModel) throws StructureException, IOException
      Returns the biological assembly for a given PDB ID and bioAssemblyId, by building the assembly from the biounit annotations found in Structure.getPDBHeader()

      Note, the number of available biological unit files varies. Many entries don't have a biological assembly specified (e.g. NMR structures), many entries have only one biological assembly (bioAssemblyId=1), and some structures have multiple biological assemblies.

      Parameters:
      pdbId - the PDB ID
      bioAssemblyId - the 1-based index of the biological assembly (0 gets the asymmetric unit)
      multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
      Returns:
      a structure object
      Throws:
      IOException
      StructureException - if biassemblyId invalid input: '<' 0 or other problems while loading structure
      Since:
      6.0.0
    • getBiologicalAssembly

      public Structure getBiologicalAssembly(String pdbId, boolean multiModel) throws StructureException, IOException
      Returns the default biological unit (bioassemblyId=1, known in PDB as pdb1.gz). If it is not available, the asymmetric unit will be returned, e.g. for NMR structures.

      Biological assemblies can also be accessed using getStructure("BIO:[pdbId]")

      Parameters:
      pdbId - the PDB id
      multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
      Returns:
      a structure object
      Throws:
      IOException
      StructureException
      Since:
      4.2
    • getBiologicalAssemblies

      public List<Structure> getBiologicalAssemblies(String pdbId, boolean multiModel) throws StructureException, IOException
      Returns all biological assemblies for given PDB id.
      Parameters:
      pdbId -
      multiModel - if true the output Structure will be a multi-model one with one transformId per model, if false the outputStructure will be as the original with added chains with renamed asymIds (in the form originalAsymId_transformId and originalAuthId_transformId).
      Returns:
      Throws:
      StructureException
      IOException
      Since:
      5.0
    • getCachePath

      public String getCachePath()
      Returns the path that contains the caching file for utility data, such as domain definitions.
      Returns:
    • getFileParsingParams

      public FileParsingParameters getFileParsingParams()
    • getPath

      public String getPath()
      Get the path that is used to cache PDB files.
      Returns:
      path to a directory
    • getStructure

      public Structure getStructure(String name) throws IOException, StructureException
      Request a Structure based on a name.
                      Formal specification for how to specify the name:
      
                      name     := pdbID
                                     | pdbID '.' chainID
                                     | pdbID '.' range
                                     | scopID
                      range         := '('? range (',' range)? ')'?
                                     | chainID
                                     | chainID '_' resNum '-' resNum
                      pdbID         := [1-9][a-zA-Z0-9]{3}
                                     | PDB_[a-zA-Z0-9]{8}
                      chainID       := [a-zA-Z0-9]
                      scopID        := 'd' pdbID [a-z_][0-9_]
                      resNum        := [-+]?[0-9]+[A-Za-z]?
      
      
                      Example structures:
                      1TIM                 #whole structure
                      4HHB.C               #single chain
                      4GCR.A_1-83          #one domain, by residue number
                      3AA0.A,B             #two chains treated as one structure
                      PDB_00001TIM         #whole structure (extended format)
                      PDB_00004HHB.C       #single chain (extended format)
                      PDB_00004GCR.A_1-83  #one domain, by residue number (extended format)
                      PDB_00003AA0.A,B     #two chains treated as one structure (extended format)
                      d2bq6a1              #scop domain
       
      With the additional set of rules:
      • If only a PDB code is provided, the whole structure will be return including ligands, but the first model only (for NMR).
      • Chain IDs are case sensitive, PDB ids are not. To specify a particular chain write as: 4hhb.A or 4HHB.A
      • To specify a SCOP domain write a scopId e.g. d2bq6a1. Some flexibility can be allowed in SCOP domain names, see
        invalid reference
        #setStrictSCOP(boolean)
      • URLs are accepted as well

      Note that this method should not be used in StructureIdentifier implementations to avoid circular calls.

      Parameters:
      name -
      Returns:
      a Structure object, or null if name appears improperly formated (eg too short, etc)
      Throws:
      IOException - The PDB file cannot be cached due to IO errors
      StructureException - The name appeared valid but did not correspond to a structure. Also thrown by some submethods upon errors, eg for poorly formatted subranges.
    • getStructure

      public Structure getStructure(StructureIdentifier strucId) throws IOException, StructureException
      Get the structure corresponding to the given StructureIdentifier. Equivalent to calling StructureIdentifier.loadStructure(AtomCache) followed by StructureIdentifier.reduce(Structure).

      Note that this method should not be used in StructureIdentifier implementations to avoid circular calls.

      Parameters:
      strucId -
      Returns:
      Throws:
      IOException
      StructureException
    • getStructureForDomain

      public Structure getStructureForDomain(ScopDomain domain) throws IOException, StructureException
      Returns the representation of a ScopDomain as a BioJava Structure object.
      Parameters:
      domain - a SCOP domain
      Returns:
      a Structure object
      Throws:
      IOException
      StructureException
    • getStructureForDomain

      public Structure getStructureForDomain(ScopDomain domain, ScopDatabase scopDatabase) throws IOException, StructureException
      Returns the representation of a ScopDomain as a BioJava Structure object.
      Parameters:
      domain - a SCOP domain
      scopDatabase - A ScopDatabase to use
      Returns:
      a Structure object
      Throws:
      IOException
      StructureException
    • getStructureForDomain

      public Structure getStructureForDomain(ScopDomain domain, ScopDatabase scopDatabase, boolean strictLigandHandling) throws IOException, StructureException
      Returns the representation of a ScopDomain as a BioJava Structure object.
      Parameters:
      domain - a SCOP domain
      scopDatabase - A ScopDatabase to use
      strictLigandHandling - If set to false, hetero-atoms are included if and only if they belong to a chain to which the SCOP domain belongs; if set to true, hetero-atoms are included if and only if they are strictly within the definition (residue numbers) of the SCOP domain
      Returns:
      a Structure object
      Throws:
      IOException
      StructureException
    • getStructureForDomain

      public Structure getStructureForDomain(String scopId) throws IOException, StructureException
      Returns the representation of a ScopDomain as a BioJava Structure object.
      Parameters:
      scopId - a SCOP Id
      Returns:
      a Structure object
      Throws:
      IOException
      StructureException
    • getStructureForDomain

      public Structure getStructureForDomain(String scopId, ScopDatabase scopDatabase) throws IOException, StructureException
      Returns the representation of a ScopDomain as a BioJava Structure object.
      Parameters:
      scopId - a SCOP Id
      scopDatabase - A ScopDatabase to use
      Returns:
      a Structure object
      Throws:
      IOException
      StructureException
    • setCachePath

      public void setCachePath(String cachePath)
      set the location at which utility data should be cached.
      Parameters:
      cachePath -
    • setFileParsingParams

      public void setFileParsingParams(FileParsingParameters params)
    • setObsoleteBehavior

      public void setObsoleteBehavior(LocalPDBDirectory.ObsoleteBehavior behavior)
      [Optional] This method changes the behavior when obsolete entries are requested. Current behaviors are:
      • THROW_EXCEPTION Throw a StructureException (the default)
      • FETCH_OBSOLETE Load the requested ID from the PDB's obsolete repository
      • FETCH_CURRENT Load the most recent version of the requested structure

        This setting may be silently ignored by implementations which do not have access to the server to determine whether an entry is obsolete, such as if

        invalid reference
        #isAutoFetch()
        is false. Note that an obsolete entry may still be returned even this is FETCH_CURRENT if the entry is found locally.
      Parameters:
      fetchFileEvenIfObsolete - Whether to fetch obsolete records
      Since:
      4.0.0
      See Also:
      • invalid reference
        #setFetchCurrent(boolean)
    • getObsoleteBehavior

      public LocalPDBDirectory.ObsoleteBehavior getObsoleteBehavior()
      Returns how this instance deals with obsolete entries. Note that this setting may be ignored by some implementations or in some situations, such as when
      invalid reference
      #isAutoFetch()
      is false.

      For most implementations, the default value is THROW_EXCEPTION.

      Returns:
      The ObsoleteBehavior
      Since:
      4.0.0
    • getFetchBehavior

      public LocalPDBDirectory.FetchBehavior getFetchBehavior()
      Get the behavior for fetching files from the server
      Returns:
    • setFetchBehavior

      public void setFetchBehavior(LocalPDBDirectory.FetchBehavior fetchBehavior)
      Set the behavior for fetching files from the server
      Parameters:
      fetchBehavior -
    • setPath

      public void setPath(String path)
      Set the path that is used to cache PDB files.
      Parameters:
      path - to a directory
    • getFiletype

      public StructureFiletype getFiletype()
      Returns the currently active file type that will be parsed.
      Returns:
      a StructureFiletype
    • setFiletype

      public void setFiletype(StructureFiletype filetype)
      Set the file type that will be parsed.
      Parameters:
      filetype - a StructureFiletype
    • getStructureForCathDomain

      public Structure getStructureForCathDomain(StructureName structureName) throws IOException, StructureException
      Returns a Structure corresponding to the CATH identifier supplied in structureName, using the the CathDatabase at CathFactory.getCathDatabase().
      Throws:
      IOException
      StructureException
    • getStructureForCathDomain

      public Structure getStructureForCathDomain(StructureName structureName, CathDatabase cathInstall) throws IOException, StructureException
      Returns a Structure corresponding to the CATH identifier supplied in structureName, using the specified CathDatabase.
      Throws:
      IOException
      StructureException
    • flagLoading

      protected void flagLoading(PdbId pdbId)
    • flagLoadingFinished

      protected void flagLoadingFinished(PdbId pdbId)
    • getStructureForPdbId

      public Structure getStructureForPdbId(String id) throws IOException, StructureException
      Loads a structure directly by PDB ID
      Parameters:
      pdbId -
      Returns:
      Throws:
      IOException
      StructureException
    • getStructureForPdbId

      public Structure getStructureForPdbId(PdbId pdbId) throws IOException
      Loads a structure directly by PDB ID
      Parameters:
      pdbId -
      Returns:
      Throws:
      IOException
      StructureException
    • loadStructureFromMmtfByPdbId

      protected Structure loadStructureFromMmtfByPdbId(String pdbId) throws IOException
      Throws:
      IOException
    • loadStructureFromMmtfByPdbId

      protected Structure loadStructureFromMmtfByPdbId(PdbId pdbId) throws IOException
      Load a Structure from MMTF either from the local file system.
      Parameters:
      pdbId - the input PDB id
      Returns:
      the Structure object of the parsed structure
      Throws:
      IOException - error reading from Web or file system
    • loadStructureFromCifByPdbId

      protected Structure loadStructureFromCifByPdbId(String pdbId) throws IOException
      Throws:
      IOException
    • loadStructureFromCifByPdbId

      protected Structure loadStructureFromCifByPdbId(PdbId pdbId) throws IOException
      Throws:
      IOException
    • loadStructureFromBcifByPdbId

      protected Structure loadStructureFromBcifByPdbId(String pdbId) throws IOException
      Throws:
      IOException
    • loadStructureFromBcifByPdbId

      protected Structure loadStructureFromBcifByPdbId(PdbId pdbId) throws IOException
      Throws:
      IOException
    • loadStructureFromPdbByPdbId

      protected Structure loadStructureFromPdbByPdbId(String pdbId) throws IOException
      Throws:
      IOException
    • loadStructureFromPdbByPdbId

      protected Structure loadStructureFromPdbByPdbId(PdbId pdbId) throws IOException
      Throws:
      IOException