Class BAMFileReader

All Implemented Interfaces:
SamReader.PrimitiveSamReader

public class BAMFileReader extends SamReader.ReaderImplementation
Class for reading and querying BAM files.
  • Method Details

    • enableIndexCaching

      protected void enableIndexCaching(boolean enabled)
      If true, uses the caching version of the index reader.
      Parameters:
      enabled - true to use the caching version of the reader.
    • enableIndexMemoryMapping

      protected void enableIndexMemoryMapping(boolean enabled)
      If false, disable the use of memory mapping for accessing index files (default behavior is to use memory mapping). This is slower but more scalable when accessing large numbers of BAM files sequentially.
      Parameters:
      enabled - True to use memory mapping, false to use regular I/O.
    • type

      public SamReader.Type type()
    • hasIndex

      public boolean hasIndex()
      Returns:
      true if ths is a BAM file, and has an index
    • getIndex

      public BAMIndex getIndex()
      Retrieves the index for the given file type. Ensure that the index is of the specified type.
      Returns:
      An index of the given type.
    • getIndexType

      public SamIndexes getIndexType()
      Return the type of the BAM index, BAI or CSI.
      Returns:
      one of SamIndexes.BAI or SamIndexes.CSI or null
    • setEagerDecode

      public void setEagerDecode(boolean desired)
    • close

      public void close()
    • getFileHeader

      public SAMFileHeader getFileHeader()
    • getValidationStringency

      public ValidationStringency getValidationStringency()
    • getIterator

      public CloseableIterator<SAMRecord> getIterator()
      Prepare to iterate through the SAMRecords in file order. Only a single iterator on a BAM file can be extant at a time. If getIterator() or a query method has been called once, that iterator must be closed before getIterator() can be called again. A somewhat peculiar aspect of this method is that if the file is not seekable, a second call to getIterator() begins its iteration where the last one left off. That is the best that can be done in that situation.
    • getIterator

      public CloseableIterator<SAMRecord> getIterator(SAMFileSpan chunks)
    • getFilePointerSpanningReads

      public SAMFileSpan getFilePointerSpanningReads()
      Gets an unbounded pointer to the first record in the BAM file. Because the reader doesn't necessarily know when the file ends, the rightmost bound of the file pointer will not end exactly where the file ends. However, the rightmost bound is guaranteed to be after the last read in the file.
      Returns:
      An unbounded pointer to the first record in the BAM file.
    • query

      public CloseableIterator<SAMRecord> query(QueryInterval[] intervals, boolean contained)
      Prepare to iterate through the SAMRecords that match any of the given intervals. Only a single iterator on a BAMFile can be extant at a time. The previous one must be closed before calling any of the methods that return an iterator. Note that an unmapped SAMRecord may still have a reference name and an alignment start for sorting purposes (typically this is the coordinate of its mate), and will be found by this method if the coordinate matches the specified interval. Note that this method is not necessarily efficient in terms of disk I/O. The index does not have perfect resolution, so some SAMRecords may be read and then discarded because they do not match the specified interval.
      Parameters:
      intervals - list of intervals to be queried. Must be optimized.
      contained - If true, the alignments for the SAMRecords must be completely contained in the interval specified by start and end. If false, the SAMRecords need only overlap the interval.
      Returns:
      Iterator for the matching SAMRecords
      See Also:
    • queryAlignmentStart

      public CloseableIterator<SAMRecord> queryAlignmentStart(String sequence, int start)
      Prepare to iterate through the SAMRecords with the given alignment start. Only a single iterator on a BAMFile can be extant at a time. The previous one must be closed before calling any of the methods that return an iterator. Note that an unmapped SAMRecord may still have a reference name and an alignment start for sorting purposes (typically this is the coordinate of its mate), and will be found by this method if the coordinate matches the specified interval. Note that this method is not necessarily efficient in terms of disk I/O. The index does not have perfect resolution, so some SAMRecords may be read and then discarded because they do not match the specified interval.
      Parameters:
      sequence - Reference sequence sought.
      start - Alignment start sought.
      Returns:
      Iterator for the matching SAMRecords.
    • queryUnmapped

      public CloseableIterator<SAMRecord> queryUnmapped()
      Prepare to iterate through the SAMRecords that are unmapped and do not have a reference name or alignment start. Only a single iterator on a BAMFile can be extant at a time. The previous one must be closed before calling any of the methods that return an iterator.
      Returns:
      Iterator for the matching SAMRecords.
    • readHeader

      protected static SAMFileHeader readHeader(BinaryCodec stream, ValidationStringency validationStringency, String source) throws IOException
      Reads the header of a BAM file from a stream
      Parameters:
      stream - A BinaryCodec to read the header from
      validationStringency - Determines how stringent to be when validating the sam
      source - Note that this is used only for reporting errors.
      Throws:
      IOException
    • getFileSpan

      public static BAMFileSpan getFileSpan(QueryInterval[] intervals, BAMIndex fileIndex)
      Use the index to determine the chunk boundaries for the required intervals.
      Parameters:
      intervals - the intervals to restrict reads to
      fileIndex - the BAM index to use
      Returns:
      file pointer pairs corresponding to chunk boundaries
    • createIndexIterator

      public CloseableIterator<SAMRecord> createIndexIterator(QueryInterval[] intervals, boolean contained, long[] filePointers)
      Prepare to iterate through SAMRecords that match the given intervals.
      Parameters:
      intervals - the intervals to restrict reads to
      contained - if true, return records that are strictly contained in the intervals, otherwise return records that overlap
      filePointers - file pointer pairs corresponding to chunk boundaries for the intervals
    • getVirtualFilePointer

      public long getVirtualFilePointer()
      Returns:
      a virtual file pointer for the underlying compressed stream.
      See Also: