Class SAMSequenceDictionary

java.lang.Object
htsjdk.samtools.SAMSequenceDictionary
All Implemented Interfaces:
HtsHeader, Serializable

public class SAMSequenceDictionary extends Object implements HtsHeader, Serializable
Collection of SAMSequenceRecords.
See Also:
  • Field Details

    • serialVersionUID

      public static final long serialVersionUID
      See Also:
    • DEFAULT_DICTIONARY_EQUAL_TAG

      public static final List<String> DEFAULT_DICTIONARY_EQUAL_TAG
  • Constructor Details

    • SAMSequenceDictionary

      public SAMSequenceDictionary()
    • SAMSequenceDictionary

      public SAMSequenceDictionary(List<SAMSequenceRecord> list)
  • Method Details

    • getSequences

      public List<SAMSequenceRecord> getSequences()
    • getSequence

      public SAMSequenceRecord getSequence(String name)
    • setSequences

      public void setSequences(List<SAMSequenceRecord> list)
      Replaces the existing list of SAMSequenceRecords with the given list. Reset the aliases
      Parameters:
      list - This value is copied and validated.
    • addSequence

      public void addSequence(SAMSequenceRecord sequenceRecord)
    • getSequence

      public SAMSequenceRecord getSequence(int sequenceIndex)
      Returns:
      The SAMSequenceRecord with the given index, or null if index is out of range.
    • getSequenceIndex

      public int getSequenceIndex(String sequenceName)
      Returns:
      The index for the given sequence name, or -1 if the name is not found.
    • size

      public int size()
      Returns:
      number of SAMSequenceRecord(s) in this dictionary
    • getReferenceLength

      public long getReferenceLength()
      Returns:
      The sum of the lengths of the sequences in this dictionary
    • isEmpty

      public boolean isEmpty()
      Returns:
      true is the dictionary is empty
    • assertSameDictionary

      public void assertSameDictionary(SAMSequenceDictionary that)
      Non-comprehensive equals(Object)-assertion: instead of calling SAMSequenceRecord.equals(Object) on constituent SAMSequenceRecords in this dictionary against its pair in the target dictionary, in order, call SAMSequenceRecord.isSameSequence(SAMSequenceRecord). Aliases are ignored.
      Throws:
      AssertionError - When the dictionaries are not the same, with some human-readable information as to why
    • isSameDictionary

      public boolean isSameDictionary(SAMSequenceDictionary that)
      Non-comprehensive equals(Object)-validation: instead of calling SAMSequenceRecord.equals(Object) on constituent SAMSequenceRecords in this dictionary against its pair in the target dictionary, in order, call SAMSequenceRecord.isSameSequence(SAMSequenceRecord).
      Parameters:
      that - SAMSequenceDictionary to compare against
      Returns:
      true if the dictionaries are the same, false otherwise
    • equals

      public boolean equals(Object o)
      Returns true if the two dictionaries are the same.

      NOTE: Aliases are NOT considered, but alternative sequence names (AN tag) names ARE.

      Overrides:
      equals in class Object
    • addSequenceAlias

      public SAMSequenceRecord addSequenceAlias(String originalName, String altName)
      Add an alias to a SAMSequenceRecord. This can be use to provide some alternate names fo a given contig. e.g: 1,chr1,chr01,01,CM000663,NC_000001.10 e.g: MT,chrM

      NOTE: this method does not add the alias to the alternative sequence name tag (AN) in the SAMSequenceRecord. If you would like to add it to the AN tag, use addAlternativeSequenceName(String, String) instead.

      Parameters:
      originalName - existing contig name
      altName - new contig name
      Returns:
      the contig associated to the 'originalName/altName'
    • addAlternativeSequenceName

      public SAMSequenceRecord addAlternativeSequenceName(String originalName, String altName)
      Add an alternative sequence name (AN tag) to a SAMSequenceRecord, including it into the aliases to retrieve the contigs (as with addSequenceAlias(String, String).

      This can be use to provide some alternate names fo a given contig. e.g: 1,chr1,chr01,01,CM000663 or MT,chrM.

      Parameters:
      originalName - existing contig name
      altName - new contig name
      Returns:
      the contig associated to the 'originalName/altName', with the AN tag including the altName
    • md5

      public String md5()
      return a MD5 sum for ths dictionary, the checksum is re-computed each time this method is called.
       md5( (seq1.md5_if_available) + ' '+(seq2.name+seq2.length) + ' '+...)
       
      Returns:
      a MD5 checksum for this dictionary or the empty string if it is empty
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • mergeDictionaries

      public static SAMSequenceDictionary mergeDictionaries(SAMSequenceDictionary dict1, SAMSequenceDictionary dict2, List<String> tagsToMatch)
      Will merge dictionaryTags from two dictionaries into one focusing on merging the tags rather than the sequences. Requires that dictionaries have the same SAMSequence records in the same order. For each sequenceIndex, the union of the tags from both sequences will be added to the new sequence, mismatching values (for tags that are in both) will generate a warning, and the value from dict1 will be used. For tags that are in tagsToEquate an unequal value will generate an error (an IllegalArgumentException will be thrown.) tagsToEquate must include LN and MD.
      Parameters:
      dict1 - first dictionary
      dict2 - first dictionary
      tagsToMatch - list of tags that must be equal if present in both sequence. Must contain MD, and LN
      Returns:
      dictionary consisting of the same sequences as the two inputs with the merged values of tags.