Class SAMRecordDuplicateComparator

java.lang.Object
htsjdk.samtools.SAMRecordDuplicateComparator
All Implemented Interfaces:
SAMRecordComparator, Serializable, Comparator<SAMRecord>

public class SAMRecordDuplicateComparator extends Object implements SAMRecordComparator, Serializable
Compares records based on if they should be considered PCR Duplicates (see MarkDuplicates). There are three orderings provided by this comparator: compare, duplicateSetCompare, and fileOrderCompare. Specify the headers when constructing this comparator if you would like to consider the library as the major sort key. The records being compared must also have non-null SAMFileHeaders.
See Also:
  • Field Details

  • Constructor Details

    • SAMRecordDuplicateComparator

      public SAMRecordDuplicateComparator()
    • SAMRecordDuplicateComparator

      @Deprecated public SAMRecordDuplicateComparator(List<SAMFileHeader> headers)
      Deprecated.
    • SAMRecordDuplicateComparator

      public SAMRecordDuplicateComparator(SAMFileHeader header)
  • Method Details

    • setScoringStrategy

      public void setScoringStrategy(DuplicateScoringStrategy.ScoringStrategy scoringStrategy)
    • compare

      public int compare(SAMRecord samRecord1, SAMRecord samRecord2)
      Most stringent comparison. Two records are compared based on if they are duplicates of each other, and then based on if they should be prioritized for being the most "representative". Typically, the representative is the record in the SAM file that is *not* marked as a duplicate within a set of duplicates. Compare by file order, then duplicate scoring strategy, read name. If both reads are paired and both ends mapped, always prefer the first end over the second end. This is needed to properly choose the first end for optical duplicate identification when both ends are mapped to the same position etc.
      Specified by:
      compare in interface Comparator<SAMRecord>
    • duplicateSetCompare

      public int duplicateSetCompare(SAMRecord samRecord1, SAMRecord samRecord2)
      Less stringent than compare, such that two records are equal enough such that their ordering within their duplicate set would be arbitrary. Major difference between this and fileOrderCompare is how we compare the orientation byte. Here we want: F == FR, F == FF R == RF, R == RR
    • fileOrderCompare

      public int fileOrderCompare(SAMRecord samRecord1, SAMRecord samRecord2)
      Less stringent than duplicateSetCompare, such that two records are equal enough such that their ordering in a sorted SAM file would be arbitrary.
      Specified by:
      fileOrderCompare in interface SAMRecordComparator
      Returns:
      negative if samRecord1 invalid input: '<' samRecord2, 0 if equal, else positive