Package htsjdk.samtools
Class SAMRecordDuplicateComparator
java.lang.Object
htsjdk.samtools.SAMRecordDuplicateComparator
- All Implemented Interfaces:
SAMRecordComparator
,Serializable
,Comparator<SAMRecord>
public class SAMRecordDuplicateComparator
extends Object
implements SAMRecordComparator, Serializable
Compares records based on if they should be considered PCR Duplicates (see MarkDuplicates).
There are three orderings provided by this comparator: compare, duplicateSetCompare, and fileOrderCompare.
Specify the headers when constructing this comparator if you would like to consider the library as the major sort key.
The records being compared must also have non-null SAMFileHeaders.
- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionint
Most stringent comparison.int
duplicateSetCompare
(SAMRecord samRecord1, SAMRecord samRecord2) Less stringent than compare, such that two records are equal enough such that their ordering within their duplicate set would be arbitrary.int
fileOrderCompare
(SAMRecord samRecord1, SAMRecord samRecord2) Less stringent than duplicateSetCompare, such that two records are equal enough such that their ordering in a sorted SAM file would be arbitrary.void
setScoringStrategy
(DuplicateScoringStrategy.ScoringStrategy scoringStrategy) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface java.util.Comparator
equals, reversed, thenComparing, thenComparing, thenComparing, thenComparingDouble, thenComparingInt, thenComparingLong
-
Field Details
-
UNKNOWN_LIBRARY_STRING
- See Also:
-
-
Constructor Details
-
SAMRecordDuplicateComparator
public SAMRecordDuplicateComparator() -
SAMRecordDuplicateComparator
Deprecated. -
SAMRecordDuplicateComparator
-
-
Method Details
-
setScoringStrategy
-
compare
Most stringent comparison. Two records are compared based on if they are duplicates of each other, and then based on if they should be prioritized for being the most "representative". Typically, the representative is the record in the SAM file that is *not* marked as a duplicate within a set of duplicates. Compare by file order, then duplicate scoring strategy, read name. If both reads are paired and both ends mapped, always prefer the first end over the second end. This is needed to properly choose the first end for optical duplicate identification when both ends are mapped to the same position etc.- Specified by:
compare
in interfaceComparator<SAMRecord>
-
duplicateSetCompare
Less stringent than compare, such that two records are equal enough such that their ordering within their duplicate set would be arbitrary. Major difference between this and fileOrderCompare is how we compare the orientation byte. Here we want: F == FR, F == FF R == RF, R == RR -
fileOrderCompare
Less stringent than duplicateSetCompare, such that two records are equal enough such that their ordering in a sorted SAM file would be arbitrary.- Specified by:
fileOrderCompare
in interfaceSAMRecordComparator
- Returns:
- negative if samRecord1 invalid input: '<' samRecord2, 0 if equal, else positive
-