Class ExtractIlluminaBarcodes


@DocumentedFeature public class ExtractIlluminaBarcodes extends ExtractBarcodesProgram
Determine the barcode for each read in an Illumina lane. For each tile, a file is written to the basecalls directory of the form s___barcode.txt. An output file contains a line for each read in the tile, aligned with the regular basecall output The output file contains the following tab-separated columns: - read subsequence at barcode position - Y or N indicating if there was a barcode match - matched barcode sequence (empty if read did not match one of the barcodes). If there is no match but we're close to the threshold of calling it a match we output the barcode that would have been matched but in lower case - distance to best matching barcode, "mismatches" (*) - distance to second-best matching barcode, "mismatchesToSecondBest" (*) NOTE (*): Due to an optimization the reported mismatches & mismatchesToSecondBest values may be inaccurate as long as the conclusion (match vs. no-match) isn't affected. For example, reported mismatches and mismatchesToSecondBest may be smaller than their true value if mismatches is truly larger than MAX_MISMATCHES. Also, mismatchesToSecondBest might be smaller than its true value if its true value is greater than mismatches + MIN_MISMATCH_DELTA.
  • Field Details

    • BARCODE_FILE

      @Argument(doc="Tab-delimited file of barcode sequences, barcode name and, optionally, library name. Barcodes must be unique and all the same length. Column headers must be \'barcode_sequence\' (or \'barcode_sequence_1\'), \'barcode_sequence_2\' (optional), \'barcode_name\', and \'library_name\'.", mutex="BARCODE") public File BARCODE_FILE
    • BARCODE

      @Argument(doc="Barcode sequence. These must be unique, and all the same length. This cannot be used with reads that have more than one barcode; use BARCODE_FILE in that case. ", mutex="BARCODE_FILE") public List<String> BARCODE
    • NUM_PROCESSORS

      @Argument(doc="Run this many PerTileBarcodeExtractors in parallel. If NUM_PROCESSORS = 0, number of cores is automatically set to the number of cores available on the machine. If NUM_PROCESSORS < 0 then the number of cores used will be the number available on the machine less NUM_PROCESSORS.") public int NUM_PROCESSORS
    • OUTPUT_DIR

      @Argument(doc="Where to write _barcode.txt files. By default, these are written to BASECALLS_DIR.", optional=true) public File OUTPUT_DIR
  • Constructor Details

    • ExtractIlluminaBarcodes

      public ExtractIlluminaBarcodes()
  • Method Details

    • doWork

      protected int doWork()
      Description copied from class: CommandLineProgram
      Do the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.
      Specified by:
      doWork in class CommandLineProgram
      Returns:
      program exit status.
    • customCommandLineValidation

      protected String[] customCommandLineValidation()
      Description copied from class: ExtractBarcodesProgram
      Parses all barcodes from input files and validates all barcodes are the same length and unique
      Overrides:
      customCommandLineValidation in class ExtractBarcodesProgram
      Returns:
      null if command line is valid. If command line is invalid, returns an array of error message to be written to the appropriate place.