Package htsjdk.variant.vcf
Class VCFEncoder
java.lang.Object
htsjdk.variant.vcf.VCFEncoder
Functions specific to encoding VCF records.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Charset
The encoding used for VCF files: ISO-8859-1. -
Constructor Summary
ConstructorsConstructorDescriptionVCFEncoder
(VCFHeader header, boolean allowMissingFieldsInHeader, boolean outputTrailingFormatFields) Prepare a VCFEncoder that will encode records appropriate to the given VCF header, optionally allowing missing fields in the header. -
Method Summary
Modifier and TypeMethodDescriptionvoid
addGenotypeData
(VariantContext vc, Map<Allele, String> alleleMap, List<String> genotypeFormatKeys, StringBuilder builder) return a Map containing Allele -> String(allele position) for all Alleles in VC (as well as NO_CALL) ex: A,T,TC -> { A:0, T:1, TC:2, NO_CALL:EMPTY_ALLELE} This may be efficient when looking up values for many genotypes per VCencode
(VariantContext context) encodes aVariantContext
as a VCF line Depending on the use case it may be more efficient towrite(Appendable, VariantContext)
directly instead of creating an intermediate string.static String
encodeGtField
(VariantContext vc, Genotype g) Easy way to generate the GT field for a Genotype.static String
formatVCFDouble
(double d) Takes a double value and pretty prints it to a String for displayvoid
setAllowMissingFieldsInHeader
(boolean allow) Deprecated.since 10/24/13 use the constructorvoid
setVCFHeader
(VCFHeader header) Deprecated.since 10/24/13 use the constructorvoid
write
(Appendable vcfOutput, VariantContext context) encodes aVariantContext
context as VCF, and writes it directly to anAppendable
This may be more efficient than callingencode(VariantContext)
and then writing the result since it avoids creating an intermediate string.static void
writeGtField
(Map<Allele, String> alleleMap, Appendable vcfoutput, Genotype g) write the encoded GT field for a Genotype
-
Field Details
-
VCF_CHARSET
The encoding used for VCF files: ISO-8859-1. When writing VCF4.3 is implemented, this should change to UTF-8.
-
-
Constructor Details
-
VCFEncoder
public VCFEncoder(VCFHeader header, boolean allowMissingFieldsInHeader, boolean outputTrailingFormatFields) Prepare a VCFEncoder that will encode records appropriate to the given VCF header, optionally allowing missing fields in the header.
-
-
Method Details
-
setVCFHeader
Deprecated.since 10/24/13 use the constructor -
setAllowMissingFieldsInHeader
Deprecated.since 10/24/13 use the constructor -
encode
encodes aVariantContext
as a VCF line Depending on the use case it may be more efficient towrite(Appendable, VariantContext)
directly instead of creating an intermediate string.- Returns:
- the VCF line
-
write
encodes aVariantContext
context as VCF, and writes it directly to anAppendable
This may be more efficient than callingencode(VariantContext)
and then writing the result since it avoids creating an intermediate string.- Parameters:
vcfOutput
- theAppendable
to write tocontext
- the variant- Throws:
IOException
-
formatVCFDouble
Takes a double value and pretty prints it to a String for displayLarge doubles => gets %.2f style formatting Doubles < 1 / 10 but > 1/100 => get %.3f style formatting Double < 1/100 => %.3e formatting
- Parameters:
d
-- Returns:
-
addGenotypeData
public void addGenotypeData(VariantContext vc, Map<Allele, String> alleleMap, List<String> genotypeFormatKeys, StringBuilder builder) -
writeGtField
public static void writeGtField(Map<Allele, String> alleleMap, Appendable vcfoutput, Genotype g) throws IOExceptionwrite the encoded GT field for a Genotype- Parameters:
alleleMap
- a mapping of Allele -> GT allele value (frominvalid @link
{@link this#buildAlleleStrings(VariantContext)
vcfoutput
- the appendable to write to, to avoid inefficiency due to string copyingg
- the genotoype to encode- Throws:
IOException
- if appending fails with an IOException
-
encodeGtField
Easy way to generate the GT field for a Genotype. This will be less efficient than usinginvalid @link
{@link this#writeGtField(Map, Appendable, Genotype)
- Parameters:
vc
- a VariantContext which must contain g or the results are likely to be incorrectg
- a Genotype in vc- Returns:
- a String containing the encoding of the GT field of g
-
buildAlleleStrings
return a Map containing Allele -> String(allele position) for all Alleles in VC (as well as NO_CALL) ex: A,T,TC -> { A:0, T:1, TC:2, NO_CALL:EMPTY_ALLELE} This may be efficient when looking up values for many genotypes per VC
-