You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
VCFHeaderReader uses a try catch to fall back to BCF encoding, which leads to incorrect error messages and stack trace if the header is actually VCF format but has unrelated errors.
E.g. Here the first exception should have been thrown (Your input file has a malformed header: Count < 0 for fixed size VCF header field BAD_PS), not logged as a warning, and the second exception should not have happened (Input stream does not contain a BCF encoded file; BCF magic header info not found, at record 0 with position 0).
scala> val variants = sc.loadVariants("truth_small_variants.variants.adam")
warning: while trying to read VCF header from file received exception: htsjdk.tribble.TribbleException$InvalidHeader: Your input file has a malformed header: Count < 0 for fixed size VCF header field BAD_PS
htsjdk.tribble.TribbleException: Input stream does not contain a BCF encoded file; BCF magic header info not found, at record 0 with position 0:
at htsjdk.variant.bcf2.BCF2Codec.error(BCF2Codec.java:478)
at htsjdk.variant.bcf2.BCF2Codec.readHeader(BCF2Codec.java:149)
at org.seqdoop.hadoop_bam.util.VCFHeaderReader.readHeaderFrom(VCFHeaderReader.java:67)
at org.bdgenomics.adam.rdd.ADAMContext.org$bdgenomics$adam$rdd$ADAMContext$$readVcfHeader(ADAMContext.scala:228)
at org.bdgenomics.adam.rdd.ADAMContext$$anonfun$loadHeaderLines$1.apply(ADAMContext.scala:234)
at org.bdgenomics.adam.rdd.ADAMContext$$anonfun$loadHeaderLines$1.apply(ADAMContext.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at org.bdgenomics.adam.rdd.ADAMContext.loadHeaderLines(ADAMContext.scala:234)
at org.bdgenomics.adam.rdd.ADAMContext.loadParquetVariants(ADAMContext.scala:1175)
at org.bdgenomics.adam.rdd.ADAMContext$$anonfun$loadVariants$1.apply(ADAMContext.scala:1733)
at org.bdgenomics.adam.rdd.ADAMContext$$anonfun$loadVariants$1.apply(ADAMContext.scala:1728)
at scala.Option.fold(Option.scala:158)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.ADAMContext.loadVariants(ADAMContext.scala:1726)
... 50 elided
The text was updated successfully, but these errors were encountered:
VCFHeaderReader
uses a try catch to fall back to BCF encoding, which leads to incorrect error messages and stack trace if the header is actually VCF format but has unrelated errors.E.g. Here the first exception should have been thrown (
Your input file has a malformed header: Count < 0 for fixed size VCF header field BAD_PS
), not logged as a warning, and the second exception should not have happened (Input stream does not contain a BCF encoded file; BCF magic header info not found, at record 0 with position 0
).The text was updated successfully, but these errors were encountered: