Skip to content
Prev 4288 / 21307 Next

[Bioc-devel] Fwd: [BioC] Problem reading VCF file using readVcf from package VariantAnnotation

Hi Vince, Kasper,

cc'ing Herve and Marc.

I think we have a couple of things going on so I wanted to clarify. The 
'genome' argument to readVcf() is assigned to the GRanges in rowData 
with the "genome<-" setter. This is where .normargGenome() gets called.

setReplaceMethod("genome", "Seqinfo",
     function(x, value)
     {
         x at genome <- .normargGenome(value, seqnames(x))
         x
     }
)

If the 'genome' replacement value is named, the name(s) must match the 
seqnames, not the build. So we aren't talking about matching compatible 
builds,

fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
vcf <- readVcf(fl, c("b37"="hg19"))  ## this is wrong
vcf <- readVcf(fl, c("hg19"="hg19")) ## also wrong

Instead the name must be the seqname, the value is the build,

vcf <- readVcf(fl, c("20"="hg19"))  ## correct
vcf <- readVcf(fl, "hg19")  ## also correct

This requirement for 'genome' is not well documented on ?readVcf or 
?Seqinfo. We can fix that.

The second thing is the issue of a flexible mapping between seqinfo 
metadata for different institutions. Herve and Marc have worked on this 
in AnnotationDbi. They can explain more about the 'SeqnameStyle' and how 
it might be used more widely.


Val
On 04/25/2013 06:54 AM, Kasper Daniel Hansen wrote: