Hi,
Assume that we have two variants from two samples at the same locus,
stored in a 'VRanges' or 'VCF' object:
library(VariantAnnotation)
vr = VRanges("1", IRanges(c(10, 10), width = 1),
ref = c("C", "C"), alt = c("A", "G"),
sampleNames = c("S1", "S2"))
vcf = as(vr, "VCF")
If we convert the VCF to a VRanges, we now get each variant in each
patient:
vr2 = as(vcf, "VRanges")
length(vr) ## 2
length(vr2) ## 4
It seems that the VCF object does not store the information of the
'sampleNames' in the first conversion.
Best wishes
Julian
[Bioc-devel] VariantAnnotation: Same locus, multiple samples
2 messages · Julian Gehring, Michael Lawrence
The two data structures do not encode the same information. Coercion to VCF forms a rectangular matrix: position+alt by sample. There is no standard way to encode that a given cell in that matrix is absent, so coercion to VRanges simply maps each cell to an element. One could imagine using the "." missing data marker for every geno field, but that's making too many assumptions. I'm not sure that's the same as an element not existing in a VRanges. On Fri, Dec 5, 2014 at 1:18 AM, Julian Gehring <julian.gehring at embl.de> wrote:
Hi,
Assume that we have two variants from two samples at the same locus,
stored in a 'VRanges' or 'VCF' object:
library(VariantAnnotation)
vr = VRanges("1", IRanges(c(10, 10), width = 1),
ref = c("C", "C"), alt = c("A", "G"),
sampleNames = c("S1", "S2"))
vcf = as(vr, "VCF")
If we convert the VCF to a VRanges, we now get each variant in each
patient:
vr2 = as(vcf, "VRanges")
length(vr) ## 2
length(vr2) ## 4
It seems that the VCF object does not store the information of the
'sampleNames' in the first conversion.
Best wishes
Julian
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel