Skip to content
Prev 6902 / 21307 Next

[Bioc-devel] VRanges with multiple samples

hi Michael, thanks for sharing your opinion, comments below,
On 01/28/2015 06:22 PM, Michael Lawrence wrote:
[...]
i'm concerned about the scalability with multisample VCFs when adding 
annotations. What you propose about using Rle-like vectors to store 
identical values from different samples together sounds good to me and 
I'm also in favor of keeping data structures as simple as possible. 
Maybe for the time being I'll try to use 'VRanges' just as they are now 
and I'll try to explore how bad it gets when scaling in samples and 
annotations to justify doing something about it along the lines you suggest.

[...]
i see your point in that the splitting a VRanges could be motivated by 
something else than sample and as you suggest 'split()' does the work 
very fast. actually invoking to the VRangesList constructor i get what i 
was looking for:

do.call("VRangesList", split(vr, sampleNames(vr)))
VRangesList of length 3
names(3): sample1 sample2 sample3


although i realize now that the rle-like strategy you propose then would 
not be usable when splitting by sample.

cheers,

robert.