[Bioc-devel] restrictToSNV for VCF
On 03/20/2014 05:20 PM, Herv? Pag?s wrote:
[...]
Following that logic names(se1) also probably return colnames(se1).
/\
should
H.
H.
On Wed, Mar 19, 2014 at 1:07 PM, Vincent Carey <stvjc at channing.harvard.edu>wrote:
On Wed, Mar 19, 2014 at 4:00 PM, Michael Lawrence < lawrence.michael at gene.com> wrote:
It would be nice to have functions like isSNV, isIndel, isDeletion, etc that at least provide precise definitions of the terminology. I've added these, but they're designed only for VRanges. Should work for ExpandedVCF. Also, it would be nice if restrictToSNV just assumed that alt(x) must be something with nchar() support (with special handling for any List), so that the 'character' vector of alt,VRanges would work immediately. Basically restrictToSNV should just be x[isSNV(x)]. Is there even a use-case for the restrictToSNV abstraction if we did that?
for VCF instance it would be x[isSNV(x),] and indeed I think that would be sufficient. i like the idea of having this family of predicates for variant classes to allow such selections
Michael On Tue, Mar 18, 2014 at 10:36 AM, Valerie Obenchain <vobencha at fhcrc.org>wrote:
Hi,
I've added a restrictToSNV() function to VariantAnnotation
(1.9.46). The
return value is a subset VCF object containing SNVs only. The function
operates on CollapsedVCF or ExapandedVCF and the alt(VCF) value
must be
nucleotides (i.e., no structural variants).
A variant is considered a SNV if the nucleotide sequences in both
ref(vcf) and alt(x) are of length 1. I have a question about how
variants
with multiple 'ALT' values should be handled.
Should we consider row 4 a SNV? One 'ALT' is length 1, the other is
not.
ALT <- DNAStringSetList("A", c("TT"), c("G", "A"), c("TT", "C"))
REF <- DNAStringSet(c("G", c("AA"), "T", "G"))
DataFrame(REF, ALT)
DataFrame with 4 rows and 2 columns
REF ALT
<DNAStringSet> <DNAStringSetList>
1 G A
2 AA TT
3 T G,A
4 G TT,C
Thanks. Valerie
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319