[Bioc-devel] seqnames of SNPlocs.*
Hi Peter, Yes, as Vince said, the chromosome names are those used by dbSNP. For whatever reason, dbSNP, which is part of NCBI, felt the need to use a different naming convention than the rest of NCBI :-/
On 06/17/2014 07:57 PM, Peter Hickey wrote:
Thanks for the explanation, Vincent. GenomeInfoDb has NCBI and UCSC support, but doesn't seem to support the dbSNP format. Perhaps this should be added?
The seqlevelsStyle() setter first requires that the seqlevels() setter
works on a SNPlocs object, which itself requires that the seqinfo()
setter works. Unfortunately, it doesn't at the moment:
> library(SNPlocs.Hsapiens.dbSNP.20120608)
> snps <- SNPlocs.Hsapiens.dbSNP.20120608
> seqlevels(snps) <- sub("^ch", "chr", seqlevels(snps))
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ?seqinfo<-? for
signature ?"SNPlocs"?
Something I'm adding on my list.
In the mean time you can do the renaming on the GRanges objects
you extract with 'getSNPlocs(..., as.GRanges=TRUE)' or with
'rsidsToGRanges(...)'. Maybe it's not very convenient to have to do
this each time you extract snps in a GRanges object but OTOH it's
really easy those days now that we have seqlevelsStyle().
Hope this helps.
Cheers,
H.
seqlevelsStyle(seqnames(SNPlocs.Hsapiens.dbSNP.20120608))
Error in .guessSpeciesStyle(seqnames) : The style does not have a compatible entry for the species supported by Seqname. Please see genomeStyles() for supported species/style On 18/06/2014, at 12:40 PM, Vincent Carey <stvjc at channing.harvard.edu> wrote:
it is the convention used in dbSNP, just propagated directly. indeed one typically has to relabel, but there is seqnamesStyle infrastructure in GenomeInfoDb that may help. On Tue, Jun 17, 2014 at 8:17 PM, Peter Hickey <hickey at wehi.edu.au> wrote: Is there a reason why the seqnames of SNPlocs.Hsapiens.dbSNP.20120608 (and possibly the other SNPlocs.*) use the prefix "ch" instead of "chr"? E.g. "ch1" instead of "chr1". It doesn't seem to fit with any standard way of naming chromosomes and means that these need to be renamed to use with most other Bioconductor data sources. Thanks, Pete -------------------------------- Peter Hickey, PhD Student/Research Assistant, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Ph: +613 9345 2324 hickey at wehi.edu.au http://www.wehi.edu.au
______________________________________________________________________
The information in this email is confidential and inte...{{dropped:28}}
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319