Skip to content
Prev 6522 / 21312 Next

[Bioc-devel] [devteam-bioc] A more flexible GenomeInfoDb::mapSeqlevels(): used supported info but don't break with new organisms/toy examples

Hi Leo,

To summarize, These are the concerns raised in your previous email :

*a) mapSeqlevels() should return current style along with other mapped 
styles? **
look at extendedMapSeqlevels()

*I am looking into this. I have a few thoughts/ questions for you.

The main idea of mapSeqlevels() is to map the current seqlevelStyle to 
other seqlevelStyles.

"I was also expecting mapSeqlevels() to return the same input if the
specified 'style' was the same as the one currently being used."


Returning the existing seqlevelStyle along with the mapped Seqlevel 
style sounds a
little redundant. (given that the user is already supplying it while 
calling mapSeqlevels()
so he already knows it!)

  "For
example, if I was working with Homo sapiens NCBI style and attempted
to map to NCBI, I was expecting the same output as the input."


why would you want to map it back to NCBI ? Please explain your use case 
more
concretely.

I'd be happy to make changes to mapSeqlevels() and look at your function 
more deeply,
once I understand the use case better. Thanks for clarifying this.

*b) Do you think it would be helpful to you and other developers of I 
export**
**.guessSpeciesStyle() and .supportedSeqnameMappings() from GenomeInfoDb ?*

*c) .guessSpeciesStyle('2') should return all possible hits for "seqnames"*

This was a great find! I have fixed it in version 1.3.2 - Thanks for 
figuring this out !
Does the following style of output make sense to you ? If multiple hits 
are found
there are returned in "species" and "style" respectively.

 > .guessSpeciesStyle(c(paste0("chr",1:10)))
$species
[1] "Homo sapiens" "Mus musculus"

$style
[1] "UCSC" "UCSC"

 > .guessSpeciesStyle(c(paste0("chr",1:22)))
$species
[1] "Homo sapiens"

$style
[1] "UCSC"

 > .guessSpeciesStyle("chr2")
$species
[1] "Homo sapiens" "Mus musculus"

$style
[1] "UCSC" "UCSC"

 > .guessSpeciesStyle("2")
$species
  [1] "Arabidopsis thaliana"    "Arabidopsis thaliana" "Cyanidioschyzon 
merolae"
  [4] "Homo sapiens"            "Mus musculus"            "Oryza sativa"
  [7] "Oryza sativa"            "Populus trichocarpa"     "Zea mays"
[10] "Zea mays"

$style
  [1] "NCBI"   "TAIR10" "NCBI"   "NCBI"   "NCBI"   "NCBI" "MSU6"   
"JGI2"   "NCBI"   "AGPvF"

*SeqlevelsStyle also returns multiple styles now (if it cant guess the 
correct one!) *

 > seqnames <- c(paste0("chr",1:22))
 > seqlevelsStyle(seqnames)
[1] "UCSC"

 > seqnames <- "2"
 > seqlevelsStyle(seqnames)
warning! Multiple seqnameStyles found.
[1] "NCBI"   "TAIR10" "MSU6"   "JGI2"   "AGPvF"

Thanks again for your feedback,
Sonali.

*
*
On 10/21/2014 8:47 PM, Maintainer wrote: