hi,
i collaborate mantaining the packages GSVA and GSVAdata and i have a
question about the function mapIdentifiers() from the GSEABase package
which i'm going to illustrate through an example.
1. let's build first an ExpressionSet object whose annotation slot is
going to point to the human organism-level annotation pacakge
org.Hs.eg.db:
library(Biobase)
library(org.Hs.eg.db)
mapped_genes <- mappedkeys(org.Hs.egSYMBOL)
exp <- matrix(rnorm(1000), nrow=100,
dimnames=list(mapped_genes[1:100],
paste("sample", 1:10, sep="")))
eset <- new("ExpressionSet", exprs=exp, annotation="org.Hs.eg.db")
ExpressionSet (storageMode: lockedEnvironment)
assayData: 100 features, 10 samples
element names: exprs
protocolData: none
phenoData: none
featureData: none
experimentData: use 'experimentData(object)'
Annotation: org.Hs.eg.db
2. now i'm going to load the Broad gene sets stored as a
GeneSetCollection object in the experimental data package GSVAdata:
library(GSVAdata)
data(c2BroadSets)
c2BroadSets
GeneSetCollection
names: NAKAMURA_CANCER_MICROENVIRONMENT_UP,
NAKAMURA_CANCER_MICROENVIRONMENT_DN, ...,
ST_PHOSPHOINOSITIDE_3_KINASE_PATHWAY (3272 total)
unique identifiers: 5167, 100288400, ..., 57191 (29340 total)
types in collection:
geneIdType: EntrezIdentifier (1 total)
collectionType: BroadCollection (1 total)
3. finally, i'd like to obtain a new GeneSetCollection object whose
identifiers have been mapped between the two classes of identifiers in
the GeneSetCollection and the ExpressionSet objects.
in this case both objects actually work with the same class of
identifiers (Entrez), so in fact i don't need to do that but this
operation forms part of a piece of code in the package GSVA which i'd
like it to work regardless of the kind of annotation package referred to
in the ExpressionSet object. i had expected that the function
mapIdentifiers() would have some kind of idempotent behavior, but i get
the following error:
gsc <- mapIdentifiers(c2BroadSets,
AnnotationIdentifier(annotation(eset)))
Error in GeneSetCollection(lapply(what, mapIdentifiers, to, ..., verbose
= verbose)) :
error in evaluating the argument 'object' in selecting a method for
function 'GeneSetCollection': Error in get(mapName, envir = pkgEnv,
inherits = FALSE) :
object 'org.Hs.egENTREZID' not found
which does not occur if the feature names and annotation of the
ExpressionSet corresponds to a classical affy chip (e.g. "hgu95av2").
i built the object c2BroadSets in the experiment data package GSVAdata
by importing the entire xml file from the Broad sets so, i guess it
could be also possible that i did something wrong when i built this
'c2BroadSets' object and there's no problem, bug or lacking feature in
mapIdentifiers().
i look forward to your diagnostic and suggestions in any of these
possible directions.
thanks,
robert.
[Bioc-devel] idempotent identifier mapping with GSEABase::mapIdentifiers()
4 messages · Vincent Carey, Martin Morgan, Robert Castelo
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/bioc-devel/attachments/20120227/d807bd8e/attachment.pl>
On 02/27/2012 04:45 AM, Vincent Carey wrote:
I have run into a very similar situation. Ultimately a uniformization of the annotation API will be called for. I wonder if a global short-term fixup would get you through this situation?
org.Hs.egENTREZID = new.env(hash=TRUE)
k = mappedkeys(org.Hs.egENSEMBL) # or any other good source of all keys
for (i in 1:length(k)) assign(k[i], k[i], org.Hs.egENTREZID)
get("1000", org.Hs.egENTREZID)
[1] "1000" On Mon, Feb 27, 2012 at 6:25 AM, Robert Castelo<robert.castelo at upf.edu>wrote:
hi,
i collaborate mantaining the packages GSVA and GSVAdata and i have a
question about the function mapIdentifiers() from the GSEABase package
which i'm going to illustrate through an example.
1. let's build first an ExpressionSet object whose annotation slot is
going to point to the human organism-level annotation pacakge
org.Hs.eg.db:
library(Biobase)
library(org.Hs.eg.db)
mapped_genes<- mappedkeys(org.Hs.egSYMBOL)
exp<- matrix(rnorm(1000), nrow=100,
dimnames=list(mapped_genes[1:100],
paste("sample", 1:10, sep="")))
eset<- new("ExpressionSet", exprs=exp, annotation="org.Hs.eg.db")
ExpressionSet (storageMode: lockedEnvironment)
assayData: 100 features, 10 samples
element names: exprs
protocolData: none
phenoData: none
featureData: none
experimentData: use 'experimentData(object)'
Annotation: org.Hs.eg.db
2. now i'm going to load the Broad gene sets stored as a
GeneSetCollection object in the experimental data package GSVAdata:
library(GSVAdata)
data(c2BroadSets)
c2BroadSets
GeneSetCollection
names: NAKAMURA_CANCER_MICROENVIRONMENT_UP,
NAKAMURA_CANCER_MICROENVIRONMENT_DN, ...,
ST_PHOSPHOINOSITIDE_3_KINASE_PATHWAY (3272 total)
unique identifiers: 5167, 100288400, ..., 57191 (29340 total)
types in collection:
geneIdType: EntrezIdentifier (1 total)
collectionType: BroadCollection (1 total)
3. finally, i'd like to obtain a new GeneSetCollection object whose
identifiers have been mapped between the two classes of identifiers in
the GeneSetCollection and the ExpressionSet objects.
in this case both objects actually work with the same class of
identifiers (Entrez), so in fact i don't need to do that but this
operation forms part of a piece of code in the package GSVA which i'd
like it to work regardless of the kind of annotation package referred to
in the ExpressionSet object. i had expected that the function
mapIdentifiers() would have some kind of idempotent behavior, but i get
the following error:
gsc<- mapIdentifiers(c2BroadSets,
AnnotationIdentifier(annotation(eset)))
Error in GeneSetCollection(lapply(what, mapIdentifiers, to, ..., verbose
= verbose)) :
error in evaluating the argument 'object' in selecting a method for
function 'GeneSetCollection': Error in get(mapName, envir = pkgEnv,
inherits = FALSE) :
object 'org.Hs.egENTREZID' not found
which does not occur if the feature names and annotation of the
ExpressionSet corresponds to a classical affy chip (e.g. "hgu95av2").
The issue seems to be in GSEABase:::.mapIdentifiers_selectMaps where org packages are handled specially, but apparently not in a general enough way; I'll look in to this. Martin
i built the object c2BroadSets in the experiment data package GSVAdata by importing the entire xml file from the Broad sets so, i guess it could be also possible that i did something wrong when i built this 'c2BroadSets' object and there's no problem, bug or lacking feature in mapIdentifiers(). i look forward to your diagnostic and suggestions in any of these possible directions. thanks, robert.
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
thanks Vincent, what you suggest fixes the situation temporarily and hopefully, as Martin said in his message before, this can have a more generic solution. you suggestion makes me think that, in fact, it could be of general interest to add a *ENTREZID (identity) map for every entrez-based organism-level annotation package. i think this could be useful in every situation in which one would like to programmatically retrieve the entrez id of a feature using any annotation package without knowing whether the feature is already an entrez id. robert.
On Mon, 2012-02-27 at 07:45 -0500, Vincent Carey wrote:
I have run into a very similar situation. Ultimately a uniformization of the annotation API will be called for. I wonder if a global short-term fixup would get you through this situation?
org.Hs.egENTREZID = new.env(hash=TRUE) k = mappedkeys(org.Hs.egENSEMBL) # or any other good source of all
keys
for (i in 1:length(k)) assign(k[i], k[i], org.Hs.egENTREZID)
get("1000", org.Hs.egENTREZID)
[1] "1000"
On Mon, Feb 27, 2012 at 6:25 AM, Robert Castelo
<robert.castelo at upf.edu> wrote:
hi,
i collaborate mantaining the packages GSVA and GSVAdata and i
have a
question about the function mapIdentifiers() from the GSEABase
package
which i'm going to illustrate through an example.
1. let's build first an ExpressionSet object whose annotation
slot is
going to point to the human organism-level annotation pacakge
org.Hs.eg.db:
library(Biobase)
library(org.Hs.eg.db)
mapped_genes <- mappedkeys(org.Hs.egSYMBOL)
exp <- matrix(rnorm(1000), nrow=100,
dimnames=list(mapped_genes[1:100],
paste("sample", 1:10, sep="")))
eset <- new("ExpressionSet", exprs=exp,
annotation="org.Hs.eg.db")
ExpressionSet (storageMode: lockedEnvironment)
assayData: 100 features, 10 samples
element names: exprs
protocolData: none
phenoData: none
featureData: none
experimentData: use 'experimentData(object)'
Annotation: org.Hs.eg.db
2. now i'm going to load the Broad gene sets stored as a
GeneSetCollection object in the experimental data package
GSVAdata:
library(GSVAdata)
data(c2BroadSets)
c2BroadSets
GeneSetCollection
names: NAKAMURA_CANCER_MICROENVIRONMENT_UP,
NAKAMURA_CANCER_MICROENVIRONMENT_DN, ...,
ST_PHOSPHOINOSITIDE_3_KINASE_PATHWAY (3272 total)
unique identifiers: 5167, 100288400, ..., 57191 (29340 total)
types in collection:
geneIdType: EntrezIdentifier (1 total)
collectionType: BroadCollection (1 total)
3. finally, i'd like to obtain a new GeneSetCollection object
whose
identifiers have been mapped between the two classes of
identifiers in
the GeneSetCollection and the ExpressionSet objects.
in this case both objects actually work with the same class of
identifiers (Entrez), so in fact i don't need to do that but
this
operation forms part of a piece of code in the package GSVA
which i'd
like it to work regardless of the kind of annotation package
referred to
in the ExpressionSet object. i had expected that the function
mapIdentifiers() would have some kind of idempotent behavior,
but i get
the following error:
gsc <- mapIdentifiers(c2BroadSets,
AnnotationIdentifier(annotation(eset)))
Error in GeneSetCollection(lapply(what, mapIdentifiers,
to, ..., verbose
= verbose)) :
error in evaluating the argument 'object' in selecting a
method for
function 'GeneSetCollection': Error in get(mapName, envir =
pkgEnv,
inherits = FALSE) :
object 'org.Hs.egENTREZID' not found
which does not occur if the feature names and annotation of
the
ExpressionSet corresponds to a classical affy chip (e.g.
"hgu95av2").
i built the object c2BroadSets in the experiment data package
GSVAdata
by importing the entire xml file from the Broad sets so, i
guess it
could be also possible that i did something wrong when i built
this
'c2BroadSets' object and there's no problem, bug or lacking
feature in
mapIdentifiers().
i look forward to your diagnostic and suggestions in any of
these
possible directions.
thanks,
robert.
_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel