[Bioc-devel] coerce ExpressionSet to SummarizedExperiment

16 messages · Levi Waldron, Martin Morgan, Michael Lawrence +3 more

Original

1

16

Levi Waldron

Sun, Sep 10, 2017 5:38 PM #

I just dug up this old thread because I realized we still don't have a
coercion method as(sample.ExpressionSet, "SummarizedExperiment"). Since we
do have SummarizedExperiment(sample.ExpressionSet), could the coercion
method also be added easily?

example("ExpressionSet")

dim: 500 26
metadata(0):
assays(1): ''
rownames(500): AFFX-MurIL2_at AFFX-MurIL10_at ... 31738_at 31739_at
rowData names(0):
colnames(26): A B ... Y Z
colData names(0):> as(sample.ExpressionSet,
"SummarizedExperiment")Error in as(sample.ExpressionSet,
"SummarizedExperiment") :
  no method or default for coercing ?ExpressionSet? to ?SummarizedExperiment?

Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils
datasets  methods   base

other attached packages:
[1] SummarizedExperiment_1.7.5 DelayedArray_0.3.16
matrixStats_0.52.2
[4] GenomicRanges_1.29.6       GenomeInfoDb_1.13.4
IRanges_2.11.7
[7] S4Vectors_0.15.5           Biobase_2.37.2
BiocGenerics_0.23.0

loaded via a namespace (and not attached):
 [1] lattice_0.20-35         bitops_1.0-6            grid_3.4.0
 [4] zlibbioc_1.23.0         XVector_0.17.0          Matrix_1.2-11
 [7] tools_3.4.0             RCurl_1.95-4.8          compiler_3.4.0
[10] GenomeInfoDbData_0.99.1

On Mon, Sep 22, 2014 at 1:54 AM, Herv? Pag?s <hpages at fhcrc.org> wrote:

Hi,

On 09/20/2014 11:14 AM, Martin Morgan wrote:

On 09/20/2014 10:43 AM, Sean Davis wrote:

Hi, Vince.

Looks like a good start.  I'd probably pull all the assays from
ExpressionSet into SummarizedExperiment as the default, avoiding data
coercion methods that are unnecessarily lossy.  Also, as it stands, the
assayname argument is not used anyway?

I think there will be some resistance to uniting the 'Biobase' and
'IRanges' realms under 'GenomicRanges';

This coercion method could be defined (1) in Biobase (where
ExpressionSet is defined), (2) in GenomicRanges (where
SummarizedExperiment is defined), or (3) in a package that
depends on Biobase and GenomicRanges.

Since it's probably undesirable to make Biobase depend on GenomicRanges
or vice-versa, we would need to use Suggests for (1) or (2). That
means we would get a note like this at installation time:

 ** preparing package for lazy loading
 in method for ?coerce? with signature ?"ExpressionSet","SummarizedEx
periment"?:
 no definition for class ?SummarizedExperiment?

Not very clean but it works.

(3) is a cleaner solution but then the coercion method would
not necessarily be available to the user when s/he needs it (unless
s/he knows what extra package to load). The obvious advantage of
putting the method in Biobase is that if a user has an ExpressionSet,
then s/he necessarily has Biobase attached and the method is already
in her/his search path.

Another solution would be (4) to move SummarizedExperiment somewhere
else. That would be in a package that depends on GenomicRanges and
Biobase, and the coercion method would be defined there.

H.


considerable effort has gone in

to making a rational hierarchy of package dependencies [perhaps Herve
will point to some of his ASCII art on the subject].

I have some recollection of (recent) discussion related to this topic in
the DESeq2 realm, but am drawing a blank; presumably Michael or Wolfgang
or ... will chime in.

Martin

Sean


On Sat, Sep 20, 2014 at 10:38 AM, Vincent Carey
<stvjc at channing.harvard.edu>
wrote:

do we have a facility for this?

if not, we have

https://github.com/vjcitn/biocMultiAssay/blob/master/R/exs2se.R

https://github.com/vjcitn/biocMultiAssay/blob/master/man/coe
rce-methods.Rd


it occurred to me that we might want something like this in
GenomicRanges
(that's where SummarizedExperiment is managed, right?) and I will add it
if there are no objections

the arguments are currently

      assayname = "exprs",    # for naming SimpleList element
      fngetter =
            function(z) rownames(exprs(z)),   # extract usable
feature names
      annDbGetter =
           function(z) {
               clnanno = sub(".db", "", annotation(z))
               stopifnot(require(paste0(annotation(z), ".db"),
character.only=TRUE) )
               get(paste0(annotation(z), ".db"))  # obtain resource for
mapping feature names to coordinates
               },
      probekeytype = "PROBEID",   # chipDb field to use
      duphandler = function(z) {    # action to take to process
duplicated
features
           if (any(isd <- duplicated(z[,"PROBEID"])))
               return(z[!isd,,drop=FALSE])
           z
           },
      signIsStrand = TRUE,   # verify that signs of addresses define
strand
      ucsdChrnames = TRUE    # prefix 'chr' to chromosome token

         [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Levi Waldron
http://www.waldronlab.org
Assistant Professor of Biostatistics     CUNY School of Public Health
US: +1 646-364-9616 <+1%20646-364-9616>
       Skype: levi.waldron

	[[alternative HTML version deleted]]

Mon, Sep 11, 2017 3:58 AM #

On 09/10/2017 08:38 PM, Levi Waldron wrote:

try as(sample.ExpressionSet, "RangedSummarizedExperiment"); see 
?makeSummarizedExperimentFromExpressionSet

do have SummarizedExperiment(sample.ExpressionSet), could the coercion
method also be added easily?

library(Biobase) > library(SummarizedExperiment) >

example("ExpressionSet")

SummarizedExperiment(sample.ExpressionSet)class: SummarizedExperiment

dim: 500 26
metadata(0):
assays(1): ''
rownames(500): AFFX-MurIL2_at AFFX-MurIL10_at ... 31738_at 31739_at
rowData names(0):
colnames(26): A B ... Y Z
colData names(0):> as(sample.ExpressionSet,
"SummarizedExperiment")Error in as(sample.ExpressionSet,
"SummarizedExperiment") :
   no method or default for coercing ?ExpressionSet? to ?SummarizedExperiment?

sessionInfo()R version 3.4.0 RC (2017-04-20 r72569)

Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils
datasets  methods   base

other attached packages:
[1] SummarizedExperiment_1.7.5 DelayedArray_0.3.16
matrixStats_0.52.2
[4] GenomicRanges_1.29.6       GenomeInfoDb_1.13.4
IRanges_2.11.7
[7] S4Vectors_0.15.5           Biobase_2.37.2
BiocGenerics_0.23.0

loaded via a namespace (and not attached):
  [1] lattice_0.20-35         bitops_1.0-6            grid_3.4.0
  [4] zlibbioc_1.23.0         XVector_0.17.0          Matrix_1.2-11
  [7] tools_3.4.0             RCurl_1.95-4.8          compiler_3.4.0
[10] GenomeInfoDbData_0.99.1

On Mon, Sep 22, 2014 at 1:54 AM, Herv? Pag?s <hpages at fhcrc.org> wrote:

Hi,

On 09/20/2014 11:14 AM, Martin Morgan wrote:

On 09/20/2014 10:43 AM, Sean Davis wrote:

Hi, Vince.

Looks like a good start.  I'd probably pull all the assays from
ExpressionSet into SummarizedExperiment as the default, avoiding data
coercion methods that are unnecessarily lossy.  Also, as it stands, the
assayname argument is not used anyway?

I think there will be some resistance to uniting the 'Biobase' and
'IRanges' realms under 'GenomicRanges';

This coercion method could be defined (1) in Biobase (where
ExpressionSet is defined), (2) in GenomicRanges (where
SummarizedExperiment is defined), or (3) in a package that
depends on Biobase and GenomicRanges.

Since it's probably undesirable to make Biobase depend on GenomicRanges
or vice-versa, we would need to use Suggests for (1) or (2). That
means we would get a note like this at installation time:

  ** preparing package for lazy loading
  in method for ?coerce? with signature ?"ExpressionSet","SummarizedEx
periment"?:
  no definition for class ?SummarizedExperiment?

Not very clean but it works.

(3) is a cleaner solution but then the coercion method would
not necessarily be available to the user when s/he needs it (unless
s/he knows what extra package to load). The obvious advantage of
putting the method in Biobase is that if a user has an ExpressionSet,
then s/he necessarily has Biobase attached and the method is already
in her/his search path.

Another solution would be (4) to move SummarizedExperiment somewhere
else. That would be in a package that depends on GenomicRanges and
Biobase, and the coercion method would be defined there.

H.


considerable effort has gone in

to making a rational hierarchy of package dependencies [perhaps Herve
will point to some of his ASCII art on the subject].

I have some recollection of (recent) discussion related to this topic in
the DESeq2 realm, but am drawing a blank; presumably Michael or Wolfgang
or ... will chime in.

Martin

Sean


On Sat, Sep 20, 2014 at 10:38 AM, Vincent Carey
<stvjc at channing.harvard.edu>
wrote:

do we have a facility for this?

if not, we have

https://github.com/vjcitn/biocMultiAssay/blob/master/R/exs2se.R

https://github.com/vjcitn/biocMultiAssay/blob/master/man/coe
rce-methods.Rd


it occurred to me that we might want something like this in
GenomicRanges
(that's where SummarizedExperiment is managed, right?) and I will add it
if there are no objections

the arguments are currently

       assayname = "exprs",    # for naming SimpleList element
       fngetter =
             function(z) rownames(exprs(z)),   # extract usable
feature names
       annDbGetter =
            function(z) {
                clnanno = sub(".db", "", annotation(z))
                stopifnot(require(paste0(annotation(z), ".db"),
character.only=TRUE) )
                get(paste0(annotation(z), ".db"))  # obtain resource for
mapping feature names to coordinates
                },
       probekeytype = "PROBEID",   # chipDb field to use
       duphandler = function(z) {    # action to take to process
duplicated
features
            if (any(isd <- duplicated(z[,"PROBEID"])))
                return(z[!isd,,drop=FALSE])
            z
            },
       signIsStrand = TRUE,   # verify that signs of addresses define
strand
       ucsdChrnames = TRUE    # prefix 'chr' to chromosome token

          [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

This email message may contain legally privileged and/or...{{dropped:2}}

Levi Waldron

Mon, Sep 11, 2017 7:15 AM #

Thanks Martin! I see the RangedSummarizedExperiment coercion method works
when there are no mappable ranges (for example curatedMetagenomicData
ExpressionSet objects), although the rowRanges is a GRangesList of empty
elements. It might be worth also having a SummarizedExperiment coercion
method it it's not a problematic or big job. And now I suppose I can ask
the question I *really* wanted to know, which is why can't I coerce an
object that extends eSet? I can still use the SummarizedExperiment()
constructor, but for example:

attr(,"package")
[1] "metagenomeSeq"> is(mouseData, "ExpressionSet")[1] FALSE>
is(mouseData, "eSet")[1] TRUE

dim: 10172 139
metadata(0):
assays(1): ''
rownames(10172): Prevotellaceae:1 Lachnospiraceae:1 ... Bryantella:103
  Parabacteroides:956
rowData names(0):
colnames(139): PM1:20080107 PM1:20080108 ... PM9:20080225 PM9:20080303
colData names(0):

"RangedSummarizedExperiment") : no method or default for coercing
?MRexperiment? to ?RangedSummarizedExperiment? > as(mouseData,
"SummarizedExperiment") Error in as(mouseData, "SummarizedExperiment") : no
method or default for coercing ?MRexperiment? to
?SummarizedExperiment? > as(mouseData,
"ExpressionSet") Error in updateOldESet(from, "ExpressionSet") : no slot of
name "pData" for this object of class "AnnotatedDataFrame" >




On Mon, Sep 11, 2017 at 6:58 AM, Martin Morgan <

martin.morgan at roswellpark.org> wrote:

On 09/10/2017 08:38 PM, Levi Waldron wrote:

I just dug up this old thread because I realized we still don't have a
coercion method as(sample.ExpressionSet, "SummarizedExperiment"). Since we

try as(sample.ExpressionSet, "RangedSummarizedExperiment"); see
?makeSummarizedExperimentFromExpressionSet

do have SummarizedExperiment(sample.ExpressionSet), could the coercion

method also be added easily?

library(Biobase) > library(SummarizedExperiment) >

example("ExpressionSet")

SummarizedExperiment(sample.ExpressionSet)class: SummarizedExperiment

dim: 500 26
metadata(0):
assays(1): ''
rownames(500): AFFX-MurIL2_at AFFX-MurIL10_at ... 31738_at 31739_at
rowData names(0):
colnames(26): A B ... Y Z
colData names(0):> as(sample.ExpressionSet,
"SummarizedExperiment")Error in as(sample.ExpressionSet,
"SummarizedExperiment") :
   no method or default for coercing ?ExpressionSet? to
?SummarizedExperiment?

sessionInfo()R version 3.4.0 RC (2017-04-20 r72569)

Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils
datasets  methods   base

other attached packages:
[1] SummarizedExperiment_1.7.5 DelayedArray_0.3.16
matrixStats_0.52.2
[4] GenomicRanges_1.29.6       GenomeInfoDb_1.13.4
IRanges_2.11.7
[7] S4Vectors_0.15.5           Biobase_2.37.2
BiocGenerics_0.23.0

loaded via a namespace (and not attached):
  [1] lattice_0.20-35         bitops_1.0-6            grid_3.4.0
  [4] zlibbioc_1.23.0         XVector_0.17.0          Matrix_1.2-11
  [7] tools_3.4.0             RCurl_1.95-4.8          compiler_3.4.0
[10] GenomeInfoDbData_0.99.1

On Mon, Sep 22, 2014 at 1:54 AM, Herv? Pag?s <hpages at fhcrc.org> wrote:

Hi,

On 09/20/2014 11:14 AM, Martin Morgan wrote:

On 09/20/2014 10:43 AM, Sean Davis wrote:

Hi, Vince.

Looks like a good start.  I'd probably pull all the assays from
ExpressionSet into SummarizedExperiment as the default, avoiding data
coercion methods that are unnecessarily lossy.  Also, as it stands, the
assayname argument is not used anyway?

I think there will be some resistance to uniting the 'Biobase' and
'IRanges' realms under 'GenomicRanges';

This coercion method could be defined (1) in Biobase (where
ExpressionSet is defined), (2) in GenomicRanges (where
SummarizedExperiment is defined), or (3) in a package that
depends on Biobase and GenomicRanges.

Since it's probably undesirable to make Biobase depend on GenomicRanges
or vice-versa, we would need to use Suggests for (1) or (2). That
means we would get a note like this at installation time:

  ** preparing package for lazy loading
  in method for ?coerce? with signature ?"ExpressionSet","SummarizedEx
periment"?:
  no definition for class ?SummarizedExperiment?

Not very clean but it works.

(3) is a cleaner solution but then the coercion method would
not necessarily be available to the user when s/he needs it (unless
s/he knows what extra package to load). The obvious advantage of
putting the method in Biobase is that if a user has an ExpressionSet,
then s/he necessarily has Biobase attached and the method is already
in her/his search path.

Another solution would be (4) to move SummarizedExperiment somewhere
else. That would be in a package that depends on GenomicRanges and
Biobase, and the coercion method would be defined there.

H.


considerable effort has gone in

to making a rational hierarchy of package dependencies [perhaps Herve
will point to some of his ASCII art on the subject].

I have some recollection of (recent) discussion related to this topic in
the DESeq2 realm, but am drawing a blank; presumably Michael or Wolfgang
or ... will chime in.

Martin


Sean


On Sat, Sep 20, 2014 at 10:38 AM, Vincent Carey
<stvjc at channing.harvard.edu>
wrote:

do we have a facility for this?

if not, we have

https://github.com/vjcitn/biocMultiAssay/blob/master/R/exs2se.R

https://github.com/vjcitn/biocMultiAssay/blob/master/man/coe
rce-methods.Rd


it occurred to me that we might want something like this in
GenomicRanges
(that's where SummarizedExperiment is managed, right?) and I will add
it
if there are no objections

the arguments are currently

       assayname = "exprs",    # for naming SimpleList element
       fngetter =
             function(z) rownames(exprs(z)),   # extract usable
feature names
       annDbGetter =
            function(z) {
                clnanno = sub(".db", "", annotation(z))
                stopifnot(require(paste0(annotation(z), ".db"),
character.only=TRUE) )
                get(paste0(annotation(z), ".db"))  # obtain resource
for
mapping feature names to coordinates
                },
       probekeytype = "PROBEID",   # chipDb field to use
       duphandler = function(z) {    # action to take to process
duplicated
features
            if (any(isd <- duplicated(z[,"PROBEID"])))
                return(z[!isd,,drop=FALSE])
            z
            },
       signIsStrand = TRUE,   # verify that signs of addresses define
strand
       ucsdChrnames = TRUE    # prefix 'chr' to chromosome token

          [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


     [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Levi Waldron
http://www.waldronlab.org
Assistant Professor of Biostatistics     CUNY School of Public Health
US: +1 646-364-9616                                           Skype:
levi.waldron

	[[alternative HTML version deleted]]

Ludwig Geistlinger

Mon, Sep 11, 2017 8:56 AM #

I guess we discussed this with Davide Risso @Bioc2017 in the
MultiAssayExperiment workshop.

puts the eSet (rather counterintuitively) into `assays` of
`SummarizedExperiment`, it does not really coerce it to
SummarizedExperiment, eg. `fData` and `pData` are not accordingly
transferred to colData and rowData.

While I can understand that this is by design of `SummarizedExperiment`, I
really wonder whether there are use cases where somebody would like to put
an `ExpressionSet` in `assays` of `SummarizedExperiment`, and not rather
would like to coerce it that way.

Furthermore, if you would indeed like to have several `ExpressionSet`s in
a `SummarizedExperiment`, haven't you already arrived at a scenario where
use of `MultiAssayExperiment` is indicated?

Thanks Martin! I see the RangedSummarizedExperiment coercion method works
when there are no mappable ranges (for example curatedMetagenomicData
ExpressionSet objects), although the rowRanges is a GRangesList of empty
elements. It might be worth also having a SummarizedExperiment coercion
method it it's not a problematic or big job. And now I suppose I can ask
the question I *really* wanted to know, which is why can't I coerce an
object that extends eSet? I can still use the SummarizedExperiment()
constructor, but for example:

library(metagenomeSeq)> data(mouseData)> class(mouseData)[1]
"MRexperiment"

attr(,"package")
[1] "metagenomeSeq"> is(mouseData, "ExpressionSet")[1] FALSE>
is(mouseData, "eSet")[1] TRUE

SummarizedExperiment(mouseData)class: SummarizedExperiment

dim: 10172 139
metadata(0):
assays(1): ''
rownames(10172): Prevotellaceae:1 Lachnospiraceae:1 ... Bryantella:103
  Parabacteroides:956
rowData names(0):
colnames(139): PM1:20080107 PM1:20080108 ... PM9:20080225 PM9:20080303
colData names(0):

as(mouseData, "RangedSummarizedExperiment") Error in as(mouseData,

"RangedSummarizedExperiment") : no method or default for coercing
???MRexperiment??? to ???RangedSummarizedExperiment??? > as(mouseData,
"SummarizedExperiment") Error in as(mouseData, "SummarizedExperiment") :
no
method or default for coercing ???MRexperiment??? to
???SummarizedExperiment??? > as(mouseData,
"ExpressionSet") Error in updateOldESet(from, "ExpressionSet") : no slot
of
name "pData" for this object of class "AnnotatedDataFrame" >




On Mon, Sep 11, 2017 at 6:58 AM, Martin Morgan <
martin.morgan at roswellpark.org> wrote:

On 09/10/2017 08:38 PM, Levi Waldron wrote:

I just dug up this old thread because I realized we still don't have a
coercion method as(sample.ExpressionSet, "SummarizedExperiment"). Since
we

try as(sample.ExpressionSet, "RangedSummarizedExperiment"); see
?makeSummarizedExperimentFromExpressionSet

do have SummarizedExperiment(sample.ExpressionSet), could the coercion

method also be added easily?

library(Biobase) > library(SummarizedExperiment) >

example("ExpressionSet")

SummarizedExperiment(sample.ExpressionSet)class: SummarizedExperiment

dim: 500 26
metadata(0):
assays(1): ''
rownames(500): AFFX-MurIL2_at AFFX-MurIL10_at ... 31738_at 31739_at
rowData names(0):
colnames(26): A B ... Y Z
colData names(0):> as(sample.ExpressionSet,
"SummarizedExperiment")Error in as(sample.ExpressionSet,
"SummarizedExperiment") :
   no method or default for coercing ???ExpressionSet??? to
???SummarizedExperiment???

sessionInfo()R version 3.4.0 RC (2017-04-20 r72569)

Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils
datasets  methods   base

other attached packages:
[1] SummarizedExperiment_1.7.5 DelayedArray_0.3.16
matrixStats_0.52.2
[4] GenomicRanges_1.29.6       GenomeInfoDb_1.13.4
IRanges_2.11.7
[7] S4Vectors_0.15.5           Biobase_2.37.2
BiocGenerics_0.23.0

loaded via a namespace (and not attached):
  [1] lattice_0.20-35         bitops_1.0-6            grid_3.4.0
  [4] zlibbioc_1.23.0         XVector_0.17.0          Matrix_1.2-11
  [7] tools_3.4.0             RCurl_1.95-4.8          compiler_3.4.0
[10] GenomeInfoDbData_0.99.1

On Mon, Sep 22, 2014 at 1:54 AM, Herv?? Pag??s <hpages at fhcrc.org>
wrote:

Hi,

On 09/20/2014 11:14 AM, Martin Morgan wrote:

On 09/20/2014 10:43 AM, Sean Davis wrote:

Hi, Vince.

Looks like a good start.  I'd probably pull all the assays from
ExpressionSet into SummarizedExperiment as the default, avoiding
data
coercion methods that are unnecessarily lossy.  Also, as it stands,
the
assayname argument is not used anyway?

I think there will be some resistance to uniting the 'Biobase' and
'IRanges' realms under 'GenomicRanges';

This coercion method could be defined (1) in Biobase (where
ExpressionSet is defined), (2) in GenomicRanges (where
SummarizedExperiment is defined), or (3) in a package that
depends on Biobase and GenomicRanges.

Since it's probably undesirable to make Biobase depend on
GenomicRanges
or vice-versa, we would need to use Suggests for (1) or (2). That
means we would get a note like this at installation time:

  ** preparing package for lazy loading
  in method for ???coerce??? with signature
???"ExpressionSet","SummarizedEx
periment"???:
  no definition for class ???SummarizedExperiment???

Not very clean but it works.

(3) is a cleaner solution but then the coercion method would
not necessarily be available to the user when s/he needs it (unless
s/he knows what extra package to load). The obvious advantage of
putting the method in Biobase is that if a user has an ExpressionSet,
then s/he necessarily has Biobase attached and the method is already
in her/his search path.

Another solution would be (4) to move SummarizedExperiment somewhere
else. That would be in a package that depends on GenomicRanges and
Biobase, and the coercion method would be defined there.

H.


considerable effort has gone in

to making a rational hierarchy of package dependencies [perhaps Herve
will point to some of his ASCII art on the subject].

I have some recollection of (recent) discussion related to this topic
in
the DESeq2 realm, but am drawing a blank; presumably Michael or
Wolfgang
or ... will chime in.

Martin


Sean


On Sat, Sep 20, 2014 at 10:38 AM, Vincent Carey
<stvjc at channing.harvard.edu>
wrote:

do we have a facility for this?

if not, we have

https://github.com/vjcitn/biocMultiAssay/blob/master/R/exs2se.R

https://github.com/vjcitn/biocMultiAssay/blob/master/man/coe
rce-methods.Rd


it occurred to me that we might want something like this in
GenomicRanges
(that's where SummarizedExperiment is managed, right?) and I will
add
it
if there are no objections

the arguments are currently

       assayname = "exprs",    # for naming SimpleList element
       fngetter =
             function(z) rownames(exprs(z)),   # extract usable
feature names
       annDbGetter =
            function(z) {
                clnanno = sub(".db", "", annotation(z))
                stopifnot(require(paste0(annotation(z), ".db"),
character.only=TRUE) )
                get(paste0(annotation(z), ".db"))  # obtain
resource
for
mapping feature names to coordinates
                },
       probekeytype = "PROBEID",   # chipDb field to use
       duphandler = function(z) {    # action to take to process
duplicated
features
            if (any(isd <- duplicated(z[,"PROBEID"])))
                return(z[!isd,,drop=FALSE])
            z
            },
       signIsStrand = TRUE,   # verify that signs of addresses
define
strand
       ucsdChrnames = TRUE    # prefix 'chr' to chromosome token

          [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


     [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Dr. Ludwig Geistlinger
eMail: Ludwig.Geistlinger at bio.ifi.lmu.de

Levi Waldron

Mon, Sep 11, 2017 9:26 AM #

On Mon, Sep 11, 2017 at 11:56 AM, Ludwig Geistlinger <

Ludwig.Geistlinger at bio.ifi.lmu.de> wrote:

Right, I had forgotten about that - this isn't a coercion but a
construction, which should be obvious from the use of a constructor
function. This behavior is intuitive if you remember that
SummarizedExperiment(assays, ...) is a constructor that accepts as assays
any object or list of objects supporting square bracket matrix-like
subsetting. Sorry for my brain hiccup there.

I think the behavior of the constructor SummarizedExperiment() here is
correct and expected, the issue here is that we're actually looking for
coercion methods.

Michael Lawrence

Mon, Sep 11, 2017 9:58 AM #

It's probably good keeping coercion and construction distinct,
although we have violated that recently with GRanges(). It now
attempts to coerce its first argument to a GRanges. Don't want to
derail the discussion, but it's another data point.

Michael

On Mon, Sep 11, 2017 at 9:26 AM, Levi Waldron

<lwaldron.research at gmail.com> wrote:

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Mon, Sep 11, 2017 11:02 AM #

Hi,

I added coercion from ExpressionSet to SummarizedExperiment in
SummarizedExperiment 1.7.6.

The current behavior of the SummarizedExperiment() constructor
when called on a ExpressionSet object doesn't make much sense to
me. I'd rather have it consistent with what the coercion does.
Will fix it.

Cheers,
H.

On 09/11/2017 09:58 AM, Michael Lawrence wrote:

It's probably good keeping coercion and construction distinct,
although we have violated that recently with GRanges(). It now
attempts to coerce its first argument to a GRanges. Don't want to
derail the discussion, but it's another data point.

Michael

On Mon, Sep 11, 2017 at 9:26 AM, Levi Waldron
<lwaldron.research at gmail.com> wrote:

On Mon, Sep 11, 2017 at 11:56 AM, Ludwig Geistlinger <
Ludwig.Geistlinger at bio.ifi.lmu.de> wrote:

I guess we discussed this with Davide Risso @Bioc2017 in the
MultiAssayExperiment workshop.

SummarizedExperiment(mouseData)

puts the eSet (rather counterintuitively) into `assays` of
`SummarizedExperiment`, it does not really coerce it to
SummarizedExperiment, eg. `fData` and `pData` are not accordingly
transferred to colData and rowData.

Right, I had forgotten about that - this isn't a coercion but a
construction, which should be obvious from the use of a constructor
function. This behavior is intuitive if you remember that
SummarizedExperiment(assays, ...) is a constructor that accepts as assays
any object or list of objects supporting square bracket matrix-like
subsetting. Sorry for my brain hiccup there.

While I can understand that this is by design of `SummarizedExperiment`, I
really wonder whether there are use cases where somebody would like to put
an `ExpressionSet` in `assays` of `SummarizedExperiment`, and not rather
would like to coerce it that way.

Furthermore, if you would indeed like to have several `ExpressionSet`s in
a `SummarizedExperiment`, haven't you already arrived at a scenario where
use of `MultiAssayExperiment` is indicated?

I think the behavior of the constructor SummarizedExperiment() here is
correct and expected, the issue here is that we're actually looking for
coercion methods.

         [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel at r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=heIkyXEvaJsN7zonHbzq6mmEiCHM2Ke_vMIGj7UTjx4&s=y4N3gOdF0D5AcrbAJZR_6Ne8WxR26lIdwbIK_L1KwrA&e=

_______________________________________________
Bioc-devel at r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=heIkyXEvaJsN7zonHbzq6mmEiCHM2Ke_vMIGj7UTjx4&s=y4N3gOdF0D5AcrbAJZR_6Ne8WxR26lIdwbIK_L1KwrA&e=

Herv? Pag?s

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

Levi Waldron

Mon, Sep 11, 2017 11:56 AM #

On Mon, Sep 11, 2017 at 2:02 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:

Thank you Herv?!

Thank you, again.

A couple more questions while I'm at it, that may expose the limitations in
my understanding of inheritance and project history... 1) Why have some
developers chosen to extend eSet instead of ExpressionSet (definition
<https://github.com/Bioconductor/Biobase/blob/536f137165ca08b3be22819e51e055b3e7afe86d/R/DataClasses.R#L166>),
and 2) why are these coercion methods developed for ExpressionSet rather
than eSet? Wouldn't an eSet coercion method be preferable because it would
cover ExpressionSet as well as all the classes that extend eSet?

Ludwig Geistlinger

Mon, Sep 11, 2017 12:31 PM #

Concerning 1) Why have some developers chosen to extend eSet instead of
ExpressionSet:

As far as I understand it, ExpressionSet was thought to exclusively
represent a microarray experiment (MIAME = Minimum Information About a
Microarray Experiment).

Thus, back in the days when more and more people started using RNA-seq and
there was no SummarizedExperiment, developers extended eSet with e.g.
assayData slots called `counts` instead of `exprs` to represent RNA-seq
data.

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Dr. Ludwig Geistlinger
eMail: Ludwig.Geistlinger at bio.ifi.lmu.de

Kasper Daniel Hansen

Mon, Sep 11, 2017 12:39 PM #

An ExpressionSet is an eSet that is guaranteed to have an "exprs" assay.
That makes no sense for example for methylation where we have (say)
Green/Red assays or Meth/Unmeth assays (or transformations of these).

Best,
Kasper

On Mon, Sep 11, 2017 at 3:31 PM, Ludwig Geistlinger <

Ludwig.Geistlinger at bio.ifi.lmu.de> wrote:

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

_______________________________________________
Bioc-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Levi Waldron

Mon, Sep 11, 2017 4:50 PM #

Thanks Ludwig and Kasper. This old presentation from Martin also helped me
a lot:

https://www.bioconductor.org/packages/devel/bioc/vignettes/Biobase/inst/doc/BiobaseDevelopment.pdf

But I still wonder, why provide the coercion for ExpressionSet, if
providing it for eSet would work not only for ExpressionSet but for
everything else derived from eSet? The coercion function seems to work fine
on the eSet-derived NChannelSet-class {the assays=as.list(assayData(from))
 seems to work regardless of the storage mode}:

attr(,"package")
[1] "Biobase"> is(obj, "eSet")[1] TRUE

RangedSummarizedExperiment dim: 10 3 metadata(3): experimentData annotation
protocolData assays(2): G R rownames(10): 1 2 ... 9 10 rowData names(0):
colnames(3): A B C colData names(3): ChannelRData ChannelGData ChannelRAndG

Error in as(obj, "RangedSummarizedExperiment") :
  no method or default for coercing ?NChannelSet? to
?RangedSummarizedExperiment?

Mon, Sep 11, 2017 5:09 PM #

I don't know the reasons behind this choice, I didn't implement
these methods. It would make sense to have these coercions defined
for the eSet,SummarizedExperiment and eSet,RangedSummarizedExperiment
signatures if they only access the eSet part of the object.
I'll look into this.

H.

On 09/11/2017 04:50 PM, Levi Waldron wrote:

_______________________________________________
Bioc-devel at r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=s44znXbg9VOxQh0qMe4z7PQsjrI_ZPrh4zdjiDO4rQg&s=qKI2hqP0ltgK5_LBtVFG6Va-8XAUuGwbXp9eKUfulXs&e=

Herv? Pag?s

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

Kasper Daniel Hansen

Mon, Sep 11, 2017 6:06 PM #

I see how an eSet maps to a SummarizedExperiment; one can look at the
assays in the object.  I agree with Levi/Herve that this is the natural
(correct) choice.  It is less clear how you get the ranges for a
RangedSummarizedExperiment without making assumptions.

On Mon, Sep 11, 2017 at 8:09 PM, Herv? Pag?s <hpages at fredhutch.org> wrote:

I don't know the reasons behind this choice, I didn't implement
these methods. It would make sense to have these coercions defined
for the eSet,SummarizedExperiment and eSet,RangedSummarizedExperiment
signatures if they only access the eSet part of the object.
I'll look into this.

H.

On 09/11/2017 04:50 PM, Levi Waldron wrote:

Thanks Ludwig and Kasper. This old presentation from Martin also helped me
a lot:

https://urldefense.proofpoint.com/v2/url?u=https-3A__www.bio
conductor.org_packages_devel_bioc_vignettes_Biobase_inst_
doc_BiobaseDevelopment.pdf&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfh
Q&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=s44znXbg9V
OxQh0qMe4z7PQsjrI_ZPrh4zdjiDO4rQg&s=C-oKupgUQZm6PA2e1IWSKKQq
PsQIy4uTGUg_CSvBeq4&e=

But I still wonder, why provide the coercion for ExpressionSet, if
providing it for eSet would work not only for ExpressionSet but for
everything else derived from eSet? The coercion function seems to work
fine
on the eSet-derived NChannelSet-class {the assays=as.list(assayData(from)
)
  seems to work regardless of the storage mode}:

library(Biobase)> library(SummarizedExperiment)>

example("NChannelSet-class", echo=FALSE)> class(obj)[1] "NChannelSet"

attr(,"package")
[1] "Biobase"> is(obj, "eSet")[1] TRUE

storageMode(obj)[1] "lockedEnvironment"

makeSummarizedExperimentFromExpressionSet(obj) class:

RangedSummarizedExperiment dim: 10 3 metadata(3): experimentData
annotation
protocolData assays(2): G R rownames(10): 1 2 ... 9 10 rowData names(0):
colnames(3): A B C colData names(3): ChannelRData ChannelGData
ChannelRAndG

as(obj, "RangedSummarizedExperiment")

Error in as(obj, "RangedSummarizedExperiment") :
   no method or default for coercing ?NChannelSet? to
?RangedSummarizedExperiment?

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel at r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
hz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt
84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=s4
4znXbg9VOxQh0qMe4z7PQsjrI_ZPrh4zdjiDO4rQg&s=qKI2hqP0ltgK5_
LBtVFG6Va-8XAUuGwbXp9eKUfulXs&e=

1 day later

Ludwig Geistlinger

Wed, Sep 13, 2017 2:54 PM #

Coercing vice versa, i.e. from SummarizedExperiment to ExpressionSet,
which is defined in

SummarizedExperiment/R/makeSummarizedExperimentFromExpressionSet.R

as follows:

setAs("SummarizedExperiment", "ExpressionSet", function(from)
    as(as(from, "RangedSummarizedExperiment"), "ExpressionSet")
)

also seems to be a bit problematic, as it makes you lose your rowData/fData.



Here is an example:

## Constructing the SE similar to examples of ?SummarizedExperiment

row.names=LETTERS[1:6])


## some rowData with simulated gene IDs

1:200))

colData=colData, rowData=rowData)

# this is how it looks

DataFrame with 200 rows and 1 column
     EntrezID
    <integer>
1         289
2         476
3         608
4         998
5         684
...       ...
196       331
197       590
198       445
199        95
200       129

(why did I actually lost the rownames g1-g200 here?)


## Coercing to Expression makes me losing the rowData/fData

data frame with 0 columns and 200 rows


## So where is the problem?
## Apparently in the coercion
##    from SummarizedExperiment to RangedSummarizedExperiment

DataFrame with 200 rows and 0 columns

Wed, Sep 13, 2017 5:59 PM #

Hi Ludwig,

Excellent catch! Thanks for the report.

This should be fixed in SummarizedExperiment release (1.6.4) and devel
(1.7.7).

Cheers,
H.

On 09/13/2017 02:54 PM, Ludwig Geistlinger wrote:

Herv? Pag?s

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

Wed, Sep 13, 2017 6:19 PM #

One more thing. See below...

On 09/13/2017 02:54 PM, Ludwig Geistlinger wrote:

Your rownames were moved to the names of the object:

 > head(names(se))
[1] "g1" "g2" "g3" "g4" "g5" "g6"

The rowData() accessor (like the mcols() accessor, note that rowData()
is just an alias for mcols) does not restore them by default, unless
you use 'use.names=TRUE'.

 > rowData(se, use.names=TRUE)
DataFrame with 200 rows and 1 column
       EntrezID
      <integer>
g1         616
g2          45
g3         944
g4         632
g5         270
...        ...
g196       827
g197       943
g198       291
g199       432
g200       106

All Vector derivatives do that (e.g. GRanges), not just
SummarizedExperiment.

The reason for this design is that the rownames must be unique
(this is a base R requirement). By moving them from the DataFrame
containing the metadata columns to the names of the object, Vector
derivatives can be subsetted in a way that repeat some of their
elements. If the rownames were on the DataFrame containing the
metadata columns, these subsetting operations wouldn't be
possible.

Hope this makes sense,
H.

Herv? Pag?s

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319