[Bioc-devel] New SE or new assay in SE? - Bioc-devel

Tue, Jan 28, 2020 1:37 AM #

Dear all,

Assume we have a SummarizedExperiment object `se` that contains raw count data, and a method `doProcess` that processes the data to produce a matrix of identical dimensions (for example log-transformation, normalisation, imputation, ...). What are the opinions in favour or against the following two options

- `doProcess(se)` returns a new SE object 
- `doProcess(se)` adds a new assay to se

If you are interested about the broader context about this question, see https://github.com/waldronlab/MultiAssayExperiment/issues/266

Thank you in advance for your input.

Laurent

Hervé Pagès

Wed, Jan 29, 2020 8:29 AM #

On 1/28/20 01:37, Laurent Gatto wrote:

Aren't these are the same?

SE objects are not reference objects i.e. they follow R standard 
copy-on-change semantic. This means that they never get modified **in 
place** (aka they're not "mutable"). So 'doProcess(se)' will always 
return a new object, whatever you do inside the function, that is, even 
if the function modifies 'se' internally e.g. with something like:

   assay(se, "new_assay") <- new_assay

Note that the assay() setter itself like all setters also produces a new 
object. The parser actually replaces the following code

   assay(se, "new_assay") <- new_assay

with

   se <- `assay<-`(se, "new_assay", value=new_assay)

As you can see the previous `se` is replaced with the new one which 
gives the **illusion** of in-place replacement but it's not.

Hope this helps,
H.

_______________________________________________
Bioc-devel at r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA&s=_aXY7azhIr_1UPl2s3RvX1MJp_9Xcw_73w2KOYbqBVI&e=

Herv? Pag?s

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

Hervé Pagès

Wed, Jan 29, 2020 8:47 AM #

Just after I pressed the "Send" button I realized that by returning a 
new SE object you probably meant returning an SE object with only the 
new assay in it. I would favor the other option i.e. 'doProcess(se)' 
adds a new assay to 'se'. I think that's what most workflows based on SE 
objects do.

This doesn't mean that you can't provide a lower-level function that 
returns the transformed data in a "naked" matrix (i.e. not wrapped 
inside an SE). This let's the (more advanced) user decide what they want 
to do with it e.g. they can add it to the original SE:

     assay(se, "normalized") <- normalized_data

or wrap it in its own new SE:

     normalized <- SummarizedExperiment(list(normalized=normalized_data))

H.

On 1/29/20 08:29, Pages, Herve wrote:

_______________________________________________
Bioc-devel at r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Of3qgEC1ElS9Ji3Iu2vNk93_Fj3m50sTV2zT0dyAKvA&s=_aXY7azhIr_1UPl2s3RvX1MJp_9Xcw_73w2KOYbqBVI&e=

Herv? Pag?s

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319