yes, if a formal extension is warranted. the metadata slot could also be
used.
On Thu, Jun 18, 2015 at 2:59 PM, Kasper Daniel Hansen <
kasperdanielhansen at gmail.com> wrote:
I think the more clean solution for Davide (if he inists on having
objects; I decided against it in minfi) is to extend the class to allow
this.
Kasper
On Thu, Jun 18, 2015 at 12:25 AM, Ryan <rct at thompsonclan.org> wrote:
Oh wow, I didn't know you could put a DataFrame into a single column of
another DataFrame. That actually solves a problem for me too (I don't
intend to expose nested DataFrames to the users though).
On 6/17/15 7:23 PM, Martin Morgan wrote:
On 06/17/2015 11:41 AM, davide risso wrote:
Dear list,
I'm creating an R package to store RNA-seq data of a somewhat large
project
in which I'm involved.
One of the initial goals is to compare different pre-processing
pipelines,
hence I have multiple expression matrices corresponding to the same
samples.
The SummarizedExperiment class seems a good candidate, since I have
multiple expression matrices with the same rowData and colData
information.
I have several sample-specific variables that I want to store with
object, namely, experimental information (e.g., batch, date,
condition, ...) and sample quality (e.g., proportion of aligned
total duplicate reads, etc...).
Of course, I can always create one big data frame concatenating the
(experimental info + sample quality), but it seems that both
and practically, it might be useful to have two separate data frames.
Since this seems somewhat a reasonably standard type of information
one would want to carry on, I was wondering if it would be possible /
useful to allow the user to have multiple data.frames in the colData
Actually, colData() is a DataFrame, and a DataFrame column can
DataFrame. So after
example(SummarizedExperiment)
we could make some faux sample quality data
quality = DataFrame(x=1:6, y=6:1, row.names=colnames(se1))
add this as a column in the colData()
colData(se1)$quality = quality
(or create the SummarizedExperiment from a similar DataFrame up-front)
and manage our grouped data
DataFrame with 6 rows and 2 columns
Treatment quality
<character> <DataFrame>
A ChIP ########
B Input ########
C ChIP ########
D Input ########
E ChIP ########
F Input ########
colData(se1[,1:2])$quality
DataFrame with 2 rows and 2 columns
x y
<integer> <integer>
A 1 6
B 2 5
I'm not sure that this is any less confusing to the end user than
to manage a DataFrameList(), but it does not require any new features.
Martin
of SummarizedExperiment.
Best,
Davide
[[alternative HTML version deleted]]