Skip to content

[Bioc-devel] eSet for aCGH data

5 messages · Vosse, S.J., Sean Davis, Vincent Carey

#
Vosse, S.J. wrote:
If I were going to do this, I would think about separate subclasses for 
raw data, normalized data, and segmented data.  If you think about 
methods for each class, you will probably want them to behave 
differently based on whether they are operating on raw, normalized, or 
segmented data.  Having separate subclasses makes this pretty 
straightforward.  Just a thought, though.

Sean
#
you might have a look at the Neve2006 package, only in the devel
experiment data branch (currently labeled 2.1 on the web site)

i did not push this package into release because of lack of consensus
on the preferred representation of aCGH data.  we have a number of
packages like aCGH, DNAcopy, snapCGH that use their own representations.

cghSet and cghExSet are defined in Neve2006.  cghExSet confronts the
problem of managing expression and CGH data obtained on the same
samples

i question whether you should have a class that manages
raw and normalized and segmented data together.  we have used stagewise
representations in the expression domain, with containers like oligoBatch
for the raw intensities and ExpressionSet devoted to expression level
quantifications that will be analyzed downstream.  binding data together
at various levels of processing may have some benefits but also many
costs if the data are voluminous.

for what you mention, it seems that the normalized and/or called data and
regions data would be in the AssayData and featureData eSet slots respectively.

once we get some consensus among multiple developers/users in place, it
is likely that a central eSet derivative class devoted to aCGH data would
be defined in Biobase (or some relevant primarily core-maintained package)
for interested developers to use.

we can start a wiki page on the developer's wiki devoted to this topic
if there is sufficient interest
The information transmitted in this electronic communication...{{dropped}}
#
I guess it would indeed be easiest to define a separate eSet subclass
for the different 'stages' of aCGH data, being raw, normalized,
segmented, called and regions. For this the ExpressionSet class can be
used with virtually no changes, except for perhaps some methods. Or
perhaps a single class, with a slot to define the type of data contained
in the class and thus the way methods behave.

Vincent, how does your cghSet class differ from the ExpressionSet class,
and why?

Sjoerd

-----Oorspronkelijk bericht-----
Van: Vincent Carey 525-2265 [mailto:stvjc at channing.harvard.edu] 
Verzonden: Thursday, October 04, 2007 16:14
Aan: Vosse, S.J.
CC: bioc-devel at stat.math.ethz.ch
Onderwerp: Re: [Bioc-devel] eSet for aCGH data
list.
working
you might have a look at the Neve2006 package, only in the devel
experiment data branch (currently labeled 2.1 on the web site)

i did not push this package into release because of lack of consensus
on the preferred representation of aCGH data.  we have a number of
packages like aCGH, DNAcopy, snapCGH that use their own representations.

cghSet and cghExSet are defined in Neve2006.  cghExSet confronts the
problem of managing expression and CGH data obtained on the same
samples

i question whether you should have a class that manages
raw and normalized and segmented data together.  we have used stagewise
representations in the expression domain, with containers like
oligoBatch
for the raw intensities and ExpressionSet devoted to expression level
quantifications that will be analyzed downstream.  binding data together
at various levels of processing may have some benefits but also many
costs if the data are voluminous.

for what you mention, it seems that the normalized and/or called data
and
regions data would be in the AssayData and featureData eSet slots
respectively.

once we get some consensus among multiple developers/users in place, it
is likely that a central eSet derivative class devoted to aCGH data
would
be defined in Biobase (or some relevant primarily core-maintained
package)
for interested developers to use.

we can start a wiki page on the developer's wiki devoted to this topic
if there is sufficient interest
The information transmitted in this electronic communica...{{dropped:9}}
#
for the "how", it would be best to see the code in Neve2006/R.  Briefly,
cghSet contains eSet.    there's a method "logRatios" that just grabs
the exprs element of the assayData.  the package data component neveCGHmatch
is an instance of cghSet, and it has a man page.  the vignette gives
some indications of how to work with the featureData component of that
structure.

the real reason for Neve2006 is to define cghExSet, which contains eSet,
but adds slots cghAssays (AssayData instance) and cloneMeta
(AnnotatedDataFrame instance).  The purpose of cghExSet is to have
a container for the Neve 2006 data that combine expression and aCGH
data on the same samples.  There have been some recommendations for
improvements from Martin Morgan that are awaiting implementation.  I
am taking a very conservative (with respect to programming effort)
approach to this development because I am not a direct user of CGH
data and I have no real use cases.

for the "why", i would say that we should extend eSet, not ExpressionSet,
to represent these data that are conceptually distinct from expression
measures.  but the details of design for the cghSet need to address use
cases. Presumably these will go beyond what is in the Neve2006 vignette,
and, if they involve a series of classes like cghRawBatch, cghNorm,
cghSeg, for example, the stuff in Neve2006 may be irrelevant.  cghSet as
I defined it could be discarded, or it could be regarded as a suitable
container for "cooked" aCGH results, which uses eSet infrastructure
appropriately.  on my superficial review of aCGH related software
in bioconductor, there was nothing that used S4 to couple the sample
information closely to the assay data results.  i feel that whatever
we do should allow this coupling at the earliest possible stage.
The information transmitted in this electronic communica...{{dropped:2}}