An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioc-devel/attachments/20071004/06462a50/attachment.pl
[Bioc-devel] eSet for aCGH data
5 messages · Vosse, S.J., Sean Davis, Vincent Carey
Vosse, S.J. wrote:
Dear all, first let me thank Martin Morgan and James MacDonald for their answers to my question about the exprSet class on the Bioconductor mailing list. They have been very helpful. I am thinking to adapt/extend the eSet class or probably ExpressionSet to contain aCGH data for our package CGHcall. My question is whether a similar class for aCGH data already exists or if anyone has been working on it or has thoughts on the subject. The class would be the same as ExpressionSet, only there would need to be slots for raw data, normalized data, segmented data, called data and regions data (http://la-press.com/cr_data/files/f_CIN-3-Wiel-et-al_96.pdf).
If I were going to do this, I would think about separate subclasses for raw data, normalized data, and segmented data. If you think about methods for each class, you will probably want them to behave differently based on whether they are operating on raw, normalized, or segmented data. Having separate subclasses makes this pretty straightforward. Just a thought, though. Sean
Dear all, first let me thank Martin Morgan and James MacDonald for their answers to my question about the exprSet class on the Bioconductor mailing list. They have been very helpful. I am thinking to adapt/extend the eSet class or probably ExpressionSet to contain aCGH data for our package CGHcall. My question is whether a similar class for aCGH data already exists or if anyone has been working on it or has thoughts on the subject. The class would be the same as ExpressionSet, only there would need to be slots for raw data, normalized data, segmented data, called data and regions data (http://la-press.com/cr_data/files/f_CIN-3-Wiel-et-al_96.pdf).
you might have a look at the Neve2006 package, only in the devel experiment data branch (currently labeled 2.1 on the web site) i did not push this package into release because of lack of consensus on the preferred representation of aCGH data. we have a number of packages like aCGH, DNAcopy, snapCGH that use their own representations. cghSet and cghExSet are defined in Neve2006. cghExSet confronts the problem of managing expression and CGH data obtained on the same samples i question whether you should have a class that manages raw and normalized and segmented data together. we have used stagewise representations in the expression domain, with containers like oligoBatch for the raw intensities and ExpressionSet devoted to expression level quantifications that will be analyzed downstream. binding data together at various levels of processing may have some benefits but also many costs if the data are voluminous. for what you mention, it seems that the normalized and/or called data and regions data would be in the AssayData and featureData eSet slots respectively. once we get some consensus among multiple developers/users in place, it is likely that a central eSet derivative class devoted to aCGH data would be defined in Biobase (or some relevant primarily core-maintained package) for interested developers to use. we can start a wiki page on the developer's wiki devoted to this topic if there is sufficient interest
Sjoerd Vosse [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
The information transmitted in this electronic communication...{{dropped}}
I guess it would indeed be easiest to define a separate eSet subclass for the different 'stages' of aCGH data, being raw, normalized, segmented, called and regions. For this the ExpressionSet class can be used with virtually no changes, except for perhaps some methods. Or perhaps a single class, with a slot to define the type of data contained in the class and thus the way methods behave. Vincent, how does your cghSet class differ from the ExpressionSet class, and why? Sjoerd -----Oorspronkelijk bericht----- Van: Vincent Carey 525-2265 [mailto:stvjc at channing.harvard.edu] Verzonden: Thursday, October 04, 2007 16:14 Aan: Vosse, S.J. CC: bioc-devel at stat.math.ethz.ch Onderwerp: Re: [Bioc-devel] eSet for aCGH data
Dear all, first let me thank Martin Morgan and James MacDonald for their answers to my question about the exprSet class on the Bioconductor mailing
list.
They have been very helpful. I am thinking to adapt/extend the eSet class or probably ExpressionSet to contain aCGH data for our package CGHcall. My question is whether a similar class for aCGH data already exists or if anyone has been
working
on it or has thoughts on the subject. The class would be the same as ExpressionSet, only there would need to be slots for raw data, normalized data, segmented data, called data and regions data (http://la-press.com/cr_data/files/f_CIN-3-Wiel-et-al_96.pdf).
you might have a look at the Neve2006 package, only in the devel experiment data branch (currently labeled 2.1 on the web site) i did not push this package into release because of lack of consensus on the preferred representation of aCGH data. we have a number of packages like aCGH, DNAcopy, snapCGH that use their own representations. cghSet and cghExSet are defined in Neve2006. cghExSet confronts the problem of managing expression and CGH data obtained on the same samples i question whether you should have a class that manages raw and normalized and segmented data together. we have used stagewise representations in the expression domain, with containers like oligoBatch for the raw intensities and ExpressionSet devoted to expression level quantifications that will be analyzed downstream. binding data together at various levels of processing may have some benefits but also many costs if the data are voluminous. for what you mention, it seems that the normalized and/or called data and regions data would be in the AssayData and featureData eSet slots respectively. once we get some consensus among multiple developers/users in place, it is likely that a central eSet derivative class devoted to aCGH data would be defined in Biobase (or some relevant primarily core-maintained package) for interested developers to use. we can start a wiki page on the developer's wiki devoted to this topic if there is sufficient interest
Sjoerd Vosse [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
The information transmitted in this electronic communica...{{dropped:9}}
I guess it would indeed be easiest to define a separate eSet subclass for the different 'stages' of aCGH data, being raw, normalized, segmented, called and regions. For this the ExpressionSet class can be used with virtually no changes, except for perhaps some methods. Or perhaps a single class, with a slot to define the type of data contained in the class and thus the way methods behave. Vincent, how does your cghSet class differ from the ExpressionSet class, and why?
for the "how", it would be best to see the code in Neve2006/R. Briefly, cghSet contains eSet. there's a method "logRatios" that just grabs the exprs element of the assayData. the package data component neveCGHmatch is an instance of cghSet, and it has a man page. the vignette gives some indications of how to work with the featureData component of that structure. the real reason for Neve2006 is to define cghExSet, which contains eSet, but adds slots cghAssays (AssayData instance) and cloneMeta (AnnotatedDataFrame instance). The purpose of cghExSet is to have a container for the Neve 2006 data that combine expression and aCGH data on the same samples. There have been some recommendations for improvements from Martin Morgan that are awaiting implementation. I am taking a very conservative (with respect to programming effort) approach to this development because I am not a direct user of CGH data and I have no real use cases. for the "why", i would say that we should extend eSet, not ExpressionSet, to represent these data that are conceptually distinct from expression measures. but the details of design for the cghSet need to address use cases. Presumably these will go beyond what is in the Neve2006 vignette, and, if they involve a series of classes like cghRawBatch, cghNorm, cghSeg, for example, the stuff in Neve2006 may be irrelevant. cghSet as I defined it could be discarded, or it could be regarded as a suitable container for "cooked" aCGH results, which uses eSet infrastructure appropriately. on my superficial review of aCGH related software in bioconductor, there was nothing that used S4 to couple the sample information closely to the assay data results. i feel that whatever we do should allow this coupling at the earliest possible stage.
Sjoerd -----Oorspronkelijk bericht----- Van: Vincent Carey 525-2265 [mailto:stvjc at channing.harvard.edu] Verzonden: Thursday, October 04, 2007 16:14 Aan: Vosse, S.J. CC: bioc-devel at stat.math.ethz.ch Onderwerp: Re: [Bioc-devel] eSet for aCGH data
Dear all, first let me thank Martin Morgan and James MacDonald for their answers to my question about the exprSet class on the Bioconductor mailing
list.
They have been very helpful. I am thinking to adapt/extend the eSet class or probably ExpressionSet to contain aCGH data for our package CGHcall. My question is whether a similar class for aCGH data already exists or if anyone has been
working
on it or has thoughts on the subject. The class would be the same as ExpressionSet, only there would need to be slots for raw data, normalized data, segmented data, called data and regions data (http://la-press.com/cr_data/files/f_CIN-3-Wiel-et-al_96.pdf).
you might have a look at the Neve2006 package, only in the devel experiment data branch (currently labeled 2.1 on the web site) i did not push this package into release because of lack of consensus on the preferred representation of aCGH data. we have a number of packages like aCGH, DNAcopy, snapCGH that use their own representations. cghSet and cghExSet are defined in Neve2006. cghExSet confronts the problem of managing expression and CGH data obtained on the same samples i question whether you should have a class that manages raw and normalized and segmented data together. we have used stagewise representations in the expression domain, with containers like oligoBatch for the raw intensities and ExpressionSet devoted to expression level quantifications that will be analyzed downstream. binding data together at various levels of processing may have some benefits but also many costs if the data are voluminous. for what you mention, it seems that the normalized and/or called data and regions data would be in the AssayData and featureData eSet slots respectively. once we get some consensus among multiple developers/users in place, it is likely that a central eSet derivative class devoted to aCGH data would be defined in Biobase (or some relevant primarily core-maintained package) for interested developers to use. we can start a wiki page on the developer's wiki devoted to this topic if there is sufficient interest
Sjoerd Vosse [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
The information transmitted in this electronic communica...{{dropped:9}}
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
The information transmitted in this electronic communica...{{dropped:2}}