[Bioc-devel] RFC: eSet with two color data
Hi, Many people (and packages) do count on the ratios of a 2-color arrays (with a reference design, for example). In those cases, using the M and A values would be reasonable. Adding one or two column in PhenoData about the samples on each color would probably be enough. This is what we generally deal with and the extension would be minor. I don't know if it would be worth to consider an extension for this specific use-case. For the general case, each color would be dealt with individually using single-channel analyses and I agree that the first option would be better. Francois
On Wed, 2007-03-21 at 10:13 -0500, Kevin R. Coombes wrote:
Hi, Having thought about this several times, I keep coming back to [A], for exactly the reasons you point out. It has the decided advantage of generalizing to any number of colors (including 1!) -- which actually suggest that ExpressionSet might be modified to include your required columns. One might, however, prefer "Label" instead of "dye" to allow for somewhat more generality. Best, Kevin Wolfgang Huber wrote:
Dear all, I hope that this question is not too tedious for those who have already thought hard about it, but I am not aware of consensus and good documentation in Biobase on this topic: How can we best represent preprocessed, normalised data from a set of two- (or n-) colour arrays in an eSet like structure? I would like to keep the intensity information of each channel, and not reduce to M-values since that looses information. I see two options: A) in an ExpressionSet-derivative called e.g. "ExpressionSetWithColors" with ncol = n times the number of arrays, and with mandatory phenoData columns named e.g. "arrayID" and "dye" . B) in an eSet-derivative with ncol = the number of arrays, and n congruent matrices in the assayData slot. Currently I prefer A, because - most of the infrastructure is already there and the additional work is little - in B, the interpretation of the phenoData columns gets mushy because some columns will refer to the arrays, others to one particular sample of the n hybrised to each array, and we need additional infrastructure to resolve that. Is there anything that someone can point out that I am not aware of? Also (different topic:) do we already have an ontology in place somewhere for control features (e.g. empty features, features measuring a known spike-in ratio)? Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________ Bioc-devel at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel