Patrick,
There are indeed always several ways to address needs, and my comment
is mostly pointing at the fact that creating yet-an-other slot is not
necessary since one can currently store such data into phenoData (into
a column named... say "scan_date").
I would in fact qualify of overbuilding the approach that adds a new
(and exclusive) slot while improving the exiting infrastructure could
perfectly answer the needs. So today it's "scanDates", and next could
be "scannerModel", or "scanningSoftwareVersion".
I have been a little unclear (even to myself) in my comment about using
"[", so here are more details. *If* the extract operator was made to
evaluate expressions such as the function subset() does, or in fact if
a method subset was implemented for eSet objects, storing all
information into phenoData makes such things nice:
# silly example: only get the control data scanned in the future:
eset[, scan_date > date() & treatment == "control"]
# same with subset:
subset(eset, , scan_date > date() & treatment == "control")
# a little longer to write
eset[, scanDates(eset) > date() & pData(eset) == "control"]
If for some reasons a distinction between phenoData and
like-phenoData-but-can't-be-the-same is needed, please do consider the
creation of an AnnotatedDataFrame that contains all of them.
L.
Patrick Aboyoun wrote:
Laurent,
We had some immediate need for scan date information and rather
than overbuild a system for managing metadata that we may or may
not need, we opted to start simply and then build up as
appropriate. There has been some internal discussions about
managing other metadata along with scan dates, but nothing else has
bubbled to the top yet. Your thoughts and design can help speed up
this process. The class versioning system in Biobase supports
iterative development and we can make further changes once we lock
a design in place. One editorial comment I have is that lots of
designs are possible for a given need and, for example, the current
class properly subsets the scanDates information using "[" despite
not being stored in the phenoData (AnnotatedDataFrame) slot.
Cheers,
Patrick
Quoting Laurent Gautier <laurent at cbs.dtu.dk>:
Hi Patrick,
Storing the scan dates is indeed useful information, and is it nice to
have it offered at the parsing stage.
However, first comment would be "does it justify a new slot" to eSet ?
I have been storing scan dates for quite some time now, but opted for
having them in the phenoData as it made more sense to me, both on an
implementation standpoint and on practical standpoint (as standard
extraction of an eset-subset on columns with the "[" operator works).
If having something specific for scan dates is really really wished,
would it make make sense to have that by extending AnnotatedDataFrame ?
In my opinion, the stage at which the the data are extracted (in that
case when parsing the files coming out of the image analysis) should
not dictate where the data are stored.
In fact, it might make it for a nice(r) workflow if the function
reading raw array data could return an eSet-inheriting instance and a
phenoData with information such as dates and file names. I am working
on a workflow that is in fact getting much more data from the header (I
suppose that I'd contribute it when enough time to wrap it up).
Just few thoughts,
L.
Patrick Aboyoun wrote:
Dear Bioconductor developers,
The Biocore group has just committed a change to the BioC 2.5
code line (Biobase version 2.5.3) to support the use of
microarray scan date in statistical analyses by adding a
scanDates slot to Biobase's eSet class. This information can be
retrieved and set using the new scanDates and scanDates<-
function respectively. The scanDates slot is designed to hold a
character vector of length = # of samples, with one character
element for each sample. (See help(scanDates) for more
information.)
In this first round of check-ins we have added affy support of
this new slot to functions like ReadAffy and we will be working
towards adding this information to other microarray platforms as
well.
This change involved bumping the eSet version number from 1.1.0
to 1.2.0 in the Biobase class definition. In order to minimize
the impact of this change, the Biobase methods support both the
current eSet version 1.2.0 as well as old 1.1.0 serialized
objects so updateObject will not be required to be performed on
eSet-derived objects prior to use in other functions. We have
also tested and versioned bumped (and patched where needed) the
following packages that create eSet-derived classes to minimize
any package build issues: ACME, beadarray, beadarraySNP,
cellHTS2, CGHbase, codelink, crlmm, GeneRegionScan, GGBase,
maDB, oligoClasses, ontoTools, puma, rMAT, SNPchip, and spkTools.
Below is a demonstration of the new functionality. If you
encounter any issues related to this change, please e-mail this
list so the community can monitor the change.
- The Biocore Team
suppressMessages(library(affy))
example(ReadAffy)
RdAffy> if(require(affydata)){
RdAffy+ celpath <- system.file("celfiles", package="affydata")
RdAffy+ fns <- list.celfiles(path=celpath,full.names=TRUE)
RdAffy+ RdAffy+ cat("Reading
files:\n",paste(fns,collapse="\n"),"\n")
RdAffy+ ##read a binary celfile
RdAffy+ abatch <- ReadAffy(filenames=fns[1])
RdAffy+ ##read a text celfile
RdAffy+ abatch <- ReadAffy(filenames=fns[2])
RdAffy+ ##read all files in that dir
RdAffy+ abatch <- ReadAffy(celfile.path=celpath)
RdAffy+ }
Loading required package: affydata
Reading files:
/Library/Frameworks/R.framework/Versions/2.10/Resources/library/affydata/celfiles/binary.cel
/Library/Frameworks/R.framework/Versions/2.10/Resources/library/affydata/celfiles/text.cel
binary.cel text.cel
"01/23/04 14:30:57" "08/29/03 15:12:30"
R version 2.10.0 Under development (unstable) (2009-06-12 r48755)
i386-apple-darwin9.6.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] affydata_1.11.6 affy_1.23.2 Biobase_2.5.3
loaded via a namespace (and not attached):
[1] affyio_1.13.3 preprocessCore_1.7.4 tools_2.10.0