[Bioc-devel] read.AnnotatedDataFrame
Hi, just to add to Florian's comment about user interface: I think the annotatedDataFrame and new eSet classes are beautiful and elegant, and much better than what we had. Yet I find it now quite complex and unintuitive to construct an annotatedDataFrame or an ExpressionSet from scratch, IMHO anything that makes it simple to convert a simple dataframe or Excel table into a valid annotatedDataFrame will make many users happy. Best wishes Wolfgang
Florian Hahne wrote:
Hi Seth, internal representation is one part of the story and I agree that row names are the way to go here. Another point however is how the user gets the information into R. At some point we need to match sample names and the sample meta data and IMO this should already be at the level of the text file. The closest to the row names idea is probably to take the first column in the file as the sample identifier, but this poses a pretty strict layout on the files (maybe for some users the first column is already the row numbering...). As far as I understand the current implementation the default is to take the first column and that you can pass row.names=x to read.AnnotatedDataFrame but there is this additional sampleNames parameter and I find this pretty confusing. So currently you can do almost everything with the function which is good in one sense but on the other hand might cause mix ups and confusion to the user. If the mapping is already clear at the level of the text file, we can sit back and tell people to check their files if something isn't showing up as they expect it to be, but currently you can do pretty stupid stuff just by setting a wrong argument without even realizing. I had the impression at the Bressanone courses that for the average user the biggest obstacle is to get all the necessary data from files somewhere on the hard disk into R and that it is important to provide a straightforward default way of doing that. Best, Florian