Skip to content
Prev 5503 / 21307 Next

[Bioc-devel] rownames in SummerizedExperiments

On 04/06/2014 04:21 PM, Michael Lawrence wrote:
Empirically, the row names can be duplicated, but the column names cannot.

The lack of constraint on row names is enabled by the rowData GenomicRanges, 
while the constraint on column names is introduced by the (rownames of the) 
colData DataFrame. So the lack of symmetry in the class leads to lack of 
symmetry for dimnames. The use of GenomicRanges for rows has been the subject of 
previous discussion.

It wouldn't be inconceivable to impose constraints on duplicate row names in 
SummarizedExperiment and set use.names=TRUE by default, or to redefine mcols(se) 
to use.names=!any(dupclicated(se)). There would be performance consequences (how 
much?) and an mcols inconsistency. I think this is part of the same discussion as

   https://stat.ethz.ch/pipermail/bioc-devel/2014-March/005409.html

which I have not yet followed through on.

Syntax wise, there is also

   mcols(se)[rownames(se) == "gene_D", "yellowness"]

This is more efficient (and more error prone) than either use.names or Michael's 
suggestion.

Martin