Skip to content

[Bioc-devel] Error when index duplicate rows in SummarizedExperiment -- is this a bug?

1 message · Elizabeth Purdom

#
Hello,

I want to be able to index duplicate rows of an assay of a Summarized Experiment (think bootstrapping), something like this:

assay(se[c(1,1,2,2),])

However this gives me an error when the assay contains a data.frame, rather than a DataFrame

# > assay(se[c(1,1,2,2),]) #throws error
# Error in `.rowNamesDF<-`(x, value = value) :
#   duplicate 'row.names' are not allowed
# In addition: Warning message:
# non-unique values when setting 'row.names': ?A1?, ?A2?

Here?s a simple example:

test <- data.frame(matrix(rnorm(100),ncol=5))
row.names(test) <- paste0("A",1:nrow(test))
se<-SummarizedExperiment(test)

I can pull duplicate rows of the original data.frame:

test[c(1,1,2,2),] # works

I can also index duplicate rows of the SummarizedExperiment

se[c(1,1,2,2),] #works

But I can?t then call `assay` on that object with the duplicated rows:

assay(se[c(1,1,2,2),]) #throws error

# > assay(se[c(1,1,2,2),]) 
# Error in `.rowNamesDF<-`(x, value = value) :
#   duplicate 'row.names' are not allowed
# In addition: Warning message:
# non-unique values when setting 'row.names': ?A1?, ?A2?

Of course, I can do

assay(se)[c(1,1,2,2),]

because the underlying data.frame can be indexed that way, but then I am not indexing the corresponding `rowData`, which is my goal in indexing `se` directly, rather than the `assay`.

On the other hand, I don?t get this problem if the input object is a DataFrame or matrix:

se<-SummarizedExperiment(DataFrame(test))
assay(se[c(1,1,2,2),]) #now it works

se<-SummarizedExperiment(data.matrix(test))
assay(se[c(1,1,2,2),]) #now it works

This seems like a bug, but I thought I?d check here. It seems, at a minimum, unfortunate that you can call `se[c(1,1,2,2),]` but not `assay(se[c(1,1,2,2),])`, especially given that the underlying `data.frame` allows this call.

Thanks,
Elizabeth Purdom