[Bioc-devel] GenomicRanges::assays
The problem is that the dimnames are stored in only one location, and this is not on the assays. When you ask for the assays, the dimnames are added, triggering a full copy of the data. If the dimnames are not of interest, then assays(BS, withDimnames=FALSE) This is not really ideal, so I'll give some thought to a better implementation. Martin
----- Kasper Daniel Hansen <kasperdanielhansen at gmail.com> wrote:
Note the final "s" in assays. It is super slow. This is a BSseq object with 28M rows and 7 columns, which means there are two assays M and Cov each being 28M x 7 (which is pretty big, on the Gb scale) These two commands retrieve the same data as far as I understand.
system.time({BS at assays$field("data")})
user system elapsed
0 0 0
system.time({assays(BS)})
user system elapsed 19.677 10.436 30.114 Follow up question: 1) It seems that all assays are stored in a SimpleList inside a reference class. If I only want to replace one of the assays, like assay(Object, "NAME") <- value does this mean that all assays are being copied? Is this different from say eSet where each assay is a matrix in an environment? 2) I think we need a convenience function for the assay names of a SummarizedExperiment. (This is how I saw the issue above, I was using names(assays(Object))) Kasper [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel