Skip to content
Back to formatted view

Raw Message

Message-ID: <966846501.652474.1373409249953.JavaMail.root@fhcrc.org>
Date: 2013-07-09T22:34:09Z
From: Martin Morgan
Subject: [Bioc-devel] GenomicRanges::assays
In-Reply-To: <CAC2h7uvT0+eg020yY8W7hR5dchSg7iQNt7CpZ-zNVrReGO35kw@mail.gmail.com>

The problem is that the dimnames are stored in only one location, and this is not on the assays. When you ask for the assays, the dimnames are added, triggering a full copy of the data. If the dimnames are not of interest, then

  assays(BS, withDimnames=FALSE)

This is not really ideal, so I'll give some thought to a better implementation.

Martin
----- Kasper Daniel Hansen <kasperdanielhansen at gmail.com> wrote:
> Note the final "s" in assays.  It is super slow.  This is a BSseq object
> with 28M rows and 7 columns, which means there are two assays M and Cov
> each being 28M x 7 (which is pretty big, on the Gb scale)
> 
> These two commands retrieve the same data as far as I understand.
> 
> > system.time({BS at assays$field("data")})
>    user  system elapsed
>       0       0       0
> > system.time({assays(BS)})
>    user  system elapsed
>  19.677  10.436  30.114
> 
> Follow up question:
> 
> 1) It seems that all assays are stored in a SimpleList inside a reference
> class.  If I only want to replace one of the assays, like
>   assay(Object, "NAME") <- value
> does this mean that all assays are being copied?  Is this different from
> say eSet where each assay is a matrix in an environment?
> 
> 2) I think we need a convenience function for the assay names of a
> SummarizedExperiment.  (This is how I saw the issue above, I was using
> names(assays(Object)))
> 
> Kasper
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel