[Bioc-devel] strange behavior on memory usage
Hi Vince, et al. it seems to me the problem is bigger than just fixing the "show" method and caching (duplicating) e.g. the dimension information in extra slots. I am a bit worried that if "getExpData" is such a memory hog the whole eSet class becomes much less useful - and people might be tempted to revert back to using simple matrices for performance-critical computations. Is there a better way to do this avoiding such overhead with "getExpData" in the first place? (I guess we might need somebody who understands the memory management in R and perhaps even can write some of the necessary infrastructure in C.)
i agree with you that the utility of eSet has to be compared to simple matrices, and should not be too far behind (some loss is permissible given the added value). i have not had the chance to do such comparisons but eSet has been rewritten and committed just last night. i have only rewritten it a priori, no profiling yet. i am trying to get some time to do some experiments but have none yet.
What I don't understand in Benilton's Email (one of the many things) is
this "ps: i just noticed that using dim(exprs(x)) in show() reduces the
memory usage from 6GB to 3.5GB... " but the implementation of exprs() is
setMethod("exprs", "eSet",
function(object) getExpData(object, "exprs")
)
i only preserved getExpData as legacy, it is deprecated. we now have assayData accessor which returns whatever is provided (list or environment) and exprs() which takes the named element of the assayData with name "exprs"
i.e. it just calls getExpData:
setMethod("getExpData", c("eSet", "character"),
function(object, name) {
object at eList[[name]] })
this should only be called if someone actually wants the assayData; if you are just describing the thing, use managed metadata. now if we see that these accessors are painful to use, then i agree i have to write some other code. i just have no data.