[Bioc-devel] Tip of the day: unlist(..., use.names=FALSE) often saves lots of memory
Hi, I just wanna share an seldom used feature of unlist(): Using argument 'use.names=FALSE' when calling unlist() often saves lots of memory. The names vector of the list will be expanded to each element and can often consume much more memory than the actually data. So, unless you really need the 'names' attributes, please consider using unlist(..., use.names=FALSE) in your package(s). It is also faster. A common example using an AffyBatch object:
affyBatch
AffyBatch object size of arrays=1164x1164 features (7 kb) cdf=HG-U133_Plus_2 (54675 affyids) number of samples=1 number of genes=54675 annotation=hgu133plus2 notes=
pmIndex <- indexProbes(affyBatch[,1], "pm") object.size(pmIndex)
[1] 6572776
cells <- unlist(pmIndex) object.size(cells)
[1] 29018704
cells2 <- unlist(pmIndex, use.names=FALSE) object.size(cells2)
[1] 2417056 # The names consumes 92% of the memory
object.size(cells2)/object.size(cells)
[1] 0.08329304 It is much cheaper to pass around 'cells2' compared with 'cells'. /Henrik