Using plyr::dply more (memory) efficiently?

Matt Dowle · 2010-04-29T15:46:17Z

"Steve Lianoglou" wrote in message news:t2ybbdc7ed01004290812n433515b5vb15b49c170f5a353 at mail.gmail.com... > Thanks for directing me to the data.table package. I read through some > of the vignettes, and it looks quite nice. > > While your sample code would provide answer if I wanted to just > compute some summary statistic/function of groups of my data.frame > (using `by=symbol`), what's the best way to produces several pieces of > info per subset. > > Fo

Matt Dowle

Thu, Apr 29, 2010 8:46 AM

"Steve Lianoglou" <mailinglist.honeypot at gmail.com> wrote in message 
news:t2ybbdc7ed01004290812n433515b5vb15b49c170f5a353 at mail.gmail.com...

Yes, thats it.

Yes, you got it.  Rather than a data.table at the end though, just return a 
list, its faster.
Shorter vectors will still be recycled to match any longer ones.

Or just this :

summaries[, list(
    counts = sum(counts),
    width = sum(exon.width),
    cplx = # .. result of complex things
), by=symbol]


Sounds like its working,  but could you give us an idea whether it is quick 
and memory efficient ?

Thread (5 messages)

Steve Lianoglou Using plyr::dply more (memory) efficiently? Apr 29 Matt Dowle Using plyr::dply more (memory) efficiently? Apr 29 Steve Lianoglou Using plyr::dply more (memory) efficiently? Apr 29 Matt Dowle Using plyr::dply more (memory) efficiently? Apr 29 Steve Lianoglou Using plyr::dply more (memory) efficiently? Apr 29