Back to formatted view
Raw Message

Message-ID: <E0ADE6E0-F4E8-42D0-84D5-9A1D26D43A09@googlemail.com>
Date: 2012-12-25T16:34:42Z
From: Martin Batholdy
Subject: aggregate / collapse big data frame efficiently

Hi,


I need to aggregate rows of a data.frame by computing the mean for rows with the same factor-level on one factor-variable;

here is the sample code:


x <- data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52))

aggregate(x, list(x[,1]), mean)


Now my problem is, that the actual data-set is much bigger (120 rows and approximately 100.000 columns) ? and it takes very very long (actually at some point I just stopped it).

Is there anything that can be done to make the aggregate routine more efficient?
Or is there a different approach that would work faster?


Thanks for any suggestions!