Skip to content

Apply functions along "layers" of a data matrix

4 messages · saschaview at gmail.com, Dennis Murphy, Paul Hiemstra +1 more

#
Hello

How can I apply functions along "layers" of a data matrix?

Example:

daf <- data.frame(
   'id' = rep(1:5, 3),
   matrix(1:60, nrow=15, dimnames=list( NULL, paste('v', 1:4, sep='') )),
   rep = rep(1:3, each=5)
)

The data frame "daf" contains 3 repetitions/layers (rep) of 4 variables 
of 5 persons (id). For some reason, I want to calculate various 
statistics (e.g., mean, median) *along* the repetitions. The "mean" 
calculation, for example, would produce the means of daf[1, 'v1'] 
*along* the 3 repetition:

(daf[1, 'v1'] + daf[6, 'v1'] + daf[11, 'v1']) / 3

That is to say, each of the calculations would result in a data frame 
with 4 variables (and the id) of the 5 persons:

   id v1 v2 v3 v4
1  1  6 21 36 51
2  2  7 22 37 52
3  3  8 23 38 53
4  4  9 24 39 54
5  5 10 25 40 55

Currently, I do this in a loop, but I was wondering about a quick and 
ressource-friendly way to achieve this?

Thanks
*S*
#
Hi:

Here are two ways to do it; further solutions can be found in the doBy
and data.table packages, among others.

library('plyr')
ddply(daf, .(id), colwise(mean, c('v1', 'v2', 'v3', 'v4')))

aggregate(cbind(v1, v2, v3, v4) ~ id, data = daf, FUN = mean)

# Result of each:
  id v1 v2 v3 v4
1  1  6 21 36 51
2  2  7 22 37 52
3  3  8 23 38 53
4  4  9 24 39 54
5  5 10 25 40 55

Dennis
On Fri, Nov 18, 2011 at 5:05 AM, <saschaview at gmail.com> wrote:
#
On 11/18/2011 01:05 PM, saschaview at gmail.com wrote:
Hi,

This seems like a job for plyr!

library(plyr)
ddply(daf, .(rep), summarise, mn = mean(v1))

hope this helps,
Paul
#
On Nov 18, 2011, at 8:05 AM, <saschaview at gmail.com> wrote:

            
I see you have gotten acouple of plyr solutions but this is really  
easy in base R:

 > aggregate(daf[-c(1,6)], list(daf$id), mean)
   Group.1 v1 v2 v3 v4
1       1  6 21 36 51
2       2  7 22 37 52
3       3  8 23 38 53
4       4  9 24 39 54
5       5 10 25 40 55

You read this as "use the mean function within categories defined by  
the "id" INDEX to aggregate the columns except tof the first and 6th  
columns"
David Winsemius, MD
West Hartford, CT