Skip to content
Prev 155885 / 398502 Next

Calculate mean/var by ID

AFAIK, tapply() only works for one variable (apart from the grouping 
variable). It might be perhaps better to use split() here:

    df <- data.frame(ID = c(111, 111, 111, 178, 178, 138, 138, 138, 138),
                     value = c(5, 6, 2, 7, 3, 3, 8, 7, 6),
                     Seg = c(2, 2, 2, 4, 4, 1, 1, 1, 1) )

    df.s <- split( df, df$ID )

    out <- sapply( df.s, function(m){
                     c( mu=mean(m$value), var=var(m$value),
                        min=min(m$Seg), max=max(m$Seg) ) })
    out <- t(out)
              mu      var min max
    111 4.333333 4.333333   2   2
    138 6.000000 4.666667   1   1
    178 5.000000 8.000000   4   4

You could also have used range() here instead of calculating min and max 
separately but naming the resulting columns becomes a bit tricky.

Regards, Adai

PS: If you do a dput() on a subset of the data, you can get a simple 
reproducible example that other R users can easily read in.
Julia Liu wrote: