Skip to content

sorting a data.frame by mean values of grouped data

3 messages · sjbarry, Mark Kimpel

#
Hi,

I have what I think is a fairly straightforward problem. I've looked 
through the FAQ's and mailing lists but have been unable to identify a 
solution, probably because I don't understand the language well enough.

I have a set of data d, with 3 columns as shown,
I want to sort the data Group, mean(Value by Label). I know that this 
can be done for one level, say Label, using factor() but I cannot see 
how to extend that. I have included the code to create the data.frame 
below and would greatly appreciate a solution or a link to a similar 
problem that has already been solved in the mailing list.
   Value Label Group
     19   Big     A
     29   Big     A
     39   Big     A
     55 Small     D
     33 Small     D
     11 Small     D
     55 Small     D
     66 Small     D
     11 Small     D
     2   Big     C
     3   Big     C
     3   Big     C
     3   Big     C
     3   Big     C
     3   Big     C
     3   Big     C
     3 Small     B
     2 Small     B
     5 Small     B
     6 Small     B
     5 Small     B
     6 Small     B

Value <- c(19,29,39,55,33,11,55,66,11,2,3,3,3,3,3,3,3,2,5,6,5,6)
Group <- c("A","A","A","D","D","D","D","D","D",
    "C","C","C","C","C","C","C","B","B","B","B","B","B")
Label <- c("Big","Big","Big",
    "Small","Small","Small","Small","Small","Small",
    "Big","Big","Big","Big","Big","Big","Big",
    "Small","Small","Small","Small","Small","Small")
d <- as.data.frame(cbind(Value, Label, Group))


Thanks

Stephen Barry
#
Stephen,

I am sure someone will have a more elegant solution, but the following 
works. Mark

d.lst <- split(x = d, f = as.factor(d$Group), drop = FALSE)
d.lst.mn <- sapply(d.lst, FUN = 
function(x){mean(as.numeric(as.character(x$Value)))})
o <- order(d.lst.mn, decreasing = TRUE)
d.lst.mn <- d.lst.mn[o]

e <- NULL
for (i in 1:length(d.lst.mn)){
   if (i == 1){
     e <- d[d$Group == names(d.lst.mn)[i],]
   } else {
     e <- rbind(e, d[as.character(d$Group) == names(d.lst.mn)[i],])
   }
}
e

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work, & Mobile & VoiceMail
(317) 204-4202 Home (no voice mail please)

mwkimpel<at>gmail<dot>com

******************************************************************
sjbarry wrote:
#
Thanks Mark,

I see that I made an error in my original request for help. I got my 
labels and groups mixed up (see below). Nonetheless, your code has been 
a good pointer in the direction of a solution. I'll post it up when I 
have it working.

Thanks again,
Stephen Barry.

I should have written it more like this:
Given data.frame:
   Value Group Label
     55     D Small
     33     D Small
     11     D Small
     55     D Small
     66     D Small
     11     D Small
     19     A   Big
     29     A   Big
     39     A   Big
     3     B Small
     2     B Small
     5     B Small
     6     B Small
     5     B Small
     6     B Small
     2     C   Big
     3     C   Big
     3     C   Big
     3     C   Big
     3     C   Big
     3     C   Big
     3     C   Big
end up with:
   Value Group Label
    19     A   Big
    29     A   Big
    39     A   Big
     2     C   Big
     3     C   Big
     3     C   Big
     3     C   Big
     3     C   Big
     3     C   Big
     3     C   Big
    55     D Small
    33     D Small
    11     D Small
    55     D Small
    66     D Small
    11     D Small
     3     B Small
     2     B Small
     5     B Small
     6     B Small
     5     B Small
     6     B Small


 >
Mark W Kimpel wrote: