Skip to content

Howto sort dataframe columns by colMeans

3 messages · Lynn Osburn, jim holtman, Liaw, Andy

#
I read from external data source containing several columns.  Each column
represents value of a metric.  The columns are time series data.

I want to sort the resulting dataframe such that the column with the largest
mean is the leftmost column, descending in colMean values to the right.

I see many solutions for sorting rows based on some column characteristic,
but haven't found any discussion of sorting columns based on column
characteristics.

viz.  input data looks like this
  time   met-a    met-b    met-c
00:00    42         18          99
00:05    88         16          67
00:10    80         27          84

desired output:
 time   met-c    met-a     met-b
00:00    99         42          18
00:05    67         88          16
00:10    84         80          27

Thanks,
-Lynn
#
Here is one way of doing it by 'skipping' the first column which is a
factor and your 'time':
+ 00:00    42         18          99
+ 00:05    88         16          67
+ 00:10    80         27          84"), header=TRUE)
time met.c met.a met.b
1 00:00    99    42    18
2 00:05    67    88    16
3 00:10    84    80    27

        
On 9/4/07, Lynn Osburn <lynn.osburn at lewan.com> wrote:

  
    
#
Something like the following may do what you want:

R> mydata.sorted <- mydata[c(1, 1 + order(colMeans(mydata[-1]),
decreasing=TRUE))]
R> mydata.sorted
   time met.c met.a met.b
1 00:00    99    42    18
2 00:05    67    88    16
3 00:10    84    80    27

(Note that I'm assuming that your first variable in the data frame is
not one of the things you want to include in your sorting.)

Andy


From: Lynn Osburn
------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}