Skip to content

Sorting dataframe by number of occurrences of factor

4 messages · adigs, Petr Savicky, Sharma D +1 more

#
Apologies for what's probably quite simple, but I'm having some problems with
sorting a data frame by the number of occurences of each level of a factor.

df<-data.frame(id=c(1:20),name=c('a','b','b','c','a','d','b','e','d','d','c','a','b','a','a','b','f','b','c','g'))

I want to sort the dataframe so that the values of df$name that occur most
often are at the bottom - ie. in the order:

attributes(sort(summary(df$name)))$name = "e" "f" "g" "c" "d" "a" "b":
e f g c d a b 
1 1 1 3 3 5 6 

So the desired result is:

id name
8    e
17    f
20    g
 4    c
11    c
19    c
 6    d
 9    d
10    d
  1    a
  5    a
12    a
14    a
15    a
  2    b
  3    b
  7    b
13    b
16    b
18    b


Any suggestions would be greatly appreciated.

Thanks.
--
View this message in context: http://r.789695.n4.nabble.com/Sorting-dataframe-by-number-of-occurrences-of-factor-tp3485443p3485443.html
Sent from the R help mailing list archive at Nabble.com.
#
On Fri, Apr 29, 2011 at 11:17:58PM -0700, adigs wrote:
Hi.

Try the following

  freq <- ave(rep(1, times=nrow(df)), df$name, FUN=sum)
  df[order(freq, df$name), ]

Hope this helps.

Petr Savicky.
#
df<-data.frame(id=c(1:20),name=c('a','b','b','c','a','d','b','e','d','d','c','a','b','a','a','b','f','b','c','g')) 
freq <- ave(rep(1, times=nrow(df)), df$name, FUN=sum) 
rowSums(table(df$name,freq))
--
View this message in context: http://r.789695.n4.nabble.com/Sorting-dataframe-by-number-of-occurrences-of-factor-tp3485443p3486088.html
Sent from the R help mailing list archive at Nabble.com.
#
to the first two lines of your solutions

df<-data.frame(id=c(1:20),name=c('a','b','b','c','a','d','b','e',
'd','d','c','a','b','a','a','b','f','b','c','g'))
freq <- ave(rep(1, times=nrow(df)), df$name, FUN=sum) 

I would add:

df[ sort.list(freq), ]