Mode (statistics) in R?
on 01/26/2009 07:28 AM Jason Rupert wrote:
Hopefully this is a pretty simple question: ? Is there a function in R that calculates the "mode" of a sample??? That is, I would like to be able to determine the value that occurs the most frequently in a data set. ? I tried the default R "mode" function, but it appears to provide a storage type or something else.? ? I tried the RSeek and some R documentation that I downloaded, but nothing seems to mention calculating the "mode". ? Thanks again.
It depends upon the type of data you are dealing with. If it is discrete, you can use table() to calculate frequencies and then take the max: set.seed(1) tl <- table(sample(letters, 100, replace = TRUE))
tl
a b c d e f g h i j k l m n o p q r s t u v w x y z 2 3 3 3 2 4 6 1 6 5 6 4 7 2 2 2 5 4 5 3 8 4 5 4 3 1
tl[which.max(tl)]
u 8 Alternatively, if the data is continuous, then you will need to look at some form of density estimation. There have been various discussions over the years on how to go about doing this, but a simplistic approach would be: set.seed(1) x <- rnorm(100) dx <- density(x) > dx$x[which.max(dx$y)] [1] 0.3294585 # Review plot plot(dx) abline(v = dx$x[which.max(dx$y)]) See ?table, ?which.max and ?density HTH, Marc Schwartz