Skip to content
Prev 40354 / 63424 Next

Statistical mode

Thank you, Kevin, for the feedback.
Absolutely. The help page for statmode() says it is for discrete data, and 
points to density() for continuous data.
Try statmode(iris,TRUE). It points out that petal lengths 1.4 and 1.5 are 
equally common in the data. I decided to make all=FALSE the default 
behavior, but I'd be equally happy with all=TRUE as the default.

As for the barley data, statmode(barley,TRUE) is just the honest answer. 
The yield is continuous, so the discrete mode is not of interest, and the 
factors levels are all equally common as you point out.
The describe() function is a verbose summary, usually of a data frame. The 
statmode() function is the discrete mode, usually of a vector. 
Importantly, describe(faithful$waiting) points out the mean, median and 
range, but not the mode.

---

Allow me to include two more valid comments, from Sarah Goslee and David 
Winsemius, respectively:
I think core R should come with a basic function to get the mode of a 
discrete vector. One option would be to lift mfv() into the 'stats' 
package, but something like statmode() could also cover factors and 
strings. Might as well provide all=TRUE/FALSE functionality, too, and 
retain integers as integers.

It's common to find rudimentary basic functionality in the 'stats' 
package, and dedicated packages for more details; time series models and 
robust statistics come to mind. The 'modeest' package is impressive 
indeed.
Yes it is, only less cumbersome. Much like sd(Vec) is less cumbersome than 
sqrt(var(Vec)). Moreover, I find it confusing to see the count as well,

   table(volcano)[which.max(table(volcano))]
   # 110
   # 177

although this can be debated. Finally, I think the examples

   statmode(mtcars)
   statmode(mtcars, TRUE)

demonstrate practical functionality beyond 
table(Vec)[which.max(table(Vec))].

The mean, median, and mode are often mentioned together as fundamental 
descriptive statistics, and I just find it odd that statmode() is not 
already in core R. Sure, we could get by without the sd() function in core 
R, but why should we?

All the best,

Arni