Skip to content

density > 1?

5 messages · Johannes Elias, Bill Venables, Eik Vettorazzi +2 more

#
Dear R-Gurus,

I wonder why 'density' values as shown in hist or plot(density(x)) are
sometimes over 1. How can that be?

Example
The resulting plot shows density values below 1 on the y-axis. However,
shows density values over 1.

How to interpret density values over 1?

Greetings,

Johannes
#
Because densities are not probabilities.  It is the area under the density curve that represents probability.

Example: the chi-squared density with 1 degree of freedom has a singularity at the zero and is unbounded.  The area under the curve, however, is still 1.

(This is a distressingly common misconception.  It is really not an R issue but a distribution theory issue.)

Bill Venables
#
Hi Johannes,
ist more a statistical issue. In short: densities are not probabilities! 
With a continuous random variable probability statements are typically 
over intervals not over points.
A density is bound to have an integral of 1 (and to be non-negative), 
nothing else.
Consider the uniform (0,0.5) distribution there the density is f(x)=2 
for all 0<=x<=0.5. This is a perfect probability density having all 
non-zero values > 1.

hth.

Johannes Elias schrieb:

  
    
#
Johannes Elias wrote:
This comes up every now and again. The real question is: Why do people
believe that densities should be probabilities? They're not, they denote
(differential) probability per unit on the x axis, and the denominator
can be small. The density _integrates_ to 1, so if e.g. it is
concentrated on (0, 0.5) if has to be at least 2 somewhere.

  
    
#
On Mon, 2009-03-02 at 13:27 +0100, Johannes Elias wrote:
Johannes,

Well density is not like probability

In histogram with density the area is equal de probability 

in you example

set.seed(123)
hist(rnorm(1000,sd=.1),freq=FALSE)

The interval of -0.05 and 0 have density=4 but a probability of number
in this interval is 4*.05=.2

the fact 

set.seed(123)
hist(rnorm(1000,sd=.1),freq=FALSE)$density
[1] 0.09999998 0.28000000 0.94000000 1.98000000 2.60000000 4.00000000
[7] 4.04000000 2.92000000 1.66000000 0.92000000 0.44000000 0.10000000
[13] 0.02000000

set.seed(123)
sum(hist(rnorm(1000,sd=.1),freq=FALSE)$density)
[1] 1


So the sum of probability is 1 but the sum of density 20