Skip to content

density() returns a density function that does not add up to 1

6 messages · Jonathan Li, A.J. Rossini, Brian Ripley +1 more

#
Dear R users,

I ran into this curious problem:
[1] 2.517502

Admittedly the method of computing the mass under the density curve at
line 3 is crude.
But 2.5 is pretty far from 1, the value it should be.

I tried a few other dataset and got similar result. Am I missing
something obvious?
Or is the return of density() not supposed to be normalized?

Thanks in advance!
Jonathan
#
jonathan> Dear R users,
    jonathan> I ran into this curious problem:

    >> d <- rnorm(100)
    >> d.density <- density(d)
    >> sum( d.density$x * d.density$y)
    jonathan> [1] 2.517502

I get 6.5

    jonathan> Admittedly the method of computing the mass under the density curve at
    jonathan> line 3 is crude.

It's actually wrong.   You aren't computing an integral, you need the
the approximate area under d.density$y[i], NOT the value of the X
axis. 

    jonathan> But 2.5 is pretty far from 1, the value it should be.

nope.

    jonathan> I tried a few other dataset and got similar result. Am I missing
    jonathan> something obvious?
    jonathan> Or is the return of density() not supposed to be normalized?

You just need to integrate properly.  You are attempting Riemann
integration, but need to change things a bit.

best,
-tony
#
jonathan> Dear R users,
    jonathan> I ran into this curious problem:

    >> d <- rnorm(100)
    >> d.density <- density(d)
    >> sum( d.density$x * d.density$y)
    jonathan> [1] 2.517502

whoops, you probably want something like:
[1] 1.000950

instead, for a crude integration (and yes, it throws a warning).
#
Thank you for the message. It helps a lot. I see where my method was
wrong.
Now another case:
[1] 2.856715

If you add more extreme tails, the thing gets worse:
[1] 13.90902

I am not sure what is going on now?
Thanks again!
Jonathan
"A.J. Rossini" wrote:

  
    
#
On Fri, 30 Aug 2002, Jonathan Li wrote:

            
(or, I did something silly)
(and got a nonsensical answer)
I think you intended

sum( unique(diff(d.density$x)) * d.density$y )

Your sum is 512x the estimate of E(X).
#
Jonathan Li <jonqli at labs.agilent.com> writes:
You're trying to calculate and then integrate a very spiky density.
Try plotting it and then increase the number of points used to
approximate it:
[1] 1.000014
Warning message: 
longer object length
        is not a multiple of shorter object length in: diff(ddd$x) * ddd$y