density plot of simulated exponential distributed data

There is an extensive statistical literature on how to correct for boundary bias in kernel density estimates.  See, for example, 

An Improved Estimator of the Density Function at the Boundary
S. Zhang, R. J. Karunamuni and M. C. Jones
Journal of the American Statistical Association
Vol. 94, No. 448 (Dec., 1999), pp. 1231-1241

And the references therein.  However, a very simple -- certainly not optimal -- fix is to reflect the data about the origin before fitting the density:

y <- rexp(100)
yy <- c(y, -y)
dens <- density(yy)
dens$y <- 2*dens$y[dens$x >=0]
dens$x <- dens$x[dens$x >= 0]
plot(dens,col="red")
lines(density(y),col="gray")

_______________________
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Greg Snow
Sent: Wednesday, April 27, 2011 1:55 PM
To: Juanjuan Chai; r-help at r-project.org
Subject: Re: [R] density plot of simulated exponential distributed data

You might want to use the logspline package instead of the density function, it allows you to specify bounds on a distribution.

density plot of simulated exponential distributed data

Thread (5 messages)