Hi all,
I tried to plot the density curve using the data from simulation. I am
sure that the data should be exponentially distributed, but the plot
of density curve always starts from (0,0) which is not the case for
exponential distribution. Is there any way around this, to keep the
curve dropping at 0?
Thanks.
The following are the codes I tested:
data <- vector()
rate <- 3
i <- 1
for(i in 1:1000){
r <- runif(1)
data[i] <- log(1-r)/(-rate)
i <- i+1
}
plot(density(data))
-JJ
density plot of simulated exponential distributed data
5 messages · Juanjuan Chai, Dennis Murphy, Greg Snow +1 more
Hi: Try this (and note the use of vectorization rather than a loop): rate <- 3 dta <- -log(1 - runif(1000))/rate hist(dta, nclass = 30, probability = TRUE) x <- c(0.001, seq(0, 3, by = 0.01)) lines(x, dexp(x, rate = 3)) This is the difference in timings between the vectorized and iterative methods of generating the samples:
system.time(replicate(1000, -log(1 - runif(1000))/rate))
user system elapsed 0.10 0.00 0.09
system.time(replicate(1000, { dt <- numeric(1000)
+ i <- 1
+ for(i in 1:1000){
+ r <- runif(1)
+ dt[i] <- log(1-r)/(-rate)
+ i <- i+1
+ } }))
user system elapsed
9.35 0.00 9.40
Vectorization is usually your friend in R, and it pays to use it when
available. All of the d*, p*, q* and r* functions, where * denotes the
suffix for a distribution, are vectorized, as are most of the
functions in base R. A happy by-product is that it also makes for more
easily readable code.
HTH,
Dennis
On Tue, Apr 26, 2011 at 3:19 PM, Juanjuan Chai <chaij at umail.iu.edu> wrote:
Hi all,
I tried to plot the density curve using the data from simulation. I am sure
that the data should be exponentially distributed, but the plot of density
curve always starts from (0,0) which is not the case for exponential
distribution. Is there any way around this, to keep the curve dropping at 0?
Thanks.
The following are the codes I tested:
data <- vector()
rate <- 3
i <- 1
for(i in 1:1000){
? ? ? ?r <- runif(1)
? ? ? ?data[i] <- log(1-r)/(-rate)
? ? ? ?i <- i+1
}
plot(density(data))
-JJ
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
You might want to use the logspline package instead of the density function, it allows you to specify bounds on a distribution.
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Juanjuan Chai
> Sent: Tuesday, April 26, 2011 4:19 PM
> To: r-help at r-project.org
> Subject: [R] density plot of simulated exponential distributed data
>
> Hi all,
>
> I tried to plot the density curve using the data from simulation. I am
> sure that the data should be exponentially distributed, but the plot
> of density curve always starts from (0,0) which is not the case for
> exponential distribution. Is there any way around this, to keep the
> curve dropping at 0?
>
> Thanks.
>
> The following are the codes I tested:
>
> data <- vector()
> rate <- 3
> i <- 1
> for(i in 1:1000){
> r <- runif(1)
> data[i] <- log(1-r)/(-rate)
> i <- i+1
> }
> plot(density(data))
>
> -JJ
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
There is an extensive statistical literature on how to correct for boundary bias in kernel density estimates. See, for example, An Improved Estimator of the Density Function at the Boundary S. Zhang, R. J. Karunamuni and M. C. Jones Journal of the American Statistical Association Vol. 94, No. 448 (Dec., 1999), pp. 1231-1241 And the references therein. However, a very simple -- certainly not optimal -- fix is to reflect the data about the origin before fitting the density: y <- rexp(100) yy <- c(y, -y) dens <- density(yy) dens$y <- 2*dens$y[dens$x >=0] dens$x <- dens$x[dens$x >= 0] plot(dens,col="red") lines(density(y),col="gray") _______________________ Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Greg Snow Sent: Wednesday, April 27, 2011 1:55 PM To: Juanjuan Chai; r-help at r-project.org Subject: Re: [R] density plot of simulated exponential distributed data You might want to use the logspline package instead of the density function, it allows you to specify bounds on a distribution.
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Juanjuan Chai
> Sent: Tuesday, April 26, 2011 4:19 PM
> To: r-help at r-project.org
> Subject: [R] density plot of simulated exponential distributed data
>
> Hi all,
>
> I tried to plot the density curve using the data from simulation. I am
> sure that the data should be exponentially distributed, but the plot
> of density curve always starts from (0,0) which is not the case for
> exponential distribution. Is there any way around this, to keep the
> curve dropping at 0?
>
> Thanks.
>
> The following are the codes I tested:
>
> data <- vector()
> rate <- 3
> i <- 1
> for(i in 1:1000){
> r <- runif(1)
> data[i] <- log(1-r)/(-rate)
> i <- i+1
> }
> plot(density(data))
>
> -JJ
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
I tried logspline function using a lower bound 0 for my data, it works like a charm. When the I changed the xlim only positive part, the vertical line was also gone. That's exactly what I expected. Thanks. -JJ
Greg Snow wrote:
You might want to use the logspline package instead of the density function, it allows you to specify bounds on a distribution.