Hello, I would have some details and explanations about the results I get. In fact, I start with a uniform sample between -1 and 1, and then plot its density. My problem is that the density ranges are much more longer than I expected : samp <- runif(10000,-1,1) plot(density(samp)) Instead of varying between -1 and 1, the density varies between approximaly -1.5 and 1.5 Could someone explain me what is happening ? Maybe some arguments for density estimation need to be set ? Waiting for an answer, Thanks in advance Isabelle. Isabelle Zabalza-Mezghani, PhD IFP - Research Reservoir Engineer
density ranges for uniform law
4 messages · ZABALZA-MEZGHANI Isabelle, Ott Toomet, Roger Bivand +1 more
Hi, | From: ZABALZA-MEZGHANI Isabelle <Isabelle.ZABALZA-MEZGHANI at ifp.fr> | Date: Tue, 8 Apr 2003 10:21:45 +0200 | Hello, | | I would have some details and explanations about the results I get. | In fact, I start with a uniform sample between -1 and 1, and then plot its | density. | My problem is that the density ranges are much more longer than I expected : | | samp <- runif(10000,-1,1) | plot(density(samp)) | | Instead of varying between -1 and 1, the density varies between approximaly | -1.5 and 1.5 The density is positive in the interval about (-1.3, 1.3) using the default bandwidth. Its value is around 0.5. I guess you should try to change bandwidth. Try
plot(density(samp, bw=0.1)) lines(density(samp, bw=0.03), col=2) lines(density(samp, bw=0.01), col=3)
best wishes, Ott | Could someone explain me what is happening ? Maybe some arguments for | density estimation need to be set ? | | Isabelle.
On Tue, 8 Apr 2003, ZABALZA-MEZGHANI Isabelle wrote:
Hello, I would have some details and explanations about the results I get. In fact, I start with a uniform sample between -1 and 1, and then plot its density. My problem is that the density ranges are much more longer than I expected : samp <- runif(10000,-1,1) plot(density(samp)) Instead of varying between -1 and 1, the density varies between approximaly -1.5 and 1.5 Could someone explain me what is happening ? Maybe some arguments for density estimation need to be set ?
Try:
samp <- runif(10000,-1,1) range(samp)
[1] -0.9995812 0.9996801 (for this samp)
plot(density(samp), ylim=c(0,0.6)) abline(v=c(-1,1)) lines(density(samp, cut=0), col="green") lines(density(samp, from=-1, to=1), col="red")
So you can add arguments to density() - see help(density) - but they will not affect the fact that for the chosen bandwidth and kernel, the kernel will extend outside the data range. Does:
samp1 <- runif(10000,-1.5,1.5) plot(density(samp1, from=-1, to=1), ylim=c(0,0.6)) abline(v=c(-1,1))
"look" "better"? Roger
Waiting for an answer, Thanks in advance Isabelle. Isabelle Zabalza-Mezghani, PhD IFP - Research Reservoir Engineer
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: Roger.Bivand at nhh.no
On 08-Apr-03 ZABALZA-MEZGHANI Isabelle wrote:
samp <- runif(10000,-1,1) plot(density(samp)) Instead of varying between -1 and 1, the density varies between approximaly -1.5 and 1.5 Could someone explain me what is happening ? Maybe some arguments for density estimation need to be set ?
density() computes a kernel-density estimate of the density, i.e.
it replaces each observation by a distribution ("kernel") which is
spread out over a certain width on either side of it, and sums these
contributions. Therefore, observations near the ends of the range [-1,1]
are replaced by distributions which extend beyond the range, with the
result you have seen.
There are options to density() which can limit the estimated density
to the range [-1,1]: try
plot(density(samp,from=-1,to=1))
or
plot(density(samp,cut=0))
(which both seem to give the same result), though you may not think that
the result looks satisfactory at the ends.
Ideally, for this sort of problem is should be possible to make the width
of the kernel depend on the position (rank) of the observation it is
applied to -- for a uniform distribution in particular the variance
of an order statistic is strongly dependent on its rank (the median
over [-1,1] has variance 1/(n+2), the min or the max has variance
4n/((n+2)*(n+1)^2) approx = 4/(n^2) for a sample of n). If you know
that a sample is from a uniform distribution, the end-points are
very precisely estimated from the extremes of the sample, and a
fixed-width kernel-density estimate will not do justice to this..
I don't know whether this is possible directly with current R functions
(though one can always write one which does it).
I hope this helps,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 08-Apr-03 Time: 10:40:50
------------------------------ XFMail ------------------------------