On Tue, 22 Feb 2005 13:43:47 -0700, Tony Plate
<tplate@blackmesacapital.com> wrote :
Is it a bug that quantile() can return a lower sample quantile for a higher
probability?
##### quantile returns decreasing results with increasing probs (data at
0% 20% 40% 60% 80%
-0.0014141174 -0.0009041968 -0.0009041968 -0.0007315023 -0.0005746115
100%
0.2905596324
##### the 40% quantile has a lower value than the 20% quantile
diff(quantile(x2, (0:5)/5))
20% 40% 60% 80% 100%
5.099206e-04 -1.084202e-19 1.726945e-04 1.568908e-04 2.911342e-01
This only happens for type=7:
for (type in 1:9) cat(type, any(diff(quantile(x2, (0:5)/5,
type=type))<0), "\n")
1 FALSE
2 FALSE
3 FALSE
4 FALSE
5 FALSE
6 FALSE
7 TRUE
8 FALSE
9 FALSE
I know this is at the limits of machine precision, but it still seems
wrong. Curiously, S-PLUS 6.2 produces exactly the same numerical result on
my machine (according to the R quantile documentation, the S-PLUS
calculations correspond to type=7).
I'd say it's not a bug in that rounding error is something you should
expect in a calculation like this, but it does look wrong. And it
isn't only restricted to type 7. If you make a vector of copies of
that bad value, you'll see it in more cases:
x <- rep(-0.00090419678460984, 602)
for (type in 1:9) cat(type, any(diff(quantile(x, (0:5)/5,
+ type=type))<0), "\n")
1 FALSE
2 FALSE
3 FALSE
4 FALSE
5 TRUE
6 TRUE
7 TRUE
8 FALSE
9 TRUE
(at least on Windows). What's happening is that R is doing linear
interpolation between two equal values, and not getting the same value
back, because of rounding.
The offending line appears to be this one:
qs[i] <- ifelse(h == 0, qs[i], (1 - h) * qs[i] + h * x[hi[i]])
The equivalent calculation in the approx function (which doesn't
appear to have this problem) is
qs[i] + (x[hi[i]] - qs[i]) * h
Can anyone think of why this would not be better? (The same sort of
calculation shows up again later in quantile().)