Skip to content
Prev 177210 / 398506 Next

p-values from bootstrap - what am I not understanding?

Some useful comments have already been made. I would like to comment on
the two definitions of the p-value under (4) -- since I thought exactly
about this issue a while back. Maybe this will be useful ...

Suppose the distribution of a test statistic Z under H0 is given by f(Z)
and that the distribution is not symmetric. A bootstrap distribution
(under H0) may in fact look like that. We can define the critical values
for alpha = .05 in two different ways: 

(1) We find a single value of z so that the sum of the lower and upper
tail is equal to .05 (red shaded region, corresponding to z = 9.51 and z
= -9.51). Since the distribution is very skewed, there is no area in the
lower tail, so that the red-shaded region in the upper tail is actually
equal to .05.

(2) We put .025 in the lower and .025 in the upper tail (blue shaded
regions, corresponding to z = -3.82 and z = 11.53 in the figure).

The first definition treats deviations on both ends of the sampling
distribution equally. The second definition acknowledges the fact that
deviations of a certain magnitude are more likely to occur in the upper
than in the lower tail. Hence, the critical value on the upper tail is
more extreme than on the lower tail. This is, in fact, how critical
values are typically defined for non-symmetric distributions.

Now given some data, we observe the value z of the test statistic. How
should we now calculate the two-sided p-value? 


option 1: p = prob(|Z| > |z|)
-----------------------------

If z =  11.53, then p = .025 (reject H0)
If z = -11.53, then p = .025 (reject H0)
If z =   9.51, then p = .050 (reject H0)
If z =  -9.51, then p = .050 (reject H0)
If z =  -3.82, then p = .303 (do not reject H0)
If z =      1, then p = .779

Note that one could never actually get a p-value below .05 if z is
negative.


option 2: p = 2*prob(Z > z) if z is positive or 2*prob(Z < z) if z is
negative
---------------------------------------------------------------------

If z =  11.53, then p = .050 (reject H0)
If z = -11.53, then p = .000 (reject H0)
If z =   9.51, then p = .100 (do not reject H0)
If z =  -9.51, then p = .000 (reject H0)
If z =  -3.82, then p = .050 (reject H0)
If z =      1, then p = 1.07 !!!

For z = 9.51, we should not reject H0 according to the second definition
of the critical values. On the other hand, for z = -3.82, we should
reject H0 according to the second definition. So option 2 is more in
line how we typically define critical values in non-symmetric
distributions. However, note that the p value can be above 1 according
to the second definition!

######################################################################

alpha		<- .05
shape		<- 4
scale		<- 2

xs		<- seq(0, 30, .1)
#ys		<- dchisq(xs, df=4)
ys		<- dgamma(xs, shape=shape, scale=scale)

offset	<- xs[which( ys == max(ys) )]
xs		<- xs - offset

par(mar=c(5,4,2,2))

plot(xs, ys, type="l", lwd=2, xlim=c(-15,25), xlab="Z", ylab="f(Z)")
abline(v = 0)

########################################################################
#####

### some p-value calculations

round( pgamma(3.82+offset, shape=shape, scale=scale, lower.tail=F) + 
 pgamma(-3.82+offset, shape=shape, scale=scale, lower.tail=T) , 3 )
round( pgamma(0+offset, shape=shape, scale=scale, lower.tail=F) + 
 pgamma(-0+offset, shape=shape, scale=scale, lower.tail=T) , 3 )
round( pgamma(1+offset, shape=shape, scale=scale, lower.tail=F) + 
 pgamma(-1+offset, shape=shape, scale=scale, lower.tail=T) , 3 )
2*round( pgamma(1+offset, shape=shape, scale=scale, lower.tail=F) , 3 )

########################################################################
#####

upp.crit	<- qgamma(alpha, shape=shape, scale=scale, lower.tail=F)
x.r		<- seq(upp.crit, max(xs)+offset, length = 100)
y.r		<- c( dgamma(x.r, shape=shape, scale=scale), 0, 0)
x.r		<- c(x.r, max(xs), upp.crit) - offset
polygon(x.r, y.r, density = 50, col="red", angle=135)

text(upp.crit-offset, .03, paste("z = ", round(upp.crit-offset,2)),
pos=4, offset=0, col="red")
text(-1*(upp.crit-offset), .03, paste("z = ",
-1*round(upp.crit-offset,2)), pos=2, offset=0, col="red")

low.crit	<- qgamma(alpha/2, shape=shape, scale=scale,
lower.tail=T)
x.l		<- seq(min(xs), low.crit, length = 100)
y.l		<- c( dgamma(x.l, shape=shape, scale=scale) , 0, 0)
x.l		<- c(x.l, low.crit, min(xs)) - offset
polygon(x.l, y.l, density = 50, col="blue")

text(low.crit-offset, .05, paste("z = ", round(low.crit-offset,2)),
pos=2, offset=0, col="blue")

upp.crit	<- qgamma(alpha/2, shape=shape, scale=scale,
lower.tail=F)
x.r		<- seq(upp.crit, max(xs)+offset, length = 100)
y.r		<- c( dgamma(x.r, shape=shape, scale=scale), 0, 0)
x.r		<- c(x.r, max(xs), upp.crit) - offset
polygon(x.r, y.r, density = 50, col="blue")

text(upp.crit-offset, .05, paste("z = ", round(upp.crit-offset,2)),
pos=4, offset=0, col="blue")

abline(h = 0)

######################################################################

I hope this helps!

Best,