Skip to content
Prev 155616 / 398502 Next

A question about the hypergeometric distribution and phyper()

On 10 Sep 2008, at 15:19, michael watson (IAH-C) wrote:

            
Actually, if you look at ?phyper, you'll see that this should be

phyper(18, 164, 6187-164, 249, lower.tail=FALSE)
[1] 2.775819e-05

if you want to calculate Pr(X >= 19) = Pr(X > 18). Similarly:
phyper(3, 12, 6187-12, 249, lower.tail=FALSE)
[1] 0.0009816739

Which you'll still find counterintuitive, of course.
I think it's just because the hypergeometric distribution becomes very  
skewed and non-normal for expected values < 1 (expectations should be  
roughly 6.6 in the first case and 0.5 in the second case). Perhaps it  
helps to visualize the two distributions?

M <- rbind(dhyper(0:20, 164, 6187-164, 249), dhyper(0:20, 12, 6187-12,  
249))
rownames(M) <- c("164 out of 6187", "12 out of 6187")
colnames(M) <- 0:20
barplot(M, beside=TRUE, legend = TRUE)


Best regards,
Stefan Evert

[ stefan.evert at uos.de | http://purl.org/stefan.evert ]