fisher.test - can I use non-integer expected values?

Peter Dalgaard · 2013-12-11T12:33:06Z

On 11 Dec 2013, at 06:37 , Peter Langfelder wrote: >> >> Expected values are needed to test a null hypothesis against observed >> counts, but if total observed counts are 20 for 3 categories, then a null >> hypothesis of a random effect would use expected values = 6.67 in each of >> the 3 categories (20/3). >> >> Yes, fisher.test is for count data and so is chisq.test, but chisq.test >> allows 6.67 to be input as expected values in each of 3 categories, while >

Peter Dalgaard

Wed, Dec 11, 2013 4:33 AM

On 11 Dec 2013, at 06:37 , Peter Langfelder <peter.langfelder at gmail.com> wrote:

A couple of additional notes: 

(a) If you think you can feed expected values like 6.67 to chisq.test anywhere, I think you are doing it wrong. It might give you an answer, but not likely a correct one.

(b) There is an exact test for equidistribution or goodness of fit in general, but that is not what fisher.test does. You can "cheat" and get an approximation by claiming that you are comparing your data to a much larger set of equidistributed data, e.g.:

Fisher's Exact Test for Count Data

data:  cbind(c(1, 10, 9), c(10000, 10000, 10000))
p-value = 0.01465
alternative hypothesis: two.sided

(c) It's not massively hard to generate the ~200 configurations of 20 items into 3 groups and use that to calculate the exact test exactly:

tab <- outer(0:20,0:20,
	Vectorize(function(i,j)
	  if (i+j <= 20)
              dmultinom(c(i, j, 20 - i - j), p=c(1, 1, 1)/3)
          else 0
	))
pp <- dmultinom(c(1, 10, 9), p=c(1, 1, 1)/3)
sum(tab[tab<=pp])

## [1] 0.01468422

(d) Another option is to use the simulate.p.value option to chisq.test():

Chi-squared test for given probabilities with simulated p-value (based
	on 10000 replicates)

data:  c(1, 10, 9)
X-squared = 7.3, df = NA, p-value = 0.0252

(The p-values _will_ differ because chi-square critical regions are slightly different from those based on the point probabilities.)

Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

fisher.test - can I use non-integer expected values?

Thread (9 messages)