goodness of fit between two samples of size N (discrete variable)

Sun, Apr 12, 2009 12:09 PM #

Hello list:

I generate by simulation (using different procedures) two sample vectors of size N, each corresponding to a discrete variable and I want to text if these samples can be considered as having the same probability distribution (which is unknown).  What is the best test for that? 
I've read that Kolmogorov-Smirnov and Anderson-Darling tests are restricted to continuous data (http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf), while chi-square can handle discrete data, but how do i test (in R) equivalence of ditribution in 2 samples using it? Are there better tests than those i mentioned?

Thanks and regards,
jlrp

David Winsemius

Sun, Apr 12, 2009 4:45 PM #

On Apr 12, 2009, at 3:09 PM, jose romero wrote:

The question of whether two discrete samples are independent,  
conditional on their joint marginals is generally handled with a chi- 
square test. The theoretical distribution is only approximately chi- 
square, but is seems close enough that most people will accept it.  
This is not a test of "equivalence". Ricci deals with the cases where  
one sample is fitted to a theoretical distribution. You do not seem to  
have that situation.

?chisq.test

I find myself wondering to what purpose you are seeking these answers.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT