Skip to content
Prev 174458 / 398503 Next

How Can I Concatenate Every Row in a Data Frame with Every Other Row?

On Sat, Mar 21, 2009 at 12:01 PM, I wrote:
I thank David Wisemius, Duncan Murdoch, and Jim Holtman for their helpful
replies.  Jim wrote
This work is for a client whose son was accused of cheating on a multiple
choice exam.  One can investigate this matter statistically by computing
the number of matching answers to questions on the exam between all pairs
of students.  Of course under the null hypothesis of no cheating the number
of matching answers has a certain distribution, which allows one to reject
the null hypothesis if the number of matching answers is unduly large for a
particular pair.  (The distribution is generally taken with respect to the
average number of correct answers in a given pair because the more correct
answers, the more matches can be expected under the null hypothesis.)

Wesolowsky (2000) discusses some of the statistical and ethical aspects of
this exercise.

Don Macnaughton


REFERENCE

Wesolowsky, G. O. 2000. Detecting excessive similarity in answers on
multiple choice exams.  _Journal of Applied Statistics,_ 27, 909-921.