large data set implies rejection of null?
Although I said I would not reply anymore, I did think of one example of what I thought was a perfectly well-controlled experiment that I did. I don't remember it very well and don't have a reprint (!), but here is the citation: Baron, J. (1974). Facilitation of perception by spelling constraints. Canadian Journal of Psychology, 28, 37-50. In one condition, subjects got practice with AB CD. In another, they got AB CD AD CB. But the frequency of AB CD is the same in both conditions, so frequency of presentation was perfectly controlled. The former condition was superior in perception, thus showing that subjects could use information about sequential constraints. (Or something like this. I might be misremembering.) Even a tiny effect would have been theoretically interesting. This involved many thousands of observations per subject, as I recall. Even with a tiny effect with millions of observations, I cannot think of an alternative explanation of a significant result (except the usual, that it was chance and would not replicate). There was no issue of sampling because everything was counterbalanced. I have done many other experiments that I think were well controlled, but nothing as simple as this one. I am not yet convinced that null hypotheses are never true. They seem to be true quite often in my lab. :( Jon
On 11/27/10 14:22, Daniel Ezra Johnson wrote:
On 11/24/10 07:59, Rolf Turner wrote:
It is well known amongst statisticians that having a large enough data set will result in the rejection of *any* null hypothesis, i.e. will result in a small p-value.
This seems to be a well-accepted guideline, probably because in the social sciences, usually, none of the predictors truly has an effect size of zero. However, unless I am misunderstanding it, the statement appears to me to be more generally false. For example, when the population difference of means actually equals zero, in a t-test, very large sample sizes do not lead to small p-values. set.seed(1) n <- 1000000 # 10^6 dat.1 <- rnorm(n/2,0,1) dat.2 <- rnorm(n/2,0,1) t.test(dat.1,dat.2,var.equal=T) # p = 0.60 set.seed(1) n <- 10000000 # 10^7 dat.1 <- rnorm(n/2,0,1) dat.2 <- rnorm(n/2,0,1) t.test(dat.1,dat.2,var.equal=T) # p = 0.48 set.seed(1) n <- 100000000 # 10^8 dat.1 <- rnorm(n/2,0,1) dat.2 <- rnorm(n/2,0,1) t.test(dat.1,dat.2,var.equal=T) # p = 0.80 Such results - where the null hypothesis is NOT rejected - would presumably also occur in any experimental situations where the null hypothesis was literally true, regardless of the size of the data set. No? Daniel
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron Editor: Judgment and Decision Making (http://journal.sjdm.org)