large data set implies rejection of null?
On 11/24/10 07:59, Rolf Turner wrote:
It is well known amongst statisticians that having a large enough data set will result in the rejection of *any* null hypothesis, i.e. will result in a small p-value.
This seems to be a well-accepted guideline, probably because in the social sciences, usually, none of the predictors truly has an effect size of zero. However, unless I am misunderstanding it, the statement appears to me to be more generally false. For example, when the population difference of means actually equals zero, in a t-test, very large sample sizes do not lead to small p-values. set.seed(1) n <- 1000000 # 10^6 dat.1 <- rnorm(n/2,0,1) dat.2 <- rnorm(n/2,0,1) t.test(dat.1,dat.2,var.equal=T) # p = 0.60 set.seed(1) n <- 10000000 # 10^7 dat.1 <- rnorm(n/2,0,1) dat.2 <- rnorm(n/2,0,1) t.test(dat.1,dat.2,var.equal=T) # p = 0.48 set.seed(1) n <- 100000000 # 10^8 dat.1 <- rnorm(n/2,0,1) dat.2 <- rnorm(n/2,0,1) t.test(dat.1,dat.2,var.equal=T) # p = 0.80 Such results - where the null hypothesis is NOT rejected - would presumably also occur in any experimental situations where the null hypothesis was literally true, regardless of the size of the data set. No? Daniel