Skip to content
Prev 12827 / 20628 Next

standard error and statistical significance in lmer versus lm

Dear Peter,

thanks for your detailed answer. Let me try to summarize your point, or 
at least my understanding of it: the lm and lmer models provide 
different results regarding the significance of the genotype:week term 
because the latter models take in account the information/variance 
explained by the probesets, while the formers do not.

I must say that I agree with you on this point. Just, it did not come to 
my mind earlier how to include the probeset information into the lm model.

Now I have followed your suggestions and I have fitted the following lm 
model on a subsample of the whole dataset:

set.seed(123)
numProbesets <- length(unique(dataset$probesets));
toKeep <- sample(1:numProbesets , size = floor(numProbesets/20), replace 
=TRUE); #selecting a random 5% sample of all probesets
probesetsToKeep <- unique(dataset$probesets)[toKeep];
subdata <- dataset[dataset$probesets %in% probesetsToKeep, ]
lmModel.probeset2.full <- lm(values ~ week * genotype + week*probesets, 
data = subdata)

The genotype:week term is now significant:
Anova Table (Type III tests)

Response: values
                Sum Sq    Df    F value    Pr(>F)
(Intercept)      1478     1 30817.8390 < 2.2e-16 ***
week                0     1     3.2522  0.071344 .
genotype          265     1  5518.9498 < 2.2e-16 ***
probesets      101240   877  2407.8307 < 2.2e-16 ***
week:genotype       0     1     9.0273  0.002663 **
week:probesets    127   877     3.0093 < 2.2e-16 ***
Residuals         884 18436
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1


Note that genotype:week is not significant if your original suggestion 
is implemented, i.e., values ~ week * genotype + week:probesets.

All in all, these results reinforce the idea that modelling the probeset 
information is necessary in order to better catch the variance structure 
of the data and to better estimate standard errors and p-values. 
Consequently, it seems that you are right when you say that the problem 
is not whether to use lm or lmer, but whether the probeset information 
is included in the model or not.

Thanks a lot for your contribution. Things seem much clearer to me now.

Regards,

Vincenzo
On 1/2/2015 10:23 PM, Peter Claussen wrote: