Dear Paul, Thanks for the interesting example. In this case we do know the true ID effects (b) so we can inspect the true and estimated ID effects:
table(dat5$ID) -> tab5 # nr of obs per ID plot( b[1:89], ranef(m2)$ID[,1], xlab="true ID effect", ylab="estimated ID
effect", pch=ifelse(tab5==2,16,1), cex=2, ???? main="m2 of dat5" )
abline(a=0,b=1,lty=4)
legend("top",pch=c(16,1), legend=c("two obs","one obs"), pt.cex=2, ncol=2 )
This confirms that the random estimates for ID deviate more from their true value if there is only 1 data point available than if there are 2 data points available. With more data points per ID it becomes easier to separate ID (b) and residual (err) random effects. In other words, some of the err variance is now considered as part of ID variance. Thus with the incomplete data in dat5, the variance between ID is overestimated (estimate 1.24, true 1.00), as illustrated in the plot. Conversely, the err variance is underestimated (estimate 0.66, true 1.00). HTH! With kind regards, Hugo Quen?
Prof.dr. Hugo Quen? | hoogleraar Kwantitatieve Methoden | onderwijsdirecteur Undergraduate School | Dept Talen Literatuur en Communicatie | Utrecht inst of Linguistics OTS | Universiteit Utrecht | Trans 10 | kamer1.43 | 3512 JK Utrecht | The Netherlands |+31 30 253 6070 |H.Quene at uu.nl |www.uu.nl/gw/medewerkers/HQuene |www.hugoquene.nl |uu.academia.edu/HugoQuene <http://uu.academia.edu/HugoQuene> | [[alternative HTML version deleted]]