odd behavior of summary()$r.squared
J.R. Lockwood wrote:
I may be missing something obvious here, but consider the following simple
dataset simulating repeated measures on 5 individuals with pretty strong
between-individual variance.
set.seed(1003)
n<-5
v<-rep(1:n,each=2)
d<-data.frame(factor(v),v+rnorm(2*n))
names(d)<-c("id","y")
Now consider the following two linear models that provide identical fitted
values, residuals, and estimated residual variance:
m1<-lm(y~id,data=d)
m2<-lm(y~id-1,data=d)
print(max(abs(fitted(m1)-fitted(m2))))
The r-squared reported by summary(m1) appears to be correct in that it is
equal to the squared correlation between the fitted and observed values:
print(summary(m1)$r.squared - cor(fitted(m1),d$y)^2)
However, the same is not true of m2.
print(summary(m2)$r.squared - cor(fitted(m2),d$y)^2)
R.version
_
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status
major 1
minor 9.0
year 2004
month 04
day 12
language R
I think what you're trying to do is better accomplished by looking at the anova table of the two results a1 <- anova(m1) a2 <- anova(m2) r2.1 <- a1[1, 2]/sum(a1[, 2]) r2.2 <- a2[1, 2]/sum(a2[, 2]) summary(m1)$r.squared - r2.1 summary(m2)$r.squared - r2.2 The result you used above using "cor" still adjusts your data for the grand mean, which m2 doesn't fit. HTH, --sundar