Problem with fitted function

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20111029/03ca16d9/attachment.pl>
You're email is a bit confusing. Let's keep this simple and draw with a distinction between lm and lmer in terms of what is a fitted value.

In least squares regression, your model is

Y = XB + e

Where Y is your outcome, X is the model matrix for the fixed effects, B is the vector of coefficients, and e is the error term. 

And the fitted values are then

\hat{Y} = X\hat{B}

Where \hat{B} is the vector of estimated fixed effects.

In mixed models, your model is

Y = XB + Zu + e

Where Y is your outcome, X is the model matrix for the fixed effects, B is the vector of coefficients, Z is the model matrix for the random effects, u are the BLUP, and e is the error term.

Now, the fitted function in lmer yields

\hat{Y} = X\hat{B} + Z\hat{u}

Where \hat{B} is the vector of estimated fixed effects and now \hat{u} are the predictions of the random effects.
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-
bounces at r-project.org] On Behalf Of Ira Sharenow
Sent: Saturday, October 29, 2011 7:39 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Problem with fitted function

I am new to mixed effects and library(lme4). But I have spent quite a bit of
time trying to understand Prof. Bates's online book.

I have data from the California Department of Education. I have Academic
Performance Index (API) scores for 677 schools and I have data from 2002
through 2009. The Years are coded 0:7.

API response variable, 200 <=API <= 1000

Years, 0:7

CDS, ID numbers, coded as factors

AVGED, average parental education 1 <=AVGED<=5

For each year and for each school, I would like to predict API scores. I
cannot figure out a model that works. My main concern is with fitted().

For each year, I did standard lm() analyses. The data is not perfectly normal
but various adjustments made little difference for my purposes. I am satisfied
with my lm() analyses.

Specifically, I am attempting to predict the API score for a school, AHS.

In order to get started I tried,

model4 = lmer(API ~ 1 + (1 | Years) + (1 | CDS), data = AVGEDAPI0209,
na.action = na.omit)

fixef(model4)
(Intercept)

   701

$Years

   0          1          2        3       4    5     6      7

-55      -30       -18        5     12   17    29   40

$AHS # actually hundreds of numbers were produced and I picked out the AHS ID
and the associated value

124

So I performed a simple calculation

# intercept + AHS effect + Years effect

expAHSScoreBase = numeric(8)

expAHSScoreBase =  701 + 124 + c(-55, -30,-18,5,

   12, 17,29, 40)

expAHSScoreBase
[1] 770   795   807   831   838  843   854   866

I then computed the predicted scores.

fittedAHSScore = fitted(model4)[AVGEDAPI0209$CDS == AHSCDS]

fittedAHSScore
[1] 770   795   807   831   838   798   933   761

The first five scores match, but the last three differ by a significant
amount. Can someone please offer some advice? Why are those final three
numbers so different? And my real goal is to predict API scores from AVGED.

I then tried the model I think I want,

model5 = lmer(API ~ 1 + AVGED + (1 | Years) + (1 | CDS), data = AVGEDAPI0209,
na.action = na.omit)

Unfortunately, when I tried to fit, I again obtained unexpected results

fittedAHSScore5 = fitted(model5)[AVGEDAPI0209$CDS == AHSCDS]

fittedAHSScore5
[1] 768 794 806 832 837 804 931 763

However when I added up the pieces, intercept + (AVGED parameter * actual
value) + Years effect + CDS effect

I think I got what I wanted.

expAHSScoreBase5 = 574 + AVGED5 + 69 + Years5

expAHSScoreBase5
[1] 768   794   806   832   837   842   855   868

By doing separate lm() regressions for each year, I obtained the following
predicted scores,

    800    813   821   838  842    844    854   868

Finally, I tried the following models and in each case the fitted values
produced the same pattern of values.

model6 = lmer(API ~ 1 + Years + (1 + Years| CDS), data = AVGEDAPI0209,
na.action = na.omit, REML = 0)

model6AVGED = lmer(API ~ 1 + AVGED + Years + (1 + Years| CDS), data =
AVGEDAPI0209, na.action = na.omit, REML = 0)

model7 = lmer(API ~ 1 + Years + (1 | CDS) + (-1 + Years| CDS), data =
AVGEDAPI0209, na.action = na.omit, REML = 0)

model7AVGED = lmer(API ~ 1 + AVGED + Years + (1 | CDS) + (-1 + Years| CDS),
data = AVGEDAPI0209, na.action = na.omit, REML = 0)

Thanks.

Ira Sharenow

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models