understanding log-likelihood/model fit
I prefer smaller data sets so I can
more readily look at all components;
here is my modification of the modeling
(along with a random seed so we
all get the same data)
You can change the parameter values
for big data sets.
Nsubj <- 10
Ngrp <- 2
NsubjRep <- 5
set.seed(123)
test1s <- data.frame(subject = rep(seq(Nsubj * Ngrp), each = NsubjRep),
response=500+c(rep(-100,Nsubj * NsubjRep),rep(100,Nsubj * NsubjRep))+rnorm(Nsubj * Ngrp * NsubjRep, 0, 10),
fixed=(rep(c("A","B"),each=Nsubj * NsubjRep)))
null1 <- lmer(response~(1|subject),test1s)
fixed1 <- lmer(response~fixed+(1|subject),test1s)
summary(null1)
summary(fixed1)
quartz() ## substitute your favourite plot device
par(mfrow = c(2, 1))
hist(ranef(null1)[[1]][,1])
hist(ranef(fixed1)[[1]][,1])
quartz() ## substitute your favourite plot device
par(mfrow = c(2, 1))
qqnorm(ranef(null1)[[1]][,1])
qqline(ranef(null1)[[1]][,1])
qqnorm(ranef(fixed1)[[1]][,1])
qqline(ranef(fixed1)[[1]][,1])
### -----
Look at the histogram of BLUPs for model null1.
The strong bimodal structure is a clue that
there may be an underlying fixed effect
(and in fact we know that this is the case,
the beauty of simulations).
The QQ plot of the null1 model also
hints that subjects are not just some
selection from a normal distribution.
The fixed effect (see summary(fixed1)
output) shows a huge t value and the
random effect estimates are near zero
(you'll see this message during the
fitting process too)
Warning message:
In .local(x, ..., value) :
Estimated variance for factor ?subject? is effectively zero
All of this suggests that a fixed effect (covariate 'fixed')
is appropriate to explain the structure
in the data, and subject need not be modeled
as a random effect. A linear model without
random effects would suffice here.
Now to run John Maindonald's model
with subjects being drawn from a
normal distribution:
### -----
set.seed(123)
test2s <- data.frame(subject = rep(seq(Nsubj * Ngrp), each = NsubjRep),
response=500 + rep(rnorm(Nsubj * Ngrp), each = NsubjRep) + rnorm(Nsubj * Ngrp * NsubjRep, 0, 10),
fixed=(rep(c("A","B"),each=Nsubj * NsubjRep)))
null2 <- lmer(response~(1|subject),test2s)
fixed2 <- lmer(response~fixed+(1|subject),test2s)
summary(null2)
summary(fixed2)
quartz() ## substitute your favourite plot device
par(mfrow = c(2, 1))
hist(ranef(null2)[[1]][,1])
hist(ranef(fixed2)[[1]][,1])
quartz() ## substitute your favourite plot device
par(mfrow = c(2, 1))
qqnorm(ranef(null2)[[1]][,1])
qqline(ranef(null2)[[1]][,1])
qqnorm(ranef(fixed2)[[1]][,1])
qqline(ranef(fixed2)[[1]][,1])
### -----
Now the histogram of null2 BLUPs looks
like a unimodal density, and the
QQ plot suggests BLUPs are closer
to normal. This is much more in line
with the situation that random effects
components were designed to handle.
The fixed effect in fixed2 (see
summary(fixed2) output) does not
contribute to the model fit, as indicated
by the small t-value, and the random effect
estimates are not all small numbers near
zero. A random effects model with subject
as a random effect is appropriate
here.
So this is a partial look into the
difference between fixed and random
effects.
A covariate that is more 'degenerate',
i.e. takes on a few distinct values, so
has a 'spiky' distribution, is often better
modeled as a fixed effect.
When a covariate really can be imagined
to be a draw from some large non-degenerate
or non-spiky distribution, and you are
not really interested in the individual
values it can assume, but you want to
'model out' its effect (so its effect doesn't
cause some other covariate estimate to be
dragged off course) you can save
model degrees of freedom by modeling
that covariate as a random effect.
Comparing likelihoods and other
statistics derived therefrom will
guide you in the right direction,
as these simulations illustrate.
The likelihoods do 'depend' on
the structure in the data.
Statistics work!
HTH
Steven McKinney
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre
email: smckinney +at+ bccrc +dot+ ca
tel: 604-675-8000 x7561
BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3
Canada
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org on behalf of John Maindonald
Sent: Tue 8/19/2008 5:25 PM
To: Daniel Ezra Johnson
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] understanding log-likelihood/model fit
On 20 Aug 2008, at 10:19 AM, Daniel Ezra Johnson wrote:
Thanks for your reply.
What you observe has everything to do with 'the extreme non-normality of the random effects in "null" as opposed to "fixed"'.
I thought so too, but it's not correct. The log-likelihood of the mixed-effects model does not appear to depend on how nearly normal, or not, the random effect BLUPs are.
Let's see your example. If you change the data, however, you will change the log-likelihood. John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.
That was what led to my original question. When we have a random effect (subject) and a between-group fixed effect (an "outer" effect), logically they are competing to account for the same variance. The mixed model fitting appears to "prefer" the model with the fixed effect, even though the model with no fixed effect appears to fit the data equally well (judging by residuals and the fact that the BLUPs are exactly what they 'should be', no shrinkage here). I am comfortable with this result, indeed my work depends on it, but I want to understand better why it comes out this way. This has nothing to do with practical research design questions. D
The distribution at the subject level is nowhere near normal.
The fixed effects account for a rather severe departure from
normality.
For this reason, the between subjects components of variance
estimate is radically different between the two models. Also the
subjects random effects are radically different. Try
znull <- ranef(null, drop=TRUE)
zfix <- ranef(fixed, drop=TRUE)
plot(zfix$subject ~ znull$subject)
The residuals for the two models are simiiar, but not at all the
same.
The residuals from the null model are surprisingly close to normal.
[Actually, a criticism of the Pinheiro and Bates book is that it
relies
far too heavily on plots of residuals for its diagnostic checking.]
Think about how the results might be used. The analyses that you
give seem to imply that the existence of the two groups is hidden
from you. You want to generalize to other subjects (the usual
situation), so that fitting subject as a fixed effect would defeat
the
whole aim of the enterprise. So, unless you learn of the existence
of the two groups from initial exploratory analysis of the data (you
darn well should), you have to do the null analysis.
At this late stage you might examine the random subject effects
and belatedly conclude that you have been laboring under a gross
misapprehension. Or you might charge blindly ahead and use the
model for predictive purposes. In the latter case, the accuracy will
be terrible (high SE), but, hey, you can make predictions on what to
expect from another subject or group of subjects.
There are strong reasons for trying to ensure that models (1) are
somewhat near correct, and (2) that they model aspects of the data
that reflect the intended use of the results.
Now try
test2 <- data.frame(subject=rep(1:200,each=500),
response=500+rep(rnorm(200), each=500)+rnorm(100000,0,10),
fixed=(rep(c("A","B"),each=50000)))
lmer(response~ (1|subject), data=test2)
lmer(response~ fixed+(1|subject), data=test2)
Wrestling with such simulated data is a good way to gain
understanding and tune one's intuitive processes.
By the way, I believe you should be fitting by maximum likelihood
(ML), for comparing models with different fixed effects.
John Maindonald email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473 fax : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 20 Aug 2008, at 6:47 AM, Daniel Ezra Johnson wrote:
Dear All,
I'm sure this is a simple question, but I haven't been able to
find an
answer to it that I can understand. I'd like an answer that is
pitched
somewhere between the full mathematics on the one hand, and an
oversimplified overview on the other.
One way of putting the question is, what is the difference, really,
between a fixed and a random effect (as they are fit by lmer)?
Another way of putting it involves the following example.
Suppose we have observations of a response from many subjects.
The overall average response is 500.
The subjects fall into two groups.
Half have an effect of +100 and half an effect of -100.
test1 <- data.frame(subject=rep(1:200,each=500),
response=500+c(rep(-100,50000),rep(100,50000))
+rnorm(100000,0,10),
fixed=(rep(c("A","B"),each=50000)))
The following model treats subject as a random effect:
null <- lmer(response~(1|subject),test1)
The following model keeps the subject effect and adds the fixed effect.
fixed <- lmer(response~fixed+(1|subject),test1)
null
Linear mixed model fit by REML
Formula: response ~ (1 | subject)
Data: test1
AIC BIC logLik deviance REMLdev
746923 746951 -373458 746923 746917
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 10041.81 100.209
Residual 100.46 10.023
Number of obs: 100000, groups: subject, 200
Fixed effects:
Estimate Std. Error t value
(Intercept) 500.000 7.086 70.56
fixed
Linear mixed model fit by REML
Formula: response ~ fixed + (1 | subject)
Data: test1
AIC BIC logLik deviance REMLdev
743977 744015 -371984 743960 743969
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 0.016654 0.12905
Residual 99.642120 9.98209
Number of obs: 100000, groups: subject, 200
Fixed effects:
Estimate Std. Error t value
(Intercept) 400.11806 0.04647 8610
fixedB 199.87485 0.06572 3041
The result is what one would expect, intuitively.
In the model "null" there is a large subject variance.
In the model "fixed" there is virtually no subject variance.
In both models the residuals are the same.
The logLik of the model with the fixed effect is closer to zero
(by about
1500).
Therefore, we say the model with the fixed effect fits better.
This makes sense. Instead of 100 subject effects near +100 and 100
near -100, we have virtually no subject effects and the fixed effect
accounts for all the between-subject variance.
The question: why? Why does the model with the fixed effect fit
better?
Why does the smaller (zero) random effect plus the fixed effect
translate into an improvement in log-likelihood?
It's not anything to do with the residuals. The models make the same
predictions:
fitted(null)[c(1:5,50001:50005)]
[1] 400.2807 400.2807 400.2807 400.2807 400.2807 600.2013 600.2013 600.2013 [9] 600.2013 600.2013
fitted(fixed)[c(1:5,50001:50005)]
[1] 400.1282 400.1282 400.1282 400.1282 400.1282 599.9839 599.9839 599.9839 [9] 599.9839 599.9839 And I don't think it has anything to do with the extreme non- normality of the random effects in "null" as opposed to "fixed". So what's the difference? What, in terms of model fitting, makes it preferable to account for the between-subject variation with a fixed effect (as in "fixed") rather than with a random effect (as in "null")? Thanks for your help, Daniel
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models