-----Oorspronkelijk bericht-----
Van: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] Namens Peter Francis
Verzonden: dinsdag 3 augustus 2010 10:53
Aan: R-mixed models mailing list
Onderwerp: [R-sig-ME] Multi-level qualitative (fixed-effects) factors
Sorry to everyone,
I guess i should rephrase this completely as i think i am not
getting across what i want to ( my fault, i apologise for the spam!).
I am trying to discover which traits correlate with
threatened or not (binary response- i do have different
levele of threat but am unsure how to model continuos
response variables in GLMM - quasi poisson?)
I shall work through a quick example and would appreciate it
if you could tell me if my conclusions are justified.
Lets just say for ease i am have two models - i can send
str(traits) etc if required -
model1 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
Generalized linear mixed model fit by the Laplace approximation
Formula: threatornot ~ 1 + (1 | a/b) + habit AIC BIC
1406 1436 -696.9 1394
Random effects:
Groups Name Variance Std.Dev.
family:order (Intercept) 6.9892e-01 8.3602e-01
order (Intercept) 4.2292e-14 2.0565e-07
Number of obs: 1116, groups: family:order, 43; order, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.04803 0.19174 -0.250 0.80219
habit2 1.10627 0.41607 2.659 0.00784 **
habit3 0.92578 0.78141 1.185 0.23611
habit4 0.14383 0.38477 0.374 0.70856
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This shows that with habit the model has a AIC value of 1406
and habit2 has the significant effect on the threat status of
a species?
Now model 2
model2 <-lmer(threatornot~1+(1|a/b) + habit +
breedingsystem, family=binomial)
Generalized linear mixed model fit by the Laplace approximation
Formula: threatornot ~ 1 + (1 | a/b) + habit + breedingsystem
AIC BIC logLik deviance
1408 1449 -696.2 1392
Random effects:
Groups Name Variance Std.Dev.
family:order (Intercept) 0.67836 0.82363
order (Intercept) 0.00000 0.00000
Number of obs: 1116, groups: family:order, 43; order, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.17398 0.39138 0.444 0.65665
habit2 1.15687 0.41943 2.758 0.00581 **
habit3 0.92017 0.78158 1.177 0.23907
habit4 -0.01673 0.43150 -0.039 0.96907
breedingsystem2 -0.18615 0.37849 -0.492 0.62285
breedingsystem3 -0.42513 0.40652 -1.046 0.29567
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This shows that adding breeding system to the model doesn't
fit the data any better (the AIC is higher), therefore
breedingsystem has no significant effect on the threat status
of a species
Under this simple example i can conclude that the habit of a
species is significant in determining how threatened the
species is and in particular species from habit2 are more
threatened than those not from habit2.
Are these inferences justified or am i missing something?
Thanks
On 3 Aug 2010, at 08:50, Andrew Dolman wrote:
I don't get it. How can you fit the model with just 1 of three levels
of factor "habitat" and have the same number of observations as when
you run the model with all three? (It must have at least 2 levels to
fit anyway) Also, in the first example you have 4 levels of habitat.
Are they different levels of habitat resolution? e.g.
Aquatic - non aquatic
Aquatic - Terrestrial - Epiphytic
Aquatic - Terrestrial - Epiphytic - Up elephant's noses
Please read the posting guide and include proper examples of what you
are doing and what the data look like.
andydolman at gmail.com
On 3 August 2010 08:48, Peter Francis <peterfrancis at me.com> wrote:
Hi David and Ben, thanks for your help -
I was worried this would make little sense!
I have set out my candidate models
A+B+C+D
B+C+D
A+C+D
etc etc
And am running through them in lmer. Factor A for instance
is Habit, which takes 3 forms - aquatic, terrestrial or epiphyte.
When i run the model with A as a factor i get the breakdown
of the individual levels habitat 1, habitat 2 and habitat 3
and a corresponding AIC score. However if i just run it with
habitat 3 - aquatic - i get a lower AIC score, therefore the
model fits the data better?
I am unsure how to, without splitting my factors into their
constituent levels at the beginning - A1+A2+A3 + B1 + B2 etc,
arrive at the model with the lowest AIC?
Thanks
Peter
On 3 Aug 2010, at 00:16, David Duffy wrote:
On Mon, 2 Aug 2010, Peter Francis wrote:
I have many multi level factors i.e habit - aquatic,
terrestrial, epiphyte etc
I ran the model with habit as a factor
model111 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
Generalized linear mixed model fit by the Laplace approximation
Formula: threatornot ~ 1 + (1 | order/family) + habit
AIC BIC logLik deviance
1406 1436 -696.9 1394
Random effects:
Groups Name Variance Std.Dev.
family:order (Intercept) 6.9892e-01 8.3602e-01
order (Intercept) 4.2292e-14 2.0565e-07
Number of obs: 1116, groups: family:order, 43; order, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.04803 0.19174 -0.250 0.80219
habit2 1.10627 0.41607 2.659 0.00784 **
habit3 0.92578 0.78141 1.185 0.23611
habit4 0.14383 0.38477 0.374 0.70856
---
Which had a AIC of 1406
I then re-ran the model with only aquatic and got a lower
AIC value - which i guess is to be expected as aquatic is
highly significant and aquatic species are more prone to
threat ( my response).
model112 <-lmer(threatornot~1+(1|a/b) + aquatic, family=binomial)
model112
Generalized linear mixed model fit by the Laplace approximation
Formula: threatornot ~ 1 + (1 | order/family) + aquatic
AIC BIC logLik deviance
1395 1415 -693.4 1387
Random effects:
Groups Name Variance Std.Dev.
family:order (Intercept) 0.60007 0.77464
order (Intercept) 0.00000 0.00000
Number of obs: 1116, groups: family:order, 43; order, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.1572 0.1827 0.860 0.389613
aquatic -0.6683 0.1737 -3.847 0.000119 ***
My question is - when i developed the candidate models i
thought using multilevel factors would be OK and i would be
able to tease out the individual levels. If i split the
factors into levels from the beginning then i am left with a
huge amount of candidate models? This would not be a problem
in stepwise regression as i could just remove the habit with
the least significant P Value.
If i remove habits i "feel" are unimportant from the
beginning i feel i would be limiting the model too much.
I don't understand at all, I'm afraid. Is aquatic the same
as habit=2, or something? If so, there is something funny
about the model fits.
If family and order are "nuisance" variables, then a
stepwise approach is quite reasonable (if you are someone who
thinks stepwise regression is reasonable, of course ;)).
Just 2c, David Duffy.
--
| David Duffy (MBBS PhD)
| email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax:
| Epidemiology Unit, Queensland Institute of Medical
| 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG