-----Oorspronkelijk bericht-----
Van: Peter Francis [mailto:peterfrancis at me.com]
Verzonden: dinsdag 3 augustus 2010 11:27
Aan: ONKELINX, Thierry
Onderwerp: Re: [R-sig-ME] Multi-level qualitative
(fixed-effects) factors
Dear Thierry,
I thought that maybe the case! Thank you very much.
I am relatively new too this field, can a Tukey test be
implemented in R and at what point should i do it? Is there a
sample workflow?
i want to be able conclude for example that - species from
habitat 1 are more threatened than those from habitat 2 etc.
I have multiple factors with multiple levels so by the end i
would like a profile of what a threatened species "looks"
like compared to a non threatened.
I am sorry for the questions and appreciate any advice given.
Peter
On 3 Aug 2010, at 10:18, ONKELINX, Thierry wrote:
Dear Peter,
Your inference in not correct. The significance you get from
habitat2 is only the comparison between habitat2 and the
reference level (habitat1?). You can conclude that only
habitat2 is significantly different from habitat1 and habitat
3 vs 1 and habitat 4 vs 1 is not significant. You cannot
compare 2 vs 3, 2 vs 4 and 3 vs 4. That would require
multiple comparisons (cfr Tukey).
Best regards,
Thierry
--------------------------------------------------------------
----------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium
Research Institute for Nature and Forest team Biometrics &
Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may
be no more than asking him to perform a post-mortem
examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an
answer does not ensure that a reasonable answer can be
extracted from a given body of data.
~ John Tukey
-----Oorspronkelijk bericht-----
Van: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] Namens
Verzonden: dinsdag 3 augustus 2010 10:53
Aan: R-mixed models mailing list
Onderwerp: [R-sig-ME] Multi-level qualitative
Sorry to everyone,
I guess i should rephrase this completely as i think i am
across what i want to ( my fault, i apologise for the spam!).
I am trying to discover which traits correlate with
(binary response- i do have different levele of threat but
how to model continuos response variables in GLMM - quasi poisson?)
I shall work through a quick example and would appreciate it if you
could tell me if my conclusions are justified.
Lets just say for ease i am have two models - i can send
str(traits) etc if required -
model1 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
Generalized linear mixed model fit by the Laplace approximation
Formula: threatornot ~ 1 + (1 | a/b) + habit AIC BIC
1406 1436 -696.9 1394
Random effects:
Groups Name Variance Std.Dev.
family:order (Intercept) 6.9892e-01 8.3602e-01
order (Intercept) 4.2292e-14 2.0565e-07
Number of obs: 1116, groups: family:order, 43; order, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.04803 0.19174 -0.250 0.80219
habit2 1.10627 0.41607 2.659 0.00784 **
habit3 0.92578 0.78141 1.185 0.23611
habit4 0.14383 0.38477 0.374 0.70856
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This shows that with habit the model has a AIC value of 1406 and
habit2 has the significant effect on the threat status of a species?
Now model 2
model2 <-lmer(threatornot~1+(1|a/b) + habit +
breedingsystem, family=binomial)
Generalized linear mixed model fit by the Laplace approximation
Formula: threatornot ~ 1 + (1 | a/b) + habit + breedingsystem AIC
BIC logLik deviance
1408 1449 -696.2 1392
Random effects:
Groups Name Variance Std.Dev.
family:order (Intercept) 0.67836 0.82363
order (Intercept) 0.00000 0.00000
Number of obs: 1116, groups: family:order, 43; order, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.17398 0.39138 0.444 0.65665
habit2 1.15687 0.41943 2.758 0.00581 **
habit3 0.92017 0.78158 1.177 0.23907
habit4 -0.01673 0.43150 -0.039 0.96907
breedingsystem2 -0.18615 0.37849 -0.492 0.62285
breedingsystem3 -0.42513 0.40652 -1.046 0.29567
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This shows that adding breeding system to the model doesn't fit the
data any better (the AIC is higher), therefore
significant effect on the threat status of a species
Under this simple example i can conclude that the habit of
is significant in determining how threatened the species is and in
particular species from habit2 are more threatened than
habit2.
Are these inferences justified or am i missing something?
Thanks
On 3 Aug 2010, at 08:50, Andrew Dolman wrote:
I don't get it. How can you fit the model with just 1 of
of factor "habitat" and have the same number of
you run the model with all three? (It must have at least 2
fit anyway) Also, in the first example you have 4 levels of habitat.
Are they different levels of habitat resolution? e.g.
Aquatic - non aquatic
Aquatic - Terrestrial - Epiphytic
Aquatic - Terrestrial - Epiphytic - Up elephant's noses
Please read the posting guide and include proper examples
are doing and what the data look like.
andydolman at gmail.com
On 3 August 2010 08:48, Peter Francis <peterfrancis at me.com> wrote:
Hi David and Ben, thanks for your help -
I was worried this would make little sense!
I have set out my candidate models
A+B+C+D
B+C+D
A+C+D
etc etc
And am running through them in lmer. Factor A for instance
is Habit, which takes 3 forms - aquatic, terrestrial or epiphyte.
When i run the model with A as a factor i get the breakdown
of the individual levels habitat 1, habitat 2 and habitat 3 and a
corresponding AIC score. However if i just run it with habitat 3 -
aquatic - i get a lower AIC score, therefore the model fits
I am unsure how to, without splitting my factors into their
constituent levels at the beginning - A1+A2+A3 + B1 + B2
at the model with the lowest AIC?
Thanks
Peter
On 3 Aug 2010, at 00:16, David Duffy wrote:
On Mon, 2 Aug 2010, Peter Francis wrote:
I have many multi level factors i.e habit - aquatic,
terrestrial, epiphyte etc
I ran the model with habit as a factor
model111 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
Generalized linear mixed model fit by the Laplace approximation
Formula: threatornot ~ 1 + (1 | order/family) + habit AIC BIC
logLik deviance
1406 1436 -696.9 1394
Random effects:
Groups Name Variance Std.Dev.
family:order (Intercept) 6.9892e-01 8.3602e-01
order (Intercept) 4.2292e-14 2.0565e-07
Number of obs: 1116, groups: family:order, 43; order, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.04803 0.19174 -0.250 0.80219
habit2 1.10627 0.41607 2.659 0.00784 **
habit3 0.92578 0.78141 1.185 0.23611
habit4 0.14383 0.38477 0.374 0.70856
---
Which had a AIC of 1406
I then re-ran the model with only aquatic and got a lower
AIC value - which i guess is to be expected as aquatic is highly
significant and aquatic species are more prone to threat ( my
response).
model112 <-lmer(threatornot~1+(1|a/b) + aquatic,
Generalized linear mixed model fit by the Laplace approximation
Formula: threatornot ~ 1 + (1 | order/family) + aquatic AIC BIC
logLik deviance
1395 1415 -693.4 1387
Random effects:
Groups Name Variance Std.Dev.
family:order (Intercept) 0.60007 0.77464
order (Intercept) 0.00000 0.00000
Number of obs: 1116, groups: family:order, 43; order, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.1572 0.1827 0.860 0.389613
aquatic -0.6683 0.1737 -3.847 0.000119 ***
My question is - when i developed the candidate models i
thought using multilevel factors would be OK and i would be able to
tease out the individual levels. If i split the factors into levels
from the beginning then i am left with a huge amount of candidate
models? This would not be a problem in stepwise regression
just remove the habit with the least significant P Value.
If i remove habits i "feel" are unimportant from the
beginning i feel i would be limiting the model too much.
I don't understand at all, I'm afraid. Is aquatic the same
as habit=2, or something? If so, there is something funny
If family and order are "nuisance" variables, then a
stepwise approach is quite reasonable (if you are someone
stepwise regression is reasonable, of course ;)).
Just 2c, David Duffy.
--
| David Duffy (MBBS PhD)
| email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax:
| Epidemiology Unit, Queensland Institute of Medical
| 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG