Multi-level qualitative (fixed-effects) factors
I don't get it. How can you fit the model with just 1 of three levels of factor "habitat" and have the same number of observations as when you run the model with all three? (It must have at least 2 levels to fit anyway) Also, in the first example you have 4 levels of habitat. Are they different levels of habitat resolution? e.g. Aquatic - non aquatic Aquatic - Terrestrial - Epiphytic Aquatic - Terrestrial - Epiphytic - Up elephant's noses Please read the posting guide and include proper examples of what you are doing and what the data look like. andydolman at gmail.com
On 3 August 2010 08:48, Peter Francis <peterfrancis at me.com> wrote:
Hi David and Ben, thanks for your help - I was worried this would make little sense! I have set out my candidate models A+B+C+D B+C+D A+C+D etc etc And am running through them in lmer. Factor A for instance is Habit, which takes 3 forms - aquatic, terrestrial or epiphyte. When i run the model with A as a factor i get the breakdown of the individual levels habitat 1, habitat 2 and habitat 3 and a corresponding AIC score. However if i just run it with habitat 3 - aquatic - i get a lower AIC score, therefore the model fits the data better? I am unsure how to, without splitting my factors into their constituent levels at the beginning - A1+A2+A3 + B1 + B2 etc, arrive at the model with the lowest AIC? Thanks Peter On 3 Aug 2010, at 00:16, David Duffy wrote: On Mon, 2 Aug 2010, Peter Francis wrote:
I have many multi level factors i.e habit - aquatic, terrestrial, epiphyte etc I ran the model with habit as a factor
model111 <-lmer(threatornot~1+(1|a/b) + habit, family=binomial)
Generalized linear mixed model fit by the Laplace approximation Formula: threatornot ~ 1 + (1 | order/family) + habit ?AIC ?BIC logLik deviance 1406 1436 -696.9 ? ? 1394 Random effects: Groups ? ? ? Name ? ? ? ?Variance ? Std.Dev. family:order (Intercept) 6.9892e-01 8.3602e-01 order ? ? ? ?(Intercept) 4.2292e-14 2.0565e-07 Number of obs: 1116, groups: family:order, 43; order, 9 Fixed effects: ? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|) (Intercept) -0.04803 ? ?0.19174 ?-0.250 ?0.80219 habit2 ? ? ? 1.10627 ? ?0.41607 ? 2.659 ?0.00784 ** habit3 ? ? ? 0.92578 ? ?0.78141 ? 1.185 ?0.23611 habit4 ? ? ? 0.14383 ? ?0.38477 ? 0.374 ?0.70856
--- Which had a AIC of 1406 I then re-ran the model with only aquatic and got a lower AIC value - which i guess is to be expected as aquatic is highly significant and aquatic species are more prone to threat ( my response).
model112 <-lmer(threatornot~1+(1|a/b) + aquatic, family=binomial) model112
Generalized linear mixed model fit by the Laplace approximation Formula: threatornot ~ 1 + (1 | order/family) + aquatic ?AIC ?BIC logLik deviance 1395 1415 -693.4 ? ? 1387 Random effects: Groups ? ? ? Name ? ? ? ?Variance Std.Dev. family:order (Intercept) 0.60007 ?0.77464 order ? ? ? ?(Intercept) 0.00000 ?0.00000 Number of obs: 1116, groups: family:order, 43; order, 9 Fixed effects: ? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|) (Intercept) ? 0.1572 ? ? 0.1827 ? 0.860 0.389613 aquatic ? ? ?-0.6683 ? ? 0.1737 ?-3.847 0.000119 ***
My question is - when i developed the candidate models i thought using multilevel factors would be OK and i would be able to tease out the individual levels. If i split the factors into levels from the beginning then i am left with a huge amount of candidate models? This would not be a problem in stepwise regression as i could just remove the habit with the least significant P Value. If i remove habits i "feel" are unimportant from the beginning i feel i would be limiting the model too much. I hope this makes sense!
I don't understand at all, I'm afraid. ?Is aquatic the same as habit=2, or something? ?If so, there is something funny about the model fits. If family and order are "nuisance" variables, then a stepwise approach is quite reasonable (if you are someone who thinks stepwise regression is reasonable, of course ;)). Just 2c, David Duffy. -- | David Duffy (MBBS PhD) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ,-_|\ | email: davidD at qimr.edu.au ?ph: INT+61+7+3362-0217 fax: -0101 ?/ ? ? * | Epidemiology Unit, Queensland Institute of Medical Research ? \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia ?GPG 4D0B994A v
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models