Dear list,
I am unsure how to structure my model, i have tried something and it makes sense but i am unsure if i am interpreting it correctly?
i have a continuous response variable - the observed quantity of evolutionary history - EH
Then i have a number of species which have a hierarchical structure ~ Genus, Family etc
My research question is do certain families have significantly higher ( or lower) EH values than the others.
Reproducible example:
example <- structure(list(Family = structure(c(2L, 1L, 1L, 5L, 7L, 7L, 3L,
4L, 6L, 6L, 1L, 3L), .Label = c("Araceae", "Asphodelaceae", "Bromeliaceae",
"Cyperaceae", "Orchidaceae", "Poaceae", "Zingiberaceae"), class = "factor"),
Genus = structure(c(3L, 4L, 4L, 1L, 6L, 6L, 2L, 5L, 8L, 9L,
4L, 7L), .Label = c("Acianthera", "Aechmea", "Aloe", "Anthurium",
"Bulbostylis", "Hedychium", "Lindmania", "Psathyrostachys",
"Sesleria"), class = "factor"), Species = structure(c(9L,
1L, 10L, 11L, 7L, 4L, 3L, 5L, 8L, 2L, 6L, 12L), .Label = c("bonplandii",
"coerulans", "cymosopaniculata", "elatum", "emmerichiae",
"gehrigeri", "glabrum", "juncea", "pubescens", "sagittatum",
"scalpricaulis", "sessilis"), class = "factor"), EH = c(8.746525,
24.462699, 33.03942, 32.719489, 13.598201, 13.598201, 13.164928,
9.339228, 9.69705, 13.478372, 37.497137, 59.562911)), .Names = c("Family",
"Genus", "Species", "EH"), class = "data.frame", row.names = c(NA,
-12L))
#My model
test <- lm(EH~Family, data = example)
#in this small example no families are significant but if one was - would that mean they have significantly more EH than the others?
Thanks
Chris
Advice in model construction
2 messages · Chris Mcowen, Weidong Gu
1 day later
Hi Chris, Linear regression model of categorical variables is equivalent to anova. If one of estimates of coefficients is significant in lm, it is interpreted as samples of Family did not come from the same population and the signficant factor level is compared to the reference level. If you want pair-wise comparison, you need post-hoc testing methods (?TukeyHSD). HTH Weidong Gu
On Wed, Oct 5, 2011 at 1:55 PM, Chris Mcowen <chrismcowen at gmail.com> wrote:
Dear list,
I am unsure how to structure my model, i have tried something and it makes sense but i am unsure if i am interpreting it correctly?
i have a continuous response variable - the observed quantity of evolutionary history - EH
Then i have a number of species which have a hierarchical structure ~ Genus, Family etc
My research question is do certain families have significantly higher ( or lower) EH values than the others.
Reproducible example:
example <- structure(list(Family = structure(c(2L, 1L, 1L, 5L, 7L, 7L, 3L,
4L, 6L, 6L, 1L, 3L), .Label = c("Araceae", "Asphodelaceae", "Bromeliaceae",
"Cyperaceae", "Orchidaceae", "Poaceae", "Zingiberaceae"), class = "factor"),
? ?Genus = structure(c(3L, 4L, 4L, 1L, 6L, 6L, 2L, 5L, 8L, 9L,
? ?4L, 7L), .Label = c("Acianthera", "Aechmea", "Aloe", "Anthurium",
? ?"Bulbostylis", "Hedychium", "Lindmania", "Psathyrostachys",
? ?"Sesleria"), class = "factor"), Species = structure(c(9L,
? ?1L, 10L, 11L, 7L, 4L, 3L, 5L, 8L, 2L, 6L, 12L), .Label = c("bonplandii",
? ?"coerulans", "cymosopaniculata", "elatum", "emmerichiae",
? ?"gehrigeri", "glabrum", "juncea", "pubescens", "sagittatum",
? ?"scalpricaulis", "sessilis"), class = "factor"), EH = c(8.746525,
? ?24.462699, 33.03942, 32.719489, 13.598201, 13.598201, 13.164928,
? ?9.339228, 9.69705, 13.478372, 37.497137, 59.562911)), .Names = c("Family",
"Genus", "Species", "EH"), class = "data.frame", row.names = c(NA,
-12L))
#My model
test <- lm(EH~Family, data = example)
#in this small example no families are significant but if one was - would that mean they have significantly more EH than the others?
Thanks
Chris
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.