An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080402/3a9a89f7/attachment.pl>
random factor and error messages in the model fitting
3 messages · Alex Fajardo, Martin Henry H. Stevens, John Maindonald
Hi Alex,
On Apr 2, 2008, at 10:17 AM, Alex Fajardo wrote:
Dear r-sig-mixed-models webmail list members, I am new in the mixed-effects models world and I am learning from Faraway's and Pinheiro & Bates' books and also from this list. I have 3, very straightforward, questions, but first a brief summary of my analysis objectives. I am trying to analyze my data with mixed effects models where the fixed factor is represented by altitudinal transects (4 transects, where T4 is treeline, and T1, T2 and T3 are below treeline). In each altitudinal transect I collected tissue samples from different age-class trees (a categorical variable with 4 levels, I, II, III, and IV); all this with the main objective to compare specific leaf area -SLA- of the treeline trees with lower elevation transects, and take into account the age-class of the tree being considered. The data set is very unbalanced and reading similar papers I concluded that age-class should be considered a random factor nested in transects. *Question 1*: am I correct by considering age-class a random factor nested within transect? If so, what should be the way to code the model? My suggestion is:
sla.termas = lmer(SLA ~ 1 + Transect + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit) or
sla.termas = lmer(SLA ~ 1 + Transect + (Transect|Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit), but I am not sure.
I would assume that Age is NOT nested; if it is, you are saying that the effect of age could depend entirely on which transect you look at. Rather, I assume different ages simply have different responses, i.e., (1|Age). However, I would think that a fixed effect model is just as useful. lm( SLA ~ Transect + Age + Age:Transect) I am not sure why these altitudes or age class are considered a random draw from a large number of such classes that you know little about. They seem entirely repeated, and usefully so. My two cents, Hank
In my learning process I followed examples given in Faraway's book and just for learning purposes I computed my model in the way he does (considering both factors random) and got, for some, variables the following warning message:
sla.termas = lmer(SLA ~ 1 + (1|Transect) + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit) *Warning message: In .local(x, ..., value) : Estimated variance-covariance for factor 'Age' is singular* This is not good when I try to test the significance of the variation among ages by ANOVA, since I get a Pr(>Chisq)=1; something must be wrong. This situation happens with some variables and not with all of them: strange? *Question 2*: any idea why this happens? What am I doing wrong? When I do get the model run (no such a warning message and just for some other variables) and compare this model with a reduced version (without the nested random factor, e.g., age-class) I run anova and get a p- value. As suggested by Faraway I should also go for a p-value computing LRT 1000 times (less conservative cf. Faraway) and most of the time I get the following message:
lrstat = numeric(1000)
for(i in 1:1000){
+ rSLA = unlist(simulate(sla.termas2))
+ nmod =
lmer(rSLA~1+(1|
Transect),data=Treeline[Site=="TermasChillan",],na.action=na.omit)
+ amod =
lmer(rSLA~1+(1|Transect)+(1|
Transect:Age),data=Treeline[Site=="TermasChillan",],na.action=na.omit)
+ lrstat[i] = 2*(logLik(amod)-logLik(nmod))
+ }
*Error in model.frame.default(data = Treeline[Site ==
"TermasChillan", :
variable lengths differ (found for 'Transect')*
*Question 3*: any idea why this happens? What am I doing wrong? I
made an
artificial balanced data set to see whether the unbalanced situation
was the
responsible for this message but it was not.
I am new in the mixed-effects models world but I want to learn; your
comments and advice will be greatly appreciated. Cheers,
--
Alex Fajardo, PhD
Investigador Asociado
Centro de Investigaci?n en Ecosistemas de la Patagonia
Bilbao 449. Coyhaique, CHILE
Telefonos: 56-67-244503; (56) 8-4506354
Fax: 56-67-244501
alex.fajardo at ciep.cl
[[alternative HTML version deleted]]
<ATT00001.txt>
Dr. Hank Stevens, Associate Professor 338 Pearson Hall Botany Department Miami University Oxford, OH 45056 Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513) 529-4243 http://www.cas.muohio.edu/~stevenmh/ http://www.cas.muohio.edu/ecology http://www.muohio.edu/botany/ "If the stars should appear one night in a thousand years, how would men believe and adore." -Ralph Waldo Emerson, writer and philosopher (1803-1882)
I'd expect that you want to generalize to a difference choice of
transects within each altitude and age-class. Were there multiple
transects for each altitude and age-class combination? If not, the
factor Transect is trying to do two things at once -- account for the
fixed effect of altitude, and account for the random effect of transect.
Two analyses are possible with the data that you seem to have:
A) lm( SLA ~ Transect + Age + Age:Transect)
The inferences generalize to a different choice of tissue samples
within those same Age and Transect combinations. When a prediction is
made, you have to say which Age and Transect combination you have in
mind, and inferences apply to the particular Transects that were taken.
B)
lmer(SLA ~ Age + Transect + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit)
This treats variation between Age:Transect combinations as the
relevant measure of error, hoping that this will be mach the same as
the error that you'd get from different transects within Altitude:Age
combinations. If there is an Altitude:Age interaction, it may over-
estimate the error.
[If you do happen to have multiple transects for each Age:Transect
combination, you'd want something like:
lmer(SLA ~ Age*Altitude+ (1|transect/Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit)
(the error term needs to identify individual transect*Age
combinations) ]
NB also, you might want to try a non-linear term in Age in the fixed
part of the model.
John Maindonald email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473 fax : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 3 Apr 2008, at 4:02 AM, Hank Stevens wrote:
Hi Alex, On Apr 2, 2008, at 10:17 AM, Alex Fajardo wrote:
Dear r-sig-mixed-models webmail list members, I am new in the mixed-effects models world and I am learning from Faraway's and Pinheiro & Bates' books and also from this list. I have 3, very straightforward, questions, but first a brief summary of my analysis objectives. I am trying to analyze my data with mixed effects models where the fixed factor is represented by altitudinal transects (4 transects, where T4 is treeline, and T1, T2 and T3 are below treeline). In each altitudinal transect I collected tissue samples from different age-class trees (a categorical variable with 4 levels, I, II, III, and IV); all this with the main objective to compare specific leaf area -SLA- of the treeline trees with lower elevation transects, and take into account the age-class of the tree being considered. The data set is very unbalanced and reading similar papers I concluded that age-class should be considered a random factor nested in transects. *Question 1*: am I correct by considering age-class a random factor nested within transect? If so, what should be the way to code the model? My suggestion is:
sla.termas = lmer(SLA ~ 1 + Transect + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit) or
sla.termas = lmer(SLA ~ 1 + Transect + (Transect|Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit), but I am not sure.
I would assume that Age is NOT nested; if it is, you are saying that the effect of age could depend entirely on which transect you look at. Rather, I assume different ages simply have different responses, i.e., (1|Age). However, I would think that a fixed effect model is just as useful. lm( SLA ~ Transect + Age + Age:Transect) I am not sure why these altitudes or age class are considered a random draw from a large number of such classes that you know little about. They seem entirely repeated, and usefully so. My two cents, Hank
In my learning process I followed examples given in Faraway's book and just for learning purposes I computed my model in the way he does (considering both factors random) and got, for some, variables the following warning message:
sla.termas = lmer(SLA ~ 1 + (1|Transect) + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit) *Warning message: In .local(x, ..., value) : Estimated variance-covariance for factor 'Age' is singular* This is not good when I try to test the significance of the variation among ages by ANOVA, since I get a Pr(>Chisq)=1; something must be wrong. This situation happens with some variables and not with all of them: strange? *Question 2*: any idea why this happens? What am I doing wrong? When I do get the model run (no such a warning message and just for some other variables) and compare this model with a reduced version (without the nested random factor, e.g., age-class) I run anova and get a p- value. As suggested by Faraway I should also go for a p-value computing LRT 1000 times (less conservative cf. Faraway) and most of the time I get the following message:
lrstat = numeric(1000)
for(i in 1:1000){
+ rSLA = unlist(simulate(sla.termas2))
+ nmod =
lmer(rSLA~1+(1|
Transect),data=Treeline[Site=="TermasChillan",],na.action=na.omit)
+ amod =
lmer(rSLA~1+(1|Transect)+(1|
Transect:Age
),data=Treeline[Site=="TermasChillan",],na.action=na.omit)
+ lrstat[i] = 2*(logLik(amod)-logLik(nmod))
+ }
*Error in model.frame.default(data = Treeline[Site ==
"TermasChillan", :
variable lengths differ (found for 'Transect')*
*Question 3*: any idea why this happens? What am I doing wrong? I
made an
artificial balanced data set to see whether the unbalanced situation
was the
responsible for this message but it was not.
I am new in the mixed-effects models world but I want to learn; your
comments and advice will be greatly appreciated. Cheers,
--
Alex Fajardo, PhD
Investigador Asociado
Centro de Investigaci?n en Ecosistemas de la Patagonia
Bilbao 449. Coyhaique, CHILE
Telefonos: 56-67-244503; (56) 8-4506354
Fax: 56-67-244501
alex.fajardo at ciep.cl
[[alternative HTML version deleted]]
<ATT00001.txt>
Dr. Hank Stevens, Associate Professor 338 Pearson Hall Botany Department Miami University Oxford, OH 45056 Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513) 529-4243 http://www.cas.muohio.edu/~stevenmh/ http://www.cas.muohio.edu/ecology http://www.muohio.edu/botany/ "If the stars should appear one night in a thousand years, how would men believe and adore." -Ralph Waldo Emerson, writer and philosopher (1803-1882)
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models