random factor and error messages in the model fitting

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080402/3a9a89f7/attachment.pl>
Hi Alex,

Dear r-sig-mixed-models webmail list members,

I am new in the mixed-effects models world and I am learning from  
Faraway's
and Pinheiro & Bates' books and also from this list.
I have 3, very straightforward, questions, but first a brief summary  
of my
analysis objectives. I am trying to analyze my data with mixed effects
models where the fixed factor is represented by altitudinal  
transects (4
transects, where T4 is treeline, and T1, T2 and T3 are below  
treeline). In
each altitudinal transect I collected tissue samples from different
age-class trees (a categorical variable with 4 levels, I, II, III,  
and IV);
all this with the main objective to compare specific leaf area -SLA-  
of the
treeline trees with lower elevation transects, and take into account  
the
age-class of the tree being considered. The data set is very  
unbalanced and
reading similar papers I concluded that age-class should be  
considered a
random factor nested in transects.

*Question 1*: am I correct by considering age-class a random factor  
nested
within transect? If so, what should be the way to code the model?
My suggestion is:
sla.termas = lmer(SLA ~ 1 + Transect + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit) or
sla.termas = lmer(SLA ~ 1 + Transect + (Transect|Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit), but I am  
not
sure.
I would assume that Age is NOT nested; if it is, you are saying that  
the effect of age could depend entirely on which transect you look at.  
Rather, I assume different ages simply have different responses, i.e.,  
(1|Age).

However, I would think that a fixed effect model is just as useful.
lm( SLA ~ Transect + Age + Age:Transect)

I am not sure why these altitudes or age class are considered a random  
draw from a large number of such classes that you know little about.  
They seem entirely repeated, and usefully so.

My two cents,
Hank

In my learning process I followed examples given in Faraway's book  
and just
for learning purposes I computed my model in the way he does  
(considering
both factors random) and got, for some, variables the following  
warning
message:

sla.termas = lmer(SLA ~ 1 + (1|Transect) + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit)
*Warning message: In .local(x, ..., value) :
 Estimated variance-covariance for factor 'Age' is singular*

This is not good when I try to test the significance of the  
variation among
ages by ANOVA, since I get a Pr(>Chisq)=1; something must be wrong.  
This
situation happens with some variables and not with all of them:  
strange?

*Question 2*: any idea why this happens? What am I doing wrong?

When I do get the model run (no such a warning message and just for  
some
other variables) and compare this model with a reduced version  
(without the
nested random factor, e.g., age-class) I run anova and get a p- 
value. As
suggested by Faraway I should also go for a p-value computing LRT  
1000 times
(less conservative cf. Faraway) and most of the time I get the  
following
message:

lrstat = numeric(1000)
for(i in 1:1000){
+ rSLA = unlist(simulate(sla.termas2))
+ nmod =
lmer(rSLA~1+(1| 
Transect),data=Treeline[Site=="TermasChillan",],na.action=na.omit)
+ amod =
lmer(rSLA~1+(1|Transect)+(1| 
Transect:Age),data=Treeline[Site=="TermasChillan",],na.action=na.omit)
+ lrstat[i] = 2*(logLik(amod)-logLik(nmod))
+ }
*Error in model.frame.default(data = Treeline[Site ==  
"TermasChillan",  :
 variable lengths differ (found for 'Transect')*

*Question 3*: any idea why this happens? What am I doing wrong? I  
made an
artificial balanced data set to see whether the unbalanced situation  
was the
responsible for this message but it was not.

I am new in the mixed-effects models world but I want to learn; your
comments and advice will be greatly appreciated. Cheers,

--
Alex Fajardo, PhD
Investigador Asociado
Centro de Investigaci?n en Ecosistemas de la Patagonia
Bilbao 449. Coyhaique, CHILE
Telefonos: 56-67-244503; (56) 8-4506354
Fax: 56-67-244501
alex.fajardo at ciep.cl

       [[alternative HTML version deleted]]

<ATT00001.txt>
Dr. Hank Stevens, Associate Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/~stevenmh/
http://www.cas.muohio.edu/ecology
http://www.muohio.edu/botany/

"If the stars should appear one night in a thousand years, how would men
believe and adore." -Ralph Waldo Emerson, writer and philosopher  
(1803-1882)
I'd expect that you want to generalize to a difference choice of  
transects within each altitude and age-class.  Were there multiple  
transects for each altitude and age-class combination?  If not, the  
factor Transect is trying to do two things at once -- account for the  
fixed effect of altitude, and account for the random effect of transect.

Two analyses are possible with the data that you seem to have:

A)  lm( SLA ~ Transect + Age + Age:Transect)

The inferences generalize to a different choice of tissue samples  
within those same Age and Transect combinations.  When a prediction is  
made, you have to say which Age and Transect combination you have in  
mind, and inferences apply to the particular Transects that were taken.

B)

lmer(SLA ~ Age + Transect + (1|Transect:Age),
          data=Treeline[Site=="TermasChillan",], na.action=na.omit)

This treats variation between Age:Transect combinations as the  
relevant measure of error, hoping that this will be mach the same as  
the error that you'd get from different transects within Altitude:Age  
combinations.  If there is an Altitude:Age interaction, it may over- 
estimate the error.

[If you do happen to have multiple transects for each Age:Transect  
combination, you'd want something like:

lmer(SLA ~ Age*Altitude+ (1|transect/Age),
          data=Treeline[Site=="TermasChillan",], na.action=na.omit)

(the error term needs to identify individual transect*Age  
combinations) ]

NB also, you might want to try a non-linear term in Age in the fixed  
part of the model.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

Hi Alex,
On Apr 2, 2008, at 10:17 AM, Alex Fajardo wrote:

Dear r-sig-mixed-models webmail list members,

I am new in the mixed-effects models world and I am learning from
Faraway's
and Pinheiro & Bates' books and also from this list.
I have 3, very straightforward, questions, but first a brief summary
of my
analysis objectives. I am trying to analyze my data with mixed  
effects
models where the fixed factor is represented by altitudinal
transects (4
transects, where T4 is treeline, and T1, T2 and T3 are below
treeline). In
each altitudinal transect I collected tissue samples from different
age-class trees (a categorical variable with 4 levels, I, II, III,
and IV);
all this with the main objective to compare specific leaf area -SLA-
of the
treeline trees with lower elevation transects, and take into account
the
age-class of the tree being considered. The data set is very
unbalanced and
reading similar papers I concluded that age-class should be
considered a
random factor nested in transects.

*Question 1*: am I correct by considering age-class a random factor
nested
within transect? If so, what should be the way to code the model?
My suggestion is:
sla.termas = lmer(SLA ~ 1 + Transect + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit) or
sla.termas = lmer(SLA ~ 1 + Transect + (Transect|Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit), but I am
not
sure.
I would assume that Age is NOT nested; if it is, you are saying that
the effect of age could depend entirely on which transect you look at.
Rather, I assume different ages simply have different responses, i.e.,
(1|Age).

However, I would think that a fixed effect model is just as useful.
lm( SLA ~ Transect + Age + Age:Transect)

I am not sure why these altitudes or age class are considered a random
draw from a large number of such classes that you know little about.
They seem entirely repeated, and usefully so.

My two cents,
Hank

In my learning process I followed examples given in Faraway's book
and just
for learning purposes I computed my model in the way he does
(considering
both factors random) and got, for some, variables the following
warning
message:

sla.termas = lmer(SLA ~ 1 + (1|Transect) + (1|Transect:Age),
data=Treeline[Site=="TermasChillan",], na.action=na.omit)
*Warning message: In .local(x, ..., value) :
Estimated variance-covariance for factor 'Age' is singular*

This is not good when I try to test the significance of the
variation among
ages by ANOVA, since I get a Pr(>Chisq)=1; something must be wrong.
This
situation happens with some variables and not with all of them:
strange?

*Question 2*: any idea why this happens? What am I doing wrong?

When I do get the model run (no such a warning message and just for
some
other variables) and compare this model with a reduced version
(without the
nested random factor, e.g., age-class) I run anova and get a p-
value. As
suggested by Faraway I should also go for a p-value computing LRT
1000 times
(less conservative cf. Faraway) and most of the time I get the
following
message:

lrstat = numeric(1000)
for(i in 1:1000){
+ rSLA = unlist(simulate(sla.termas2))
+ nmod =
lmer(rSLA~1+(1|
Transect),data=Treeline[Site=="TermasChillan",],na.action=na.omit)
+ amod =
lmer(rSLA~1+(1|Transect)+(1|
Transect:Age 
),data=Treeline[Site=="TermasChillan",],na.action=na.omit)
+ lrstat[i] = 2*(logLik(amod)-logLik(nmod))
+ }
*Error in model.frame.default(data = Treeline[Site ==
"TermasChillan",  :
variable lengths differ (found for 'Transect')*

*Question 3*: any idea why this happens? What am I doing wrong? I
made an
artificial balanced data set to see whether the unbalanced situation
was the
responsible for this message but it was not.

I am new in the mixed-effects models world but I want to learn; your
comments and advice will be greatly appreciated. Cheers,

--
Alex Fajardo, PhD
Investigador Asociado
Centro de Investigaci?n en Ecosistemas de la Patagonia
Bilbao 449. Coyhaique, CHILE
Telefonos: 56-67-244503; (56) 8-4506354
Fax: 56-67-244501
alex.fajardo at ciep.cl

     [[alternative HTML version deleted]]

<ATT00001.txt>

Dr. Hank Stevens, Associate Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/~stevenmh/
http://www.cas.muohio.edu/ecology
http://www.muohio.edu/botany/

"If the stars should appear one night in a thousand years, how would  
men
believe and adore." -Ralph Waldo Emerson, writer and philosopher
(1803-1882)

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models