Skip to content
Prev 82337 / 398513 Next

LME & data with complicated random & correlational structures

Have you received any replies to this post?  I haven't seen any, so I 
will attempt a few comments.  First, I'm overwhelmed with the details in 
your discussion.  I suggest that for each of your question, you try to 
think of an extremely simple example that would test your question, as 
suggested in the Posting Guide (www.R-project.org/posting-guide.html). 
If you have problems with that, please submit a question focused on that 
one thing.

	  Have you looked at Pinheiro and Bates (2000) Mixed Effects Models in 
S and S-Plus (Springer)?  If no, I suggest you take a hard look at this 
book.  I was unable to get anything sensible from "lme" until I started 
working through this book.  This is the primary reference for "lme", and 
for me at least, it was essential to understanding what I needed to do 
to get anything useful from "lme".  You don't follow all the math in 
this book to get something useful out of it, because it includes many 
worked examples that should go a long way to answering many questions 
you might have about "lme".

	  1.  "if I set something in my model formula as a fixed effect, then 
it does not make sense to set it as a random effect as well? ...

 > Temperature ~ Stimulus-1, random=Subj|Subj/Epoch/Stimulus

	  I perceive several problems here.  Have you tried the following:

 > Temperature ~ Stimulus-1, random=~1|Subj/Epoch

 From your description, I'm guessing that Stimulus is a factor with 
levels like ("Baseline", "A", "B", "Recovery").  My way of thinking 
about these things is to try to write this as an algebraic model with 
parameters to estimate.  "Temperature~Stimulus-1", ignoring the "random" 
argument for the moment, could be written as follows:

(1) Temperature = b["Baseline"]*I(Stimulus=="Baseline") + 
b["A"]*I(Stimulus=="A") + b["B"]*I(Stimulus=="B") +
b["Recovery"]*I(Stimulus=="Recovery"),

	  where the 4 "b" parameters are to be estimated by "iterative, 
reweighted generalized least squares" to maximize an appropriate 
likelihood function [and where I(...) = indicator function that is 1 if 
the (...) is TRUE and 0 otherwise].

	  Now consider "random=~1|Subj";  ignore "epoch" for the moment.  This 
adds one "random coefficient" to this model for each Sub.  If you have 2 
subjects, then the model becomes something like the following:

(2) Temperature = b["Baseline"]*I(Stimulus=="Baseline") + 
b["A"]*I(Stimulus=="A") + b["B"]*I(Stimulus=="B") +
b["Recovery"]*I(Stimulus=="Recovery") + b.subj[1]*I(Subj==1)
+ b.subj[2]*I(Subj==2).

	  However, we do NOT initially estimate the random coefficients 
b.subj[1] and b.subj[2].  Rather, we assume these coefficients "b.subj" 
are normally distributed with mean 0 and a variance "var.subj", and we 
want to estimate "var.subj".  (We may later estimate the b.subj's 
conditioned on the estimate of "var.subj", but that's a separate issue.) 
  We estimate "var.subj" using "iterative, reweighted generalized least 
squares" that roughly speaking "uncorrelates" or "whitens" the 
correlated residuals from model (1) and then minimizes the sum of 
squares of those "whitened" residuals.  More detail is provided in 
Pinheiro and Bates.

	  Now consider "random=~1|Subj/Epoch".  This adds another random 
coefficient for each (Subject, Epoch) combination, which we also assume 
are normally distributed with mean 0 and variance "var.s.epoch".  We 
then want to estimate the 4 fixed effect coefficients and the two 
variance coefficients simultaneously, using the same approach I just 
outlined.

	> 2- Is it possible to take a piecewise approach wrt the variance using 
lme(), such as modeling the variability of each subject first ... . 
When I try to set up the correlation structure, I run out of memory fast."

	  You've hit on a great idea here:  Consider the data for each subject 
separately.  Plot it, think about the similarities and differences, and 
condense the data into, e.g., 1 or a few numbers per subject / epoch 
combination.  Then use "lme" on this condensed data set.  I'd start 
simple and add complexity later.  I have on occasion tried to fit the 
most complicated model I could think of, only to find that I could have 
gotten most of the information from condensing it grossly and fitting 
much simpler models, and I wasted lots of time and effort trying to work 
with all the data at once.  I suggest you start by aggregating it to the 
grossest level you think might answer your research questions. After you 
have sensible answers at that level, if you still have time and energy 
for this project, I then might try aggregate the data into more 
summaries with fewer observations in each summary.

	  3.  Is there a way to get rid of the serial dependency BEFORE running 
the model with LME(), such as initiating a corStruct before placing it 
in the model?

	  Answer:  Yes, and the primary way to do this would be to aggregate, 
like I just suggested.  If you still want to do more with the 
correlation structure, study Pinheiro and Bates and try something with a 
moderately condensed version of what you have.

	  hope this helps.
	  spencer graves
Keith Chamberlain wrote: