Fitting a ``trivial'' model with lmer().

2 messages · Rolf Turner, Doran, Harold

Original

Rolf Turner

Tue, Jul 29, 2008 3:48 PM #

I sent the message, to be found below, to Doug Bates just now.
Since he is flat out preparing a number of workshops, he suggested
that I post to this SIG to see if I could get some enlightenment.

Ergo, here goes:

I wrote (to the general R-help list; 28 July 2008):

	<snip>

and Doug Bates responded:

I tried the above formulation and it (as I posted yesterday) gives
a warning about a false convergence message from nlminb.  I think
there are indeed singularity problems, or rather problems with a
certain indeterminateness.

I think that the forgoing formulation translates into ``mathematical
expressions'' as

	y_ij = mu + alpha_i + s_j + e_ij

(with treatment ``contrasts'' giving the constraint that alpha_1 = 0)
where alpha_i corresponds to the i-th time, s_j corresponds to the j-th
student, and the e_ij are (Gaussian) errors with mean 0 and a covariance
structure

	Cov(e_ij, e_{i',j'}) = 0 if j != j'
			     = gamma_{i,i'} if j = j'

Translated into my ``trivial model'' thinking, this can be expressed as

	Y_j = MU + s_j + E_j

where the Y_j and E_j are 6-dimensional vectors and the addition in the
forgoing follows the S/R convention whereby the scalar s_j is added to
each component of the vectors.  And of course MU is the 6 dimensional
vector of mean values.

The vectors Y_j are independent N(MU,Sigma) where Sigma = Gamma +  
sigma^2
where Gamma = [gamma_ij] and sigma^2 is the variance of the  
(independent,
mean 0) s_j.  I believe that sigma^2 is the residual variance in the
lmer() formulation of the model.

The covariance matrix Sigma is well-determined/estimable but Gamma  
and sigma^2
are not --- we can ``sensibly'' have Gamma = Sigma - sigma^2 for any  
value
of sigma^2 such that Sigma - sigma^2 is positive definite. In general  
there
will be a wide range of such values of sigma^2.

It seems to me that the s_j are redundant in the model; we just need:

	y_ij = mu + alpha_i + e_ij

Is there a way to formulate the foregoing model in lmer()?  Effectively,
from one p.o.v., one wants to constrain the sigma^2 variance  
component to
be 0.

Of course one doesn't need lmer() to fit such a model.  I just want to
get some understanding of the modelling syntax in lmer() and perhaps
be able to fit more complicated models which are related to the  
foregoing
``trivial'' one.

Thanks.

	cheers,

		Rolf Turner

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

Wed, Jul 30, 2008 7:45 AM #

Can you post str(Rolf's_data)? If I understand correctly, y is a
temporally ordered vector of the measurements on each student, correct?
I further assume that tstnum is a factor that associates each element in
the vector y with a point in time. So, your data are something like:

Student	y	tstnum
1		y1	1
1		y2	2
1		y3	3
2		y1	1
2		y2	2
2		y3	3
...

If this is your data structure and your lmer syntax is:

lmer(y ~ tstnum + (0 + tstnum|stdnt), data=schooldat)

Then you will get the marginal effects for each time point as (mu +
alpha_i) and the variance/covariance matrix of the ranodm effects will
be very large and probably unstable. I don't know what you mean by the
"most general covariance matrix possible". As someone who models
longitudinal educational data on a daily basis, I would approach this
problem differently. 

Again, I don't think I understand your data structure entirely correct,
but assume the data structure I pose above is what you have. I would
first start with a model where tstnum is an integer and not a factor in
the data frame. Because it seems you want the marginal effects for time,
I might do something like:

lmer(y ~ factor(tstnum) + (tstnum|stdnt), data=schooldat)

In this formulation, treating time in this manner accounts for the
serial correlation between scores and reduces the number of elements in
the variance/covariance matrix.

Now, I experimented quickly with the egsingle data set that is in
library(mlmRev). Here is how my thinking plays out:

Linear mixed-effects model fit by REML 
Formula: math ~ factor(year) + (year | schoolid) 
   Data: egsingle 
   AIC   BIC logLik MLdeviance REMLdeviance
 20546 20608 -10264      20503        20528
Random effects:
 Groups   Name        Variance  Std.Dev. Corr  
 schoolid (Intercept) 0.1933133 0.439674       
          year        0.0098065 0.099028 0.555 
 Residual             0.9664253 0.983069       
number of obs: 7230, groups: schoolid, 60

Fixed effects:
                 Estimate Std. Error t value
(Intercept)      -3.20081    0.10074  -31.77
factor(year)-1.5  1.18359    0.09304   12.72
factor(year)-0.5  2.19556    0.09552   22.99
factor(year)0.5   2.88863    0.09991   28.91
factor(year)1.5   3.48153    0.10660   32.66
factor(year)2.5   4.29051    0.11451   37.47

Correlation of Fixed Effects:
            (Intr) f()-1. f()-0. f()0.5 f()1.5
fctr(y)-1.5 -0.831                            
fctr(y)-0.5 -0.816  0.914                     
fctr(yr)0.5 -0.785  0.893  0.927              
fctr(yr)1.5 -0.739  0.853  0.903  0.933       
fctr(yr)2.5 -0.691  0.809  0.872  0.915  0.932

Warning messages:
1: In .local(x, ..., value) :
  Estimated variance-covariance for factor 'schoolid' is singular

2: In .local(x, ..., value) :
  nlminb returned message false convergence (8)

The number of elements in the variance/covariance matrix in fm2 is large
and this model also has a singularity problem, similar to what you have
encountered. Let me stop here and let you add info or experiment with
what I pose.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models