Skip to content

Replicating SAS results in R with unstructured covariance matrix

3 messages · Viechtbauer Wolfgang (STAT), Tony K.-T.

#
It's been a while since I used SAS for mixed-effects model, but I think the model:

PROC MIXED data=Comb method=reml;
  CLASS FAC1 TIME SUBJECT FAC2;
  MODEL RESPONSE = BIAS FAC2 FAC1 TIME TIME*FAC1 / ddfm=Residual solution;
  REPEATED TIME / subject=SUBJECT type=un;
RUN;

can be fit with gls() using:

gls(RESPONSE ~ BIAS + factor(FAC2) + factor(TIME)* factor(FAC1), correlation = corSymm(form = ~ TIME | SUBJECT), weights = varIdent(form = ~ 1 | TIME), data=Comb)

TIME needs to be integer valued. SUBJECT should be a factor. ddfm=Residual implies that the fixed effects are tested with t-tests using df = n-rank(X) (where X is the model matrix), while gls() uses the normal distribution to compute those p-values. But this is a minor issue, except in those pesky borderline cases, but then again neither that t-distribution nor that normal distribution is exactly right and one should withhold judgement anyway.

It would be nice if you could provide feedback whether the syntax above did the trick.

Best,
Wolfgang

--   
Wolfgang Viechtbauer, Ph.D., Statistician   
Department of Psychiatry and Psychology   
School for Mental Health and Neuroscience   
Faculty of Health, Medicine, and Life Sciences   
Maastricht University, P.O. Box 616 (VIJV1)   
6200 MD Maastricht, The Netherlands   
+31 (43) 388-4170 | http://www.wvbauer.com
#
For corSymm, 'form' needs to be:

"a one sided formula of the form ~ t, or ~ t | g, specifying a time covariate t and, optionally, a grouping factor g. A covariate for this correlation structure must be integer valued. When a grouping factor is present in form, the correlation structure is assumed to apply only to observations within the same grouping level; observations with different grouping levels are assumed to be uncorrelated. Defaults to ~ 1, which corresponds to using the order of the observations in the data as a covariate, and no groups."

So, yes, with corSymm(form = ~ 1 | SUBJECT), gls() will assume that the first measurement for a particular subject corresponds to time = 1. If that's not the case (and that can of course happen if rows with missing data have been deleted), then things get misalligned. Then you need the time covariate, so that things can be matched up properly.

Best,
Wolfgang