Does corSymm() require balanced data?

Mon, Mar 15, 2021 10:37 AM

Dear Joe,

At the risk of revealing something that could be misused (because I agree with Thierry that you are pushing things by trying to fit this model with these data), you can get the model to converge by switching to a different optimizer (i.e., BFGS):

fit <- lme(opp ~ time * ccog, random = ~ 1 | id, correlation = corSymm(form = ~ 1 | id), data = dat, control = list(opt = "optim"))

Whether this converges to the global maximum I have not attempted to check.

Maybe this is still useful to know because it might allow you to make a more informed decision about the use of a simpler model. For example:

fit2 <- lme(opp ~ time*ccog, random = ~ 1 | id, correlation = corAR1(form = ~ time), data = dat, control=list(opt="optim"))
anova(fit, fit2)

shows that the corSymm() model does not fit significantly better than the AR1 model.

Best,
Wolfgang

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On
Behalf Of Ben Bolker
Sent: Monday, 15 March, 2021 18:05
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Does corSymm() require balanced data?

On 3/15/21 10:56 AM, Tip But wrote:

Dear Thierry,

Thank you so much for your insightful comments. May I follow-up on them
below in-line:


***"You have too few subjects with 4 observations. Either drop those fourth
observations."

Does the above mean that for an unstructured residual correlation

matrix, the unique number of measurements (e.g., 3 times, 4 times etc.)
must have relatively equal sizes (e.g., 9 subjects with 3 times, 7 subjects
with 4 times)?

 Balance is probably less important than the total number with 4
observations.  If you had 100 subjects with 3 times and 20 subjects with
4 times you'd probably be fine.

***"Or use a different correlation structure. E.g. an AR1:

fit_alt <- lme(opp ~ time * ccog, random = ~1 | id,
   correlation = corAR1(form = ~ time), data = dat)
"

In your above R code, is it necessary to use `corAR1(form = ~ time)`?

It seems `corAR1(form = ~1 | id)` gives the same result?

  I believe that form = ~1|id uses the order of the observations in the
data set as the time index, and the grouping variable from the random
effect as the grouping variable, so these should indeed be equivalent (I
think the documentation should state this, but I haven't checked)

  If you **really** want an answer you can tell R to return it anyway:
use control=lmeControl(returnObject=TRUE), but I wouldn't trust it.

  It's hard to find another mixed-model package in R that can handle
this case (unstructured correlation, homogeneous variance).

On Mon, Mar 15, 2021 at 2:37 AM Thierry Onkelinx <thierry.onkelinx at inbo.be>
wrote:

Dear Joe,

You have too few subjects with 4 observations. Either drop those fourth
observations. Or use a different correlation structure. E.g. an AR1

fit <- lme(
   opp ~ time * ccog, random = ~1 | id,
   correlation = corSymm(), data = dat, subset = time < 3
)

fit_alt <- lme(
   opp ~ time * ccog, random = ~1 | id,
   correlation = corAR1(form = ~ time), data = dat
)
Best regards,


ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

Op ma 15 mrt. 2021 om 03:27 schreef Tip But <fswfswt at gmail.com>:

Dear Members,

In my longitudinal data below, the first couple of subjects were measured
4
times but the rest of the subjects were measured 3 times (see data below).

We intend to use an unstructured residual correlation matrix in
`nlme::lme()`. But our model fails to converge.

Question: Given our data is unbalanced with respect to our grouping
variable (i.e., `id`), can we use ` corSymm()`? And if we do, what would
be
the dimensions of the resultant unstructured residual correlation matrix
for our data; a 3x3 or a 4x4 matrix?

Thank you for your expertise,
Joe

# Data and R Code
dat <- read.csv("https://raw.githubusercontent.com/hkil/m/master/un.csv")

library(nlme)

fit <- lme(opp~time*ccog, random = ~1|id, correlation=corSymm(form = ~ 1 |
id),
            data=dat)

Error:
   nlminb problem, convergence error code = 1
   message = false convergence (8)

Does corSymm() require balanced data?

Thread (7 messages)