I?ve encountered a strange situation where the results of a gls model with a corStruct correlation structure are affected by whether Initialize() is explicitly called before the gls() model is executed.
This seems to be restricted to unbalanced data sets. I have pasted a test case using the Orthodont data set and corCompSymm below, although I first noticed it while running a custom corStruct class so it would appear to be a more general problem than this test case as you might expect. I have not tested this in lme().
I?m aware that calling Initialize() outside of gls or lme is generally discouraged but was surprised that it changed the analysis results. You could imagine calling it in an interactive modeling session to view the correlation structure without knowing that you might have changed future results. I would have expected the results to be identical.
I?d also like to know - which is the ?correct? result or are they considered practically equivalent, although the differences seem quite large for the latter.
Just to the clear, this is not about the relative merits of balanced vs unbalanced data. Its about the veracity of the results.
Thanks,
Joe.
R version 3.6.0 (Linux CentOS 7)
library(nlme)
packageVersion('nlme')
[1] ?3.1.147?
#?????????????????????
# the values are fixed for all models
rho <- 0.3
fixform <- distance ~ age + factor(Sex)
# remove the grouping and other classes to keep it simple
# full balanced data set
Obal<-as.data.frame(Orthodont)
class(Obal)
[1] "data.frame"
# create unbalanced Orthodont data set with no singletons
Ounbal <-as.data.frame(Orthodont[c(3,4,5, 7,8, 11, 12, 15, 16, 19, 20:28,31,32,35,36,39,
#----
# test with the full balanced data set
# pre-calling Initialize doesn't seem to make a difference
csB <- corCompSymm(value = rho, form = ~ 1 | Subject)
csB<-Initialize(csB, data=Obal)
print(mod<-gls(fixform, correlation=csB, data=Obal))
Generalized least squares fit by REML
Model: fixform
Data: Obal
Log-restricted-likelihood: -218.7563
Coefficients:
(Intercept) age SexFemale
17.7067130 0.6601852 -2.3210227
Correlation Structure: Compound symmetry
Formula: ~1 | Subject
Parameter estimate(s):
Rho
0.6144908
Degrees of freedom: 108 total; 105 residual
Residual standard error: 2.305696
# compare with direct call
print(mod2<-gls(fixform, correlation=corCompSymm(value = rho, form = ~ 1 | Subject), data=Obal) )
Generalized least squares fit by REML
Model: fixform
Data: Obal
Log-restricted-likelihood: -218.7563
Coefficients:
(Intercept) age SexFemale
17.7067130 0.6601852 -2.3210227
Correlation Structure: Compound symmetry
Formula: ~1 | Subject
Parameter estimate(s):
Rho
0.6144908
Degrees of freedom: 108 total; 105 residual
Residual standard error: 2.305696