Dear R users,
We are working on a data set in which we have measured repeatedly a
physiological response variable (y)
every 20 min for 12 h (time variable; 'x') in subjects ('id') beloning to
one of five groups ('group'; 'A' to 'E'). Data are located at:
https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0
We are interested to model if the response in y differences with time
(i.e. 'x') for the two groups. Thus:
require(nlme)
m1<-lme(y~group*x+group*I(x^2),random=~x|id,data=data.df,na.action=na.omit)
But because data are collected repeatedly over short time intervals for
each subject, it seemed prudent to consider an autoregressive covariance
structure. Thus:
m2<-update(m1,~.,corr=corCAR1(form=~x|id))
AIC values indicate the latter (i.e. m2) as more appropriate:
anova(m1,m2)
# Model df AIC BIC logLik Test L.Ratio
p-value
#m1 1 19 2155.996 2260.767 -1058.9981
#m2 2 20 2021.944 2132.229 -990.9718 1 vs 2 136.0525 <.0001
Fixed effects and test statistics differ between models. A look at
marginal ANOVA tables suggest inference might differ somewhat between
models:
anova.lme(m1,type="m")
# numDF denDF F-value p-value
#(Intercept) 1 1789 63384.80 <.0001
#group 4 45 1.29 0.2893
#x 1 1789 0.05 0.8226
#I(x^2) 1 1789 4.02 0.0451
#group:x 4 1789 2.61 0.0341
#group:I(x^2) 4 1789 4.37 0.0016
anova.lme(m2,type="m")
# numDF denDF F-value p-value
#(Intercept) 1 1789 59395.79 <.0001
#group 4 45 1.33 0.2725
#x 1 1789 0.04 0.8379
#I(x^2) 1 1789 2.28 0.1312
#group:x 4 1789 2.09 0.0802
#group:I(x^2) 4 1789 2.81 0.0244
Now, this is all well. But: my colleagues have been running the same data
set using PROC MIXED in SAS and come up with substantially different
results when comparing SAS default covariance structure (variance
components) and AR1. Specifically, there is virtually no change in either
test statistics or fitted values when using AR1 instead of Variance
Components in SAS, which fits the observation that AIC values (in SAS)
indicate both covariance structures fit data equally well.
This is not very satisfactory to me, and I would be interesting to know
what is happening here. Realizing
this might not be the correct forum for this question, I would like to ask
you all if anyone would have any
input as to what is going on here, e.g. am I setting up my model
erroneously, etc.?
N.b. I have no desire to replicate SAS results, but I would most certainly
be interested to know what could possibly explain such a large discrepancy
between the two platforms. Any suggestions greatly welcomed.
(Data are located at:
https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0)
With all best wishes,
Andreas
[[alternative HTML version deleted]]