Skip to content

MCMCglmm_2.13

4 messages · Malcolm Fairbrother, Jarrod Hadfield

#
Dear Jarrod,

Can I please just confirm what you meant here by "covariance structures for the random effects"? If I understand correctly, this technique might be very useful for quite a lot of things--though I'm also trying to think it through.

Specifically, the nlme package for example allows for error structures like AR1, etc., in the *residuals* of a mixed model. Are you saying that MCMCglmm's "ginverse" function allows for similar sorts of structure to be modelled in the *random effects*? Consider:

library(nlme); library(MCMCglmm)

# models with uncorrelated random effects (16 Rats observed on 11 occasions):
m1 <- lme(weight ~ Time, random = ~ 1 | Rat, BodyWeight)
m2 <- MCMCglmm(weight ~ Time, random = ~ Rat, data=BodyWeight, verbose=F) # m1 and m2 are the same

# same model as the above, but allowing for AR1 residuals errors:
cs1AR1 <- corAR1(0.8, form = ~ Time | Rat)
cs1AR1 <- Initialize(cs1AR1, BodyWeight)
corMatrix(cs1AR1)[[1]] # shows the correlation structure for each Rat (all 16 are the same, since each Rat is measured on the same occasions)
m3 <- lme(weight ~ Time, random = ~ Time | Rat, correlation = cs1AR1, BodyWeight)

# now, for an illustrative experiment, treat *Time* as the random grouping factor, instead of Rat:
m4 <- lme(weight ~ Diet, random = ~ 1 | Time, BodyWeight)
m5 <- MCMCglmm(weight ~ Diet, random=~Time, data=BodyWeight, verbose=F, nitt=53000, pr=T)
# estimates for the variance of the random intercepts is somewhat different, but otherwise m4 and m5 yield similar results

# now, create a "ginverse" matrix, and incorporate it in a call to MCMCglmm:
mat <- as(corMatrix(cs1AR1)[[1]], "sparseMatrix")
rownames(mat) <- unique(BodyWeight$Time)
m6 <- MCMCglmm(weight ~ Diet, random=~Time, ginverse=list(Time=mat), data=BodyWeight, verbose=F, nitt=53000, pr=T)

The variance for the random effects for Time is now substantially reduced. Does this mean that the "estimates" for the U_j terms has taken into account their correlation, whereas m5 treated them as independent? (Maybe I should have inverted "mat" before including it in the call?)

Do you see what I'm getting at? Maybe I've not understood, but this seemed to raise the possibility of allowing for random effects to be structured in many interesting ways (serially in time, spatially, etc.), which I don't think other packages can handle?

Many thanks for any clarification, as always,
Malcolm
#
Hi Malcolm,

I don't think it's as fancy as you think/hope. The covariance  
structure has to be seperable into a matrix of prameters (V) and a  
fixed matrix (A), whereas in many spatial/time-series models A is also  
a function of some parameter(s). It is flexible in the sense that A  
can be any positive-definite matrix.

I've edited your code below to get MCMCglmm to fit the AR1 model (with  
fixed phi).

Cheers,

Jarrod
On 27 Jul 2011, at 19:18, Malcolm Fairbrother wrote:

            
# The models below (m4,m5 & m6) are not ideal, so I'll ft a MCMCglmm  
model that is close to m3:

phi<-0.756197

# get phi from m3 - tried to extract this number from m3 but gave up  
looking!?

csEAR1 <- corAR1(phi, form = ~ Time | Rat)
csEAR1 <- Initialize(cs1AR1, BodyWeight)

mat <- as(corMatrix(csEAR1)[[1]], "sparseMatrix")

# generate AR1 covariance matrix

# IMPORTANT: the inverse needs to be passed, rather than the actual  
covariance matrix. This is preferable because the inverse is often  
sparser, and MCMCglmm may not "know" this from the matrix and a brute  
force solve. Your example is a good one, because the inverse of an AR1  
covaraince matrix  (when time points are ordered) only has non-zero  
elements along the diagonal and sub-diagonals.  However, inverting mat  
gives a dense matrix because of numerical inaccuracies:

length(solve(mat)@x)

# gives 121 non-zero elements, but should really be :

dim(mat)[1]+2*(dim(mat)[1]-1)

# 31 non-zero elements. So:

smat<-solve(mat)
smat<-band(smat,c(-1,1), c(1,-1))

# also, it makes more sense for this to define the autocovariance  
structure within each Rat (rather than across Rats) as in m3.  Since  
the deisgn is balanced this is easy to form:

invC<-kronecker(diag(nlevels(BodyWeight$Rat)), smat)
invC<-as(invC,  "CsparseMatrix")

# create a new variable with a unique name for each datum (and have  
the rownames in invC correspond)

rownames(invC) <- 1:dim(BodyWeight)[1]
BodyWeight$timeRat<-as.factor(1:dim(BodyWeight)[1])

# Then fit

prior=list(R=list(V=1, nu=0.02), G=list(G1=list(V=diag(2), nu=2,  
alpha.mu=c(0,0), alpha.V=diag(2)*1000), G2=list(V=1, nu=1, alpha.mu=0,  
alpha.V=1000)))

m3.mcmc <- MCMCglmm(weight ~ Time, random=~us(1+Time):Rat+timeRat,  
ginverse=list(timeRat=invC), data=BodyWeight, verbose=F, nitt=53000,  
pr=T, prior=prior)

# This is almost identical to m3 except the AR1 process is for the  
random effects rather than the residuals. In m3 the covariance  
structure within each Rat is equal to vr*mat where vr is the residual  
variance. In  MCMCglmm  it is  equal to   va*mat +vb*I  where va is  
the variance associated with the timeRat term and vb is the variance  
associated with the units term (residual variance).

# Nevertheless, the lme point estimates correspond very closely to the  
posterior modes for all parameters:

f1<-summary(m3)$coef$fixed[1]
f2<-summary(m3)$coef$fixed[2]
v1<-as.numeric(VarCorr(summary(m3))[1])
v2<-as.numeric(VarCorr(summary(m3))[2])
c12<-as.numeric(VarCorr(summary(m3))[,3][2])*sqrt(v1*v2)
vr<-as.numeric(VarCorr(summary(m3))[3])


par(mfrow=c(1,2))
hist(m3.mcmc$Sol[,1], breaks=30)
abline(v=f1, col="red")
hist(m3.mcmc$Sol[,2], breaks=30)
abline(v=f2, col="red")

par(mfrow=c(2,2))
hist(m3.mcmc$VCV[,1], breaks=30)
abline(v=v1, col="red")
hist(m3.mcmc$VCV[,4], breaks=30)
abline(v=v2, col="red")
hist(m3.mcmc$VCV[,2], breaks=30)
abline(v=c12, col="red")
hist(rowSums(m3.mcmc$VCV[,5:6]), breaks=30)
abline(v=vr, col="red")

  
    
#
Jarrod,

This is indeed very interesting, and plenty fancy enough. What I'm particularly intrigued by is the possibility of fitting models with *both* time-series and spatial correlation structures. The code below represents my attempt to do that. Does this make sense?


library(nlme); library(MCMCglmm)
cs1AR1 <- corAR1(0.2, form = ~ Time | Rat) # the value chosen here doesn't seem to affect the estimates much
cs1AR1 <- Initialize(cs1AR1, BodyWeight)
corMatrix(cs1AR1)[[1]] # shows the correlation structure for each Rat (all 16 are the same, since each Rat is measured on the same occasions)

# a revised version of what you proposed (which, if I've got this right, proves about 25% faster for this dataset, and is simpler to set up):

smat <- Matrix(band(solve(corMatrix(cs1AR1)[[1]]), -1, 1), sparse=T)
rownames(smat) <- unique(BodyWeight$Time)
BodyWeight$Time2 <- BodyWeight$Time # distinguishing between two different "Time" variables seems to prevent R from crashing when calling MCMCglmm...

prior=list(R=list(V=1, nu=0.02), G=list(G1=list(V=diag(2), nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*1000), G2=list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))

MC0 <- MCMCglmm(weight ~ as.numeric(Time), random= ~ us(1+Time):Rat + Time, data=BodyWeight, prior=prior)
MC1 <- MCMCglmm(weight ~ as.numeric(Time), random= ~ us(1+Time):Rat + Time2, ginverse=list(Time2=smat), data=BodyWeight, prior=prior)
# this approach takes advantage of the fact that the correlation structure is the same for each Rat
# it generates the same results as your m3.mcmc, though it reallocates some of the variance from your "timeRat" level to the units level
# but I think that's not a problem, because the units/residual variance can still be identified using va*mat+vb*I
# make sense?

# anyway, here's the (maybe) interesting bit, combining time-series and spatial autocorrelation:

BodyWeight$x <- sample(1:length(unique(BodyWeight$Rat)))[BodyWeight$Rat]
BodyWeight$y <- sample(1:length(unique(BodyWeight$Rat)))[BodyWeight$Rat]
plot(unique(BodyWeight$x), unique(BodyWeight$y)) # a map

cE <- corExp(form = ~ x + y | Time)
cE <- Initialize(cE, BodyWeight)
corMatrix(cE)[[1]] # a spatial correlation structure (albeit for data without any real spatial pattern)

smat2 <- Matrix(solve(corMatrix(cE)[[1]]), sparse=T) # not actually sparse, unfortunately
rownames(smat2) <- unique(BodyWeight$Rat)
BodyWeight$Rat2 <- BodyWeight$Rat # again, this seems necessary to prevent R from crashing when calling MCMCglmm

MC2 <- MCMCglmm(weight ~ as.numeric(Time), random= ~ us(1+Time):Rat + Time2, ginverse=list(Time2=smat, Rat2=smat2), data=BodyWeight, prior=prior)
#
Hi Malcolm,

You can combine variance structures as you have done, but given some  
of the parameters of the variance structure (e.g. phi) have to be  
known, I think you're better off fitting these models in lme or ASReml  
where they are estimated.

The first Rat example using Time2 is not equivalent to the model I set  
up (or your lme model m3). In the original models there are  
covariances between observations within Rats, whereas in your  
specification you have covariances between observations within AND  
between Rats (i.e. You have COV(u_{11},u_{12}) =  COV(u_{11},u_{22}),  
where the first subscript is Rat and the second Time2, whereas before  
it COV(u_{11},u_{22})=0).

I'll take a look into the problem you were getting when time was used  
as a fixed covariate and a factor associated with a ginverse.

Cheers,

Jarrod





Quoting Malcolm Fairbrother <m.fairbrother at bristol.ac.uk> on Thu, 28  
Jul 2011 22:17:46 +0100: