Bivariate MCMCglmm with repeated measures - R-SIG-mixed-models

Mon, Aug 11, 2014 10:34 AM #

Hi

I running a bivariate GLMM, where both of my response variables have repeated measures. A dummy data set would look like this:

Individual Presence Stage
1 0 1
1 1 1
1 1 2
1 1 2
2 1 1
2 0 1
2 0 1
2 1 2
2 0 2

There are a series of individuals. For each individual we have measures presence/absence repeatedly during life stage 1 and then again repeatedly during life stage 2.

For a straight forward bivariate model, with a single measure per individual (or repeated measures during only one life stage), the data could be set up like this:

Individual Presence stage 1 Presence stage 2

However because I have repeated measures for both stages I am struggling to find out how to code the data so MCMCglmm can run.

Does anyone have any experience with this kind of data structure? I can?t find anything on the R list.

Many Thanks

Sam

Dr Samantha Patrick
Research Fellow
Biosciences QU116
Francis Close Hall Campus
University of Gloucestershire
Cheltenham, GL50 4AZ, UK

Research Associate: OxNav, University of Oxford

******From 1st August - 14th November 2014 I will be
based in Montr?al, which is 5 hours behind GMT ******

Tel: 07740 472 719
Skype: sammy_patrick
https://sites.google.com/site/samanthacpatrick/

-
?In the top 5 in the Green League Table; committed to sustainability?
This email is confidential to the intended recipient. If you have received it in error please notify the sender and delete it from your computer.
The University of Gloucestershire is a company limited by guarantee registered in England and Wales. Registered number: 06023243. Registered office: The Park, Cheltenham, GL50 2RH
Please consider the environment before printing this email.
-

Jarrod Hadfield

Mon, Aug 11, 2014 11:47 AM #

Hi Sam,

One option would be

random = ~us(stage):Individual, rcov=~units

where the random term is a 2x2 covariance matrix (between individual  
variances for each stage and the covariance between them). There is  
only a single residual variance in my model - but this is OK, with  
binary data it can't be estimated so there is no point trying to  
estimates separate residual variances for each stage. You will need to  
fix the residual variance at something though (I use 1).

If you only have Individual level covariates (i.e. no  
observation-level covariates) then you could group your binary  
responses into a binomial response and fit the model

random=NULL, rcov = ~us(stage):units

This will give (nearly) the same answers as the first model if you  
rescale the (co)variances as described in the CourseNotes.  It will be  
much faster too.

You might also want to consider models that deal with temporal  
autocorrelation, but these are not implemented in MCMCglmm.

Cheers,

Jarrod
Quoting "PATRICK, Samantha" <spatrick at glos.ac.uk> on Mon, 11 Aug 2014  
17:34:05 +0000:

Hi

I running a bivariate GLMM, where both of my response variables have  
repeated measures.  A dummy data set would look like this:

Individual     Presence    Stage
1                      0                       1
1                      1                        1
1                      1                        2
1                      1                        2
2                     1                         1
2                     0                        1
2                     0                        1
2                     1                         2
2                     0                        2

There are a series of individuals.  For each individual we have  
measures presence/absence repeatedly during life stage 1 and then  
again repeatedly during life stage 2.

For a straight forward bivariate model, with a single measure per   
individual (or repeated measures during only one life stage), the  
data could be set up like this:

Individual      Presence stage 1          Presence stage 2

However because I have repeated measures for both stages I am  
struggling to find out how to code the data so MCMCglmm can run.

Does anyone have any experience with this kind of data structure?  I  
can?t find anything on the R list.

Many Thanks

Sam

Dr Samantha Patrick
Research Fellow
Biosciences QU116
Francis Close Hall Campus
University of Gloucestershire
Cheltenham, GL50 4AZ, UK

Research Associate: OxNav, University of Oxford

******From 1st August - 14th November 2014 I will be
based in Montr?al, which is 5 hours behind GMT  ******

Tel: 07740 472 719
Skype: sammy_patrick
https://sites.google.com/site/samanthacpatrick/

-
?In the top 5 in the Green League Table; committed to sustainability?
This email is confidential to the intended recipient. If you have  
received it in error please notify the sender and delete it from  
your computer.
The University of Gloucestershire is a company limited by guarantee  
registered in England and Wales.  Registered number: 06023243.   
Registered office: The Park, Cheltenham, GL50 2RH
Please consider the environment before printing this email.
-

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

PATRICK, Samantha

Wed, Aug 13, 2014 6:53 AM #

Hi Jarrod

Thanks for getting back to me.  By simplifying the data, I left out the fact that there is a temporal pattern (Time).  There is an interaction between Time * Stage on Presence.  During stage one, Presence increases linearly with Time; during stage two Presence decreases linearly with time.   Stage one and two are mutually exclusive: Time 1-5 is always stage 1 and Time 6-10 always stage 2.   This was why I was originally to fit a bivariate model, so I could calculate the covariance between the slopes.

Following the method you suggested  I have been able to progress towards this.  The following model, if I understand correctly, fits an intercept (stage; two level factor) and a two population slopes (One in stage 1 and one in stage 2) and then a random intercept and slope during stage one and another random intercept and slope during stage two.  The 4x4  covariance matrix estimates the covariance between both intercepts and both slopes.
### this prior is not optimised in anyway - it is just a starting point
prior2.1 <- list(G = list(G1 = list(V=diag(4), nu=2, alpha.mu=c(0,0,0,0), alpha.V=diag(4)*1000)),
                          R = list(V=1, fix=1))

## full model
model2.1 <- MCMCglmm(Presence~ stage +  at.level(stage,1):Time + at.level(stage,2):Time ,

             random = ~us(at.level(stage ,1)+ at.level(stage ,1):Time  +
                                 at.level(stage ,2) +at.level(stage ,2):Time):Individual,

             rcov = ~units, family = "categorical",

              data = Data2, prior = prior2.1, verbose = FALSE, pr=T)

I then extended this model to allow the covariance structure to vary between the sexes:

##prior
prior2.2<- list(G = list(G1 = list(V=diag(8), nu=2, alpha.mu=c(0,0,0,0,0,0,0,0), alpha.V=diag(8)*1000)),
                          R = list(V=1, fix=1))

## full model
model2.2<- MCMCglmm(Presence~ stage +  at.level(stage,1):Time + at.level(stage,2):Time  ,
                     random = ~us(at.level(stage,1):at.level(Sex,1)+
                                    at.level(stage,1):at.level(Sex,2)+
                                    at.level(stage,1):at.level(Sex,1):Time +
                                    at.level(stage,1):at.level(Sex,2):Time  +
                                    at.level(stage,2):at.level(Sex,1)+
                                    at.level(stage,2):at.level(Sex,2)+
                                    at.level(stage,2):at.level(Sex,1):Time  +
                                    at.level(stage,2):at.level(Sex,2):Time ):Individual,
                     rcov = ~units, family = "categorical",
                     data = Data2, prior = prior2.2, verbose = FALSE, pr=T)

So I am left with one question: In essence the data lends itself to a piecewise regression, such that the end point of the slope in stage one is the start point for stage two.  Is it possible to fit this at the individual level? By this I mean that the random slope for individual 1 at stage two begins where and the slope for stage one ends. I have struggled to find anyone doing this.

I have included the full models and simulated data below (runs in first model only) in case anyone else is working on this kind of problem.

Thanks

Sam


### Data


Time
        Presence        Stage    Individual
1       1       1       1
2       1       1       1
3       0       1       1
4       1       1       1
5       1       1       1
6       0       2       1
7       0       2       1
8       1       2       1
9       0       2       1
10      1       2       1
1       0       1       2
2       1       1       2
3       0       1       2
4       1       1       2
5       1       1       2
6       1       2       2
7       0       2       2
8       0       2       2
9       0       2       2
10      1       2       2
1       0       1       3
2       1       1       3
3       1       1       3
4       1       1       3
5       1       1       3
6       0       2       3
7       0       2       3
8       1       2       3
9       1       2       3
10      1       2       3
1       0       1       4
2       1       1       4
3       1       1       4
4       1       1       4
5       1       1       4
6       1       2       4
7       1       2       4
8       0       2       4
9       0       2       4
10      0       2       4
1       1       1       5
2       0       1       5
3       0       1       5
4       1       1       5
5       1       1       5
6       1       2       5
7       1       2       5
8       1       2       5
9       1       2       5
10      0       2       5


Dr Samantha Patrick
Research Fellow
Biosciences QU116
Francis Close Hall Campus
University of Gloucestershire
Cheltenham, GL50 4AZ, UK

Research Associate: OxNav, University of Oxford

******From 1st August - 14th November 2014 I will be
based in Montr?al, which is 5 hours behind GMT  ******

Tel: 07740 472 719
Skype: sammy_patrick
https://sites.google.com/site/samanthacpatrick/

From: Jarrod Hadfield<mailto:j.hadfield at ed.ac.uk>
Sent: ?Monday?, ?11? ?August? ?2014 ?14?:?47
To: Samantha Patrick<mailto:spatrick at glos.ac.uk>
Cc: r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>

Hi Sam,

One option would be

random = ~us(stage):Individual, rcov=~units

where the random term is a 2x2 covariance matrix (between individual
variances for each stage and the covariance between them). There is
only a single residual variance in my model - but this is OK, with
binary data it can't be estimated so there is no point trying to
estimates separate residual variances for each stage. You will need to
fix the residual variance at something though (I use 1).

If you only have Individual level covariates (i.e. no
observation-level covariates) then you could group your binary
responses into a binomial response and fit the model

random=NULL, rcov = ~us(stage):units

This will give (nearly) the same answers as the first model if you
rescale the (co)variances as described in the CourseNotes.  It will be
much faster too.

You might also want to consider models that deal with temporal
autocorrelation, but these are not implemented in MCMCglmm.

Cheers,

Jarrod
Quoting "PATRICK, Samantha" <spatrick at glos.ac.uk> on Mon, 11 Aug 2014
17:34:05 +0000:

Hi

I running a bivariate GLMM, where both of my response variables have
repeated measures.  A dummy data set would look like this:

Individual     Presence    Stage
1                      0                       1
1                      1                        1
1                      1                        2
1                      1                        2
2                     1                         1
2                     0                        1
2                     0                        1
2                     1                         2
2                     0                        2

There are a series of individuals.  For each individual we have
measures presence/absence repeatedly during life stage 1 and then
again repeatedly during life stage 2.

For a straight forward bivariate model, with a single measure per
individual (or repeated measures during only one life stage), the
data could be set up like this:

Individual      Presence stage 1          Presence stage 2

However because I have repeated measures for both stages I am
struggling to find out how to code the data so MCMCglmm can run.

Does anyone have any experience with this kind of data structure?  I
can?t find anything on the R list.

Many Thanks

Sam

Dr Samantha Patrick
Research Fellow
Biosciences QU116
Francis Close Hall Campus
University of Gloucestershire
Cheltenham, GL50 4AZ, UK

Research Associate: OxNav, University of Oxford

******From 1st August - 14th November 2014 I will be
based in Montr?al, which is 5 hours behind GMT  ******

Tel: 07740 472 719
Skype: sammy_patrick
https://sites.google.com/site/samanthacpatrick/

-
?In the top 5 in the Green League Table; committed to sustainability?
This email is confidential to the intended recipient. If you have
received it in error please notify the sender and delete it from
your computer.
The University of Gloucestershire is a company limited by guarantee
registered in England and Wales.  Registered number: 06023243.
Registered office: The Park, Cheltenham, GL50 2RH
Please consider the environment before printing this email.
-

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


-
?In the top 5 in the Green League Table; committed to sustainability?
This email is confidential to the intended recipient. If you have received it in error please notify the sender and delete it from your computer.
The University of Gloucestershire is a company limited by guarantee registered in England and Wales.  Registered number: 06023243.  Registered office: The Park, Cheltenham, GL50 2RH
Please consider the environment before printing this email.
-

Jarrod Hadfield

Wed, Aug 13, 2014 7:09 AM #

Hi,

To do piecewise random regression forcing the two regression to `join'  
you could centre your covariate (Time) at the breakpoint. Then use:

random = ~us(1 + at.level(stage,1):Time+at.level(stage,2):Time):Individual

The first variance is the variance in the new intercepts (the value at  
the breakpoint) and the other two variances are the variances in  
slopes that emanate from either side of the breakpoint.

Bear in mind you will need a lot of data to get precise estimates from  
such a complex model, particularly if you want to break it up by sex  
too.

Cheers,

Jarrod




Quoting "PATRICK, Samantha" <spatrick at glos.ac.uk> on Wed, 13 Aug 2014  
13:53:58 +0000:

Hi Jarrod

Thanks for getting back to me.  By simplifying the data, I left out  
the fact that there is a temporal pattern (Time).  There is an  
interaction between Time * Stage on Presence.  During stage one,  
Presence increases linearly with Time; during stage two Presence  
decreases linearly with time.   Stage one and two are mutually  
exclusive: Time 1-5 is always stage 1 and Time 6-10 always stage 2.   
 This was why I was originally to fit a bivariate model, so I could  
calculate the covariance between the slopes.

Following the method you suggested  I have been able to progress  
towards this.  The following model, if I understand correctly, fits  
an intercept (stage; two level factor) and a two population slopes  
(One in stage 1 and one in stage 2) and then a random intercept and  
slope during stage one and another random intercept and slope during  
stage two.  The 4x4  covariance matrix estimates the covariance  
between both intercepts and both slopes.
### this prior is not optimised in anyway - it is just a starting point
prior2.1 <- list(G = list(G1 = list(V=diag(4), nu=2,  
alpha.mu=c(0,0,0,0), alpha.V=diag(4)*1000)),
                          R = list(V=1, fix=1))

## full model
model2.1 <- MCMCglmm(Presence~ stage +  at.level(stage,1):Time +  
at.level(stage,2):Time ,

             random = ~us(at.level(stage ,1)+ at.level(stage ,1):Time  +
                                 at.level(stage ,2) +at.level(stage  
,2):Time):Individual,

             rcov = ~units, family = "categorical",

              data = Data2, prior = prior2.1, verbose = FALSE, pr=T)

I then extended this model to allow the covariance structure to vary  
between the sexes:

##prior
prior2.2<- list(G = list(G1 = list(V=diag(8), nu=2,  
alpha.mu=c(0,0,0,0,0,0,0,0), alpha.V=diag(8)*1000)),
                          R = list(V=1, fix=1))

## full model
model2.2<- MCMCglmm(Presence~ stage +  at.level(stage,1):Time +  
at.level(stage,2):Time  ,
                     random = ~us(at.level(stage,1):at.level(Sex,1)+
                                    at.level(stage,1):at.level(Sex,2)+
                                    at.level(stage,1):at.level(Sex,1):Time +
                                    at.level(stage,1):at.level(Sex,2):Time  +
                                    at.level(stage,2):at.level(Sex,1)+
                                    at.level(stage,2):at.level(Sex,2)+
                                    at.level(stage,2):at.level(Sex,1):Time  +
                                     
at.level(stage,2):at.level(Sex,2):Time ):Individual,
                     rcov = ~units, family = "categorical",
                     data = Data2, prior = prior2.2, verbose = FALSE, pr=T)

So I am left with one question: In essence the data lends itself to  
a piecewise regression, such that the end point of the slope in  
stage one is the start point for stage two.  Is it possible to fit  
this at the individual level? By this I mean that the random slope  
for individual 1 at stage two begins where and the slope for stage  
one ends. I have struggled to find anyone doing this.

I have included the full models and simulated data below (runs in  
first model only) in case anyone else is working on this kind of  
problem.

Thanks

Sam


### Data


Time
        Presence        Stage    Individual
1       1       1       1
2       1       1       1
3       0       1       1
4       1       1       1
5       1       1       1
6       0       2       1
7       0       2       1
8       1       2       1
9       0       2       1
10      1       2       1
1       0       1       2
2       1       1       2
3       0       1       2
4       1       1       2
5       1       1       2
6       1       2       2
7       0       2       2
8       0       2       2
9       0       2       2
10      1       2       2
1       0       1       3
2       1       1       3
3       1       1       3
4       1       1       3
5       1       1       3
6       0       2       3
7       0       2       3
8       1       2       3
9       1       2       3
10      1       2       3
1       0       1       4
2       1       1       4
3       1       1       4
4       1       1       4
5       1       1       4
6       1       2       4
7       1       2       4
8       0       2       4
9       0       2       4
10      0       2       4
1       1       1       5
2       0       1       5
3       0       1       5
4       1       1       5
5       1       1       5
6       1       2       5
7       1       2       5
8       1       2       5
9       1       2       5
10      0       2       5


Dr Samantha Patrick
Research Fellow
Biosciences QU116
Francis Close Hall Campus
University of Gloucestershire
Cheltenham, GL50 4AZ, UK

Research Associate: OxNav, University of Oxford

******From 1st August - 14th November 2014 I will be
based in Montr?al, which is 5 hours behind GMT  ******

Tel: 07740 472 719
Skype: sammy_patrick
https://sites.google.com/site/samanthacpatrick/

From: Jarrod Hadfield<mailto:j.hadfield at ed.ac.uk>
Sent: ?Monday?, ?11? ?August? ?2014 ?14?:?47
To: Samantha Patrick<mailto:spatrick at glos.ac.uk>
Cc: r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>

Hi Sam,

One option would be

random = ~us(stage):Individual, rcov=~units

where the random term is a 2x2 covariance matrix (between individual
variances for each stage and the covariance between them). There is
only a single residual variance in my model - but this is OK, with
binary data it can't be estimated so there is no point trying to
estimates separate residual variances for each stage. You will need to
fix the residual variance at something though (I use 1).

If you only have Individual level covariates (i.e. no
observation-level covariates) then you could group your binary
responses into a binomial response and fit the model

random=NULL, rcov = ~us(stage):units

This will give (nearly) the same answers as the first model if you
rescale the (co)variances as described in the CourseNotes.  It will be
much faster too.

You might also want to consider models that deal with temporal
autocorrelation, but these are not implemented in MCMCglmm.

Cheers,

Jarrod
Quoting "PATRICK, Samantha" <spatrick at glos.ac.uk> on Mon, 11 Aug 2014
17:34:05 +0000:

Hi

I running a bivariate GLMM, where both of my response variables have
repeated measures.  A dummy data set would look like this:

Individual     Presence    Stage
1                      0                       1
1                      1                        1
1                      1                        2
1                      1                        2
2                     1                         1
2                     0                        1
2                     0                        1
2                     1                         2
2                     0                        2

There are a series of individuals.  For each individual we have
measures presence/absence repeatedly during life stage 1 and then
again repeatedly during life stage 2.

For a straight forward bivariate model, with a single measure per
individual (or repeated measures during only one life stage), the
data could be set up like this:

Individual      Presence stage 1          Presence stage 2

However because I have repeated measures for both stages I am
struggling to find out how to code the data so MCMCglmm can run.

Does anyone have any experience with this kind of data structure?  I
can?t find anything on the R list.

Many Thanks

Sam

Dr Samantha Patrick
Research Fellow
Biosciences QU116
Francis Close Hall Campus
University of Gloucestershire
Cheltenham, GL50 4AZ, UK

Research Associate: OxNav, University of Oxford

******From 1st August - 14th November 2014 I will be
based in Montr?al, which is 5 hours behind GMT  ******

Tel: 07740 472 719
Skype: sammy_patrick
https://sites.google.com/site/samanthacpatrick/

-
?In the top 5 in the Green League Table; committed to sustainability?
This email is confidential to the intended recipient. If you have
received it in error please notify the sender and delete it from
your computer.
The University of Gloucestershire is a company limited by guarantee
registered in England and Wales.  Registered number: 06023243.
Registered office: The Park, Cheltenham, GL50 2RH
Please consider the environment before printing this email.
-

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.