An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20110107/2c927cf7/attachment.pl>
Mixed-model-binary logistic model with dependence between individual repeated measures
10 messages · Anna Ekman, Martin Maechler, Ben Pelzer +4 more
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11-01-07 06:59 AM, Anna Ekman wrote:
Hi, I am a novice R user and do not know how to properly mail to this list. I apologies if I do it in the wrong way. I want to analyze my data using a random intercept (later extended also to random slope) logistic model for a binary outcome (later extended to a ordinal outcome). This I have been able to do in SAS if assuming the repeated measurements within an individual to be independent, but I want to be able to choose different covariance structures for the individual measurements. This I cannot do directly in either SAS or STATA, and therefore now turn to R. How can I do this in R? Anna
I'm surprised that you can't do this in SAS (PROC MIXED, NLMIXED, or GLIMMIX?) or Stata <http://www.gllamm.org/>, but: if you want to do it in R, your choices are glmmPQL in the MASS package or possibly one of the generalized estimating equation packages (geese, geepack?) I would recommend the following references for getting started: Zuur et al Mixed models (Springer) Pinheiro and Bates 2000 (Springer), especially the material on temporal autocorrelation models Extending to a ordinal outcome with temporal autocorrelation could be tricky ... -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0nNJsACgkQc5UpGjwzenMymACfbqx+gtOjyhoX9hHpPyO/vbVg NXUAniLvM+8voyzKA6axKLyJLclqeBhY =wBwK -----END PGP SIGNATURE-----
Ben Bolker, thank you for your suggestions. Yes, it is suprising that I in SAS and STATA have to assume independence between the measurements within an individual. I do not want to assume that. In addition I would like to be able to chose other distributions than the normal for my random effect, which is not possible in SAS (proc NLMIXED). The generalized estimating equation packages are probably not an option as I do not whant marginal models. I will look at the references you suggested. Thank you. /Anna ___________________________________ Anna Ekman (Grimby-Ekman) PhLic Statistics, PhD (Dr Med Sci) Occupational and Environmental Medicine Sahlgrenska Academy and University Hospital Box 414, SE - 405 30 Goteborg, Sweden Phone +46 (0)31 786 31 23 www.amm.se Akademistatistik inom EpiStat Sahlgrenska akademin vid G?teborgs universitet www.sahlgrenska.gu.se/forskning/epistat/ ________________________________________ Fr?n: Ben Bolker [bbolker at gmail.com] Skickat: den 7 januari 2011 16:43 Till: Anna Ekman Kopia: r-sig-mixed-models at r-project.org ?mne: Re: [R-sig-ME] Mixed-model-binary logistic model with dependence between individual repeated measures -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11-01-07 06:59 AM, Anna Ekman wrote:
Hi, I am a novice R user and do not know how to properly mail to this list. I apologies if I do it in the wrong way. I want to analyze my data using a random intercept (later extended also to random slope) logistic model for a binary outcome (later extended to a ordinal outcome). This I have been able to do in SAS if assuming the repeated measurements within an individual to be independent, but I want to be able to choose different covariance structures for the individual measurements. This I cannot do directly in either SAS or STATA, and therefore now turn to R. How can I do this in R? Anna
I'm surprised that you can't do this in SAS (PROC MIXED, NLMIXED, or GLIMMIX?) or Stata <http://www.gllamm.org/>, but: if you want to do it in R, your choices are glmmPQL in the MASS package or possibly one of the generalized estimating equation packages (geese, geepack?) I would recommend the following references for getting started: Zuur et al Mixed models (Springer) Pinheiro and Bates 2000 (Springer), especially the material on temporal autocorrelation models Extending to a ordinal outcome with temporal autocorrelation could be tricky ... -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0nNJsACgkQc5UpGjwzenMymACfbqx+gtOjyhoX9hHpPyO/vbVg NXUAniLvM+8voyzKA6axKLyJLclqeBhY =wBwK -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11-01-07 11:35 AM, Anna Ekman wrote:
Ben Bolker, thank you for your suggestions. Yes, it is suprising that I in SAS and STATA have to assume independence between the measurements within an individual.
It's fundamentally a bit hard to specify correlation among individuals in a non-normal model. One option is to go completely to the marginal specification (which you said you don't want to do); probably the most sensible statistical formulation is (fixed effects) eta0 = X*beta (random effects) eta1 ~ MVN(mu=X*beta,Sigma=(something sensible such as AR(1) within individuals)) y ~ Bernoulli(eta1) i.e., a hierarchical model with a multivariate normal correlated distribution at the 'lower level', with a level of Bernoulli variation on top of that. The correlation parameters of eta1 will not correspond to the actual correlations among the measurements (which will be smaller due to the extra variation coming from the Bernoulli sampling) I do not
want to assume that. In addition I would like to be able to chose other distributions than the normal for my random effect, which is not possible in SAS (proc NLMIXED).
It's not possible in R either as far as I know. The generalized estimating
equation packages are probably not an option as I do not whant marginal models. I will look at the references you suggested. Thank you. /Anna
If you want a non-marginal model with non-normal random effects and
within-individual correlation structures other than compound symmetry
(i.e. simple block structures), you are probably going to have to
construct your own solution with WinBUGS or AD Model Builder or ... ? If
you're lucky, MCMCglmm may be able to do what you want -- I would check
it out. (Molenbergh and Verbeke's book on longitudinal models describes
approaches for non-normal random effects, but in the context of LMMs
(i.e. normally distributed errors) -- they may have done something to
extend this stuff to GLMMs more recently. It's possible that someone
out there has done what you want and encapsulated it into a canned
package, but I doubt it.
cheers
Ben Bolker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk0nRBsACgkQc5UpGjwzenPr6wCfb7wixKYRGfglmge7Ejg0+i26
k5QAoJpuc+QiWC8iaMXC6aDnW75GfrYE
=292C
-----END PGP SIGNATURE-----
Ben Bolker <bbolker at gmail.com>
on Fri, 07 Jan 2011 11:49:31 -0500 writes:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> On 11-01-07 11:35 AM, Anna Ekman wrote:
>> Ben Bolker, thank you for your suggestions.
>>
>> Yes, it is suprising that I in SAS and STATA have to assume
>> independence between the measurements within an individual.
> It's fundamentally a bit hard to specify correlation among individuals
> in a non-normal model. One option is to go completely to the marginal
> specification (which you said you don't want to do); probably the most
> sensible statistical formulation is
> (fixed effects) eta0 = X*beta
> (random effects) eta1 ~ MVN(mu=X*beta,Sigma=(something sensible such
> as AR(1) within individuals))
> y ~ Bernoulli(eta1)
Interesting... {I've been "taught" in the past that correlation
specification for non-normal, i.e. GLME models,
would not make sense / be possible,
something you do not seem to support ...
}
Does the above mean {slight changes}
(fixed effects) eta0 = X*beta
(random effects) eta1 ~ MVN(0, Sigma=(something sensible such
as AR(1) within individuals))
(Y | X,eta1) ~ Bernoulli( logit(eta0 + eta1) )
??
> i.e., a hierarchical model with a multivariate normal correlated
> distribution at the 'lower level', with a level of Bernoulli variation
> on top of that.
> The correlation parameters of eta1 will not correspond to the actual
> correlations among the measurements (which will be smaller due to the
> extra variation coming from the Bernoulli sampling)
> I do not
>> want to assume that. In addition I would like to be able to chose
>> other distributions than the normal for my random effect, which is
>> not possible in SAS (proc NLMIXED).
> It's not possible in R either as far as I know.
> The generalized estimating
>> equation packages are probably not an option as I do not whant
>> marginal models. I will look at the references you suggested. Thank
>> you. /Anna
>>
> If you want a non-marginal model with non-normal random effects and
> within-individual correlation structures other than compound symmetry
> (i.e. simple block structures), you are probably going to have to
> construct your own solution with WinBUGS or AD Model Builder or ... ? If
> you're lucky, MCMCglmm may be able to do what you want -- I would check
> it out. (Molenbergh and Verbeke's book on longitudinal models describes
> approaches for non-normal random effects, but in the context of LMMs
> (i.e. normally distributed errors) -- they may have done something to
> extend this stuff to GLMMs more recently. It's possible that someone
> out there has done what you want and encapsulated it into a canned
> package, but I doubt it.
> cheers
> Ben Bolker
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> iEYEARECAAYFAk0nRBsACgkQc5UpGjwzenPr6wCfb7wixKYRGfglmge7Ejg0+i26
> k5QAoJpuc+QiWC8iaMXC6aDnW75GfrYE
> =292C
> -----END PGP SIGNATURE-----
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20110107/bee658c3/attachment.pl>
On Fri, 2011-01-07 at 11:49 -0500, Ben Bolker wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
<snip />
I do not
want to assume that. In addition I would like to be able to chose other distributions than the normal for my random effect, which is not possible in SAS (proc NLMIXED).
It's not possible in R either as far as I know.
I was reading the article in the latest issue of the R Journal on the hglm package, and although I was only giving it a cursory scan over lunch it looks like it might be able to fit the sort of model implied here; random effects distributed as a member of the exponential family. G
The generalized estimating
equation packages are probably not an option as I do not whant marginal models. I will look at the references you suggested. Thank you. /Anna
If you want a non-marginal model with non-normal random effects and
within-individual correlation structures other than compound symmetry
(i.e. simple block structures), you are probably going to have to
construct your own solution with WinBUGS or AD Model Builder or ... ? If
you're lucky, MCMCglmm may be able to do what you want -- I would check
it out. (Molenbergh and Verbeke's book on longitudinal models describes
approaches for non-normal random effects, but in the context of LMMs
(i.e. normally distributed errors) -- they may have done something to
extend this stuff to GLMMs more recently. It's possible that someone
out there has done what you want and encapsulated it into a canned
package, but I doubt it.
cheers
Ben Bolker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk0nRBsACgkQc5UpGjwzenPr6wCfb7wixKYRGfglmge7Ejg0+i26
k5QAoJpuc+QiWC8iaMXC6aDnW75GfrYE
=292C
-----END PGP SIGNATURE-----
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Hi All, I think one issue is going to be constraining the within-individual residual structure to a correlation matrix, since the diagonal elements of the covariance matrix cannot be estimated from binary data. A residual structure of the form rcov=~cor(time):units does this in MCMCglmm, but I would not advocate it if there are many time points because there will be N*(N-1)/2 parameters, rather than 1 parameter with an appropriate AR structure. N is the number of time points. Cheers, Jarrod Quoting Gavin Simpson <gavin.simpson at ucl.ac.uk>:
On Fri, 2011-01-07 at 11:49 -0500, Ben Bolker wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
<snip />
I do not
want to assume that. In addition I would like to be able to chose other distributions than the normal for my random effect, which is not possible in SAS (proc NLMIXED).
It's not possible in R either as far as I know.
I was reading the article in the latest issue of the R Journal on the hglm package, and although I was only giving it a cursory scan over lunch it looks like it might be able to fit the sort of model implied here; random effects distributed as a member of the exponential family. G
The generalized estimating
equation packages are probably not an option as I do not whant marginal models. I will look at the references you suggested. Thank you. /Anna
If you want a non-marginal model with non-normal random effects and
within-individual correlation structures other than compound symmetry
(i.e. simple block structures), you are probably going to have to
construct your own solution with WinBUGS or AD Model Builder or ... ? If
you're lucky, MCMCglmm may be able to do what you want -- I would check
it out. (Molenbergh and Verbeke's book on longitudinal models describes
approaches for non-normal random effects, but in the context of LMMs
(i.e. normally distributed errors) -- they may have done something to
extend this stuff to GLMMs more recently. It's possible that someone
out there has done what you want and encapsulated it into a canned
package, but I doubt it.
cheers
Ben Bolker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk0nRBsACgkQc5UpGjwzenPr6wCfb7wixKYRGfglmge7Ejg0+i26
k5QAoJpuc+QiWC8iaMXC6aDnW75GfrYE
=292C
-----END PGP SIGNATURE-----
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11-01-07 12:48 PM, Martin Maechler wrote:
Ben Bolker <bbolker at gmail.com>
on Fri, 07 Jan 2011 11:49:31 -0500 writes:
On 11-01-07 11:35 AM, Anna Ekman wrote:
>> Ben Bolker, thank you for your suggestions.
>>
>> Yes, it is suprising that I in SAS and STATA have to assume
>> independence between the measurements within an individual.
It's fundamentally a bit hard to specify correlation among individuals in a non-normal model. One option is to go completely to the marginal specification (which you said you don't want to do); probably the most sensible statistical formulation is (fixed effects) eta0 = X*beta (random effects) eta1 ~ MVN(mu=X*beta,Sigma=(something sensible such as AR(1) within individuals)) y ~ Bernoulli(eta1)
Interesting... {I've been "taught" in the past that correlation
specification for non-normal, i.e. GLME models,
would not make sense / be possible,
something you do not seem to support ...
}
Hmm. I agree that if you just specify the correlation structure of the residuals (e.g. by using a corStruct in glmmPQL), it's not clear what the underlying statistical model would be, or whether/when it would make sense.
Does the above mean {slight changes}
(fixed effects) eta0 = X*beta (random effects) eta1 ~ MVN(0, Sigma=(something sensible such as AR(1) within individuals))
(Y | X,eta1) ~ Bernoulli( logit(eta0 + eta1) )
Yes, that's more or less what I meant, although I forgot the inverse link function entirely and you put the link function (logit) where I think you wanted the inverse link function (logistic) ...
??
i.e., a hierarchical model with a multivariate normal correlated distribution at the 'lower level', with a level of Bernoulli variation on top of that.
[snip] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0njsIACgkQc5UpGjwzenP3qQCePoKzRPzFcNhBUmBUO/ZnZWxp RE4An1QaoLlzrvFecMkYjnYb54dp7nzC =jt4M -----END PGP SIGNATURE-----
On Fri, 7 Jan 2011, Martin Maechler wrote:
Ben Bolker <bbolker at gmail.com>
on Fri, 07 Jan 2011 11:49:31 -0500 writes:
> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1
> On 11-01-07 11:35 AM, Anna Ekman wrote:
>> Ben Bolker, thank you for your suggestions. >> >> Yes, it is suprising that I in SAS and STATA have to assume >> independence between the measurements within an individual.
> It's fundamentally a bit hard to specify correlation among individuals > in a non-normal model. One option is to go completely to the marginal > specification (which you said you don't want to do); probably the most > sensible statistical formulation is
> (fixed effects) eta0 = X*beta > (random effects) eta1 ~ MVN(mu=X*beta,Sigma=(something sensible such > as AR(1) within individuals)) > y ~ Bernoulli(eta1)
Interesting... {I've been "taught" in the past that correlation
specification for non-normal, i.e. GLME models,
would not make sense / be possible,
something you do not seem to support ...
}
Does the above mean {slight changes}
(fixed effects) eta0 = X*beta
(random effects) eta1 ~ MVN(0, Sigma=(something sensible such
as AR(1) within individuals))
(Y | X,eta1) ~ Bernoulli( logit(eta0 + eta1) )
With the probit link, such dichotomous and ordinal variable mixed models have a long history in genetics and psychometrics. In the latter case, factor analysis and path analysis of tetrachoric/polychoric correlations is completely equivalent to the probit-normal, although GLS/WLS was often used for computational reasons. We used to do all this in LISREL. For the case of varying numbers of observations per individual (and other irregular data types), you can use the "multiple groups" approach, where you specify a covariance matrix of the right size for each pattern of data, and constrain the correlations equal in the different groups. Since the main interest is in the correlations between latent variables, all hypotheses and estimates are usually framed at that "level" of the model. In the genetic situation, for example, we might estimate the heritability of a dichotomous trait based on family data under a polygenic model as being 1/2 the sibling tetrachoric correlation. Model criticism is done by comparing predicted risk to different degrees of relations of an affected individual, or set of affected relatives. Practically, this was used for genetic counselling etc. In the current era of genome wide association studies, a key question is the "missing heritability", ie amount of familial aggregation of diseases unexplained by gene variants with detectable effect: the case control studies have N=30000. Some of the arguments hinge on what kind of link function is used in the theoretical model. Sorry, I couldn't resist ;)
| David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v