Skip to content

Mixed-model-binary logistic model with dependence between individual repeated measures

10 messages · Anna Ekman, Martin Maechler, Ben Pelzer +4 more

#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11-01-07 06:59 AM, Anna Ekman wrote:
I'm surprised that you can't do this in SAS (PROC MIXED, NLMIXED, or
GLIMMIX?) or Stata <http://www.gllamm.org/>, but: if you want to do it
in R, your choices are glmmPQL in the MASS package or possibly one of
the generalized estimating equation packages (geese, geepack?)  I would
recommend the following references for getting started:

  Zuur et al Mixed models (Springer)
  Pinheiro and Bates 2000 (Springer), especially the material on
temporal autocorrelation models

  Extending to a ordinal outcome with temporal autocorrelation could be
tricky ...




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0nNJsACgkQc5UpGjwzenMymACfbqx+gtOjyhoX9hHpPyO/vbVg
NXUAniLvM+8voyzKA6axKLyJLclqeBhY
=wBwK
-----END PGP SIGNATURE-----
#
Ben Bolker, thank you for your suggestions.

Yes, it is suprising that I in SAS and STATA have to assume independence between the measurements within an individual. I do not want to assume that. In addition I would like to be able to chose other distributions than the normal for my random effect, which is not possible in SAS (proc NLMIXED).
The generalized estimating equation packages are probably not an option as I do not whant marginal models.
I will look at the references you suggested.
Thank you.
/Anna


___________________________________
Anna Ekman  (Grimby-Ekman)
PhLic Statistics, PhD (Dr Med Sci)
Occupational and Environmental Medicine
Sahlgrenska Academy and University Hospital
Box 414, SE - 405 30 Goteborg, Sweden
Phone +46 (0)31 786 31 23
www.amm.se

Akademistatistik inom EpiStat
Sahlgrenska akademin vid G?teborgs universitet
www.sahlgrenska.gu.se/forskning/epistat/

________________________________________
Fr?n: Ben Bolker [bbolker at gmail.com]
Skickat: den 7 januari 2011 16:43
Till: Anna Ekman
Kopia: r-sig-mixed-models at r-project.org
?mne: Re: [R-sig-ME] Mixed-model-binary logistic model with dependence between individual repeated measures

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11-01-07 06:59 AM, Anna Ekman wrote:
I'm surprised that you can't do this in SAS (PROC MIXED, NLMIXED, or
GLIMMIX?) or Stata <http://www.gllamm.org/>, but: if you want to do it
in R, your choices are glmmPQL in the MASS package or possibly one of
the generalized estimating equation packages (geese, geepack?)  I would
recommend the following references for getting started:

  Zuur et al Mixed models (Springer)
  Pinheiro and Bates 2000 (Springer), especially the material on
temporal autocorrelation models

  Extending to a ordinal outcome with temporal autocorrelation could be
tricky ...




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0nNJsACgkQc5UpGjwzenMymACfbqx+gtOjyhoX9hHpPyO/vbVg
NXUAniLvM+8voyzKA6axKLyJLclqeBhY
=wBwK
-----END PGP SIGNATURE-----
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11-01-07 11:35 AM, Anna Ekman wrote:
It's fundamentally a bit hard to specify correlation among individuals
in a non-normal model. One option is to go completely to the marginal
specification (which you said you don't want to do); probably the most
sensible statistical formulation is

  (fixed effects)  eta0 = X*beta
  (random effects) eta1 ~ MVN(mu=X*beta,Sigma=(something sensible such
as AR(1) within individuals))
   y ~ Bernoulli(eta1)

 i.e., a hierarchical model with a multivariate normal correlated
distribution at the 'lower level', with a level of Bernoulli variation
on top of that.

  The correlation parameters of eta1 will not correspond to the actual
correlations among the measurements (which will be smaller due to the
extra variation coming from the Bernoulli sampling)

 I do not
It's not possible in R either as far as I know.


The generalized estimating
If you want a non-marginal model with non-normal random effects and
within-individual correlation structures other than compound symmetry
(i.e. simple block structures), you are probably going to have to
construct your own solution with WinBUGS or AD Model Builder or ... ? If
you're lucky, MCMCglmm may be able to do what you want -- I would check
it out. (Molenbergh and Verbeke's book on longitudinal models describes
approaches for non-normal random effects, but in the context of LMMs
(i.e. normally distributed errors) -- they may have done something to
extend this stuff to GLMMs more recently.  It's possible that someone
out there has done what you want and encapsulated it into a canned
package, but I doubt it.

   cheers
    Ben Bolker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0nRBsACgkQc5UpGjwzenPr6wCfb7wixKYRGfglmge7Ejg0+i26
k5QAoJpuc+QiWC8iaMXC6aDnW75GfrYE
=292C
-----END PGP SIGNATURE-----
#
> -----BEGIN PGP SIGNED MESSAGE-----
    > Hash: SHA1
> On 11-01-07 11:35 AM, Anna Ekman wrote:
>> Ben Bolker, thank you for your suggestions.
    >> 
    >> Yes, it is suprising that I in SAS and STATA have to assume
    >> independence between the measurements within an individual.

    > It's fundamentally a bit hard to specify correlation among individuals
    > in a non-normal model. One option is to go completely to the marginal
    > specification (which you said you don't want to do); probably the most
    > sensible statistical formulation is

    > (fixed effects)  eta0 = X*beta
    > (random effects) eta1 ~ MVN(mu=X*beta,Sigma=(something sensible such
    > as AR(1) within individuals))
    > y ~ Bernoulli(eta1)

Interesting... {I've been "taught" in the past that  correlation
                specification for non-normal, i.e. GLME models,
		would not make sense /  be possible,
		something you do not seem to support ...
}

Does the above mean {slight changes}

 (fixed effects)  eta0 = X*beta
 (random effects) eta1 ~ MVN(0, Sigma=(something sensible such
				       as AR(1) within individuals))

  (Y | X,eta1)  ~ Bernoulli( logit(eta0 + eta1) )


??

    > i.e., a hierarchical model with a multivariate normal correlated
    > distribution at the 'lower level', with a level of Bernoulli variation
    > on top of that.

    > The correlation parameters of eta1 will not correspond to the actual
    > correlations among the measurements (which will be smaller due to the
    > extra variation coming from the Bernoulli sampling)

    > I do not
    >> want to assume that. In addition I would like to be able to chose
    >> other distributions than the normal for my random effect, which is
    >> not possible in SAS (proc NLMIXED). 

    > It's not possible in R either as far as I know.


    > The generalized estimating
    >> equation packages are probably not an option as I do not whant
    >> marginal models. I will look at the references you suggested. Thank
    >> you. /Anna
    >> 

    > If you want a non-marginal model with non-normal random effects and
    > within-individual correlation structures other than compound symmetry
    > (i.e. simple block structures), you are probably going to have to
    > construct your own solution with WinBUGS or AD Model Builder or ... ? If
    > you're lucky, MCMCglmm may be able to do what you want -- I would check
    > it out. (Molenbergh and Verbeke's book on longitudinal models describes
    > approaches for non-normal random effects, but in the context of LMMs
    > (i.e. normally distributed errors) -- they may have done something to
    > extend this stuff to GLMMs more recently.  It's possible that someone
    > out there has done what you want and encapsulated it into a canned
    > package, but I doubt it.

    > cheers
    > Ben Bolker
    > -----BEGIN PGP SIGNATURE-----
    > Version: GnuPG v1.4.10 (GNU/Linux)
    > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

    > iEYEARECAAYFAk0nRBsACgkQc5UpGjwzenPr6wCfb7wixKYRGfglmge7Ejg0+i26
    > k5QAoJpuc+QiWC8iaMXC6aDnW75GfrYE
    > =292C
    > -----END PGP SIGNATURE-----

    > _______________________________________________
    > R-sig-mixed-models at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
On Fri, 2011-01-07 at 11:49 -0500, Ben Bolker wrote:
<snip />
I was reading the article in the latest issue of the R Journal on the
hglm package, and although I was only giving it a cursory scan over
lunch it looks like it might be able to fit the sort of model implied
here; random effects distributed as a member of the exponential family.

G

  
    
#
Hi All,

I think one issue is going to be constraining the within-individual  
residual structure to a correlation matrix, since the diagonal  
elements of the covariance matrix cannot be estimated from binary  
data. A residual structure of the form rcov=~cor(time):units  does  
this in MCMCglmm, but I would not advocate it if there are many time  
points because there will be N*(N-1)/2 parameters, rather than 1  
parameter with an appropriate AR structure. N is the number of time  
points.

Cheers,

Jarrod



Quoting Gavin Simpson <gavin.simpson at ucl.ac.uk>:

  
    
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 11-01-07 12:48 PM, Martin Maechler wrote:
Hmm.  I agree that if you just specify the correlation structure of
the residuals (e.g. by using a corStruct in glmmPQL), it's not clear
what the underlying statistical model would be, or whether/when it would
make sense.
Yes, that's more or less what I meant, although I forgot the inverse
link function entirely and you put the link function (logit) where I
think you wanted the inverse link function (logistic) ...
[snip]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0njsIACgkQc5UpGjwzenP3qQCePoKzRPzFcNhBUmBUO/ZnZWxp
RE4An1QaoLlzrvFecMkYjnYb54dp7nzC
=jt4M
-----END PGP SIGNATURE-----
#
On Fri, 7 Jan 2011, Martin Maechler wrote:

            
With the probit link, such dichotomous and ordinal variable mixed models 
have a long history in genetics and psychometrics.  In the latter case, 
factor analysis and path analysis of tetrachoric/polychoric correlations 
is completely equivalent to the probit-normal, although GLS/WLS was often 
used for computational reasons.  We used to do all this in LISREL.  For 
the case of varying numbers of observations per individual (and other 
irregular data types), you can use the "multiple groups" approach, where 
you specify a covariance matrix of the right size for each pattern of 
data, and constrain the correlations equal in the different groups. 
Since the main interest is in the correlations between latent variables, 
all hypotheses and estimates are usually framed at that "level" of the 
model.

In the genetic situation, for example, we might estimate the heritability 
of a dichotomous trait based on family data under a polygenic model as 
being 1/2 the sibling tetrachoric correlation.  Model criticism is done by 
comparing predicted risk to different degrees of relations of an affected 
individual, or set of affected relatives.  Practically, this was used for 
genetic counselling etc.  In the current era of genome wide association 
studies, a key question is the "missing heritability", ie amount of 
familial aggregation of diseases unexplained by gene variants with 
detectable effect: the case control studies have N=30000.  Some of the 
arguments hinge on what kind of link function is used in the theoretical 
model.

Sorry, I couldn't resist ;)