Skip to content

Request for help using a generalized linear mixed model in correlated data

2 messages · Maria Eugenia Castellanos, David Duffy

#
Dear R-sig-mixed-models group,

Happy new year and hope you are fine. I will appreciate a lot your help in
this analysis in which I have spent several weeks.

I am an epidemiologist, trying to estimate risk differences between the
risk of tuberculosis (TB) in  2 groups, contacts of TB cases and contacts
of community controls.

My sample is 1043 contacts of controls and 1002 contacts of cases.

The variable index_contact1  (exposure) indicates if an observation comes
from a contact of a TB case or it is a contact of a community control
(coded 1 and 0 respectively). I want to adjust by age  (enr_age, continuous
variable), sex (enr_sex coded 2=women, 1=men) and HIV status (hiv,
1=infected, 0=non infected).

My binary outcome is called ?tst? and it is coded as 1 (infected) or 0 (not
infected).

One of the reviewers asked me to account for the clustering of these
contacts. So I have a variable called enr_lnkid that indicates the ID of a
TB case (n=122) or a community control (n=124).

All my variables are coded as factors, except for tst and for enr_age,
which they are coded as numeric. I had to write tst as a numeric variable
in order to work with the regression models.

First, I used a GEE model and it worked:

allnoc <- gee(tst ~ index_contact1 + enr_age + enr_sex + hiv, id=netid,
family=binomial ('identity'), corstr = "exchangeable", data = c3)

summary(allnoc)

GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA

gee S-function, version 4.13 modified 98/01/27 (1998)



Model:

Link:                      Identity

 Variance to Mean Relation: Binomial

 Correlation Structure:     Exchangeable



Call:

gee(formula = tst ~ index_contact1 + enr_age + enr_sex + hiv,

    id = netid, data = c3, family = binomial("identity"), corstr =
"exchangeable")



Summary of Residuals:

       Min         1Q     Median         3Q        Max

-0.9229960 -0.4073128 -0.2338467  0.5127307  0.8636212





Coefficients:

                  Estimate   Naive S.E.    Naive z  Robust S.E.   Robust z

(Intercept)      0.1991535 0.0254609144  7.8219323 0.0254041176  7.8394200

index_contact11  0.1600779 0.0208047552  7.6942957 0.0210721716  7.5966513

enr_age          0.0086733 0.0007726673 11.2251418 0.0009238111  9.3886077

enr_sex2        -0.1061413 0.0208703279 -5.0857503 0.0206749937 -5.1337997

hiv1            -0.0393979 0.0414949448 -0.9494627 0.0418152072 -0.9421908



Estimated Scale Parameter:  1.002437

Number of Iterations:  3



Working Correlation

     [,1]

[1,]    1
The result was 0.16 (16%), which is very close to the crude risk difference
(approx.. 14%).

I am trying to replicate my findings, using a generalized linear mixed
model, so I use this code:

  Model1 <- glmer(tst ~  index_contact1 + enr_age + enr_sex + hiv +  (1 |
enr_lnkid), family=binomial, data = c3, control = glmerControl(optimizer =
"bobyqa"), nAGQ=0)



And it works



Generalized linear mixed model fit by maximum likelihood (Adaptive
Gauss-Hermite Quadrature, nAGQ = 0) ['glmerMod']

 Family: binomial  ( logit )

Formula: tst ~ index_contact1 + enr_age + enr_sex + hiv + (1 | enr_lnkid)

   Data: c3

      AIC       BIC    logLik  deviance  df.resid

 2584.565  2618.304 -1286.283  2572.565      2039

Random effects:

 Groups    Name        Std.Dev.

 enr_lnkid (Intercept) 0.7106

Number of obs: 2045, groups:  enr_lnkid, 246

Fixed Effects:

    (Intercept)  index_contact11          enr_age         enr_sex2
        hiv1

       -1.39891          0.75106          0.04386         -0.48146
    -0.35127





But my results come as link=logit whereas I need to have link=identity to
get risk differences.





I tried then this, using Poisson as the family binomial did not accept
link=identity:



#identity

  Model2 <- glmer(tst ~  index_contact1 + enr_age + enr_sex + hiv + (1 |
enr_lnkid), family=poisson(link=identity), data = c3, control =
glmerControl(optimizer = "bobyqa"), nAGQ=0)





But I get this error:



Error in (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L,
maxit = 100L,  :

  (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate





I want to ask two things:



   1. Is there a way that I can convert the results for model 1 from
   link=logit to link=identity?
   2. Or how I can solve these error with the Poisson family?



Thanks for the help you can provide me, thank you!

Maria Eugenia Castellanos





Global Health Institute

College of Public Health

University of Georgia

Athens, GA, USA
#
Hi Maria.

The nature of the GEE (ie marginal) model means that it should agree with the "naive" model ignoring clustering.

One way you can use your logistic-normal GLMM is to predict risk for individuals with comparable covariate values using 
predict(mod, type="response"), and calculate the resulting risk difference. Or take your log odds ratio and apply it to a given base rate - 
the hypothesis testing done using the logistic link is correctly allowing for the clustering.  

Recall that the different links will entail different distributions for the cluster means and correlation - ie a logistic link might be more appropriate for the biology generating your data.See also the zoo of alternatives for glmmTMB (beta, beta-binomial, negative binomial etc) that you could send another few weeks on. 

Cheers, David Duffy.