Precision about the glmer model for Bernoulli variables
Thanks for all this details about correlation. Just to be clear about the nomenclature: when a model is called ??A-B??, A is for the cumulative distribution function used as link between pi and the linear predictor including the random effect(s) or its reciprocal, and B is for the distribution of random effect(s), is this right? If this is right, for the ? double binomial ?, it would mean that the link function is a step function, and not a monotonous strictly increasing function. Since it would then not be a bijection between R and [0,1], doesn't it introduce indetermination and strong constraints on the possible values of pi? I guess I'm wrongly interpreting what means ? double binomial ?. Sorry for this naive questions, I'm discovering the field... Best regards,
On Tue, Apr 21, 2020 at 07:59:48AM +0000, John Maindonald wrote:
Actually, with a suitable parameterization, the betabinomial allows small negative correlations. See Binary Regression Using an Extended Beta-Binomial Distribution, With Discussion of Correlation Induced by Covariate Measurement Errors R. L. Prentice: Journal of the American Statistical Association Vol. 81, No. 394 (Jun., 1986), pp. 321-327 A feature of the betabinomial (BB) , which ought to be better advertised than it is in most accounts that I have seen, is that, for positive correlation rho, it sets a strict lower limit of pi(1-pi)(1+rho) on the variance of the estimate of the proportion pi. This is in marked contrast to the variance assumptions that define quasibinomial errors. The gamlss package implements the double binomial (as well as the logistic normal and BB), but as far as I can tell, not in a multi-level model context. The double binomial does allow a negative correlation. John Maindonald email: john.maindonald at anu.edu.au<mailto:john.maindonald at anu.edu.au> On 21/04/2020, at 19:18, David Duffy <David.Duffy at qimrberghofer.edu.au<mailto:David.Duffy at qimrberghofer.edu.au>> wrote: You'd know the beta-binomial constrains the range of the correlation coefficient positive, but the logit-normal and probit-normal allow a negative correlation, with a lower bound constrained by the N, in the case of regular sized clusters. For your example of 2x2 tables, the correlation can go from -1 to 1. This is the usual genetics type setup where you model the pairwise correlations for all pairs of observations in the dataset versus their (pairwise) genetic relatedness. In multi-trait models, the correlations between different phenotypes (say asthma and diabetes) in the same individual can be negative, so you can estimate negative variance components. The pedigreemm package extends lme4 to allow glmms for pedigree data (see the mastitis example for a binary trait). Cheers, David Duffy.
________________________________________ From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org<mailto:r-sig-mixed-models-bounces at r-project.org>> on behalf of Vaida, Florin <fvaida at health.ucsd.edu<mailto:fvaida at health.ucsd.edu>> Sent: Tuesday, 21 April 2020 7:05:40 AM To: Emmanuel Curis Cc: r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org> Subject: Re: [R-sig-ME] Precision about the glmer model for Bernoulli variables Hi Emmanuel, That's a good question. My guess is that the correlation is non-negative generally, but I wasn't able to prove it theoretically even in the simplest case when Y1, Y2 ~ Bernoulli(u) independent conditionally on u, and u ~ Normal(0, 1). I am curious if someone has a solution. We can't go too far down this route in this forum, since Doug wants to keep it applied. Florin On Apr 20, 2020, at 12:32 PM, Emmanuel Curis <emmanuel.curis at parisdescartes.fr<mailto:emmanuel.curis at parisdescartes.fr>> wrote: Hi Florin, Thanks for the answer, the precision about p(i,j), and the reference. A last question, that I forgot in my message: is the obtained correlation also always positive, as in the linear case? Or may some negative correlation appear, depending on the values of pi(i,j) and pi(i,j')? Best regards, On Mon, Apr 20, 2020 at 03:27:39PM +0000, Vaida, Florin wrote: ? Hi Emmanuel, ? ? Your reasoning is correct. ? ? As a quibble, outside a simple repeated measures experiment setup, p(i,j) *does* depend on j. ? For example, if observations are collected over time, generally there is a time effect; if they are repeated measures with different experimental conditions, p(i,j) will depend on the condition j, etc. ? ? There is almost certainly no closed form solution for the covariance under logit. ? I am not sure about the probit (my guess is not). ? There will be some Laplace approximations available, a la Breslow and Clayton 1993. ? ? I'd be curious if these formulas/approximations were developed somewhere - I'd be surprised if they weren't. ? ? Florin ? ? ? > On Apr 20, 2020, at 12:48 AM, Emmanuel Curis <emmanuel.curis at parisdescartes.fr<mailto:emmanuel.curis at parisdescartes.fr>> wrote: ? > ? > Hello everyone, ? > ? > I hope you're all going fine in these difficult times. ? > ? > I tried to understand in details the exact model used when using glmer ? > for a Bernoulli experiment, by comparison with the linear mixed ? > effects model, and especially how it introducts correlations between ? > observations of a given group. I think I finally got it, but could ? > you check that what I write below is correct and that I'm not missing ? > something? ? > ? > I use a very simple case with only a single random effect, and no ? > fixed effects, because I guess that adding fixed effects or other ? > random effects does not change the idea, it "just" makes formulas more ? > complex. I note i the random effect level, let's say ? patient ?, and ? > j the observation for this patient. ? > ? > In the linear model, we have Y(i,j) = ?0 + Z(i) + epsilon( i, j ) with ? > Z(i) and epsilon(i,j) randoms variables having a density of ? > probability, independant, and each iid. ? > ? > Hence, cov( Y(i,j), Y(i,j') ) = Var( Z(i) ): the model introduces a ? > positive correlation between observations of the same patient. ? > ? > ? > ? > In the Bernoulli model, Y(i,j) ~ B( pi(i,j) ) and pi(i,j) = f( Z(i) ), ? > f being the inverse link function, typically the reciprocal of the ? > logit. So we have ? > ? > cov( Y(i,j), Y(i,j') ) = E( Y(i,j) Y(i, j') ) - E( Y(i,j) ) E( Y(i,j') ) ? > = Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) - pi(i,j) * pi(i,j') ? > ? > Since in practice pi(i,j) does not depend on j, pi(i,j) = pi(i,j'). ? > ? > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) = ? > integral(R) Pr( Y(i,j) = 1 inter Y(i,j') = 1 | Z(i) = z ) p( Z(i) = z ) dz ? > ? > Then, we assume that conditionnally on Zi, the Yij are independant, is ? > this right? This is the equivalent of ? the epsilon(i, j) are ? > independant ?? I assume this hypothesis is also used for computing the ? > likelihood? If not, what is the model for the joint probability? ? > ? > In that case, ? > ? > Pr( Y(i,j) = 1 inter Y(i,j') = 1 ) = ? > integral(R) f(z) f(z) p( Z(i) = z ) dz ? > ? > and since pi(i,j) = integral( R ) f(z) p( Z(i) = z ) dz we have ? > ? > cov( Y(i,j), Y(i,j') ) = ? > integral( R ) f?(z) p( Z(i) = z ) dz - ? > ( integral( R ) f(z) p( Z(i) = z ) dz )? ? > ? > which in general has no reason to be nul, hence the two observations ? > are correlated. Is this correct? ? > ? > Is there any way to have a closed-form of the covariance, for usual f ? > (let's say, logit or probit) and Z distribution (let's say, Gaussian)? ? > ? > Thanks a lot for reading, and your answers, ? > ? > -- ? > Emmanuel CURIS ? > emmanuel.curis at parisdescartes.fr<mailto:emmanuel.curis at parisdescartes.fr> ? > ? > Page WWW: http://emmanuel.curis.online.fr/index.html ? > ? > _______________________________________________ ? > R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list ? > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models ? -- Emmanuel CURIS emmanuel.curis at parisdescartes.fr<mailto:emmanuel.curis at parisdescartes.fr> Page WWW: http://emmanuel.curis.online.fr/index.html _______________________________________________ R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models [EXTERNAL EMAIL] This message originates from an external email address, please exercise caution when clicking any links or opening attachments. If you believe the sender is impersonating someone at QIMR Berghofer, please forward this message to phishing at qimrberghofer.edu.au. _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Emmanuel CURIS
emmanuel.curis at parisdescartes.fr
Page WWW: http://emmanuel.curis.online.fr/index.html