Distribution family for non-negative lower and upper bound values
Alain, Steven, Thanks both for the lead and for the papers. A few clarifications: *1. SP>> Consider a mixed effects variant of the beta regression model, as discussed in the papers below.* I assume that you agree with the rescaling approach then? I should have mentioned that I will be comparing several models - the ideal package would be gamm4, however it doesn't fit betar family. (gamm package does but comparing models is compromised) *2. AZ>>You could try a beta distribution, which can be used when your data is between x1 and x2.* Not sure I understand 'when your data is between x1 and x2'. What does x1 and x2 refer to? In any case - as recommended in your book - beginners to GAMM, gamm4 package is ideal when comparing models (I have 500 models to compare). This doesn't fit beta family - is there a workaround? *3. AZ>>All in all this sounds like an MCMC job. I haven't tried SabreR...maybe it can do a beta distribution.* I haven't tried these two before Kind regards, Gitu
On Sat, Dec 26, 2015 at 1:57 AM, Steven J. Pierce <pierces1 at msu.edu> wrote:
Gitu,
Consider a mixed effects variant of the beta regression model, as
discussed in the papers below.
Smithson, M., & Verkuilen, J. (2006). A better lemon squeezer? Maximum
likelihood regression with beta-distributed dependent variables.
Psychological Methods, 11(1), 54-71. doi:10.1037/1082-989X.11.1.54
Zimprich, D. (2010). Modeling change in skewed variables using mixed beta
regression models. Research in Human Development, 7(1), 9-26.
doi:10.1080/15427600903578136
Steven J. Pierce, Ph.D.
Associate Director
Center for Statistical Training & Consulting (CSTAT)
Michigan State University
-----Original Message-----
From: Gitu wa Mbui [mailto:gitumbui at gmail.com]
Sent: Wednesday, December 23, 2015 8:54 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Distribution family for non-negative lower and upper
bound values
I am running generalized additive mixed models on two response variables
separately. Values in response 1 are non-negative and bounded between 1-2,
while response 2 is also non- negative and bounded between 1-3.
In choosing the distribution for response 1, I have subtracted 1 (to
rescale to between 0-1) and logit transformed before fitting the models
with gaussian family.
As for response 2 (non-negative values between 1-3), I have divided the
values by 3 so as to rescale to between 0-1, before logit transforming and
fitting with gaussian family.
Does this sound like a good approach? if not what are the alternatives,
considering:
- responses 1&2 are not proportions
- I am using lme4 version (gamm4) which is limited on the number of
families that can be fit
- histograms of both responses are pretty flat (non skewed and don't look
anywhere near normal distribution
~ Gitu
[[alternative HTML version deleted]]