Skip to content
Prev 256537 / 398506 Next

Regression model with proportional dependent variable

On Mon, 11 Apr 2011, ty ty wrote:

            
Beta regression is one possibility to model proportions in the open unit 
interval (0, 1). It is available in R in the package "betareg":

   http://CRAN.R-project.org/package=betareg
   http://www.jstatsoft.org/v34/i02/

If 0 and 1 can occur, some authors have suggested to scale the response so 
that 0 and 1 are avoided. See the paper linked above for an example. If, 
however, there are many 0s and/or 1s, one might want to take a hurdle or 
inflation type approach. One such approach is implemented in the "gamlss" 
package:

   http://CRAN.R-project.org/package=gamlss
   http://www.jstatsoft.org/v23/i07/
   http://www.gamlss.org/

The hurdle approach can be implemented using separate building blocks.
First a binary regression model that captures whether the dependent 
variable is greater than 0 (i.e., crosses the hurdle): glm(I(y > 0) ~ ...,
family = binomial). Second a beta regression for only the observations in 
(0, 1) that crossed the hurdle: betareg(y ~ ..., subset = y > 0). A recent 
technical report introduces such a family of models along with many 
further techniques (specialized residuals and regression diagnostics) that 
are not yet available in R:

   http://arxiv.org/abs/1103.2372

Best,
Z