semicontinuous variables: what likelihoods are available?
Many thanks George! Attached figures. I will look into your suggestion. Thank you! Aurelie
On 2013-03-18, at 9:00 PM, George Wang wrote:
Hi Aurelie, I am probably more seeking assurance from the gurus than trying to answer your question, as I'm also playing with a data set in a similar situation. (I am looking at area of leaf consumed by insect herbivores.) Can you use a delta-distribution/approach for your data? That is, run binomial models on the presence-absence data of your response variable, and log-normal models on the positive (non-zero) portion of your continuous data. I know this approach is fairly common for linear models, e.g. http://r-project.markmail.org/search/?q=delta%20Tweedie#query:delta%20Tweedie+page:1+mid:gnzpixld5zkl5sig+state:results and I imagine it's equally applicable for (G)LMM's. I'll let the more knowledgeable members of this list correct me if not. I didn't see any attachment in your last message, so I don't know how your data are distributed, but this approach seemed to work well for my data (with ~70% zeros). HTH, George On Mon, Mar 18, 2013 at 4:25 PM, Aurelie Cosandey Godin <GodinA at dal.ca> wrote: Thank you Ben and others, Apologize for not being very precise! My response variable is measured both in weight (kg) and counts and is very zero-inflated i.e., 91% of my data. I previously ran models on the count data using a suit of likelihoods: 2-parts zero inflated poisson & 2-parts zero inflated negative binomial. The latter were the best. Now I would like to run the same models but with my response variable in kg, but I don't know how to model my positive (truncated or just positive weight data?). See figure attached of the distribution of my weight data. Many thanks in advance!! Aurelie On 2013-03-18, at 4:30 PM, Ben Bolker wrote:
Aurelie Cosandey Godin <GodinA at ...> writes: [snip]
I need to run spatio-temporal models for a semicontinuous response variable (weight in kg). I am not familiar with the available semicontinuous likelihood functions available in R and was wondering if some of you may be able to point me in the right direction for information.
Can you say any more about exactly what a semicontinuous response variable is? Poking around (e.g <http://lpsolve.sourceforge.net/4.0/semi-cont.htm>) doesn't make it entirely clear: are these data that are truncated, i.e. values <= a lower threshold are absent from the data set; censored, i.e. values <= a lower threshold are recorded as "less than threshold"? positive, i.e. values <0 don't even exist? are the data non-negative (i.e. >=0) or are they positive (>0)? The simplest of these cases is positive data, which you can model fairly easily by log transformation (i.e. assume a lognormal distribution), or with slightly more difficulty using a Gamma distribution ... if you have censored or truncated data, or data that include zeros, it gets a little harder ...
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models