Interval censored Data in survreg() with zero values!
--begin included ----- My endogenous variable is not a time depending variable but percentages which naturally are censored in the interval [0,100]. Unfortunately many data points are 0 or 100 exactly. The rest of the data is asymmetrically distributed. So I would like to apply a two-limit tobit, regressing the percentage (endogenous variable) on several explanatory variables. --- end included ---- Censoring is a limit in the observation process: right censored at 100 means that "the true y value is > 100, but we did not observe the exact value". You have binomial data with 0 <= y <= 100, which is not a constraint on the observation process. You should be using glm with a binomial family. Terry T
Sorry for being so cumbersome, but I don't see why my data shouldn't be censored but be binomial instead. The classical Tobit (left censored at zero) example is household expenditure on durable goods, which naturally has a high peak on zero as not each item is bought in every period. As expenditure can't be negative, the variable is left censored. In my case, the (observed) percentage of A must be between 0 and 100. We suppose that each individual has a specific unobservable tendency (y*) to do A. If the tendency to do A is very low (y*<=0), we observe that she does not do A (y=0); if the tendency is very high (y*>=100), we observe that she is only doing A (y=100); if the tendency is mediocre (0<y*<100), we observe that she is doing some A (y=y*, 0<y<100). I don't see a binomial distribution in the data. I don't see where there is a Bernoulli trial in the data as y can take more than two values and is even a continuous variable. As said before I'm no big statistician so I would be grateful if you could enlighten me. Geraldine