"prediction intervals for glm"
On Thu, 1 May 2003, Fredrik Lundgren wrote:
I wouldn't know anything about the theoretical problems with glm and a binary outcome but there is a "prediction interval" in predict.glm of S-Plus(6.02 version something). I have failed to source it to R (and I do have difficulties with the higher forms of matrix manipulations). In the medical field where I'm active I think it has a high value to generate "prediction intervals" for risk and benefit calculations for individual patients. If it's theoretically fishy or unsound with a prediction interval maybe some bootstrap appraoch could do the trick?
It's more than fishy ... it uses the normal approximation on link scale (as I recall) which is very unlikely to be valid except for the gaussian family. Indeed for 0/1 data the interval will have coverage 0, exactly. I don't see how a bootstrap would help either: the issue is to combine the (reasonably well-known) uncertainty in the prediction of the mean with the variability in the observation. That would be easy to do by simulation, but not by re-sampling. (Or did you think all simulation-based inference was `some bootstrap approach'.) However, you are not going to be able to summarize that predictive distribution as an *interval* for 0/1 data.
Sincerely Fredrik Lundgren ----- Original Message ----- From: "Peter Dalgaard BSA" <p.dalgaard at biostat.ku.dk> To: "Spencer Graves" <spencer.graves at pdf.com> Cc: "Fredrik Lundgren" <fredrik.lundgren at norrkoping.mail.telia.com>; <R-help at stat.math.ethz.ch> Sent: Tuesday, April 29, 2003 4:48 PM Subject: Re: [R] "prediction intervals for glm"
Spencer Graves <spencer.graves at pdf.com> writes:
"?predict.glm" produced something in my copy of R 1.6.2 under Windows 2000.
.. but probably not what Fredrik wanted. Prediction intervals (i.e. intervals with 95% probability of catching a new observation) are somewhat tricky even to define for glms. For Normal responses you have the formula yhat +- qt(.975,df)* sqrt(s^2+se(yhat)^2), for other continuous responses that would become (approximately!) the error distribution convolved with a Gaussian density, for discrete responses - say 0/1 - I wouldn't know what to do.
Fredrik Lundgren wrote:
Where can i find prediction intervals for glm in R?
-- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595