gaussian family change suggestion
On Tue, 11 Apr 2006, Prof Brian Ripley wrote:
On Tue, 11 Apr 2006, Simon Wood wrote:
Hi,
Currently the `gaussian' family's initialization code signals an error if
any response data are zero or negative and a log link is used. Given that
zero or negative response data are perfectly legitimate under the GLM
fitted using `gaussian("log")', this seems a bit unsatisfactory. Might
it be worth changing it?
Well, that's not really true: it says it is unable to find starting values, and asks you to provide some. I don't really see why your choices are satisfactory: what happens if all the data are negative for example? (You get +Inf in the log case, I believe. The algorithm will probably get stuck from a finite starting point, but it will produce a mysterious error from an infinite one.) We could try even harder, but code that is almost never used tends to get broken whilst no one is testing it. So if you want to pursue it I think we need a comprehensive test suite.
I have actually been working with log-link glms quite a bit recently and was planning to change the defaults. -thomas
The current offending code from `gaussian' is:
initialize = expression({
n <- rep.int(1, nobs)
if (is.null(etastart) && is.null(start) && is.null(mustart) &&
((family$link == "inverse" && any(y == 0)) ||
(family$link == "log" && any(y <= 0)))) stop(
"cannot find valid starting values: please specify some")
mustart <- y
})
A possible replacement would be ...
initialize = expression({
n <- rep.int(1, nobs)
if (is.null(etastart) && is.null(start) && is.null(mustart) &&
((family$link == "inverse" && all(y == 0)) ||
(family$link == "log" && all(y <= 0)))) stop(
"cannot find valid starting values: please specify some")
mustart <- y
if (family$link=="log") {
iy <- y<=0
if (sum(iy)) mustart[iy] <- min(y[!iy])*.5
} else if (family$link=="inverse") {
iy <- y==0
if (sum(iy)) mustart[iy] <- min(abs(y[!iy]))*.5
}
})
best,
Simon
- Simon Wood, Mathematical Sciences, University of Bath, Bath BA2 7AY - +44 (0)1225 386603 www.maths.bath.ac.uk/~sw283/
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle