Is there any reason that rstandard.glm doesn't have a "pearson" option?
And if not, can it be added?
Background: I'm currently teaching an undergrad/grad-service course from
Agresti's "Introduction to Categorical Data Analysis (2nd edn)" and
deviance residuals are not used in the text. For now I'll just provide
the students with a simple function to use, but I prefer to use R's
native capabilities whenever possible.
I think something along the following lines should do it:
rstandard.glm <-
function(model,
infl=influence(model, do.coef=FALSE),
type=c("deviance", "pearson"), ...)
{
res <- switch(type, pearson = infl$pear.res, infl$dev.res)
res <- res/sqrt(1-infl$hat)
res[is.infinite(res)] <- NaN
res
}
Standardized Pearson residuals
6 messages · Brett Presnell, Peter Dalgaard, Jari Oksanen
On Mar 14, 2011, at 22:25 , Brett Presnell wrote:
Is there any reason that rstandard.glm doesn't have a "pearson" option? And if not, can it be added?
Probably... I have been wondering about that too. I'm even puzzled why it isn't the default. Deviance residuals don't have quite the properties that one might expect, e.g. in this situation, the absolute residuals sum pairwise to zero, so you'd expect that the standardized residuals be identical in absolute value
y <- 1:4 r <- c(0,0,1,1) c <- c(0,1,0,1) rstandard(glm(y~r+c,poisson))
1 2 3 4 -0.2901432 0.2767287 0.2784603 -0.2839995 in comparison,
i <- influence(glm(y~r+c,poisson)) i$pear.res/sqrt(1-i$hat)
1 2 3 4 -0.2817181 0.2817181 0.2817181 -0.2817181 The only thing is that I'm always wary of tampering with this stuff, for fear of finding out the hard way why thing are the way they are....
Background: I'm currently teaching an undergrad/grad-service course from Agresti's "Introduction to Categorical Data Analysis (2nd edn)" and deviance residuals are not used in the text. For now I'll just provide the students with a simple function to use, but I prefer to use R's native capabilities whenever possible.
Incidentally, chisq.test will have a stdres component in 2.13.0 for much the same reason.
I think something along the following lines should do it:
rstandard.glm <-
function(model,
infl=influence(model, do.coef=FALSE),
type=c("deviance", "pearson"), ...)
{
res <- switch(type, pearson = infl$pear.res, infl$dev.res)
res <- res/sqrt(1-infl$hat)
res[is.infinite(res)] <- NaN
res
}
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Thanks Peter. I have just a couple of minor comments, and another possible feature request, although it's one that I don't think will be implemented. peter dalgaard <pdalgd at gmail.com> writes:
On Mar 14, 2011, at 22:25 , Brett Presnell wrote:
Is there any reason that rstandard.glm doesn't have a "pearson" option? And if not, can it be added?
Probably... I have been wondering about that too. I'm even puzzled why it isn't the default. Deviance residuals don't have quite the properties that one might expect, e.g. in this situation, the absolute residuals sum pairwise to zero, so you'd expect that the standardized residuals be identical in absolute value
y <- 1:4 r <- c(0,0,1,1) c <- c(0,1,0,1) rstandard(glm(y~r+c,poisson))
1 2 3 4 -0.2901432 0.2767287 0.2784603 -0.2839995 in comparison,
i <- influence(glm(y~r+c,poisson)) i$pear.res/sqrt(1-i$hat)
1 2 3 4 -0.2817181 0.2817181 0.2817181 -0.2817181 The only thing is that I'm always wary of tampering with this stuff, for fear of finding out the hard way why thing are the way they are....
I'm sure that's wise, but it would be nice to get it in as an option, even if it's not the default
Background: I'm currently teaching an undergrad/grad-service course from Agresti's "Introduction to Categorical Data Analysis (2nd edn)" and deviance residuals are not used in the text. For now I'll just provide the students with a simple function to use, but I prefer to use R's native capabilities whenever possible.
Incidentally, chisq.test will have a stdres component in 2.13.0 for much the same reason.
Thank you. That's one more thing I won't have to provide code for
anymore. Coincidentally, Agresti mentioned this to me a week or two ago
as something that he felt was missing, so that's at least two people who
will be happy to see this added.
It would also be nice for teaching purposes if glm or summary.glm had a
"pearsonchisq" component and a corresponding extractor function, but I
can imagine that there might be arguments against it that haven't
occured to me. Plus, I doubt that anyone wants to touch glm unless it's
to repair a bug. If I'm wrong about all that though, ...
BTW, as I go along I'm trying to collect a lot of the datasets from the
examples and exercises in the text into an R package ("icda"). It's far
from complete and what is there needed tidying up, but I hope to
eventually to round it into shape and put it on CRAN, assuming that
Agresti approves and that there are no copyright issues.
I think something along the following lines should do it:
rstandard.glm <-
function(model,
infl=influence(model, do.coef=FALSE),
type=c("deviance", "pearson"), ...)
{
type <- match.arg(type)
res <- switch(type, pearson = infl$pear.res, infl$dev.res)
res <- res/sqrt(1-infl$hat)
res[is.infinite(res)] <- NaN
res
}
On Mar 15, 2011, at 04:40 , Brett Presnell wrote:
Background: I'm currently teaching an undergrad/grad-service course from Agresti's "Introduction to Categorical Data Analysis (2nd edn)" and deviance residuals are not used in the text. For now I'll just provide the students with a simple function to use, but I prefer to use R's native capabilities whenever possible.
Incidentally, chisq.test will have a stdres component in 2.13.0 for much the same reason.
Thank you. That's one more thing I won't have to provide code for anymore. Coincidentally, Agresti mentioned this to me a week or two ago as something that he felt was missing, so that's at least two people who will be happy to see this added.
And of course, I was teaching a course based on Agresti & Franklin: "Statistics, The Art and Science of Learning from Data", when I realized that R was missing standardized residuals.
It would also be nice for teaching purposes if glm or summary.glm had a "pearsonchisq" component and a corresponding extractor function, but I can imagine that there might be arguments against it that haven't occured to me. Plus, I doubt that anyone wants to touch glm unless it's to repair a bug. If I'm wrong about all that though, ...
Hmm, how would that work? If there was one, I'd worry that people would start subtracting them which is usually not the right thing to do. I do miss having a test on the residual deviance occasionally (even though it is only sometimes meaningful), having to fit a saturated model explicitly can be a bit silly. E.g. in this case (homogeneity of birth rates):
anova(glm(births~month,poisson,data=bb), test="Chisq")
...
Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL 11 225.98
month 11 225.98 0 0.00 < 2.2e-16 ***
anova(glm(births~1,poisson,data=bb), test="Chisq")
...
Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL 11 225.98
Notice that the latter version gives me the correct deviance but no p-value.
A better support for generic score tests could be desirable too. I suspect that this would actually be the Pearson Chi-square in the interesting cases.
Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
On 15/03/11 13:17 PM, "peter dalgaard" <pdalgd at gmail.com> wrote:
On Mar 15, 2011, at 04:40 , Brett Presnell wrote:
Background: I'm currently teaching an undergrad/grad-service course from Agresti's "Introduction to Categorical Data Analysis (2nd edn)" and deviance residuals are not used in the text. For now I'll just provide the students with a simple function to use, but I prefer to use R's native capabilities whenever possible.
Incidentally, chisq.test will have a stdres component in 2.13.0 for much the same reason.
Thank you. That's one more thing I won't have to provide code for anymore. Coincidentally, Agresti mentioned this to me a week or two ago as something that he felt was missing, so that's at least two people who will be happy to see this added.
And of course, I was teaching a course based on Agresti & Franklin: "Statistics, The Art and Science of Learning from Data", when I realized that R was missing standardized residuals.
So nobody uses McCullagh & Nelder: "Generalized Linear Models" in teaching, since they don't realize that R is missing Anscombe residuals, too? Cheers, Jari Oksanen
On Mar 15, 2011, at 14:22 , Jari Oksanen wrote:
On 15/03/11 13:17 PM, "peter dalgaard" <pdalgd at gmail.com> wrote:
On Mar 15, 2011, at 04:40 , Brett Presnell wrote:
Background: I'm currently teaching an undergrad/grad-service course from Agresti's "Introduction to Categorical Data Analysis (2nd edn)" and deviance residuals are not used in the text. For now I'll just provide the students with a simple function to use, but I prefer to use R's native capabilities whenever possible.
Incidentally, chisq.test will have a stdres component in 2.13.0 for much the same reason.
Thank you. That's one more thing I won't have to provide code for anymore. Coincidentally, Agresti mentioned this to me a week or two ago as something that he felt was missing, so that's at least two people who will be happy to see this added.
And of course, I was teaching a course based on Agresti & Franklin: "Statistics, The Art and Science of Learning from Data", when I realized that R was missing standardized residuals.
So nobody uses McCullagh & Nelder: "Generalized Linear Models" in teaching, since they don't realize that R is missing Anscombe residuals, too? Cheers, Jari Oksanen
Well, if you can read the book, you can probably write the code... The other books are for beginners who may need the convenience (and persuasion power) of standard software.
Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com