Skip to content

plot.lm: "Cook's distance" label can overplot point labels

10 messages · John Maindonald, John Fox, Brian Ripley +2 more

#
The following code demonstrates an annoyance with plot.lm():

library(DAAGxtras)
x11(width=3.75, height=4)
nihills.lm <- lm(log(time) ~ log(dist) + log(climb), data = nihills)
plot(nihills.lm, which=5)

OR try the following
xy <- data.frame(x=c(3,1:5), y=c(-2, 1:5))
plot(lm(y ~ x, data=xy), which=5)

The "Cook's distance" text overplots the label for the point with the  
smallest residual.  This is an issue when the size of the plot is much  
less than the default, and the pointsize is not reduced proportionately.


I suggest the following:
       xx <- hii
       xx[xx >= 1] <- NA
## Insert new code
       fracht <- (1.25*par()$cin[2])/par()$pin[2]
       ylim[1] <- ylim[1] - diff(ylim)*max(0, fracht-0.04)
## End insert new code
       plot(xx, rsp, xlim = c(0, max(xx, na.rm = TRUE)),
            ylim = ylim, main = main, xlab = "Leverage",
            ylab = ylab5, type = "n", ...)

Then, about 15 lines further down, replace
         legend("bottomleft", legend = "Cook's distance",
                lty = 2, col = 2, bty = "n")

by
         legend("bottomleft", legend = "Cook's distance",
                lty = 2, col = 2, bty = "n", y.intersp=0.5)

If this second change is not made, then one wants fracht <- (1.5*par() 
$cin[2])/par()$pin[2]
I prefer the "Cook's distance" text to be a bit closer to the x-axis,  
as it separates it more clearly from any point labels.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
#
Dear John,

It occurs to me that the title above the graph, "Residuals vs. Leverage," is
entirely redundant since the x-axis is labelled "Leverage" and the y-axis
"Studentized residuals." Why not use the title above the graph for "Cook's
distance countours"?

Regards,
 John
On
#
Dear John -
The title above the graph is also redundant for the first of the  
plots; do we want to be totally consistent?  I am not sure.

It occurs to me that the text "Cook's distance", as well as the  
contours, might be in red.
Regards
John.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 18/02/2009, at 12:27 PM, John Fox wrote:

            
#
Dear John,
labels
Why not? "A foolish consistency is the hobgoblin of little minds," but maybe
this isn't a foolish consistency.
That would provide a nice visual cue (for those who aren't colour blind).

Best,
 John
[mailto:r-devel-bounces at r-project.org
#
On Wed, 18 Feb 2009, John Fox wrote:

            
Or using a black-and-white device.  We have not hitherto assumed a 
colour device in 'stats' graphics, and given how often they are 
printed I don't think we want to start.

As so often, it seems that what looks good is in the eye of the 
beholder.  If the two of you can agree on something that you both see 
is a definite improvement, please provide a patch and examples to try 
to persuade everyone else.  (As a Wishlist item on R-bugs, so it gets 
recorded.)

  
    
#
Actually, the contours and the smooth are currently printed with  
col=2.  This prints satisfactorily in grayscale.    Colours ("orange"  
and "darkred" as well as col=2) are also used in termplot.

Does the stricture against "colour" extend to grayscale?  Does it  
apply to lines as well as text?

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 19/02/2009, at 5:58 PM, Prof Brian Ripley wrote:

            
#
Dear John and Brian,

My point about colour-blindness was partly tongue-in-cheek, but I think that
it's a bad choice to have the second and third colours in the default
palette as red and green.

Regards,
 John
labels
#
On Thu, 19 Feb 2009, John Fox wrote:

            
Looking at the standard palette with dichromat::dichromat() it seems that it depends on which flavour of red-green anomaly you have. For deuteranopia the red and green are quite close. For protanopia they are pretty distinct and the confusion is between colours 3 and 7 (yellow vs green) and between 4 and 6 (blue and magenta).

I agree that the standard palette isn't ideal, though.

      -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
#
Dear Thomas,

Though far from an expert on the matter, it's my understanding that
red-green confusion is the most common form of colour-blindness. I guess
that the best way to put it is that it would be desirable to choose colours
for the standard palette that minimize the probability of perceptual
problems.

Regards,
 John
labels
it
the
and
#
At 13:05 20/02/2009, John Fox wrote:

            
I wonder whether there are two separate issues: what is the best 
standard palette and whether mainstream plots should use colour to 
carry essential information. For instance there seems little problem 
in biplot having red arrows and black points because the colour is redundant.

I am not an expert on colour vision either but there certainly are 
people who report difficulty at scientific meetings with interpreting 
the slides.
Michael Dewey
http://www.aghmed.fsnet.co.uk