Skip to content
Prev 172718 / 398506 Next

hatvalues?

I am struiggling a bit with this function 'hatvalues'.  I would like a little more undrestanding than taking the black-box and using the values. I looked at the Fortran source and it is quite opaque to me. So I am asking for some help in understanding the theory. First, I take the simplest case of a single variant. For this I turn o John Fox's book, "Applied Regression Analysis and Generalized Linear Models, p 245 and generate this 'R' code:
# remove the NA's
This gives me a array of values the largest of which is
[1]  21  52  17  93  30  62 158 113 175 131 182  29 106 125 123 146  91  99

So the largest "hatvalue" is
[1] 0.1041207

Which doesn't match the 0.714 value that is reported in the book but I will probably take that up with the author later.

Then I use more of 'R' and I get

fit <- lm(weight ~ repwt)
hr <- hatvalues(fit)
hr[21]
       21 
0.1041207 

So this matches which is reasusing. My question is this, given the QR transformation and the residuals derived from that transformation what is a simple matrix formula for the hatvalues?
residuals = y - Hy = y(I - H)
or
H = -(residuals/y - I)
This generates a matrix but I cannot see any coerrelation between this "hat-matrix" and the return from "hatvalues".

Comments?

Thank you.

Kevin