Skip to content
Prev 139001 / 398506 Next

Formula for whether hat value is influential?

Dear Gavin and Paul,

(k + 1)/n is the average hatvalue. The 2(k + 1)/n rule comes from results in
Belsley, Kuh, and Welsch (1980), Regression Diagnostics, concerning the
distribution of the hatvalues when n is large relative to k + 1, and when X
is multivariate normal. For smaller n, this tends to nominate too many
points, and thus suggests the rule 3(k + 1)/n, which I think is also due to
Belsley et al.

I'd prefer to call such hatvalues "noteworthy" rather than "influential,"
since hatvalues measure "leverage" on the least-squares fit and not
influence (on the coefficients).

Finally, I think that it's a better idea to examine diagnostics like
hatvalues graphically rather than paying too much attention to numerical
cutoffs.

Regards,
 John

--------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox