Robust M-Estimator Comparison

Johannes:

WARNING: I'm no expert. Caveat emptor!

There is a huge literature on robust estimation, as you'll find when you
Google it. One natural place to start might be the relevant sections of
V&R's MASS( Modern Applied Statistics with S) and the references therein. An
old classic, which may not, however, still be in print, is Hoaglin,
Mosteller, Tukey: Understanding robust and exploratory data analysis.
(Robust estimation chapter)

It is not clear to me that robust estimation will solve your problems with
lots of one-sided outliers -- sounds like a skewed distribution in there
somewhere. 

One thing to be careful about: there's "Robustness of efficiency" and
"Outlier resistance." The first is about maintaining estimation efficiency
in the face of "contamination" by a usually small percentage of "outliers"
(whatever THEY are); the second is about maintaining estimation accuracy in
the face of a possibly large proportion of outliers. The classic example of
the latter for estimating location is the median; an M-estimator (e.g.
iterated biweight) is an exemplar of the former. As V&R and others makes
clear, these are not mutually exclusive, but they do tend to pull in
somewhat different ways.

Robust estimation seems to have lost its cachet these days, maybe because it
seems to be difficult to do in the nonlinear models that arise out of the
complex covariance structures people want to use these days (e.g, mixed
models; Empirical Bayes). I continue to find it an essential tool in any
routine regression work that I do, however. Seems more in keeping with
entropy.

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

Robust M-Estimator Comparison

Thread (2 messages)