Skip to content
Prev 284774 / 398502 Next

Outlier removal techniques

On Thu, 9 Feb 2012, mails wrote:

            
Those more expert than I will certainly provide answers. What I do will
new data is create box-and-whisker plots (I use the lattice package) which
defines outliers as those data beyond 1.5x the first or third quartile
values.

   No one but you can answer your question on when an outlier is an outlier.
It depends on your data set and the context of the data. For example, a
water chemistry value that far exceeds a regulartory threshold might be
meaningful in the context of a one-off excursion (in which case it's not an
outlier but a real data point) or it might result from a handling,
instrumentation, or analytical error (in which case toss it as an outlier).

Rich