Skip to content

how to identify the outliers

2 messages · Rado Bonk, Christian Hennig

#
Hello R-users,

Is there any more sophisticated way how to identify the dataset 
outliers other then seeing them in boxplot? I wanna exclude them from
further analysis and I am interested in their position in my vector
data.

Rado
#
Dear Rado,

I do not know how your data looks like, but generally you can use robust
Mahalanobis distances. That is, compute robust mean and covariance matrix
by cov.rob (method="mcd") in Library lqs, and put these as center and cov
into the function mahalanobis. As cutoff value you can take a large
quantile (say 0.999) of the chi^2-distribution with p (number of your
variables) degrees of freedom. Details in Rousseeuw & van Driessen, see
help page on cov.rob. 

Christian
On Tue, 26 Nov 2002, Rado Bonk wrote: