Skip to content
Prev 75506 / 398502 Next

need help

Weiwei Shi wrote:
Hi Weiwei,

I think your method of defining a central value for the large proportion 
of values and then setting a criterion for outliers is valid (or at 
least as valid as many other ways of defining outliers). However, here 
is a different method, sorting the vector of values and then looking for 
a "gap" with a specified multiple (gap.prop) of the mean differences 
between the smaller values. It returns the first value after the "gap" 
(easily changed to all the values after). To account for vectors that 
have negative values the minimum value is subtracted when calculating 
"newx" and then added to the result. For your data, a gap.prop of 20 
works, but the default value of 10 doesn't. It also won't work where 
large values are typical and small ones are the outliers (well, it will 
indicate where the "gap" is).

Jim
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: find.first.gap.R
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20050813/99cdabfe/find.first.gap.pl