How to remove multiple outliers
Did you read the documentation for ?outlier. It clearly states that it removes the single (possibly repeated) value with the largest distance from the mean. That's only 10099 here....you could perhaps apply the function more than once or write your own outlier removal script using whatever criterion you want to define outliers, but the function is doing exactly what it claims to do. On another note, why complicate things? Just use the rm.outlier() function of the same package rather than doing it (inefficiently) how you are currently. Note that outlier() returns a logical vector which can be used for direct subsetting; that there's no need to test booleans ==TRUE (since that's an identity transform on the set of booleans), and that the arr.ind = TRUE call isn't needed here. None of those make much of a difference for this problem, but they are points of good practice. Michael
On Thu, Oct 20, 2011 at 8:11 AM, aajit75 <aajit75 at yahoo.co.in> wrote:
Hi All, I am working on the dataset in which some of the variables have more than one observations with ?outliers . I am using below mentioned sample script library(outliers) x1 <- c(10, 10, 11, 12, 13, 14, 14, 10, 11, 13, 12, 13, 10, 19, 18, 17, 10099, 10099, 10098) outlier_tf1 = outlier(x1,logical=TRUE) find_outlier1 = which(outlier_tf1==TRUE, arr.ind=TRUE) beh_input_ro1 = x1[-find_outlier1] It removes the outliers which are extrme and not all. In this example it removes only ?10099, 10099 and not 10098. Thanks for the help in advance. -Ajit -- View this message in context: http://r.789695.n4.nabble.com/How-to-remove-multiple-outliers-tp3921689p3921689.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.