outlier
On Tue, 17 Jun 2003, kan Liu wrote:
I want to calculate the R-squared between two variables. Can you advice me how to identify and remove the outliers before performing R-squared calculation?
Easy: you don't. It make no sense to consider R^2 after arbitrary outlier removal: if I remove all but two points I get R^2 = 1! R^2 is normally used to measure the success of a multiple regression, but as you mention two variables, did you just mean the Pearson product-moment correlation? It makes more sense to use a robust measure of correlation, as in cov.rob (package lqs) or even Spearman or Kendall measures (cov.test in package ctest). If you intended to do this for a multiple regression, you need to do some sort of robust regression and a use a robust measure of fit.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595