Back to formatted view
Raw Message

Message-ID: <413AC584-2ECE-4478-8DCB-3A8F6C528989@ucla.edu>
Date: 2013-07-19T22:14:57Z
From: Noah Silverman
Subject: Identify Leverage Points

Hello,

I'm working on some fairly standard regression models (linear, logistic, and poisson.)  Unfortunately, the data is rather messy. 

A visual inspection, using either a histogram or a density plot indicates some significant outliers.  Furthermore, summary statistics of the data indicate the same thing.

If I fit a linear regression in R using the "lm" command, I can then plot the model to look at residuals, etc.

I'm interesting in re-fitting the model with a N% of the high leverage points removed.   (Large data set, want to fit "most" of the data.)

Is there a computational way to get the leverage for each data point?  That way I can subset the data skipping N% of the highest leverage ones.


Thanks!


--
Noah Silverman, M.S., C.Phil
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095