Skip to content

[RsR] confidence intervals for lmRob

2 messages · Stefan Herzog, Matias Salibian-Barrera

#
Hi,


I looked around, but couldn't find anything (and that's why I hope  
this is not an unnecessary, lazy newbie question):

1) How do I compute confidence intervals for lmRob regression  
(package "robust")?

2) If this method is not yet implemented, would it make sense to  
bootstrap lmRob and derive the CI, say using a percentile t method?


Thanx!


Cheers, Stefan


-------------------------------------------------------------
Stefan Herzog, M. Sc.
Center for Cognitive and Decision Sciences

Department of Psychology
University of Basel
Missionsstrasse 64A
4055 Basel
Switzerland

+41 61 267 06 15
stefan.herzog at unibas.ch
http://www.psycho.unibas.ch/herzog/
#
Hello Stefan,

The summary() method for lmRob objects in package robust returns 
estimated standard errors that are valid when the error distribution is 
symmetric. If you either (a) don't have outliers in your data; or (b) 
have atypical observations or model departures that are symmetric around 
the regression line, then these estimated SDs could be used to construct 
confidence intervals for each regression coefficient of the form 
estimate +/- qnorm(alpha/2) * SD.

However, in package robustbase, the summary() method for lmrob objects 
returns estimated SDs that are valid in more general cases (e.g 
asymmetric outliers), so they are in principle more reliable than the 
ones above. They are based on Croux, C., Dhaene, G., Hoorelbeke, D. 
(2003), "Robust Standard Errors for Robust Estimators." (available on 
line from Christophe Croux's website). See help(lmrob) for more details.

These estimated SDs in package robustbase can also be used to construct 
confidence intervals for each regression coefficient of the form 
estimate +/- qnorm(alpha/2) * SD under weaker (more general) assumptions 
than above.

Moreover, note that bootstrapping robust estimators is not 
straightforward. Although MM estimators are smooth enough to allow the 
bootstrap to be consistent, their high computational complexity together 
with the potentially large number of outliers present in the bootstrap 
samples makes direct bootstrapping of robust estimators not a good idea 
in general. There are some alternatives in the literature. See, for 
example: SB and Zamar, R.H. (2002). Bootstrapping robust estimates of 
regression. The Annals of Statistics, 30, 556-582.

This fast and robust bootstrap can also be applied to obtain consistent 
p-values for nested tests of hypotheses for linear regression models 
based on robust estimators (SB, (2005). Estimating the p-values of 
robust tests for the linear model. Journal of Statistical Planning and 
Inference, 128, 241-257).

The fast and robust bootstrap has also been applied successfully to 
several other models (e.g. SB, Van Aelst, S. and Willems, G. (2006). PCA 
based on multivariate MM-estimators with fast and robust bootstrap. 
Journal of the American Statistical Association, 101, 1198-1211; 
Roelant, E., Van Aelst, S., and Croux, C. (2008), " Multivariate 
Generalized S-estimators," Journal of Multivariate Analysis, to appear; 
Van Aelst, S., and Willems, G. (2005), " Multivariate Regression 
S-Estimators for Robust Estimation and Inference," Statistica Sinica, 
15, 981-1001), and also to model selection for linear regression (SB, 
Van Aelst, S. (2008), " Robust Model Selection Using Fast and Robust 
Bootstrap, " Computational Statistics and Data Analysis, 52, 5121-5135).

I have some R plug-in code for the robustbase package that implements 
this fast and robust bootstrap based on lmrob in package robustbase. If 
you're interested, let me know and I will dig it out for you.

Hope this helps. I'll be happy to help if you have any further questions.

Best,

Matias