Skip to content
Back to formatted view

Raw Message

Message-ID: <5FEC4F8459B55448AB1D8E8D37D2B238011DDF97@ukaprdembx02.rd.astrazeneca.net>
Date: 2007-08-29T10:35:47Z
From: Southworth, Harry
Subject: [RsR] Distribution of robust distances
In-Reply-To: <007c01c7e9bd$be0fdd70$0132a8c0@KILER>

That's a wonderfully informative response. Thank you!

I had been doing some testing with cov.rob from MASS as you guessed. I'm intending to move to covRob with estim="mcd" in the S+Robust library. According to the documentation for that function, it gives the FAST MCD estimate. I now appear to be reproducing the results of Hardin & Rocke.

Again, many thanks.
Harry

____________________________________________

Harry Southworth,
FE2 A/2 Parklands,
Alderley Park,
Cheshire, SK10 4TG, UK
Tel.: +44 1625 518327
Mobile: 07867 676073
____________________________________________



> -----Original Message-----
> From: Valentin Todorov [mailto:valentin.todorov at chello.at]
> Sent: 28 August 2007 22:53
> To: Southworth, Harry; R-SIG-robust at stat.math.ethz.ch
> Cc: kwright at eskimo.com; jo.hardin at pomona.edu; dmrocke at ucdavis.edu
> Subject: Re: [RsR] Distribution of robust distances
> 
> 
> Dear Harry,
> 
> Thank you very much for this question , which is an important 
> issue arising 
> now and then in different forms but almost allways meaning "Why are 
> different the results of the different MCD implementations?". 
> And the answer 
> is almost always "Because of the different consistency and 
> small sample 
> corrections factors used".
> 
> There are several problems in your case (and Kevin Wright's):
> 
> 1. I assume you are using cov.rob() or cov.mcd() from MASS 
> for computing the 
> MCD estimator (as seen in the code of Kevin). These functions 
> return the 
> reweighted MCD covariance matrix, while the results in the 
> paper of Hardin 
> and Rocke are for the raw MCD. There is an MCD program at the 
> Jo Hardin's 
> web page for computing the MCD, which is a straightforward 
> implementation of 
> FAST-MCD in R without partitioning, nesting, reweighting and 
> correcting, 
> which they used for performing the computation. With this 
> program you could 
> reproduce the results but unfortunately it is very slow 
> compared to the 
> implementations in native Fortran or C code (like these in MASS, 
> rrcov/robustbase). Here is the link:
> 
> http://pages.pomona.edu/~jsh04747/Research/mcd.est.r
> 
> 2. If you use covMcd{robustbase} or CovMcd{rrcov} instead, 
> you can: (i) 
> switch off the correction factors and (ii) take the raw estimates.
> 
> 3. Youd do not need to estimate c (page 19) since it is 
> already applied in 
> covMcd() - the covariance matrix was devided by c, i.e. you 
> are multiplyung 
> twise the distances by this factor.
> 
> In summary: using the raw estimates from covMcd() called with 
> use.correction=FALSE and setting c=1 in the code of Kevin 
> will reproduce the 
> results.
> 
> 
> 
> covResult <- covMcd(x, use.correction=FALSE)
> 
> T <- covResult$raw.center
> 
> C <- covResult$raw.cov
> 
> 
> 
> c <- 1
> 
> .
> 
> m <- .
> 
> .
> 
> I'll try to find the code of the simulations I did some time 
> ago and will 
> post it in the next days.
> 
> Hope this helps,
> Best regards,
> Valentin
> 
> ----- Original Message ----- 
> From: "Southworth, Harry" <Harry.Southworth at astrazeneca.com>
> To: <R-SIG-robust at stat.math.ethz.ch>
> Sent: Tuesday, August 28, 2007 4:21 PM
> Subject: [RsR] Distribution of robust distances
> 
> 
> > Hello.
> >
> > Has anyone implemented anything to compute quantiles of the 
> distribution
> > of robust distances following Hardin & Rocke
> > (http://dmrocke.ucdavis.edu/papers/HardinRocke2005.pdf)?
> >
> > I've got a function to do it, but I can't reproduce the results of
> > Hardin & Rock because my function is returning values that 
> are too high.
> > Searching the R help archive, I found a message from 2004 
> describing the
> > same 
> problem(http://tolstoy.newcastle.edu.au/R/help/04/05/1296.html).
> > The code in that message is essentially similar to mine 
> (except that he
> > uses a 1 - h/n that I think should be h/n).
> >
> > I'd be grateful of any pointers.
> >
> > Thanks,
> > Harry
> >
> > _______________________________________________
> > R-SIG-Robust at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-robust
> > 
> 
>