[RsR] covrob --- some OOP-comments
Hi Heinrich,
I thought you might be interested in taking a look at the cov and
covRob functions in the (Insightful*) Robust Library. I am in the
process of updating these so that they work under both R and S-Plus.
You can find a snapshot of the project here.
www.stats.ox.ac.uk/~konis/robust/robust.tar.gz
It's not in a very user friendly form yet but it should work on 32-
bit Linux (that's what I have been using anyways) if you do the
following.
% unpack robust.tar.gz
% cd robust/src
% make -f Makefile
% cd ..
% R
<...>
> library(lattice)
> source("build.q")
> runif(1)
> # make a covRob object
> rob <- covRob(woodmod.dat)
> # make a cov object
> cls <- cov(woodmod.dat)
> # try the print, summary and plot methods
> # fit both classical and robust at the same time
> fm <- fit.models(list(Robust = "covRob", Classical = "cov"), data
= woodmod.dat)
> # try the print summary and plot methods on these
You may also want to take a look at how the control parameters can be
passed to covRob. It uses a method quite similar to the one you
describe.
Hope this is helpful.
Kjell
*Note that this code has not been released under the *GPL yet. It
should be soon but they've been telling me that for a while now. BTW
the license is in license.txt and claims to be an open source license
but I don't actually want to read it.
On 26 Mar 2006, at 22:34, Heinrich Fritz wrote:
Dear Peter, Valentin and Martin, Thanks for your comments on the package covrob!! Your suggestions are very constructive, we already updated the implementation according to your remarks. The new version is now online at http://www.statistik.tuwien.ac.at/rsr/groups/mva.html with the following changes: (*) The class has been renamed (from covStruct) to cov Further there is a covR - class available (derived from cov). At the moment it does not include any other slots. (*) S3 - S4 mix: This problem has been resolved. The following functions are still available for the (S4) cov object: plot <taking one or two objects> print summary (*) I have improved the argument - passing. The reason why I chose to use the ... implementation was the following: Some cov estimation methods (covOGK) do not take a "control" - argument for passing input arguments. The covrob wrapper should work with as many existing cov - methods as possible (without changing those methods!!!) and in this case passing arguments is only possible by the ... syntax. However if a cov - method (like covMest or covMcd) takes a control - argument for passing other arguments this control structure will be passed anyway. I have now explicitly added the control - argument to covrob. My suggestion for passing arguments via a control - structure is to use a simple list. In my opinion it is not really comfortable for users to instantiate a new class for (almost) every single call of covrob. Further the only disadvantage by using a simple list is, that since no slots (and slot - data types) are defined the user could pass anything via the control structure. This has to be caught anyway because it is always possible to call an function with invalid arguments (e.g. cov (testdata, alpha = "asd")) So the now implemented solution would work the following way: covrob (testdata, method = "<estimator>", control = list (argument1 = .., argument2 = ..)) or this would work too: covrob (testdata, method = "<estimator>", argument1 = .., argument2 = ..) The problem with the second version is, that there may be naming problems (as mentioned by XX)- (e.g. a cov - estimator takes an argument called "method") but this is only for assuring, that covrob works with existing cov estimators which do not take a control - structure. (e.g. the author does not know anything about the covrob - package.) (*) Further I have implemented a function for applying the control - object to the input arguments.. function (x, a1 = <default 4 a1>, a2 = <default 4 a2>, a3 = <default 4 a3>, <further arguments>, control) { if (!missing (control)) { if (!is.null(control$a1)) a1 = control$a1 if (!is.null(control$a2)) a2 = control$a2 if (!is.null(control$a3)) a3 = control$a3 # other arguments.. } .... } would then be function (x, a1 = <default 4 a1>, a2 = <default 4 a2>, a3 = <default 4 a3>, <further arguments>, control) { if (!missing (control)) ParseControlStructure (control) .... } This was possible by the (very excellent) code from Peter
eval.parent(substitute( object at distance <- distanceValues ))
I don't know if it is the common way of passing arguments by reference, but this is very powerful! The advantage of this approach is that the control - structure does not need to contain all arguments which can be passed to the function but only those which should really be set. An example of the function ParseControlStructure can be found in the corresponding help file. (*) Accesssor functions I have implemented several accessor functions: cov cor returning (first time: calculating) the correlation matrix center method returning the name of the estimator.. details returning the whole output - object of the cov - estimation method datadim mah mahalanobis distances. mah.wt only calculating the mah - weights (as Martin Maechler proposed.) These are generic functions - I hope this is the way it was intended! (*) Mahalanobis distances. They are now taken from the output object of the cov - method (if available) (*) * from Martin Maechler: *
1. In general I think we should have a bit less "optional" parts; in particular, top of page 4, I think one could require 'method'.> require 'method'.
Again: We wanted to be compatible to as many (existing) covariance estimation methods as possible. I don't believe that it is very expensive to check whether a method-string has been returned or not. (*)
'3.1 arguments of covrob()' : x: should be a numeric matrix *or a data frame*
This has been implemented anyway. It was not described correctly in the pdf. (*) * from Valentin Todorov *
- What happens with the user of, for example covMcd() when it begins to return an S4 class "Mcd" instead of the current S3 "mcd". Of course these that just use print/plot/summary will not notice the change, but what about these that use the returned object within their programs? This is actually a general question on compatibility.
This was exactly the reason why we chose the wrapping - solution. It should work (and generate S4 output) using available functions because we assumed that it is impossible to change the output of functions which are already published and in use by others. Code, relying on these functions to produce the output described in the help - pages, would not work anymore. Everybody using these functions would have to change his code. So I think the wrapping - version is the solution with the fewest compatibility problems. (*) I've kicked the classical estimation. Instead I've implemented an estimator "cov.classic" best regards, Heinrich
_______________________________________________ R-SIG-Robust at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-robust