Hi, again, I agree with Werner: robustbase (robust-base, robust.base) is much easer to remember than robstats (robstat) and more appropriate than robusta. Alfio
[RsR] Package title
5 messages · Alfio Marazzi, Martin Maechler, Eva Cantoni
3 days later
"Alfio" == Alfio Marazzi <Alfio.Marazzi at chuv.ch>
on Sat, 10 Dec 2005 19:32:59 +0100 writes:
Alfio> Hi,
Alfio> again, I agree with Werner: robustbase (robust-base, robust.base) is much
Alfio> easer to remember than robstats (robstat) and more appropriate than robusta.
and we haven't heard anything further.
I'd like to reach a conclusion.
On some forums, the last answer is taken to be okay when nobody
replies anymore. I doubt though that this would be the case
here :-)
In my (biased) view I think we have the following highest
ranking contenders ("-" and "." are not to be used in package names):
1 robustbase
2 robustats
3 robustat
4 robusta
with some google-statistics arguments against #4.
Unless there are urgent new contenders,
can I collect votes, please?
To make it interesting, everyone would have 3 votes that can of
course be partitioned as 3 or 2+1 or 1+1+1.
Ideally, you send them privately to me {where you have to trust
my honesty ;-)}, or then (mis)use the list.
In order to easily collect the votes, a simple e-mail with only
one line such as
Alfio 3 0 0 0
( "Alfio gives all his 3 votes to 'robustbase'")
or
Martin 0 1 0 2
( "Martin gives 1 to 'robustats' and 2 to 'robusta')
would make things quite efficient for me
since I could use read.table() and rowSums() and colSums() to
check and summarize.
Unless we get new contenders (i.e. names for the above list)
within the next 12 hours, I'll be collecting votes till
Sunday, Dec.18, midnight MET (= UTC+1).
Martin
1 day later
I didn't get strong statements for yet another name, so the vote "is it" ... Well, it seems pretty clear at the moment, but then, many more people could still vote... A simple R script (and a cron job) auto-produces the webpage http://stat.ethz.ch/~maechler/R-sig-robust-vote.html from the data file {that I update manually}. I know from politics that it is strictly forbidden to publish polls when the vote is still happening {because of the well-known "the winner takes it all" effect}, but then we want to have some fun, too... ;-) Martin
"MM" == Martin Maechler <maechler at stat.math.ethz.ch>
on Wed, 14 Dec 2005 16:12:33 +0100 writes:
.........
MM> I'd like to reach a conclusion.
MM> On some forums, the last answer is taken to be okay when nobody
MM> replies anymore. I doubt though that this would be the case
MM> here :-)
MM> In my (biased) view I think we have the following highest
MM> ranking contenders ("-" and "." are not to be used in package names):
MM> 1 robustbase
MM> 2 robustats
MM> 3 robustat
MM> 4 robusta
MM> with some google-statistics arguments against #4.
MM> Unless there are urgent new contenders,
MM> can I collect votes, please?
MM> To make it interesting, everyone would have 3 votes that can of
MM> course be partitioned as 3 or 2+1 or 1+1+1.
MM> Ideally, you send them privately to me {where you have to trust
MM> my honesty ;-)}, or then (mis)use the list.
MM> In order to easily collect the votes, a simple e-mail with only
MM> one line such as
MM> Alfio 3 0 0 0
MM> ( "Alfio gives all his 3 votes to 'robustbase'")
MM> or
MM> Martin 0 1 0 2
MM> ( "Martin gives 1 to 'robustats' and 2 to 'robusta')
MM> would make things quite efficient for me
MM> since I could use read.table() and rowSums() and colSums() to
MM> check and summarize.
MM> Unless we get new contenders (i.e. names for the above list)
MM> within the next 12 hours, I'll be collecting votes till
MM> Sunday, Dec.18, midnight MET (= UTC+1).
4 days later
As most of you have probably seen in the mean time,
the name 'robustbase' wone by a large margin:
robustbase 45
robustats 9
robusta 5
robustat 1
Thanks to all 20 voters! As a politically active Swiss, I'm
used to voting differently than the majority ;-)
I plan to make the `always current' state of the (source)
package available (by "svn" or "subversion", but also https) at
the same URL as other R packages already are; I will announce
it here, when it's ready. Probably, it will also make sense to submit it
to CRAN even very early and unfinished, just for the reason that
windows users who can't build R packages from the source, can
easily install the package.
Now we can get to work on it, i.e. putting functionality there.
We might want to really consider Andreas Ruckstuhl's posted
private package (on Dec 7) and his question
ARu> Talking about names: how should we call functions which do
ARu> i.e. robust fitting of a glm:
ARu>
ARu> rfglm (Robust Fitting of GLM)
ARu> rglm
ARu> robglm
DATA SETS
---------
Of course, since this package is somewhat focused on the
Maronna-Martin-Yohai book, we should eventually get their
datasets in there, and Ricardo and Victor agreed in Treviso to
provide them eventually.
OTOH, it's quite useful to have data sets available from the
beginning in order to write examples and tests using those data.
I've already asked some individuals about this, but do ask here
in public for useful / sensible / well known and non-large data sets
to be also part of the package; such that the examples (on each
help page!) could make use of those data sets.
I've got already what I call 'Animals2' which is another version
of the "brain vs body weight" data, namely the union of the two MASS
data sets 'Animals' and 'mammals'.
Of course, Rousseeuw & Leroy (1990), contains a few dozens more
data sets, some of which would be interesting.
Several of them are currently already in Valentin's 'rrcov'
package, and -- if Valentin agrees -- I would propose to just
"mirror them" in the new 'robustbase' package. Eventually, they
could be removed from 'rrcov', namely at least then when rrcov
would "Depend" on 'robustbase' (i.e. load or attach robustbase
when rrov itself is loaded).
Note BTW that the stackloss data *is* already in the core
package 'datasets' (and I don't understand why at least three
other CRAN packages have *also* provided the stackloss data,
just with slightly different variable names.. ; well, one of
them is package 'MPV' providing all data sets from the book 'Montgomery,
Peck & Vining').
Also, we really don't need data sets that are already in
"standard" or "recommended" R packages; i.e., notably we could
well make use of all those you see from
data(package = "datasets") # standard
data(package = "MASS") # recommended
you can always use datasets from other packages by
data(<name>, package = <packagename>)
e.g. data(mammals, package = "MASS")
Further note: Apart from univariate data, if there are not very
good reasons , we'd only want data frames, not matrices or
single vectors, since the latter can always easily be
extracted from the data frames.
Now if you'd consider "donating" data sets to the 'robustbase'
package, please send me two files,
1) a table (*.tab, *.txt or *.csv) file, typically; or a binary *.rda
file, see the manual "Writing R Extensions", section 'Package
subdirectories'
2) a *.Rd file as produced from prompt(<your_dataframe>)
and edited __by you__ where you have filled in the relevant
information about the data.
Martin Maechler.
"MM" == Martin Maechler <maechler at stat.math.ethz.ch>
on Thu, 15 Dec 2005 17:52:46 +0100 writes:
MM> I didn't get strong statements for yet another name, so the vote
MM> "is it" ... Well, it seems pretty clear at the moment, but
MM> then, many more people could still vote...
MM> A simple R script (and a cron job) auto-produces the
MM> webpage
MM> http://stat.ethz.ch/~maechler/R-sig-robust-vote.html
MM> from the data file {that I update manually}.
MM> I know from politics that it is strictly forbidden to publish
MM> polls when the vote is still happening {because of the
MM> well-known "the winner takes it all" effect},
MM> but then we want to have some fun, too... ;-)
ARu> Talking about names: how should we call functions which do ARu> i.e. robust fitting of a glm: ARu> ARu> rfglm (Robust Fitting of GLM) ARu> rglm ARu> robglm
I have no preference, but consistency with the robust version of lm is probably needed. Happy Christmas/ Season greetings to you all! Eva
Dr Eva Cantoni phone : (+41) 22 379 8240 Econom?trie - Univ. Gen?ve fax : (+41) 22 379 8299 40, Bd du Pont d'Arve e-mail : Eva.Cantoni at metri.unige.ch CH-1211 Gen?ve 4 http://www.unige.ch/ses/metri/cantoni