Sorry for leaving this one in my mail box for so long, but - well, I
suppose you know what I mean.
(I'm shifting it over to r-devel, so I'll include all your original
text)
Kurt Hornik <hornik@ci.tuwien.ac.at> writes:
Well, ctest is not making progress as quickly as I wanted it ...
Anyway, here are a few questions/remarks.
* I am still a bit confused about what binom.test() does. For which
test are the p-values computed? In theory, ``the'' test to use would be
the optimal unbiased one for one-parameter exponential families, which I
think is not used ... also, this would be a randomized test, how is the
p-value for such a test defined? I should really appreciate someone
helping me out on this.
I really don't think we should do randomized tests, except possibly as
an option. Different people getting different p-values for the same
data??
If one must do it, I conjecture that one could get a "p-value" by
looking at x + runif(1,-.5,.5) and linearly interpolating between - er,
draw a picture of the density of the modified x and think about it...
* I would still like to come up with rather general functions for
location.test()
scale.test()
and perhaps some more ... Some time ago, Peter and Tony Rossini and I
had quite a vivid exchange of emails on this, but I seem to have lost
our final findings (in case we ever got this far).
Anyway, any input on this would be great. Remember, the basic idea was
to have a unified approach to e.g. several non-parametric tests for a
difference in location or scale (rather than having mood.test,
ansaribradley.test, vanderwaerden.test, ...). However, as PD pointed
out, it would be a bad idea to use this general scheme for tests which
don't really fit in there (such as the Wilcoxon tests). In fact, I seem
to remember that one issue was whether there are any tests which really
are location or scale tests ...
I think the main point was that they're not *median* tests,
(irrespective of what the SAS output says!) "Location test" is
probably OK. My basic worry was the risk of losing the simplicity of
having well known standard tests called simply t.test(),
wilcoxon.test(), in favour of a perhaps unnecessarily abstract
taxonomy. Of course there's always the possibility to do things like
spearman.test(...) <- function(...) cor.test(..., method="spearman")
etc.
* Speaking of the Wilcoxon tests, I still need to add exact computations
for the small-sample cases. Does anyone have code or algorithms for
doing that?
Signed rank is trivial, you just generate the 2^k different sign
patterns and look at the distribution of the sums. Even in interpreted
code, this can be done for k up to 16 or so, at which point the
difference from the approximation is immaterial. The bit patterns are
simply all binary numbers between 0 and 2^k-1.
The two-sample case is a bit more unwieldy...
* I also mentioned some time ago that I'd like to make Fisher's test
available for tables larger than 2 by 2. There is an implementation
(FEXACT) of the Mehta and Patel algorithm available via APSTAT (I
think). However, when I last used it (for an association analysis of a
gene with 12 alleles) it could not deal with the ``large'' 12 by 2
table. (More precisely, it can deal with it after enlarging some size
parameters in the sources and recompiling, but that's not the smart
wayof doing things ...) Again, does anyone have a suggestion what to do
here? (FEXACT has a ``mixed'' method of dealing with larger tables, but
it seems stupid to have an R function which may produce a message like
``no, I need more memory ... please try to change param XYZ and then
recompile''.)
If one can precompute the size of the array, one can usually allocate
it in R and pass it as a parameter instead.
As you know, *my* main desires for ctest is to allow model formula
specifications for all of the common tests (for consistency), and in
the slightly longer run also include trend tests and stratification.
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._