An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130303/c5a5bf0e/attachment.pl>
Kolmogorov-Smirnov: calculate p value given as input the test statistic
3 messages · Rani Elkon, Rui Barradas, Brian Ripley
Hello,
You can compute the p-value from the test statistic if you know the
samples' sizes. R calls functions written in C for the several cases,
for the two samples case, this is the code (edited)
n.x <- 100 # length of 1st sample
n.y <- 100 # length of 2nd sample
STATISTIC <- 1.23
PVAL <- 1 - .C("psmirnov2x",
p = as.double(STATISTIC),
as.integer(n.x),
as.integer(n.y))$p
PVAL <- min(1.0, max(0.0, PVAL))
For the other cases check the source, file stats/ks.test.R.
As for the second question, I believe the answer is no, you must provide
at least on sample and a CDF. Something like
x <- rnorm(100)
f <- ecdf(rnorm(100))
ks.test(x, f)
Hope this helps,
Rui Barradas
Em 03-03-2013 09:58, Rani Elkon escreveu:
Dear all, I calculate the test statistic for the KS test outside R, and wish to use R only to calculate the corresponding p-value. Is there a way for doing this? (as far as I see, ks.test() requires raw data as input). Alternatively, is there a way to provide the ks.test() the two CDFs (two samples test) rather than the (x, y) data vectors? Thanks in advance, Rani [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 03/03/2013 09:58, Rani Elkon wrote:
Dear all, I calculate the test statistic for the KS test outside R, and wish to use R only to calculate the corresponding p-value.
There is no public way to do this in R. But you can read the code of ks.test and see how it does it, and extract the code you need. Note that ks.test covers several cases and hence has several branches of code to compute p values. Also (and this is one good reason why there is no a public interface), the internal code differs by version of R (so another answer I have just seen is wrong for pre-3.0.0).
Is there a way for doing this? (as far as I see, ks.test() requires raw data as input). Alternatively, is there a way to provide the ks.test() the two CDFs (two samples test) rather than the (x, y) data vectors?
Yes, because if you have the CDF you can recover the sorted data vector which is all you need.
Thanks in advance, Rani
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595