Skip to content

Crosstable-like analysis (ks test) of dataframe

3 messages · Johannes Radinger, Rui Barradas

#
Hi,

I have a dataframe with multiple (appr. 20) columns containing
vectors of different values (different distributions).
 Now I'd like to create a crosstable
where I compare the distribution of each vector (df-column) with
each other. For the comparison I want to use the ks.test().
The result should contain as row and column names the column names
of the input dataframe and the cells should be populated with
the p-value of the ks.test for each pairwise analysis.

My data.frame looks like:
df <- data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2))

And the test for one single case is:
ks <- ks.test(df$X,df$Z)

where the p value is:
ks[2]

How can I create an automatized way of this pairwise analysis?
Any suggestions? I guess that is a quite common analysis (probably with
other tests).

cheers,
Johannes
#
Hello,

Try the following.


f <- function(x, y, ...,
         alternative = c("two.sided", "less", "greater"), exact = NULL){
     #w <- getOption("warn")
     #options(warn = -1)  # ignore warnings
     p <- ks.test(x, y, ..., alternative = alternative, exact = 
exact)$p.value
     #options(warn = w)
     p
}

n <- 1e1
dat <- data.frame(X=rnorm(n), Y=runif(n), Z=rchisq(n, df=3))

apply(dat, 2, function(x) apply(dat, 2, function(y) f(x, y)))

Hope this helps,

Rui Barradas
Em 28-09-2012 11:10, Johannes Radinger escreveu:
#
Thank you Rui!

that works as I want it... :)

/Johannes
On Fri, Sep 28, 2012 at 12:30 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote: