Skip to content

Ripley's K-function and CSR/CSRT test for Very Large Dataset

2 messages · Massimiliano Tripoli, Barry Rowlingson

#
Hi,
i am trying to apply the CSR and CSRT test, in a case-control context.
The size of control data is ~400.000 points. I have then a set of case 
data (around 500 and i want to verify, for each one, the interaction 
between the case and control data) and the size of these data vary from 
100 to 4000 events.

I am using the ripley k-function and the test statistic 
D(h)/sqrt(var(D)), for each distance h in the range.

The problem is that i should use Monte Carlo hypothesis testing to 
obtain the p-value, but with this size of control data the 
computational cost is prohibitive.

Are there other types of tests, less computationally expensive to 
verify the null hypothesis applying the ripley's K-function on very 
large dataset?

Regards

Massimiliano Ruocco
PhD Student
Department of Computer and Information Science (IDI), NTNU
Office: 260 IT-bygget
Phone:(+47) 735 94168
Email:ruocco at idi.ntnu.no
Website: http://www.idi.ntnu.no/~ruocco/
#
On Thu, May 26, 2011 at 9:21 AM, ruocco <ruocco at idi.ntnu.no> wrote:
Any reason why you can't randomly sample from your control points,
and do the analysis with 4,000 control points instead of 400,000?
Repeat that a few times to get an idea of the sensitivity and that's
job done.

 K-function tests for  500 cases/4000 controls should be doable on a
PC these days.

Barry