Hi, i am trying to apply the CSR and CSRT test, in a case-control context. The size of control data is ~400.000 points. I have then a set of case data (around 500 and i want to verify, for each one, the interaction between the case and control data) and the size of these data vary from 100 to 4000 events. I am using the ripley k-function and the test statistic D(h)/sqrt(var(D)), for each distance h in the range. The problem is that i should use Monte Carlo hypothesis testing to obtain the p-value, but with this size of control data the computational cost is prohibitive. Are there other types of tests, less computationally expensive to verify the null hypothesis applying the ripley's K-function on very large dataset? Regards Massimiliano Ruocco PhD Student Department of Computer and Information Science (IDI), NTNU Office: 260 IT-bygget Phone:(+47) 735 94168 Email:ruocco at idi.ntnu.no Website: http://www.idi.ntnu.no/~ruocco/
Ripley's K-function and CSR/CSRT test for Very Large Dataset
2 messages · Massimiliano Tripoli, Barry Rowlingson
On Thu, May 26, 2011 at 9:21 AM, ruocco <ruocco at idi.ntnu.no> wrote:
Hi, i am trying to apply the CSR and CSRT test, in a case-control context. The size of control data is ~400.000 points. I have then a set of case data (around 500 and i want to verify, for each one, the interaction between the case and control data) and the size of these data vary from 100 to 4000 events. I am using the ripley k-function and the test statistic D(h)/sqrt(var(D)), for each distance h in the range. The problem is that i should use Monte Carlo hypothesis testing to obtain the p-value, but with this size of control data the computational cost is prohibitive. Are there other types of tests, less computationally expensive to verify the null hypothesis applying the ripley's K-function on very large dataset?
Any reason why you can't randomly sample from your control points, and do the analysis with 4,000 control points instead of 400,000? Repeat that a few times to get an idea of the sensitivity and that's job done. K-function tests for 500 cases/4000 controls should be doable on a PC these days. Barry