dclf.test output. - R-SIG-Geo

Wed, Jul 27, 2016 10:48 AM #

Hi all,
I have some marked spatial points and I am trying to assess the relative association between different types of points using the  Diggle-Cressie-Loosmore-Ford test of CSR.
My observations are of 4 categories (A,B,C,D) and I am trying to assess 3 categories (A,B,C,) against one  (D), and I get the output provided below. Knowing the sampling  area, I know category "D" and category "B" tend to occur all across the sampling area.
What I am trying to prove is that category  "A" and "C" tend to be clustered around "D".  But  u values I am getting are all positive, and the p-value are all 0.01. However, the dclf.test between A-D and C-D returns a u value at least 3 times as large than that of B-D.
My question is: how do I interpret these values. Does it still show  clustering of A and C relative to D?  if yes how do I interpret the output of dclf.test between B and D?
Thanks,
GAB



Diggle-Cressie-Loosmore-Ford test of CSR
        Monte Carlo test based on 99 simulations
        Summary function: Kcross["A", "D"](r)
        Reference function: theoretical
        Alternative: two.sided
        Interval of distance values: [0, 1.05769125]
        Test statistic: Integral of squared absolute deviation
        Deviation = observed minus theoretical

data:  Data.ppp
u = 54.931, rank = 1, p-value = 0.01

Diggle-Cressie-Loosmore-Ford test of CSR
        Monte Carlo test based on 99 simulations
        Summary function: Kcross["B", "D"](r)
        Reference function: theoretical
        Alternative: two.sided
        Interval of distance values: [0, 1.05769125]
        Test statistic: Integral of squared absolute deviation
        Deviation = observed minus theoretical

data:  Data.ppp
u = 19.315, rank = 1, p-value = 0.01

Diggle-Cressie-Loosmore-Ford test of CSR
        Monte Carlo test based on 99 simulations
        Summary function: Kcross["C", "D"](r)
        Reference function: theoretical
        Alternative: two.sided
        Interval of distance values: [0, 1.05769125]
        Test statistic: Integral of squared absolute deviation
        Deviation = observed minus theoretical

data:  Data.ppp
u = 46.829, rank = 1, p-value = 0.01





This email and any files transmitted with it are confide...{{dropped:7}}

Rolf Turner

Wed, Jul 27, 2016 2:47 PM #

I gather that your problem is that you expect to reject the null 
hypothesis of "no clustering" for A vs. D and for C vs. D, but *not* to 
reject it for B vs. D.

I *think* that your problem might be the fact that you are using a 
two-sided test, which gives, roughly speaking, a test of "no 
association" rather than a test of "no clustering".  It could be the 
case that points of types B and D tend to *avoid* each other, so you get 
"significant" association between B and D, although the B points do the 
opposite of clustering around D points.

It's hard to tell for sure without a *reproducible example* (!!!).  We 
don't have access to Data.ppp.

Try using alternative="greater" in your call to dclf.test() and see if 
the results are more in keeping with your expectations.

cheers,

Rolf Turner

Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 28/07/16 05:48, Guy Bayegnak wrote:

> Hi all, I have some marked spatial points and I am trying to assess
> therelative association between different types of points using the
> Diggle-Cressie-Loosmore-Ford test of CSR.
> My observations are of 4 categories (A,B,C,D) and I am trying to
> assess 3 categories (A,B,C,) against one (D), and I get the output
> provided below. Knowing the sampling area, I know category "D" and
> category "B" tend to occur all across the sampling area.
> What I am trying to prove is that category "A" and "C" tend to be
> clustered around "D". But u values I am getting are all positive, and
> the p-value are all 0.01. However, the dclf.test between A-D and C-D
> returns a u value at least 3 times as large than that of B-D.
> My question is: how do I interpret these values. Does it still show
> clustering of A and C relative to D? if yes how do I interpret the
> output of dclf.test between B and D?
> Thanks, GAB
>
>
>
> Diggle-Cressie-Loosmore-Ford test of CSR
>         Monte Carlo test based on 99 simulations
>         Summary function: Kcross["A", "D"](r)
>         Reference function: theoretical
>         Alternative: two.sided
>         Interval of distance values: [0, 1.05769125]
>         Test statistic: Integral of squared absolute deviation
>         Deviation = observed minus theoretical
>
> data:  Data.ppp
> u = 54.931, rank = 1, p-value = 0.01
>
> Diggle-Cressie-Loosmore-Ford test of CSR
>         Monte Carlo test based on 99 simulations
>         Summary function: Kcross["B", "D"](r)
>         Reference function: theoretical
>         Alternative: two.sided
>         Interval of distance values: [0, 1.05769125]
>         Test statistic: Integral of squared absolute deviation
>         Deviation = observed minus theoretical
>
> data:  Data.ppp
> u = 19.315, rank = 1, p-value = 0.01
>
> Diggle-Cressie-Loosmore-Ford test of CSR
>         Monte Carlo test based on 99 simulations
>         Summary function: Kcross["C", "D"](r)
>         Reference function: theoretical
>         Alternative: two.sided
>         Interval of distance values: [0, 1.05769125]
>         Test statistic: Integral of squared absolute deviation
>         Deviation = observed minus theoretical
>
> data:  Data.ppp
> u = 46.829, rank = 1, p-value = 0.01

Guy Bayegnak

Wed, Jul 27, 2016 3:59 PM #

Thanks for your response Rolf,
You summarized it correctly.  However, B and D do not necessarily avoid each other. They could and do in fact occur next to each other at times just by coincidence, simply because both categories tend to occur all over the place, while I think A and C are influenced by D. I included the alternative="greater" but I still get the same results.
A sample of my data is provided below( I have more than 800 points).

Longitude        Latitude               Type
1 -113.1923      51.02913       C
2 -113.2013      52.83306       A
3 -113.6834     51.06585        A
4 -113.0295      50.97140       C
5 -113.2366      50.96440       A
6 -113.5849      51.37568       A
7 -113.6877      51.09027       D
8 -113.5371      51.82780       D

 I used the following code and got the results provided earlier:


dclf.test(Data.ppp,Kcross, i = "A", j = "D", alternative="greater" ,correction = "border")
dclf.test(Data.ppp,Kcross, i = "B", j = "D", alternative="greater" ,correction = "border")
dclf.test(Data.ppp,Kcross, i = "C", j = "D", alternative="greater" ,correction = "border")




Thanks,
GAB

-----Original Message-----
From: Rolf Turner [mailto:r.turner at auckland.ac.nz]
Sent: Wednesday, July 27, 2016 3:48 PM
To: Guy Bayegnak
Cc: r-sig-geo at r-project.org
Subject: Re: [R-sig-Geo] dclf.test output.


I gather that your problem is that you expect to reject the null hypothesis of "no clustering" for A vs. D and for C vs. D, but *not* to reject it for B vs. D.

I *think* that your problem might be the fact that you are using a two-sided test, which gives, roughly speaking, a test of "no association" rather than a test of "no clustering".  It could be the case that points of types B and D tend to *avoid* each other, so you get "significant" association between B and D, although the B points do the opposite of clustering around D points.

It's hard to tell for sure without a *reproducible example* (!!!).  We don't have access to Data.ppp.

Try using alternative="greater" in your call to dclf.test() and see if the results are more in keeping with your expectations.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 28/07/16 05:48, Guy Bayegnak wrote:

Hi all, I have some marked spatial points and I am trying to assess
therelative association between different types of points using the
Diggle-Cressie-Loosmore-Ford test of CSR.
My observations are of 4 categories (A,B,C,D) and I am trying to
assess 3 categories (A,B,C,) against one (D), and I get the output
provided below. Knowing the sampling area, I know category "D" and
category "B" tend to occur all across the sampling area.
What I am trying to prove is that category "A" and "C" tend to be
clustered around "D". But u values I am getting are all positive, and
the p-value are all 0.01. However, the dclf.test between A-D and C-D
returns a u value at least 3 times as large than that of B-D.
My question is: how do I interpret these values. Does it still show
clustering of A and C relative to D? if yes how do I interpret the
output of dclf.test between B and D?
Thanks, GAB



Diggle-Cressie-Loosmore-Ford test of CSR
        Monte Carlo test based on 99 simulations
        Summary function: Kcross["A", "D"](r)
        Reference function: theoretical
        Alternative: two.sided
        Interval of distance values: [0, 1.05769125]
        Test statistic: Integral of squared absolute deviation
        Deviation = observed minus theoretical

data:  Data.ppp
u = 54.931, rank = 1, p-value = 0.01

Diggle-Cressie-Loosmore-Ford test of CSR
        Monte Carlo test based on 99 simulations
        Summary function: Kcross["B", "D"](r)
        Reference function: theoretical
        Alternative: two.sided
        Interval of distance values: [0, 1.05769125]
        Test statistic: Integral of squared absolute deviation
        Deviation = observed minus theoretical

data:  Data.ppp
u = 19.315, rank = 1, p-value = 0.01

Diggle-Cressie-Loosmore-Ford test of CSR
        Monte Carlo test based on 99 simulations
        Summary function: Kcross["C", "D"](r)
        Reference function: theoretical
        Alternative: two.sided
        Interval of distance values: [0, 1.05769125]
        Test statistic: Integral of squared absolute deviation
        Deviation = observed minus theoretical

data:  Data.ppp
u = 46.829, rank = 1, p-value = 0.01

This email and any files transmitted with it are confide...{{dropped:7}}

Rolf Turner

Wed, Jul 27, 2016 4:49 PM #

(1) It would then appear to be the case that points of types B and D 
*do* tend to cluster together despite your expectations.

(2) What is the appearance of an envelope plot just for the Kcross 
function between B and D?

(3) If these ideas don't clear up the problem, perhaps you could make 
the data set available to me, off-list, and I could have a go at 
exploring the pattern and see if I can understand what's going on.

(4) It is always possible that there is something that I haven't 
properly comprehended in respect of these issues.  In particular I now 
feel a little bit nervous about the fact that as it stands your test is 
based on simulations of patterns that are CSRI (completely spatially 
random with independence of types).  It might be the case that this is 
inappropriate.  I'll have to think about this a bit more.

(5) I am a bit puzzled by the fact that you get "the same results" when 
you use alternative="greater".  Generally a one-sided test should yield
a smaller p-value than a two-sided test when the data are "pointing in 
the direction of the alternative hypothesis".

E.g.:

set.seed(42)
E <- envelope(ants,fun=Kcross,i="Cataglyphis",j="Messor",
               savepatterns=TRUE)
dclf.test(E) # Gives a p-value of 0.7
dclf.test(E,alternative="greater") # gives a p-value of 0.23

cheers,

Rolf

Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 28/07/16 10:59, Guy Bayegnak wrote:
> Thanks for your response Rolf,
> You summarized it correctly.  However, B and D do not necessarily avoid each other. They could and do in fact occur next to each other at times just by coincidence, simply because both categories tend to occur all over the place, while I think A and C are influenced by D. I included the alternative="greater" but I still get the same results.
> A sample of my data is provided below( I have more than 800 points).
>
> Longitude        Latitude               Type
> 1 -113.1923      51.02913       C
> 2 -113.2013      52.83306       A
> 3 -113.6834     51.06585        A
> 4 -113.0295      50.97140       C
> 5 -113.2366      50.96440       A
> 6 -113.5849      51.37568       A
> 7 -113.6877      51.09027       D
> 8 -113.5371      51.82780       D
>
>  I used the following code and got the results provided earlier:
>
>
> dclf.test(Data.ppp,Kcross, i = "A", j = "D", alternative="greater" ,correction = "border")
> dclf.test(Data.ppp,Kcross, i = "B", j = "D", alternative="greater" ,correction = "border")
> dclf.test(Data.ppp,Kcross, i = "C", j = "D", alternative="greater" ,correction = "border")
>
>
>
>
> Thanks,
> GAB
>
> -----Original Message-----
> From: Rolf Turner [mailto:r.turner at auckland.ac.nz]
> Sent: Wednesday, July 27, 2016 3:48 PM
> To: Guy Bayegnak
> Cc: r-sig-geo at r-project.org
> Subject: Re: [R-sig-Geo] dclf.test output.
>
>
> I gather that your problem is that you expect to reject the null hypothesis of "no clustering" for A vs. D and for C vs. D, but *not* to reject it for B vs. D.
>
> I *think* that your problem might be the fact that you are using a two-sided test, which gives, roughly speaking, a test of "no association" rather than a test of "no clustering".  It could be the case that points of types B and D tend to *avoid* each other, so you get "significant" association between B and D, although the B points do the opposite of clustering around D points.
>
> It's hard to tell for sure without a *reproducible example* (!!!).  We don't have access to Data.ppp.
>
> Try using alternative="greater" in your call to dclf.test() and see if the results are more in keeping with your expectations.
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> On 28/07/16 05:48, Guy Bayegnak wrote:
>
>> Hi all, I have some marked spatial points and I am trying to assess
>> therelative association between different types of points using the
>> Diggle-Cressie-Loosmore-Ford test of CSR.
>> My observations are of 4 categories (A,B,C,D) and I am trying to
>> assess 3 categories (A,B,C,) against one (D), and I get the output
>> provided below. Knowing the sampling area, I know category "D" and
>> category "B" tend to occur all across the sampling area.
>> What I am trying to prove is that category "A" and "C" tend to be
>> clustered around "D". But u values I am getting are all positive, and
>> the p-value are all 0.01. However, the dclf.test between A-D and C-D
>> returns a u value at least 3 times as large than that of B-D.
>> My question is: how do I interpret these values. Does it still show
>> clustering of A and C relative to D? if yes how do I interpret the
>> output of dclf.test between B and D?
>> Thanks, GAB
>>
>>
>>
>> Diggle-Cressie-Loosmore-Ford test of CSR
>>         Monte Carlo test based on 99 simulations
>>         Summary function: Kcross["A", "D"](r)
>>         Reference function: theoretical
>>         Alternative: two.sided
>>         Interval of distance values: [0, 1.05769125]
>>         Test statistic: Integral of squared absolute deviation
>>         Deviation = observed minus theoretical
>>
>> data:  Data.ppp
>> u = 54.931, rank = 1, p-value = 0.01
>>
>> Diggle-Cressie-Loosmore-Ford test of CSR
>>         Monte Carlo test based on 99 simulations
>>         Summary function: Kcross["B", "D"](r)
>>         Reference function: theoretical
>>         Alternative: two.sided
>>         Interval of distance values: [0, 1.05769125]
>>         Test statistic: Integral of squared absolute deviation
>>         Deviation = observed minus theoretical
>>
>> data:  Data.ppp
>> u = 19.315, rank = 1, p-value = 0.01
>>
>> Diggle-Cressie-Loosmore-Ford test of CSR
>>         Monte Carlo test based on 99 simulations
>>         Summary function: Kcross["C", "D"](r)
>>         Reference function: theoretical
>>         Alternative: two.sided
>>         Interval of distance values: [0, 1.05769125]
>>         Test statistic: Integral of squared absolute deviation
>>         Deviation = observed minus theoretical
>>
>> data:  Data.ppp
>> u = 46.829, rank = 1, p-value = 0.01