Create distance 'neighborhood' (zone of indifference) when clustering binary data

Hi Roger,

thanks so much for your suggestion. I am running the code on a large
dataset and have been reading up on "sparse matrices", however, I haven't
come across anything in the R or GIS literature that explains in simple
terms how "going out to a sparse matrix with both, then back again" is
achieving the goals I set.
Please do read the code. There is an aggregate method for nb objects in 
spdep, but no aggregate method for listw objects. Consequently, to 
aggregate them, I converted each to a sparse matrix format, added, and 
then converted back to listw format. My mistake was to think that trimming 
the >100m neighbours back to 0 would work - it didn't always do so. Again, 
read the code very carefully, this isn't about words, it's about the 
representation of objects in code.
Could you please point me to a reference that has employed a similar
approach, or perhaps explain yourself how doing this creates a region
outside of the 'core' neighborhood in which points gradually have less
influence with distance? Thanks.

No, no idea, your problem.

Roger
Alberto

On Wed, Nov 23, 2011 at 12:55 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:

On Wed, 23 Nov 2011, Roger Bivand wrote:

 On Wed, 23 Nov 2011, Alberto Gallano wrote:
 Hi Roger,
thanks for your reply. The data I posted is only a subset of my real
data,
in which there are some points less than 100 meters apart.

To clarify what I want:

1) I would like to create neighbourhoods of 100 meters within which all
points are neighbors. I think my code will do that with the real dataset
if
I set the upper bounds of dnearneigh to 100.

2) I want points outside of this 100 meter neighborhood to still be
neighbors, *but* with decreasing weight by distance. I believe this is
referred to as inverse distance weighting. I may want to accentuate this
by
using squared inverse distance weighting.

The question is how can I accomplish task 2? Here is my attempt:

Try going out to a sparse matrix with both, then back again:

library(spdep)
set.seed(1)
coords <- matrix(runif(1000, 0, 2000), ncol=2)
nb1 <- dnearneigh(coords, 0, 100)
nb1
k1 <- knn2nb(knearneigh(coords, k=1))
maxD <- max(unlist(nbdists(k1, coords)))
nb2 <- dnearneigh(coords, 100, maxD)
set.ZeroPolicyOption(TRUE)
lw1 <- nb2listw(nb1, style="B")
mat1 <- as(as_dgRMatrix_listw(lw1), "CsparseMatrix")
dnb2 <- nbdists(nb2, coords)
idw2 <- lapply(dnb2, function(x) 1/(x-100))
# to avoid an abrupt drop, I suggest subracting 100m

This wasn't a good idea, as points with (x-100) < 1 end up with weights >
1. It would need trapping to reduce them to 1:

idw2 <- lapply(dnb2, function(x) {x100 <- x-100; x100 <- ifelse (x100 < 1,
1, x100); 1/x100})

Roger

 all(unlist(sapply(idw2, function(x) is.finite(x))))
lw2 <- nb2listw(nb2, glist=idw2, style="B")
mat2 <- as(as_dgRMatrix_listw(lw2), "CsparseMatrix")
mat12 <- mat1 + mat2
image(mat1)
image(mat2)
image(mat12)
summary(rowSums(mat12))
summary(colSums(mat12))
lw12 <- mat2listw(mat12, style="B")
lw12
table(card(lw12$neighbours))

Hope this helps,

Roger

coords <- cbind(dat$x, dat$y)
k1 <- knn2nb(knearneigh(coords))
maxD <- max(unlist(nbdists(k1, coords)))
datnb <- dnearneigh(coords, 0, maxD)

# general weights - inverse distance squared
dlist <- nbdists(datnb, coords)
idlist <- lapply(dlist, function(x) (1/x)^2)

datlistw.id2 <- nb2listw(datnb, glist=idlist, style="B",
zero.policy=TRUE)

joincount.test(as.factor(dat$**present), datlistw.id2, zero.policy=TRUE,
 alternative="greater", spChk=NULL, adjust.n=TRUE)

Does this make sense? It seems so to me. (There is a warning given, but I
think that is because my example has so few points).

thank you,

Alberto

On Mon, Nov 21, 2011 at 3:29 AM, Roger Bivand <Roger.Bivand at nhh.no>
wrote:

 On Sun, 20 Nov 2011, Alberto Gallano wrote:
 Hi, i'm trying to do join count analysis on binary data, but i'd like
to

create neighborhood's of 'indifference' of 100 meters around each
spacial
point.

I've been using dnearneigh to create a neighbor list and knearneigh to
calculate the maximum of the nearest neighbor distances for the upper
bound
argument to dnearneigh (so that no observations become islands).

My question is, how do I create neighborhoods based on distance?
Should I
just input a number into the upper bound argument of dnearneigh,
without
first using knearneigh? If so, what number would correspond to 100
meters
(Euclidean)? (btw, the coordinates are already projected).

If you mean that you define neighbours as points j within 100m of point
i,
then:

datnb <- dnearneigh(coords, 0, 100)

will do this if your coordinates are measured in metres. This doesn't
work
for your example, because the closest points are almost 1200m apart, so
no
point has any neighbours for this definition. However, your mentioning
neighborhoods of 'indifference' makes me uncertain that this is what you
mean. Do you mean placing a buffer around each point before finding
neighbours?

Roger

 Here is a small subsample of my data and analysis. Thanks,
Alberto

# ==========================
# data
dat <- structure(list(present = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L), x = c(332940L, 316301L,
312714L, 306008L, 312248L, 329276L, 329663L, 341535L, 314761L,
332898L, 332957L, 328332L, 312462L, 330063L, 317808L, 336216L,
333763L, 315049L, 333855L, 324406L), y = c(4305226L, 4303010L,
4316685L, 4309006L, 4319255L, 4311208L, 4316837L, 4306055L, 4301051L,
4300625L, 4330342L, 4303420L, 4308292L, 4307181L, 4292904L, 4304336L,
4313750L, 4297998L, 4314941L, 4315051L)), .Names = c("present",
"x", "y"), class = "data.frame", row.names = c(1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L))

library(spdep)
coords <- cbind(dat$x, dat$y)
k1 <- knn2nb(knearneigh(coords))
maxD <- max(unlist(nbdists(k1, coords))) # upper bound distance btw
neighbors
datnb <- dnearneigh(coords, 0, maxD)
summary(datnb)
print(is.symmetric.nb(datnb))

datlistw <- nb2listw(datnb, glist=NULL, style="B", zero.policy=TRUE)

# join count
joincount.test(as.factor(dat$****present), datlistw, zero.policy=TRUE,
 alternative="greater", spChk=NULL, adjust.n=TRUE)
# =========================

      [[alternative HTML version deleted]]

______________________________****_________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/****listinfo/r-sig-geo<https://stat.ethz.ch/mailman/**listinfo/r-sig-geo>
<https://**stat.ethz.ch/mailman/listinfo/**r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo>

 --
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

--
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Create distance 'neighborhood' (zone of indifference) when clustering binary data

Thread (7 messages)