Create distance 'neighborhood' (zone of indifference) when clustering binary data
On Sun, 27 Nov 2011, Alberto Gallano wrote:
Hi Roger, thanks so much for your suggestion. I am running the code on a large dataset and have been reading up on "sparse matrices", however, I haven't come across anything in the R or GIS literature that explains in simple terms how "going out to a sparse matrix with both, then back again" is achieving the goals I set.
Please do read the code. There is an aggregate method for nb objects in spdep, but no aggregate method for listw objects. Consequently, to aggregate them, I converted each to a sparse matrix format, added, and then converted back to listw format. My mistake was to think that trimming the >100m neighbours back to 0 would work - it didn't always do so. Again, read the code very carefully, this isn't about words, it's about the representation of objects in code.
Could you please point me to a reference that has employed a similar approach, or perhaps explain yourself how doing this creates a region outside of the 'core' neighborhood in which points gradually have less influence with distance? Thanks.
No, no idea, your problem. Roger
Alberto On Wed, Nov 23, 2011 at 12:55 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Wed, 23 Nov 2011, Roger Bivand wrote: On Wed, 23 Nov 2011, Alberto Gallano wrote:
Hi Roger,
thanks for your reply. The data I posted is only a subset of my real data, in which there are some points less than 100 meters apart. To clarify what I want: 1) I would like to create neighbourhoods of 100 meters within which all points are neighbors. I think my code will do that with the real dataset if I set the upper bounds of dnearneigh to 100. 2) I want points outside of this 100 meter neighborhood to still be neighbors, *but* with decreasing weight by distance. I believe this is referred to as inverse distance weighting. I may want to accentuate this by using squared inverse distance weighting. The question is how can I accomplish task 2? Here is my attempt:
Try going out to a sparse matrix with both, then back again: library(spdep) set.seed(1) coords <- matrix(runif(1000, 0, 2000), ncol=2) nb1 <- dnearneigh(coords, 0, 100) nb1 k1 <- knn2nb(knearneigh(coords, k=1)) maxD <- max(unlist(nbdists(k1, coords))) nb2 <- dnearneigh(coords, 100, maxD) set.ZeroPolicyOption(TRUE) lw1 <- nb2listw(nb1, style="B") mat1 <- as(as_dgRMatrix_listw(lw1), "CsparseMatrix") dnb2 <- nbdists(nb2, coords) idw2 <- lapply(dnb2, function(x) 1/(x-100)) # to avoid an abrupt drop, I suggest subracting 100m
This wasn't a good idea, as points with (x-100) < 1 end up with weights >
1. It would need trapping to reduce them to 1:
idw2 <- lapply(dnb2, function(x) {x100 <- x-100; x100 <- ifelse (x100 < 1,
1, x100); 1/x100})
Roger
all(unlist(sapply(idw2, function(x) is.finite(x))))
lw2 <- nb2listw(nb2, glist=idw2, style="B") mat2 <- as(as_dgRMatrix_listw(lw2), "CsparseMatrix") mat12 <- mat1 + mat2 image(mat1) image(mat2) image(mat12) summary(rowSums(mat12)) summary(colSums(mat12)) lw12 <- mat2listw(mat12, style="B") lw12 table(card(lw12$neighbours)) Hope this helps, Roger
coords <- cbind(dat$x, dat$y) k1 <- knn2nb(knearneigh(coords)) maxD <- max(unlist(nbdists(k1, coords))) datnb <- dnearneigh(coords, 0, maxD) # general weights - inverse distance squared dlist <- nbdists(datnb, coords) idlist <- lapply(dlist, function(x) (1/x)^2) datlistw.id2 <- nb2listw(datnb, glist=idlist, style="B", zero.policy=TRUE) joincount.test(as.factor(dat$**present), datlistw.id2, zero.policy=TRUE, alternative="greater", spChk=NULL, adjust.n=TRUE) Does this make sense? It seems so to me. (There is a warning given, but I think that is because my example has so few points). thank you, Alberto On Mon, Nov 21, 2011 at 3:29 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote: On Sun, 20 Nov 2011, Alberto Gallano wrote:
Hi, i'm trying to do join count analysis on binary data, but i'd like to
create neighborhood's of 'indifference' of 100 meters around each spacial point. I've been using dnearneigh to create a neighbor list and knearneigh to calculate the maximum of the nearest neighbor distances for the upper bound argument to dnearneigh (so that no observations become islands). My question is, how do I create neighborhoods based on distance? Should I just input a number into the upper bound argument of dnearneigh, without first using knearneigh? If so, what number would correspond to 100 meters (Euclidean)? (btw, the coordinates are already projected).
If you mean that you define neighbours as points j within 100m of point i, then: datnb <- dnearneigh(coords, 0, 100) will do this if your coordinates are measured in metres. This doesn't work for your example, because the closest points are almost 1200m apart, so no point has any neighbours for this definition. However, your mentioning neighborhoods of 'indifference' makes me uncertain that this is what you mean. Do you mean placing a buffer around each point before finding neighbours? Roger Here is a small subsample of my data and analysis. Thanks,
Alberto
# ==========================
# data
dat <- structure(list(present = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L), x = c(332940L, 316301L,
312714L, 306008L, 312248L, 329276L, 329663L, 341535L, 314761L,
332898L, 332957L, 328332L, 312462L, 330063L, 317808L, 336216L,
333763L, 315049L, 333855L, 324406L), y = c(4305226L, 4303010L,
4316685L, 4309006L, 4319255L, 4311208L, 4316837L, 4306055L, 4301051L,
4300625L, 4330342L, 4303420L, 4308292L, 4307181L, 4292904L, 4304336L,
4313750L, 4297998L, 4314941L, 4315051L)), .Names = c("present",
"x", "y"), class = "data.frame", row.names = c(1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L))
library(spdep)
coords <- cbind(dat$x, dat$y)
k1 <- knn2nb(knearneigh(coords))
maxD <- max(unlist(nbdists(k1, coords))) # upper bound distance btw
neighbors
datnb <- dnearneigh(coords, 0, maxD)
summary(datnb)
print(is.symmetric.nb(datnb))
datlistw <- nb2listw(datnb, glist=NULL, style="B", zero.policy=TRUE)
# join count
joincount.test(as.factor(dat$****present), datlistw, zero.policy=TRUE,
alternative="greater", spChk=NULL, adjust.n=TRUE)
# =========================
[[alternative HTML version deleted]]
______________________________****_________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/****listinfo/r-sig-geo<https://stat.ethz.ch/mailman/**listinfo/r-sig-geo> <https://**stat.ethz.ch/mailman/listinfo/**r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo> --
Roger Bivand Department of Economics, NHH Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
-- Roger Bivand Department of Economics, NHH Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Roger Bivand Department of Economics, NHH Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no