How to find all first order neighbors of a collection of points

Dear Benjamin,

I'm not sure how you define "first order neighbors" for a point. The
first thing that comes to my mind is to use their corresponding voronoi
polygons and define neighborhood from there. Following your code:
Thanks, the main source of confusion is that "first order neighbors" are 
not defined. A k=1 neighbour could be (as below), as could k=6, or voronoi 
neighbours, or sphere of influence etc. So reading vignette("nb") would be 
a starting point.

Also note that voronoi and other graph-based neighbours should only use 
planar coordinates - including dismo::voronoi, which uses deldir::deldir() 
- just like spdep::tri2nb(). Triangulation can lead to spurious neighbours 
on the convex hull.
v <- dismo::voronoi(coords)
par(mfrow = c(1, 2), xaxt = "n", yaxt = "n", mgp = c(0, 0, 0))
plot(coords, type = "n", xlab = NA, ylab = NA)
plot(v, add = TRUE)
text(x = coords[, 1], y = coords[, 2], labels = voter.subset$Voter.ID)
plot(coords, type = "n", xlab = NA, ylab = NA)
plot(poly2nb(v), coords, add = TRUE, col = "gray")

?acu.-

On 07/12/2018 09:00 PM, Benjamin Lieberman wrote:
Hi all,

Currently, I am working with U.S. voter data. Below, I included a brief 
example of the structure of the data with some reproducible code. My 
data set consists of roughly 233,000 (233k) entries, each specifying a 
voter and their particular latitude/longitude pair.
Using individual voter data is highly dangerous, and must in every case be 
subject to the strictest privacy rules. Voter data does not in essence 
have position - the only valid voting data that has position is of the 
voting station/precinct, and those data are aggregated to preserve 
anonymity.

Why does position and voter data not have position? Which location should 
you use - residence, workplace, what? What are these locations proxying? 
Nothing valid can be drawn from "just voter data" - you can get 
conclusions from carefully constructed stratified exit polls, but there 
the key gender/age/ethnicity/social class/etc. confounders are handled by 
design. Why should voting decisions be influenced by proximity (they are 
not)? The missing element here is looking carefully at relevant covariates 
at more aggregated levels (in the US typically zoning controlling social 
class positional segregation, etc.).
I have been using the spdep package with the hope of creating a CAR 
model. To begin the analysis, we need to find all first order neighbors 
of every point in the data.

While spdep has fantastic commands for finding k nearest neighbors 
(knearneigh), and a useful command for finding lag of order 3 or more 
(nblag), I have yet to find a method which is suitable for our purposes 
(lag = 1, or lag =2). Additionally, I looked into altering the nblag 
command to accommodate maxlag = 1 or maxlag = 2, but the command relies 
on an nb format, which is problematic as we are looking for the 
underlying neighborhood structure.

There has been numerous work done with polygons, or data which already 
is in ?nb? format, but after reading the literature, it seems that 
polygons are not appropriate, nor are distance based neighbor 
techniques, due to density fluctuations over the area of interest.

Below is some reproducible code I wrote. I would like to note that I am 
currently working in R 1.1.453 on a MacBook.
You mean RStudio, there is no such version of R.
# Create a data frame of 10 voters, picked at random
voter.1 = c(1, -75.52187, 40.62320)
voter.2 = c(2,-75.56373, 40.55216)
voter.3 = c(3,-75.39587, 40.55416)
voter.4 = c(4,-75.42248, 40.64326)
voter.5 = c(5,-75.56654, 40.54948)
voter.6 = c(6,-75.56257, 40.67375)
voter.7 = c(7, -75.51888, 40.59715)
voter.8 = c(8, -75.59879, 40.60014)
voter.9 = c(9, -75.59879, 40.60014)
voter.10 = c(10, -75.50877, 40.53129)

These are in geographical coordinates.
# Bind the vectors together
voter.subset = rbind(voter.1, voter.2, voter.3, voter.4, voter.5, voter.6, voter.7, voter.8, voter.9, voter.10)

# Rename the columns
colnames(voter.subset) = c("Voter.ID", "Longitude", "Latitude")

# Change the class from a matrix to a data frame
voter.subset = as.data.frame(voter.subset)

# Load in the required packages
library(spdep)
library(sp)

# Set the coordinates
coordinates(voter.subset) = c("Longitude", "Latitude")
coords = coordinates(voter.subset)

# Jitter to ensure no duplicate points
coords = jitter(coords, factor = 1)

jitter does not respect geographical coordinated (decimal degree metric).
# Find the first nearest neighbor of each point
one.nn = knearneigh(coords, k=1)
See the help page (hint: longlat=TRUE to use Great Circle distances, much 
slower than planar).
# Convert the first nearest neighbor to format "nb"
one.nn_nb = knn2nb(one.nn, sym = F)

Thank you in advance for any help you may offer, and for taking the 
time to read this. I have consulted Applied Spatial Data Analysis with 
R (Bivand, Pebesma, Gomez-Rubio), as well as other Sig-Geo threads, the 
spdep documentation, and the nb vignette (Bivand, April 3, 2018) from 
earlier this year.

Warmest,
Ben
--
Benjamin Lieberman
Muhlenberg College 2019
Mobile: 301.299.8928

	[[alternative HTML version deleted]]
Plain text only, please.

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

How to find all first order neighbors of a collection of points

Thread (9 messages)