Skip to content

Finding the county shapefile polygon closest to a long-lat position

5 messages · BRWIN338 at aol.com, Pilar Tugores Ferra, Paul Hiemstra +2 more

#
Hi,

If you have your data loaded into R as spatial objects, you can find in 
which polygons the points fall using the overlay() function from the 
sp-pacakge. See the documentation there. I'm not sure what to do if the 
point falls outside a polygon and you need the nearest.

cheers,
Paul

BRWIN338 at aol.com schreef:

  
    
#
On Thu, 21 May 2009, Pilar Tugores Ferra wrote:

            
Thanks, Pilar!

The only possible weakness in this might be that the input points and 
polygon boundaries are in geographical, not projected (planar) 
coordinates, so if there are differences in distances on the plane and 
great circle distances, they might lead to the "outside" points being 
assigned to the wrong county.

Depending on how many there are, you might consider two alternatives, one 
visual inspection in Google Earth or similar (export the "outside" points 
with IDs as one KML, and the county boundaries you are using with 
writeOGR() in rgdal (or use those built into GE)); the other would be to 
use the closeness to the label points of the counties of the "outside" 
points - coordinates() of the SpatialPolygons* object retrieves these, so 
looping over the "outside" points in spDistsN1() in sp with longlat=TRUE 
would give the great circle distances.

Of course, the underlying issue is why they are outside - is this a datum 
shift problem (counties in NAD27 and points in NAD83/WGS84)? If you fixed 
that, they would match better.

Hope this helps,

Roger

  
    
#
Hi Joe

Like Roger says you should first look at the possible datum problem. The 
example below shows one method to find the closest polygon, and to add that 
to the overlay. Notice however that if you change the projection and/or 
ellipsoid the closest points (end of lines) are different. Closest line in 
this case is calculated by:
abs((x2-x1)*(y1-y0)-(x1-x0)*(y2-y1))/sqrt(((x2-x1)^2+(y2-y1)^2))^2)

where x0, and y0 is the point, while x1, y1, and x2, y2 are the line ends.

To find the two closest points, spDistsN1 is used. 

Try to play with un-commenting polys <- spTransform(polys, CRS("+proj=moll 
+ellps=WGS84"))
and changing ll.logical=FALSE/TRUE (FALSE = Euclidean, TRUE = Great Circle 
distance)

One of the things you will see is that the closest points are different when 
you calculate the distance in km (ll.logical=TRUE) combined with 
proj4string=CRS("+proj=moll +ellps=WGS84")

I hope this gives some insight in the problem. 

The code can also be downloaded from http://open.uib.no/R/mindist.R

I do not guarantee the code is bug free.

You could try to replace polys with your polygon, and ssample with your 
points, and see if it makes sense.

Best wishes
Torleif



#################

library(rgdal)

#Make data
set.seed(7)
grd <- GridTopology(c(1,1), c(1,1), c(2,2))
polys <- as.SpatialPolygons.GridTopology(grd)
proj4string(polys) <- CRS(" +proj=longlat +ellps=WGS84" )

## Try uncommenting this line, and see what are the closest points in the last 
figure.
## In this case the same closest polygons are found, but not the same two 
closest points
#polys <- spTransform(polys, CRS("+proj=moll +ellps=sphere"))

ssample <- spsample(polys,n=10,type="random")

#displace points
slot(ssample, 'coords') <- slot(ssample, 'coords')-mean( 
slot(ssample, 'coords'))*.25


#overlay
overl <- overlay(polys, ssample)

#Show overlay
overl

#Which points are outside?
wh <- which(is.na(overl))

#Make plots
plot(polys, xlim=(c(bbox(polys)[1,])-c(1, -1)), 
      ylim=(c(bbox(polys)[2,])-c(1, -1)))
for (i in 1:4) points(slot(slot((slot(polys, 'polygons'))[[i]], 'Polygons')
[[1]], 'coords')[,1], slot(slot((slot(polys, 'polygons'))[[i]], 'Polygons')
[[1]], 'coords')[,2], col = i)
points(ssample)
for (i in 1:length(wh)) {
points(slot(ssample, 'coords')[wh[i],1], slot(ssample, 'coords')[wh[i],2], 
col=(i+1))
}


# Number of polygons to sapply over (or number of polygons). 
# With more polygons you could also try to only scan the lines around 
# the "4" polygons which center is closest to the outlying point(?)
nit <- dim(coordinates(polys))[1]

#Find the closest polygon, and paste the data in overl
overl.na <- sapply(1:length(wh), function (x) {
  x0 <- slot(ssample, 'coords')[wh[x],1]
  y0 <- slot(ssample, 'coords')[wh[x],2]

mn <- sapply(1:nit, function(y) {
    cc <- slot(slot((slot(polys, 'polygons'))[[y]], 'Polygons')
[[1]], 'coords')
    x1 <- cc[1:(dim(cc)[1]-1),1]
    x2 <- cc[2:dim(cc)[1],1]
    y1 <- cc[1:(dim(cc)[1]-1),2]
    y2 <- cc[2:dim(cc)[1],2]
    #calculate distance
    mn <- 
mean(abs((x2-x1)*(y1-y0)-(x1-x0)*(y2-y1))/sqrt(((x2-x1)^2+(y2-y1)^2)))
    return(mn)
  })

  wm <- which.min(mn)
  return(list(wm, x0, y0))
})

##Can leave out. This section is just for plotting lines
#lines. Shortest distance
lin  <- sapply(2:(length(wh)+1), function (x) {
    p1 <- matrix(unlist(overl.na[-1,(x-1)]),,2)
    pol.pos <- unlist(overl.na[1,])
    pol1 <- slot(slot((slot(polys, 'polygons'))[[pol.pos[x-1]]], 'Polygons')
[[1]], 'coords')[-1,]
    dst <- spDistsN1(pol1, p1, longlat=FALSE)
    wh.dst <- which.min(dst)
    ln <- Line(matrix( c(p1[,1], pol1[wh.dst, 1], p1[,2], pol1[wh.dst, 2]), 
2 ))
    return(ln)
  })

#lines, second shortest distance
lin2  <- sapply(2:(length(wh)+1), function (x) {
    p1 <- matrix(unlist(overl.na[-1,(x-1)]),,2)
    pol.pos <- unlist(overl.na[1,])
    pol1 <- slot(slot((slot(polys, 'polygons'))[[pol.pos[x-1]]], 'Polygons')
[[1]], 'coords')[-1,]
    dst <- spDistsN1(pol1, p1, longlat=FALSE)
    wh.dst <- which.min(dst)
    dst[wh.dst] <- NA
    wh.dst2 <- which.min(dst)
    ln2 <- Line(matrix( c(p1[,1], pol1[wh.dst2, 1], p1[,2], pol1[wh.dst2, 2]), 
2 ))
    return(ln2)
  })

#Convert to spatial lines
sl1 <- Lines(lin)
spl1 <- SpatialLines(list(sl1), proj4string = CRS(proj4string(polys)))
sl2 <- Lines(lin2)
spl2 <- SpatialLines(list(sl2), proj4string = CRS(proj4string(polys)))


##
#Add the new values to the overlay
overl[wh] <- unlist(overl.na[1,])

#Show overlay 
overl

#Add polygon
new.points <- SpatialPointsDataFrame(ssample, data.frame(polyid=overl))

polys <- SpatialPolygonsDataFrame(polys, 
data.frame(ID=c("p1", "p2", "p3", "p4")))

mypalette <- "red"
require(lattice)
trellis.par.set(sp.theme(regions = list(col = mypalette)))
spplot(new.points, 
      sp.layout = list(list("sp.polygons", polys, fill = 
c("grey10", "grey40", "grey60", "grey90")),
			list("sp.lines", spl1),
			list("sp.lines", spl2)), 
      xlim=(c(bbox(polys)[1,])-c(mean( slot(ssample, 'coords'))*.5, -mean( 
slot(ssample, 'coords'))*.5)), 
      ylim=(c(bbox(polys)[2,])-c(mean( slot(ssample, 'coords'))*.5, -mean( 
slot(ssample, 'coords'))*.5)))




##########
On Friday 22 May 2009 10:44:38 am Roger Bivand wrote: