gUnaryUnion Not Dissolving Correctly
On Thu, 5 Nov 2015, Roger Bivand wrote:
On Wed, 4 Nov 2015, Matt Strimas-Mackey wrote:
The original study area shapefile is a boundary of the Indonesia half of New Guinea. The file as well as the code to construct the hexagonal grids are here: https://www.dropbox.com/sh/ff8v08p3ambqcbs/AAAPBlGP4fthdmZhrto7oIuCa?dl=0 Since it's a large area, generating the grid takes a long time, so I've also included code for a small subset of the original shapefile--one small offshore island.
An equivalent scaling fix is to change the coordinate units from metres to
kilometres:
library(rgdal)
study_area2 <- spTransform(study_area, CRS("+proj=aea +lat_1=-7.42
+lat_2=-0.62 +lat_0=-9.119 +lon_0=128.91 +x_0=0 y_0=0 +datum=WGS84
+units=km +no_defs +ellps=WGS84 +towgs84=0,0,0"))
size <- sqrt(2 * 1e2 / sqrt(3)) # reduce to suit
set.seed(1)
hex_points <- spsample(study_area2[2,], type="hexagonal", cellsize=size)
hex_grid <- HexPoints2SpatialPolygons(hex_points)
hex_union <- gUnaryUnion(hex_grid)
plot(hex_grid, col='lightgrey')
plot(hex_union, border='orange', lwd=3, add=T)
An advantage of these scaling issues being made visible is that we see that
computers really do not operate in continuous space, and that computational
geometry actually matters, I suppose.
I'll check the underlying code generating the hexagons to see why the nodes
(where hexagon boundaries meet) appear to slide apart at the edge of machine
precision.
A potentially robust approach at default scale uses the dx= argument to
HexPoints2SpatialPolygons():
hex_grid <- HexPoints2SpatialPolygons(hex_points, dx=size)
Setting:
set_RGEOS_polyThreshold(1e-2) # for example
set_RGEOS_warnSlivers(TRUE)
shows the remaining slivers, and:
set_RGEOS_dropSlivers(TRUE)
drops them. It is still possible that an inward dangle will get through
(zero area line in from boundary point, but part of the single boundary).
Setting dx= to the same value as cellsize= in the spsample call seems to
be crucial, avoiding an approximation marked in the code as:
# EJP; changed:
# how to figure out dx from a grid? THK suggested:
#dx <- hexGrid$x[2] - hexGrid$x[1]
# and the following will also not always work:
in sp:::genPolyList(), so there was a warning there against taking the
default.
Roger
Roger
Finally, some more odd behaviour. I noticed each time I run this code,
the dissolve mistakes change, i.e. different boundaries are
erroneously kept. However, using set.seed() makes the errors the same
each time for a given seed, and changing the seed yields a different
set of errors. Example in the code in the dropbox link and copied
here:
library(sp)
library(raster)
library(rgeos)
# just a subset of full shapefile
set.seed(1)
size <- sqrt(2 * 1e8 / sqrt(3))
study_area <- shapefile('papua.shp')
hex_points <- spsample(study_area[2,], type="hexagonal", cellsize=size)
hex_grid <- HexPoints2SpatialPolygons(hex_points)
hex_union <- gUnaryUnion(hex_grid)
plot(hex_grid, col='lightgrey')
plot(hex_union, border='orange', lwd=3, add=T)
# results chage according to seed
set.seed(100)
size <- sqrt(2 * 1e8 / sqrt(3))
study_area <- shapefile('papua.shp')
hex_points <- spsample(study_area[2,], type="hexagonal", cellsize=size)
hex_grid <- HexPoints2SpatialPolygons(hex_points)
hex_union <- gUnaryUnion(hex_grid)
plot(hex_grid, col='lightgrey')
plot(hex_union, border='orange', lwd=3, add=T)
On Wed, Nov 4, 2015 at 8:57 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Wed, 4 Nov 2015, Matt Strimas-Mackey wrote:
Thanks, lots of useful info here. I've never seen the setScale() function; I don't think it's mentioned in the gUnaryUnion help. This saves me a lot of headache! For what it's worth, the invalid geometry is an artifact of the reproducible example I created. The original hexagonal grid is produced with g <- spsample(study_area, type="hexagonal", cellsize=size) hex_grid <- HexPoints2SpatialPolygons(g)
OK, thanks - this is useful. Could you please make available the study area object in some way - so that we can re-create g and see how HexPoints2SpatialPolygons() creates the artefact Mike spotted (although this looks like numeric fuzz - 'changing "1287248.96712942" to "1287248.96712943"' is a change in the 15th digit, which is on the precision edge of the "double" storage mode. If we can revisit functions creating SpatialPolygons objects to ensure that they are GEOS-compatible for the default scale of 1e+8, we'll be more secure. Roger
And this object passes gIsValid() and clgeo_GeometryReport() without any problems, yet still has the dissolving issue. Regardless, all is solved with setScale(). Thanks! M On Wed, Nov 4, 2015 at 6:30 AM, Michael Sumner <mdsumner at gmail.com> wrote:
On Thu, 5 Nov 2015 at 01:10 Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Wed, 4 Nov 2015, Michael Sumner wrote:
Thanks for all this detail Roger, is there a way to "re-build" a spatial object so that the given scale setting is applied? Are there any general rounding or "orthogonalize" functions in the Spatial suite?
No, not really. In this case, the very detailed coordinate measurements may have made things worse, or possibly using Polygon rather than Polygons objects, or not building the Polygons object with a proper comment attribute, I don't know. rgeos::gIsValid(spoly) is FALSE, and cleangeo::clgeo_GeometryReport(clgeo_CollectionReport(clgeo_Clean(spoly))) says the same (based on rgeos). I suggest working with Emmanuel Blondel (cleangeo maintainer) to extend cleangeo. GEOS is looking at allowing users to manipulate precision models, not just scale, but I'm uncertain about that. Running spoly into GRASS and back out (GRASS builds topology on import) shows a different error, the object seems to be problematic.
It can be fixed by changing "1287248.96712942" to "1287248.96712943", so really the creator should not have had those values on input. There's no easy way out without topology. This brought out some interesting issues for work I've been doing using the "near-Delaunay triangulation" in RTriangle, and that requires a normalized set of vertices (no duplicate vertices) which on its own presents interesting problems. I have a related issue where a parallel (latitude) line needs many vertices to look "smooth" on a polar projection, but when building a mesh with triangles it's really best to allow relatively coarse segmented boundaries rather than have many elements at parallels. The Triangle library does not consider these hexagon coordinates to be duplicates, so there are two vertical segments between the two bottom polys at points(coordinates(as(as(spoly, "SpatialLines"), "SpatialPoints"))[c(3, 4), ] ) Thanks for the reminder about cleangeo, I'll have closer look. cheers, Mike.
Roger
Cheers, Mike. On Wed, 4 Nov 2015 at 18:16 Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Tue, 3 Nov 2015, Matt Strimas-Mackey wrote:
I'm working with a regular hexagonal grid stored as SPDF. At
some
point I subset this SPDF, then want to combine all adjacent
hexagons
together so that each contiguous set of hexagons is a single
polygon.
I'm doing this last step using gUnaryUnion (or
gUnionCascaded, not
clear what the different is actually). The problem is that
in some
cases boundaries between clearly adjacent polygons are not
dissolved.
Example:
## Create three adjacent hexagons
library(sp)
library(rgeos)
p1 <- Polygon(cbind(
c(1276503.26781119, 1281876.11747031, 1287248.96712942,
1287248.96712942, 1281876.11747031, 1276503.26781119,
1276503.26781119),
c(204391.40834643, 207493.42454344, 204391.40834643,
198187.37595242,
195085.35975541, 198187.37595242, 204391.40834643)))
p2 <- Polygon(cbind(
c(1287248.96712943, 1292621.81678854, 1297994.66644766,
1297994.66644766, 1292621.81678854, 1287248.96712943,
1287248.96712943),
c(204391.40834643, 207493.42454344, 204391.40834643,
198187.37595242,
195085.35975541, 198187.37595242, 204391.40834643)))
p3 <- Polygon(cbind(
c(1281876.11747031, 1287248.96712943, 1292621.81678854,
1292621.81678854, 1287248.96712943, 1281876.11747031,
1281876.11747031),
c(213697.45693745, 216799.47313446, 213697.45693745,
207493.42454344,
204391.40834643, 207493.42454344, 213697.45693745)))
spoly <- SpatialPolygons(list(Polygons(list(p1, p2, p3),
's1')))
plot(gUnaryUnion(spoly))
No, this is just numerical fuzz: plot(spoly) plot(gUnaryUnion(spoly), border="green", lty=3, lwd=2, add=TRUE) oS <- getScale() # default 1e+8 setScale(1e+4) plot(gUnaryUnion(spoly), border="orange", lwd=2, add=TRUE) setScale(oS) JTS, GEOS, and consequently rgeos by default shift all coordinates to an integer grid after multiplying by a scale factor (finding integer matches is much easier than real matches). If the scaling is too detailed (in some cases), the operations do not give the expected outcomes. There is work in progress in GEOS and JTS to provide other scaling options and models, and to permit iteration over scaling values until a "clean" result is obtained (for some meanings of clean). gUnionCascaded() was the only possible function for GEOS < 3.3.0, from GEOS 3.3.0 gUnaryUnion() is available and the prefered and more efficient route. This is explained on the help page. Hope this clarifies, Roger
Note that p2 and p3 are dissolved together, but p1 is separate. The shared edge of p1 and p2 is: p1: [2,] 1281876 207493.4 [3,] 1287249 204391.4 p2: [5,] 1287249 204391.4 [6,] 1281876 207493.4 So, exactly the same apart from the order. I originally thought this difference in order might be the problem, but this doesn't seem to be an issue with in this example, where order is also flipped: sss <- rasterToPolygons(raster(nrow=2, ncol=2, xmn=0, xmx=2, ymn=0, ymx=2, vals=1:4)) lapply(sss at polygons, function(x) slot(x, 'Polygons')[[1]]@coords) plot(sss) plot(gUnaryUnion(sss)) Session Info: R version 3.2.2 (2015-08-14) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.10.5 (Yosemite) Message when rgeos is loaded: rgeos version: 0.3-14, (SVN revision 511) GEOS runtime version: 3.5.0-CAPI-1.9.0 r0 Linking to sp version: 1.2-0 Polygon checking: TRUE Any help on how to get these polygons to dissolve is appreciated. M
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 91 00 e-mail: Roger.Bivand at nhh.no
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 91 00 e-mail: Roger.Bivand at nhh.no
-- Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 91 00 e-mail: Roger.Bivand at nhh.no
Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 91 00 e-mail: Roger.Bivand at nhh.no