Exporting Multipolygon geojson with rgdal::writeOGR
Hi Andy,
On Mon, 26 Oct 2015, Andy Teucher wrote:
Hi Roger, Sorry I haven't been much help here - it's not for lack of trying, just lack of skills. I've been poring through OGR_write.cpp and trying to figure out language and the flow of things... a few thoughts occurred to me (which are probably obvious to you):
Your example was very helpful, the problem was that when writeOGR() and the underlying compiled code was written, the model used was that of the ESRI Shapefile driver, rather than OGC SFS, which was first published in 2004. Most older formats (and their drivers in OGR) use a OGR geometry factory function to organize polygons on reading - but this is not present in shapelib (bundled with the R maptools package), so when designing polygon classes for sp, and writing rgdal::writeOGR() about 10 years ago, this was not known, and was not revisited subsequently, until you happily spotted the bug. GEOS in rgeos uses OGC SFS definitions (the ones you are using below), but the SpatialPolygons/Polygons/Polygon structure was already established by then, and a fix (using comment attributes in Polygons objects) was used, rather than a major change to Polygons objects (in 2010). The revision to rgdal::writeOGR() now branches on the input objects - if they have comments (applied in code, not user-facing), these are trusted, and the MultiPolygon objects constructed containing one or more Polygon objects with one exterior ring and zero or more interior rings on that basis. However, if the input objects do not have comment attributes, OGRGeometryFactory::organizePolygons() (C++) is used to create properly organized MultiPolygon objects for export. This means that the pre-OGC SFS drivers continue to work, but that the OGC SFS drivers, like GeoJSON, now also work correctly when the second ring in an object is not assumed to be an interior ring (rgdal < 1.1-1). A test set for all the possible cases has been introduced. The revision is now in source package form on CRAN: https://cran.r-project.org/web/packages/rgdal/index.html and a Windows binary package will be available later on Tuesday I hope. The OSX revision may take a little longer, but we are grateful to have what we have anyway. Thanks for a clear and very helpful bug report. Best wishes, Roger
1) Is this bug because holes are standalone polygons in sp objects,
while they are members of polygon objects in geojson, wkt, etc?
2) On a related note, I notice that a single polygon with a hole is
identified as a multipolygon containing a single polygon with a hole,
when using writeOGR and one of the affected drivers. Practically I
don't think this matters too much, but it's not quite right.
3) In lines 116-160 the polygon/multipolygon and line/multiline
determination is made for the entire layer. Can/should this be made
for each feature in the layer, as you could have a layer with a
mixture of single polygons and multipolygons?
4) I feel like there ought to be one more layer of looping through the
polyons. e.g.:
loop over each feature;
determine whether each feature is a single polygon (length 1 or
all rings but one are holes), or a multipolygon:
if a polygon loop over and add ring(s);
if a multipolygon loop over polygons:
for each polygon add ring(s) and somehow assign holes to
parent polygons
Again, apologies if I'm being totally obvious (or way off base).
Thanks for all your work on these packages, and on this issue.
Andy
On Thu, Oct 22, 2015 at 11:07 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Fri, 23 Oct 2015, Andy Teucher wrote:
On my Mac, I am running rgdal 1.0-7 and GDAL 1.11.3 sessionInfo()
OK, thanks. The problem also affects the SQLite driver (the CRAN binaries do not have this driver). Could someone please check the PostGIS driver - my guess is that all OGC SFS compliant drivers may be affected. Could somebody also please check whether this is a recently introduced issue or not? For example somebody still on rgdal < 1.0? src/OGR_write.cpp was changed 2015-08-21, but the last previous change was 2015-06-11, then 2015-05-31, 2014-08-17 ... from the ChangeLog visible on the package CRAN page. Roger
R version 3.2.2 (2015-08-14) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11 (El Capitan) locale: [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rgdal_1.0-7 sp_1.2-1 loaded via a namespace (and not attached): [1] tools_3.2.2 grid_3.2.2 lattice_0.20-33 Andy On Thu, Oct 22, 2015 at 11:28 AM, Andy Teucher <andy.teucher at gmail.com> wrote:
One more test: WKT has the same issue:
library(rgdal)
js <- '{
"type": "MultiPolygon",
"coordinates": [[[[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0],
[102.0, 2.0]]], [[[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0],
[100.0, 0.0]]]]
} '
spdf <- readOGR(js, layer='OGRGeoJSON', verbose=FALSE)
writeOGR(spdf, dsn = "test.csv", layer = "test", driver="CSV",
layer_options = "GEOMETRY=AS_WKT")
cat(readLines("test.csv"), sep = "\n")
WKT,FID,
"MULTIPOLYGON (((102 2,102 3,103 3,103 2,102 2),(100 0,100 1,101 1,101
0,100 0)))",0
It should be:
WKT,FID,
"MULTIPOLYGON (((102 2,102 3,103 3,103 2,102 2)),((100 0,100 1,101
1,101 0,100 0)))",0
Andy
On Thu, Oct 22, 2015 at 10:36 AM, Andy Teucher <andy.teucher at gmail.com>
wrote:
Thanks very much Roger. I've replicated this on my work machine (Windows) and on my Mac at home. KML and GML are both affected. ESRI shapefile works fine and, though I'm not very familiar with the format, 'MapInfo File' works too (loading the file in qgis shows a multipart polygon with no geometry errors). sessionInfo() for my Windows machine (using rgdal 1.0-7 with included gdal drivers): R version 3.2.2 (2015-08-14) Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252 [4] LC_NUMERIC=C LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rgdal_1.0-7 sp_1.2-0 loaded via a namespace (and not attached): [1] tools_3.2.2 grid_3.2.2 lattice_0.20-33 I'll send the details for my Mac environment this evening. I'll dig into the source code as soon as I can - C is a completely foreign language to me, so I can't promise anything fast! Thanks again, Andy On Thu, Oct 22, 2015 at 12:27 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Thu, 22 Oct 2015, Andy Teucher wrote:
I?m finding that writeOGR isn?t exporting multipolygons properly using the GeoJSON driver. I have a simple test case (borrowed from here: http://gis.stackexchange.com/questions/137977/writeogr-alters-multipolygon-holes) with a geojson string with one multipolygon containing two polygons. I use readOGR to create a SpatialPolygonsDataFrame out of it, then write it with writeOGR: library(rgdal) js <- '{ "type": "MultiPolygon", "coordinates": [[[[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0], [102.0, 2.0]]], [[[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0]]]] } ' spdf <- readOGR(js, layer='OGRGeoJSON', verbose=FALSE) # Create a SpatialPoygonsDataFrame temp <- tempfile() writeOGR(spdf, dsn = temp, layer = "", driver="GeoJSON") cat(readLines(temp)) # Output: { "type": "FeatureCollection", "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } }, "features": [ { "type": "Feature", "id": 0, "properties": { "FID": 0 }, "geometry": { "type": "MultiPolygon", "coordinates": [ [ [ [ 102.0, 2.0 ], [ 102.0, 3.0 ], [ 103.0, 3.0 ], [ 103.0, 2.0 ], [ 102.0, 2.0 ] ], [ [ 100.0, 0.0 ], [ 100.0, 1.0 ], [ 101.0, 1.0 ], [ 101.0, 0.0 ], [ 100.0, 0.0 ] ] ] ] } } ] }
I can replicate this with GDAL 2.0.1 and rgdal 1.0-7 - reading temp in gives the orphaned hole. I'm surprised that passing the js object to readOGR() doesn't fail, but that isn't the source of the problem. spdf appears to be structured correctly. Please provide your GDAL and rgdal versions, and OS details from sessionInfo(). We are completely relying on the GDAL drivers here - we can't cherry pick for particular drivers (historically excepting ESRI Shapefile), so your debugging will need to examine writeOGR(), the C code it calls, and the interactions between the rgdal C code and called OGR functions. Please check which other drivers are affected (I think KML is, is GML?). Can you place debugging Rprintf(...); in the rgdal C code to see where this is coming from? I can start looking at this sometime next week, so please make a start yourself; contributions from others are very welcome. Roger
If you look closely at the output, you can see that the 'coordinates' array now contains a single polygon array with two coordinate arrays: the boundary, and a second one which is now treated as a hole of the first (orphaned as it is outside the bounds of the polygon). The original 'coordinates' array consists of two polygon arrays, each consisting of a single coordinate array which defining a polygon (with no holes), which is correct according the GeoJSON spec: http://geojson.org/geojson-spec.html#polygon. I'm always hesitant to call things a bug, but this doesn't appear to happen using ogr2ogr on the command line: writeOGR(spdf, ".", "test", driver = "ESRI Shapefile") # Write a shapefile to convert using ogr2ogr system("ogr2ogr -f GeoJSON test_from_shp.geojson test.shp") cat(readLines("test_from_shp.geojson")) # Output: { "type": "FeatureCollection", "features": [ { "type": "Feature", "properties": { "FID": 0 }, "geometry": { "type": "MultiPolygon", "coordinates": [ [ [ [ 102.0, 2.0 ], [ 102.0, 3.0 ], [ 103.0, 3.0 ], [ 103.0, 2.0 ], [ 102.0, 2.0 ] ] ], [ [ [ 100.0, 0.0 ], [ 100.0, 1.0 ], [ 101.0, 1.0 ], [ 101.0, 0.0 ], [ 100.0, 0.0 ] ] ] ] } } ] } The resulting output is correct. Thanks in advance for any help on this. Andy
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 91 00 e-mail: Roger.Bivand at nhh.no
-- Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 91 00 e-mail: Roger.Bivand at nhh.no
Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 91 00 e-mail: Roger.Bivand at nhh.no