Hi Barry, Thanks very much for the tip. I will try and see if I can implement that using system calls from my R script, and will report back. Thanks, Lyndon On Fri, Dec 14, 2012 at 7:04 PM, Barry Rowlingson
<b.rowlingson at lancaster.ac.uk> wrote:
On Fri, Dec 14, 2012 at 8:16 PM, Lyndon Estes <lestes at princeton.edu> wrote:
Given the type of work we are asking of people, we are getting many "unclean" polygons (self-intersects, overlaps, non-noding intersections, etc). I have been writing in various patches to deal with these as I have been going, but I turn now for some advice on cleaning up operations. An example taken from the workflow illustrates a typical problem I encounter that I am looking for an efficient way to handle. This example is followed by my two questions.
So, here''s my first question. Is there some other of cleaning self-intersections up that preserves the area better than the examples using gBuffer and gSimplify, and which might be more efficient and provide a results that more closely approximates the original map than rasterizing and polygonizing? e.g. Is it possible to explode polygons at the point(s) of self-intersection, in order to create multiple valid polygons? My second question: am I barking up the wrong tree here? Should I preferentially use something like GRASS's v.clean before I even read into R? I haven't until now because 1) I don't know GRASS very well, 2) I want to minimize the number of different routines/software in the workflow (which currently involves python, postgis/postgres, R, openlayers).
The most promising thing I've seen for cleaning up polygon topology is pprepair: https://github.com/tudelft-gist/pprepair which won a best paper prize at OSGIS-UK this year. Its not simple to run properly, which is why I say most promising, and not best. If run on your extreme self-intersection 'bow tie' polygon, it produces a shapefile with a feature for each part of the bow tie. I can't currently find a way of tieing those two features back to the source bow-tie feature since the attributes aren't propogating properly. Also, it seems to merge two of the large features in your examples. Its a standalone C++ program that needs the CGAL library to work, and there's no R interface. It operates on Shapefiles (or possibly any OGR source?). Overall, cleaning up messy digitizing is a hard problem. Obviously overlaps between polygons are wrong, and pprepair does a great job of assigning them to one or other of the polygons, but gaps are trickier to assign - they might be real gaps (like a river between regions) or they could be digiitzing errors. Its possible that some buffering could help here, before doing pprepair. Barry
Lyndon Estes Associate Research Scholar Princeton University +1-609-258-8308 (o) +1-609-258-2799 (f) +1-202-431-0496 (m) lestes at princeton.edu www.princeton.edu/~lestes