Skip to content

Cleaning small spatial polygons

9 messages · Fred, Eduardo Diez, Roger Bivand

#
Is there any way of doing this or should i forget it and go on using GRASS
through rgrass7?

Thanks

2015-10-19 15:03 GMT-03:00 Eduardo Diez <eduardodiez at gmx.com>:

  
  
#
On Wed, 4 Nov 2015, Eduardo Diez wrote:

            
I suggest working with Emmanuel Blondel (cleangeo maintainer) to extend 
cleangeo (also suggested in an earlier thread today, with apologies to 
Emmanuel for picking on him!). That package already uses rgeos, and is a 
logical place to put GRASS v.clean-like functionality (maybe even 
encapsulating using GRASs via rgrass7 and a throwaway location).

At the moment it is messy, though some rgeos internals do do something 
like this, but it isn't exposed to users.

Roger

  
    
#
I tried to do that with a lattice map, and figured out a way to remove
those small polygons within a large area. But don't know whether it suits
your needs for raster.

I read the digital map with rgdal package.

Using Australia local government area as an example.

#remove the small polygons within the large LGA area

aus_lga = readOGR(dsn = ".", "LGA11aAust") #LGA
nsw_lga = aus_lga[aus_lga$STATE_CODE == 1, ]
N = length(nsw_lga)
for (i in 1:N) {
  pol = nsw_lga at polygons[[i]]@Polygons
  index = 1:length(pol)
  area = sapply(pol, function(x) x at area)
  n_remove = index[area!=max(area)]
  nsw_lga at polygons[[i]]@Polygons[n_remove] <- NULL
}

It basically extracts the area of the polygons within the S4 objects, and
set the spatial object of small polygons within a large area to NULL.

Regards

Fred
On Thu, Nov 5, 2015 at 5:34 AM, Eduardo Diez <eduardodiez at gmx.com> wrote:

            

  
  
#
On Wed, 4 Nov 2015, Roger Bivand wrote:

            
A possibility is to use:

set_RGEOS_polyThreshold(1e-2) # for example
set_RGEOS_warnSlivers(TRUE)

shows the remaining slivers, and:

set_RGEOS_dropSlivers(TRUE)

drops them when using for example:
A buffer of zero width should not have side-effects, but your mileage may 
vary. It will only remove slivers, not dangles. I haven't tried it when it 
also finds a Polygons object under the threshold, which was your initial 
problem.

Roger

  
    
1 day later
#
Roger,
Following this last approach strange things happen: the sliver polygons
that are fully inside another one get "merged" with the larger; the ones
that have no contact with the exterior but share boundaries with two or
more polygons get deleted and a hole left behind; the ones in contact with
the exterior get deleted.
This is what i tried. As the polygons were projected i used values
 in squared meters for the set_RGEOS_polyThreshold function. I think values
 of around 1e-2 are more appropriate for Spatial* objects in GCS.
There were 50 or more warnings (use warnings() to see the first 50)
Warning messages:
1: In gBuffer(pol2, byid = T, width = 0) : 1: Polygon object 0 area 726.084
2: In gBuffer(pol2, byid = T, width = 0) : 2: Polygon object 1 area 20.169
3: In gBuffer(pol2, byid = T, width = 0) : 3: Polygon object 2 area 20.169
4: In gBuffer(pol2, byid = T, width = 0) : 4: Polygon object 3 area 20.169
5: In gBuffer(pol2, byid = T, width = 0) : 5: Polygon object 4 area 20.169
6: In gBuffer(pol2, byid = T, width = 0) : 6: Polygon object 5 area 20.169
7: In gBuffer(pol2, byid = T, width = 0) : 7: Polygon object 6 area 80.676
8: In gBuffer(pol2, byid = T, width = 0) : 8: Polygon object 7 area 20.169
9: In gBuffer(pol2, byid = T, width = 0) : 9: Polygon object 8 area 20.169
10: In gBuffer(pol2, byid = T, width = 0) : 10: Polygon object 9 area 20.169
11: In gBuffer(pol2, byid = T, width = 0) : 11: Polygon object 10 area
2581.63
12: In gBuffer(pol2, byid = T, width = 0) : 12: Polygon object 11 area
40.338
13: In gBuffer(pol2, byid = T, width = 0) : 13: Polygon object 12 area
20.169
14: In gBuffer(pol2, byid = T, width = 0) : 14: Polygon object 14 area 1432
15: In gBuffer(pol2, byid = T, width = 0) : 15: Polygon object 15 area
40.338
16: In gBuffer(pol2, byid = T, width = 0) : 16: Polygon object 16 area
20.169
17: In gBuffer(pol2, byid = T, width = 0) : 17: Polygon object 17 area
20.169
18: In gBuffer(pol2, byid = T, width = 0) : 18: Polygon object 18 area
20.169
19: In gBuffer(pol2, byid = T, width = 0) : 19: Polygon object 19 area
80.676
20: In gBuffer(pol2, byid = T, width = 0) : 20: Polygon object 21 area
20.169
21: In gBuffer(pol2, byid = T, width = 0) : 21: Polygon object 22 area
1230.31
22: In gBuffer(pol2, byid = T, width = 0) : 22: Polygon object 23 area
20.169
23: In gBuffer(pol2, byid = T, width = 0) : 23: Polygon object 25 area
20.169
24: In gBuffer(pol2, byid = T, width = 0) : 24: Polygon object 26 area
20.169
25: In gBuffer(pol2, byid = T, width = 0) : 25: Polygon object 27 area
564.732
26: In gBuffer(pol2, byid = T, width = 0) : 26: Polygon object 28 area
40.338
27: In gBuffer(pol2, byid = T, width = 0) : 27: Polygon object 30 area
504.225
28: In gBuffer(pol2, byid = T, width = 0) : 28: Polygon object 31 area
605.07
29: In gBuffer(pol2, byid = T, width = 0) : 29: Polygon object 33 area
20.169
30: In gBuffer(pol2, byid = T, width = 0) : 30: Polygon object 34 area
20.169
31: In gBuffer(pol2, byid = T, width = 0) : 31: Polygon object 35 area
1210.14
32: In gBuffer(pol2, byid = T, width = 0) : 32: Polygon object 36 area
20.169
33: In gBuffer(pol2, byid = T, width = 0) : 33: Polygon object 37 area
685.746
34: In gBuffer(pol2, byid = T, width = 0) : 34: Polygon object 38 area
40.338
35: In gBuffer(pol2, byid = T, width = 0) : 35: Polygon object 39 area
20.169
36: In gBuffer(pol2, byid = T, width = 0) : 36: Polygon object 40 area
20.169
37: In gBuffer(pol2, byid = T, width = 0) : 37: Polygon object 41 area
80.676
38: In gBuffer(pol2, byid = T, width = 0) : 38: Polygon object 42 area
20.169
39: In gBuffer(pol2, byid = T, width = 0) : 39: Polygon object 43 area
625.239
40: In gBuffer(pol2, byid = T, width = 0) : 40: Polygon object 44 area
20.169
41: In gBuffer(pol2, byid = T, width = 0) : 41: Polygon object 46 area
262.197
42: In gBuffer(pol2, byid = T, width = 0) : 42: Polygon object 47 area
2500.96
43: In gBuffer(pol2, byid = T, width = 0) : 43: Polygon object 48 area
20.169
44: In gBuffer(pol2, byid = T, width = 0) : 44: Polygon object 49 area
40.338
45: In gBuffer(pol2, byid = T, width = 0) : 45: Polygon object 53 area
80.676
46: In gBuffer(pol2, byid = T, width = 0) : 46: Polygon object 56 area
1290.82
47: In gBuffer(pol2, byid = T, width = 0) : 47: Polygon object 57 area
2722.82
48: In gBuffer(pol2, byid = T, width = 0) : 48: Polygon object 59 area
2218.59
49: In gBuffer(pol2, byid = T, width = 0) : 49: Polygon object 60 area
40.338
50: In gBuffer(pol2, byid = T, width = 0) : 50: Polygon object 61 area
121.014
Thanks, and sorry if i can't be of much help in making suggestions as i
don't know how could this be done


2015-11-05 8:05 GMT-03:00 Roger Bivand <Roger.Bivand at nhh.no>:

  
  
#
On Sat, 7 Nov 2015, Eduardo Diez wrote:

            
It is important to actually find out what GRASS v.clean intends to do. I 
think that it is meant to remove precision and digitizing artefacts, 
really.
No, the use of 1e-2 was for projected coordinates, and slivers were very, 
very thin polygons (slivers) created by coordinate precision issues (with 
areas o ~ 1e-5 m2).

So the underlying question is why you want to remove non-sliver geometries 
(that is geometries that do create visual clutter, but which are not 
caused by coordinate precision issues). In fact, you are probably actually 
trying to aggregate Polygons objects - those which have data attributes, 
with neighbouring but larger polygons. You then need to choose (the 
algorithms could, but you need to specify) which Polygons objects are to 
be aggregated first, and then run gUnaryUnion() to merge the geometries, 
then aggregate the attribute data, and finally remove any slivers that may 
be present.

You also seem to have many geometries of 20.169 m2 in area, or multiples 
of this, presumably created by rasterToPolygons(). They are likely 
isolated single pixels of another value in raster 1. Could you use a focal 
method in raster to "smooth" them out first? What are the statistical 
properties of the "smoothed" patches - they will no longer be as 
homogeneous as they were before - does this matter?

Roger

  
    
#
On Sat, 7 Nov 2015, Roger Bivand wrote:

            
The manual says:

"tool=rmarea

The rmarea tool removes all areas <= thresh. The longest boundary with an 
adjacent area is removed or all boundaries if there is no adjacent area. 
Area categories are not combined when a small area is merged with a larger 
area."

So to replicate v.clean tool=rmarea, you'd need to find the proportions of 
boundary lengths with neighbouring polygons in order to choose the larger 
polygon with which to merge. This cannot be done in R anyway, as the 
vector model is not topological. GRASS uses a topological model, so can 
find the boundary lengths between neighbouring polygons. Note that the 
operation overwrites the attribute values of the smaller, merged polygon 
with those of the larger.

So a workflow that would emulate this in rasterToPolygons() might be first 
to detect all patches, and for patches of fewer than 15 cells, change 
their values (categorical) to those of the majority of neighbours. Trying 
a 3x3 filter would correspond to only 9 cells, so ~ 1800 m2, and would 
also dilate or erode boundaries between larger patches, so it would be 
quite tedious and would potentially affect your inferences. You could also 
do this in GRASS, possibly using r.clump, r.mfilter and the r.li.* suite 
to find the < threshold patches. Did you use raster::clump() to generate 
raster1?

Roger

  
    
#
Ok, so i'm starting to see where the issues are. In the first place i
didn't fully understood what a sliver polygon was. My objective is to get
rid of polygons that didn't met my needs in terms of ability to manage them
once the analysis is complete, not the ones caused by coordinate precision
issues.

In this context get rid means merging them with the polygons with which
they share the longest boundary (basically removing the shared boundary
that separates them). That is regardless of whether they have the same
attributes or not (i don't care of the attribute because i won't be able to
handle it anyway).

The area covered by the resulting object should be the same of the original
one as no polygons get deleted, only merged with the larger ones.

The presence of single cell polygons is something with which i don't have
much problem because they all disappeared after running the GRASS tool and
is something to be expected from my workflow. I wouldn't be that
comfortable smoothing the raster because i would be messing with the
results.

One question that comes to my head is what would happen in the cases where
i have a polygon that is below the threshold, but is inside another polygon
below the threshold and these 2 inside a larger one above the threshold.
For example: with a threshold of 3000 sq m i may have a 1500 sq m polygon
inside a 2000 sq m polygon, and those inside a 4000 sq m polygon. From
experience i know that GRASS handles this cases quite well.

Thanks

2015-11-07 12:09 GMT-03:00 Roger Bivand <Roger.Bivand at nhh.no>:

  
  
#
On Sat, 7 Nov 2015, Eduardo Diez wrote:

            
Try it and see - as far as I know, the threshold in rgeos is absolute and 
is meant for real slivers. My guess is that v.clean in GRASS is not doing 
this by design, and that the results are coincidental - it is designed to 
clean non-topological vectors.

Roger