How to retrieve the interior boundary (points) of a polygon sf object
Dear Xiang, thanks for your kind words. 1. The underlying assumption behind st_duplicated is that the polygons used to construct the multipolygon share some vertices in the interior part of the multipolygon. Your explanation and understandings are correct. If you want to check the implementation, see here: https://github.com/luukvdmeer/sfnetworks/blob/c6eb10a05768f570e94981e826de8ee3f7fb7fce/R/utils.R#L21 2. Yes, you are correct, you can simply use st_cast(..., "POINT"). I'm not sure whether one of the two approaches is much faster than the other, but I guess it doesn't matter here. Kind regards Andrea
On 8/8/2025 8:39 AM, Xiang Ye wrote:
Dear Andrea,
Thank you for your detailed response. I followed your instructions
carefully, and was able to get the expected results.
St_duplicated() is cool!
I have two following questions.
1.
Why st_duplicated() works?
My guess is that for a typical (multi)polygon sf object, all vertices
on the interior borders will be adopted twice to form polygons.
Technically, this makes them appear twice in the geometry column of
the data set. Meanwhile, vertices on the exterior borders only appear
once. When checked by st_duplicated(), all the second appearance of
the interior-border vertices will be marked true, and thus they are
extracted.
Is my understanding correct?
2.
Can we simplify a sentence?
In the script, there is a sentence:
pts_sfc <- st_multipoint(pts_matrix[, c(1, 2)]) |> st_sfc(crs =
st_crs(polygon)) |> st_cast("POINT")
Is this sentence equal to the following one?
pts_sfc_2 <- st_cast(polygon, 'point')
I experimented by myself, and find identical(pts_sfc,
pts_sfc_2)==FALSE. However, identical(st_duplicated(pts_sfc),
st_duplicated(pts_sfc_2))==TRUE.
Again, thank you so much for your awesome solution!
Xiang
------------------------------------------------------------------------
*From:* Andrea Gilardi - Unimib <andrea.gilardi at unimib.it>
*Sent:* Thursday, August 7, 2025 01:56
*To:* Xiang Ye <yexiangmaup at outlook.com>
*Cc:* r-sig-geo at r-project.org <r-sig-geo at r-project.org>
*Subject:* Re: [R-sig-Geo] How to retrieve the interior boundary
(points) of a polygon sf object
Hi Xiang! I would use the following approach:
library(tidyverse)
library(sf)
library(spData)
# You need the development version of sfnetworks:
# remotes::install_github("luukvdmeer/sfnetworks")
library(sfnetworks) # Exports st_duplicated
polygon = st_geometry(us_states)
pts_matrix <- st_coordinates(polygon)
pts_sfc <- st_multipoint(pts_matrix[, c(1, 2)]) |> st_sfc(crs =
st_crs(polygon)) |> st_cast("POINT")
pts_dup <- st_duplicated(pts_sfc)
pts_interior <- pts_sfc[pts_dup]
plot(polygon, reset = FALSE)
points(pts_interior, cex = 1, pch = 16, col = 2)
You can also check the output here:
https://gist.github.com/agila5/2b5d9b7e1453c4d3f292e7aae5fdb79e
Hope that's helpful.
Andrea
On 8/6/2025 7:27 PM, Dexter Locke wrote:
library(tidyverse) library(sf) library(spData) polygon=st_geometry(us_states) plot(polygon) polygon |> st_as_sf() |> summarise() |> plot() On Wed, Aug 6, 2025 at 1:25?PM Xiang Ye <yexiangmaup at outlook.com> wrote:
Dear community, I am trapped by a seemingly easy task. I would like to write a short R script to retrieve the interior boundary (points) of a polygon sf
object,
like the state boundary of the Contiguous United States, but
without the
outline. I tried several methods, with none fulfilling my objective. Below
is the
code, with my observations in the comments: # Objective: Retrieve the internal boundary of a polygon data set library(tidyverse) library(sf) library(spData) polygon=st_geometry(us_states) plot(polygon) # Method 1: Use st_difference() st_union(polygon) -> border plot(border) st_difference(polygon, border) -> interior plot(interior) # You can tell that the result is weird. # Method 2: Retrieve vertices as points, then use st_difference() st_cast(polygon, 'POINT') -> points_polygon points_polygon # 3585 features st_cast(border, 'POINT') -> points_border points_border # 1275 features st_difference(points_polygon, points_border) -> points_interior #
About 2
minutes to execute points_interior # 4.57 million features, about 2 GB. why? # Methods 3: Use st_disjoint instead of st_difference() polygon[border, op=st_disjoint] # Empty set points_polygon[border, op=st_disjoint] -> points_interior2 points_interior2 # 132 features; plot() it and doesn't look right points_polygon[points_border, op=st_disjoint] -> points_interior3 points_interior3 # 3585 features, meaning no points have been taken out # Methods 4: Retrieve vertices as coordinates, converting them into tibbles, then anti-join the border coordinates st_coordinates(polygon) %>% as_tibble -> coords_polygon coords_polygon# A tibble: 3,585 ? 5 st_coordinates(border) %>% as_tibble -> coords_border coords_border # A tibble: 1,275 ? 5 anti_join(coords_polygon, coords_border) -> coords_interior coords_interior # A tibble: 3,585 ? 5, meaning no points have been
taken
out, and implying no coordinates in coords_border coincides with the coordinates in coords_polygon. This is just not reasonable. Any comments on why I fail and/or how to do it correctly will be
helpful.
Thank you in advance! ?? YE, Xiang THINKING SPATIALLY<http://www.linkedin.com/in/spatialyexiang>. Ph.D. in Spatial Statistics ????????? [[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
??????? [[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo