Skip to content
Prev 29294 / 29559 Next

Using duckdb spatial module from R (with sf)?

thanks Dewey, good call on st_aswkb.  yup I was puzzled as to what that
internal format is too! from the docs (
https://github.com/duckdblabs/duckdb_spatial#multi-tiered-geometry-type-system
)
basically double aligned WKB, and we may eventually look into enforcing the
format to be properly compatible with PostGIS (which may be useful for the
PostGIS scanner extension)

so maybe one day this coercion won't be necessary.

The docs detail which operations are currently 'native' threaded duckdb and
which aren't
https://github.com/duckdblabs/duckdb_spatial#supported-functions
Looks like their roadmap might be of interest to users on this sig as well
(looks like spatial indexing is still on there
<https://github.com/duckdblabs/duckdb_spatial/issues/7>, but that's way
over my head).

So far this works pretty well for me! Using this trivial example of a
spatial polygon filter, I'm able to run a query against a GBIF parquet
snapshot (~2000+ parquet partitions, about 175 GB compressed) on an S3
bucket from duckdb without downloading in about 17 minutes.  (doing
approximately the same filtering using only the bounding box of the
polygon, ie. in pure duckdb, w/o spatial extension, takes about 8 minutes,
so the additional overhead of casting lat/long columns to geometry and
filtering with the polygon really isn't that bad!). Note that RAM use is
minimal, as expected.

Cheers,

Carl
---
Carl Boettiger
http://carlboettiger.info/

---
Carl Boettiger
http://carlboettiger.info/


On Tue, Aug 15, 2023 at 5:58?PM Dewey Dunnington <dewey at dunnington.ca>
wrote: