Skip to content

Reading tables without geometry from gdb?

7 messages · obrl soil, Edzer Pebesma, Michael Treglia

#
Hi All,

I have a file geodatabase with non-spatial tables that can be joined to
other spatial objects. Is there a best/easiest way to import the
non-spatial tables in R these days?

I've seen some solutions here:
https://gis.stackexchange.com/questions/184013/read-a-table-from-an-esri-file-geodatabase-gdb-using-r
but figured I'd see if I'm missing anything within sf.

If it helps, the dataset I'm looking at is here:
https://data.cityofnewyork.us/Environment/NYC-Urban-Tree-Canopy-Assessment-Metrics-2010/hnxz-kkn5

When I try sf::st_read, as follows, I get the subsequent error:
sf::st_read(dsn=fgdb,table="LandCover_Metrics_NYC_Community_Districts_Version10C")

Error in st_sf(x, ..., agr = agr, sf_column_name = sf_column_name) :
  no simple features geometry column present.

Thanks all - best regards,
Mike T
#
Hi Mike,

Here's a demo workflow that uses package sf as well as a standalone
install of GDAL (I use the OSGeo4W package version, but any install
that has the OpenFileGDB driver will do). Ogr2ogr can write to csv so
its easy enough to pull out the non-spatial tables that way.

I downloaded your data (fair warning to others, 140MB - not that you'd
know from the metadata!) and unzipped it as 'NYCTrees.gdb'. Now:

options(stringsAsFactors = FALSE)
library(sf) # for spatial
library(readr) # for csv
library(dplyr)  # for join

setwd('C:/DATA')

# somewhere to dump outputs
dir.create(file.path(getwd(), 'tree_tables'))

# whats in the gdb?
gdb_contents <- st_layers(dsn = file.path(getwd(), 'NYCTrees.gdb'),
do_count = TRUE)

# this is easier to read than the list above:
gdb_neat_deets <- data.frame('Name' = gdb_contents[['name']],
                             'Geomtype' = unlist(gdb_contents[['geomtype']]),
                             'features' = gdb_contents[['features']],
                             'fields' = gdb_contents[['fields']])

# so the non-spatial tables have geometry == NA, lets get their names
nyctrees_nonspatial <-
as.list(gdb_neat_deets[is.na(gdb_neat_deets$Geomtype), 'Name'])
# names attrib is useful here
names(nyctrees_nonspatial) <-
gdb_neat_deets[is.na(gdb_neat_deets$Geomtype), 'Name']

# pull all the non-spatial layers out to csv using GDAL:
lapply(seq_along(nyctrees_nonspatial), function(x) {
  system2('C:/OSGeo4W64/bin/ogr2ogr.exe',
          args = c(# output format
                   '-f', 'csv',
                   # overwrite
                   '-overwrite',
                   # destination file
                   file.path(getwd(),
                             'tree_tables',
                             paste0(names(nyctrees_nonspatial[x]), '.csv')),
                   # src file
                   file.path(getwd(), 'NYCTrees.gdb'), nyctrees_nonspatial[x]))
})

# this is p quick, but one of those files has 800k rows so...not that quick!

# are they there?
tree_csvs <- list.files(file.path(getwd(), 'tree_tables'),
                        pattern = '\\.csv$',
                        full.names = TRUE)

# awesome! lets read a small one in and try a spatial join
LandCover_Metrics_PROW_Borough <- read_csv(tree_csvs[7])

# this spatial layer appears to match
PROW_Borough_sf <- st_read(dsn   = file.path(getwd(), 'NYCTrees.gdb'),
                           layer = 'PROW_Borough')

# only 5 geometries, but they're massive multipolygons so this is how
you view attributes without crashing Rstudio:
View(st_set_geometry(PROW_Borough_sf, NULL))

# join on BoroName ought to work
test_join <- left_join(PROW_Borough_sf,
LandCover_Metrics_PROW_Borough, by = 'BoroName')

# seems legit. A (terrible) plot:
plot(select(test_join, Grass_P), border = NA)

et voila

cheers
@obrl_soil
On Sun, Feb 11, 2018 at 2:22 PM, Michael Treglia <mtreglia at gmail.com> wrote:
#
On 02/11/2018 05:22 AM, Michael Treglia wrote:
I didn't even get that far. I tried:
st_read(dsn="NYC_20Urban_20Tree_20Canopy_20Assessment_20Metrics_202010.zip",table="LandCover_Metrics_NYC_Community_Districts_Version10C")
Cannot open data source
/tmp/NYC_20Urban_20Tree_20Canopy_20Assessment_20Metrics_202010.zip
Error in CPL_read_ogr(dsn, layer, as.character(options), quiet, type,  :
  Open failed.

but I'm not sure about the dsn argument; in your example, it is not
clear what fgdb stands for.

I tried unzipping the file in a directory, and use the directory name
for dsn, but without success. Would appreciate pointers to what I'm
doing wrong.

  
    
#
The download wasn?t zipped correctly I think ? after unzipping, add
extension .gdb to the folder name.

@obrl_soil

On Sun, Feb 11, 2018 at 10:47 PM, Edzer Pebesma
<edzer.pebesma at uni-muenster.de> wrote:
#
Bingo! Thanks,
On 02/11/2018 01:52 PM, obrl soil wrote:

  
    
#
sf::st_read now returns a data.frame if no geometry column is found,
rather than raising an error:

https://github.com/r-spatial/sf/commit/77a31f2989a1d217438fb629ac34d7ef31baa9c2

it does emit a warning in that case, now, since an object of a different
type than expected is returned:
"LandCover_Metrics_NYC_Community_Districts_Version10C")
Reading layer `LandCover_Metrics_NYC_Community_Districts_Version10C'
from data source `/tmp/fgdb.gdb' using driver `OpenFileGDB'
Warning message:
  no simple feature geometries present: returning a data.frame
TC_ID  Total_A   Can_A   Grass_A   Soil_A Water_A  Build_A  Road_A
Paved_A
1     1 41641392 4935664 3736731.7 21923.00  783819 13813225 8179456
10170573
2     2 37739955 4440133  632087.5 12255.75  343714 17140788 7380846
7790131
   Perv_A Imperv_A    Can_P  Grass_P     Soil_P  Water_P  Build_P   Road_P
1 9478138 32163254 11.85278 8.973599 0.05264713 1.882307 33.17186 19.64261
2 5428190 32311765 11.76507 1.674850 0.03247420 0.910743 45.41815 19.55711
   Paved_P   Perv_P Imperv_P BoroCD
1 24.42419 22.76134 77.23866    101
2 20.64160 14.38314 85.61686    102
On 02/11/2018 05:22 AM, Michael Treglia wrote:

  
    
#
This is great - thanks so much Edzer! I'm sure this will come in handy for
others too.

And thanks to @obrl_soil for the clear and thorough response too.

Best,
Mike

On Sun, Feb 11, 2018 at 8:27 AM, Edzer Pebesma <
edzer.pebesma at uni-muenster.de> wrote: