Skip to content
Prev 26485 / 29559 Next

Issues with a GDB file in R?

On Thu, 3 May 2018, Roger Bivand wrote:

            
Combining R and GRASS seems to work, but no guarantees.

library(sp)
library(rgdal)
Gi <- GDALinfo("g250_clc12_V18_5.tif")
makeSG <- function(x) {
   stopifnot(class(x) == "GDALobj")
   p4 <- attr(x, "projection")
   gt <- GridTopology(c(x[4]+(x[6]/2), x[5]+(x[7]/2)), c(x[6], x[7]),
     c(x[2], x[1]))
   SpatialGrid(gt, CRS(p4))
}
SG <- makeSG(Gi)
nuts_ll <- readOGR("NUTS_RG_01M_2013_4326_LEVL_3.shp")
nuts_laea <- spTransform(nuts_ll, CRS(attr(Gi, "projection")))
library(rgrass7)
td <- tempdir()
iG <- initGRASS("/home/rsb/topics/grass/g740/grass-7.4.0", td, SG)
# your GRASS will be where you installed it
writeVECT(nuts_laea, "nuts", v.in.ogr_flags="o")
execGRASS("r.in.gdal", input="g250_clc12_V18_5.tif", output="corine",
   flags="o")
execGRASS("v.to.rast", input="nuts", output="nuts3", use="cat",
   label_column="FID")
execGRASS("r.cross", input="nuts3,corine", output="cross_nuts3")
r_stats0 <- execGRASS("r.stats", input="cross_nuts3", flags="a",
   intern=TRUE)
r_stats1 <- gsub("\\*", "NA", r_stats0)
con_stats <- textConnection(r_stats1)
stats <- read.table(con_stats, header=FALSE, col.names=c("cross_cat",
   "area"), colClasses=c("integer", "numeric"))
close(con_stats)
r_cats0 <- execGRASS("r.category", map="cross_nuts3", intern=TRUE)
r_cats1 <- gsub(";", "", r_cats0)
r_cats2 <- gsub("\t", " ", r_cats1)
r_cats3 <- gsub("no data", "no_data", r_cats2)
r_cats4 <- gsub("category ", "", r_cats3)
r_cats4[1] <- paste0(r_cats4[1], "NA NA")
r_cats_split <- strsplit(r_cats4, " ")
cats <- data.frame(cross_cat=as.integer(sapply(r_cats_split, "[", 1)),
   nuts=sapply(r_cats_split, "[", 2),
   corine=as.integer(sapply(r_cats_split, "[", 3)))
catstats <- merge(cats, stats, by="cross_cat", all=TRUE)
agg_areas <- tapply(catstats$area, list(catstats$nuts, catstats$corine),
   sum)
library(foreign)
corine_labels <- read.dbf("g250_clc12_V18_5.tif.vat.dbf", as.is=TRUE)
o <- match(colnames(agg_areas), as.character(corine_labels$Value))
colnames(agg_areas) <- corine_labels$LABEL3[o]
agg_areas_df <- as.data.frame(agg_areas)
agg_areas_df1 <- agg_areas_df[-which(!(row.names(agg_areas_df) %in%
   as.character(nuts_ll$FID))),] # dropping "NA"      "no_data"

This should be ready to merge with the NUTS3 boundaries, if needed.

agg_areas_df1$FID <- row.names(agg_areas_df1)
nuts_corine <- merge(nuts_laea, agg_areas_df1, by="FID")

For the vector parts you could use sf and the provisional rgrass7sf on 
github, but that wouldn't yet let you construct a skeleton SpatialGrid to 
define the GRASS location. Using GRASS for the heavy lifting (the raster 
is 51000 by 35000), and avoiding vector for overlay, this doesn't need 
much memory (GRASS handles rasters by row). The GRASS temporary location 
only takes 130MB of disk space. You could go for the 100m raster 
resolution, but I doubt that the outcome would vary much - anyone like to 
try?

If the sub-polygons by NUTS and corine categories are actually needed, the 
output of r.cross could be passed to r.to.vect:

execGRASS("r.to.vect", input="cross_nuts3", output="cross_nuts3",
   type="area")

but this is more demanding in memory terms.

Interesting case, and it does show that combining GIS and R delivers the 
goods - SAGA would probably work equivalently.

Roger