Skip to content

Match polygon and dataframe IDs after raster::extract

4 messages · Loïc Dutrieux, Edzer Pebesma

#
Hi all,

I'm trying to look at correlation between two raster layers, for 
different polygons. So I use raster::extract to get all the raster 
values for every polygon, do the calculation and feed the output back to 
a SpatialPolygonDataFrame.
I got it working, but I have a doubt regarding the order of the rows; 
and it doesn't look like I can use match.ID = TRUE.

See the example below.

library(raster)
library(dplyr)

# Create brick with 2 layers
b <- brick(ncol=36, nrow=18, nl=2)
b[[1]] <- rnorm(ncell(b))
b[[2]] <- rnorm(ncell(b))

# Create sp
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
cds3 <- rbind(c(-10,20), c(50,60), c(70,-10))
polys <- spPolygons(cds1, cds2, cds3)

# Extract all values
df0 <- extract(b, polys, df = TRUE)

# Compute correlation betwen the two layers for every polygons (sorry 
for the pipe)
df1 <- group_by(df0, ID) %>%
   summarise(cor = cor(layer.1, layer.2)) %>%
   data.frame()

# Attach to df to spdf
spdf <- SpatialPointsDataFrame(polys, df1)



How do I know for sure that the order of the rows in the dataframe did 
not get mixed up? Can I just assume that they will remain in the same order?

The dataframe returned by extract has an ID column, but the IDs do not 
correspond to the polygons IDs. For instance if I remove the second 
polygon (which has the ID "2"), the IDs in the df extracted are still 1 
and 2 (instead of 1 and 3).

# Quick function to get polygons IDs
getPolyID <- function(x) {
   sapply(x at polygons, function(x) {x at ID} )
}


getPolyID(polys)
# [1] "1" "2" "3"
getPolyID(polys[-2])
# [1] "1" "3"

unique(extract(b, polys[-2], df = TRUE)$ID)
# [1] 1 2


Any suggestions?

Thanks,
Lo?c
#
On 19/11/15 14:37, Lo?c Dutrieux wrote:
If it does reshuffle and you didn't specify match.ID, it will warn you:
[,1] [,2]
a    1    3
b    2    4
a
b 2
a 3
coordinates a
a      (1, 3) 3
b      (2, 4) 2
Warning message:
In SpatialPointsDataFrame(pts, df) :
  forming a SpatialPointsDataFrame based on maching IDs, not on record
order. Use match.ID = FALSE to match on record order
coordinates a
b      (1, 3) 2
a      (2, 4) 3

Using match.ID = FALSE ensures the input order of coordinates and points
is kept.

  
    
#
Thanks Edzer,

But then, how safe is it to match attributes and geometries based on the 
record order? Is there any chance that records get shuffled during the 
dataframe manipulation steps, or the raster::extract?

Also one note, SpatialPointsDataFrame() gives a warning, but 
SpatialPolygonsDataFrame() doesn't. (I made a mistake in my initial 
example, I want to create a SpatialPolygonsDataFrame, not a 
SpatialPointsDataFrame).

Cheers,
Lo?c
On 11/19/2015 03:08 PM, Edzer Pebesma wrote:
#
On 19/11/15 16:36, Lo?c Dutrieux wrote:
Can you think of a reason why the author of the code would do that?

To be sure I would read the docs and then try to understand

library(raster)
getMethod("extract", c("Raster", "SpatialPolygons"))
as documented, SpatialPolygonsDataFrame has argument match.ID default to
TRUE.