Skip to content

spsample and NA

8 messages · Roger Bivand, Edzer Pebesma, Agustin Lobo

#
It is often the case, at least with raster (grid) objects,
that you are not interested on a part of the raster.
For example because it's outside a non.rectangular region of study
and/or because some types of terrain (some classes) are outside
the scope of the study. In any case, those parts can be
represented by NA.

Unfortunately, while some functions like table() would not
include the NA in the statistics, spsample includes
the NA as just another class, which implies that a fraction
of the resulting coordinates will lay on the NA
part of the raster. This makes it difficult, to get N valid (i.e., not 
on NA parts) points using spsample. Is there any way around this?
For example, if I want 10 points on non-NA cells of
raster X, I would like

spsample(X, 10, "stratified")

to provide 10 positions on non-NA cells.

At the moment, I get:

x <- GDAL.open("C:/ALOBO/dipu2006/STLL_VEG2006/ESTRAT3_OBAC.tif")
ESTRAT3OBAC <- asSGDF_GROD(x, output.dim=c(100,100))
GDAL.close(x)

ESTRAT3OBAC at data[ESTRAT3OBAC at data==55537]<- NA

X <- ESTRAT3OBAC
X$cut <- as.ordered(cut(X$band1, c(0,10,100),include.lowest=TRUE))
table(X$cut)

   [0,10] (10,100]
       83     1961

summary(X$cut)
   [0,10] (10,100]     NA's
       83     1961     7956
table(X$cut)

   [0,10] (10,100]
       83     1961

Xspsample <- spsample(X,10,"stratified")
Xo <- overlay(X,Xspsample)
summary(Xo)
Object of class SpatialPointsDataFrame
Coordinates:
          min       max
x1  410995.6  414954.4
x2 4608002.1 4610566.0
Is projected: TRUE
proj4string : [ +proj=utm +zone=31 +ellps=intl +units=m +no_defs]
Number of points: 9
Data attributes:
      band1             cut
  Min.   :   67   [0,10]  :0
  1st Qu.:   97   (10,100]:3
  Median :  159   NA's    :6
  Mean   :24745
  3rd Qu.:55537
  Max.   :55537

That is, I get 6 positions on NA values of X

Thanks!

Agus
#
The test with meuse.grid works:

#copied from file:SpatialGridDataFrame-class.html
coordinates(meuse.grid) = c("x", "y") # promote toSpatialPointsDataFrame
gridded(meuse.grid) <- TRUE # promote to SpatialPixelsDataFrame
x = as(meuse.grid, "SpatialGridDataFrame") # creates the full grid
#Now me:
 > unique(x at data$soil)
[1] NA  1  2  3

 > table(x at data$soil)
    1    2    3
1665 1084  354

test.stsamp <- spsample(x, n = 10, "stratified")
o <- overlay(x, test.stsamp)
 > o
           coordinates part.a part.b      dist soil ffreq
5556 (179156, 330917)      0      1 0.1030270    1     1
6670 (180039, 330336)      0      1 0.0703333    1     2
2240 (180677, 332606)      1      0 0.0975136    1     1


But don't know how to make that my raster behave in the same way. What
does it mean "cast to SpatialPixels first"? I thought that as
X comes from an imported raster it was an sp object already.

Also, there are several components in the x coming from meuse.grid:
str(x)
Formal class 'SpatialGridDataFrame' [package "sp"] with 6 slots
   ..@ data       :'data.frame': 8112 obs. of  5 variables:
   .. ..$ part.a: num [1:8112] NA NA NA NA NA NA NA NA NA NA ...
   .. ..$ part.b: num [1:8112] NA NA NA NA NA NA NA NA NA NA ...
   .. ..$ dist  : num [1:8112] NA NA NA NA NA NA NA NA NA NA ...
   .. ..$ soil  : num [1:8112] NA NA NA NA NA NA NA NA NA NA ...
   .. ..$ ffreq : Factor w/ 3 levels "1","2","3": NA NA NA NA NA NA NA 
NA NA NA ...

how does spsample know which layer has to be used for the stratification?

Thanks!

Agus

Roger Bivand escribi?:

  
    
3 days later
#
Not only to avoid NAs, but also to get approximately right sample size.

Stratified sampling does not take place based on strata defined by 
attributes (map layers), but rather by dividing the area into square 
blocks (strata) and randomly sample (n=1) from each block. That is why 
the resulting sample doesn't have exactly the prescribed/requested size 
(although one could iterate). See also Ripley 1981 Spatial Statistics 
for explanation.

Best wishes,
--
Edzer
Roger Bivand wrote:
#
I understand, but how do I convert between one type and the other?
I've tried:

x <- GDAL.open("C:/ALOBO/dipu2006/STLL_VEG2006/ESTRAT3_OBAC.tif")
ESTRAT3OBAC <- asSGDF_GROD(x, output.dim=c(118,189))
GDAL.close(x)
ESTRAT3OBAC at data[ESTRAT3OBAC at data==55537]<- NA
X <- ESTRAT3OBAC[seq(1,118,by=10),seq(1,189,by=10)] #small example
 > summary(X)
Object of class SpatialGridDataFrame
Coordinates:
         min       max
x  410649.7  415399.7
y 4607910.5 4610910.5
Is projected: TRUE
proj4string : [ +proj=utm +zone=31 +ellps=intl +units=m +no_defs]
Number of points: 2
Grid attributes:
   cellcentre.offset cellsize cells.dim
x          410774.7      250        19
y         4608035.5      250        12
Data attributes:
      band1
  Min.   :  4.0
  1st Qu.: 67.0
  Median :156.0
  Mean   :125.8
  3rd Qu.:163.0
  Max.   :237.0
  NA's   :119.0

 > Y <- SpatialPixels(X)
 > summary(Y)
Object of class SpatialPixels
Coordinates:
         min       max
x  410649.7  415399.7
y 4607910.5 4610910.5
Is projected: TRUE
proj4string : [ +proj=utm +zone=31 +ellps=intl +units=m +no_defs]
Number of points: 2

So X and Y are not the same at all.

Thanks for your help.

Agus


Edzer J. Pebesma escribi?:

  
    
#
additional to the

X = as(X, "SpatialPixelsDataFrame")

etc. there's a helper function that toggles between them; try:

fullgrid(X) = TRUE
class(X)
fullgrid(X) = FALSE
class(X)
--
Edzer
Agustin Lobo wrote:
#
Great, it works now, thanks a lot.
This is how I do it:

#Using a raster map to stratify a sample
#Non-interesting classes set as "NA"

 > library(rgdal)
 > library(sp)
 > x <- GDAL.open("C:/ALOBO/dipu2006/STLL_VEG2006/ESTRAT3_OBAC.tif")
 > ESTRAT3OBAC <- asSGDF_GROD(x, output.dim=c(118,189))
 > GDAL.close(x)
Closing GDAL dataset handle 0x0171c668...  destroyed ... done.
 > ESTRAT3OBAC at data[ESTRAT3OBAC at data==55537]<- NA
 > X <- ESTRAT3OBAC[seq(1,118,by=10),seq(1,189,by=10)] #small example
 > class(X)
[1] "SpatialGridDataFrame"
attr(,"package")
[1] "sp"
 > fullgrid(X)=F
 > class(X)
[1] "SpatialPixelsDataFrame"
attr(,"package")
[1] "sp"
 > test.stsamp <- spsample(X, n = 10, "stratified")
 > test.stsamp
SpatialPoints:
             x1      x2
  [1,] 411328.3 4608544
  [2,] 412220.4 4608288
  [3,] 412856.2 4608528
  [4,] 413195.9 4608455
  [5,] 414354.4 4608723
  [6,] 414804.9 4608653
  [7,] 412880.7 4609431
  [8,] 413441.1 4608874
  [9,] 414629.3 4609258
[10,] 414903.8 4609542
[11,] 412511.0 4609699
[12,] 413290.9 4610094
Coordinate Reference System (CRS) arguments: +proj=utm +zone=31
+ellps=intl +units=m +no_defs
 > o <- overlay(X, test.stsamp)
 > o
  [1] 71 89 77 79 67 85 26 64 41 31 18 14


Agus

Edzer J. Pebesma escribi?: