Skip to content

Dealing with lakes and NA in Moran/SpatialPolygonsDataFrame

5 messages · Matthieu Stigler, Roger Bivand

#
Hi

I'm puzzled hot to deal with NAs and lakes on SpatialPolygonsDataFrame
objects...

My problems (maybe should I post in separated files?):
-spplot does not seem to handle specially NA values...
-neighbors graph seem to make links even to lakes/holes
-moran.test seem to work with NA values... but what about the
neighbors/weights used as input? Are the results biased?
-localmoran does not seem to work with NA

I illustrate each of them below.


Example is based on Syracuse data from ASDAR (data from web-site
http://www.asdar-book.org/bundles/lat1_bundle.zip):
#NA PROBLEM
setwd("H:/Documents/Stats/Book Bivand/Chap 9")
library(sp)
library(rgdal)
NY8 <- readOGR(".", "NY8_utm18")
library(spdep)
Syracuse <- NY8[NY8$AREANAME == "Syracuse city",]
Syracuse2<-Syracuse
Syracuse2$POP8[43]<-NA
spplot(Syracuse2, zcol="POP8")
#it appears as white, similar to other colors which are nevertheless
true values and not NA!

LAKE problem:
#check if lakes:
sapply(sapply(slot(Syracuse, "polygons"),function(x) slot(x,
"Polygons")), function(x) slot(x, "hole"))
#btw, it is pretty complicated, are there some more user-friendly
wrapers for that? kind of isHole, getHole?
slot(slot(slot(Syracuse2, "polygons")[[43]],"Polygons")[[1]], "hole")<-TRUE
plot(poly2nb(Syracuse), coordinates(Syracuse2))

Here it seems that it did not take into account the hole and still
computes neighbors... right?

And if yes... does it affect the results using moran tests?

Thanks a lot!
Matthieu Stigler
#
Hi

I have a second problem with NA in SpatialPolygonsDataFrame, it seems
that residuals.spautolm does not work with NA values, returning NULL.

I add to the questions I asked yesterday about dealing with NAs and
I'm very thankful if anyone could give me any advice!

Thanks a lot!!

Matthieu Stigler



2009/9/3 Matthieu Stigler <matthieu.stigler at gmail.com>:
SPAUTOLM residuals problem:

NY8 <- readOGR(".", "NY8_utm18")
library(spdep)
Syracuse <- NY8[NY8$AREANAME == "Syracuse city",]
Syracuse2<-Syracuse
Syracuse2$POP8[43]<-NA
spplot(Syracuse2, zcol="POP8")

NY8$Z[43]<-NA
NY8$PEXPOSURE[43]<-NA
NY8$PCTAGE65P[43]<-NA
NY8$PCTOWNHOME[43]<-NA

NY_nb <- read.gal("NY_nb.gal", region.id=row.names(as(NY8, "data.frame")))
NYlistw<-nb2listw(NY_nb, style = "B")
nysar<-spautolm(Z~PEXPOSURE+PCTAGE65P+PCTOWNHOME , data=NY8, listw=NYlistw)
summary(nysar)
residuals(nysar)
1 day later
#
On Thu, 3 Sep 2009, Matthieu Stigler wrote:

            
NAs are coded "transparent", which look the same when the background is 
white. In the default divergent palette, white is used. If you do:

library(lattice)
trellis.device(bg="grey")
spplot(Syracuse2, zcol="POP8")

(wrong way to change background but OK to show the point) or

trellis.par.set(sp.theme())
spplot(Syracuse2, zcol="POP8")

where white is not in the palette.
Holes are easy to fall into, so no wrappers.
poly2nb() uses all the geometries present. The prefered choice is to 
subset the geometries to keep only the ones the user requires, so:


not_holes <- !sapply(sapply(slot(Syracuse2, "polygons"),function(x)
   slot(x, "Polygons")), function(x) slot(x, "hole"))
nb <- poly2nb(Syracuse2[not_holes,])
plot(nb, coordinates(Syracuse2)[not_holes,])

I guess this clarifies things. In principle, a single Polygon object in a 
Polygons object will not be a hole anyway (it is an external ring), so 
your example is rather artificial. Subsetting the geometries with "[" is 
to be prefered.

Hope this helps,

Roger

I guess the same applies to your followup - but spautolm() ought not to 
permit computation on missing data - I'll check this.

  
    
1 day later
#
2009/9/5 Roger Bivand <Roger.Bivand at nhh.no>:
Thanks a lot for your help!

For plotting NAs... well changing bg is a little bit radical, as
everything is then different... the problem came actually because I
used heat.col() which looks really pretty but uses white... isn't it a
function in treillis to change only NAs and not all the rest with?
Thanks!
Oh so for the lake, I should rather use ringDir?

And when my dataset contains lakes and NAs, should I do the same as
above also for Nas and then subset? Will Moran values be affected by
that?

Thanks a lot!
#
On Mon, 7 Sep 2009, Matthieu Stigler wrote:

            
Not that I am aware of. If you want this level of control, use the at= and 
col.regions= arguments, and assign an improbable value to the NAs, or use 
base graphics and use hatching for the NAs. I'm travelling and do not have 
Deepayan Sarkar's excellent lattice book with me, perhaps you could check 
what he says?
Please do not use the hole= or ringDir= slot as topologically checked. See 
checkPolygonsHoles() in maptools for details. In many representations, the 
entity (here Polygons object) has to have one or more external rings that 
are not holes, so your lake may be turned from hole to non-hole by 
software. If there is "nothing" there, subset it out on an attribute, for 
example a factor describing land cover:

x1 <- x0[x0$landCover != "lake",]

then subset on the NAs. Using the geometry characteristics of entities 
whose topologies have not been built or checked is not robust (may work 
with some data sets, but not with others). A lake isn't a hole, it is an 
entity that is a lake, or it is not an entity at all (subsetted out). In 
this case you need to check your visualization representations.

Confusion with Moran's I will result if your entities are confused, so 
sort them out first.

Hope this helps,

Roger