Simulating spatially autocorrelated data
On Wed, 7 Sep 2011, Terry Griffin wrote:
Patrick, Specification of the spatial weights matrix (W) is important, and, in general, the connectedness of the W influences the estimation and inference of the model. When you say that you do not know the "true rho", I suspect you are saying that you do not know the true underlying spatial structure of the data, and thus the appropriate specification of the spatial weights matrix. One tool in the spdep package that may be helpful to you is the sp.correlogram function for spatial correlogram; other techniques have been used including semivariograms. I would be interested in what others have to say regarding determining the optimal level of connectedness of W. Two classic references regarding connectedness of W are: Florax, R.J.G.M. and Rey, S. 1995 The Impacts of Misspecified Spatial Interaction in Linear Regression Models. In Anselin L, Florax R J G M (eds) New directions in spatial econometrics. Berlin, Springer: 111-135 Bell, K.P. and Bockstael, N.E. 2000. Applying the Generalized-Moment Estimation Approach to Spatial Problems Involving Micro level Data. The Review of Economics and Statistics, February 2000, 82 (1): 72-82.
And a newer one: Smith, T. E. (2009), Estimation Bias in Spatial Models with Strongly Connected Weight Matrices. Geographical Analysis, 41: 307?332. doi: 10.1111/j.1538-4632.2009.00758.x Sparse connections are present in the spatial weights, but the implied spatial process is dense, because the inverse of (I - \rho W) is dense. If the weights are strongly connected, the assumed process probably imples "oversmoothing". Roger
Terry Griffin, Ph.D. Associate Professor - Economics University of Arkansas - Division of Agriculture 501.249.6360 (SMS) tgriffin at uaex.edu ----- Original Message ----- From: "Patrick Downey" <PDowney at urban.org> To: "Roger Bivand" <Roger.Bivand at nhh.no> Cc: r-sig-geo at stat.math.ethz.ch Sent: Tuesday, September 6, 2011 1:02:04 PM Subject: Re: [R-sig-Geo] Simulating spatially autocorrelated data Hi Roger and Terry, Thank you very much for your help and directing me towards Roger's spdep package, which of course had everything I needed. I've now worked through this code and done some additional simulations. I have one remaining question. You say "the larger the distance threshold, the less well the spatial process is captured." I was wondering if you could further provide some information on this, either by explaining or referencing a document or webpage with explanation. Decreasing the distance threshold, as you suggest, radically alters the results and I'm looking for some guidance on how to select the appropriate distance threshold when I don't know the true rho (that is, with non-simulated data). Thanks, Mitch -----Original Message----- From: Roger Bivand [mailto:Roger.Bivand at nhh.no] Sent: Thursday, September 01, 2011 2:20 PM To: Downey, Patrick Cc: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] Simulating spatially autocorrelated data On Thu, 1 Sep 2011, Downey, Patrick wrote:
Hello all, I'm trying to simulate a spatially autocorrelated random variable, and I cannot figure out what the problem is. All I want is a simple spatial lag model where Y = rho*W*Y + e Where e is a vector of iid normal random variables, rho is the autocorrelation, W is a row-normalized distance matrix (a spatial weights matrix), and Y is the random variable. I thought the following program should do it, but it's not working. At the end of the program, I calculate Moran's I, and it is not even close to rejecting the null hypothesis of no spatial autocorrelation, even when rho is very high (for example, below, rho is 0.95). Can someone please identify what the problem is and offer some guidance on
how to fix it?
PS - I apologize in advance, but I am not familiar with R's spatial
packages. I've done very little spatial analysis in R, so if there's a
package that can already do this, please recommend.
BEGIN PROGRAM:
install.packages("fields");library(fields)
install.packages("ape");library(ape)
N <- 200
rho <- 0.95
x.coord <- runif(N,0,100)
y.coord <- runif(N,0,100)
points <- cbind(x.coord,y.coord)
e <- rnorm(N,0,1)
dist.nonnorm <- rdist(points,points) # Matrix of Euclidean distances
dist <- dist.nonnorm/rowSums(dist.nonnorm) # Row normalizing the
distance
matrix diag(dist) <- 0 # Ensuring that the main diagonal is exactly 0
I think that you are using the distances as weights, not inverse distances, which seems more sensible.
I <- diag(N) # Identity matrix (not Moran's I) inv <- solve(I-rho.lag*dist) # Inverting (I - rho*W) y <- as.vector(inv %*% e) # Generating data that is supposed to be spatially autocorrelated Moran.I(y,dist) # Does not reject null hypothesis of no spatial autocorrelation
As Terry Griffin says, you can use spdep for this: library(spdep) rho <- 0.95 N <- 200 x.coord <- runif(N,0,100) y.coord <- runif(N,0,100) points <- cbind(x.coord,y.coord) e <- rnorm(N,0,1) dnb <- dnearneigh(points, 0, 150) dsts <- nbdists(dnb, points) idw <- lapply(dsts, function(x) 1/x) lw <- nb2listw(dnb, glist=idw, style="W") inv <- invIrW(lw, rho) y <- inv %*% e moran.test(y, lw) to reproduce your analysis with IDW, here without: lw <- nb2listw(dnb, glist=dsts, style="W") inv <- invIrW(lw, rho) y <- inv %*% e moran.test(y, lw) # no autocorrelation and here with a less inclusive distance threshold: dnb <- dnearneigh(points, 0, 15) dsts <- nbdists(dnb, points) idw <- lapply(dsts, function(x) 1/x) lw <- nb2listw(dnb, glist=idw, style="W") inv <- invIrW(lw, rho) y <- inv %*% e moran.test(y, lw) the larger the distance threshold, the less well the spatial process is captured, alternatively use idw <- lapply(dsts, function(x) 1/(x^2)), for example, to attenuate the weights more sharply. Hope this clarifies, Roger
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Department of Economics, NHH Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Department of Economics, NHH Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no