Sam,
Indeed perplexing. I've tried to slim things down to:
library(spdep)
nsamp <- 100
rho <- 0.4
nb7rt <- cell2nb(7, 7, torus=TRUE)
set.seed(20060918)
e <- matrix(rnorm(nsamp*length(nb7rt)), nrow=length(nb7rt))
listw <- nb2listw(nb7rt)
invmat <- invIrW(listw, rho=rho)
y <- invmat %*% e
lag_res0 <- lapply(1:nsamp, function(i)
lagsarlm(y[,i] ~ 1, listw=listw))
err_res0 <- lapply(1:nsamp, function(i)
errorsarlm(y[,i] ~ 1, listw=listw))
summary(do.call("rbind", lapply(lag_res0, coefficients)))
summary(do.call("rbind", lapply(err_res0, coefficients)))
to just take the null models in both cases, where lambda == rho
by
definition, but the intercept term differs (I'm not sure how to
insert
autocorrelation into the intercept in the lag model, which is
probably a
source of error in the simulation.
I also had a look at Haining 1990 and the use of the Cholesky
lower
triangle of the SAR (simultaneous autoregressive) covariance
matrix, and
tried:
W <- listw2mat(listw)
Id <- diag(length(nb7rt))
sarcov <- solve(t(Id - rho*W) %*% (Id - rho*W))
sarcovL <- chol((sarcov + t(sarcov))/2)
y <- t(sarcovL) %*% e # Kaluzny et al. 1996, p. 160
lag_res1 <- lapply(1:nsamp, function(i)
lagsarlm(y[,i] ~ 1, listw=listw))
err_res1 <- lapply(1:nsamp, function(i)
errorsarlm(y[,i] ~ 1, listw=listw))
summary(do.call("rbind", lapply(lag_res1, coefficients)))
summary(do.call("rbind", lapply(err_res1, coefficients)))
which is similar to, but not the same as just premultiplying by
(I - rho W)^{-1}, and where I'm very unsure how to simulate for
the lag
model correctly. The error model simulation is, however, perhaps
best done
this way, isn't it?
Running this shows that none of them retrieve the input
coefficient value
exactly. The weights here are about as well-behaved as possible,
all
observations have four (grid on a torus), and things perhaps get
worse
with less regular weights.
Simulating for the error model is discussed in the literature,
but the lag
model is something of an oddity, certainly worth sorting out.
Best wishes,
Roger
On Sat, 16 Sep 2006, Sam Field wrote:
Roger and Terry,
Thanks to both of you for your reply. Yes, sigma^2 enters into
generation of e. The systematic component of the model,
X_mat%*%parms, has a variance of around 10^2. e is typically
around 4^2, or...
e <- rnorm(n,mean=0,sd=4)
I went ahead and copied the code below (it isn't too long) - in
others are interested in this exersise. The car package is
necessary for the n.bins function in the last lines of the
it is optional. I also tried the invIrM function suggested by
Terry. It seems to produce the same results that I get. It
appears in the code below. Any help/suggestions would be
appreciated. I am stuck.
#creating data
library(spdep)
library(car)
#sample size
n <- 100
#coordinates
x_coord <- runif(n,0,10)
y_coord <- runif(n,0,10)
## w matrix and row normalized
w_raw <- matrix(nrow=length(x_coord),ncol=length(x_coord),1)
for(i in 1:length(x_coord)){for(j in 1:length(x_coord))
{w_raw[i,j]<- (sqrt((x_coord[i]-x_coord[j])^2 +
(y_coord[i]-y_coord[j])^2))}}
w_dummy <- matrix(nrow=length(x_coord),ncol=length(x_coord),1)
#defines neighborhood distance
neighbor_dist <- 3
for(i in 1:length(x_coord)){for(j in 1:length(x_coord))
{if(w_raw[i,j] > neighbor_dist) w_dummy[i,j] <- 0}}
for(i in 1:length(x_coord)){for(j in 1:length(x_coord))
{if( i == j ) w_dummy[i,j] <- 0}}
row_sum <- rep(1,length(x_coord))
for(i in 1:length(x_coord))
{row_sum[i] <- sum(w_dummy[i,]) }
w <- matrix(nrow=length(x_coord),ncol=length(x_coord),1)
# divide by row sums
for(i in 1:length(x_coord)){for(j in 1:length(x_coord))
{w[i,j] <- w_dummy[i,j]/row_sum[i]}}
neighbors <- dnearneigh(cbind(x_coord,y_coord),0,neighbor_dist)
nb_list <- nb2listw(neighbors)
#create some independent variables
X1 <- rnorm(n,4,2)
X2 <- rnorm(n,2,2)
X3 <- rnorm(n,0,1)
X4 <- rnorm(n,13,3)
X5 <- rnorm(n,21,.5)
X_mat <- cbind(rep(1,n),X1,X2,X3,X4,X5)
# population parameters
parms <- c(12.4,2,-4,5,1,-.5)
rho <- .3
#begin iterations
n_iter <- 100 #draw 1500 samples of size n
estimates <- matrix(nrow=n_iter,ncol=7)
estimates2 <- matrix(nrow=n_iter,ncol=7)
estimates3 <- matrix(nrow=n_iter,ncol=7)
for(iter in 1:n_iter)
{
e <- rnorm(n,mean=0,sd=4)
# drawing samples
y_lag <- (solve(diag(100)- rho*w))%*%X_mat%*%parms +
solve(diag(100)-rho*w)%*%e
y_error <- X_mat%*%parms + (solve(diag(100)-rho*w))%*%e
y_error2 <- X_mat%*%parms + invIrM(neighbors,.3)%*%e
lag_model <- lagsarlm(y_lag ~ X1 + X2 + X3 + X4 + X5,
listw=nb_list,type='lag')
lag_model2 <- errorsarlm(y_error ~ X1 + X2 + X3 + X4 + X5,
listw=nb_list)
lag_model3 <- errorsarlm(y_error2 ~ X1 + X2 + X3 + X4 + X5,
listw=nb_list)
# retrieve parameter estimates
estimates[iter,] <- coefficients(lag_model)
estimates2[iter,] <- coefficients(lag_model2)
estimates3[iter,] <- coefficients(lag_model3)
}
mean(estimates[,7])
hist(estimates[,7],nclass=n.bins(estimates[,7]),probability=TRUE)
lines(density(estimates[,7]),lwd=2)
mean(estimates2[,7])