Skip to content

(no subject)

8 messages · Sam Field, Roger Bivand

#
List,

I am looking for ways of estimating spatial autoregression models that adjust 
for a known source of heteroskedaticity and the Waller and Gotway (2004) text 
outline how this can be done in the case of the SAR model.  If I work at it, I 
think I can implement this myself in R, but I wanted to see if anybody else had 
done it. It seems like a pretty straightforward generalization of the SAR model 
and would make a very helpful addition to the spatial regression tools in 
spdep - especially given the effects of heteroskedaticity on the consistency of 
the SAR parameters! 

Sam
#
On Tue, 21 Aug 2007, Sam Field wrote:

            
?spautolm

The examples reproduce the results in Waller & Gotway, perhaps apart from 
a flattish function to optimise in the weighted CAR case. spautolm() now 
provides weighted or unweighted SAR, CAR, and SMA. Sparse matrix methods 
are available for SAR and CAR, SAR when spatial weights are symmetric or 
similar to symmetric (CAR weights have to be symmetric).

Roger

  
    
#
Thanks Roger!

Sorry about omitting the subject line.  I have been working with errorsarlm() - 
did not know about spautolm().  Do you know if there is something analogous 
possible in the case of the spatial lag model,

Y = pWY + XB + e ?

I was going to start looking into it.

thanks!


Sam




Quoting Roger Bivand <Roger.Bivand at nhh.no>:

  
    
#
On Tue, 21 Aug 2007, Sam Field wrote:

            
I have not looked at it, but because it is a wierd animal, I don't think 
it will be too easy to provide a theoretical foundation for it. The 
heteroskedasticity is in the error term, but the autoregressive part 
isn't. I don't think there are any examples anywhere, either.

It ought to be possible, though.

Roger

  
    
7 days later
#
Roger,

One possibility in this limited case might be to replicate the aggregate 
level cases based on their respective weights (since they are integers, 
i.e. within unit sample sizes), then run a spatial lag model.  This 
would be equivalent to recreating the individual level data from the 
aggregate data (excluding measures that vary within the aggregate 
units).  This would obviously inflate your sample size and one would 
have to correct for this somehow in the variance covariance matrix of 
the parameters estimates. 

You would have to do the same for your nb object as well of course.  I 
have looked into this by creating a list of neighbor ids from the 
original nb object, but nb2listw() requires an nb object not a list so I 
am stuck.

The other problem would be that you would end up with a potentially 
large data set. In my case, 13,000 - maybe more then spautolm() could 
handle?  Maybe this whole idea if flawed.


Thanks again for your input! The results change quite a bit with the 
weighted SAR models. 


Sam
Roger Bivand wrote:

  
    
#
Sam,
On Wed, 29 Aug 2007, Sam Field wrote:

            
You could fake it with nb2blocknb, but that was not written for this case, 
but for the case when the individual level variables were observed, but 
that there was no address or coordinates, just a postal code. Here the LHS 
and RHS would be replicated, which doesn't seem desirable.
One interesting conclusion that I've reached is that while the spdep code 
in spautolm() replicates Waller and Gotway for unweighted and weighted SAR 
and CAR, S-Plus SpatialStats fails on the weighted CAR. The reason seems 
to be that W+G did the same as spautolm() (in SAS?) - find the spatial 
autoregressive coefficient first (optimise in one dimension), then use GLS 
to find the regression coefficients. But S+ seems to try to optimise all 
the coefficients at once, and gets bitten by the fact that
(I - \rho W) %*% diag(wts) in their case is not symmetric (W has to be 
symmetric, and the wts have to "balance" - see Cressie etc. Now I'm not 
sure that S+ is right here. If not, then the lag model can be given 
weights too, by simply passing them to the auxilliary regressions used to 
set up the framework for optimisation. The analytical covariance matrix of 
the coefficients remains a problem, though. We'd need to use some other 
mechanism to get there for the eigen method, though the LR tests used for 
sparse methods would be, I think, OK. I've also been playing with sampling 
from a fitted model, to generate synthetic "standard errors", like 
mcmcsamp() in lme4, but I don't know if it is sensible, or how well it 
would scale to many observations.

So I am thinking about how lagsarlm() could get weights, but it won't 
happen too fast, maybe.

Best wishes,

Roger

  
    
#
Roger,  

The reason seems
Don't the weights play a role in the optimization to find lambda? Certainly the 
location of lambda is influenced by the weights employed by W+G. Wouldn't they 
also influence the lcoation of rho in the spatial lag model?

A while ago I wrote some SAS code to fit a spatial lag model and calculate the 
variance covariance matrix of the parameters, so I am a somewhat familiar with 
the two step procedure in that case. I have never really messed with the 
spatial error model. 

I am actually pretty happy being confined to a weighted spatial error model for 
the moment, since it would seem to me that spill over effects from the X%*%beta 
can always be accomodated by including W%*%X%*%tau terms in a spatial error 
model (though one must assume that the influence of spatially lagged covariates 
stop at first order neighbors?).  I wondered if including W%*%X%*%tau into a 
spatial error model would lead to inconsistent paramter estimates since rho*W%*%
X is also in the model.  Some quick simulation suggested that the consistency 
of the parameter estimates was not affected.

I am not sure that a true spatial lag process is theoretically compelling in my 
case anyway, now that I think about it. In fact, the idea the spatial 
correlation among the residuals is due to a "pure" contagion process (as 
represented by the pWy term) would seem pretty rare in the case of most complex 
phenomona - which is why the type='mixed" option in lagsarlm() is so useful!  

I wonder if 

Y = Xbeta + WXtau + pWe + u

isn't a sensible alternative to the spatial lag model when a contagion process 
is not theoretically plausible but where spill over effects of neighboring 
covariates are.  


thanks again for your amazing support of the spdep package.  

cheers,

Sam







 


   


Quoting Roger Bivand <Roger.Bivand at nhh.no>:

  
    
#
On Wed, 29 Aug 2007, Sam Field wrote:

            
Yes, both in the sum of squared errors, and in the spautolm() 
implementation as a separate term in the log likelihood, rather than in a 
combined Jacobian. In the spatial lag case, the auxiliary regressions 
would be weighted, so the sum of squared errors term would be affected, 
but I'm unsure about the extra term in the log likelihood.
The error model is, as you say, doing:

(I - rho W) y = (I - rho W) X beta + u

y = rho W y + X beta - rho W X beta + u

so

y = rho W y + (X beta + W X tau) - rho W (X beta + W X tau) + u

might become unstable if any of the X's are highly autocorrelated, leading 
to aliasing (columns would drop out of the QR solution). However, in the 
lag IV fitting method, W X (and maybe W W X too) are used as instruments 
for W y, so it is worth exploring.
If you explore this, it would be interesting to hear how you get on.

Best wishes,

Roger