(no subject) - R-SIG-Geo | R Mailing Lists

Tue, Aug 21, 2007 1:18 PM #

List,

I am looking for ways of estimating spatial autoregression models that adjust 
for a known source of heteroskedaticity and the Waller and Gotway (2004) text 
outline how this can be done in the case of the SAR model.  If I work at it, I 
think I can implement this myself in R, but I wanted to see if anybody else had 
done it. It seems like a pretty straightforward generalization of the SAR model 
and would make a very helpful addition to the spatial regression tools in 
spdep - especially given the effects of heteroskedaticity on the consistency of 
the SAR parameters! 

Sam

********Note the new contact information*******

Samuel H. Field, Ph.D. 
Senior Research Investigator
CHERP/Division of Internal Medicine - University of Pennsylvania
Philadelphia VA Medical Center
3900 Woodland Ave (9 East)
Philadelphia, PA 19104
(215) 823-5800 EXT. 6155 (Office)
(215) 823-6330 (Fax)

Roger Bivand

Tue, Aug 21, 2007 1:54 PM #

On Tue, 21 Aug 2007, Sam Field wrote:

?spautolm

The examples reproduce the results in Waller & Gotway, perhaps apart from 
a flattish function to optimise in the weighted CAR case. spautolm() now 
provides weighted or unweighted SAR, CAR, and SMA. Sparse matrix methods 
are available for SAR and CAR, SAR when spatial weights are symmetric or 
similar to symmetric (CAR weights have to be symmetric).

Roger

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Sam Field

Tue, Aug 21, 2007 2:27 PM #

Thanks Roger!

Sorry about omitting the subject line.  I have been working with errorsarlm() - 
did not know about spautolm().  Do you know if there is something analogous 
possible in the case of the spatial lag model,

Y = pWY + XB + e ?

I was going to start looking into it.

thanks!


Sam




Quoting Roger Bivand <Roger.Bivand at nhh.no>:

********Note the new contact information*******

Samuel H. Field, Ph.D. 
Senior Research Investigator
CHERP/Division of Internal Medicine - University of Pennsylvania
Philadelphia VA Medical Center
3900 Woodland Ave (9 East)
Philadelphia, PA 19104
(215) 823-5800 EXT. 6155 (Office)
(215) 823-6330 (Fax)

Roger Bivand

Wed, Aug 22, 2007 12:02 AM #

On Tue, 21 Aug 2007, Sam Field wrote:

I have not looked at it, but because it is a wierd animal, I don't think 
it will be too easy to provide a theoretical foundation for it. The 
heteroskedasticity is in the error term, but the autoregressive part 
isn't. I don't think there are any examples anywhere, either.

It ought to be possible, though.

Roger

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Sam Field

Wed, Aug 29, 2007 9:55 AM #

Roger,

One possibility in this limited case might be to replicate the aggregate 
level cases based on their respective weights (since they are integers, 
i.e. within unit sample sizes), then run a spatial lag model.  This 
would be equivalent to recreating the individual level data from the 
aggregate data (excluding measures that vary within the aggregate 
units).  This would obviously inflate your sample size and one would 
have to correct for this somehow in the variance covariance matrix of 
the parameters estimates. 

You would have to do the same for your nb object as well of course.  I 
have looked into this by creating a list of neighbor ids from the 
original nb object, but nb2listw() requires an nb object not a list so I 
am stuck.

The other problem would be that you would end up with a potentially 
large data set. In my case, 13,000 - maybe more then spautolm() could 
handle?  Maybe this whole idea if flawed.


Thanks again for your input! The results change quite a bit with the 
weighted SAR models. 


Sam

Roger Bivand wrote:

On Tue, 21 Aug 2007, Sam Field wrote:

Thanks Roger!

Sorry about omitting the subject line.  I have been working with errorsarlm() -
did not know about spautolm().  Do you know if there is something analogous
possible in the case of the spatial lag model,

Y = pWY + XB + e ?

I have not looked at it, but because it is a wierd animal, I don't think 
it will be too easy to provide a theoretical foundation for it. The 
heteroskedasticity is in the error term, but the autoregressive part 
isn't. I don't think there are any examples anywhere, either.

It ought to be possible, though.

Roger

I was going to start looking into it.

thanks!


Sam




Quoting Roger Bivand <Roger.Bivand at nhh.no>:

On Tue, 21 Aug 2007, Sam Field wrote:

List,

I am looking for ways of estimating spatial autoregression models that

adjust

for a known source of heteroskedaticity and the Waller and Gotway (2004)

text

outline how this can be done in the case of the SAR model.  If I work at

it, I

think I can implement this myself in R, but I wanted to see if anybody else

had

done it. It seems like a pretty straightforward generalization of the SAR

model

and would make a very helpful addition to the spatial regression tools in
spdep - especially given the effects of heteroskedaticity on the

consistency of

the SAR parameters!

?spautolm

The examples reproduce the results in Waller & Gotway, perhaps apart from
a flattish function to optimise in the weighted CAR case. spautolm() now
provides weighted or unweighted SAR, CAR, and SMA. Sparse matrix methods
are available for SAR and CAR, SAR when spatial weights are symmetric or
similar to symmetric (CAR weights have to be symmetric).

Roger

Sam

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Samuel H. Field
Division of Internal Medicine - University of Pennsylvania
CHERP - Philadelphia VA Medical Center
3900 Woodland Ave (9 East)
Philadelphia, PA 19104
(215) 823-5800 EXT. 6155 (Office)
(215) 823-6330 (Fax)

Roger Bivand

Wed, Aug 29, 2007 2:55 PM #

Sam,

On Wed, 29 Aug 2007, Sam Field wrote:

You could fake it with nb2blocknb, but that was not written for this case, 
but for the case when the individual level variables were observed, but 
that there was no address or coordinates, just a postal code. Here the LHS 
and RHS would be replicated, which doesn't seem desirable.

One interesting conclusion that I've reached is that while the spdep code 
in spautolm() replicates Waller and Gotway for unweighted and weighted SAR 
and CAR, S-Plus SpatialStats fails on the weighted CAR. The reason seems 
to be that W+G did the same as spautolm() (in SAS?) - find the spatial 
autoregressive coefficient first (optimise in one dimension), then use GLS 
to find the regression coefficients. But S+ seems to try to optimise all 
the coefficients at once, and gets bitten by the fact that
(I - \rho W) %*% diag(wts) in their case is not symmetric (W has to be 
symmetric, and the wts have to "balance" - see Cressie etc. Now I'm not 
sure that S+ is right here. If not, then the lag model can be given 
weights too, by simply passing them to the auxilliary regressions used to 
set up the framework for optimisation. The analytical covariance matrix of 
the coefficients remains a problem, though. We'd need to use some other 
mechanism to get there for the eigen method, though the LR tests used for 
sparse methods would be, I think, OK. I've also been playing with sampling 
from a fitted model, to generate synthetic "standard errors", like 
mcmcsamp() in lme4, but I don't know if it is sensible, or how well it 
would scale to many observations.

So I am thinking about how lagsarlm() could get weights, but it won't 
happen too fast, maybe.

Best wishes,

Roger


Sam



Roger Bivand wrote:

On Tue, 21 Aug 2007, Sam Field wrote:

Thanks Roger!

Sorry about omitting the subject line.  I have been working with errorsarlm() -
did not know about spautolm().  Do you know if there is something analogous
possible in the case of the spatial lag model,

Y = pWY + XB + e ?

I have not looked at it, but because it is a wierd animal, I don't think
it will be too easy to provide a theoretical foundation for it. The
heteroskedasticity is in the error term, but the autoregressive part
isn't. I don't think there are any examples anywhere, either.

It ought to be possible, though.

Roger

I was going to start looking into it.

thanks!


Sam




Quoting Roger Bivand <Roger.Bivand at nhh.no>:

On Tue, 21 Aug 2007, Sam Field wrote:

List,

I am looking for ways of estimating spatial autoregression models that

adjust

for a known source of heteroskedaticity and the Waller and Gotway (2004)

text

outline how this can be done in the case of the SAR model.  If I work at

it, I

think I can implement this myself in R, but I wanted to see if anybody else

had

done it. It seems like a pretty straightforward generalization of the SAR

model

and would make a very helpful addition to the spatial regression tools in
spdep - especially given the effects of heteroskedaticity on the

consistency of

the SAR parameters!

?spautolm

The examples reproduce the results in Waller & Gotway, perhaps apart from
a flattish function to optimise in the weighted CAR case. spautolm() now
provides weighted or unweighted SAR, CAR, and SMA. Sparse matrix methods
are available for SAR and CAR, SAR when spatial weights are symmetric or
similar to symmetric (CAR weights have to be symmetric).

Roger

Sam

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Sam Field

Wed, Aug 29, 2007 7:04 PM #

Roger,  

The reason seems

Don't the weights play a role in the optimization to find lambda? Certainly the 
location of lambda is influenced by the weights employed by W+G. Wouldn't they 
also influence the lcoation of rho in the spatial lag model?

A while ago I wrote some SAS code to fit a spatial lag model and calculate the 
variance covariance matrix of the parameters, so I am a somewhat familiar with 
the two step procedure in that case. I have never really messed with the 
spatial error model. 

I am actually pretty happy being confined to a weighted spatial error model for 
the moment, since it would seem to me that spill over effects from the X%*%beta 
can always be accomodated by including W%*%X%*%tau terms in a spatial error 
model (though one must assume that the influence of spatially lagged covariates 
stop at first order neighbors?).  I wondered if including W%*%X%*%tau into a 
spatial error model would lead to inconsistent paramter estimates since rho*W%*%
X is also in the model.  Some quick simulation suggested that the consistency 
of the parameter estimates was not affected.

I am not sure that a true spatial lag process is theoretically compelling in my 
case anyway, now that I think about it. In fact, the idea the spatial 
correlation among the residuals is due to a "pure" contagion process (as 
represented by the pWy term) would seem pretty rare in the case of most complex 
phenomona - which is why the type='mixed" option in lagsarlm() is so useful!  

I wonder if 

Y = Xbeta + WXtau + pWe + u

isn't a sensible alternative to the spatial lag model when a contagion process 
is not theoretically plausible but where spill over effects of neighboring 
covariates are.  


thanks again for your amazing support of the spdep package.  

cheers,

Sam







 


   


Quoting Roger Bivand <Roger.Bivand at nhh.no>:

Sam,

On Wed, 29 Aug 2007, Sam Field wrote:

Roger,

One possibility in this limited case might be to replicate the aggregate
level cases based on their respective weights (since they are integers,
i.e. within unit sample sizes), then run a spatial lag model.  This
would be equivalent to recreating the individual level data from the
aggregate data (excluding measures that vary within the aggregate
units).  This would obviously inflate your sample size and one would
have to correct for this somehow in the variance covariance matrix of
the parameters estimates.

You would have to do the same for your nb object as well of course.  I
have looked into this by creating a list of neighbor ids from the
original nb object, but nb2listw() requires an nb object not a list so I
am stuck.

You could fake it with nb2blocknb, but that was not written for this case, 
but for the case when the individual level variables were observed, but 
that there was no address or coordinates, just a postal code. Here the LHS 
and RHS would be replicated, which doesn't seem desirable.

The other problem would be that you would end up with a potentially
large data set. In my case, 13,000 - maybe more then spautolm() could
handle?  Maybe this whole idea if flawed.


Thanks again for your input! The results change quite a bit with the
weighted SAR models.

One interesting conclusion that I've reached is that while the spdep code 
in spautolm() replicates Waller and Gotway for unweighted and weighted SAR 
and CAR, S-Plus SpatialStats fails on the weighted CAR. The reason seems 
to be that W+G did the same as spautolm() (in SAS?) - find the spatial 
autoregressive coefficient first (optimise in one dimension), then use GLS 
to find the regression coefficients. But S+ seems to try to optimise all 
the coefficients at once, and gets bitten by the fact that
(I - \rho W) %*% diag(wts) in their case is not symmetric (W has to be 
symmetric, and the wts have to "balance" - see Cressie etc. Now I'm not 
sure that S+ is right here. If not, then the lag model can be given 
weights too, by simply passing them to the auxilliary regressions used to 
set up the framework for optimisation. The analytical covariance matrix of 
the coefficients remains a problem, though. We'd need to use some other 
mechanism to get there for the eigen method, though the LR tests used for 
sparse methods would be, I think, OK. I've also been playing with sampling 
from a fitted model, to generate synthetic "standard errors", like 
mcmcsamp() in lme4, but I don't know if it is sensible, or how well it 
would scale to many observations.

So I am thinking about how lagsarlm() could get weights, but it won't 
happen too fast, maybe.

Best wishes,

Roger


Sam



Roger Bivand wrote:

On Tue, 21 Aug 2007, Sam Field wrote:

Thanks Roger!

Sorry about omitting the subject line.  I have been working with

errorsarlm() -

did not know about spautolm().  Do you know if there is something

analogous

possible in the case of the spatial lag model,

Y = pWY + XB + e ?

I have not looked at it, but because it is a wierd animal, I don't think
it will be too easy to provide a theoretical foundation for it. The
heteroskedasticity is in the error term, but the autoregressive part
isn't. I don't think there are any examples anywhere, either.

It ought to be possible, though.

Roger

I was going to start looking into it.

thanks!


Sam




Quoting Roger Bivand <Roger.Bivand at nhh.no>:

On Tue, 21 Aug 2007, Sam Field wrote:

List,

I am looking for ways of estimating spatial autoregression models that

adjust

for a known source of heteroskedaticity and the Waller and Gotway

(2004)

text

outline how this can be done in the case of the SAR model.  If I work

at

it, I

think I can implement this myself in R, but I wanted to see if anybody

else

had

done it. It seems like a pretty straightforward generalization of the

SAR

model

and would make a very helpful addition to the spatial regression tools

in

spdep - especially given the effects of heteroskedaticity on the

consistency of

the SAR parameters!

?spautolm

The examples reproduce the results in Waller & Gotway, perhaps apart

from

a flattish function to optimise in the weighted CAR case. spautolm()

now

provides weighted or unweighted SAR, CAR, and SMA. Sparse matrix

methods

are available for SAR and CAR, SAR when spatial weights are symmetric

or

similar to symmetric (CAR weights have to be symmetric).

Roger

Sam

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School

of

Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

********Note the new contact information*******

Samuel H. Field, Ph.D. 
Senior Research Investigator
CHERP/Division of Internal Medicine - University of Pennsylvania
Philadelphia VA Medical Center
3900 Woodland Ave (9 East)
Philadelphia, PA 19104
(215) 823-5800 EXT. 6155 (Office)
(215) 823-6330 (Fax)

Roger Bivand

Wed, Aug 29, 2007 11:28 PM #

On Wed, 29 Aug 2007, Sam Field wrote:

Yes, both in the sum of squared errors, and in the spautolm() 
implementation as a separate term in the log likelihood, rather than in a 
combined Jacobian. In the spatial lag case, the auxiliary regressions 
would be weighted, so the sum of squared errors term would be affected, 
but I'm unsure about the extra term in the log likelihood.

The error model is, as you say, doing:

(I - rho W) y = (I - rho W) X beta + u

y = rho W y + X beta - rho W X beta + u

so

y = rho W y + (X beta + W X tau) - rho W (X beta + W X tau) + u

might become unstable if any of the X's are highly autocorrelated, leading 
to aliasing (columns would drop out of the QR solution). However, in the 
lag IV fitting method, W X (and maybe W W X too) are used as instruments 
for W y, so it is worth exploring.

If you explore this, it would be interesting to hear how you get on.

Best wishes,

Roger

thanks again for your amazing support of the spdep package.

cheers,

Sam













Quoting Roger Bivand <Roger.Bivand at nhh.no>:

Sam,

On Wed, 29 Aug 2007, Sam Field wrote:

Roger,

One possibility in this limited case might be to replicate the aggregate
level cases based on their respective weights (since they are integers,
i.e. within unit sample sizes), then run a spatial lag model.  This
would be equivalent to recreating the individual level data from the
aggregate data (excluding measures that vary within the aggregate
units).  This would obviously inflate your sample size and one would
have to correct for this somehow in the variance covariance matrix of
the parameters estimates.

You would have to do the same for your nb object as well of course.  I
have looked into this by creating a list of neighbor ids from the
original nb object, but nb2listw() requires an nb object not a list so I
am stuck.

You could fake it with nb2blocknb, but that was not written for this case,
but for the case when the individual level variables were observed, but
that there was no address or coordinates, just a postal code. Here the LHS
and RHS would be replicated, which doesn't seem desirable.

The other problem would be that you would end up with a potentially
large data set. In my case, 13,000 - maybe more then spautolm() could
handle?  Maybe this whole idea if flawed.


Thanks again for your input! The results change quite a bit with the
weighted SAR models.

One interesting conclusion that I've reached is that while the spdep code
in spautolm() replicates Waller and Gotway for unweighted and weighted SAR
and CAR, S-Plus SpatialStats fails on the weighted CAR. The reason seems
to be that W+G did the same as spautolm() (in SAS?) - find the spatial
autoregressive coefficient first (optimise in one dimension), then use GLS
to find the regression coefficients. But S+ seems to try to optimise all
the coefficients at once, and gets bitten by the fact that
(I - \rho W) %*% diag(wts) in their case is not symmetric (W has to be
symmetric, and the wts have to "balance" - see Cressie etc. Now I'm not
sure that S+ is right here. If not, then the lag model can be given
weights too, by simply passing them to the auxilliary regressions used to
set up the framework for optimisation. The analytical covariance matrix of
the coefficients remains a problem, though. We'd need to use some other
mechanism to get there for the eigen method, though the LR tests used for
sparse methods would be, I think, OK. I've also been playing with sampling
from a fitted model, to generate synthetic "standard errors", like
mcmcsamp() in lme4, but I don't know if it is sensible, or how well it
would scale to many observations.

So I am thinking about how lagsarlm() could get weights, but it won't
happen too fast, maybe.

Best wishes,

Roger


Sam



Roger Bivand wrote:

On Tue, 21 Aug 2007, Sam Field wrote:

Thanks Roger!

Sorry about omitting the subject line.  I have been working with

errorsarlm() -

did not know about spautolm().  Do you know if there is something

analogous

possible in the case of the spatial lag model,

Y = pWY + XB + e ?

I have not looked at it, but because it is a wierd animal, I don't think
it will be too easy to provide a theoretical foundation for it. The
heteroskedasticity is in the error term, but the autoregressive part
isn't. I don't think there are any examples anywhere, either.

It ought to be possible, though.

Roger

I was going to start looking into it.

thanks!


Sam




Quoting Roger Bivand <Roger.Bivand at nhh.no>:

On Tue, 21 Aug 2007, Sam Field wrote:

List,

I am looking for ways of estimating spatial autoregression models that

adjust

for a known source of heteroskedaticity and the Waller and Gotway

(2004)

text

outline how this can be done in the case of the SAR model.  If I work

at

it, I

think I can implement this myself in R, but I wanted to see if anybody

else

had

done it. It seems like a pretty straightforward generalization of the

SAR

model

and would make a very helpful addition to the spatial regression tools

in

spdep - especially given the effects of heteroskedaticity on the

consistency of

the SAR parameters!

?spautolm

The examples reproduce the results in Waller & Gotway, perhaps apart

from

a flattish function to optimise in the weighted CAR case. spautolm()

now

provides weighted or unweighted SAR, CAR, and SMA. Sparse matrix

methods

are available for SAR and CAR, SAR when spatial weights are symmetric

or

similar to symmetric (CAR weights have to be symmetric).

Roger

Sam

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School

of

Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no