bivariate spatial correlation in R

Roger,

Thank you for your response.

I recognize the data is not ideal and the analysis has limitations because
of the lack of information on population displacements that might have
occurred over the years. Nonetheless, there are plenty of data + literature
showing how the spatial distribution of income classes and land use
patterns is fairly stable over time in this city, particularly for a short
timescales like in this analysis. Having said this, I believe these two
questions (1) what socioeconomic classes have gained more accessibility?
and (2) ?were wealthier areas in 2010 able to attract more changes to
accessibility?? in the end ask the same thing but with different phrasings,
though your phrasing (2) is more precise/correct.

On the more technical discussion, I see your point that spatial AND
temporal correlation in my data would make permutation bootstrap
inappropriate to generate significance levels, thus making bivariate
Moran?s I biased. Thank you very much for the clarifications! This has been
very helpful and I will have a closer look at which spatial regression
models are more appropriate for my analysis.

On a side note, do you think the function to calculate bivariate Moran?s I
is correct?  And could it be incorporated in the next version of spdep? If
so, please give credit to Rog?rio Barbosa, the researcher who proposed the
code in Stack Overflow.
Perhaps, but SO is usually ephemeral (nobody takes responsibility for 
documenting code and fixing bugs found later). I don't see any tests, 
documentation or accommodation of what spdep expects for edge cases - the 
function as it stands would need a lot of work to protect users from 
obvious blunders. There are no references to literature, nor to proven 
test cases which this implementation should match. We have an 
implementation of Lee (2001), but this is not the same, right? Which 
article gives the formal statistical development of the bivariate local 
Moran's I? Do we know that conditional simulation (permutation bootstrap) 
is valid in some cases, if so which? Is there a development of parametric 
bootstrap?

Roger
best,

Rafael HM Pereira
http://urbandemographics.blogspot.com

On Mon, Jul 31, 2017 at 10:52 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:

Rafael,

I'm sorry, but there is no way you can logically "analyze who benefits the
recent changes in the transport system in terms of access to jobs" from the
data you have.

Even if you had aggregate household income data for 2014 and 2017 (not for
2010 only), you would not know whether wealthier families had not dispaced
poorer families as accessibility improved. You need individual data, either
survey or register, preferably panel, to show that changes in accessibility
change the economic welfare of households controlling for movement of
households. The timestamps on the data make any attempt to do this very
risky; the real findings from a hypothetical surevey-based panel might be
completely different, especially if poorer households were displaced (also
indirectly, through rising house prices driven by improved accessibility).
Gauging the welfare effects of transport investments is very hard to
instrument.

The closest I could get was to map deciles of the change in access (more
negatives than positives) and compare the aspatial income distributions:

library(spdep)
library(rgdal)
map <- readOGR(dsn=".", layer="test_map")
library(classInt)
cI <- classIntervals(map$diffaccess, n=10, style="quantile")
library(RColorBrewer)
ybrpal <- brewer.pal(6, "YlOrBr")
fC <- findColours(cI, ybrpal)
qtm(map, fill="diffaccess", fill.breaks=cI$brks, format="Europe2")
map$faccess <- factor(findInterval(map$diffaccess, cI$brks,
  all.inside=TRUE), labels=names(attr(fC, "table")))
qtm(map, fill="diffaccess", fill.breaks=cI$brks, format="Europe2")
acc_income <- split(map$income, map$faccess)
do.call("rbind", lapply(acc_income, summary))
dens <- lapply(acc_income, density)
plot(1, ylab="", xlab="", type="n", xlim=c(-2000, 11000), ylim=c(0,
  0.002))
for (i in seq(along=dens)) lines(dens[[i]], col=i)
legend("topright", legend=names(dens), col=1:length(dens), lty=1, bty="n")

These density curves really do not suggest any clear relationship, other
than that some areas with increased accessibility had higher incomes in
2010.

You can examine the reverse relationship - were aggregate areas that were
more wealthy in 2010 able to attract more changes to accessibility? The
answer seems to be yes, they were able to do this:

nb <- poly2nb(map)
lw <- nb2listw(nb, style = "W", zero.policy = T)
lm.morantest(lm(diffaccess ~ I(income/1000), map), lw)
# SLX model
summary(lmSLX(diffaccess ~ I(income/1000), map, lw))
lm.morantest(lmSLX(diffaccess ~ I(income/1000), map, lw), lw)
# Spatial Durbin error model - SDEM
obj <- errorsarlm(diffaccess ~ I(income/1000), map, lw, etype="emixed")
summary(impacts(obj))
summary(impacts(lmSLX(diffaccess ~ I(income/1000), map, lw)))
LR.sarlm(lmSLX(diffaccess ~ I(income/1000), map, lw), obj)

It would be possible to run lm.morantest.sad() on the output of the SDEM
model taking global spatial autocorrelation into account. If you need that,
follow up in this thread.

Bivariate Moran's I should not be used in this case, but could be used in
other cases - use in change over time is troubling because randomisation
will not be a good guide as t=1 and t=2 are subject to temporal as well as
spatial autocorrelation, so you cannot use permutation bootstrap to find a
usable measure of significance.

Hope this clarifies, and thanks for the code.

Roger

On Sun, 30 Jul 2017, Rafael Pereira wrote:

Roger,
Population and income data are single point in time and come from the 2010
Census.

Accessibility variables in 2014 and 2017 show the proportion of jobs
accessible by public transport under 60 minutes. The variable diffaccess
shows the difference between these two. It's in percentage points
(access2017 - access2014)

best,

Rafael H M Pereira
urbandemographics.blogspot.com

On Sun, Jul 30, 2017 at 7:41 AM, Roger Bivand <Roger.Bivand at nhh.no>
wrote:

Thanks, I'll get back when able, offline now. What are the units of
observation, and are aggregate household incomes observed only once?

Roger

Roger Bivand
Norwegian School of Economics
Bergen, Norway

Fra: Rafael Pereira
Sendt: s?ndag 30. juli, 00.39
Emne: Re: [R-sig-Geo] bivariate spatial correlation in R
Kopi: Rog?rio Barbosa, r-sig-geo at r-project.org

Hi all, here is a reproducible example to calculate in R bivariate
Moran's
I and LISA clusters. This example is based on a this answer provided in
SO*
and it uses a toy model of my data. The R script and the shape file with
the data are available on this link. https://gist.github.com/
rafapereirabr/5348193abf779625f5e8c5090776a228 What this example does is
to estimate the spatial association between household income per capita
and
the gains in accessibility to jobs. The aim is to analyze who benefits
the
recent changes in the transport system in terms of access to jobs. So the
idea is not to find causal relationships, but spatial association between
areas of high/low income who had high/low gains in accessibility. The
variables in the data show info on the proportion of jobs accessible in
both years 2014 and 2017 (access2014, access2017) and the difference
between the two years in percentage points (diffaccess). Roger, I know
you
have shown to be a bit sceptical about this application of bivariate
Moran's I. Do you still think a spatial regression would be more
appropriate? Also, I would be glad to hear if others have comments on the
code. This function is not implemented in any package so it would be
great
to have some feedback. Rafael H M Pereira urbandemographics.blogspot.com
* https://stackoverflow.com/questions/45177590/map-of-
bivariate-spatial-correlation-in-r-bivariate-lisa On Wed, Jul 26, 2017
at
11:07 AM, Roger Bivand wrote: > On Wed, 26 Jul 2017, Rafael Pereira
wrote:

Roger, >> >> This example was provided only for the sake or making the

code easily >> reproducible for others and I'm more interested in how
the
bi-variate >> Moran >> could be implemented in R, but your comments are
very much welcomed and >> I've made changes to the question. >> >> My
actual case study looks at bi-variate spatial correlation between (a) >>
average household income per capita and (b) proportion of jobs in the
city

that are accessible under 60 minutes by transit. I don't think I could

use >> rates in this case but I will normalize the variables using >>
scale(data$variable). >> > > Please provide a reproducible example,
either
with a link to a data > subset, or using a builtin data set. My guess is
that you do not need > bi-variate spatial correlation at all, but rather
a
spatial regression. > > The "causal" variable would then the the
proportion
of jobs accessible > within 60 minutes by transit, though this is
extremely
blunt, and lots of > other covariates (demography, etc.) impact average
household income per > capita (per block/tract?). Since there are many
missing variables in your > specification, any spatial correlation would
be
most closely associated > with them (demography, housing costs,
education,
etc.), and the choice of > units of measurement would dominate the
outcome.

This is also why bi-variate spatial correlation is seldom a good idea,

I > believe. It can be done, but most likely shouldn't, unless it can
be >
motivated properly. > > By the way, the weighted and FDR-corrected SAD
local Moran's I p-values of > the black/white ratio for Oregon (your toy
example) did deliver the goods - > if you zoom in in mapview::mapview,
you
can see that it detects a rate > hotspot between the rivers. > > Roger >

best, >> >> Rafael H M Pereira >> >> On Mon, Jul 24, 2017 at 7:56 PM,

Roger Bivand >> wrote: >> >> On Mon, 24 Jul 2017, Rafael Pereira
wrote: >>>

Hi all, >>> >>>> >>>> I would like to ask whether some you conducted

bi-variate spatial >>>> correlation in R. >>>> >>>> I know the
bi-variate
Moran's I is not implemented in the spdep library. >>>> I left a question
on SO but also wanted to hear if anyone if the >>>> mainlist >>>> have
come
across this. >>>> https://stackoverflow.com/questions/45177590/map-of-
bivariat >>>> e-spatial-correlation-in-r-bivariate-lisa >>>> >>>> I also
know Roger Bivand has implemented the L index proposed by Lee >>>> (2001)

in spdep, but I'm not I'm not sure whether the L local correlation
coefficients can be interpreted the same way as the local Moran's I
coefficients. I couldn't find any reference commenting on this issue.

I >>>> would very much appreciate your thoughts this. >>>> >>>> >>>
In the
SO question, and in the follow-up, your presumably throw-away >>> example
makes fundamental mistakes. The code in spdep by Virgilio >>> G?mez-Rubio
is for uni- and bivariate L, and produces point values of >>> local >>>
L.
This isn't the main problem, which is rather that you are not taking >>>
account of the underlying population counts, nor shrinking any estimates

of >>> significance to accommodate population sizes. Population sizes

vary from >>> 0 >>> to 11858, with the lower quartile at 3164 and upper
5698: >>> plot(ecdf(oregon.tract$pop2000)). Should you be comparing
rates
in >>> stead? >>> These are also compositional variables (sum to pop2000,
or 1 if rates) >>> with >>> the other missing components. You would
probably be better served by >>> tools >>> examining spatial segregation,
such as for example the seg package. >>> >>> The 0 count populations
cause
problems for an unofficial alternative, the >>> black/white ratio: >>>

oregon.tract1 0,] >>> oregon.tract1$rat >> nb >> lw >> >>> which should
still be adjusted by weighting: >>> >>> lm0 >> >>> I'm not advising this,
but running localmoran.sad on this model output >>> yields SAD p-values <
0.05 after FDR correction only in contiguous tracts >>> on the Washington
state line in Portland between the Columbia and >>> Williamette rivers.
So
do look at the variables you are using before >>> rushing into things.

Hope this clarifies, >>> >>> Roger >>> >>> >>> best, >>>> >>>> Rafael

HM Pereira >>>> http://urbandemographics.blogspot.com >>>> >>>>
[[alternative HTML version deleted]] >>>> >>>>
_______________________________________________ >>>> R-sig-Geo mailing
list >>>> R-sig-Geo at r-project.org >>>> https://stat.ethz.ch/mailman/
listinfo/r-sig-geo >>>> >>>> >>>> -- >>> Roger Bivand >>> Department of
Economics, Norwegian School of Economics, >>> Helleveien 30, N-5045
Bergen,
Norway. >>> voice: +47 55 95 93 55 <+47%2055%2095%2093%2055>; e-mail:
Roger.Bivand at nhh.no >>> Editor-in-Chief of The R Journal,
https://journal.r-project.org/ >>> index.html >>>
http://orcid.org/0000-0003-2392-6140 >>> https://scholar.google.no/
citations?user=AWeghB0AAAAJ&hl=en >>> >> >> [[alternative HTML version
deleted]] >> >> _______________________________________________ >>
R-sig-Geo mailing list >> R-sig-Geo at r-project.org >>
https://stat.ethz.ch/mailman/listinfo/r-sig-geo >> > > -- > Roger Bivand

Department of Economics, Norwegian School of Economics, > Helleveien 30,

N-5045 Bergen, Norway. > voice: +47 55 95 93 55
<+47%2055%2095%2093%2055>;
e-mail: Roger.Bivand at nhh.no > Editor-in-Chief of The R Journal,
https://journal.r-project.org/index.html > http://orcid.org/0000-0003-
2392-6140 > https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

[[alternative HTML version deleted]] ______________________________
_________________
R-sig-Geo mailing list R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

bivariate spatial correlation in R

Thread (3 messages)