Skip to content

Create a Spatial Weight Matrix based on road distance

24 messages · Rolf Turner, Adrian Baddeley, Juan Pablo Carranza +4 more

#
Dear community,

Is there any way to create a spatial weight matrix based on road distance?
I am trying to use the road distance between two points instead of
euclidean distance.

I've seen that there is a package named osrm. Can anyone give some advice?

Thank you in advance.

Regards,
#
On 21/06/19 12:26 PM, Rolando Valdez wrote:

            
I don't know anything about "osrm".  Calculating "road distances" can be 
done in the spatstat package reasonably easily, if you take the trouble 
to represent your collection of roads as a "linnet" object.

Given that you have done so, suppose that your linnet object is "L" and 
that you have vectors "x" and "y" specifying the points on L (i.e. on 
your roads) between which you want to know the distances.

Do:

     X    <- lpp(data.frame(x=x,y=y),L)
     dMat <- pairdist(X)

The object "dMat" is a (symmetric) square matrix; dMat[i,j] is the 
distance between point i and point j.  (Of course the diagonal entries 
are all 0.)

If your collection of roads is specified by means of a shapefile, 
vignette("shapefiles") will tell you how to turn this collection into a 
"psp" ("planar segment pattern") object; the function (method) 
as.linnet.psp() can then be used to turn the "psp" object into a 
"linnet" object.

HTH

cheers,

Rolf Turner
#
Rather than converting an object of class 'SpatialLines' or 'SpatialLinesDataFrame' to the spatstat class 'psp' and then converting it to the spatstat class 'linnet', it is safer and more efficient to convert the SpatialLines* object directly to class linnet using as.linnet.SpatialLinesDataFrame() from the package 'maptools'.


Prof Adrian Baddeley DSc FAA

John Curtin Distinguished Professor

Department of Mathematics and Statistics

Curtin University, Perth, Western Australia
#
Dear Rolando,

The advantage of using Open Street Maps engine is that you can give the
travel option. This means you can select whether are you traveling by bike,
car or walking. The previous approach didn't consider this topic. For this,
you should have added a vector layer depending on your position of the
street you can take (or not). Open Street Maps project allow you two
options: a service in which you give your current position, the position
you want to reach and your transport method (giving you back the fastest
route). Or the option to download the engine/algorithm compile by yourself
(if I am not wrong is made in C or python) and then you can make your own
calculation at your own computer. For the first option, the package OSRM is
an interface in which send a request to the OSM web page and wait for an
answer. With this method, you can send a couple of request at the same time
but no to many (you should read the manual for this). Of course, also will
depend on whether the OSM server is down or not (or busy).

I have to say that I used some years ago this app and nowadays I know that
for some cities OSM has more streets reported than the same google maps.
Also is an open project and they let you download their data for free,
contrary to what google maps do.

All the best,


Andres

El vie., 21 jun. 2019 a las 4:30, Adrian Baddeley (<
adrian.baddeley at curtin.edu.au>) escribi?:

  
    
1 day later
#
Thank you for your answer.

I have a shapefile with, say, counties, and I got another shapefile with
the roads. ?What if a county does not intersect any road?

El jue., 20 de jun. de 2019 a la(s) 19:08, Rolf Turner (
r.turner at auckland.ac.nz) escribi?:

  
    
#
On 23/06/19 1:17 PM, Rolando Valdez wrote:

            
I am sorry, but it is not at all clear to me just what the problem is. 
How do the counties come into the picture?  You said you wanted to get 
the road distance between points on the roads.  What have the counties 
got to do with this?

Can you perhaps provide a reproducible example?

cheers,

Rolf
#
I am sorry, I was not clear enough. My goal is to calculate a spatial
weight matrix (nxn) across counties but, instead of euclidean distance, to
use road distance.

El s?b., 22 de jun. de 2019 a la(s) 19:28, Rolf Turner (
r.turner at auckland.ac.nz) escribi?:

  
    
#
On 23/06/19 2:38 PM, Rolando Valdez wrote:

            
I'm afraid I still don't understand.  To put it mildly. You presumably 
have a clear idea of what you are trying to, but those of us who are not 
involved in your research have no such idea.  We (or at least I) haven't 
a clue as to what you are talking about.

What do you mean by "spatial weight"?  What are these weights used for? 
What is n?  How are the counties involved?  Is n the number of counties? 
Are you interested in the road distance (minimum road distance?) between 
pairs of counties?

Please explain *clearly* and do not expect those who are trying to help 
you to be mind-readers!!!

cheers,

Rolf
#
Sorry again.

A Spatial Weight Matrix (swm) is an object used in spatial econometrics to
characterize the spatial structure among territories. It is an element nxn
where n is the number of territorial units (counties, districts, states,
cities, regions) in the sample and it could be based on contiguity or
distance. Usually, you can create a swm based on distance using
'dnearneigh' from spdep and then convert to a listw through 'nb2listw'. The
problem is that the matrix that you generate trough 'dnearneigh'computes
the euclidean distance among centroids of polygons. This is where I spot my
issue, I need to compute the swm using the road distance instead of
euclidean distance computed through 'dnearneigh'. I do have a shapefile
with poligons (counties) and another shapefile with lines (roads).


El s?b., 22 de jun. de 2019 a la(s) 20:00, Rolf Turner (
r.turner at auckland.ac.nz) escribi?:

  
    
#
On 23/06/19 3:30 PM, Rolando Valdez wrote:
OK.  It's getting a *bit* clearer ....  You are interested in "road 
distances" between counties.  I'm still not entirely sure what this 
means.  Is it the *minimum* distance by road from one county to another? 
In which case, if two counties are contiguous (adjacent) and there is a 
road crossing the border between the two, is the distance between the 
counties equal to zero?  (This doesn't seem like it would be 
satisfactory ....)

If this is not the case, then what *is* the case?  Perhaps you want 
distances between the *centroids* of the counties.  What then do you 
mean by road distance when the centroids do not lie on a road?

You apparently need to deal with counties in which there are no roads at 
all.  To handle this you have to define what *you* mean by the distance 
by road from county A to county B when there are no roads at all in 
county B.  Perhaps infinity would be the appropriate distance, but *I* 
don't know; you have to make the call.

Previously you indicated that you needed to know (pairwise) road 
distances between specified points in a given set, and I showed you how 
to obtain those using pairdist(), from spatstat.  Now it seems that you 
want something rather different, and it's still not clear what.

You need to get *your* thoughts clear; make some definitions and 
specifications, and decide what you really want or need.

It seems that you are expecting R to magically do your thinking for you; 
it won't!

cheers,

Rolf
#
Hi all!
I don't have the specific answer, but I don't appreciate academic bullying
either. So... here is a way you could take. Go to qgis, calculate road
distances by network analysis and then add it to your dataframe. Use this
new variable to weigth the matrix.
I hope this serves, and I hope Rolf solves what ever is bothering *him*.
Cheers

Juan Pablo Carranza
Mgter. en Administraci?n P?blica
Lic. en Econom?a

El dom., 23 de jun. de 2019 1:16 a. m., Rolf Turner <r.turner at auckland.ac.nz>
escribi?:

  
  
#
I apologize for the lack of clarity.

Let me try again:

The SWM captures the spatial structure among territories. In the case of a
matrix based on distance, you define a distance-threshold, say 50 km, and
every territory under that distance is considered as neighbor, in the
matrix, those territories considered neighbors take the value 1, and 0
otherwise (territories beyond 50 km). This is what 'dnearneigh' function
does.

Then, I want to define a distance-threshold, say 50 km by road (not
euclidean) and every territory under that distance (by road) be considered
as neighbor.

El s?b., 22 de jun. de 2019 a la(s) 21:15, Rolf Turner (
r.turner at auckland.ac.nz) escribi?:
one road, however it's not a big deal. If I define a distance of 50 km, it
doesn't matter how many times two counties are connected, I just need that
they are at 50 km trough, at least, one road.
This is a big challenge, I'm still working on it.
If two counties are not connected through a road, they could not be
neighbors. In this case, it would correspond to a value 0 in the matrix.
research fields.
I got it.

  
    
#
I don't know how exactly works that, but I will try.

Thank you.

El s?b., 22 de jun. de 2019 a la(s) 22:17, Juan Pablo Carranza (
carranzajuanp at gmail.com) escribi?:

  
    
#
Rolando, thanks for clarify.
"gdistance" may give you what you are looking for. Another option is the
package "osrm", for calculate distances using Open Street Maps routes.
Once you calculate a dataframe with the distances of every pair of
locations, you could make ones and ceros with a simple ifelse. Then you
could standarize by row. That will be your lw matrix.
Here is some help:

https://www.google.com/url?sa=t&source=web&rct=j&url=https://cran.r-project.org/web/packages/gdistance/vignettes/gdistance1.pdf&ved=2ahUKEwim8u3P-f7iAhV5GLkGHbUKBCUQFjAAegQIAxAB&usg=AOvVaw2EjdceAFDJB9Qet-EBEDAn&cshid=1561271237676


Juan Pablo Carranza
Mgter. en Administraci?n P?blica
Lic. en Econom?a

El dom., 23 de jun. de 2019 3:02 a. m., Rolando Valdez <rvaldezr at gmail.com>
escribi?:

  
  
#
On 23/06/19 6:01 PM, Rolando Valdez wrote:

            
You still have not defined what you mean by *distance* between 
territories (regions, counties). Distance between *points* is well 
defined; distance between territories is not.  You have to specify what 
you mean by such a distance.  This could be the minimum distance between 
points in the regions (which is not of course a metric), distance 
between centroids of the territories, Hausdorff distance, or something 
else.  This applies whether you are talking about the distance between 
points being Euclidean distance or road distance or some other metric. 
Thresholding that distance (e.g. at 50 km.) is then a trivial matter.

I have tried my best to get you to clarify what you mean, and my efforts 
seem to be in vain.  Since Juan Pablo thinks that I am "bullying you" 
(which mystifies me completely) I guess I'll give up.  And to respond to 
Juan Pablo's hope, nothing whatever is "bothering" me.

cheers,

Rolf
#
Comments inline below:
On Sun, 23 Jun 2019, Rolando Valdez wrote:

            
So the main goal is to use distance by road between territories to 
construct a neighbour object, with a maximum distance threshold.

Firstly, why do you suppose that this will represent your a priori 
knowledge better than contiguity, which is the most obvious measure of 
neighbourhood for data with areal support?

If there are major mountain chains or water bodies impeding contact across 
a boundary between contiguous territories, could you not just edit out 
those graph edges (spdep::edit.nb() for example)? If there are many, a 
programmatic approach may be needed.

The inherent difficulty is that distances to territories in practice mean 
that you have to change support from area to point, and that needs 
thinking through. A territory centroid may, for example, not lie on a 
transport link. If you have finer scale population or production data, you 
might use weighted centroids, but this involves further steps, and using 
this kind of data may add endogeneity, as the variables used to define the 
neighbour object may enter your model elsewhere.

So:

I) explain why you cannot simply use contiguities (using accessibility as 
a covariate if important);

II) if some contiguities given territory boundaries should be removed 
because there are good reasons (mountain range, ...) for no contact, 
provide a reproducible example so that you can be assisted in programming 
the criteria for thinning the contiguity graph; and

III) if you really want to go with road distances, provide a reproducible 
example for first finding the representative point for each territory and 
then giving the road network and territory boundaries. With the linear 
network and the change-of-support points, it is not unlikely that the 
spatstat linnet route will be easiest to implement. I have used raster 
cost distances in GRASS for things like this (see the Snow example in ch. 
4 in ASDAR); gdistance is slower in that setting.

Hope this helps,

Roger

PS. It would be really helpful to use affiliations, as gmail addresses 
tell us nothing about what might be expected of background knowledge. This 
posting feels like someone looking for a response to a reviewer of an 
article submitted to a journal, where the reviewer has little idea of the 
underlying challenges involved in making such a revision. Not knowing 
this, and the lack of a reproducible example leaves helpers pretty much in 
the dark. Using existing or toy data sets lets you post without showing 
too much of the detail in your specific problem, but in this thread there 
has been too little information for reasonable traction.

  
    
#
I mean the distance between centroids of polygons that represent counties.

Thank you for your time.

El dom., 23 de jun. de 2019 a la(s) 01:17, Rolf Turner (
r.turner at auckland.ac.nz) escribi?:

  
    
#
El dom., 23 de jun. de 2019 a la(s) 03:00, Roger Bivand (Roger.Bivand at nhh.no)
escribi?:
Yes, just like that.
I'm running the same specification with 8 distance-threshold to assess the
impact of the distance on a key dependent variable.
Yes, there are many cases like this you mentioned.
I have redefine the shapefile to urban areas, these polygons are smaller
than counties, thus, their centroids would lie on the roads more
confidently.

I am not using variables to define the neighbor object.
Because the distance is an explanatory component itself.
I've seen the example you suggest, I cannot figure out how to fit to my case
.
You are absolutely right. I am trying to follow a suggestion from a
referee, who suggests to switch from the euclidean distance to a road
distance. This is the reason I posted my doubt, I had no clue even where to
start.

Now, I have task to do.

Thanks a lot for your help.

  
    
#
Dear Andres,

I could follow an example provided with the package, it was a little bit
simple, however, I got this message:
The OSRM server returned an error:
Error: The public OSRM API does not allow results with a number of
durations
higher than 10000. Ask for fewer durations or use your own server and set
its
--max-table-size option.

My sample size is 2,457

El vie., 21 de jun. de 2019 a la(s) 02:53, Andres Diaz Loaiza (
madiazl at gmail.com) escribi?:

  
    
#
On Mon, 24 Jun 2019, Rolando Valdez wrote:

            
Were you asking for the whole nx(n-1)/2 set of durations in one run? You 
see that the API limits the number of interactions (possibly by time and 
by unique IP number). So you need to reduce the number of queries. If you 
are going to impose a distance threshold anyway, you can do that using the 
Great Circle (not Euclidean) distances between your geographical 
coordinates and dnearneigh(), and step through the nb list object with 
src= being the data frame of the observation coordinates, and dest= the 
data frame of the neighbours' coordinates.

library(sf)
nc <- st_read(system.file("shapes/sids.shp", package="spData")[1], 
quiet=TRUE)
st_crs(nc) <- "+proj=longlat +datum=NAD27"
nc1 <- st_transform(nc, 32019)
nc2 <- st_centroid(nc1, of_largest_polygon=TRUE)
nc3 <- st_transform(nc2, 4326)
crds <- st_coordinates(st_geometry(nc3))
df <- data.frame(id=nc3$FIPSNO, long=crds[,1], lat=crds[,2])
library(spdep)
nb <- dnearneigh(crds, 0, 50, longlat=TRUE)
library(osrm)
res <- vector(mode="list", length=nrow(df))
for (i in seq(along=res)) res[[i]] <- osrmTable(src=df[i,],
  dst=df[nb[[i]],], measure = "duration")
res1 <- lapply(res, function(x) 10/x$duration)
lw <- nb2listw(nb, glist=res1, style="B")

gives general spatial weights based on 10/# minutes travel time between 
county centroids (centroids calculated from projected coordinates), for 
county centroids closer than 50 km measured by Great Circle.

In your case, you may need to split the for() loop into portions of 
cumsum(card(nb)) of less than the limit. If durations are symmetric, 
you could also halve the query count by taking only neighbour ids > i, 
but you'd have to fold them back afterwards. You also wanted to categorise 
the durations into 0,1, which you could do with lapply() instead of using 
inverse durations.

Hope this helps,

Roger

  
    
#
Dear Rolando,


As previous members of the list asked you, it will be nice if you can
provide a reproducible example. Something like Roger has just done (with
command lines), but including your original data. In that way is always
easier to provide help.

 With the command: osrmTable, you are going to obtain the travel time
between points of your domain. In this case, I assume are the centroids of
your polygons. But no distances (what you were talking about). Since the
OSRM is only an interface, it only allows you to perform some few
calculations (since the algorithm to perform these calculations is on the
same web page). That is probably the reason why you are having this error
(I already warn you about this). If you have massive calculations, then I
will recommend you to download the app directly and run it on your
computer.

To get the distance in the manual says you should put as an argument the
distance, something like:

dists <- osrmTable(loc = muns13, measure = "distance")

You have to think this app is an emulator of google maps, that can give you
the shortest path and travel times according to different travel options,
although their primary goal is to cartography all
roads/ways/channels/railways etc., and make them freely available. please
visit
http://project-osrm.org/  (for the engine) and
https://www.openstreetmap.org/
<https://www.openstreetmap.org/#map=12/52.1710/5.2956> for the cartography.

 Andres D.
TU-Delft
Associate Researcher

El lun., 24 jun. 2019 a las 9:58, Roger Bivand (<Roger.Bivand at nhh.no>)
escribi?:

  
    
2 days later
#
Dear Prof. Roger, many thanks for your help, I am working on.

I have a doubt: Which is the differece between Great Circle and Euclidean
Distance?

Thanks

El lun., 24 de jun. de 2019 a la(s) 00:58, Roger Bivand (Roger.Bivand at nhh.no)
escribi?:

  
    
#
Hi, Rolando:

A Great Circle distance is the distance measured with un-projected 
coordinates, or what we normally called the longitude and latitude 
coordinates (measured in decimal degrees). These coordinates are 
essentially Spherical coordinates recorded on the surface of a Sphere 
(or sphere-like object, the Spheroid/Ellipsoid, which is basically our 
Earth). When measuring distance using these coordinates, you must use 
the Great Circle (the Circle that centered on the center of the Earth) 
that crosses the two points and calculate the distance on the Great 
Circle, which is calculated as (assuming the Earth is a perfect Sphere):

[ (angle between the two lines from the center of the Earth to the two 
points on the surface of the Earth)/360 ] * 2*Pai* (Radius of the Earth)

If, however, the coordinates are projected to a planar system (a 
Cartesian coordinate system), we can then use the normal Euclidean 
distance measure to calculate the distance between two points recorded 
in planar coordinate system (such as the UTM, State Plane used in the 
US, and many others), which is calculated simply as:

sqrt((difference between the xs)^2 + (difference between the ys)^2).

Hope this helps.

Best,

Danlin
On 6/27/2019 2:44 AM, Rolando Valdez wrote:
#
I see. It is very interesting.

Thank you for your answer.

El El jue, 27 de junio de 2019 a la(s) 3:08, Danlin Yu <
yud at mail.montclair.edu> escribi?:
Rol~