I use the following code from maptools and spdep to create a neighbors list
of class nb from a shapefile:
NMmap <- read.shape("CountyLevelData.shp")
NMpolylist <- Map2poly(NMmap,region.id=NULL,quiet=TRUE)
NMnb <- poly2nb(NMpolylist,queen=FALSE)
print(is.symmetric.nb(NMnb))
I can then use write.nb.gal to output the neighbors list to a txt file in
GAL format:
write.nb.gal(NMnb,file="E://zweightsGAL_county.txt",oldstyle=FALSE,shpfile=NULL,ind=NULL)
I understand how to read the GAL format. What is the origin of the id's
listed in the txt file (in this case the rweightsGAL_county.txt file). They
look like they are the values from the FID field + 1 in the original shape
file.
I want to use the information from the GAL format .txt file to create a
spatial weights matrix file of the form I can use in ArcGIS on the same
shapefile the data came from in the first place. (For instance, compute
Moran's I using spatial statistics extension in ArcGIS using the binary
weights connectivity matrix.) How do I know which field from the shapefile
is being used in the construction of the neighbors list for the class nb
and then being written to the txt file using write.nb.gal?
Larry Layne
ljlayne at unm.edu
polygon identifiers and nb classes
8 messages · Larry Layne, Roger Bivand
On Thu, 13 Apr 2006, Larry Layne wrote:
I use the following code from maptools and spdep to create a neighbors list
of class nb from a shapefile:
NMmap <- read.shape("CountyLevelData.shp")
NMpolylist <- Map2poly(NMmap,region.id=NULL,quiet=TRUE)
NMnb <- poly2nb(NMpolylist,queen=FALSE)
print(is.symmetric.nb(NMnb))
I can then use write.nb.gal to output the neighbors list to a txt file in
GAL format:
write.nb.gal(NMnb,file="E://zweightsGAL_county.txt",oldstyle=FALSE,shpfile=NULL,ind=NULL)
I understand how to read the GAL format. What is the origin of the id's
listed in the txt file (in this case the rweightsGAL_county.txt file). They
look like they are the values from the FID field + 1 in the original shape
file.
Yes, if no row.names= argument is given to poly2nb(), this is what they will default to. You can choose your own row.names= in poly2nb(), and setting oldstyle=FALSE, it should work. You'll need to edit the first line, the ArcGIS format is not GAL as far as I understand. The col.gal.nb object in data(columbus) has region.id attributes set in this way:
summary(col.gal.nb)
Neighbour list object: Number of regions: 49 Number of nonzero links: 230 Percentage nonzero weights: 9.579342 Average number of links: 4.693878 Link number distribution: 2 3 4 5 6 7 8 9 10 7 7 13 4 9 6 1 1 1 7 least connected regions: 1005 1008 1045 1047 1049 1048 1015 with 2 links 1 most connected region: 1017 with 10 links
tmpfl <- tempfile()
write.nb.gal(col.gal.nb, tmpfl, oldstyle=FALSE)
system(paste("cat", tmpfl))
0 49 unknown unknown 1005 2 1001 1006 1001 3 1005 1006 1002 ... so then you'd need to change the header to name the DBF field with the IDs correctly. For cross-checking in GeoDa, please set the shapefile= and ind= arguments to the appropriate values. A revision of write.nb.gal() is obviously possible here, I would value feedback. Please also crosscheck whether the variance of Moran's I agrees with moran.test here in the randomisation=TRUE row-standardised weights case - it is known not to agree for fixed range and inverse distance weights in ArcGIS, the results here (and in Stata) are consistent, but those in ArcGIS are not. It will be interesting to hear what you find! Roger
I want to use the information from the GAL format .txt file to create a spatial weights matrix file of the form I can use in ArcGIS on the same shapefile the data came from in the first place. (For instance, compute Moran's I using spatial statistics extension in ArcGIS using the binary weights connectivity matrix.) How do I know which field from the shapefile is being used in the construction of the neighbors list for the class nb and then being written to the txt file using write.nb.gal? Larry Layne ljlayne at unm.edu
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Yes, if no row.names= argument is given to poly2nb(), this is what they will default to.
FID + 1 is what I assume you to mean.
You can choose your own row.names= in poly2nb(),
Which I did here: NMnb <- poly2nb(NMpolylist,row.names="OBJECTID",queen=FALSE) Now I know for sure which values are being used as polygon ids to define the neighborhood list. (FYI, the shapefiles originally came from ArcGIS geodatabase polygon feature classes and is the reason the field OBJECTID exists in the shapefile DBF.)
and setting oldstyle=FALSE, it should work.
...oldstyle=FALSE in write.nb.gal. Actually this doesn't matter. oldstyle can be either TRUE or FALSE - see below.
You'll need to edit the first line, the ArcGIS format is not GAL as far as I understand.
More than just the first line. As I understand the spatial weights matrix file structure for ArcGIS there needs to be a complete rearrangement of the GAL file. For instance, given the first 5 lines of a GAL file: 0 49 unknown unknown 1 4 8 9 23 41 2 4 4 27 30 43 the information needs to be rearranged for an ArcGIS spatial weights file as follows for a binary weights connectivity matrix (the first line of the GAL file is discarded): OBJECTID 1 8 1 1 9 1 1 23 1 1 41 1 2 4 1 2 27 1 2 30 1 2 43 1 The first line in the file is the name of the field in the DBF file used as the polygon label ids. (I am using the work "line" to imply record, row, or observation - take your pick.) This was specified above in poly2nb(). Following the field name in the first line, each line in the text file defines a weight for pairwise connections between polygons. Because polygon #1 shares a boundary with 4 other polygons, the weights defining each of these connections requires 4 lines in the ArcGIS spatial weights matrix file. The same is true for polygon #2, etc. This is what I understand the structure of the file to be for an ArcGIS spatial weights matrix file. I convert the GAL structure to the ArcGIS spatial weights matrix file using a SAS program. Any number of programming languages could be used to do the same thing.
For cross-checking in GeoDa, please set the shapefile= and ind= arguments to the appropriate values.
I haven't used GeoDa to date. Maybe I should in the future? Maybe.
A revision of write.nb.gal() is obviously possible here, I would value feedback.
Once I figured out how to read the GAL structure I could convert it to any other structure I wanted using SAS program code. For convenience, it might be nice to have a function like write.nb.gal() but more general, like write.nb() and allow the user to specify a parameter defining the structure of the output. For instance, certainly one of the parameter settings would be to write a GAL structure to a text file. Another might be to write one for ArcGIS like that shown above. (This would also require defining the weights in the output file, such as style = W, B, C, U, S.) I have a bunch of SAS programs that use a neighbors structure like the following (using the same poly ids for the GAL example from above): 1 8 9 23 41 2 4 27 30 43 Structures for other popular spatial stats software? Might be useful to different users.
Please also crosscheck whether the variance of Moran's I agrees with moran.test here in the randomisation=TRUE row-standardised weights case - it is known not to agree for fixed range and inverse distance weights in ArcGIS, the results here (and in Stata) are consistent, but those in ArcGIS are not. It will be interesting to hear what you find!
I plan to compare all of my results from ArcGIS spatial statistics extension to those from spdep as it would be convenient to do a few more things in ArcGIS than in spdep, and I would like to have some confidence in the ArcGIS results. Generally, my experience with statistics anything in ArcGIS is that its output always needs to be verified using some other established software because ArcGIS is notorious for giving different results. The one good thing about most of the spatial statistics extension tools is that I can look at the algorithms they are implementing as most of these tools are included as Python scripts. Anyway, I would be glad to send you those comparisons right now but cannot because these particular tools in the spatial statistics extension have suddenly decided to not work on the computer I have the data on! What else could a person expect out of such a high-quality software product? I'll send the ArcGIS comparison results when I can. Again, thanks for all your help. Larry
On Thu, 13 Apr 2006, Larry Layne wrote:
Yes, if no row.names= argument is given to poly2nb(), this is what they will default to.
FID + 1 is what I assume you to mean.
You can choose your own row.names= in poly2nb(),
Which I did here: NMnb <- poly2nb(NMpolylist,row.names="OBJECTID",queen=FALSE) Now I know for sure which values are being used as polygon ids to define the neighborhood list. (FYI, the shapefiles originally came from ArcGIS geodatabase polygon feature classes and is the reason the field OBJECTID exists in the shapefile DBF.)
and setting oldstyle=FALSE, it should work.
...oldstyle=FALSE in write.nb.gal. Actually this doesn't matter. oldstyle can be either TRUE or FALSE - see below.
You'll need to edit the first line, the ArcGIS format is not GAL as far as I understand.
More than just the first line. As I understand the spatial weights matrix file structure for ArcGIS there needs to be a complete rearrangement of the GAL file. For instance, given the first 5 lines of a GAL file: 0 49 unknown unknown 1 4 8 9 23 41 2 4 4 27 30 43 the information needs to be rearranged for an ArcGIS spatial weights file as follows for a binary weights connectivity matrix (the first line of the GAL file is discarded):
Right. Please look at write.sn2gwt() instead - it will need modification to handle the IDs. I'll start looking at it - listw2sn() does carry through a region.id attribute, so changing the output function to substitute the IDs for the indices shouldn't be hard. Roger
OBJECTID 1 8 1 1 9 1 1 23 1 1 41 1 2 4 1 2 27 1 2 30 1 2 43 1 The first line in the file is the name of the field in the DBF file used as the polygon label ids. (I am using the work "line" to imply record, row, or observation - take your pick.) This was specified above in poly2nb(). Following the field name in the first line, each line in the text file defines a weight for pairwise connections between polygons. Because polygon #1 shares a boundary with 4 other polygons, the weights defining each of these connections requires 4 lines in the ArcGIS spatial weights matrix file. The same is true for polygon #2, etc. This is what I understand the structure of the file to be for an ArcGIS spatial weights matrix file. I convert the GAL structure to the ArcGIS spatial weights matrix file using a SAS program. Any number of programming languages could be used to do the same thing.
For cross-checking in GeoDa, please set the shapefile= and ind= arguments to the appropriate values.
I haven't used GeoDa to date. Maybe I should in the future? Maybe.
A revision of write.nb.gal() is obviously possible here, I would value feedback.
Once I figured out how to read the GAL structure I could convert it to any other structure I wanted using SAS program code. For convenience, it might be nice to have a function like write.nb.gal() but more general, like write.nb() and allow the user to specify a parameter defining the structure of the output. For instance, certainly one of the parameter settings would be to write a GAL structure to a text file. Another might be to write one for ArcGIS like that shown above. (This would also require defining the weights in the output file, such as style = W, B, C, U, S.) I have a bunch of SAS programs that use a neighbors structure like the following (using the same poly ids for the GAL example from above): 1 8 9 23 41 2 4 27 30 43 Structures for other popular spatial stats software? Might be useful to different users.
Please also crosscheck whether the variance of Moran's I agrees with moran.test here in the randomisation=TRUE row-standardised weights case - it is known not to agree for fixed range and inverse distance weights in ArcGIS, the results here (and in Stata) are consistent, but those in ArcGIS are not. It will be interesting to hear what you find!
I plan to compare all of my results from ArcGIS spatial statistics extension to those from spdep as it would be convenient to do a few more things in ArcGIS than in spdep, and I would like to have some confidence in the ArcGIS results. Generally, my experience with statistics anything in ArcGIS is that its output always needs to be verified using some other established software because ArcGIS is notorious for giving different results. The one good thing about most of the spatial statistics extension tools is that I can look at the algorithms they are implementing as most of these tools are included as Python scripts. Anyway, I would be glad to send you those comparisons right now but cannot because these particular tools in the spatial statistics extension have suddenly decided to not work on the computer I have the data on! What else could a person expect out of such a high-quality software product? I'll send the ArcGIS comparison results when I can. Again, thanks for all your help. Larry
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Right. Please look at write.sn2gwt()...
It is also probably a good idea to start using spdep 0.3-22 to take advantage of some of the new functions, which I have done. Larry
On Thu, 13 Apr 2006, Larry Layne wrote:
Right. Please look at write.sn2gwt()...
Here is a first cut, which seems to emit a suitable file. If you can try
it and/or send back an Arc command line say for COLUMBUS.SHP CRIME and
NEIGNO as index in GeoDa, I would be grateful:
write.sn2Arc <- function(sn, file, field=NULL) {
if(!inherits(sn, "spatial.neighbour"))
stop("not a spatial.neighbour object")
if (is.null(field)) stop("field must be given")
n <- attr(sn, "n")
if (n < 1) stop("non-positive number of entities")
nms <- as.character(attr(sn, "region.id"))
sn[,1] <- nms[sn[,1]]
sn[,2] <- nms[sn[,2]]
con <- file(file, open="w")
writeLines(field, con)
write.table(as.data.frame(sn), file=con, append=TRUE,
row.names=FALSE, col.names=FALSE, quote=FALSE)
close(con)
}
I think it may fall over if the region.id values include white space or
are non-unique.
sn is a spatial neighbours object as returned by listw2sn(), so in your
case it would be:
write.sn2Arc(listw2sn(nb2listw(my_nb, style="B")), filename, fieldname)
Roger
It is also probably a good idea to start using spdep 0.3-22 to take advantage of some of the new functions, which I have done. Larry
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
4 days later
Hello Roger,
Here is a first cut, which seems to emit a suitable file. If you can try it and/or send back an Arc command line say for COLUMBUS.SHP CRIME and NEIGNO as index in GeoDa, I would be grateful:
I'm not exactly sure what "as index in GeoDa" means but I have successfully
tested your code on 2 different data sets: my own and the columbus.shp
shapefile. Here is the full code to create the text file of spatial weights
in the form that ArcGIS wants them. The columbus shapefile is used as the
example file here:
library(maptools)
library(maps)
library(spdep)
setwd("E:\\ShapeFiles")
getwd()
COLUMBUSmap <- read.shape("columbus.shp")
COLUMBUSpolylist <- Map2poly(COLUMBUSmap,region.id=NULL,quiet=TRUE)
COLUMBUSnb <- poly2nb(COLUMBUSpolylist,row.names="POLYID",queen=FALSE)
print(is.symmetric.nb(COLUMBUSnb))
write.sn2Arc(listw2sn(nb2listw(COLUMBUSnb,style="B")),file="E://Shapefiles//zweights_test.txt",field="POLYID")
The tests were successful using style="B" or style="W".
Following are 2 Arc command lines, one for CRIME and the other for NEIGNO:
SpatialAutocorrelation columbus CRIME false "Get Spatial Weights From File"
"Euclidean Distance" None 0 E:\Shapefiles\zweights_test.txt 0 0
SpatialAutocorrelation columbus NEIGNO false "Get Spatial Weights From
File" "Euclidean Distance" None 0 E:\Shapefiles\zweights_test.txt 0 0
Thanks for the effort you put into this.
Larry
On Wed, 19 Apr 2006, Larry Layne wrote:
Hello Roger,
Here is a first cut, which seems to emit a suitable file. If you can try it and/or send back an Arc command line say for COLUMBUS.SHP CRIME and NEIGNO as index in GeoDa, I would be grateful:
I'm not exactly sure what "as index in GeoDa" means but I have successfully
tested your code on 2 different data sets: my own and the columbus.shp
shapefile. Here is the full code to create the text file of spatial weights
in the form that ArcGIS wants them. The columbus shapefile is used as the
example file here:
library(maptools)
library(maps)
library(spdep)
setwd("E:\\ShapeFiles")
getwd()
COLUMBUSmap <- read.shape("columbus.shp")
COLUMBUSpolylist <- Map2poly(COLUMBUSmap,region.id=NULL,quiet=TRUE)
COLUMBUSnb <- poly2nb(COLUMBUSpolylist,row.names="POLYID",queen=FALSE)
print(is.symmetric.nb(COLUMBUSnb))
write.sn2Arc(listw2sn(nb2listw(COLUMBUSnb,style="B")),file="E://Shapefiles//zweights_test.txt",field="POLYID")
The tests were successful using style="B" or style="W".
Following are 2 Arc command lines, one for CRIME and the other for NEIGNO:
SpatialAutocorrelation columbus CRIME false "Get Spatial Weights From File"
"Euclidean Distance" None 0 E:\Shapefiles\zweights_test.txt 0 0
SpatialAutocorrelation columbus NEIGNO false "Get Spatial Weights From
File" "Euclidean Distance" None 0 E:\Shapefiles\zweights_test.txt 0 0
Thanks, useful! The function will be in the next spdep release. Roger
Thanks for the effort you put into this. Larry
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no