Skip to content

polygon identifiers and nb classes

8 messages · Larry Layne, Roger Bivand

#
I use the following code from maptools and spdep to create a neighbors list 
of class nb from a shapefile:

NMmap <- read.shape("CountyLevelData.shp")
NMpolylist <- Map2poly(NMmap,region.id=NULL,quiet=TRUE)
NMnb <- poly2nb(NMpolylist,queen=FALSE)
print(is.symmetric.nb(NMnb))

I can then use write.nb.gal to output the neighbors list to a txt file in 
GAL format:

write.nb.gal(NMnb,file="E://zweightsGAL_county.txt",oldstyle=FALSE,shpfile=NULL,ind=NULL)

I understand how to read the GAL format. What is the origin of the id's 
listed in the txt file (in this case the rweightsGAL_county.txt file). They 
look like they are the values from the FID field + 1 in the original shape 
file.

I want to use the information from the GAL format .txt file to create a 
spatial weights matrix file of the form I can use in ArcGIS on the same 
shapefile the data came from in the first place. (For instance, compute 
Moran's I using spatial statistics extension in ArcGIS using the binary 
weights connectivity matrix.) How do I know which field from the shapefile 
is being used in the construction of the neighbors list for the class nb 
and then being written to the txt file using write.nb.gal?

Larry Layne
ljlayne at unm.edu
#
On Thu, 13 Apr 2006, Larry Layne wrote:

            
Yes, if no row.names= argument is given to poly2nb(), this is what they 
will default to.

You can choose your own row.names= in poly2nb(), and setting 
oldstyle=FALSE, it should work. You'll need to edit the first line, the 
ArcGIS format is not GAL as far as I understand.

The col.gal.nb object in data(columbus) has region.id attributes set in 
this way:
Neighbour list object:
Number of regions: 49 
Number of nonzero links: 230 
Percentage nonzero weights: 9.579342 
Average number of links: 4.693878 
Link number distribution:

 2  3  4  5  6  7  8  9 10 
 7  7 13  4  9  6  1  1  1 
7 least connected regions:
1005 1008 1045 1047 1049 1048 1015 with 2 links
1 most connected region:
1017 with 10 links
0 49 unknown unknown
1005 2
1001 1006
1001 3
1005 1006 1002
...

so then you'd need to change the header to name the DBF field with the IDs 
correctly. For cross-checking in GeoDa, please set the shapefile= and ind= 
arguments to the appropriate values.

A revision of write.nb.gal() is obviously possible here, I 
would value feedback. Please also crosscheck whether the variance of 
Moran's I agrees with moran.test here in the randomisation=TRUE 
row-standardised weights case - it is known not to agree for fixed range 
and inverse distance weights in ArcGIS, the results here (and in Stata) 
are consistent, but those in ArcGIS are not.

It will be interesting to hear what you find!

Roger

  
    
#
FID + 1 is what I assume you to mean.
Which I did here:

NMnb <- poly2nb(NMpolylist,row.names="OBJECTID",queen=FALSE)

Now I know for sure which values are being used as polygon ids to define 
the neighborhood list. (FYI, the shapefiles originally came from ArcGIS 
geodatabase polygon feature classes and is the reason the field OBJECTID 
exists in the shapefile DBF.)
...oldstyle=FALSE in write.nb.gal. Actually this doesn't matter. oldstyle 
can be either TRUE or FALSE - see below.
More than just the first line. As I understand the spatial weights matrix 
file structure for ArcGIS there needs to be a complete rearrangement of the 
GAL file. For instance, given the first 5 lines of a GAL file:

0 49 unknown unknown
1 4
8 9 23 41
2 4
4 27 30 43

the information needs to be rearranged for an ArcGIS spatial weights file 
as follows for a binary weights connectivity matrix (the first line of the 
GAL file is discarded):

OBJECTID
1 8  1
1 9  1
1 23 1
1 41 1
2 4  1
2 27 1
2 30 1
2 43 1

The first line in the file is the name of the field in the DBF file used as 
the polygon label ids. (I am using the work "line" to imply record, row, or 
observation - take your pick.) This was specified above in poly2nb(). 
Following the field name in the first line, each line in the text file 
defines a weight for pairwise connections between polygons. Because polygon 
#1 shares a boundary with 4 other polygons, the weights defining each of 
these connections requires 4 lines in the ArcGIS spatial weights matrix 
file. The same is true for polygon #2, etc. This is what I understand the 
structure of the file to be for an ArcGIS spatial weights matrix file. I 
convert the GAL structure to the ArcGIS spatial weights matrix file using a 
SAS program. Any number of programming languages could be used to do the 
same thing.
I haven't used GeoDa to date. Maybe I should in the future? Maybe.
Once I figured out how to read the GAL structure I could convert it to any 
other structure I wanted using SAS program code. For convenience, it might 
be nice to have a function like write.nb.gal() but more general, like 
write.nb() and allow the user to specify a parameter defining the structure 
of the output. For instance, certainly one of the parameter settings would 
be to write a GAL structure to a text file. Another might be to write one 
for ArcGIS like that shown above. (This would also require defining the 
weights in the output file, such as style = W, B, C, U, S.) I have a bunch 
of SAS programs that use a neighbors structure like the following (using 
the same poly ids for the GAL example from above):

1 8 9 23 41
2 4 27 30 43

Structures for other popular spatial stats software? Might be useful to 
different users.
I plan to compare all of my results from ArcGIS spatial statistics 
extension to those from spdep as it would be convenient to do a few more 
things in ArcGIS than in spdep, and I would like to have some confidence in 
the ArcGIS results. Generally, my experience with statistics anything in 
ArcGIS is that its output always needs to be verified using some other 
established software because ArcGIS is notorious for giving different 
results. The one good thing about most of the spatial statistics extension 
tools is that I can look at the algorithms they are implementing as most of 
these tools are included as Python scripts. Anyway, I would be glad to send 
you those comparisons right now but cannot because these particular tools 
in the spatial statistics extension have suddenly decided to not work on 
the computer I have the data on! What else could a person expect out of 
such a high-quality software product? I'll send the ArcGIS comparison 
results when I can.

Again, thanks for all your help.

Larry
#
On Thu, 13 Apr 2006, Larry Layne wrote:

            
Right. Please look at write.sn2gwt() instead - it will need modification
to handle the IDs. I'll start looking at it - listw2sn() does carry 
through a region.id attribute, so changing the output function to 
substitute the IDs for the indices shouldn't be hard.

Roger

  
    
#
It is also probably a good idea to start using spdep 0.3-22 to take 
advantage of some of the new functions, which I have done.

Larry
#
On Thu, 13 Apr 2006, Larry Layne wrote:

            
Here is a first cut, which seems to emit a suitable file. If you can try 
it and/or send back an Arc command line say for COLUMBUS.SHP CRIME and 
NEIGNO as index in GeoDa, I would be grateful:

write.sn2Arc <- function(sn, file, field=NULL) {
	if(!inherits(sn, "spatial.neighbour")) 
	    stop("not a spatial.neighbour object")
	if (is.null(field)) stop("field must be given")
	n <- attr(sn, "n")
	if (n < 1) stop("non-positive number of entities")
	nms <- as.character(attr(sn, "region.id"))
	sn[,1] <- nms[sn[,1]]
	sn[,2] <- nms[sn[,2]]
	con <- file(file, open="w")
	writeLines(field, con)
	write.table(as.data.frame(sn), file=con, append=TRUE,
		row.names=FALSE, col.names=FALSE, quote=FALSE)
	close(con)
}

I think it may fall over if the region.id values include white space or 
are non-unique.

sn is a spatial neighbours object as returned by listw2sn(), so in your 
case it would be:

write.sn2Arc(listw2sn(nb2listw(my_nb, style="B")), filename, fieldname)

Roger

  
    
4 days later
#
Hello Roger,
I'm not exactly sure what "as index in GeoDa" means but I have successfully 
tested your code on 2 different data sets: my own and the columbus.shp 
shapefile. Here is the full code to create the text file of spatial weights 
in the form that ArcGIS wants them. The columbus shapefile is used as the 
example file here:

library(maptools)
library(maps)
library(spdep)

setwd("E:\\ShapeFiles")
getwd()

COLUMBUSmap <- read.shape("columbus.shp")
COLUMBUSpolylist <- Map2poly(COLUMBUSmap,region.id=NULL,quiet=TRUE)
COLUMBUSnb <- poly2nb(COLUMBUSpolylist,row.names="POLYID",queen=FALSE)
print(is.symmetric.nb(COLUMBUSnb))

write.sn2Arc(listw2sn(nb2listw(COLUMBUSnb,style="B")),file="E://Shapefiles//zweights_test.txt",field="POLYID")

The tests were successful using style="B" or style="W".
Following are 2 Arc command lines, one for CRIME and the other for NEIGNO:

SpatialAutocorrelation columbus CRIME false "Get Spatial Weights From File" 
"Euclidean Distance" None 0 E:\Shapefiles\zweights_test.txt 0 0

SpatialAutocorrelation columbus NEIGNO false "Get Spatial Weights From 
File" "Euclidean Distance" None 0 E:\Shapefiles\zweights_test.txt 0 0

Thanks for the effort you put into this.

Larry
#
On Wed, 19 Apr 2006, Larry Layne wrote:

            
Thanks, useful! The function will be in the next spdep release.

Roger