Skip to content

Considers offset term in binary spatial logit/probit model

6 messages · seyatranspo at yahoo.co.jp, James Rooney, Roger Bivand

1 day later
#
On Fri, 4 Jul 2014, seyatranspo at yahoo.co.jp wrote:

            
I suggest sending a copy of this question to the authors of the package 
with a motivating example. I do note, however, that using the formula 
interface to sarprobit() permits for example:

sarprobit.fit <- sarprobit(y ~ X[,3] + offset(1*X[,2]), W, ndraw=1000,
   thinning=1, prior=NULL)

using the example on the function help page. See ?formula for the offset() 
term.

Hope this helps,

Roger

  
    
#
Hi all,

I am using writeOGR to save ESRI shapefiles after a ton of processing. Typical warning message I get is as follows:
Warning message:
In writeOGR(ED, "/Users/jrooney/Documents/My Projects/ALS Spatial - Advanced/Irish Data/Processed Data/",  :
  Field names abbreviated for ESRI Shapefile driver

Now it has been doing this all along. But lately it has changed. For example - I frequently use the column id "GEOGID" as I've inherited it somewhere along the line from some dataset I got somewhere.
Until recently it was happily saving this column name in its entireity. Lately, for some mystery reason it is not saving it fully and only saving 5 letters - "GEOGI". This, as you can imagine is causing me troubles.

Anyhow - why are field names so short for ESRI shapfiles and why does it sometimes accept 6 letter fields and other times not ? It would be great if it would allow longer field names, as 5 letters is not alot when you have over 100 variables - it gets confusing even when you use codes.

Many thanks,
James
#
On Sun, 6 Jul 2014, James Rooney wrote:

            
It would have helped if you gave the output of sessionInfo() - including 
the version of rgdal, and the startup messages on loading rgdal, as rgdal 
version, underlying GDAL version, and platform may make a difference. In 
addition, it would help to know how you installed rgdal.

The warning and field-name abbreviation occurs in R/ogr_write.R after line 
116, and has been present since mid-November 2011. This is based on:

http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrsf_frmts/shape/drv_shapefile.html

after line 120, describing what GDAL does to field names, and trying to 
pre-empt this in R. As the OGR driver only supports names <= 10 characters 
long, the R code uses up to two passes of abbreviate() to shorten field 
names in data to be exported, first trying to get to minlength=7 
chararacters, and if this is insufficient, to minlength=5. Your case 
suggests that you now have so many similar field names over 10 characters 
long that abbreviate() no longer succeeds with a wish of minlength=7, so 
goes to minlength=5 - this would explain the change in behaviour.

Testing each name separately rather than using abbreviate() on the vector 
of names would incur the further cost of checking for uniqueness - the 
function preserves uniqueness when strict=FALSE.

The reason for the difficulty is that the underlying DBF format cannot be 
relied on to support field names > 10 characters long, so the OGR driver, 
and writeOGR(), are obliged to protect users from arbitrary shortening, 
which could lead to multiple fields having the same name. Some 
applications may permit longer names, but others do not, and those are the 
ones that set the limit.

The only reliable resolution is to give your own variable/field names that 
are all <= 10 characters long if you use the ESRI Shapefile driver for 
exporting objects. You can handle this in scripts yourself by looking at 
nchar(names(<obj>)), and abbreviating uniquely any that are longer.

Note that there are also substantial encoding challenges in the DBF format 
too, although I don't think that this is affecting nchar() here - keeping 
to ASCII (always single byte) in field names with an archaic format like 
DBF may be judicious.

Hope this clarifies,

Roger

  
    
#
Hi Roger,

Many thanks for your answer - very enlightening.
Apologies I didn't realise the sessionInfo and other stuff would help.
As it happens your explanation makes perfect sense - it is only after I started using some longer and similar field names that the behaviour changed. Knowing how it works now thanks to your explanation I should be able to avoid that. I will make changes and see if it works. If it doens't work I'll post back with all the version info etc.

Many thanks,
James