Skip to content

readOGR and nonASCII character

5 messages · Roger Bivand, Agustin Lobo

#
Hi list,

Is there any way to get readOGR() to correctly read
non-ascii character strings from the dbf file? I've
checked and my dbf file correctly displays
names with accents, but once read into R
accents are substituted by wrong symbols.

Agus
6 days later
#
On Wed, 11 Jul 2007, Agustin Lobo wrote:

            
We are dependent on what GDAL/OGR gives us here. Please try the equivalent 
function in maptools for your shapefile, and see whether the read.dbf() in 
foreign does any better. I'm assuming that you know the locale settings of 
your platform, and of the originating platform from sessionInfo()?

Roger

  
    
5 days later
#
Roger,

read.shapefile() and read.dbf() yields weird symbols for non-ascii 
characters in the input file also.
Exporting to csv and reading in with read.csv(filename,sep=";") works 
fine (yes, it's odd, excel puts ";" instead of "," for CSV in spanish 
locale as
"," could be used for decimal separation)

My session.info output is:

 > sessionInfo()
R version 2.5.0 (2007-04-23)
i386-pc-mingw32

locale:
LC_COLLATE=Spanish_Spain.1252;LC_CTYPE=Spanish_Spain.1252;LC_MONETARY=Spanish_Spain.1252;LC_NUMERIC=C;LC_TIME=Spanish_Spain.1252

attached base packages:
[1] "stats"     "graphics"  "grDevices" "utils"     "datasets" 
"methods"   "base"

other attached packages:
   spatstat       mgcv shapefiles   maptools    foreign      rgdal 
    sp
   "1.11-7"   "1.3-23"      "0.6"   "0.6-13"   "0.8-20"   "0.5-13" 
"0.9-14"

Agus

Roger Bivand escribi?:

  
    
#
On Mon, 23 Jul 2007, Agustin Lobo wrote:

            
(See read.csv2())

So a possible work-around is to use CSV or text files to transfer the 
affected character values. There are lots of possible difficulties when 
the data are generated by one program making some assumptions and then 
read by a different program with different assumptions. read.table() and 
friends do have an encoding= argument to assign a known code to character 
strings. There is a short note in the Data Import-Export manual, and much 
more about locales in the installation and administration manual.

Roger

  
    
#
This is what I do as a work-around:

ActiG <- 
readOGR("C:/ALOBO/dipu2006/espaisoberts_P037/GARRAF/GFO_EOA_TOTS_v4",layer="GFO_EOA_Acti_v4")
delme2 <- 
read.csv("C:/ALOBO/dipu2006/espaisoberts_P037/GARRAF/GFO_EOA_TOTS_v4/GFO_EOA_Acti_v4.csv",sep=";") 

ActiG at data$NOM_FINCA <- delme2$NOM_FINCA

but wanted to highlight the problem of non ascii characters. Also,
note that writeOGR works fine. After this, the dbf written by

writeOGR(pepa,layer="ActiG",driver="ESRI 
Shapefile",dsn="C:/ALOBO/dipu2006/espaisoberts_P037/GARRAF/GFO_EOA_TOTS_v4")

has accents, ?, and all the catalan subtleties!

Thanks for your patience

Agus


Roger Bivand escribi?: