Skip to content

changing names with different character sets

5 messages · Erin Hodgess, Brian Ripley, Jeff Newmiller +1 more

#
Dear R People:

I'm trying to replicate something that I saw on an R blog.

The first step is to load in the .rda file, which is fine.

However, some of the names of the columns in the data frame have
special characters, accents, and such.

How do I get around this on a basic keyboard, please?

Thanks,
Erin
#
On 19/02/2012 07:30, Erin Hodgess wrote:
Most of the world think characters with accents are normal, not special. 
  The difference for R is going to be whether they are alphanumeric or not.
Copy-and-paste from names(dataframe) may work.  But without an example 
or knowing your OS or your locale (but I remember you are in the US) it 
is hard to tell.

The main issue is that what R regards as a valid name aka symbol depends 
on the locale, and so strictly in a US locale no non-ASCII characters 
are valid in names.  In practice US locales tend to be set up either for 
a Western European character set (Latin-1, cp1252) or so that all 
alphanumeric Unicode characters in a human language are regarded as 
alphanumeric.

  
    
#
Refer to the columns by their position (numerically).
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.
Erin Hodgess <erinm.hodgess at gmail.com> wrote:

            
#
On Feb 19, 2012, at 08:49 , Prof Brian Ripley wrote:

            
You could consider a strategy like this:
?blefl?de Bl?b?rgr?d
1         1          3
2         2          4
[1] "?blefl?de"  "Bl?b?rgr?d"
[1] "AEbleflode"  "Blabaergrod"
AEbleflode Blabaergrod
1          1           3
2          2           4

(If the characters don't display correctly to begin with, you may need to figure out the appropriate from= argument to iconv() as well.)

  
    
#
On 19/02/2012 12:43, peter dalgaard wrote:
And for some languages transliteration does not work (and it is not 
supported at all under some versions of iconv).

We are all guessing, but the comment about a 'basic' keyboard suggested 
to me that the column names were used in some script.  If so, getting R 
to work with the original names may be the simplest alternative.