Skip to content

R 1.8.1 - 1.9.0 incompatability: Underscore in syntactically valid names

3 messages · Michael A. Miller, Peter Dalgaard

#
Dear R-gang,

I have a question about handling underscores in names in R 1.8.1
and 1.9.0.  I recently installed 1.9.0 on a machine and found
that many codes no longer work as a result of the changed
behavior in make.names.

I have numerous data files that have dashes, periods and
underscores in the header row.  I've got numerous R codes that
read those files with read.table and read.csv and then use the
names expecting the underscores and dashes to be changed to
periods in the column names.  Since this no longer happens in
1.9.0, lots of codes are failing.  One way to work around this is
to repair a lot of codes to take this backward incompatibility
into account (both R scripts and data sources, so that is a
significant project, here at least).  Or I can stay with 1.8.1,
but that is just a sure route to eventual incomparability with
something else, so I'll gradually start migrating code over and
adding version checks.

Can anyone suggest a way to maintain compatibility between R
versions?  I suppose I could write a wrapper around make.names to
replace _ with ., but I'm not sure what that might break in 1.9.0
that now expects the new behavior.  I'd appreciate any
suggestions.

Mike
#
mmiller3 at iupui.edu (Michael A. Miller) writes:
Gah! I could swear we discussed that particular issue leading up to
1.9.x and had plans for a compatibility option.

You might file a bug report at least for the docs, since the example
is clearly wrong...
3 days later
#
> Gah! I could swear we discussed that particular issue
    > leading up to 1.9.x and had plans for a compatibility
    > option.

    > You might file a bug report at least for the docs, since
    > the example is clearly wrong...

Done.  

I tried to write some version dependencies into a sample code,
but I'm stumped by the fact that _ is not allowed before 1.9.0.
For example, suppose I have a data file, example.dat, like this: 

a b x   some_factor
1 1 0.4 orange
2 1 0.3 blue
1 1 0.2 dog
2 1 0.1 orange
1 2 0.4 blue
2 2 0.3 dog
1 2 0.2 orange
2 2 0.1 blue

To read and use this in a version independent way, I've tried this:

df <- read.table('example.dat',header=T)
if ( version['minor'] == "9.0" ) {
  plot(x ~ some_factor, data=df)
} else {
  plot(x ~ some.factor, data=df)
}


This fails in R 1.8.1, because some_factor throws a syntax error:

  > df <- read.table('example.dat',header=T)
  > if ( version['minor'] == "9.0" ) {
  + if ( version['minor'] == "9.0" ) {
  +   plot(x ~ some_factor, data=df)
  Error: syntax error
  > 

Ick.  Is there a known idiom for handling this sort of version
dependency in R?  I'm going to avoid R 1.9.x for now, and I
encourage any authors of contributing packages to do their best
to maintain backwards compatibility for those of us who cannot
make the switch quickly.

Mike