Skip to content
Prev 6056 / 63468 Next

RFC: type conversion in read.table

Prof Brian Ripley wrote:
Yes, definitely.  It also fits very well into the formal class idiom. 
Couple of suggestions below.
I think the most flexible way to get what you want is something like the
following.

The natural default for the colClasses argument is the name of a class,
but a "virtual" class in green book terminology.

I've been playing around with some data-frame related software mostly as
tests for the methods code (in SLanguage/SModels in the Omegahat tree).

The class used there for this purpose is called "dataVariable", meaning
anything that can conceptually be a variable in a data frame.  Actual
classes for variables extend this class, maybe trivially, maybe by some
method.

What's needed for the default here is essentially a method to coerce
class "character" to "dataVariable" (or whatever name one wants to
use).  When we are really using formal methods, this would be specified
by a call to setAs (green book, p307).  Then in effect
  data[[i]] <- as(data[[i]], colClasses[i])
applies in the default case as well.

Users could specialize the default by over-riding the setAs, but a
better way would be to define a new virtual class, with its own method
for coercion.  Users would then have essentially unlimited flexibility,
by supplying the name of that class in the colClasses argument.
As a default, seems fine.  When the user supplies a class, this implies
an as() method, which can then decide what to do in case of
problems--error, NA, or whatever.
John