Regarding this change:
CHANGES IN R 3.1.0:
NEW FEATURES:
* type.convert() (and hence by default read.table()) returns a
character vector or factor when representing a numeric input as a
double would lose accuracy. Similarly for complex inputs.
If a file contains numeric data with unrepresentable numbers of
decimal places that are intended to be read as numeric, specify
colClasses in read.table() to be "numeric".
How do I get the old behavior where type.convert() automatically
converts to numeric if suitable, regardless of whether or not the
string has more than 17 digits of accuracy?
Sure, I could first pass every single column of data through a kludgy
checking function like my.can.be.numeric() below, and then set
colClasses to "numeric" or not based on that, but is there a better
way?
my.can.be.numeric <- function(xx) {
old.warn <- options(warn = -1)
on.exit(options(old.warn))
(!is.na(as.numeric(xx)))
}
Example of the changed behavior in R 3.1.0 vs. earlier versions, both
with options("digits"=10) set:
# R version 3.1.0 Patched (2014-04-15 r65398) -- "Spring Dance"
# Platform: x86_64-unknown-linux-gnu/x86_64 (64-bit)
type.convert(paste("0.", paste(rep(0:9,3)[seq_len(17)],collapse=""), sep=""), as.is=TRUE)
[1] 0.01234568
type.convert(paste("0.", paste(rep(0:9,3)[seq_len(18)],collapse=""), sep=""), as.is=TRUE)
[1] "0.012345678901234567" # R version 3.0.2 Patched (2013-10-23 r64103) -- "Frisbee Sailing" # Platform: x86_64-unknown-linux-gnu/x86_64 (64-bit)
type.convert(paste("0.", paste(rep(0:9,3)[seq_len(17)],collapse=""), sep=""), as.is=TRUE)
[1] 0.01234568
type.convert(paste("0.", paste(rep(0:9,3)[seq_len(18)],collapse=""), sep=""), as.is=TRUE)
[1] 0.01234568
Andrew Piskorski <atp at piskorski.com>