disabling NA token as na.string in read.table
Vadim Ogranovich <vograno at arbitrade.com> writes:
Dear R-Users, I have a csv file that has NA tokens and these tokens are perfectly good values that need not to be converted to NA by read.table(). I tried to prevent the conversion by specifying the na.strings arg., but this seems to only add to the list of NA strings, not substitute.
system("cat foo")
system("cat foo")
1 foo
2 NA
read.table("foo", na.strings="foo")
read.table("foo", na.strings="foo")
V1 V2
1 1 NA
2 2 NA
This is R1.6.0 on Linux.
What did I do wrong?
Hmm, this looks like a bit of a bug. read.table() ends up calling type.convert() with its default "NA" na.string. Now, if "NA" was in the na.string for read.table(), scan() would already have turned it into <NA> at that point, so I suspect you might have preferred na.strings=character(0), but that has the side effect of turning the real NA into a factor level:
x <- c(NA,"NA","foo") type.convert(x)
[1] <NA> <NA> foo Levels: foo
type.convert(x,na.strings=character(0))
[1] <NA> NA foo Levels: NA foo NA
dput(type.convert(x,na.strings=character(0)))
structure(c(3, 1, 2), .Label = c("NA", "foo", NA), class = "factor")
I.e. it looks like the internals of type.convert needs some fixing up.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907