Skip to content

disabling NA token as na.string in read.table

2 messages · Vadim Ogranovich, Peter Dalgaard

#
Dear R-Users,

I have a csv file that has NA tokens and these tokens are perfectly good
values that need not to be converted to NA by read.table(). I tried to
prevent the conversion by specifying the na.strings arg., but this seems to
only add to the list of NA strings, not substitute.
system("cat foo")
1 foo
2 NA
read.table("foo", na.strings="foo")
  V1 V2
1  1 NA
2  2 NA


This is R1.6.0 on Linux.

What did I do wrong?

Thanks, Vadim

-------------------------------------------------- 
DISCLAIMER \ This e-mail, and any attachments thereto, is intend ... [[dropped]]
#
Vadim Ogranovich <vograno at arbitrade.com> writes:
Hmm, this looks like a bit of a bug. read.table() ends up calling
type.convert() with its default "NA" na.string. Now, if "NA" was in
the na.string for read.table(), scan() would already have turned it
into <NA> at that point, so I suspect you might have preferred
na.strings=character(0), but that has the side effect of turning the
real NA into a factor level:
[1] <NA> <NA> foo
Levels: foo
[1] <NA> NA   foo
Levels: NA foo NA
structure(c(3, 1, 2), .Label = c("NA", "foo", NA), class = "factor")

I.e. it looks like the internals of type.convert needs some fixing up.