Skip to content

type.convert (PR#6781)

3 messages · J. R. M. Hosking, Brian Ripley

#
Full_Name: J. R. M. Hosking
Version: 1.9.0
OS: Windows 2000
Submission from: (NULL) (129.34.20.23)


Two problems, perhaps related:

(1) na.strings is not honored when x is non-numeric and as.is=T

  > type.convert( c("abc","-"), as.is=T, na.strings="-" )
  [1] "abc" "-"  

... unless x consists only of NAs

  > type.convert( c("abc","-"), as.is=T, na.strings=c("-","abc") )
  [1] NA NA

But with x numeric or as.is FALSE (or omitted), it works as advertised:

  > type.convert( c("abc","-"), na.strings="-" )
  [1] abc  <NA>
  Levels: abc
  > type.convert( c("6","-"), na.strings="-" )
  [1]  6 NA


(2) When na.strings is omitted, blank strings in nonnumeric vectors are not
converted into NAs (regardless of the value of as.is).

  > type.convert(c("6",""," "))     # OK: gives 6 NA NA
  [1]  6 NA NA

  > type.convert(c("A",""," "))   # gives a factor with 3 levels and no NAs
  [1] A    
  Levels:    A

  > type.convert(c("A",""," "),as.is=T)  # gives a char vector with no NAs
  [1] "A" ""  " "


Rider: it would be nice if type.convert had a strip.white argument, so that
  type.convert(c(" 6"," -"),na.strings="-",strip.white=T) 
would return a numeric vector. Stripping leading and trailing blanks can be
time-consuming, and could presumably be done more quickly by an .Internal
function such as the one called by type.convert.


(R 1.9.0, Windows binary from CRAN)
1 day later
#
On Fri, 16 Apr 2004 hosking@watson.ibm.com wrote:
[...]
It can be done very rapidly by sub(), for example.

Given that type.convert is documented as a helper function for read.table, 
we will not be encumbering it with features read.table does not need.  It 
would be helpful if you did not encumber bug reports with other issues, 
too.
#
On Fri, 16 Apr 2004 hosking@watson.ibm.com wrote:

            
That is documented to be different (a logical vector).
The point is that if no conversion takes place, no checking for na.strings 
took place.  That _was_ intentional, as it is not needed by read.table.
I've added it.
This is wholly intentional, and is now documented.  It is nothing to do 
with whether `na.strings is omitted', though.