Skip to content
Back to formatted view

Raw Message

Message-ID: <Pine.LNX.4.44.0404181714470.22842-100000@gannet.stats>
Date: 2004-04-18T18:31:16Z
From: Brian Ripley
Subject: type.convert (PR#6781)
In-Reply-To: <20040416163133.1085710495@slim.kubism.ku.dk>

On Fri, 16 Apr 2004 hosking@watson.ibm.com wrote:

> Full_Name: J. R. M. Hosking
> Version: 1.9.0
> OS: Windows 2000
> Submission from: (NULL) (129.34.20.23)
> 
> 
> Two problems, perhaps related:
> 
> (1) na.strings is not honored when x is non-numeric and as.is=T
> 
>   > type.convert( c("abc","-"), as.is=T, na.strings="-" )
>   [1] "abc" "-"  
> 
> ... unless x consists only of NAs
> 
>   > type.convert( c("abc","-"), as.is=T, na.strings=c("-","abc") )
>   [1] NA NA

That is documented to be different (a logical vector).

> But with x numeric or as.is FALSE (or omitted), it works as advertised:
> 
>   > type.convert( c("abc","-"), na.strings="-" )
>   [1] abc  <NA>
>   Levels: abc
>   > type.convert( c("6","-"), na.strings="-" )
>   [1]  6 NA

The point is that if no conversion takes place, no checking for na.strings 
took place.  That _was_ intentional, as it is not needed by read.table.
I've added it.


> (2) When na.strings is omitted, blank strings in nonnumeric vectors are not
> converted into NAs (regardless of the value of as.is).

This is wholly intentional, and is now documented.  It is nothing to do 
with whether `na.strings is omitted', though.

>   > type.convert(c("6",""," "))     # OK: gives 6 NA NA
>   [1]  6 NA NA
> 
>   > type.convert(c("A",""," "))   # gives a factor with 3 levels and no NAs
>   [1] A    
>   Levels:    A
> 
>   > type.convert(c("A",""," "),as.is=T)  # gives a char vector with no NAs
>   [1] "A" ""  " "

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595