Message-ID: <46B8E8CE.1060509@stats.uwo.ca>
Date: 2007-08-07T21:49:02Z
From: Duncan Murdoch
Subject: Embedded nuls in strings
In-Reply-To: <46B8DEF0.3030602@fhcrc.org>
On 07/08/2007 5:06 PM, Herve Pages wrote:
> Hi,
>
> ?rawToChar
> 'rawToChar' converts raw bytes either to a single character string
> or a character vector of single bytes. (Note that a single
> character string could contain embedded nuls.)
>
> Allowing embedded nuls in a string might be an interesting experiment but it
> seems to cause some troubles to most of the string manipulation functions.
>
> A string with an embedded 0:
>
> raw0 <- as.raw(c(65:68, 0 , 70))
> string0 <- rawToChar(raw0)
>
>> string0
> [1] "ABCD\0F"
>
> nchar() should return 6:
>> nchar(string0)
> [1] 4
You don't state your R version. The default type of counting in nchar()
has recently changed from "bytes" (where 6 is correct) to "chars" (where
4 is correct).
Duncan Murdoch