random output with sub(fixed = TRUE)
On Wed, 21 Dec 2005, Roger D. Peng wrote:
Well, who am I to break this long-standing ritual? :) Interestingly, while the printed output looks wrong, I get
v <- paste(0:10, "asdf", sep = ".")
a <- sub(".asdf", "", v, fixed = TRUE)
b <- as.character(0:10)
identical(a, b)
[1] TRUE
identical is wrong! R character strings have a true length and a C-style length: print() prints the all the characters, even those after embedded nuls. identical uses if(strcmp(CHAR(STRING_ELT(x, i)), CHAR(STRING_ELT(y, i))) != 0) which is C-style. The issue is character.c:1015 whose nr gets trashed: note the first answer in the vector is correct. So easy to fix. This code has been as currently for years, so I don't think this is at all related to the release of 2.2.1.
Peter Dalgaard wrote:
"Roger D. Peng" <rpeng at jhsph.edu> writes:
I've noticed what I think is curious behavior in using 'sub(fixed = TRUE)' and
was wondering if my expectation is incorrect. Here is one example:
v <- paste(0:10, "asdf", sep = ".")
sub(".asdf", "", v, fixed = TRUE)
The results I get are
sub(".asdf", "", v, fixed = TRUE)
[1] "0" "1\0st\0\0" "2\0<af>\001\0\0" "3\0<af>\001\0\0" [5] "4\0mes\0" "5\0<ba>\001\0\0" "6\0\0\0\0\0" "7\0\0\0m\0" [9] "8\0\0\0t\0" "9\0<fe>\0\0\0" "10\0\0\0\0\0"
I expected "0" in the first entry and everything else would be unchanged. Your results may vary since every time I run 'sub()' in this way, I get a slightly different answer in entires 2 through 11. As it turns out, 'gsub(fixed = TRUE)' gives me the answer I *actually* wanted, which was to replace the string in every entry. But I still think the behavior of 'sub(fixed = TRUE) is a bit odd.
version
_ platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 2.1 year 2005 month 12 day 20 svn rev 36812 language R
Argh... year 2005 month 12 day 21 and something like this gets discovered. It's a ritual, I tell ya, a ritual! If you look at the output and terminate all strings at the embedded \0, it looks much more sensible, so it should be fairly easy to spot the cause of this bug...
-- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595