Bug in rank with utf8?
2015-08-13 8:39 GMT-05:00 Hadley Wickham <h.wickham at gmail.com>:
x <- "\u0663" y <- 3 x == y # FALSE rank(c(x, y)) # c(1.5, 1.5)
?also interesting, and confusing to me:
x == y
[1] FALSE
x > y
[1] FALSE
x < y
[1] FALSE
With some slight changes:
x <- "\u0663" y <- "3" xy <- c(x,y) rank(xy);
[1] 1.5 1.5
Sys.getlocale();
[1] "LC_CTYPE=en_US.UTF8;LC_NUMERIC=C;LC_TIME=en_US.UTF8;LC_COLLATE=en_US.UTF8;LC_MONETARY=en_US.UTF8;LC_MESSAGES=en_US.UTF8;LC_PAPER=en_US.UTF8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF8;LC_IDENTIFICATION=C"
Sys.setlocale(category="LC_COLLATE", locale="C");
[1] "C"
rank(xy);
[1] 2 1
Schrodinger's backup: The condition of any backup is unknown until a restore is attempted. Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be. He's about as useful as a wax frying pan. 10 to the 12th power microphones = 1 Megaphone Maranatha! <>< John McKown [[alternative HTML version deleted]]