readLines interaction with gsub different in R-dev
I was told to re-raise this issue with R-dev: In the documentation of R-dev and R-3.4.3, under ?gsub
replacement ... For perl = TRUE only, it can also contain "\U" or "\L" to convert the rest of the replacement to upper or lower case and "\E" to end case conversion.
However, the following code runs differently:
tempf <- tempfile()
writeLines(enc2utf8("author: Am?lie"), con = tempf, useBytes = TRUE)
entry <- readLines(tempf, encoding = "UTF-8")
gsub("(\\w)", "\\U\\1", entry, perl = TRUE)
"AUTHOR: AM?LIE" # R-3.4.3
"A" # R-dev
Best,
Hugh Parsonage.