Dear R-List,
I'm trying to read an UTF-8-encoded text file which works fine under
#####################################################################
### CONFIG 1
sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
running under Windows Server 2008.
### RESULT:
read.csv2("example.utf", fileEncoding="UTF-8")
VARIABLE LABEL ORDER_IN_PROFILE
1 A Umlauts:??? 45
2 B Umlauts:???? 35
#####################################################################
The exact same command executed under R-2.14.0 (running under Windows
7) gives a different output:
#####################################################################
### CONFIG 2
sessionInfo()
R version 2.14.0 (2011-10-31)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.14.0
If I specify "encoding" instead of "fileEncoding", non-ascii-chars are
displayed fine, but apparently the "UTF-8-bytes" are not stripped:
### RESULT:
read.csv2("example.utf", encoding="UTF-8")
X.U.FEFF.VARIABLE LABEL ORDER_IN_PROFILE
1 A Umlauts:??? 45
2 B Umlauts:???? 35
######################################################################
Any hints what I could do to reach the results from config 1 under
config 2?
Many thanks in advance,
Christian