Skip to content
Prev 43743 / 63424 Next

Encoding errors in Rd files

On 12-07-25 3:24 AM, steven mosher wrote:
The issue is almost certainly with accented characters in the text.  For 
example, the 5th letter in Rivi?re is an e with a grave accent.  It is 
displayed as "h" in your error message, because R was not told how to 
interpret the way it is stored in the source file, or was told something 
that turned out to be incorrect.

You need to change the source file so that it is stored in the UTF-8 
encoding.  That means you should read the file into an editor that 
displays it correctly (and that's sometimes hard when you don't know the 
original encoding; you may need to do some manual editing), then save it 
again, specifying that it should be saved using the UTF-8 encoding.  How 
you do that depends on your editor.

Then when you tell R that it is encoded in UTF-8, R will read it 
properly and won't complain.

The tools::showNonASCIIfile() function can help to find characters that 
may need fixing.  R can recognize when things are not ASCII (those bytes 
have the high bit set), but it will be up to you to figure out what 
encoding was actually used.  For French, latin1 is a good guess but it 
is not necessarily right.

Duncan Murdoch