Skip to content

Submitting an updated package version to CRAN (Warning: non-ASCII characters)

2 messages · Luck Buttered, Brian Ripley

#
Dear all:

I am updating the version of an R package I submitted last year on CRAN and
came across two questions that I would be grateful to seek any input about:

1) In the updated version of the package, I am adding a second example
dataset. This example dataset is a subset of a public database that
contains thousands of names. Upon running devtools::check(), I am only
getting one warning. ("Warning: found non-ASCII strings").

It seems this warning is coming from special characters in some of the
names. As it is ideal that the names should not be altered, I did not know
what approach to take. Should I simply include a note in my CRAN submission
indicating that the non-ASCII characters are meaningfully inherent to the
example data? Or, should I convert the names to ASCII characters (if that
is easily possible for so many cases), and indicate to users that names
have been altered (special characters removed)?

2) I have never submitted an updated version of a package to CRAN. I am
considering following a similar process to what I did to submit my original
version of the package to CRAN. That is, using devtools::release() and
including a note in a file called cran-comments.md to indicate that this is
not an original version submission, but rather, an updated version
submission. I found these advice on Hadley Wickhams site (
http://r-pkgs.had.co.nz/release.html), but could not determine if this was
appropriate for version update submissions as well.

Thank you for sharing any advice!
1 day later
#
On 21/05/2016 21:25, Luck Buttered wrote:
You should follow the advice of the manual: 
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Encoding-issues 
.  There is not enough detail here to know what you currently do (let 
alone what you should do), but that message indicates that the encoding 
of non-ASCII stings (what you call 'special characters') has not been 
declared (and to be portable they should be in UTF-8).
There is a list for discussing package preparation, r-package-devel.