Hello everybody, I'm testing my package with the devtools::check() function and I got a warning about found non-ASCII strings. These characters are in a dataframe and, as they are names of institutions used to filter databases, it makes no sense to translate them. Is there any way to make the check accept these characters? They are in latin1 encoding. Thanks in advance! -- *Igor Laltuf Marques* Economist (UFF) Master in urban and regional planning (IPPUR-UFRJ) Researcher at ETTERN e CiDMob https://igorlaltuf.github.io/
[R-pkg-devel] Non-ASCII and CRAN Checks
4 messages · Igor L, Neal Fultz, Hadley Wickham +1 more
This happened to me this summer when working on the recent US census; came up with two possible solutions: 1. Re-encode the column to UTF-8. Example: Encoding(puertoricocounty20$NAME) <- "latin1" puertoricocounty20$NAME <- iconv(puertoricocounty20$NAME, "latin1", "UTF-8") 2. Use gsub to replace all n-tilde's with regular n's. - Neal
On Mon, Sep 19, 2022 at 12:53 PM Igor L <igorlaltuf at gmail.com> wrote:
Hello everybody, I'm testing my package with the devtools::check() function and I got a warning about found non-ASCII strings. These characters are in a dataframe and, as they are names of institutions used to filter databases, it makes no sense to translate them. Is there any way to make the check accept these characters? They are in latin1 encoding. Thanks in advance! -- *Igor Laltuf Marques* Economist (UFF) Master in urban and regional planning (IPPUR-UFRJ) Researcher at ETTERN e CiDMob https://igorlaltuf.github.io/ [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
1 day later
In my experience this NOTE does not interfere with CRAN submission and you can ignore it. Hadley
On Monday, September 19, 2022, Igor L <igorlaltuf at gmail.com> wrote:
Hello everybody, I'm testing my package with the devtools::check() function and I got a warning about found non-ASCII strings. These characters are in a dataframe and, as they are names of institutions used to filter databases, it makes no sense to translate them. Is there any way to make the check accept these characters? They are in latin1 encoding. Thanks in advance! -- *Igor Laltuf Marques* Economist (UFF) Master in urban and regional planning (IPPUR-UFRJ) Researcher at ETTERN e CiDMob https://igorlaltuf.github.io/ [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
http://hadley.nz [[alternative HTML version deleted]]
Leaving data in the wrong encoding is leaving a bug around waiting to surface. Is the data correctly encoded as Latin1 (codepage 8859-1), Windows 8 bit (codepage 1252, also sometimes referred to as Latin1) or some Unicode encoding (likely UTF-8)? Character mapping is not such an issue for mapping characters between the traditional 8 bit character sets (they do very little interpretation), but in going in and out of Unicode with incorrectly encoded data you can end up with non-characters (a concept that does not exist in 8 bit character sets) in your data which bite you much later when other systems require data to be Unicode characters. I also had some extremely odd behaviour from R around the beginning of the year when some Unicode accented characters got into some variable names and data frame data access got quite weird. Greg
On Wed, 21 Sept 2022 at 10:04, Hadley Wickham <h.wickham at gmail.com> wrote:
In my experience this NOTE does not interfere with CRAN submission and you can ignore it. Hadley On Monday, September 19, 2022, Igor L <igorlaltuf at gmail.com> wrote:
Hello everybody, I'm testing my package with the devtools::check() function and I got a warning about found non-ASCII strings. These characters are in a dataframe and, as they are names of
institutions
used to filter databases, it makes no sense to translate them. Is there any way to make the check accept these characters? They are in latin1 encoding. Thanks in advance! -- *Igor Laltuf Marques* Economist (UFF) Master in urban and regional planning (IPPUR-UFRJ) Researcher at ETTERN e CiDMob https://igorlaltuf.github.io/ [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
-- http://hadley.nz [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel