translating HTML character entities to accented characters
Thanks, David I need an all-R solution for this, because the author.csv file is exported from a database that enforces the HTML encoding and the import into R may have to be repeated several times as the database is updated. -Michael
On 8/10/2012 12:40 PM, David L Carlson wrote:
It's not quite an R solution, but I just pasted your examples into a script window in R and saved it as chars.html. Then I opened it in Firefox and pasted the results here (with returns inserted to match your original).
grep("&", author$lname, value=TRUE)
[1] "Fr?re de Montizon" "Lumi?re" [3] "Lumi?re" "Ni?pce" [5] "S?ssmilch" "Sch?pbach"
grep("&", author$birthplace, value=TRUE)
[1] "Marbach, W?rttemberg" [2] "C?te-d'Or" [3] "Chalon-sur-Sa?ne, Sa?ne-et-Loire" [4] "Gro? S?rchen, Germany"
apropos("HTML")
For a CSV file you would want to preserve the lines by adding <br> to the end of each line first. ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Michael Friendly Sent: Friday, August 10, 2012 11:15 AM To: R-help Subject: [R] translating HTML character entities to accented characters I've imported a .csv file where character strings that contained accented characters were written as HTML character entities. Is there a function that works on a vector to translate them back to accented (latin1) characters? Some examples:
> grep("&", author$lname, value=TRUE)
[1] "Frère de Montizon" "Lumière" [3] "Lumière" "Niépce" [5] "Süssmilch" "Schüpbach"
> grep("&", author$birthplace, value=TRUE)
[1] "Marbach, Württemberg" [2] "Côte-d'Or" [3] "Chalon-sur-Saône, Saône-et-Loire" [4] "Groß Särchen, Germany"
> apropos("HTML")
thx, -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA