unicode&pdf font problem RESOLVED
I have many German umlauts in my data sets and code them UTF-8. When it comes to plotting on pdf, I figured out that "CP1257" is a good choice to output Umlauts. I have no experiences with "CP1250", but maybe this small hint helps: pdf(file=paste(sharepath, "/filename.pdf", sep=""), 9, 6, pointsize = 11, family = "Helvetica", encoding = "CP1257") *S*
On 11-01-13 16:17, tdenes at cogpsyphy.hu wrote:
Date: Thu, 13 Jan 2011 16:17:04 +0100 (CET) From: tdenes at cogpsyphy.hu To: David Winsemius <dwinsemius at comcast.net> Cc: r-help at r-project.org Subject: Re: [R] unicode&pdf font problem RESOLVED Dear David, Thank you for your efforts. Inspired by your remarks, I started a new google-search and found this: http://stackoverflow.com/questions/3434349/sweave-not-printing-localized-characters SO HERE COMES THE SOLUTION (it works on both OSs): pdf.options(encoding = "CP1250") pdf() plot(1,type="n") text(1,1,"\U0171") dev.off() CP1250 should work for all Central-European languages: http://en.wikipedia.org/wiki/Windows-1250 Thank you again, Denes
On Jan 13, 2011, at 7:01 AM, tdenes at cogpsyphy.hu wrote:
Hi! Sorry for the missing specs, here they are:
version
_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 12.1 year 2010 month 12 day 16 svn rev 53855 language R version.string R version 2.12.1 (2010-12-16) OS: Windows 7 (English version, 32 bit)
You are after what Adobe calls: udblacute; 0171. It is recognized in the list of adobe glyphs:
str(tools::Adobe_glyphs[371, ])
'data.frame': 1 obs. of 2 variables:
$ adobe : chr "udblacute"
$ unicode: chr "0171"
Consulted the help pages
points {graphics}
postscript {grDevices}
pdf {grDevices}
charsets {tools}
postscriptFonts {grDevices}
I have tried a variety of the pdfFonts installed on my Mac without
success. You can perhaps make a list of fonts on your machines with
names(pdfFonts()). Perhaps the range of fonts and the glyphs they
contain is different on your machines. I get consistently warning
messages saying there is a conversion failure:
pdf("trial.pdf", family="Helvetica")
# also tried with font="Helvetica" but I think that is erroneous
plot(1,type="n") text(1,1,"print \U0170\U0171")
Warning messages: 1: In text.default(1, 1, "print ????") : conversion failure on 'print ????' in 'mbcsToSbcs': dot substituted for <c5> 2: In text.default(1, 1, "print ????") : conversion failure on 'print ????' in 'mbcsToSbcs': dot substituted for <b0> 3: In text.default(1, 1, "print ????") : conversion failure on 'print ????' in 'mbcsToSbcs': dot substituted for <c5> 4: In text.default(1, 1, "print ????") : conversion failure on 'print ????' in 'mbcsToSbcs': dot substituted for <b1> 5: In text.default(1, 1, "print ????") : font metrics unknown for Unicode character U+0170 6: In text.default(1, 1, "print ????") : font metrics unknown for Unicode character U+0171 7: In text.default(1, 1, "print ????") : conversion failure on 'print ????' in 'mbcsToSbcs': dot substituted for <c5> 8: In text.default(1, 1, "print ????") : conversion failure on 'print ????' in 'mbcsToSbcs': dot substituted for <b0> 9: In text.default(1, 1, "print ????") : conversion failure on 'print ????' in 'mbcsToSbcs': dot substituted for <c5> 10: In text.default(1, 1, "print ????") : conversion failure on 'print ????' in 'mbcsToSbcs': dot substituted for <b1> And this is despite my system saying the \U0170 and \U0171 are present in the Helvetica font. Also tried family=URWHelvetica and family=NimbusSanand and a bunch of others without success, but my last best hope after reading the material in help(postscript) in the "Families" section had been NimbusSan. There is also information on that page regarding encodings that appears to be very machine specific.
Note that \U0171 != ??. See http://www.fileformat.info/info/unicode/char/171/index.htm Anyway, I have no problem with ű (~u") and other special Hungarian characters in my R-Gui. It is correctly displayed in the console, in plots, etc. The problem is with the pdf conversion. The same holds for my Ubuntu Hardy Heron system*, with exactly the same error messages as reported in an earlier thread http://www.mail-archive.com/r-help at r-project.org/msg89792.html As far as I know, Hershey fonts do not contain \U0171. Regards, Denes * The specs of Ubuntu:
version
_ platform x86_64-pc-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 12.0 year 2010 month 10 day 15 svn rev 53317 language R version.string R version 2.12.0 (2010-10-15)
On Jan 12, 2011, at 11:11 PM, tdenes at cogpsyphy.hu wrote:
Dear List,
I would like to print a plot into pdf. The problem is that the
character
\U0171 is replaced by a simple 'u' (i.e. without accents) in the pdf
file.
Example:
# this works fine
plot(1,type="n")
text(1,1,"print \U0171")
# this fails
pdf("trial.pdf")
plot(1,type="n")
text(1,1,"print \U0171")
dev.off()
Have you tried:
pdf("trial.pdf")
plot(1,type="n")
text(1,1,"print ??")
dev.off()
Your default screen fonts may not be the same as your default pdf
fonts. A lot depends on system specifics, none of which have you
provided.
I found an earlier post at http://www.mail-archive.com/r-help at r-project.org/msg65541.html, but it is too hard to understand at my R-level. Any help is appreciated.
David Winsemius, MD West Hartford, CT
David Winsemius, MD West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sascha Vieweg, saschaview at gmail.com