An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20140121/2b0d187b/attachment.pl>
Using unicode from C interface of R
4 messages · Duncan Murdoch, Brian Ripley, Sandip Nandi
On 14-01-21 5:41 PM, Sandip Nandi wrote:
Hi , I am using C interface of R . If a unicode string is read , in what format I could pass it back to R ? I was trying to use the following tpStr = ( char *)val; SET_STRING_ELT(innerList , 0, mkChar(tpStr)); It does not work . If I pass it back from as RAW format to R , what package is there to read it ? I mean package for interpreting RAW data .
There are a number of encodings for Unicode. Most Unix systems use UTF-8, Windows uses UTF-16 for some things, etc. If your string is known to be in UTF-8 that's easiest: just use mkCharCE instead of mkChar, as described in Writing R Extensions. If it is in UTF-16 you might have more trouble because of possible embedded 0 bytes. Translate to UTF-8 first using C facilities like WideCharToMultibyte. Duncan Murdoch
On 22/01/2014 00:08, Duncan Murdoch wrote:
On 14-01-21 5:41 PM, Sandip Nandi wrote:
Hi , I am using C interface of R . If a unicode string is read , in what format I could pass it back to R ? I was trying to use the following tpStr = ( char *)val; SET_STRING_ELT(innerList , 0, mkChar(tpStr)); It does not work . If I pass it back from as RAW format to R , what package is there to read it ? I mean package for interpreting RAW data .
There are a number of encodings for Unicode. Most Unix systems use UTF-8, Windows uses UTF-16 for some things, etc. If your string is known to be in UTF-8 that's easiest: just use mkCharCE instead of mkChar, as described in Writing R Extensions. If it is in UTF-16 you might have more trouble because of possible embedded 0 bytes. Translate to UTF-8 first using C facilities like WideCharToMultibyte.
Which is Windows-only (and 'wide char' differs by platform, including if it is known to be any Unicode encoding) All platforms have Riconv: see 'Writing R Extensions'. C11 has other ways to do this, but they are not widely implemented.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20140121/2a5c2e24/attachment.pl>