blank space escape sequence in R?

11 messages · Mark Heckmann, Duncan Murdoch, Jan van der Laan +3 more

Original

1

11

Mon, Apr 25, 2011 6:01 AM #

Is there a blank space escape sequence in R, i.e. something like \sp etc. to produce a blank space?

TIA
Mark
???????????????????????????????????????
Mark Heckmann
Blog: www.markheckmann.de
R-Blog: http://ryouready.wordpress.com

Mon, Apr 25, 2011 6:05 AM #

On 25/04/2011 9:01 AM, Mark Heckmann wrote:

You need to give some context.  A blank in a character vector will be 
printed as a blank, so you are probably talking about something else, 
but what?

Duncan Murdoch

Mon, Apr 25, 2011 6:13 AM #

Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt.
Name: nicht verf?gbar
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110425/336a6b29/attachment.pl>

Mon, Apr 25, 2011 6:28 AM #

On 25/04/2011 9:13 AM, Mark Heckmann wrote:

I don't think R has anything like that built in.   You'll need to attach 
a class to your vector of strings, and write a print method for it that 
does the substitution before printing.

Duncan Murdoch

Jan van der Laan

Mon, Apr 25, 2011 7:35 AM #

There exists a non-breaking space:

http://en.wikipedia.org/wiki/Non-breaking_space

Perhaps you could use this. In R on Linux under gnome-terminal I can 
enter it with CTRL+SHIFT+U00A0. This seems to work: it prints as a 
space, but is not equal to ' '. I don't know if there are any 
difficulties using, for example, utf8 encoding in source files (which 
you'll probably need).

Jan

On 04/25/2011 03:28 PM, Duncan Murdoch wrote:

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Jan van der Laan

Mon, Apr 25, 2011 7:37 AM #

There exists a non-breaking space:

http://en.wikipedia.org/wiki/Non-breaking_space

Perhaps you could use this. In R on Linux under gnome-terminal I can 
enter it with CTRL+SHIFT+U00A0. This seems to work: it prints as a 
space, but is not equal to ' '. I don't know if there are any 
difficulties using, for example, utf8 encoding in source files (which 
you'll probably need).

Jan

On 04/25/2011 03:28 PM, Duncan Murdoch wrote:

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Mike Miller

Mon, Apr 25, 2011 8:26 AM #

On Mon, 25 Apr 2011, Mark Heckmann wrote:

Is it possible to use \x20 or some similar way to evoke the hexadecimal 
ascii form of blank?  That works in perl as does \040 for the octal form.

Mike

Mon, Apr 25, 2011 10:24 AM #

You can embed hex escapes in strings (except \x00). The value(s) that
you embed will depend on the character encoding used on you platform. If
this is UTF-8, or some other ASCII compatible encoding, \x20 will work:

[1] "foo bar"

For other locales, you might try charToRaw(" ") to see the binary (hex)
representation for the space character on your platform, and substitute
this sequence instead.

On Mon, 2011-04-25 at 15:01 +0200, Mark Heckmann wrote:

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Mon, Apr 25, 2011 10:42 AM #

I may have misread your original email. Whether you use a hex escape or
a space character, the resulting string in memory is identical:

[1] TRUE

But, if you were to read a file containing the six characters "a
\x20b" (say with readLines), then the six characters would be read into
memory, and printed like this:

"a\\x20b"

That is, not with a space character substituted for \x20. So, now I'm
not sure this is a solution.

On Mon, 2011-04-25 at 12:24 -0500, Matt Shotwell wrote:

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Petr Savicky

Mon, Apr 25, 2011 10:55 AM #

On Mon, Apr 25, 2011 at 04:37:15PM +0200, Jan van der Laan wrote:

This character may be specified as "\u00A0".

  a <- "abc\u00A0def"
  a
  [1] "abc?def"
  
The utf-8 representation of the obtained string is

  charToRaw(a)
  [1] 61 62 63 c2 a0 64 65 66

Using Unicode package, the string may be analyzed as follows 

  library(Unicode)
  u_char_inspect(as.u_char_seq(a, ""))

      Code                 Name Char
  1 U+0061 LATIN SMALL LETTER A    a
  2 U+0062 LATIN SMALL LETTER B    b
  3 U+0063 LATIN SMALL LETTER C    c
  4 U+00A0       NO-BREAK SPACE    ?
  5 U+0064 LATIN SMALL LETTER D    d
  6 U+0065 LATIN SMALL LETTER E    e
  7 U+0066 LATIN SMALL LETTER F    f

Hope this helps.

Petr Savicky.

Mon, Apr 25, 2011 11:09 AM #

Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt.
Name: nicht verf?gbar
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110425/233f7cde/attachment.pl>