Skip to content

regular expression in gsub() for strings with leading backslash

6 messages · Duncan Murdoch, Mike Miller, Miao

#
On 29/04/2011 7:41 PM, Miao wrote:
If those are R strings, none of them contain a backslash.  In R, a 
backslash would always be printed as \\.

\x is the introduction to a hexadecimal encoding for a character; the 
next two characters show the hex digits.  So your first string contains 
a single character \xa0, the third one contains \xab, and so on.

The \023 is an octal encoding for a single character.

Duncan Murdoch
#
On 29/04/2011 9:34 PM, Miao wrote:
I don't know.  This might work:

gsub("[\x01-\x1f\x7f-\xff]", "", x)

(i.e. the range from character 1 to character 31, and 127 to 255) but I 
don't know if our regular expression matcher will accept those characters.

Duncan Murdoch
#
On Fri, 29 Apr 2011, Duncan Murdoch wrote:

            
If we were dealing with a leading backslash, I guess this would do it:

gsub("^\\\\.*", "", txt)

R would display a double backslash, but I believe that represents a single 
backslash.  So if the string were saved using write.table, say, only a 
single backslash would be stored.
[1] "\\This is a string."
[1] "This is a string."
[1] "\\This is a string."
[1] ""
[1] ""               "Another string" ""
$ cat a.txt
\This is a string.

Apparently this is not what the OP really wanted.  The OP probably wanted 
to remove characters that were not from the regular ASCII set.


Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota