Skip to content

on gsub (simple, but not to me!) sintax

5 messages · Ottorino-Luca Pantani, Duncan Murdoch, David Winsemius +1 more

#
Dear R users,
my problem today deals with my ignorance on regular expressions.
a matter I recently discovered.

Consider the following

foo <-
c("V_7_101110_V",  "V_7_101110_V",  "V_9_101110_V",  "V_9_101110_V",
"V_9_s101110_V",  "V_9_101110_V",  "V_9_101110_V",  "V_11_101110_V",
"V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V",
"V_17_101110_V", "V_17_101110_V")

what I'm trying to obtain is to add a zero in front of numbers below 10,
as in

c("V_07_101110_V",  "V_07_101110_V",  "V_09_101110_V",  "V_09_101110_V",
"V_09_101110_V",  "V_09_101110_V",  "V_09_101110_V",  "V_11_101110_V",
"V_11_101110_V", "V_11_101110_V", "V_11_101110_V", "V_11_101110_V",
"V_17_101110_V", "V_17_101110_V")


I'm able to do this on the emacs buffer through query-replace-regexp

C-M-%
search for
V_\(.\)_
and substitute with
V_0\1_

but I completely ignore how to do it with gsub within R
and the help is quite complicate to understand
(at least to me, at this moment in time)

I can search the vector through
grep("V_._",  foo)

but I always get errors either on
gsub('V_\(.\)_', 'V_0\1_', foo)


or I get not what I'm looking for on
gsub('V_._', 'V_0._', foo)
gsub('V_._', 'V_0\1_', foo)

Thanks in advance
#
On 11/16/2009 8:21 AM, Ottorino-Luca Pantani wrote:
You were close.  First, gsub by default doesn't need escapes before the 
parens.  (There are lots of different conventions for regular 
expressions, unfortunately.)  So the Emacs regular expression V_\(.\)_ 
is entered as "V_(.)_" in the default version of gsub().  Second, to 
enter a backslash into a string, you need to escape it.  So the 
replacement pattern V_0\1_ is entered as "V_0\\1_".  So

gsub("V_(.)_", "V_0\\1_", foo)

should give you what you want.

Duncan Murdoch
#
On Nov 16, 2009, at 8:21 AM, Ottorino-Luca Pantani wrote:

            
Any of these (the need for doubling of the "\\" for the back-reference  
seems to be the main issue:
 > gsub("_([[:digit:]])_.", "_0\\1_", foo)
  [1] "V_07_01110_V"  "V_07_01110_V"  "V_09_01110_V"  "V_09_01110_V"   
"V_09_101110_V"
  [6] "V_09_01110_V"  "V_09_01110_V"  "V_11_101110_V" "V_11_101110_V"  
"V_11_101110_V"
[11] "V_11_101110_V" "V_11_101110_V" "V_17_101110_V" "V_17_101110_V"

 > gsub("_(\\d)_.", "_0\\1_", foo)
  [1] "V_07_01110_V"  "V_07_01110_V"  "V_09_01110_V"  "V_09_01110_V"   
"V_09_101110_V"
  [6] "V_09_01110_V"  "V_09_01110_V"  "V_11_101110_V" "V_11_101110_V"  
"V_11_101110_V"
[11] "V_11_101110_V" "V_11_101110_V" "V_17_101110_V" "V_17_101110_V"

 > gsub("V_(.)_", "V_0\\1_", foo)
  [1] "V_07_101110_V"  "V_07_101110_V"  "V_09_101110_V"   
"V_09_101110_V"  "V_09_s101110_V"
  [6] "V_09_101110_V"  "V_09_101110_V"  "V_11_101110_V"   
"V_11_101110_V"  "V_11_101110_V"
[11] "V_11_101110_V"  "V_11_101110_V"  "V_17_101110_V"  "V_17_101110_V"
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
#
Duncan Murdoch wrote:
actually, guessing from the form of the input, sub is more appropriate, 
though the performance gain seems inessential (~3%).

vQ
#
Duncan Murdoch ha scritto:
I suspected something on the double escape..........

Thanks to you all.
R is a wonderful software and R-help is always a great place to visit !!!
8rino