Skip to content

regular expression

4 messages · Fred G, Gabor Grothendieck, David Winsemius +1 more

#
On Wed, Feb 29, 2012 at 2:24 PM, Fred G <bayespokerguy at gmail.com> wrote:
This extracts all the numeric fields:

# sample data
Lines <- c("98-610: Cell type: S; Surv(months): 6; STATUS(0=alive, 1=dead): 1",
"99-625: Cell type: S; Surv(months): 21; STATUS(0=alive, 1=dead): 1")

library(gsubfn)
strapply(Lines, "(\\d+);", as.numeric, simplify = TRUE)


# We can also get all numeric fields in case that is of interest:

strapply(Lines, "\\d+", as.numeric, simplify = rbind)
#
On Feb 29, 2012, at 2:24 PM, Fred G wrote:

            
Modified to be correct R code. Please emulate my example in the future.

inp <-c( "98-610: Cell type: S; Surv(months): 6; STATUS(0=alive,  
1=dead): 1",
"99-625: Cell type: S; Surv(months): 21; STATUS(0=alive, 1=dead): 1")
You can use either regex methods (noting that the "?" is necessary to  
defeat the default greedy nature of regex match.


 > sub( ";.+$", "", sub("^.+?;", "", inp) )
[1] " Surv(months): 6"  " Surv(months): 21"

...  or you can read these as lines and pass the results to read.table  
with sep =";".

 > read.table(text=inp, sep=";", stringsAsFactors=FALSE)[ ,2]
[1] " Surv(months): 6"  " Surv(months): 21"
Please learn to post in palin text.