Skip to content
Back to formatted view

Raw Message

Message-ID: <971536df0903131313p3dd43119u996158ba512b4bbe@mail.gmail.com>
Date: 2009-03-13T20:13:19Z
From: Gabor Grothendieck
Subject: search for string insider a string
In-Reply-To: <3303FA84CE4F7244B27BE264EC4AE2A70CEF674E@panemail.panagora.com>

That might be done by splitting the string into the portion before
dtest, the portion from dtest to the number but not including it,
the number and the rest. The s<- line splits it up into a list and
the next line reforms it into a character matrix replacing NULL
list items with NA:

> library(gsubfn)
> # x from prior post
> s <- strapply(x, "(.*)(dtest[^0-9]*)([0-9][0-9.]*)(.*)", c)
> do.call(rbind, sapply(s, function(x) if (is.null(x)) NA else x))
     [,1] [,2]             [,3]   [,4]
[1,] NA   NA               NA     NA
[2,] "bc" "dtestblabla"    "2.1"  "bla"
[3,] "c"  "dtestblablabla" "3.88" "blabla"


On Fri, Mar 13, 2009 at 3:10 PM, Tan, Richard <RTan at panagora.com> wrote:
> That works. ?I want the position just for the purpose of my later manual check. ?Thanks a lot Gabor.
>
> -----Original Message-----
> From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
> Sent: Friday, March 13, 2009 2:18 PM
> To: Tan, Richard
> Cc: r-help at r-project.org
> Subject: Re: [R] search for string insider a string
>
> Try this. ?We use regexpr to get the positions and strapply puts the values in list s. ?The unlist statement converts NULL to NA and simplifies the list, s, to a numeric vector. ?For more info on strapply see http://gsubfn.googlecode.com
>
> library(gsubfn) ?# strapply
>
> x <- c"test1", "bcdtestblabla2.1bla", "cdtestblablabla3.88blabla")
>
> dtest.info <- cbind(posn = regexpr("dtest", x),
> ? value = { s <- strapply(x, "dtest[^0-9]*([0-9][0-9.]*)", as.numeric)
> ? ? ? ? ? ? ? ?unlist(ifelse(sapply(s, length), s, NA))
> })
>
>> # the above may be sufficient but
>> # if its important to NA out rows with no match add
>> dtest.info[dtest.info[,1] < 0,] <- NA dtest.info
> ? ? pos value
> [1,] ?NA ? ?NA
> [2,] ? 3 ?2.10
> [3,] ? 2 ?3.88
>
> Why do you want the position? ? Is there a further transformation needed?
> What is it? ?There may be even easier approaches to the entire problem.
>
> On Fri, Mar 13, 2009 at 12:25 PM, Tan, Richard <RTan at panagora.com> wrote:
>> Hi, sorry if it is a too stupid question, but how do I a string search
>> in R:
>>
>> I have a dataframe A with A$test like:
>>
>> test1
>> bcdtestblabla2.1bla
>> cdtestblablabla3.88blabla
>>
>> and I want to search for string that start with 'dtest' and ends with
>> number and return the location of that substring and the number, so
>> the end result would be:
>>
>> NA ? ?NA
>> 3 ? ?2.1
>> 2 ? ?3.88
>>
>> I find grep can probably do this but I am new to the function so would
>> like a good example.
>>
>> Thanks,
>> Richard
>>
>>
>>
>> ? ? ? ?[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>