Back to formatted view
Raw Message

Message-ID: <4969BAAF.9040108@idi.ntnu.no>
Date: 2009-01-11T09:23:59Z
From: Wacek Kusnierczyk
Subject: Extracting File Basename without Extension
In-Reply-To: <971536df0901091455x76b28a8icac00798707895f0@mail.gmail.com>

Gabor Grothendieck wrote:
> On Fri, Jan 9, 2009 at 4:28 PM, Wacek Kusnierczyk
> <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>   
>> Rau, Roland wrote:
>>     
>>> P.S. Any suggestions how to become more proficient with regular
>>> expressions? The O'Reilly book ("Mastering...")? Whenever I tried
>>> anything more complicated than basic usage (things like ^ $ * . ) in R,
>>> I was way faster to write a new function (like above) instead of finding
>>> a regex solution.
>>>
>>>       
>> the book you mention is good.
>> you may also consider http://www.regular-expressions.info/
>>
>> regexes are usually well explained with lots of examples in perl books.
>>
>>     
>>> By the way: it might be still possible to *write* regular expressions,
>>> but what about code re-use? Are there people who can easily *read*
>>> complicated regular expressions?
>>>
>>>       
>> in some cases it is possible to write regular expressions in a way that
>> facilitates reading them by a human.  in perl, for example, you can use
>> so-called readable regexes:
>>
>> /
>>   (.+)    # match and remember at least one arbitrary character
>>   [.]     # match a dot
>>   [^.]+ # match at least one non-dot character
>>   $  # end of string anchor
>> /x;
>>
>> you can also use within regex comments:
>>
>> /(.+)(?# one or more chars)[.](?# a dot)[^.]+(?# one or more
>> non-dots)$(?# end of string)/
>>
>>
>> nothing of the sorts in r, however.
>>     
>
> Supports that if you begin the regular expression with (?x) and
> use perl = TRUE.  See ?regexp
>   

cool, i see ?xism is supported.  so the above can be written in r as:

names = c("foo.bar", ".zee")
sub("(?x) # alloow embedded comemnts
     (.+) # match and remember at least one arbitrary character
    [.] # match a dot
    [^.]+ # match at least one non-dot character
    $ # end of string anchor",
        "\\1", names, perl=TRUE)

is this what you wanted, roland?

vQ