Skip to content
Back to formatted view

Raw Message

Message-ID: <AFFA4A28-67EC-4008-955A-1477F1E479C5@me.com>
Date: 2014-08-15T16:41:42Z
From: Marc Schwartz
Subject: regex pattern assistance
In-Reply-To: <1408119535.2063.16.camel@tom-laptop>

On Aug 15, 2014, at 11:18 AM, Tom Wright <tom at maladmin.com> wrote:

> Hi,
> Can anyone please assist.
> 
> given the string 
> 
>> x<-"/mnt/AO/AO Data/S01-012/120824/"
> 
> I would like to extract "S01-012"
> 
> require(stringr)
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/(.+)\\/+")
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/(\\w+)\\/+")
> 
> both nearly work. I expected I would use something like:
>> str_match(x,"\\/mnt\\/AO\\/AO Data\\/([\\w -]+)\\/+")
> 
> but I don't seem able to get the square bracket grouping to work
> correctly. Can someone please show me where I am going wrong?
> 
> Thanks,
> Tom


Is the desired substring always in the same relative position in the path?

If so:

> strsplit(x, "/")
[[1]]
[1] ""        "mnt"     "AO"      "AO Data" "S01-012" "120824" 

> unlist(strsplit(x, "/"))[5]
[1] "S01-012"



Alternatively, again, presuming the same position:

> gsub("/mnt/AO/AO Data/([^/]+)/.+", "\\1", x)
[1] "S01-012"


You don't need all of the double backslashes in your regex above. The '/' character is not a special regex character, whereas '\' is and needs to be escaped.

Regards,

Marc Schwartz