Skip to content
Back to formatted view

Raw Message

Message-ID: <9DAC4AC6-43E3-4A2F-B80C-39D620C7B1D8@comcast.net>
Date: 2012-07-08T01:45:58Z
From: David Winsemius
Subject: Splitting a character vector.
In-Reply-To: <A5CD20EB2C2.000006ACjrkrideau@inbox.com>

On Jul 7, 2012, at 5:37 PM, John Kane wrote:

> I am lousy at simple regex and I have not found a solution to a  
> simple problem.
>
> I have a vector with some character values that I want to split.
> Sample data
> dd1  <-  c( "XXY (mat harry)","XXY (jim bob)", "CAMP (joe blow)",  
> "ALP (max jack)")
>
> Desired result
> dd2  <-  data.frame( xx = c("XXY", "XXY", "CAMP", "ALP"), yy =  
> c("mat harry", "jim bob" , "joe blow", "max jack"))

data.frame(xx=sub("(\\s\\(.+$)", "", dd1),
            yy=sub("(.+)(\\s\\()(.+)(\\)$)", "\\3", dd1) )
     xx        yy
1  XXY mat harry
2  XXY   jim bob
3 CAMP  joe blow
4  ALP  max jack


>
> I thought I should be able to split the characters with strsplit but  
> either I am misunderstanding the function or don't know how to  
> escape a "(" properly in an effort to at least get   "XXY" "(mat  
> harry)"
>


David Winsemius, MD
Heritage Laboratories
West Hartford, CT