Skip to content

splitting a string by space except when contained within quotes

5 messages · downtowater, William Dunlap, Gabor Grothendieck +1 more

#
I've been trying to split a space delimited string with double-quotes in R
for some time but without success. An example of a string is as follows:

/rainfall snowfall "Channel storage" "Rivulet storage"/

It's important for us because these are column headings that must match the
subsequent data. 

Here is some code I've been trying:

str <- 'rainfall snowfall "Channel storage" "Rivulet storage"'
regex <- "[^\\s\"']+|\"([^\"]*)\""
split <- strsplit(str, regex, perl=T)
what I would like is

[1] "rainfall" "snowfall" "Channel storage" "Rivulet storage"

but what I get is:

[1] ""  " " " " " "

The vector is the right length (which is encouraging) but of course the
strings are empty or contain a single space. Any suggestions?

Thanks in advance!



--
View this message in context: http://r.789695.n4.nabble.com/splitting-a-string-by-space-except-when-contained-within-quotes-tp4651286.html
Sent from the R help mailing list archive at Nabble.com.
#
Try using scan(quote='"', ...), as in the following
  > str <- 'rainfall snowfall "Channel storage" "Rivulet storage"'
  > scan(text=str, what="", quote='"', quiet=TRUE)
  [1] "rainfall"        "snowfall"        "Channel storage" "Rivulet storage"

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On Thu, Nov 29, 2012 at 9:43 AM, downtowater <downtowater at yahoo.ca> wrote:
Try this:
Read 4 items
[1] "rainfall"        "snowfall"        "Channel storage" "Rivulet storage"
email: ggrothendieck at gmail.com
#
Hi,

May be this helps:
str1 <- 'rainfall snowfall "Channel storage" "Rivulet storage"'
res<-unlist(strsplit(gsub("[\"]","",str1)," "))
?res1<-c(res[1],res[2],paste(res[3],res[4],""),paste(res[5],res[6],collapse=""))
?res1
#[1] "rainfall"???????? "snowfall"???????? "Channel storage " "Rivulet storage" 
A.K.




----- Original Message -----
From: downtowater <downtowater at yahoo.ca>
To: r-help at r-project.org
Cc: 
Sent: Thursday, November 29, 2012 9:43 AM
Subject: [R] splitting a string by space except when contained within quotes

I've been trying to split a space delimited string with double-quotes in R
for some time but without success. An example of a string is as follows:

/rainfall snowfall "Channel storage" "Rivulet storage"/

It's important for us because these are column headings that must match the
subsequent data. 

Here is some code I've been trying:

str <- 'rainfall snowfall "Channel storage" "Rivulet storage"'
regex <- "[^\\s\"']+|\"([^\"]*)\""
split <- strsplit(str, regex, perl=T)
what I would like is

[1] "rainfall" "snowfall" "Channel storage" "Rivulet storage"

but what I get is:

[1] ""? " " " " " "

The vector is the right length (which is encouraging) but of course the
strings are empty or contain a single space. Any suggestions?

Thanks in advance!



--
View this message in context: http://r.789695.n4.nabble.com/splitting-a-string-by-space-except-when-contained-within-quotes-tp4651286.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Hi,

You could also do this:
?res<-unlist(strsplit(str,"[\"]"))
?res1<-res[res!=" "]
res2<-c(unlist(strsplit(res1[grepl("\\s+$",res1)]," ")),res1[!grepl("\\s+$",res1)])
res2
#[1] "rainfall"??????? "snowfall"??????? "Channel storage" "Rivulet storage"
A.K.




----- Original Message -----
From: downtowater <downtowater at yahoo.ca>
To: r-help at r-project.org
Cc: 
Sent: Thursday, November 29, 2012 9:43 AM
Subject: [R] splitting a string by space except when contained within quotes

I've been trying to split a space delimited string with double-quotes in R
for some time but without success. An example of a string is as follows:

/rainfall snowfall "Channel storage" "Rivulet storage"/

It's important for us because these are column headings that must match the
subsequent data. 

Here is some code I've been trying:

str <- 'rainfall snowfall "Channel storage" "Rivulet storage"'
regex <- "[^\\s\"']+|\"([^\"]*)\""
split <- strsplit(str, regex, perl=T)
what I would like is

[1] "rainfall" "snowfall" "Channel storage" "Rivulet storage"

but what I get is:

[1] ""? " " " " " "

The vector is the right length (which is encouraging) but of course the
strings are empty or contain a single space. Any suggestions?

Thanks in advance!



--
View this message in context: http://r.789695.n4.nabble.com/splitting-a-string-by-space-except-when-contained-within-quotes-tp4651286.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.