Hi there,
I have a problem about lapply, strsplit, and accessing list elements,
which I don't understand or cannot solve:
I have e.g. a character vector with three elements:
x = c("349/077,349/074,349/100,349/117",
"340/384.2,340/513,367/139,455/128,D13/168",
"600/437,128/903,128/904")
The task I want to perform, is to generate a list, comprising the
portion in front of the "/" of each element of x:
neededResult = list(c("349","349", "349", "349"),
c("340", "340", "367", "455", "D13"),
c("600", "128", "128") )
I figured out that for a single element of x the following works
unlist( lapply( strsplit( unlist( strsplit(x[1], "\\,") ), "/"), "[", 1) )
but due to "unlist" it doesn't provide the required result if extended
to all elements of x
unlist(lapply(strsplit( unlist( lapply(x, strsplit, "\\,")), "/"), "[",))
Someone can help me to get the needed result?
Thanks and regards,
Dirk
lapply, strsplit, and list elements
6 messages · Dick Harray, William Dunlap, Henrique Dallazuanna +2 more
-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Dick Harray
Sent: Friday, February 04, 2011 7:37 AM
To: r-help at r-project.org
Subject: [R] lapply, strsplit, and list elements
Hi there,
I have a problem about lapply, strsplit, and accessing list elements,
which I don't understand or cannot solve:
I have e.g. a character vector with three elements:
x = c("349/077,349/074,349/100,349/117",
"340/384.2,340/513,367/139,455/128,D13/168",
"600/437,128/903,128/904")
The task I want to perform, is to generate a list, comprising the
portion in front of the "/" of each element of x:
neededResult = list(c("349","349", "349", "349"),
c("340", "340", "367", "455", "D13"),
c("600", "128", "128") )
Try the following, which first splits each string by commas
(returning a list), then removes the first slash and everything
after it (using lapply to maintain the list structure).
> gotResult <- lapply(strsplit(x, ","), function(xi)gsub("/.*", "",
xi))
> identical(getResult, neededResult)
[1] TRUE
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
I figured out that for a single element of x the following works unlist( lapply( strsplit( unlist( strsplit(x[1], "\\,") ), "/"), "[", 1) ) but due to "unlist" it doesn't provide the required result if extended to all elements of x unlist(lapply(strsplit( unlist( lapply(x, strsplit, "\\,")), "/"), "[",)) Someone can help me to get the needed result? Thanks and regards, Dirk
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Try this:
x <- c("349/077,349/074,349/100,349/117",
+ "340/384.2,340/513,367/139,455/128,D13/168", + "600/437,128/903,128/904")
library(gsubfn) out <- strapply(x, '([0-9]+)(?=/)') out
[[1]] [1] "349" "349" "349" "349" [[2]] [1] "340" "340" "367" "455" "13" [[3]] [1] "600" "128" "128" The strapply looks for the pattern then returns every time it finds the pattern. The pattern in this case is 1 or more digits that are followed by a /, but the slash is not included in the matched portion (a positive look ahead). If you need more than digits you can modify the pattern to whatever matches before the /. Hope this helps,
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Dick Harray
> Sent: Friday, February 04, 2011 8:37 AM
> To: r-help at r-project.org
> Subject: [R] lapply, strsplit, and list elements
>
> Hi there,
>
> I have a problem about lapply, strsplit, and accessing list elements,
> which I don't understand or cannot solve:
>
> I have e.g. a character vector with three elements:
>
> x = c("349/077,349/074,349/100,349/117",
> "340/384.2,340/513,367/139,455/128,D13/168",
> "600/437,128/903,128/904")
>
>
> The task I want to perform, is to generate a list, comprising the
> portion in front of the "/" of each element of x:
>
> neededResult = list(c("349","349", "349", "349"),
> c("340", "340", "367", "455", "D13"),
> c("600", "128", "128") )
>
>
> I figured out that for a single element of x the following works
>
> unlist( lapply( strsplit( unlist( strsplit(x[1], "\\,") ), "/"), "[",
> 1) )
>
> but due to "unlist" it doesn't provide the required result if extended
> to all elements of x
>
> unlist(lapply(strsplit( unlist( lapply(x, strsplit, "\\,")), "/"),
> "[",))
>
>
> Someone can help me to get the needed result?
>
> Thanks and regards,
>
> Dirk
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110204/8671fceb/attachment.pl>
On Fri, Feb 4, 2011 at 1:27 PM, Greg Snow <Greg.Snow at imail.org> wrote:
Try this:
x <- c("349/077,349/074,349/100,349/117",
+ ? ? ? ? ?"340/384.2,340/513,367/139,455/128,D13/168", + ? ? ? ? ?"600/437,128/903,128/904")
library(gsubfn) out <- strapply(x, '([0-9]+)(?=/)') out
[[1]] [1] "349" "349" "349" "349" [[2]] [1] "340" "340" "367" "455" "13" [[3]] [1] "600" "128" "128" The strapply looks for the pattern then returns every time it finds the pattern. ?The pattern in this case is 1 or more digits that are followed by a /, but the slash is not included in the matched portion (a positive look ahead). If you need more than digits you can modify the pattern to whatever matches before the /.
Also this similar approach with a slight simplification of the regular expression: strapply(x, '([0-9]+)/') or to convert the numbers to numeric at the same time: strapply(x, '([0-9]+)/', as.numeric)
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
Darn, Good catch, I fell victim to overthinking the problem. I think I was more thinking of: '[0-9]+(?=/)' Which uses the whole match (then I switched thinking and captured the number, but did not simplify the other part). Yours is the best.
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
> Sent: Friday, February 04, 2011 12:22 PM
> To: Greg Snow
> Cc: Dick Harray; r-help at r-project.org
> Subject: Re: [R] lapply, strsplit, and list elements
>
> On Fri, Feb 4, 2011 at 1:27 PM, Greg Snow <Greg.Snow at imail.org> wrote:
> > Try this:
> >
> >> x <- c("349/077,349/074,349/100,349/117",
> > + ? ? ? ? ?"340/384.2,340/513,367/139,455/128,D13/168",
> > + ? ? ? ? ?"600/437,128/903,128/904")
> >>
> >> library(gsubfn)
> >> out <- strapply(x, '([0-9]+)(?=/)')
> >> out
> > [[1]]
> > [1] "349" "349" "349" "349"
> >
> > [[2]]
> > [1] "340" "340" "367" "455" "13"
> >
> > [[3]]
> > [1] "600" "128" "128"
> >
> >
> > The strapply looks for the pattern then returns every time it finds
> the pattern. ?The pattern in this case is 1 or more digits that are
> followed by a /, but the slash is not included in the matched portion
> (a positive look ahead).
> >
> > If you need more than digits you can modify the pattern to whatever
> matches before the /.
>
> Also this similar approach with a slight simplification of the regular
> expression:
>
> strapply(x, '([0-9]+)/')
>
> or to convert the numbers to numeric at the same time:
>
> strapply(x, '([0-9]+)/', as.numeric)
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com