The single grep regex solutions offered to Ivan's problem were fine,
but do not readily generalize to the conjunction of multiple (>2, say)
regex patterns that can appear anywhere in a string and in any order.
However, note that this can easily be done using the Perl zero width
lookahead construction,? "(?=...)" .
e.g.
test <- test <- c("xyCz",
"xAyCz","xAyBzC","xCByAz","xACyB","BAyyC","CBxBAy")
## to search for strings contain "A", "B", & "C" in any order
grep("(?=.*A)(?=.*B)(?=.*C)", test, perl = TRUE)
[1] 3 4 5 6 7
Note that this matches on one or multiple instances of the patterns.
If one wants only exactly one instance of each conjunct,? then
something like this should do:
lookfor <- c("A","B","C")
notme <- paste0("[^",lookfor,"]*")
z <- paste0("(?=", notme, lookfor, notme, "$)",collapse = "")
grep(z, test, perl = TRUE)
[1] 3 4 5 6
Cheers,
Bert
On Wed, Aug 19, 2020 at 11:38 PM Ivan Calandra <calandra at rgzm.de
<mailto:calandra at rgzm.de>> wrote:
Thank you all for all the very helpful answers!
Best,
Ivan
--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra
On 20/08/2020 3:28, Richard O'Keefe wrote:
> There are & and | operators in the R language.
> There is an | operator in regular expressions.
> There is NOT any & operator in regular expressions.
> grep("ConfoMap&GuineaPigs", mydata, value=TRUE)
> looks for elements of mydata containing the literal
> string 'ConfoMap&GuineaPigs'.
>
> > foo <- c("a","b","cab","back")
> > foo[grepl("a",foo) & grepl("b",foo)]
> [1] "cab" ?"back"
>
> grepl returns a TRUE/FALSE vector.
>
> On Thu, 20 Aug 2020 at 02:53, Ivan Calandra <calandra at rgzm.de
<mailto:calandra at rgzm.de>
> <mailto:calandra at rgzm.de <mailto:calandra at rgzm.de>>> wrote:
>
>? ? ?Dear useRs,
>
>? ? ?I feel really stupid, but I cannot understand why "&"
>? ? ?as I
>? ? ?expect, while "|" does.
>
>? ? ?I have the following vector:
>? ? ?mydata <- c("SSFA-ConfoMap_GuineaPigs_NMPfilled.csv",
>? ? ?"SSFA-ConfoMap_Lithics_NMPfilled.csv",?
>? ? ?"SSFA-ConfoMap_Sheeps_NMPfilled.csv",
>? ? ?"SSFA-Toothfrax_GuineaPigs.xlsx",
>? ? ?"SSFA-Toothfrax_Lithics.xlsx", "SSFA-Toothfrax_Sheeps.xlsx")
>? ? ?and I want to find the values that include both "ConfoMap" and
>? ? ?"GuineaPigs".
>
>? ? ?If I do:
>? ? ?grep("ConfoMap&GuineaPigs", mydata, value=TRUE)
>? ? ?it returns an empty vector, character(0).
>
>? ? ?But if I do:
>? ? ?grep("ConfoMap|GuineaPigs", mydata, value=TRUE)
>? ? ?it returns all the elements that include either "ConfoMap" or
>? ? ?"GuineaPigs", as I would expect.
>
>? ? ?So what is wrong with my "&" construct? How can I return the
>? ? ?that include both parts?
>
>? ? ?Thank you for your help!
>? ? ?Ivan
>
>? ? ?--
>? ? ?Dr. Ivan Calandra
>? ? ?TraCEr, laboratory for Traceology and Controlled Experiments
>? ? ?MONREPOS Archaeological Research Centre and
>? ? ?Museum for Human Behavioural Evolution
>? ? ?Schloss Monrepos
>? ? ?56567 Neuwied, Germany
>? ? ?+49 (0) 2631 9772-243
>? ? ?https://www.researchgate.net/profile/Ivan_Calandra
>
>? ? ?______________________________________________
>? ? ?R-help at r-project.org <mailto:R-help at r-project.org>
<mailto:R-help at r-project.org <mailto:R-help at r-project.org>>
mailing list --