R grep question
FWIW: I think Jim makes an excellent point -- regex's really aren't the right tool for this sort of thing (imho); matching is. Note also that if one is willing to live with a logical response (better, again imho), then the ifelse() can of course be dispensed with:
CRC$MMR.gene<-CRC$gene.all %in% match_strings CRC$MMR.gene
[1] TRUE FALSE TRUE FALSE Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Thu, May 27, 2021 at 8:35 PM Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Kai,
You may find %in% easier than grep when multiple matches are needed:
match_strings<-c("MLH1","MSH2")
CRC<-data.frame(gene.all=c("MLH1","MSL1","MSH2","MCC3"))
CRC$MMR.gene<-ifelse(CRC$gene.all %in% match_strings,"Yes","No")
Composing your match strings before applying %in% may be more flexible
if you have more than one selection to make.
On Fri, May 28, 2021 at 1:57 AM Marc Schwartz via R-help
<r-help at r-project.org> wrote:
Hi,
A quick clarification:
The regular expression is a single quoted character vector, not a
character vector on either side of the | operator:
"MLH1|MSH2"
not:
"MLH1"|"MSH2"
The | is treated as a special character within the regular expression.
See ?regex.
grep(), when value = FALSE, returns the index of the match within the
source vector, while when value = TRUE, returns the found character
entries themselves.
Thus, you need to be sure that your ifelse() incantation is matching the
correct values.
In the case of grepl(), it returns TRUE or FALSE, as Rui noted, thus:
CRC$MMR.gene <- ifelse(grepl("MLH1|MSH2",CRC$gene.all), "Yes", "No")
should work.
Regards,
Marc Schwartz
Kai Yang via R-help wrote on 5/27/21 11:23 AM:
Hi Rui,thank you for your suggestion. but when I try the solution, I got message below: Error in "MLH1" | "MSH2" : operations are possible only for numeric,
logical or complex types
does it mean, grepl can not work on character field? Thanks,Kai On Thursday, May 27, 2021, 01:37:58 AM PDT, Rui Barradas
<ruipbarradas at sapo.pt> wrote:
Hello,
ifelse needs a logical condition, not the value. Try grepl.
CRC$MMR.gene <- ifelse(grepl("MLH1"|"MSH2",CRC$gene.all), "Yes", "No")
Hope this helps,
Rui Barradas
?s 05:29 de 27/05/21, Kai Yang via R-help escreveu:
Hi List, I wrote the code to create a new variable:
CRC$MMR.gene<-ifelse(grep("MLH1"|"MSH2",CRC$gene.all,value=T),"Yes","No")
I need to create MMR.gene column in CRC data frame, ifgene.all column
contenes MLH1 or MSH2, then the MMR.gene=Yes, if not,MMR.gene=No
But, the code doesn't work for me. Can anyone tell how to fix the
code?
Thank you, Kai
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.