Selecting columns whose names contain "mutated" except when they also contain "non" or "un"

Here is a method that uses negative look behind:
tmp <- c('mutation','nonmutated','unmutated','verymutated','other')
grep("(?<!un)(?<!non)muta", tmp, perl=TRUE)
[1] 1 4

it looks for muta that is not immediatly preceeded by un or non (but
it would match "unusually mutated" since the un is not immediatly
befor the muta).

Hope this helps,
Hello All,

Started out awhile ago trying to select columns in a dataframe whose names contain some variation of the word "mutant" using code like:

names(KRASyn)[grep("muta", names(KRASyn))]

The idea then would be to add together the various columns using code like:

KRASyn$Mutant_comb <- rowSums(KRASyn[grep("muta", names(KRASyn))])

What I discovered though, is that this selects columns like "nonmutated" and "unmutated" as well as columns like "mutated", "mutation", and "mutational".

So I'd like to know how to select columns that have some variation of the word "mutant" without the "non" or the "un". I've been looking around for an example of how to do that but haven't found anything yet.

Can anyone show me how to select the columns I need?

Thanks,

Paul

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com

Selecting columns whose names contain "mutated" except when they also contain "non" or "un"

Thread (16 messages)