Selecting columns whose names contain "mutated" except when they also contain "non" or "un"
Here is a method that uses negative look behind:
tmp <- c('mutation','nonmutated','unmutated','verymutated','other')
grep("(?<!un)(?<!non)muta", tmp, perl=TRUE)
[1] 1 4 it looks for muta that is not immediatly preceeded by un or non (but it would match "unusually mutated" since the un is not immediatly befor the muta). Hope this helps,
On Mon, Apr 23, 2012 at 10:10 AM, Paul Miller <pjmiller_57 at yahoo.com> wrote:
Hello All,
Started out awhile ago trying to select columns in a dataframe whose names contain some variation of the word "mutant" using code like:
names(KRASyn)[grep("muta", names(KRASyn))]
The idea then would be to add together the various columns using code like:
KRASyn$Mutant_comb <- rowSums(KRASyn[grep("muta", names(KRASyn))])
What I discovered though, is that this selects columns like "nonmutated" and "unmutated" as well as columns like "mutated", "mutation", and "mutational".
So I'd like to know how to select columns that have some variation of the word "mutant" without the "non" or the "un". I've been looking around for an example of how to do that but haven't found anything yet.
Can anyone show me how to select the columns I need?
Thanks,
Paul
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Gregory (Greg) L. Snow Ph.D. 538280 at gmail.com