calculations on columns with partially matching names
On Jan 3, 2010, at 6:09 PM, Jim Bouldin wrote:
Is there a command for partial matching of character strings? Specifically, I'd like to be able to calculate the mean of the values in any columns in a data frame or matrix that have identity in part of their column names. For example, columns labeled "mpw06a" and "mpw06b" match on the first five characters; their mean would be taken whereas any columns beginning with other than "mpw06" would be excluded.
?grep
?"["
> tdf <- data.frame(mpw06a=rnorm(10), mpw06b=rnorm(10), abc=rnorm(10))
> lapply(tdf[ , grep("mpw06", names(tdf)) ], mean)
$mpw06a
[1] -0.1825447
$mpw06b
[1] -0.2386772
I need to compare every pair of columns in the frame, and in some cases, possibly three at a time.
?combn
Thanks in advance for any ideas.
Jim Bouldin Research Ecologist Department of Plant Sciences, UC Davis Davis CA, 95616 530-554-1740
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD Heritage Laboratories West Hartford, CT