On Jan 17, 2021, at 3:48 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
?
There are literally tons of ways to do this sort of thing in R.
In base R ?tapply and friends, especially ?ave and ?by that may be close to what you want.
But there is a whole parallel universe -- the so-called "tidyverse set of packages -- that many folks prefer.
This link takes you down that rabbit hole: https://dplyr.tidyverse.org/
There are still others (e.g. the data.table package). You should expect to invest a little time in learning whichever you choose. You may wish to also search a bit for tutorials on your choice -- there are many good ones out there.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sun, Jan 17, 2021 at 12:18 PM Bernard McGarvey <mcgarvey.bernard at comcast.net> wrote:
I have a data frame that consists of several factor columns say A, B, C, D, and E and several columns containing numerical data, say X1, X2, .... X10. I would like to create statistics of some of the numerical columns by some of the factor columns. For example,
Calculate the mean, min, and max of variables X1 and X7, by factors A, and E. The results should look like the table below:
Factor A Factor E mean(X1) min(x1) max(X1) mean(X7) min(x7) max(X7) mean(X10) min(x10) max(X10)
A1 E1
A1 E2
A1 E3
A2 E1
A2 E2
A2 E3
I would like the results to be returned to a data frame or other object that I can write out using the write.csv function. I have looked at the summarize and numSummary functions but they do not appear to be flexible enough to do the above.
Any help would be appreciated,
Thanks
Bernard McGarvey
Director, Fort Myers Beach Lions Foundation, Inc.
Retired (Lilly Engineering Fellow).