row-wise dataframe calculation
Dear Ken,
At 05:51 PM 13/09/2001 -0400, kestickler at netscape.net wrote:
Hi,
i have a dataframe such as:
Exp1 Exp2 Exp3
name1 12.6 78.0 45.6
name2 11.9 19.0 21.0
name3 10.0 14.0 17.0
...
...
...
Real datasets might be quite large - 20,000 rows by 100 columns
I want to calculate metrics such as the variation *row-wise*. So, var for
name1, var for name 2, var for name3 etc.
Can someone kindly guide me on how best to code this?
The size of the dataset may prove to be a problem, but in principle this kind of calculation can be done with the apply function: apply(df, 1, var), where df is the data frame containing your data.
Also, once such a metric has been calculated for each row, how best to store the results such that when (for instance) the results are sorted, i can access the row names along with the (ordered) variance value?
You can simply create a new variable in the data frame, e.g., df$var <- apply(df, 1, var) . When you sort a variable in a data frame, usually the row names don't show in the result. But something like the following should work (again, if the size of the problem isn't too large for your resources): names(df$var) <- rownames(df); sort(df$var) I hope that this helps, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox ----------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._