summing F stats and permutation
Jari, Thanks. This is helpful. I knew that RDA obtained predicted abundances for each species, but I didn't know that function rda could be used to calculate F statistics for each predictor for each species. That does seem like a promising way to approach the problem. So the code you've sent me is showing how to extract and sums the numerator and denominator SS and then calculate the ratio from those sums and the dfs? Or is it showing how to obtain Fs for each species and then summing them? Incidentally, I was initially very skeptical of the sumF approach until I started comparing sumF results to perMANOVA results (using PC-Ord) on several different datasets of mine and got strikingly similar results between the two techniques. It seems to go against what a lot of statisticians are saying about the unique value of multivariate stats. I don't have a satisfactory explanation for why it seems to work with my data. Steve J. Stephen Brewer Professor Department of Biology PO Box 1848 University of Mississippi University, Mississippi 38677-1848 Brewer web page - http://home.olemiss.edu/~jbrewer/ FAX - 662-915-5144 Phone - 662-915-1077
On 11/29/12 9:43 AM, "Jari Oksanen" <jari.oksanen at oulu.fi> wrote:
Steve,
This is R, so it is not about whether this can be done, but how this can
be done. Unfortunately, doing exactly this requires some fluency in R.
Doing something similar is very simple.
The description of your problem sounds very much like the description of
permutation test in redundancy analysis (RDA). The difference is that in
RDA you sum up nominators and denominators before getting the ratio, but
in your model you sum up the ratios. So in RDA test you have (num_1 +
num_2 + ... + num_p)/(den_1 + den_2 + ... + den_p), and in your
description you have num_1/den_1 + num_2/den_2 + ... + num_p/den_p. The
former test in canned for instance in the vegan package, but the latter
you must develop yourself (and the former method of summing variances
instead of their ratios feels sounder). It would not be too many lines of
code to fit your code, though. Please note that RDA works by fitting
linear models for each species independently so that you can get all
needed statistics from a fitted RDA in the vegan package (function rda).
The following function extracts F-values by species from a fitted
vegan:::rda() result object:
spF <-
function (object)
{
inherits(object, "rda") || stop("needs an rda() result object")
df1 <- object$CCA$qrank
df2 <- nrow(object$CA$Xbar) - df1 - 1
num <- colSums(predict(object, type="working", model="CCA")^2)
den <- colSums(predict(object, type="working", model="CA")^2)
(num/df1)/(den/df2)
}
HTH, Jari Oksanen
________________________________________ From: r-sig-ecology-bounces at r-project.org [r-sig-ecology-bounces at r-project.org] on behalf of Steve Brewer [jbrewer at olemiss.edu] Sent: 29 November 2012 16:42 To: r-sig-ecology at r-project.org Subject: [R-sig-eco] summing F stats and permutation Dear Colleagues, I'm wondering if anyone in this group has developed code for doing a sumF test for examining community responses in an experiment. For those not familiar, sumF is a simple univariate alternative to MANOVA and perMANOVA, wherein univariate ANOVAs and their associated F statistics are calculated for each species' abundance and then the F statistic for each effect is summed over all species. The significance of the resulting summed F statistic is then evaluated using random permutation. The summed F statistic is interpreted as an overall community response to the treatments, whereas the F statistic for each species provides a measure of the contribution that species makes to treatment differences. I could envision a variety of ways in which this could be done in R, but I'm not adept enough in R to figure out how to do it myself. One possibility might involve using permute or shuffle to get the randomized data matrices, but it is not clear to me how one could simultaneously calculate the Anovas for all species and sum the resulting F statistics for each random permutation. There is no reason why traditional F statistics would have to be used. Pseudo-F statistics based on distances for each species' abundance could be calculated instead and then summed across species. PLEASE NOTE THAT I AM ALREADY AWARE OF THE OBJECTIONS TO THIS APPROACH TO COMMUNITY ANALYSIS. Nevertheless, I am interested in pursuing this using R, if possible. Any suggestions are welcomed. Thanks, Steve J. Stephen Brewer Professor Department of Biology PO Box 1848 University of Mississippi University, Mississippi 38677-1848 Brewer web page - http://home.olemiss.edu/~jbrewer/ FAX - 662-915-5144 Phone - 662-915-1077 [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology