Extract residuals from adonis function in vegan package

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20140318/0f1b8ddf/attachment.pl>
Dear Alicia Vald?s,
My problem is that I cannot figure out how to get residual values from the
adonis model.

You cannot get residuals from the output of adonis(). 

We could change the function so that this is possible, but the current function does not return information for getting residuals. Neither would they be residuals in the traditional meaning of the word as we are dealing with dissimilarities or distances, and these cannot be negative. We got to discuss this with vegan developers.

Cheers, Jari Oksanen
Dear Alicia and Jari,

just a thought:
Couldn't be capscale or betadisper be used for this?
 - To obtain the distances to the group centroid?

But than: How to convert this from distances to abundances?

Eduard Sz?cs
Dear Alicia Vald?s, 

On 18/03/2014, at 13:53 PM, Alicia Vald?s wrote:
My problem is that I cannot figure out how to get residual values from the
adonis model.

You cannot get residuals from the output of adonis(). 

We could change the function so that this is possible, but the current function does not return information for getting residuals. Neither would they be residuals in the traditional meaning of the word as we are dealing with dissimilarities or distances, and these cannot be negative. We got to discuss this with vegan developers.

Cheers, Jari Oksanen
_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Eduard Sz?cs
Quantitative Landscape Ecology
Institute for Environmental Sciences
University Koblenz-Landau
Fortstrasse 7
76829 Landau
Germany
http://www.uni-koblenz-landau.de/campus-landau/faculty7/environmental-sciences/landscape-ecology/Staff/eduardszoecs

Dear Alicia and Jari,

just a thought:
Couldn't be capscale or betadisper be used for this?
- To obtain the distances to the group centroid?

But than: How to convert this from distances to abundances?

You can *almost* do this with capscale(), but not quite: for semimetric dissimilarities the results are not identical with capscale (they are identical with metric distances). The capscale() function also has fitted() and residuals() methods that both return dissimilarities. Now it also depends on what you mean with "residuals". The capscale() interpretation and the one I had on my mind is that 

1) adonis(fitted(adonis(y ~ model)) ~ model) should give distances where the fitted part of adonis(y ~model) and the residual variation part should be null, and

2) adonis(residuals(adonis(y ~model) ~ model) should give distances where fit would be null and residual similar as in the original adonis(y ~ model).

It would be possible to develop such functions, but not with the current adonis() output. You can approximate both of these with capscale() and its fitted() and residuals() methods, but not exactly. 

The ecodist package of Sarah Goslee takes a different approach, and could return something usable (but I do not know that package very well).

What really is needed depends on what you mean with "residuals". Should they be dissimilarities (which cannot be negative) or straightforward residuals (which have an average of zero and some of which are negative).

Cheers, jari Oksanen
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20140318/ec75ecd3/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20140318/0500dd2b/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20140318/9dd2b169/attachment.pl>
Hi all,
I would like to first use adonis to remove the region
effect, this is, to fit a model like
Jari, and Eduard - please correct me if I'm wrong.
It's just a quick thought: couldn't you, Alicia, use Condition() to
remove the effect of region out of her analysis? Something in the line of:

capscale(communitydistances ~ environmentalvar+Condition(region))

Indicator species analysis could be done for each region separately.

And additionally, you could use simper() for analysing groups in your
data - which should be compatible to capscale if you use beta-w there
(as presence-absence analogon to BC distances).

Cheers,
 arne

Hi and many thanks for your replies,

I have had a look at capscale() and betadisper(), but as you said, this could only provide "residuals" in fhe form of dissimilarities, and what I actually would like to have are true residuals, negative and positive.

I have also looked into ecodist, but I did not find anything that could help. 

Now you really have to tell us what do you mean with "true" residuals. From the point of view adonis() "true residuals" are dissimilarities, but it seems that you want something completely different. What that may be? Your concept of "residual" seems to be different from mine, so please explain what you want to have.

If you want to have residuals of raw data, you can use rda() and its residuals() method. However, if you intend to use these in subsequent analysis, please note that labdsv functions at least seem to assume that data are non-negative, but it does not check this.

Cheers, Jari Oksanen
On Tue, Mar 18, 2014 at 5:02 PM, Alicia Vald?s
...
However, what I attempt to do is to perform an indicator species analysis
(ISA) with these residuals. I want to see if I can find species which are
indicators for different environmental conditions, but first I would like
to remove the differences in species composition due to the study region
...

Isn't an indicator species analysis *based* on differences in species
composition? How could one "remove the difference in species
composition due to region" and then still hope to identify indicator
species for the regions that are "equalized"?

Cheers,
Ivailo
"The cure for boredom is curiosity. There is no cure for curiosity." --
Dorothy Parker
On Tue, Mar 18, 2014 at 5:02 PM, Alicia Vald?s
...
However, what I attempt to do is to perform an indicator species analysis
(ISA) with these residuals. I want to see if I can find species which are
indicators for different environmental conditions, but first I would like
to remove the differences in species composition due to the study region
(which accounts in fact for a big part of the differences in species
composition). I am using the packages indicspecies and labdsv for ISA but
in none of the cases did I found a way of including this as, for example, a
block variable, that's what I attempt to get the residuals.
Alicia,

following up on my previous comment, I think you might use the regions
as a typology on which to base your ISA. So you'll get the
characteristic species for each region and there is no need to
"account" for differences in species composition among regions
(moreover, I fail to understand why one might need to do so).

If you take a look at the help page for multipatt() in indicspecies,
you'll see that you need a community data table (your presence/absence
matrix) and a site classification (your "regions" if these are not
further classified into meaningful clusters; although you didn't
provide more details on how the forest patches relate to the regions
you wrote about) to run the analysis.

HTH,
Ivailo
"The cure for boredom is curiosity. There is no cure for curiosity." --
Dorothy Parker
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20140318/9b9d80e9/attachment.pl>
Alicia

One more thought. I wonder if part of the problem is that you're
attempting to use ISA to do something it was not designed to deal with.
Sometimes that can work and result in a clever new approach, but in this
case, I don't see how it can work. As a recall, ISA is done by taking the
product of relative abundance and relative frequency. Therefore, as
mentioned by Jari, it can only be used with positive values of abundance
and frequency. Yes, it is true that you can transform the residual
abundances to make them all positive. That seems perfectly reasonable if
you then intend to use the residual abundances (and the residual
abundances only) to examine associations with environmental groups. What
is puzzling me is how one calculates a relative frequency using residual
abundances or how one calculates a "residual frequency." Is it your
intention to calculate residual frequencies? It's not clear to me how that
could be done. I suppose you could leave out the relative frequency when
doing the ISA, but that leads me to the following question: Can you tell
us why indicator species analysis is a better approach for determining
indication of particular environmental conditions than are species scores
generated from a capscale ordination?  Given the flexibility of capscale
to deal with categorical predictors, it has never been clear to me what
advantages ISA has over species scores in interpreting environmental
associations (other than generating a Monte Carlo-derived significance
value). If there is not much difference, then I would do a capscale
analysis using region as a Condition factor (as suggested by Arne) and
then examine the species scores.

Steve

J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077

Thanks for so many thoughts!

Arne: I could use Condition() for that but something like
capscale(communitydistances ~ environmentalvar+Condition(region))
would not attempt what I need, because I need to fit a model only with
"region", extract residuals, and then do another analysis (ISA) with these
residuals.
However, if this kind of formulation is correct, it could also be useful
to
see the effects of environmental factors while controlling for the effect
of region

Indicator species analysis could be done for each region separately. -
yes,
but I would like to try both approaches, ISA for all regions (but removing
the region effect) and ISA for each region separately, and compare the
results

Jari: OK I guess that even if they could be calculated, residuals from the
adonis() point of view are not suitable for me, as they are
dissimilarities. Sorry if I was confusing, what I need are residuals of
raw
data, as you say. So I guess the residuals() method of rda() should work.
Yes, I know that ISA needs non-negative values, I was attending to
transform these residuals to make them all positive.

Pierre: I checked manyglm() function in the mvabund package and I think it
could be useful too, you can indeed get residuals from a mutivariate GLM
with family=binomial.

Ivailo: I think you maybe misunderstood my approach, I want to perform an
ISA based on differences in species composition, but I want to focus on
the
part of these differences which is not caused by the region studied. I
want
to identify indicator species for different environmental conditions and
caracteristics of forest patches (like landscape management type, forest
age and others), but I am not interested in finding the characteristic
species for the different regions. To make it clearer, I have data from
different regions, into each of the regions there are a series of forest
patches which vary in management type, age and other factors. So the
cluster I am using in multipatt() is a classification into management
types, age groups, etc. As I said before, I am not interested in use
"regions" for site classification. I hope this makes the analysis easier
to
understand now!

So my main doubts now are: for this kind of purpose, 1) what is your
opinion on using a community data table in the ISA which contains not the
actual presence/absence, but residuals form a previous model with region
effect?; and 2) Which kind of residuals do you find more appropiate to use
here (rda residuals of multivariate GLM residuals)?

Cheers,

Alicia

2014-03-18 16:23 GMT+01:00 Ivailo <ubuntero.9161 at gmail.com>:

On Tue, Mar 18, 2014 at 5:02 PM, Alicia Vald?s
<aliciavaldes1501 at gmail.com> wrote:
...
However, what I attempt to do is to perform an indicator species
analysis
(ISA) with these residuals. I want to see if I can find species which
are
indicators for different environmental conditions, but first I would
like
to remove the differences in species composition due to the study
region
...

Isn't an indicator species analysis *based* on differences in species
composition? How could one "remove the difference in species
composition due to region" and then still hope to identify indicator
species for the regions that are "equalized"?

Cheers,
Ivailo
--
"The cure for boredom is curiosity. There is no cure for curiosity." --
Dorothy Parker

[[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology