My apologies for sending my reply to Jari Oksanen only.
See it below, please.
Vit
Dear Jari Oksanen,
Thank you for your suggestions!
You really have to define what is a good fit. Having a large change in
probability is different from explaining a large part of the variation of
observations. You may have cute, strong response and still large
residuals.
In the language of linear models, you may have steep regression slope and
still large residual variance. Your question sounded like the 'slope'
component, but most statistics deal with the 'residual' component either
in
absolute terms ('residuals') or in relative terms
('residuals'/'original').
It was not clear, but I definitely meant good fit in terms of residuals -
the species I am interested in are those, whose abundance (or probability
of presence) significantly change within the 2-dimensional ordination
space and can be predicted well (with reasonable R2) using the two
ordination axes.
For example, in the middle of the ordination diagram, there are gathered
both ubiquitous species and species with their optimum around the middle
of the gradient. I would like to eliminate the ubiquitous ones. Now I see,
it is more complex issue, as the species may be predicted well using the
ordination axes, but may have different niche breath.
It is not in the Canoco proper (or the code that Cajo ter Braak wrote on
the
base of Mark Hill's original), but on its support and wrapper functions.
These supports were originally a separate program (CanoDraw), but are no
bundled together. Here the criterion was different: it was the species
response to the constraints and not to the ordination axes. These may be
very, very different. Further, constraining uses a linear model so that
you
will clearly use here linear goodness of fit to the original environmental
variables instead of non-linear response to the ordination axes.
Ok, thank you.
Actually, when I asked for the first time, I expected there is some common
practice in calculating the species fit in ordination. Most recently I
have read in a figure legend withou any other explanation that:
"The species displayed ... have a more than average fit and occur five or
more times in the data".
(ter Braak CJF, Schaffers AP, 2004: Co-correspondence analysis: A new
ordination method to relate two community compositions. Ecology 85(3):
834-846.)
I really don't know what fit did they mean. Maybe I should ask them.
Function goodness() for cca/rda finds the Canoco-like statistics either as
residual distances or as proportion explained. You may have to set
choices,
and set summarize = TRUE. See ?goodness.cca. The text() and points()
functions for most ordination objects have 'select' argument in vegan so
that you can pick up the cases you want to have.
In principle, you can use envfit() for species, but it implies a linear
species response to the gradients. Function ordisurf() fits a smooth
surface, and if you set knots = 2, it will fit a quadratic surface, and
with
family = binomial it will something similar to Gaussian species response
on
the ordination space (with family = quasipoisson it would be the Gaussian
response, but with binary responses you must use family = binomial). The
function returns an object of mgcv:::gam and you can use all mgcv:::gam
methods for further analysis of the results.
The solution seems to me to use the ordisurf method with your suggestions
in case the species response is expected to be unimodal, otherwise (if
linear response expected) the envfit method would do the job.
Last question, should I care about arch effect when estimating the species
fit? (When expecting unimodal species response) In co-correspondence
analysis the arch effect is common.
Thank you very much!
Yours,
Vit