An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20100603/ba13cf3e/attachment.pl>
randtest.enfa {adehabitat}
10 messages · Clément Calenge, Consuelo Hermosilla, Mathieu Basille
I have another question (oops)!! When you do the randomisation test for ENFA and plot the result (with scatter() function), what's the meaning of the arrow value?
It is not possible to use the function scatter to plot the result of a randomization test. I do not understand your question here... Best regards, Cl?ment Calenge
Cl?ment CALENGE Cellule d'appui ? l'analyse de donn?es Office national de la chasse et de la faune sauvage Saint Benoist - 78610 Auffargis tel. (33) 01.30.46.54.14
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20100604/9bf991da/attachment.pl>
3 days later
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20100607/4ba22c83/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20100608/e5ea9dd9/attachment.pl>
Consuelo, Here are some elements that you might find useful.
I had already read the paper. I understood the idea of the scatter plot, that the arrows were a projection of the marginality and specialization but I didn't understand where the length of arrows came from. I was somehow lost thinking how their length represent the marginality and specialization. Now you tell me that they "correspond to the scores of the environmental variables on the axes of the ENFA". Hmmm.... Are these scores given somewhere in the output of enfa?
Yes they are. Look at the $co element of the enfa object. Following the
example in ?enfa:
> enfa1$co
Mar Spe1
forets 0.7178829 0.29304902
hydro 0.0977121 -0.94018385
routes -0.5394778 0.15001749
artif 0.4290224 -0.08758622
It gives you the coordinates of the arrow. The starting point of the
vector is the average availability which is centred on zero.
You see, the main issue for me is that if I put that graph in a manuscript, I'd like to be able to tell, the arrow length represent XX times the marginality and XX times the specialization of each variable or something like that. In order to give an idea of the real value of marginality and specialization. Does it make any sense? Maybe it doesn't make any sense and it's not necessary, what do you think? I hope you understand me this time... otherwise, I'll give up!!!
You should interpret the scatter plot qualitatively. E.g. in the example of ?enfa, the marginality is very strong on "forets" (forest) with a shift towards high values, on "routes" (roads) with a shift on low values, and to a lesser extent on "artif" (artificial areas) with a shift on high values. On the other hand, the specialization is very pronounced on "hydro" only, with a restriction around the mean (you have to look at the specific distributions on this particular variable to say more about it).
Regarding the global specialization issue. Not sure about something again. You told me that there's not global specialization, but Hirzel et al (2002), described it. Why you say "it cannot be measured globally over the ecological space"? Is it wrong if I use the global specialization formula of Hirzel et al (square root of the sum of the eigeinvalues divided by the number of variables) to estimate this global specialization (sensu Hirzel et al)? Can I do that?
You can indeed compute such an index. But now, how would you interpret it? The only case it could be useful is the comparison of two niches in the same environment, i.e. same availability. Two study sites cannot be compared on such basis. So that it would give you an idea of a global specialization that is totally context dependent. This is why it is not, from my point of view, very useful.
And regarding the tolerance, just to be sure we're talking about the same, is it the inverse of the specialization, right? if I calculate it, could I use it to estimate the specialization? Actually, I have tried to calculate the tolerance, but I think I have used the wrong "wei". Can you please confirm I'm using the correct terms?
It is conceptually the inverse of the specialization: the specialization is a ratio of variances (available over used) while the tolerance is the sum of used variances over all variables (which is similar to divide it by an available variance of 1). You might have a look at ?niche.test for a global measure of tolerance.
You told me to use this formula: sum(dudi.pca(tab, row.w=wei,
scan=FALSE)$eig); in which "tab" is the dataframe containing the values of
environmental variables and "wei" is vector describing the utilization
weight of each pixel. Using the manual example for enfa:
data(lynxjura)
map <- lynxjura$map
tmp <- lynxjura$locs[,4]!="D"
locs <- lynxjura$locs[tmp, c("X","Y")]
dataenfa1 <- data2enfa(map, locs)
enfa1 <- enfa(pc, dataenfa1$pr,+ scannf = FALSE)
For me, in this case, "tab" would be kasc2df(map) and "wei" would be
dataenfa1$pr
so, the tolerance would be: sum(dudi.pca(kasc2df(map), row.w=dataenfa1$pr,
scan=FALSE)$eig)
Is this correct? I got this huge value (2355). It's kind of high, isn't it?
Maybe try: sum(dudi.pca(kasc2df(map), row.w = dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig) which gives you 5 only. The weights should sum to 1 (i.e. they are proportions). But then, how would you interpret this? This is the same as for the "global specialization".
And about the specialization axis you keep in enfa (or madifa or gnesfa or any other one of these analysis), is there a way to get the % variation explained by each eigenvalue? I mean, a referee can ask that, right? I have tried to figure it out, but without any luck... I have chosen only one eigenvalue of specialization, but I wasn't able to calculate how much variation it accounted for.
There is no way to give a percentage of variation for a specialization axis, by nature: the eigenvalue of each axis gives you a ratio of variance. E.g. an eigenvalue of 12 tells you that the available variance on this axis is 12 times higher than the used variance. It is up to you to interpret this (is it a strong specialization or not).
I hope you don't mind all these question... I need to be sure of what I'm doing if I want to publish these results...
No problem about it. The ENFA, however, should be seen as a very elegant way of exploring the data, at the manner of a PCA (and the interpretation is in many ways similar, when you take into account the specificity of the analysis). Hope this helps a bit, Mathieu.
Thanks!!!
Consuelo
-------------
Consuelo Hermosilla
PhD student
Departamento de Ecolog?a y Biolog?a Animal
Departamento de Bioqu?mica, Gen?tica e Inmunolog?a, ?rea de Gen?tica
Facultad de Ciencias del Mar
Campus de As Lagoas-Marcosende
Universidad de Vigo
36310 Vigo
SPAIN
Mobile: +34 692 633 298
oooO
( ) Oooo
( ( )
_) ) /
(_/
Stop Gaza Massacre
2010/6/7 Cl?ment Calenge <clement.calenge at gmail.com>
On 06/04/2010 09:40 AM, Consuelo Hermosilla wrote: Hummm, true, I got confused! Sorry!! I meant scatter.enfa. What I don't understand is the length of the arrows. The grid is d=2, different from the grid set with the biplot of marginality and specialization (d=0.5). In that case, the length make senses to me, but I don't understand it in the scatter.enfa. They don't have the same lenght /value as they have in the biplot. Do you understand me? I do not understand you... The function scatter.enfa draws the biplot associated with the results of the ENFA. The following paper explains this graph in detail: Basille, M., Calenge, C., Marboutin, E., Andersen, R. and Gaillard, J.M. 2008. Assessing habitat selection using multivariate statistics: some refinements of the ecological niche factor analysis. Ecol. model. 211: 233--240.
I am not sure what you mean by "biplot of marginality and specialization". There is only one way to draw a biplot with the ENFA: it is provided by scatter.enfa (see the paper cited above). So I do not understand how the result of scatter.enfa could be inconsistent with the biplot, since the result of scatter.enfa *is* the biplot. Best, Cl?ment Calenge -- Cl?ment CALENGE Cellule d'appui ? l'analyse de donn?es Office national de la chasse et de la faune sauvage Saint Benoist - 78610 Auffargis tel. (33) 01.30.46.54.14
[[alternative HTML version deleted]] ------------------------------------------------------------------------
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
~$ whoami Mathieu Basille, Post-Doc ~$ locate Laboratoire d'?cologie Comportementale et de Conservation de la Faune + Centre d'?tude de la For?t D?partement de Biologie Universit? Laval, Qu?bec ~$ info http://ase-research.org/basille ~$ fortune ``If you can't win by reason, go for volume.'' Calvin, by Bill Watterson.
sum(dudi.pca(kasc2df(map), row.w = dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig) which gives you 5 only. The weights should sum to 1 (i.e. they are proportions). But then, how would you interpret this? This is the same as for the "global specialization".
Yes, but be careful: kasc2df returns a list with one component "tab" (the component of interest) and one component "index" (the component allowing to rebuild the original kasc), so that the correct code is: sum(dudi.pca(kasc2df(map)$tab, row.w = dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig) Best, Cl?ment Calenge
On 06/09/2010 09:05 AM, Cl?ment Calenge wrote:
sum(dudi.pca(kasc2df(map), row.w = dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig) which gives you 5 only. The weights should sum to 1 (i.e. they are proportions). But then, how would you interpret this? This is the same as for the "global specialization".
Yes, but be careful: kasc2df returns a list with one component "tab" (the component of interest) and one component "index" (the component allowing to rebuild the original kasc), so that the correct code is: sum(dudi.pca(kasc2df(map)$tab, row.w = dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig)
I was a bit too hasty for this reply. Actually the code is incorrect. Consider the following code: pc <- dudi.pca(kasc2df(map)$tab, row.w = dataenfa1$pr/sum(dataenfa1$pr)) This code performs a *centered and scaled* PCA of the environmental variables: this analysis first centers and scales the table containing the value of the environmental variables. Therefore, for each variable of pc$tab, the mean *weighted by the utilization weights* is equal to zero and the variance *weighted by the utilization weights* is equal to 1. Therefore, the sum of the eigenvalues of the PCA on this transformed table is equal to the number of environmental variables. This code is incorrect (thanks Mathieu for noting the inconsistency). First, to calculate the global tolerance, you would need to calculate: pc <- dudi.pca(kasc2df(map)$tab, scan=FALSE) This preliminary analysis is performed just to center/scale the variables with uniform weighting. This allows to compare the different variables in pc$tab (they all have the same average and the same scale). *Then*, calculate the weighted and *unscaled* PCA of the table pc$tab: sum(dudi.pca(pc$tab, scale=FALSE, row.w = dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig) This gives the tolerance of the species on the area, Sorry for the confusion, HTH, Cl?ment Calenge
Cl?ment CALENGE Cellule d'appui ? l'analyse de donn?es Office national de la chasse et de la faune sauvage Saint Benoist - 78610 Auffargis tel. (33) 01.30.46.54.14
4 days later
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20100613/a225fbf3/attachment.pl>
1 day later
Consuelo,
Again, about the tolerance, using the code you told me: pc3 <- dudi.pca(kasc2df(map)$tab, scan=FALSE) sum(dudi.pca(pc3$tab, scale=FALSE, row.w = dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig) I got* 3.129783*, is this correct?
Yes it is. You can double check it with niche.test:
bla <- niche.test(map, lynxjura$locs[tmp, c("X", "Y")])
bla$obs
0.1450481 3.1297832
The first value is the marginality (same as in enfa1), the second one is
the tolerance.
But I don't understand, shouldn't tolerance have a value ranging from 0 to 1? I thougth that was the idea, since specialization varies from 1 to infinite, right? [i.e. Reutter et al 2003***]
Not necessarily. But I guess you will find different definitions of (global) tolerance and (global) specialization according to different sources. In any cases, I would not recommend the use of a *global* tolerance/specialization index, unless it is used to compare: - two species in the same study area - the same species in two periods, given that the environment does not change.
In the other hand, I have estimated the global specializations using Hirzel et al (2002) formula, like this: S <- (sqrt(sum(enfa1$s)))/1.96 Does it seem correct?
According to Hirzel et al. (2002)'s definition, it should be divided by the number of specialization axes. It is just the mean eigenvalue of specialization, which can be seen as a way to define a global specialization. Now, as I said before, it is difficult to use it as is.
I got this value S = 0.996322 (<1 though), and then, my tolerance (1/S) was 1.003692 (>1). It bothers me that we don't have the same values! grrr
My guess: the S = 1/T relation only hold on one dimension, but not in the N-dimensions ecological space. Could be wrong, though. Anyway, Hirzel et al. (2002)'s definition of specialization cannot not be used in this context.
What do you think? About my other question about the scatter plot of enfa, the scores and so on... I think I haven't been able to explain myself... I have uploaded a picture http://img808.imageshack.us/i/scatter.jpg/. I hope you can see what I mean there.
Now I see. The d=1 that you have in the upper-right corner just holds for the projection of pixels (simplified with the use of MCPs). Try the following: > scatter(enfa1, pts = TRUE) > summary(enfa1$li) which should help. The arrows are just proportional to their real values, in order to adequately fit in the graph. Regards, Mathieu.
Thanks again guys!!!
Consuelo
*** Reutter, B.A, V. Helfer, A.H. Hirzel1 & P. Vogel. 2003. Modelling
habitat-suitability using museum collections: an example with three
sympatric Apodemus species from the Alps. Journal of Biogeography, 30,
581?590
-------------
Consuelo Hermosilla
PhD student
Departamento de Ecolog?a y Biolog?a Animal
Departamento de Bioqu?mica, Gen?tica e Inmunolog?a, ?rea de Gen?tica
Facultad de Ciencias del Mar
Campus de As Lagoas-Marcosende
Universidad de Vigo
36310 Vigo
SPAIN
Mobile: +34 692 633 298
oooO
( ) Oooo
( ( )
_) ) /
(_/
Stop Gaza Massacre
2010/6/9 Cl?ment Calenge <clement.calenge at gmail.com
<mailto:clement.calenge at gmail.com>>
On 06/09/2010 09:05 AM, Cl?ment Calenge wrote:
sum(dudi.pca(kasc2df(map), row.w =
dataenfa1$pr/sum(dataenfa1$pr),
scan=FALSE)$eig)
which gives you 5 only. The weights should sum to 1 (i.e.
they are
proportions). But then, how would you interpret this? This
is the same
as for the "global specialization".
Yes, but be careful: kasc2df returns a list with one component
"tab" (the component of interest) and one component "index" (the
component allowing to rebuild the original kasc), so that the
correct code is:
sum(dudi.pca(kasc2df(map)$tab, row.w =
dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig)
I was a bit too hasty for this reply. Actually the code is
incorrect. Consider the following code:
pc <- dudi.pca(kasc2df(map)$tab, row.w = dataenfa1$pr/sum(dataenfa1$pr))
This code performs a *centered and scaled* PCA of the environmental
variables: this analysis first centers and scales the table
containing the value of the environmental variables. Therefore, for
each variable of pc$tab, the mean *weighted by the utilization
weights* is equal to zero and the variance *weighted by the
utilization weights* is equal to 1. Therefore, the sum of the
eigenvalues of the PCA on this transformed table is equal to the
number of environmental variables. This code is incorrect (thanks
Mathieu for noting the inconsistency). First, to calculate the
global tolerance, you would need to calculate:
pc <- dudi.pca(kasc2df(map)$tab, scan=FALSE)
This preliminary analysis is performed just to center/scale the
variables with uniform weighting. This allows to compare the
different variables in pc$tab (they all have the same average and
the same scale). *Then*, calculate the weighted and *unscaled* PCA
of the table pc$tab:
sum(dudi.pca(pc$tab, scale=FALSE, row.w =
dataenfa1$pr/sum(dataenfa1$pr), scan=FALSE)$eig)
This gives the tolerance of the species on the area,
Sorry for the confusion,
HTH,
Cl?ment Calenge
--
Cl?ment CALENGE
Cellule d'appui ? l'analyse de donn?es
Office national de la chasse et de la faune sauvage
Saint Benoist - 78610 Auffargis
tel. (33) 01.30.46.54.14
~$ whoami Mathieu Basille, Post-Doc ~$ locate Laboratoire d'?cologie Comportementale et de Conservation de la Faune + Centre d'?tude de la For?t D?partement de Biologie Universit? Laval, Qu?bec ~$ info http://ase-research.org/basille ~$ fortune ``If you can't win by reason, go for volume.'' Calvin, by Bill Watterson.