An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20121213/8b5a5c9f/attachment.pl>
pca or nmds (with which normalization and distance ) for abundance data ?
9 messages · claire della vedova, stephen sefick, Alan Haynes +2 more
On Thu 13 Dec 2012 09:24:41 AM CST, claire della vedova wrote:
Dear all, I?m a biostatistician working for a French institute involved in environmental risk assessment, and I would need help to understand the results I obtained from several ordination analyses. I have a dataset of 25 sites. For these 25 sites I have abundance data of 38 species and also the measurement of 5 environmental variables. Here an extract of my abundance data for the 5 first sites: Anguinidae.ditylenchus Aphelenchidae Aphelenchoididae Aporcelaimidae 12 18 184 0 0 14 154 0 45 0 101 6 20 0 148 0 0 0 118 0 Here the environmental data for the 5 first sites: ExtPond moist Corg pH DV50 0.946 9.086 4.269 5.24 171.33 0.682 27.139 23.813 3.82 75.45 2.480 14.322 7.191 4.48 230.90 3.069 18.380 11.404 3.58 211.19 2.615 16.693 7.128 4.12 224.45 My aim was to study how the distribution of species is linked with environmental data. Firstly, I did a PCA (with vegan library), using a Hellinger transformation, with commands like this : acp1<-rda(decostand(myDataSpec[,c(25:62)], "hellinger"))
Is the Hellinger transform done on relative proportions?
The first axe represent 19.5% the second one 16.3%. A colleague of me said it is not so bad with abundance data, but it seems to me quite poor. What do you think about ?
You could use something like the broken stick model or others to access how many axes are necessary, but two axes explaining <40% of the variation seems low.
Then, I fitted environmental vectors with the envfit function (of vegan library), with commands like this : physCInd.fit3<-envfit(acp1,MyDataEnv[,c(13,18,20,21,23)], permut=4999, na.rm=T) It appeared that pH variable is significantly linked with the ordination, and the pval of ExtPond is 0.1. Next I did a RDA which is not significant. To finish I did two NMDS. For the first one I used the Hellinger normalization and the Bray-Curtis distance. The stress obtained value is 0.22, Non metric fit R? is 0.952 and Linear fit R2 =0.777. When I fitted the environmental vectors , ExtPond was correlated with the ordination (pval =0.02) and p-val of pH = 0.23 But then I read in ?numerical ecology? page 449 that it?s better to standardize the data by dividing each value by maximum abundance for species and then use Kulcynski distance. The stress value was 0.23 , Non metric fit R? was 0.948 and Linear fit R2 =0.69. These values are a little less good than those of the first NMDS, but the stressplot seems to me more homogenous. Nevertheless, the results I obtained are very different... When I fitted the environmental data it appeared that ExtPond was not correlated with this ordination (p-val=0.82) and p-val of pH=0.06. And obviously ExtPond is the most important variable for us ;-) With all these results, I?m quite confused, and I don?t know what to think. So, if someone can help me, I would appreciate it very much. Be sure that all comments will be welcome. To summarize my questions are : a) Which ordination method would be better for my data : PCA knowing that the represented inertia is 35.62% or NMDS with a stress value about 0.22?
My opinion is PCA on hellinger transformed relative proportions "means" more than an NMDS
b) If NMDS is more adapted which one is the better? with Hellinger normalization and Bray-Curtis distance, or with the normalization recommended by Legendre and Legendre and Kulcynski distance ?
I sounds like the normalization you are referring to is relative proportion which is si/sum(s); s is a vector of taxon at a site.
c) Is there other method to apply? I?m going to try co-inertia with ade4 package
I am reading about co-inertia analysis now as it may be useful for some of the things that I am planning on doing. This method looks promising. You are going to have to decide on what type of ordination to use with COIA... HTH, Stephen
Thanks in advance. Cheers. Claire Della Vedova [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- Stephen Sefick ************************************************** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ************************************************** sas0025 at auburn.edu http://www.auburn.edu/~sas0025 ************************************************** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." -Robert Gentleman
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20121214/4997cd4a/attachment.asc>
Hello Folks, Kruskal's "rule of thumb" really is a rule of thumb. That is, it is intended for a rough guideline. In that sense, there is no difference to Clarke's rules. However, I wouldn't judge usability simply by stress: solutions with very low stress can be useless and solutions with fairly high stress can be usable. In stress it is a question about many things, but a large portion of stress is similar as signal/noise ratios. The signal is more difficult to detect with high noise, but if you detect the signal, the amount of noise does not matter. I have quite often seen pretty usable solutions with stress around or above 0.20 (20%), at least when using external explanatory variables. There are limits, though. If you trace single runs, you may see that random starting configurations start typically start with stress 0.4 (40%) or a bit higher. If you cannot improve from that, the solution probably is pretty useless (and metaMDS you will probably have no convergent solutions). However, instead of discarding the results, you may first start with stricter convergence criteria for monoMDS (if you use monoMDS). See its help pages (next version of vegan will have stricter limit for "scale factor of gradient", sfgrmin). There is also a limit for low stress. In fact, the current vegan warns of too low stress (Kruskal's "perfect" fit). This is usually a symptom of insufficient data (too many dimensions for too few points, dissimilarities found from too few variables). In my opinion, ecologists are often too much obsessed with goodness of fit values. This is true in general, but also very manifest with multivariate method. I do think that if you, say, in PCA or RDA "explain" something like >50%, there is something suspect in your analysis. Typical reasons are insufficient data (too few rows or columns) or not really multivariate data. Sometimes there are some very dominant species (high variance) so that the analysis need not care but about a couple of species, and that is an easy task. If you transform your data so that high abundances are squashed down and variances equalized, or even made equal, the data become more multivariate (= all species count). Typically this means that lower proportion of variance is "explained", but often the results are more interpretable. This also happens when you change models: Unscaled PCA/RDA using variances "explains" much of the variance, scaled PCA/RDA using correlations "explains" much less, and CA/CCA studying deviations from expectations "explains" the least. Typically the usability and interpretability of the results improves as "explanatory power" decreases. The same also often holds for NMDS: Euclidean distances often give lower stress and pooorer results athn dissimilarities that treat all species more equally. Not really R, but perhaps I'm forgiven (this time), Cheers, Jari Oksanen
From: r-sig-ecology-bounces at r-project.org [r-sig-ecology-bounces at r-project.org] on behalf of Alan Haynes [aghaynes at gmail.com]
Sent: 14 December 2012 09:53
To: sas0025 at auburn.edu
Cc: claire della vedova; r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] pca or nmds (with which normalization and distance ) for abundance data ?
Sent: 14 December 2012 09:53
To: sas0025 at auburn.edu
Cc: claire della vedova; r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] pca or nmds (with which normalization and distance ) for abundance data ?
Hi Claire, Im not sure if it helps, but it might be interesting to hear other list readers views on the subject, but McCune and Grace, the authors of PCOrd and "Analysis of Ecological Communities" have a couple of rules of thumb for NMDS stress. They use Kruskal stress*100, while i believe monoMDS (and thus metaMDS) uses simple Kruskal stress. (values in brackets below are thus the values vegan could report) "Kruskal's rules of thumb" 2.5 (or 0.025) = excellent 5 (0.05) = good 10 (0.1) = fair 20 (0.2) = poor "Clarke's rules of thumb" <5 (0.05) - excellent, cannot be misinterpreted, but incredibly rare in practice 5-10 (0.05 - 0.1) - good no real risk of false inference 10-20 (0.1 - 0.2) - can be usable, but upper values could be misleading. plot details should not be used >20 (0.2) - plots likely to be dangerous to interpret. Stresses of >~35, samples are more or less randomly placed with little regard for ranking. Correspondingly, McCune and Grace would probably err on the side of caution as 0.22 is getting into the poor fit, dangerous to interpret areas. It would be interesting to hear other NMDS users views on this...what stress do you consider too high, when does an ordination become (essentially) useless etc. HTH Cheers, Alan -------------------------------------------------- Email: aghaynes at gmail.com Mobile: +41794385586 Skype: aghaynes On 13 December 2012 21:03, Stephen Sefick <sas0025 at auburn.edu> wrote: > > > On Thu 13 Dec 2012 09:24:41 AM CST, claire della vedova wrote: > >> >> Dear all, >> >> I?m a biostatistician working for a French institute involved in >> environmental risk assessment, and I would need help to understand the >> results I obtained from several ordination analyses. >> >> I have a dataset of 25 sites. For these 25 sites I have abundance data of >> 38 >> species and also the measurement of 5 environmental variables. >> >> Here an extract of my abundance data for the 5 first sites: >> >> Anguinidae.ditylenchus Aphelenchidae Aphelenchoididae Aporcelaimidae >> >> 12 18 184 0 >> >> 0 14 154 0 >> >> 45 0 101 6 >> >> 20 0 148 0 >> >> 0 0 118 0 >> >> >> >> Here the environmental data for the 5 first sites: >> >> ExtPond moist Corg pH DV50 >> >> 0.946 9.086 4.269 5.24 171.33 >> >> 0.682 27.139 23.813 3.82 75.45 >> >> 2.480 14.322 7.191 4.48 230.90 >> >> 3.069 18.380 11.404 3.58 211.19 >> >> 2.615 16.693 7.128 4.12 224.45 >> >> >> >> My aim was to study how the distribution of species is linked with >> environmental data. >> >> Firstly, I did a PCA (with vegan library), using a Hellinger >> transformation, >> with commands like this : >> >> acp1<-rda(decostand(**myDataSpec[,c(25:62)], "hellinger")) >> >> >> > Is the Hellinger transform done on relative proportions? > > > > > > > > > > > >> The first axe represent 19.5% the second one 16.3%. A colleague of me said >> it is not so bad with abundance data, but it seems to me quite poor. What >> do >> you think about ? >> >> >> > You could use something like the broken stick model or others to access > how many axes are necessary, but two axes explaining <40% of the variation > seems low. > > > >> Then, I fitted environmental vectors with the envfit function (of vegan >> library), with commands like this : >> >> physCInd.fit3<-envfit(acp1,**MyDataEnv[,c(13,18,20,21,23)], permut=4999, >> na.rm=T) >> >> It appeared that pH variable is significantly linked with the ordination, >> and the pval of ExtPond is 0.1. >> >> Next I did a RDA which is not significant. >> >> To finish I did two NMDS. For the first one I used the Hellinger >> normalization and the Bray-Curtis distance. The stress obtained value is >> 0.22, Non metric fit R? is 0.952 and Linear fit R2 =0.777. When I fitted >> the >> environmental vectors , ExtPond was correlated with the ordination (pval >> =0.02) and p-val of pH = 0.23 >> >> But then I read in ?numerical ecology? page 449 that it?s better to >> standardize the data by dividing each value by maximum abundance for >> species >> and then use Kulcynski distance. The stress value was 0.23 , Non metric >> fit >> R? was 0.948 and Linear fit R2 =0.69. These values are a little less good >> than those of the first NMDS, but the stressplot seems to me more >> homogenous. >> >> Nevertheless, the results I obtained are very different... When I fitted >> the environmental data it appeared that ExtPond was not correlated with >> this >> ordination (p-val=0.82) and p-val of pH=0.06. And obviously ExtPond is the >> most important variable for us ;-) >> >> With all these results, I?m quite confused, and I don?t know what to >> think. >> So, if someone can help me, I would appreciate it very much. Be sure that >> all comments will be welcome. >> >> To summarize my questions are : >> >> a) Which ordination method would be better for my data : PCA knowing >> that the represented inertia is 35.62% or NMDS with a stress value about >> 0.22? >> >> My opinion is PCA on hellinger transformed relative proportions "means" > more than an NMDS > > > b) If NMDS is more adapted which one is the better? with Hellinger >> normalization and Bray-Curtis distance, or with the normalization >> recommended by Legendre and Legendre and Kulcynski distance ? >> >> I sounds like the normalization you are referring to is relative > proportion which is si/sum(s); s is a vector of taxon at a site. > > > c) Is there other method to apply? I?m going to try co-inertia with >> ade4 package >> >> >> >> I am reading about co-inertia analysis now as it may be useful for some > of the things that I am planning on doing. This method looks promising. > > You are going to have to decide on what type of ordination to use with > COIA... > > HTH, > > Stephen > > Thanks in advance. >> >> Cheers. >> >> Claire Della Vedova >> >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> ______________________________**_________________ >> R-sig-ecology mailing list >> R-sig-ecology at r-project.org >> https://stat.ethz.ch/mailman/**listinfo/r-sig-ecology<https://stat.ethz.ch/mailman/listinfo/r-sig-ecology> >> -- >> Stephen Sefick >> **************************************************** >> Auburn University >> Biological Sciences >> 331 Funchess Hall >> Auburn, Alabama >> 36849 >> **************************************************** >> sas0025 at auburn.edu >> http://www.auburn.edu/~sas0025 >> **************************************************** >> >> Let's not spend our time and resources thinking about things that are so >> little or so large that all they really do for us is puff us up and make us >> feel like gods. We are mammals, and have not exhausted the annoying little >> problems of being mammals. >> >> -K. Mullis >> >> "A big computer, a complex algorithm and a long time does not equal >> science." >> >> -Robert Gentleman >> >> > ______________________________**_________________ > R-sig-ecology mailing list > R-sig-ecology at r-project.org > https://stat.ethz.ch/mailman/**listinfo/r-sig-ecology<https://stat.ethz.ch/mailman/listinfo/r-sig-ecology> >
Claire, Here some small comments
On 13/12/2012, at 17:24 PM, claire della vedova wrote:
Dear all, a) Which ordination method would be better for my data : PCA knowing that the represented inertia is 35.62% or NMDS with a stress value about 0.22?
These numbers cannot be used to say which of these methods is better. You need other criteria. Some people may have strong opinions on the choice here, but these opinions cannot be based on these numbers -- they are based on something else (I do have such an opinion, but I abstain from expressing my opinion).
b) If NMDS is more adapted which one is the better? with Hellinger normalization and Bray-Curtis distance, or with the normalization recommended by Legendre and Legendre and Kulcynski distance ?
Hellinger transformation was suggested for Euclidean metric, and normally it is used in PCA/RDA (which are based on Euclidean metric although they do not explicitly calculate Euclidean distances). I haven't heard of any advantages of Hellinger transformation with Bray-Curtis dissimilarity. I suggest you don't use it with Bray-Curtis. I don't know if Kulczy?ski dissimilarity is any better than, say, S?rensen dissimilarity (and both seem to be difficult to spell), but certainly it belongs to the same group of usually well behaving dissimilarities as variants of Bray-Curtis or Jaccard.
c) Is there other method to apply? I?m going to try co-inertia with ade4 package
Certainly there is a high number of methods you can apply, but why? What you try to analyse? What are your questions? Cheers, Jari Oksanen
Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland jari.oksanen at oulu.fi, Ph. +358 400 408593, http://cc.oulu.fi/~jarioksa
On Thu, 2012-12-13 at 14:03 -0600, Stephen Sefick wrote:
<snip />
My aim was to study how the distribution of species is linked with environmental data. Firstly, I did a PCA (with vegan library), using a Hellinger transformation, with commands like this : acp1<-rda(decostand(myDataSpec[,c(25:62)], "hellinger"))
Is the Hellinger transform done on relative proportions?
The transformation includes division by by the row sum and hence conversion to proportions. As such it can be applied to count data or relative abundance data; with the latter the division by row sum will have no effect and then the transformation collapses to a simple square root transformation of the proportional abundance data. This is one of the reasons for the apparent contradictions over the utility of the chord distance in ecological and palaeoecological disciplines. In the latter we commonly use proportional data whilst count abundances are common in the former. Directly applying the chord distance to count abundances carries with it the baggage of the Euclidean distance (squared differences emphasise the big things). But chord distance applied to proportional data *is* the Hellinger distance and hence palaeoecologists have found the chord distance a useful dissimilarity coefficients in their field. <snip />
a) Which ordination method would be better for my data : PCA knowing that the represented inertia is 35.62% or NMDS with a stress value about 0.22?
My opinion is PCA on hellinger transformed relative proportions "means" more than an NMDS
?? NMDS with Hellinger distances could optimise a k-D PCA with Hellinger transform. Given that NMDS essentially subsumes PCA I'm not sure what you are getting at. G
b) If NMDS is more adapted which one is the better? with Hellinger normalization and Bray-Curtis distance, or with the normalization recommended by Legendre and Legendre and Kulcynski distance ?
I sounds like the normalization you are referring to is relative proportion which is si/sum(s); s is a vector of taxon at a site.
c) Is there other method to apply? I?m going to try co-inertia with ade4 package
I am reading about co-inertia analysis now as it may be useful for some of the things that I am planning on doing. This method looks promising. You are going to have to decide on what type of ordination to use with COIA... HTH, Stephen
Thanks in advance. Cheers. Claire Della Vedova [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- Stephen Sefick ************************************************** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ************************************************** sas0025 at auburn.edu http://www.auburn.edu/~sas0025 ************************************************** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." -Robert Gentleman
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
On Fri 14 Dec 2012 05:08:32 AM CST, Gavin Simpson wrote:
On Thu, 2012-12-13 at 14:03 -0600, Stephen Sefick wrote: <snip />
My aim was to study how the distribution of species is linked with environmental data. Firstly, I did a PCA (with vegan library), using a Hellinger transformation, with commands like this : acp1<-rda(decostand(myDataSpec[,c(25:62)], "hellinger"))
Is the Hellinger transform done on relative proportions?
The transformation includes division by by the row sum and hence conversion to proportions. As such it can be applied to count data or relative abundance data; with the latter the division by row sum will have no effect and then the transformation collapses to a simple square root transformation of the proportional abundance data. This is one of the reasons for the apparent contradictions over the utility of the chord distance in ecological and palaeoecological disciplines. In the latter we commonly use proportional data whilst count abundances are common in the former. Directly applying the chord distance to count abundances carries with it the baggage of the Euclidean distance (squared differences emphasise the big things). But chord distance applied to proportional data *is* the Hellinger distance and hence palaeoecologists have found the chord distance a useful dissimilarity coefficients in their field. <snip />
a) Which ordination method would be better for my data : PCA knowing that the represented inertia is 35.62% or NMDS with a stress value about 0.22?
My opinion is PCA on hellinger transformed relative proportions "means" more than an NMDS
?? NMDS with Hellinger distances could optimise a k-D PCA with Hellinger transform.
Gavin, maybe I have spoken beyond my knowledge. My though was that a PCA has a unique solution and is therefore "better" (as long as an appropriate distance is used that deals with the double zero problem effectively). I am sure that this is too simple for the reality of the situation. I don't know what a k-D PCA is. Would you mind explaining or directing me to some reading material?
Given that NMDS essentially subsumes PCA I'm not sure what you are getting at.
I don't understand. Would you mind explaining this? many thanks, Stephen
G
b) If NMDS is more adapted which one is the better? with Hellinger normalization and Bray-Curtis distance, or with the normalization recommended by Legendre and Legendre and Kulcynski distance ?
I sounds like the normalization you are referring to is relative proportion which is si/sum(s); s is a vector of taxon at a site.
c) Is there other method to apply? I?m going to try co-inertia with ade4 package
I am reading about co-inertia analysis now as it may be useful for some of the things that I am planning on doing. This method looks promising. You are going to have to decide on what type of ordination to use with COIA... HTH, Stephen
Thanks in advance. Cheers. Claire Della Vedova [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- Stephen Sefick ************************************************** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ************************************************** sas0025 at auburn.edu http://www.auburn.edu/~sas0025 ************************************************** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." -Robert Gentleman
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Stephen Sefick ************************************************** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ************************************************** sas0025 at auburn.edu http://www.auburn.edu/~sas0025 ************************************************** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." -Robert Gentleman
On Fri, 2012-12-14 at 06:22 -0600, Stephen Sefick wrote:
<snip />
a) Which ordination method would be better for my data : PCA knowing that the represented inertia is 35.62% or NMDS with a stress value about 0.22?
My opinion is PCA on hellinger transformed relative proportions "means" more than an NMDS
?? NMDS with Hellinger distances could optimise a k-D PCA with Hellinger transform.
Gavin, maybe I have spoken beyond my knowledge. My though was that a PCA has a unique solution and is therefore "better" (as long as an appropriate distance is used that deals with the double zero problem effectively). I am sure that this is too simple for the reality of the situation. I don't know what a k-D PCA is. Would you mind explaining or directing me to some reading material?
By k-D PCA I meant that in nMDS you need to state the dimensionality; in metaMDS() we start the process from a Principal Coordinates of the data (PCoA == PCA when Euclidean distances used). I meant that nMDS for say 2d solutions can optimise the configuration arising from the first two PCA axes. I don't see the unique solution of PCA as an implicit advantage of that method. It has a unique solution because the possible solutions are constrained by the approach; linear combinations of the variables which best approximate the Euclidean distances between samples. NMDS generalises this idea extensively into a problem of best preserving the mapping of the dissimilarities. As such it can do a better job of drawing the map but that comes at a price. Again though; horses for courses.
Given that NMDS essentially subsumes PCA I'm not sure what you are getting at.
I don't understand. Would you mind explaining this? many thanks,
I meant in the sense that PCA is special case of Principal Coordinates and that nMDS generalises Principal coordinates. I don't get the point of saying one method is "better" than any other. Each has uses etc. I certainly don't think any one method "means" more than the other. G
Stephen
G
b) If NMDS is more adapted which one is the better? with Hellinger normalization and Bray-Curtis distance, or with the normalization recommended by Legendre and Legendre and Kulcynski distance ?
I sounds like the normalization you are referring to is relative proportion which is si/sum(s); s is a vector of taxon at a site.
c) Is there other method to apply? I?m going to try co-inertia with ade4 package
I am reading about co-inertia analysis now as it may be useful for some of the things that I am planning on doing. This method looks promising. You are going to have to decide on what type of ordination to use with COIA... HTH, Stephen
Thanks in advance. Cheers. Claire Della Vedova [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- Stephen Sefick ************************************************** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ************************************************** sas0025 at auburn.edu http://www.auburn.edu/~sas0025 ************************************************** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." -Robert Gentleman
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
On Fri 14 Dec 2012 06:51:56 AM CST, Gavin Simpson wrote:
On Fri, 2012-12-14 at 06:22 -0600, Stephen Sefick wrote: <snip />
a) Which ordination method would be better for my data : PCA knowing that the represented inertia is 35.62% or NMDS with a stress value about 0.22?
My opinion is PCA on hellinger transformed relative proportions "means" more than an NMDS
?? NMDS with Hellinger distances could optimise a k-D PCA with Hellinger transform.
Gavin, maybe I have spoken beyond my knowledge. My though was that a PCA has a unique solution and is therefore "better" (as long as an appropriate distance is used that deals with the double zero problem effectively). I am sure that this is too simple for the reality of the situation. I don't know what a k-D PCA is. Would you mind explaining or directing me to some reading material?
By k-D PCA I meant that in nMDS you need to state the dimensionality; in metaMDS() we start the process from a Principal Coordinates of the data (PCoA == PCA when Euclidean distances used). I meant that nMDS for say 2d solutions can optimise the configuration arising from the first two PCA axes. I don't see the unique solution of PCA as an implicit advantage of that method. It has a unique solution because the possible solutions are constrained by the approach; linear combinations of the variables which best approximate the Euclidean distances between samples. NMDS generalises this idea extensively into a problem of best preserving the mapping of the dissimilarities. As such it can do a better job of drawing the map but that comes at a price. Again though; horses for courses.
Given that NMDS essentially subsumes PCA I'm not sure what you are getting at.
I don't understand. Would you mind explaining this? many thanks,
I meant in the sense that PCA is special case of Principal Coordinates and that nMDS generalises Principal coordinates. I don't get the point of saying one method is "better" than any other. Each has uses etc. I certainly don't think any one method "means" more than the other.
Point taken. As always, it depends on the question that you are trying to answer. Thank you for the discussion and clarification.
G
Stephen
G
b) If NMDS is more adapted which one is the better? with Hellinger normalization and Bray-Curtis distance, or with the normalization recommended by Legendre and Legendre and Kulcynski distance ?
I sounds like the normalization you are referring to is relative proportion which is si/sum(s); s is a vector of taxon at a site.
c) Is there other method to apply? I?m going to try co-inertia with ade4 package
I am reading about co-inertia analysis now as it may be useful for some of the things that I am planning on doing. This method looks promising. You are going to have to decide on what type of ordination to use with COIA... HTH, Stephen
Thanks in advance. Cheers. Claire Della Vedova [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- Stephen Sefick ************************************************** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ************************************************** sas0025 at auburn.edu http://www.auburn.edu/~sas0025 ************************************************** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." -Robert Gentleman
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Stephen Sefick ************************************************** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ************************************************** sas0025 at auburn.edu http://www.auburn.edu/~sas0025 ************************************************** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." -Robert Gentleman