Skip to content

a possible hack for adding species scores?

4 messages · Dave Roberts, Eliot Miller

#
Hi all,

A few months ago I ran into a problem that others seem to have had. In
short, if you use the FD package to calculate a Gower distance matrix which
you then either feed back into vegan or use directly in FD, you "lose the
species" scores associated with the points. You can't see what explains the
distances among points. Examples of similar problems are here:
http://r.789695.n4.nabble.com/Ordination-Plotting-Warning-Species-scores-not-available-td4664025.html
https://www.mail-archive.com/r-help at r-project.org/msg153765.html

I am less concerned with the plotting of the results than the previous
posters have been. I am more concerned with simply seeing the table of
"loadings" (don't think that term is correct for the PCoA). Perhaps this is
an acceptable and widespread solution to the problem, and I simply haven't
stumbled on it yet. It takes advantage of the vegan function envfit(). Or
perhaps it is an incorrect solution. I'm wondering which of those two
things it is, and hoping for a bit of input. In my case, I am making a
distance matrix among foraging observations spread across a number of
species. I want to know each species' niche size using the FDis metric. I
made an example dataset and completely worked through example. The main
point here is that I want to know whether the object "fitted" at the end is
correctly explaining how things are "loaded". Any help is greatly
appreciated.

Thanks in advance!
Eliot

library(FD)

#these first few lines are just building up an example dataset of foraging
observations
set.seed(0)
maneuver <-
sample(c(rep("sally",10),rep("probe",10),rep("glean",10),rep("hover",10)))

set.seed(0)
substrate <-
sample(c(rep("leaf",10),rep("flower",10),rep("air",10),rep("branch",10)))

set.seed(0)
attack.height <- rnorm(40, mean=5)

set.seed(0)
relative.height <- rnorm(40, 0.5)
relative.height[relative.height < 0] <- 0
relative.height[relative.height > 1.1] <- 1.1

set.seed(0)
leafiness <- sample(c(0:5), size=40, replace=TRUE)
#convert this to an ordinal variable
leafiness <- ordered(leafiness)

set.seed(0)
distance <- sample(c(1:4), size=40, replace=TRUE)
#convert to ordinal
distance <- ordered(distance)

#here is the complete dataset of example foraging observations
to.ordinate <- data.frame(maneuver, substrate, attack.height,
relative.height, leafiness, distance)

#now make a second table that identifies to which species each observation
belongs
road.map <- matrix(ncol=40, nrow=3, 0)

colnames(road.map) <- 1:40
row.names(road.map) <- c("species1","species2","species3")

#this for loop assigns each foraging observations to one of the species
set.seed(0)
for(i in 1:dim(road.map)[2])
{
    species <- sample(row.names(road.map), 1)
    road.map[species,i] <- 1
}

#run the FD calculations
FDresults <- dbFD(x=to.ordinate, a=road.map, corr="lingoes",
calc.FRic=FALSE, calc.FGR=FALSE, calc.CWM=FALSE, calc.FDiv=FALSE,
print.pco=TRUE)

#use the vegan function envfit() to see what is driving each axis from the
PCoA
fitted <- envfit(ord=FDresults$x.axes, env=to.ordinate, permutations=0,
choices=1:10)
1 day later
#
Hi Elliot,

    If you copy (or just rename) your FDresults$x.axes to 
FDresults$sites you can use all the vegan ordi__ functions on your results.

E.g.

FDresults$sites <- FDresults$x.axes
demo <- ordiplot(FDresults)
ordisurf(demo,to.ordinate$attack.height)

which is a little more informative than envfit unless you're convinced 
the response should be linear.

In addition, if you want, you could add the species in

FDresults$species <- t(road.map)
demo <- ordiplot(FDresults)

to see species centroids.

Dave
On 08/27/2014 09:24 AM, Eliot Miller wrote:

  
    
#
Hi Dave,

Thanks a million for the response! Your solution is good in some
respects--when applied to my real data it shows me why I should be cautious
of assuming any linearity in the response of my continuous variables.
However, it doesn't work for the factors (which makes sense). You can run,

ordisurf(demo, to.ordinate$leafiness)

to see what I mean. Would it be reasonable to use some combination of
ordisurf for the continuous variables, with the locations of the factors
determined via envfit?

Thanks again for the push in the right direction!

Cheers,
Eliot
On Wed, Aug 27, 2014 at 10:24 AM, Eliot Miller <eliotmiller at umsl.edu> wrote:

            

  
  
#
Hi Eliot,

    With respect to leafiness you're in a common problem area.  Ordinal 
variables are hard to handle from any approach.  One alternative (only 
undertaken with a well articulated rationale) is to promote them to 
interval variables,

ordisurf(demo, as.numeric(to.ordinate$leafiness))

If you think the scale is wrong you can try other, e.g.

0,1,4,9,16,25

Otherwise you can do logistic regression on the classes

ordisurf(demo, to.ordinate$leafiness==1, family=binomial)

but you have one fewer degree of freedom than expected which can be a 
problem on a small data set.  Alternatively, you can use envfit as you 
have been doing, but that also assumes they are categorical as opposed 
to ordered.

    Maybe someone else will weigh in with a better alternative for 
ordinal variables.

Dave
On 08/29/2014 11:05 AM, Eliot Miller wrote: