Skip to content

Problem in Search and Association Loop

2 messages · Alexandre F. Souza, Hanna Tuomisto

#
Dear friends,

I am writing a loop to compare scientific plant names from one data frame
in a second data frame containing a smaller set of potentially different
plant names, and then to copy the associated abundances of the sp already
present in the first database, and add the new species to the bottom.

I am receiving, however, an error message saying that factor levels are
different. Can someone give me a hint on what can be happening?

Here you have the code

The base dataframe
family           genus             epithet    abundance
1   Burseraceae   Coccoloba cf. cordifolia 224
2 Anacardiaceae    Tapirira     guianensis 112
3 Euphorbiaceae Pogonophora schomburgkiana 146
4 Anacardiaceae  Thyrsodium     spruceanum 115
5   Apocynaceae Himatanthus  phagedaenicus  47
6    Sapotaceae    Pouteria       coriacea  71
7 Malpighiaceae   Byrsonima        sericea  25
8  Polygonaceae   Coccoloba cf. cordifolia  29
9   Burseraceae     Protium   heptaphyllum  37

The new data data.frame
genus            sp.new    abundance
1    Coccoloba cf. cordifolia   29
2      Protium   heptaphyllum   37
3    Bowdichia   virgilioides   15
4 Sclerolobium    densiflorum   15
5       Ocotea      glomerata   16

The Output table


output= matrix(nrow = 10, ncol = 3)
colnames(output) = c("genus", "epithet", "abundance")
output


for (i in 1:nrow(base)){
  for (j in 1:nrow(new.data)){
    if ((base[i,2] == new.data[j,1]) & (base[i,3] == new.data[j,2])){
      output[i,1] = base[i,2]
      output[i,2] = base[i,3]
      output[i,3] = new.data[j,2]
    }
  }
}

This is just the beginning of the code I plan. I stopped due to the error
message.

Thank you very much in advance,

Alexandre
#
Hi Alexandre,

R has converted your strings into factors. This may have happened already when you read the data from a file, and in that context it can be avoided:
read.csv(... , stringsAsFactors=FALSE)

You do not need a loop to combine the two data frames. First make sure you have the same column names in both:
colnames(new.data)[2] <- "epithet"

Then you can use rbind() and subsetting to combine the desired parts of the data frames:

output <- rbind(base[,-1], subset(new.data, !new.data[,"genus"] %in% base[,"genus"] | !new.data[,"epithet"] %in% base[,"epithet"]))

Good luck,
Hanna
On 5 Sep 2015, at 3:34 AM, Alexandre F. Souza wrote:

            
---
Hanna Tuomisto
Department of Biology
FI-20014 University of Turku, FINLAND

e-mail: hanna.tuomisto at utu.fi
phone: +358-2-3335634
http://www.utu.fi/en/units/sci/units/biology/Pages/home.aspx
http://www.utu.fi/amazon