splitting scientific names into genus, species, and sub
On 04-Nov-09 21:09:42, Mark W. Miller wrote:
I have a list of scientific names in a data set. I would like
to split the names into genus, species and subspecies.
Not all names include a subspecies. Could someone show me how
to do this?
My example code is:
a <- matrix(c('genusA speciesA', 10,
'genusB speciesAA', 20,
'genusC speciesAAA subspeciesA', 15,
'genusC speciesAAA subspeciesB', 25), nrow=4, byrow=TRUE)
aa <- data.frame(a)
colnames(aa) <- c('species', 'counts')
aa
# The code returns
species counts
1 genusA speciesA 10
2 genusB speciesAA 20
3 genusC speciesAAA subspeciesA 15
4 genusC speciesAAA subspeciesB 25
# I would like there to be 4 columns as below
genus species subspecies counts
genusA speciesA no.subspecies 10
genusB speciesAA no.subspecies 20
genusC speciesAAA subspeciesA 15
genusC speciesAAA subspeciesB 25
I have tried using 'strsplit', but cannot get the desired result.
Thank you for any help with this.
Mark Miller
Gainesville, Florida
The following seems to work for your example. However, others
can probably propose a less clumsy version (but at least this
one breaks it down into its elements):
a <- matrix(c('genusA speciesA', 10,
'genusB speciesAA', 20,
'genusC speciesAAA subspeciesA', 15,
'genusC speciesAAA subspeciesB', 25), nrow=4, byrow=TRUE)
a
# [,1] [,2]
# [1,] "genusA speciesA" "10"
# [2,] "genusB speciesAA" "20"
# [3,] "genusC speciesAAA subspeciesA" "15"
# [4,] "genusC speciesAAA subspeciesB" "25"
A <- NULL
for( i in (1:nrow(a))){
Names <- unlist(strsplit(a[i,1],"[ ]+"))
if(length(Names)==2) Names <- c(Names,"no.subspecies")
A <- rbind(A,c(Names,a[i,2]))
}
colnames(A) <- c("Genus","Species","Subspecies","Count")
A <- as.data.frame(A)
A$Count <- as.numeric(A$Count)
A
# Genus Species Subspecies Count
# 1 genusA speciesA no.subspecies 1
# 2 genusB speciesAA no.subspecies 3
# 3 genusC speciesAAA subspeciesA 2
# 4 genusC speciesAAA subspeciesB 4
Hoping this helps!
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 04-Nov-09 Time: 21:37:03
------------------------------ XFMail ------------------------------