Skip to content

Multiple count if style "queries"

8 messages · jim holtman, David Winsemius, Phil Spector +1 more

#
Dear List,

I am looking to calculate two things from my data frame and was after some advice. For the example below i want to know.

1. How many unique Orders/Families and Genera there are per eco-name

2. How many incidences are there for each Order/Family and Genus there are per eco-region

I have 650 econame.

I.e for Biak-Numfoor rain forests there are 2 orders, 2 families and two genera. Also, Alismatales are represented once, Asparagales once etc etc.

Thanks for any advice.

Chris


ECO_NAME							Order		Family					Genus
Biak-Numfoor rain forests				Alismatales		Araceae				Homalomena
Biak-Numfoor rain forests				Asparagales		Orchidaceae			Bromheadia
Central Range montane rain forests	Alismatales		Araceae				Homalomena
Central Range montane rain forests	Asparagales		Lomandraceae		Cordyline
Central Range montane rain forests	Asparagales		Orchidaceae			Thelymitra
Central Range montane rain forests	Asparagales		Orchidaceae			Dendrobium
Central Range montane rain forests	Asparagales		Orchidaceae			Vanda
Central Range montane rain forests	Asparagales		Orchidaceae			Bulbophyllum
Central Range montane rain forests	Asparagales		Orchidaceae			Bulbophyllum
Central Range montane rain forests	Asparagales		Orchidaceae			Dendrobium
Central Range montane rain forests	Asparagales		Orchidaceae			Dendrobium
Central Range montane rain forests	Commelinales	Commelinaceae		Murdannia
Central Range montane rain forests	Poales			Centrolepidaceae	Centrolepis
Central Range montane rain forests	Poales			Cyperaceae			Machaerina
Central Range montane rain forests	Poales			Cyperaceae			Eleocharis
Central Range montane rain forests	Poales			Eriocaulaceae		Eriocaulon
Central Range montane rain forests	Poales			Poaceae			Schizostachyum
Central Range montane rain forests	Poales			Poaceae			Poa
Central Range montane rain forests	Zingiberales		Zingiberaceae		Alpinia
Central Range montane rain forests	Zingiberales		Zingiberaceae		Curcuma
Central Range montane rain forests	Zingiberales		Zingiberaceae		Amomum
Huon Peninsula montane rain forests	Asparagales		Orchidaceae			Taeniophyllum
Huon Peninsula montane rain forests	Asparagales		Orchidaceae			Corybas
Huon Peninsula montane rain forests	Asparagales		Orchidaceae			Thelymitra
Huon Peninsula montane rain forests	Asparagales		Orchidaceae			Glomera
Huon Peninsula montane rain forests	Poales			Centrolepidaceae	Centrolepis
Huon Peninsula montane rain forests	Poales			Poaceae			Poa
#
use the 'sqldf' package.  Also use 'dput' to include sample data since
it was impossible to use the data in the format you provided, so my
guess at a solution would be:

answer <- sqldf("
     select ECO_NAME
               , count(distinct Order) as Order
               , count(distinct Family) as Family
               , count(distinct Genus) as Genus
        from yourData
        group by ECO_NAME
")

Not sure exactly what you wanted in the second question; it would be
nice to include what some sample output would be so we have an idea of
what you are after.
On Fri, May 20, 2011 at 5:18 PM, Chris Mcowen <chrismcowen at gmail.com> wrote:

  
    
#
Sorry for not including the data, i did intend to. 

I tried your code but got this error:
For the second question i would want:

The number of times each Order/Family and Genus was represented in each eco_region.

I.E

Biak-Numfoor rain forests	
Order: Alismatales-1, Araceae-1
Family: Homalomena -1, Asparagales-1	
Genus: Orchidaceae-1, Bromheadia-1

So the number after each name is the count.

Thanks for the help.

Chris


data2 <- structure(list(ECO_NAME = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 3L, 3L), .Label = c("Biak-Numfoor rain forests", 
"Central Range montane rain forests", "Huon Peninsula montane rain forests"
), class = "factor"), Order = structure(c(1L, 2L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 
2L, 2L, 2L, 2L, 4L, 4L), .Label = c("Alismatales", "Asparagales", 
"Commelinales", "Poales", "Zingiberales"), class = "factor"), 
    Family = structure(c(1L, 7L, 1L, 6L, 7L, 7L, 7L, 7L, 7L, 
    7L, 7L, 3L, 2L, 4L, 4L, 5L, 8L, 8L, 9L, 9L, 9L, 7L, 7L, 7L, 
    7L, 2L, 8L), .Label = c("Araceae", "Centrolepidaceae", "Commelinaceae", 
    "Cyperaceae", "Eriocaulaceae", "Lomandraceae", "Orchidaceae", 
    "Poaceae", "Zingiberaceae"), class = "factor"), Genus = structure(c(13L, 
    3L, 13L, 6L, 19L, 9L, 20L, 4L, 4L, 9L, 9L, 15L, 5L, 14L, 
    10L, 11L, 17L, 16L, 1L, 8L, 2L, 18L, 7L, 19L, 12L, 5L, 16L
    ), .Label = c("Alpinia", "Amomum", "Bromheadia", "Bulbophyllum", 
    "Centrolepis", "Cordyline", "Corybas", "Curcuma", "Dendrobium", 
    "Eleocharis", "Eriocaulon", "Glomera", "Homalomena", "Machaerina", 
    "Murdannia", "Poa", "Schizostachyum", "Taeniophyllum", "Thelymitra", 
    "Vanda"), class = "factor")), .Names = c("ECO_NAME", "Order", 
"Family", "Genus"), class = "data.frame", row.names = c(NA, -27L
))
On 20 May 2011, at 22:33, jim holtman wrote:
use the 'sqldf' package.  Also use 'dput' to include sample data since
it was impossible to use the data in the format you provided, so my
guess at a solution would be:

answer <- sqldf("
    select ECO_NAME
              , count(distinct Order) as Order
              , count(distinct Family) as Family
              , count(distinct Genus) as Genus
       from yourData
       group by ECO_NAME
")

Not sure exactly what you wanted in the second question; it would be
nice to include what some sample output would be so we have an idea of
what you are after.
On Fri, May 20, 2011 at 5:18 PM, Chris Mcowen <chrismcowen at gmail.com> wrote:

  
    
#
On May 20, 2011, at 5:50 PM, Chris Mcowen wrote:

            
You included the data, just not in a format that anyone could  
reasonably be expected to edit so it can be used. Please read the  
Posting Guide.
#
Is the dput output included not usable either? I have tested it on my machine and it works fine.

Thanks
On 20 May 2011, at 22:57, David Winsemius wrote:

        
On May 20, 2011, at 5:50 PM, Chris Mcowen wrote:

            
You included the data, just not in a format that anyone could reasonably be expected to edit so it can be used. Please read the Posting Guide.
#
Chris -
    It's pretty easy to calculate these counts, the tricky part
is what kind of object to put them in.  Here's a way that returns
a list of lists:
list(Order=table(as.character(x$Order)),
                 Family=table(as.character(x$Family)),
                 Genus=table(as.character(x$Genus))))

So to reproduce your (poorly chosen) example:
$Order

Alismatales Asparagales
           1           1

$Family

     Araceae Orchidaceae
           1           1

$Genus

Bromheadia Homalomena
          1          1

or to see each piece individually:
Alismatales Asparagales
           1           1
Araceae Orchidaceae
           1           1
Bromheadia Homalomena
          1          1

Hope this helps.

 					- Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu
On Fri, 20 May 2011, Chris Mcowen wrote:

            
#
Thanks Phil,

I will investigate how to put it into a exportable object.

Chris
On 20 May 2011, at 23:08, Phil Spector wrote:
Chris -
  It's pretty easy to calculate these counts, the tricky part
is what kind of object to put them in.  Here's a way that returns
a list of lists:
list(Order=table(as.character(x$Order)),
               Family=table(as.character(x$Family)),
               Genus=table(as.character(x$Genus))))

So to reproduce your (poorly chosen) example:
$Order

Alismatales Asparagales
         1           1

$Family

   Araceae Orchidaceae
         1           1

$Genus

Bromheadia Homalomena
        1          1

or to see each piece individually:
Alismatales Asparagales
         1           1
Araceae Orchidaceae
         1           1
Bromheadia Homalomena
        1          1

Hope this helps.

					- Phil Spector
					 Statistical Computing Facility
					 Department of Statistics
					 UC Berkeley
					 spector at stat.berkeley.edu
On Fri, 20 May 2011, Chris Mcowen wrote:

            
#
On May 20, 2011, at 6:01 PM, Chris Mcowen wrote:

            
It's fine. I assumed you were referring to the "data" you did include  
in the first posting with all the spaces and tabs and didn't scroll  
down.

  dat5 <-as.data.frame( with(data2,
                   table(ECO_NAME,Order,Family,Genus) ) )
  dat5[dat5$Freq >0, ]