Skip to content
Back to formatted view

Raw Message

Message-ID: <20080919082219.GA7719@localhost>
Date: 2008-09-19T08:22:19Z
From: Philipp Pagel
Subject: frequency table across multiple variables
In-Reply-To: <19567838.post@talk.nabble.com>

> I have a dataframe like this:
> 
> x1<-c(1,2,3,4,NA ,NA ,NA, 3, 1, 1, 1, 1, 2, 2, 3, 4, 4)
> x2<-c(2,3,4,3,4,3,4,2,2,3,4,NA,NA,NA,NA,4,3)
> x3<-c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,1,2)
> m<-data.frame(x1,x2,x3)
> 
> I would like to create a frequency table like this:
> 
>       x1  x2  x3
> NA
> 1
> 2
> 3
> 4
> 
> where the values in each cell would be the count of the value for that
> variable.
> How can I do this?

The following will work IF all columns are integer:


> apply(m, 2, function(x){tabulate(na.omit(x))})
     x1 x2 x3
[1,]  5  0  5
[2,]  3  3  5
[3,]  3  5  4
[4,]  3  5  3

Please note that the result will look slightly different, if some columns contain 
the largest value and others don't:

> x1<-as.integer(c(1,2,3,4,NA ,NA ,NA, 3, 1, 1, 1, 1, 2, 2, 3, 4, 5))
> m<-data.frame(x1,x2,x3)
> apply(m, 2, function(x){tabulate(na.omit(x))})
$x1
[1] 5 3 3 2 1

$x2
[1] 0 3 5 5

$x3
[1] 5 5 4 3


cu
	Philipp


-- 
Dr. Philipp Pagel
Lehrstuhl f?r Genomorientierte Bioinformatik
Technische Universit?t M?nchen
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://mips.gsf.de/staff/pagel