Back to formatted view
Raw Message

Message-ID: <9EA364B2FAFC264A86CED5A4404A19DB3F2BAD22@DB3PRD0311MB429.eurprd03.prod.outlook.com>
Date: 2013-01-17T07:43:38Z
From: Pancho Mulongeni
Subject: RESOLVED: Using table to get frequencies of several factors at once
In-Reply-To: <A4E5A0B016B8CB41A485FC629B633CED4A333618D1@GOLD.corp.lgc-group.com>

Thanks, 
I hereby declare this thread as resolved.

-----Original Message-----
From: S Ellison [mailto:S.Ellison at LGCGroup.com] 
Sent: Wednesday, January 16, 2013 4:27 PM
To: Pancho Mulongeni; R help
Subject: RE: Using table to get frequencies of several factors at once

You could use a variant of apply(), probably sapply

For example
d <- as.data.frame( matrix(sample(0:1, 200, replace=TRUE), ncol=5))

head(d)

sapply(d, table)

S Ellison
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Pancho Mulongeni
> Sent: 11 January 2013 11:18
> To: R help
> Subject: [R] Using table to get frequencies of several factors at once
> 
> Hi, I have a dataframe with n columns, but I am only looking at five 
> of them. And lots of rows, over 700.
> So I would like to find frequencies for each of the numeric columns 
> (variables) using the table function. However, is there a fast way to 
> produce a frequency table where the 5 rows represent the 5 numeric 
> variables and the columns refer to the values (levels) of the 
> respective numeric variables, which in this case are 0 and 1.
> The only way I have figured it out is via a for loop:
> m<-seq(218,222,1) #these are columns of the variables in the larger 
> dataframe tm<-m[1:5] #I need this for the for loop
> l.tm<-length(tm)
> B<-matrix(nrow=l.tm,ncol=2)  #the matrix to hold the freqs for (p in 
> 1:l.tm) { var.num<-m[p]
> B[p,]<-table(DATA[,var.num])
> }
> 
> > B
>      [,1] [,2]
> [1,]  697    9
> [2,]  512  194
> [3,]  604  102
> [4,]  700    6
> [5,]  706  706
> So the rows represent my five variables (columns) that occupy 
> columns 218 through 222 in the DATA dataframe.
> So the second column represents my frequencies of the value 
> 1, which is what I am interested in. The last row has a 
> double entry, because there was only one value, 0, with a 
> freq of 706 and so R duplicated in the two columns, but 
> that's ok, I can just ignore it. 
> 
> So is there are better way to do this? Is there a way to use 
> the so called tapply function? I struggle to understand the 
> help doc for this.function.
> 
> 
> Pancho Mulongeni
> Research Assistant
> PharmAccess Foundation
> 1 Fouch? Street
> Windhoek West
> Windhoek
> Namibia
> ?
> Tel:?? +264 61 419 000
> Fax:? +264 61 419 001/2
> Mob: +264 81 4456 286
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:9}}