Message-ID: <9EA364B2FAFC264A86CED5A4404A19DB3F2BAD22@DB3PRD0311MB429.eurprd03.prod.outlook.com>
Date: 2013-01-17T07:43:38Z
From: Pancho Mulongeni
Subject: RESOLVED: Using table to get frequencies of several factors at once
In-Reply-To: <A4E5A0B016B8CB41A485FC629B633CED4A333618D1@GOLD.corp.lgc-group.com>
Thanks,
I hereby declare this thread as resolved.
-----Original Message-----
From: S Ellison [mailto:S.Ellison at LGCGroup.com]
Sent: Wednesday, January 16, 2013 4:27 PM
To: Pancho Mulongeni; R help
Subject: RE: Using table to get frequencies of several factors at once
You could use a variant of apply(), probably sapply
For example
d <- as.data.frame( matrix(sample(0:1, 200, replace=TRUE), ncol=5))
head(d)
sapply(d, table)
S Ellison
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Pancho Mulongeni
> Sent: 11 January 2013 11:18
> To: R help
> Subject: [R] Using table to get frequencies of several factors at once
>
> Hi, I have a dataframe with n columns, but I am only looking at five
> of them. And lots of rows, over 700.
> So I would like to find frequencies for each of the numeric columns
> (variables) using the table function. However, is there a fast way to
> produce a frequency table where the 5 rows represent the 5 numeric
> variables and the columns refer to the values (levels) of the
> respective numeric variables, which in this case are 0 and 1.
> The only way I have figured it out is via a for loop:
> m<-seq(218,222,1) #these are columns of the variables in the larger
> dataframe tm<-m[1:5] #I need this for the for loop
> l.tm<-length(tm)
> B<-matrix(nrow=l.tm,ncol=2) #the matrix to hold the freqs for (p in
> 1:l.tm) { var.num<-m[p]
> B[p,]<-table(DATA[,var.num])
> }
>
> > B
> [,1] [,2]
> [1,] 697 9
> [2,] 512 194
> [3,] 604 102
> [4,] 700 6
> [5,] 706 706
> So the rows represent my five variables (columns) that occupy
> columns 218 through 222 in the DATA dataframe.
> So the second column represents my frequencies of the value
> 1, which is what I am interested in. The last row has a
> double entry, because there was only one value, 0, with a
> freq of 706 and so R duplicated in the two columns, but
> that's ok, I can just ignore it.
>
> So is there are better way to do this? Is there a way to use
> the so called tapply function? I struggle to understand the
> help doc for this.function.
>
>
> Pancho Mulongeni
> Research Assistant
> PharmAccess Foundation
> 1 Fouch? Street
> Windhoek West
> Windhoek
> Namibia
> ?
> Tel:?? +264 61 419 000
> Fax:? +264 61 419 001/2
> Mob: +264 81 4456 286
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:9}}