I have a very simple query with regard to summarizing the number of factors
present in a certain snippet of a data frame.
Given the following data frame:
foo <- data.frame(yr = c(rep(1998,4), rep(1999,4), rep(2000,2)), div =
factor(c(rep(NA,4),"A","B","C","D","A","C")),
org = factor(c(1:4,1:4,1,2)))
I want to get two new variables. Object ndiv would give the number of
divisions by year:
1998 0
1999 3
2000 2
Object norgs would give the number of organizations
1998 4
1999 4
2000 2
I figure xtabs should be able to do it, but I'm stuck without a for loop.
Any suggestions? -Andy
Summarizing factor data in table?
4 messages · Andy Bunn, Tony Plate, Gabor Grothendieck
Do you want to count the number of non-NA divisions and organizations in
the data for each year (where duplicates are counted as many times as
they appear)?
> tapply(!is.na(foo$div), foo$yr, sum)
1998 1999 2000
0 4 2
> tapply(!is.na(foo$org), foo$yr, sum)
1998 1999 2000
4 4 2
>
Or perhaps the number of unique non-NA divisions and organizations in
the data for each year?
> tapply(foo$div, foo$yr, function(x) length(na.omit(unique(x))))
1998 1999 2000
0 4 2
> tapply(foo$org, foo$yr, function(x) length(na.omit(unique(x))))
1998 1999 2000
4 4 2
>
(I don't understand where the "3" in your desired output comes from
though, which maybe indicates I completely misunderstand your request.)
Andy Bunn wrote:
I have a very simple query with regard to summarizing the number of factors
present in a certain snippet of a data frame.
Given the following data frame:
foo <- data.frame(yr = c(rep(1998,4), rep(1999,4), rep(2000,2)), div =
factor(c(rep(NA,4),"A","B","C","D","A","C")),
org = factor(c(1:4,1:4,1,2)))
I want to get two new variables. Object ndiv would give the number of
divisions by year:
1998 0
1999 3
2000 2
Object norgs would give the number of organizations
1998 4
1999 4
2000 2
I figure xtabs should be able to do it, but I'm stuck without a for loop.
Any suggestions? -Andy
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
The three was a typo, which I regret very much. I don't know why I didn't think of apply. I was obsessed with doing it as a table. Thanks for your response, -Andy
-----Original Message----- From: Tony Plate [mailto:tplate at acm.org] Sent: Tuesday, April 26, 2005 2:00 PM To: Andy Bunn Cc: R-Help Subject: Re: [R] Summarizing factor data in table? Do you want to count the number of non-NA divisions and organizations in the data for each year (where duplicates are counted as many times as they appear)?
> tapply(!is.na(foo$div), foo$yr, sum)
1998 1999 2000
0 4 2
> tapply(!is.na(foo$org), foo$yr, sum)
1998 1999 2000
4 4 2
>
Or perhaps the number of unique non-NA divisions and organizations in the data for each year?
> tapply(foo$div, foo$yr, function(x) length(na.omit(unique(x))))
1998 1999 2000
0 4 2
> tapply(foo$org, foo$yr, function(x) length(na.omit(unique(x))))
1998 1999 2000
4 4 2
>
(I don't understand where the "3" in your desired output comes from though, which maybe indicates I completely misunderstand your request.) Andy Bunn wrote:
I have a very simple query with regard to summarizing the
number of factors
present in a certain snippet of a data frame. Given the following data frame: foo <- data.frame(yr = c(rep(1998,4), rep(1999,4),
rep(2000,2)), div =
factor(c(rep(NA,4),"A","B","C","D","A","C")),
org = factor(c(1:4,1:4,1,2)))
I want to get two new variables. Object ndiv would give the number of
divisions by year:
1998 0
1999 3
2000 2
Object norgs would give the number of organizations
1998 4
1999 4
2000 2
I figure xtabs should be able to do it, but I'm stuck without a
for loop.
Any suggestions? -Andy
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20050426/48193fd6/attachment.pl