counting frequencies across two columns - R-help

Sat, Oct 31, 2009 11:20 PM #

I've got a data frame describing comments on an electronic journal, 
wherein each row is a unique comment, like so:

  commentID  author articleID
1         1   smith         2
2         2   jones         3
3         3 andrews         2
4         4   jones         1
5         5 johnson         3
6         6   smith         2

I want know the number of unique authors per article.  I can get a table 
of article frequencies with table(articleID), but I can't figure out how 
to count frequencies in a different column.  I'm sure there's an easy 
way, but I guess I'm too new at this to find it.  Thanks for your help!

Jason Priem
PhD student, School of Information and Library Science, University of 
North Carolina-Chapel Hill

milton ruser

Sat, Oct 31, 2009 11:52 PM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091101/08469419/attachment-0001.pl>

Patrick Connolly

Sat, Oct 31, 2009 11:59 PM #

On Sun, 01-Nov-2009 at 01:20AM -0500, Jason Priem wrote:

Let's call that dataframe x

I'm not clear what you require, but maybe it's this:

articleID andrews johnson jones smith
        1       0       0     1     0
        2       1       0     0     2
        3       0       1     1     0

Is that anything like what you're after?

~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___    Patrick Connolly   
 {~._.~}                   Great minds discuss ideas    
 _( Y )_  	         Average minds discuss events 
(:_~*~_:)                  Small minds discuss people  
 (_)-(_)  	                      ..... Eleanor Roosevelt
	  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

David Winsemius

Sun, Nov 1, 2009 4:48 AM #

On Nov 1, 2009, at 1:59 AM, Patrick Connolly wrote:

You've had two guesses so far and my guess increments the count.

Were you attempting to specify this?

df1 <- read.table(textConnection("commentID  author articleID
1         1   smith         2
2         2   jones         3
3         3 andrews         2
4         4   jones         1
5         5 johnson         3
6         6   smith         2"), header=T)

 > lapply( lapply(tapply(df1$author, df1$articleID, I), unique) ,  
length)
$`1`
[1] 1

$`2`
[1] 2

$`3`
[1] 2

Or delivered in matrix form (and using Connolly's approach as  
intermediate:

 > apply( with(df1, table(articleID, author)), 1, function(x) sum(x>0) )
1 2 3
1 2 2

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Jorge Ivan Velez

Sun, Nov 1, 2009 7:30 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091101/77780b8b/attachment-0001.pl>