I've got a data frame describing comments on an electronic journal, wherein each row is a unique comment, like so: commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2 I want know the number of unique authors per article. I can get a table of article frequencies with table(articleID), but I can't figure out how to count frequencies in a different column. I'm sure there's an easy way, but I guess I'm too new at this to find it. Thanks for your help! Jason Priem PhD student, School of Information and Library Science, University of North Carolina-Chapel Hill
counting frequencies across two columns
5 messages · Jason Priem, milton ruser, Patrick Connolly +2 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091101/08469419/attachment-0001.pl>
On Sun, 01-Nov-2009 at 01:20AM -0500, Jason Priem wrote:
I've got a data frame describing comments on an electronic journal, wherein each row is a unique comment, like so: commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2
Let's call that dataframe x
I want know the number of unique authors per article. I can get a table of article frequencies with table(articleID), but I can't figure out how to count frequencies in a different column. I'm sure there's an easy way, but I guess I'm too new at this to find it.
I'm not clear what you require, but maybe it's this:
with(x, table(articleID, author))
articleID andrews johnson jones smith
1 0 0 1 0
2 1 0 0 2
3 0 1 1 0
Is that anything like what you're after?
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___ Patrick Connolly
{~._.~} Great minds discuss ideas
_( Y )_ Average minds discuss events
(:_~*~_:) Small minds discuss people
(_)-(_) ..... Eleanor Roosevelt
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
On Nov 1, 2009, at 1:59 AM, Patrick Connolly wrote:
On Sun, 01-Nov-2009 at 01:20AM -0500, Jason Priem wrote:
I've got a data frame describing comments on an electronic journal, wherein each row is a unique comment, like so: commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2
Let's call that dataframe x
I want know the number of unique authors per article. I can get a table of article frequencies with table(articleID), but I can't figure out how to count frequencies in a different column. I'm sure there's an easy way, but I guess I'm too new at this to find it.
I'm not clear what you require, but maybe it's this:
with(x, table(articleID, author))
articleID andrews johnson jones smith
1 0 0 1 0
2 1 0 0 2
3 0 1 1 0
Is that anything like what you're after?
You've had two guesses so far and my guess increments the count.
Were you attempting to specify this?
df1 <- read.table(textConnection("commentID author articleID
1 1 smith 2
2 2 jones 3
3 3 andrews 2
4 4 jones 1
5 5 johnson 3
6 6 smith 2"), header=T)
> lapply( lapply(tapply(df1$author, df1$articleID, I), unique) ,
length)
$`1`
[1] 1
$`2`
[1] 2
$`3`
[1] 2
Or delivered in matrix form (and using Connolly's approach as
intermediate:
> apply( with(df1, table(articleID, author)), 1, function(x) sum(x>0) )
1 2 3
1 2 2
-- ~ .~ .~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly
-- David Winsemius, MD Heritage Laboratories West Hartford, CT
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091101/77780b8b/attachment-0001.pl>