Skip to content

Sum over indexed value

4 messages · Gunadi, Stefan Petersson, Henrique Dallazuanna +1 more

#
I am sure this is easy but I am not finding a function to do this. 

I have two columns in a matrix. The first column contains multiple entries
of numbers from 1 to 100 (i.e. 10 ones, 8 twos etc.). The second column
contains unique numbers. I want to sum the numbers in column two based on
the indexed values in column one (e.g. sum of all values in column two
associated with the value 1 in column one). I would like two columns in
return - the indexed value in column one (i.e. this time no duplicates) and
the sum in column two. 

How do I do this?
#
P=data.frame(x=c(1,1,2,3,2,1),y=rnorm(6))
tapply(P$y,P$x,sum)

regards,
 stefan
On Mon, Nov 16, 2009 at 09:49:17AM -0800, Gunadi wrote:
#
Try this:

with(DF, rowsum(Col2, Col1))
On Mon, Nov 16, 2009 at 3:49 PM, Gunadi <boydkramer at gmail.com> wrote:

  
    
#
Gunadi wrote:
Supposing you had the data:

  tstData <- data.frame( index = c(1,2,1,1,3,2), 
    value = c( 0, 4, 0, 0, 7, 4 ) )

You could use the by() function to divide the data.frame and sum the value
column:

  sums <- by( tstData, tstData[['index']], function( slice ){

    return( sum( slice[['value']] ) )

  })

However, by() tends to do a poor job of cleanly expressing which values of
'index' generated the sums.  I would recomend the __ply() functions in
Hadley Wickham's plyr package.  Specifically ddply():

  require( plyr )

  sums <- ddply( tstData, 'index', function( slice ){

    return(
      data.frame( sum = sum( slice[['value']] ) )
    )
  })

  sums
   index sum
  1     1   0
  2     2   8
  3     3   7


Hope this helps!

-Charlie

-----
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University