Skip to content

How to selectively sum rows [Beginner question]

5 messages · asindc, jim holtman, Aaron Siirila +1 more

#
Hi, I am new to R so I would appreciate any help. I have some data that has
passenger flight data between city pairs. The way I got the data, there are
multiple rows of data for each city pair; the number of passengers needs to
be summed to get a TOTAL annual passenger count for each city pair. 

So my question is: how do I create a new table (or data frame) that
selectively sums

My initial thought would be to iterate through each row with the following
logic:

1. If the ORIGIN_WAC and DEST_WAC pair are not in the new table, then add
them to the table
2. If the ORIGIN_WAC and DEST_WAC pair already exist, then sum the
passengers (and do not add a new row)

Is this logical? If so, I think I just need some help on syntax (or do I use
a script?). Thanks.

The first few rows of data look like this:



--
View this message in context: http://r.789695.n4.nabble.com/How-to-selectively-sum-rows-Beginner-question-tp3933512p3933512.html
Sent from the R help mailing list archive at Nabble.com.
#
It would be good to follow the posting guide and at least supply a
sample of the data.

Most likely 'tapply' is one way of doing it:

tapply(df$passenger, list(df$orig, df$dest), sum)
On Mon, Oct 24, 2011 at 11:27 AM, asindc <siirilaa at eastwestcenter.org> wrote:

  
    
#
Sorry, I attempted to paste the sample data but it must have been stripped
out when I posted. It is hopefully now listed below.

tapply looks useful. I will check it out further.

Here's the sample data:
PASSENGERS DISTANCE ORIGIN   ORIGIN_CITY_NAME ORIGIN_WAC DEST
DEST_CITY_NAME DEST_WAC YEAR
1       17266     5995    LAX    Los Angeles, CA         91  ICN Seoul,
South Korea      778 2010
2       16934     5995    LAX    Los Angeles, CA         91  ICN Seoul,
South Korea      778 2010
3       15470     5995    LAX    Los Angeles, CA         91  ICN Seoul,
South Korea      778 2010
4       13997     5995    ICN Seoul, South Korea        778  LAX    Los
Angeles, CA       91 2010
5       13738     5995    LAX    Los Angeles, CA         91  ICN Seoul,
South Korea      778 2010
6       13682     5995    LAX    Los Angeles, CA         91  ICN Seoul,
South Korea      778 2010
7       13187     5995    ICN Seoul, South Korea        778  LAX    Los
Angeles, CA       91 2010
8       13051     5995    LAX    Los Angeles, CA         91  ICN Seoul,
South Korea      778 2010
9       12761     1940    SPN         Saipan, TT          5  ICN Seoul,
South Korea      778 2010
10      12419     5995    ICN Seoul, South Korea        778  LAX    Los
Angeles, CA       91 2010

Thanks,
Aaron


-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com] 
Sent: Monday, October 24, 2011 11:58 AM
To: asindc
Cc: r-help at r-project.org
Subject: Re: [R] How to selectively sum rows [Beginner question]

It would be good to follow the posting guide and at least supply a
sample of the data.

Most likely 'tapply' is one way of doing it:

tapply(df$passenger, list(df$orig, df$dest), sum)

On Mon, Oct 24, 2011 at 11:27 AM, asindc <siirilaa at eastwestcenter.org>
wrote:
has
are
to
use
http://r.789695.n4.nabble.com/How-to-selectively-sum-rows-Beginner-question-
tp3933512p3933512.html
http://www.R-project.org/posting-guide.html

  
    
#
See the count() function in the plyr package; it does fast summation.
Something like

library('plyr')
count(passengerData, c('ORIGIN_WAC', 'DEST_WAC'), 'npassengers')

HTH,
Dennis
On Mon, Oct 24, 2011 at 8:27 AM, asindc <siirilaa at eastwestcenter.org> wrote:
#
The count() function in the plyr package works beautifully. Thanks to Jim,
Rainer and Dennis for your help. 

Best.

-----Original Message-----
From: Dennis Murphy [mailto:djmuser at gmail.com] 
Sent: Monday, October 24, 2011 12:05 PM
To: asindc
Cc: r-help at r-project.org
Subject: Re: [R] How to selectively sum rows [Beginner question]

See the count() function in the plyr package; it does fast summation.
Something like

library('plyr')
count(passengerData, c('ORIGIN_WAC', 'DEST_WAC'), 'npassengers')

HTH,
Dennis
On Mon, Oct 24, 2011 at 8:27 AM, asindc <siirilaa at eastwestcenter.org> wrote:
has
are
to
use
http://r.789695.n4.nabble.com/How-to-selectively-sum-rows-Beginner-question-
tp3933512p3933512.html
http://www.R-project.org/posting-guide.html