Sorry, I attempted to paste the sample data but it must have been stripped
out when I posted. It is hopefully now listed below.
tapply looks useful. I will check it out further.
Here's the sample data:
flights[1:10,]
PASSENGERS DISTANCE ORIGIN ORIGIN_CITY_NAME ORIGIN_WAC DEST
DEST_CITY_NAME DEST_WAC YEAR
1 17266 5995 LAX Los Angeles, CA 91 ICN Seoul,
South Korea 778 2010
2 16934 5995 LAX Los Angeles, CA 91 ICN Seoul,
South Korea 778 2010
3 15470 5995 LAX Los Angeles, CA 91 ICN Seoul,
South Korea 778 2010
4 13997 5995 ICN Seoul, South Korea 778 LAX Los
Angeles, CA 91 2010
5 13738 5995 LAX Los Angeles, CA 91 ICN Seoul,
South Korea 778 2010
6 13682 5995 LAX Los Angeles, CA 91 ICN Seoul,
South Korea 778 2010
7 13187 5995 ICN Seoul, South Korea 778 LAX Los
Angeles, CA 91 2010
8 13051 5995 LAX Los Angeles, CA 91 ICN Seoul,
South Korea 778 2010
9 12761 1940 SPN Saipan, TT 5 ICN Seoul,
South Korea 778 2010
10 12419 5995 ICN Seoul, South Korea 778 LAX Los
Angeles, CA 91 2010
Thanks,
Aaron
-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com]
Sent: Monday, October 24, 2011 11:58 AM
To: asindc
Cc: r-help at r-project.org
Subject: Re: [R] How to selectively sum rows [Beginner question]
It would be good to follow the posting guide and at least supply a
sample of the data.
Most likely 'tapply' is one way of doing it:
tapply(df$passenger, list(df$orig, df$dest), sum)
On Mon, Oct 24, 2011 at 11:27 AM, asindc <siirilaa at eastwestcenter.org>
wrote:
Hi, I am new to R so I would appreciate any help. I have some data that
has
passenger flight data between city pairs. The way I got the data, there
are
multiple rows of data for each city pair; the number of passengers needs
to
be summed to get a TOTAL annual passenger count for each city pair.
So my question is: how do I create a new table (or data frame) that
selectively sums
My initial thought would be to iterate through each row with the following
logic:
1. If the ORIGIN_WAC and DEST_WAC pair are not in the new table, then add
them to the table
2. If the ORIGIN_WAC and DEST_WAC pair already exist, then sum the
passengers (and do not add a new row)
Is this logical? If so, I think I just need some help on syntax (or do I
use
a script?). Thanks.
The first few rows of data look like this:
--
View this message in context: