Bus stop sequence matching problem
Homework? The list has a no homework policy - but perhaps I'll be forgiven por
posting hints.
In general terms, this is how I appraoched the problem:
* Loop through the rows of stop_onoff - for (idx in ...someething...) {...
* For each row, find the first of "ref" in a suitably filtered subset of
stop_sequence, and keep track of these row numbers
* Update columns "on" and "off"
* Use cumsum to calculate the number of passengers on the bus
Note the loop. Someone cleverer than I might be able to vectorise that step,
but I couldn't see how.
By the way, if this is homework...
Are you sure you're desired_output is correct? I would expect someething like
seq ref on off load
1 10 A 5 0 5
2 20 B 0 0 5
3 30 C 0 0 5
4 40 D 0 2 3
5 50 B 10 2 11
6 60 A 0 6 5
Are you aware that you're "ref" ccolumns are factors and not characters? If
you use "stringsAsFactors = FALSE" or
stop_onoff <-
data.frame(ref=factor(c('A','D','B','A'), levels =
levels(stop_sequence$ref)),on=c(5,0,10,0),off=c(0,2,2,6))
it will simplify your'e analysis (or at least reduce some typing).
Type the following in an R console
?data.frame
?factor
and have a read.
Now, if this ain't homework, or you just want someone to do it for you, e-mail
me offline and I'll send you my appraoch. If it is homework, let me know - I'm
happy to help anyway, but I will be trying to help you solve this for
yourself.
Cheers,
DMcP
On Sat, 30 Aug 2014 12:46:17 +1200 Adam Lawrence <alaw005 at gmail.com> wrote
I am hoping someone can help me with a bus stop sequencing problem in R, where I need to match counts of people getting on and off a bus to the correct stop in the bus route stop sequence. I have tried looking online/forums for sequence matching but seems to refer to numeric sequences or DNA matching and over my head. I am after a simple example if anyone can please help. I have two data series as per below (from database), that I want to combine. In this example ?stop_sequence? includes the equence (seq) of
bus
stops and ?stop_onoff? is a count of people getting on and off at
certain
stops (there is no entry if noone gets on or off).
stop_sequence <- data.frame(seq=c(10,20,30,40,50,60),
ref=c('A','B','C','D','B','A'))
## seq ref
## 1 10 A
## 2 20 B
## 3 30 C
## 4 40 D
## 5 50 B
## 6 60 A
stop_onoff <-
data.frame(ref=c('A','D','B','A'),on=c(5,0,10,0),off=c(0,2,2,6))
## ref on off
## 1 A 5 0
## 2 D 0 2
## 3 B 10 2
## 4 A 0 6
I need to match the stop_onoff numbers in the right sto sequence, with the
correctly matched output as follows (load is a cumulative count of on and
off)
desired_output <- data.frame(seq=c(10,20,30,40,50,60),
ref=c('A','B','C','D','B','A'),
on=c(5,'-','-',0,10,0),off=c(0,'-','-',2,2,6), load=c(5,0,0,3,11,5))
## seq ref on off load
## 1 10 A 5 0 5
## 2 20 B - - 0
## 3 30 C - - 0
## 4 40 D 0 2 3
## 5 50 B 10 2 11
## 6 60 A 0 6 5
In this example the stop ?B? is matched to the second stop ?B? in
the stop
sequence and not the first because the onoff data is after stop ?D?.
Any guidance much appreciated.
Regards
Adam
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________ South Africas premier free email service - www.webmail.co.za Cheapest Insurance Quotes! https://www.outsurance.co.za/insurance-quote/personal/?source=msn&cr=Postit14_468x60_gif&cid=322