Skip to content

Creating two new variables conditional on retaining values from previous rows

3 messages · pele.s at yahoo.com, Bert Gunter, Jim Lemon

#
Hello,

Iam looking for an R solution that can efficiently produce the output shown below. I can produce this easily in SAS with retain statement and a few lines of if-then-else logic, etc.. but I am not find anything similar on the Rforum archives. Below is the logic I am trying to apply to produce the output table below. Thanks in any help!

if the ID is the first ID encountered then group=1 and groupdate=date or else if not first ID and date - previous date > 10 or date - previous group date >10 then group=previous group # + 1 and groupdate = date or else if not first ID and date - previous date <= 10 or date - previous group date<=10 then group=previous group # and groupdate = previous date.

Input:

ID  DATE        ITEM
1   1/1/2014    P1
1   1/15/2014   P2
1   1/20/2014   P3
1   1/22/2014   P4
1   3/10/2015   P5
2   1/13/2015   P1
2   1/20/2015   P2
2   1/28/2015   P3
2   2/28/2015   P4
2   3/20/2015   P5
Desired Output

ID  DATE        ITEM    GROUP   GROUPDATE
1   1/1/2014    P1  1   1/1/2014
1   1/15/2014   P2  2   1/15/2014
1   1/20/2014   P3  2   1/15/2014
1   1/22/2014   P4  2   1/15/2014
1   3/10/2015   P5  3   3/10/2015
2   1/13/2015   P1  1   1/13/2015
2   1/20/2015   P2  1   1/13/2015
2   1/28/2015   P3  2   1/28/2015
2   2/28/2015   P4  3   2/28/2015
2   3/20/2015   P5  4   3/20/2015

Thanks for any help!
#
I do not have the tenacity to decipher your logic, but I would suggest
that you go through an R tutorial or two instead of limiting yourself
to R-Help (not R forum?) archives. You probably are going about it
wrongly in R (I suspect you need indexing). In fact, I would guess
that you probably don't want to go about things this way at all in R,
but of course that would require my knowing the underlying problem you
are trying to solve, which I do not.  My point is that you need to
change your programming paradigm and learn R instead of trying to
overlay SAS paradigms. And yup, it requires time and effort to do
this.

Note that there are various "R for SAS programmers" tutorials out
there, which might be helpful, either for your specific query or more
generally.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Apr 19, 2016 at 3:59 PM, pele.s--- via R-help
<r-help at r-project.org> wrote:
#
Hi pele,
There are probably more elegant ways to do this using some function,
but this might help:

psdat<-read.table(text="ID DATE ITEM
 1   1/1/2014    P1
 1   1/15/2014   P2
 1   1/20/2014   P3
 1   1/22/2014   P4
 1   3/10/2015   P5
 2   1/13/2015   P1
 2   1/20/2015   P2
 2   1/28/2015   P3
 2   2/28/2015   P4
 2   3/20/2015   P5",
 header=TRUE)
psdat$DATE<-as.Date(as.character(psdat$DATE),"%m/%d/%Y")
psdat$GROUP<-1
psdat$GROUPDATE<-psdat$DATE[1]
for(case in 2:dim(psdat)[1]) {
 # start a new ID
 if(lastID != psdat$ID[case-1]) {
  lastID<-psdat$ID[case]
  psdat$GROUP[case]<-1
  psdat$GROUPDATE[case]<-psdat$DATE[case]
 } else {
  if((psdat$DATE[case] - psdat$DATE[case-1]) > 10 ||
   (psdat$DATE[case] - psdat$GROUPDATE[case-1]) > 10) {
   psdat$GROUP[case]<-psdat$GROUP[case-1]+1
   psdat$GROUPDATE[case]<-psdat$DATE[case]
  } else {
   psdat$GROUP[case]<-psdat$GROUP[case-1]
   psdat$GROUPDATE[case]<-psdat$GROUPDATE[case-1]
  }
 }
}
psdat

Jim

On Wed, Apr 20, 2016 at 8:59 AM, pele.s--- via R-help
<r-help at r-project.org> wrote: