Skip to content

My code is too "loopy"

5 messages · Dennis Murphy, Dimitri Liakhovitski

#
Hello!
I wrote a piece of code below that does the job but seems too "loopy" to me.
I was wondering if there is any way to make it more efficient/less "loopy"?
Thanks a lot for your hints!
Dimitri

### Creating example data set:

mygroups<-c(rep("group1", 8),rep("group2", 8))
myweeks<-seq(as.Date("2010-01-04"), length = 8, by = "week")
values.w<-c(0,10,15,20,0,0,0,10,100,200,0,0,300,200,0,0)
mydata<-data.frame(group=mygroups,mydates=myweeks,myvalue=values.w)
mydata$group<-as.factor(mydata$group)
str(mydata)
(mydata)

### Doing the following within each level of the factor "mydata$group":
### Create a new variable ("new.value") that equals:
### myvalue in the same week * 0.5 +
### myvalue 1 week ago  * 0.35
### myvalue 2 weeks ago * 0.15

groups<-levels(mydata$group)
(groups)

mydata[["new.value"]]<-mydata[["myvalue"]]*0.5

for(i in groups){   # looping through groups
  temp.data<-mydata[mydata$group %in% i,] # selecting values for one group
  temp.data[2,"new.value"]<-temp.data[["new.value"]][2]+temp.data[1,"myvalue"]*0.35
# 2nd new value
  for(myrow in 3:nrow(temp.data)){  # Starting in row 3 and looping through rows
    temp.data[myrow,"new.value"]<-temp.data[["new.value"]][myrow]+temp.data[(myrow-1),"myvalue"]*.35+temp.data[(myrow-2),"myvalue"]*.15
  }
  mydata[mydata$group %in% i,]<-temp.data
}
#
Hi:

I think the embed() function is your friend here. From its help page example,
[,1] [,2] [,3]
[1,]    3    2    1
[2,]    4    3    2
[3,]    5    4    3
[4,]    6    5    4
[5,]    7    6    5
[6,]    8    7    6
[7,]    9    8    7
[8,]   10    9    8


Applying it to your test data,

# h() creates a weighted average of the observations in each row
h <- function(x) embed(x, 3) %*% c(0.5, 0.35, 0.15)
library(plyr)
ddply(mydata, "group", summarise, ma = h(myvalue))
    group     ma
1  group1  11.00
2  group1  16.75
3  group1   9.25
4  group1   3.00
5  group1   0.00
6  group1   5.00
7  group2  85.00
8  group2  30.00
9  group2 150.00
10 group2 205.00
11 group2 115.00
12 group2  30.00

Does that work for you? The rollapply() function in the zoo package
may also be applicable with a similar input function that computes a
weighted average.

HTH,
Dennis


On Mon, Apr 25, 2011 at 1:50 PM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:
#
Dennis, this is really great, thanks a lot!
Do you know how to prevent the result from omitting the first 2
values. I mean - it starts (within each group) with the 3rd row but
omits the first 2...
Dimitri
On Mon, Apr 25, 2011 at 5:31 PM, Dennis Murphy <djmuser at gmail.com> wrote:

  
    
#
I would probably still have to "manually" specify the values for row 1
and row 2 - and then loop through groups. Something like:

mydata[!is.na(mydata$myvalue),"new.value"]<-mydata[!is.na(mydata$myvalue),"myvalue"]*0.5
 # this calculates the values for row 1
h <- function(x) embed(x, 3) %*% c(0.5, 0.35, 0.15)   # This is apply
only to rows 3+ of each group
for(i in groups){   # looping through groups
  temp.data<-mydata[mydata$group %in% i,] # selecting data for one group
  temp.data[2,"new.value"]<-temp.data[["new.value"]][2]+temp.data[1,"myvalue"]*0.35
# 2nd weighted value is calculated "manually"
  temp.data[3:nrow(temp.data),"new.value"]<-ddply(temp.data, "group",
summarise, ma = h(myvalue))[2]
  mydata[mydata$group %in% i,"new.value"]<-temp.data["new.value"]
}

Dimitri

On Tue, Apr 26, 2011 at 10:18 AM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:

  
    
#
Hi:

One approach is to remove the top two observations from each group.
Here's one way:

ddply(mydata, .(group), function(d) tail(d, -2))

Now apply the previous procedure to this data subset.

HTH,
Dennis

On Tue, Apr 26, 2011 at 7:18 AM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote: