Skip to content

merging 2 frames while keeping all the entries from the "reference" frame

5 messages · Dimitri Liakhovitski, Gabor Grothendieck, Henrique Dallazuanna

#
Hello!
I have my data frame "mydata" (below) and data frame "reference" -
that contains all the dates I would like to be present in the final
data frame.
I am trying to merge them so that the the result data frame contains
all 8 dates in both subgroups (i.e., Group1 should have 8 rows and
Group2 too). But when I merge it it's not coming out this way. Any
hint would be greatly appreciated!
Dimitri

mydata<-data.frame(mydate=rep(seq(as.Date("2008-12-29"), length = 8,
by = "week"),2),
group=c(rep("Group1",8),rep("Group2",8)),values=rnorm(16,1,1))
(reference);(mydata)
set.seed(1234)
out<-sample(1:16,5,replace=F)
mydata<-mydata[-out,]; dim(mydata)
(mydata)

# "reference" contains the dates I want to be present in the final data frame:
reference<-data.frame(mydate=seq(as.Date("2008-12-29"), length = 8, by
= "week"))

# Merging:
new.data<-merge(mydata,reference,by="mydate",all.x=T,all.y=T)
new.data<-new.data[order(new.data$group,new.data$mydate),]
(new.data)
# my new.data contains only 7 rows in Group 1 and 4 rows in Group 2
#
To clarify just in case, here is the result I am trying to get:

mydate	group	values
12/29/2008	Group1	0.453466522
1/5/2009	Group1	NA
1/12/2009	Group1	0.416548943
1/19/2009	Group1	2.066275155
1/26/2009	Group1	2.037729638
2/2/2009	Group1	-0.598040483
2/9/2009	Group1	1.658999227
2/16/2009	Group1	-0.869325211
12/29/2008	Group2	NA
1/5/2009	Group2	NA
1/12/2009	Group2	NA
1/19/2009	Group2	0.375284194
1/26/2009	Group2	0.706785401
2/2/2009	Group2	NA
2/9/2009	Group2	2.104937151
2/16/2009	Group2	2.880393978



On Mon, Apr 4, 2011 at 1:09 PM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:

  
    
#
On Mon, Apr 4, 2011 at 1:09 PM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:
It might make more sense to put each group into its own column since
then the object is a multivariate time series:
Group1    Group2
2008-12-29 2.0266215        NA
2009-01-05        NA        NA
2009-01-12 1.0255344        NA
2009-01-19 1.3880938 0.8135788
2009-01-26 1.4380978 1.6068682
2009-02-02 1.1764965        NA
2009-02-09 1.1578531 1.4484447
2009-02-16 0.6673568 1.4760864

although if you really need to you could string them out like this:

library(reshape2)
melt(data.frame(time(z), coredata(z)), id = 1)
#
Try this:

 merge(mydata, cbind(reference, group = rep(unique(mydata$group), each
= nrow(reference))), all = TRUE)

On Mon, Apr 4, 2011 at 2:24 PM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:

  
    
1 day later
#
Thanks a lot - these solutions are much more elegant than my own:

new.data<-merge(mydata[mydata$group %in%
levels(mydata$group)[1],],reference,by="mydate",all.x=T,all.y=T)
new.data[["group"]][is.na(new.data[["group"]])]<-levels(mydata$group)[1]
new.data[["values"]][is.na(new.data[["values"]])]<-0

# Continue Merging - starting with Group2:
for(i in 2:nlevels(mydata$group)){  #i<-2
	temp<-merge(mydata[mydata$group %in%
levels(mydata$group)[i],],reference,by="mydate",all.x=T,all.y=T)
	temp[["group"]][is.na(temp[["group"]])]<-levels(mydata$group)[i]
	temp[["values"]][is.na(temp[["values"]])]<-0
	new.data<-rbind(new.data,temp)
}

Dimitri
On Mon, Apr 4, 2011 at 3:07 PM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote: