An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20081228/e80a2e48/attachment.pl>
Conditional operation on multiple columns from two data frames
5 messages · Tim Sippel, Peter Dalgaard, jim holtman
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20081228/c366fae1/attachment.pl>
The suggestion below was made. df1$Date <- as.Date(df1$Date) df2$Date <- as.Date(df2$Date) ifelse(df1$ID==df2$ID & df1$Date-df2$Date<0.5,df1$y-df2$y, NA) However, because my dataframe rows do not align, I need the conditionals to be tested on every combination of cells. I'm starting to think I need to use tapply? Tim
View this message in context: http://www.nabble.com/Conditional-operation-on-multiple-columns-from-two-data-frames-tp21189891p21197566.html Sent from the R help mailing list archive at Nabble.com.
tsippel wrote:
The suggestion below was made. df1$Date <- as.Date(df1$Date) df2$Date <- as.Date(df2$Date) ifelse(df1$ID==df2$ID & df1$Date-df2$Date<0.5,df1$y-df2$y, NA) However, because my dataframe rows do not align, I need the conditionals to be tested on every combination of cells. I'm starting to think I need to use tapply? Tim
I'd look at outer() or try a merge() before the operation to get things aligned. It is not clear to me what form of output you desire if you look at "every combination of cells".
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Merge will give you an output like below and you should be able to find the value you need:
merge(df1, df2, by='ID', all=TRUE)
ID Date.x y.x x.x Date.y y.y x.y 1 1 <NA> NA NA 2007-05-31 23:00:00 -20.82907 217.7022 2 1 <NA> NA NA 2007-06-01 11:00:00 -20.82500 217.6980 3 1 <NA> NA NA 2007-06-01 23:00:00 -20.82236 217.7012 4 2 2007-05-30 20:21:56 NA NA 2007-02-19 00:00:00 -37.40960 178.3060 5 2 2007-05-30 20:21:56 NA NA 2007-02-19 12:00:00 -37.37515 178.3360 6 2 2007-05-30 20:21:56 NA NA 2007-02-20 00:00:00 -37.27700 178.3780 7 2 2007-05-30 20:21:56 NA NA 2007-02-20 12:00:00 -37.17855 178.3845 8 2 2007-05-30 20:21:56 NA NA 2007-02-21 00:00:00 -37.07595 178.3890 9 2 2007-05-30 20:21:56 NA NA 2007-02-21 12:00:00 -36.91965 178.3965 10 2 2007-05-30 20:21:56 NA NA 2007-02-22 00:00:00 -36.74345 178.3910 11 2 2007-05-30 20:21:56 NA NA 2007-02-22 12:00:00 -36.49245 178.5090 12 2 2007-05-30 20:21:56 NA NA 2007-02-23 00:00:00 -36.24150 178.5750 13 2 2007-05-30 20:21:56 NA NA 2007-02-23 12:00:00 -35.95500 178.6280 14 2 2007-05-30 18:48:20 NA NA 2007-02-19 00:00:00 -37.40960 178.3060 15 2 2007-05-30 18:48:20 NA NA 2007-02-19 12:00:00 -37.37515 178.3360 16 2 2007-05-30 18:48:20 NA NA 2007-02-20 00:00:00 -37.27700 178.3780 17 2 2007-05-30 18:48:20 NA NA 2007-02-20 12:00:00 -37.17855 178.3845 18 2 2007-05-30 18:48:20 NA NA 2007-02-21 00:00:00 -37.07595 178.3890 19 2 2007-05-30 18:48:20 NA NA 2007-02-21 12:00:00 -36.91965 178.3965 20 2 2007-05-30 18:48:20 NA NA 2007-02-22 00:00:00 -36.74345 178.3910 21 2 2007-05-30 18:48:20 NA NA 2007-02-22 12:00:00 -36.49245 178.5090 22 2 2007-05-30 18:48:20 NA NA 2007-02-23 00:00:00 -36.24150 178.5750 23 2 2007-05-30 18:48:20 NA NA 2007-02-23 12:00:00 -35.95500 178.6280 24 2 2007-06-05 08:34:37 NA NA 2007-02-19 00:00:00 -37.40960 178.3060 25 2 2007-06-05 08:34:37 NA NA 2007-02-19 12:00:00 -37.37515 178.3360 26 2 2007-06-05 08:34:37 NA NA 2007-02-20 00:00:00 -37.27700 178.3780 27 2 2007-06-05 08:34:37 NA NA 2007-02-20 12:00:00 -37.17855 178.3845 28 2 2007-06-05 08:34:37 NA NA 2007-02-21 00:00:00 -37.07595 178.3890 29 2 2007-06-05 08:34:37 NA NA 2007-02-21 12:00:00 -36.91965 178.3965 30 2 2007-06-05 08:34:37 NA NA 2007-02-22 00:00:00 -36.74345 178.3910 31 2 2007-06-05 08:34:37 NA NA 2007-02-22 12:00:00 -36.49245 178.5090 32 2 2007-06-05 08:34:37 NA NA 2007-02-23 00:00:00 -36.24150 178.5750 33 2 2007-06-05 08:34:37 NA NA 2007-02-23 12:00:00 -35.95500 178.6280 34 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-19 00:00:00 -37.40960 178.3060 35 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-19 12:00:00 -37.37515 178.3360 36 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-20 00:00:00 -37.27700 178.3780 37 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-20 12:00:00 -37.17855 178.3845 38 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-21 00:00:00 -37.07595 178.3890 39 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-21 12:00:00 -36.91965 178.3965 40 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-22 00:00:00 -36.74345 178.3910 41 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-22 12:00:00 -36.49245 178.5090 42 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-23 00:00:00 -36.24150 178.5750 43 2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-23 12:00:00 -35.95500 178.6280 44 2 2007-06-02 17:36:20 NA NA 2007-02-19 00:00:00 -37.40960 178.3060 45 2 2007-06-02 17:36:20 NA NA 2007-02-19 12:00:00 -37.37515 178.3360 46 2 2007-06-02 17:36:20 NA NA 2007-02-20 00:00:00 -37.27700 178.3780 47 2 2007-06-02 17:36:20 NA NA 2007-02-20 12:00:00 -37.17855 178.3845 48 2 2007-06-02 17:36:20 NA NA 2007-02-21 00:00:00 -37.07595 178.3890 49 2 2007-06-02 17:36:20 NA NA 2007-02-21 12:00:00 -36.91965 178.3965 50 2 2007-06-02 17:36:20 NA NA 2007-02-22 00:00:00 -36.74345 178.3910 51 2 2007-06-02 17:36:20 NA NA 2007-02-22 12:00:00 -36.49245 178.5090 52 2 2007-06-02 17:36:20 NA NA 2007-02-23 00:00:00 -36.24150 178.5750 53 2 2007-06-02 17:36:20 NA NA 2007-02-23 12:00:00 -35.95500 178.6280 54 2 2007-06-03 20:31:35 NA NA 2007-02-19 00:00:00 -37.40960 178.3060 55 2 2007-06-03 20:31:35 NA NA 2007-02-19 12:00:00 -37.37515 178.3360 56 2 2007-06-03 20:31:35 NA NA 2007-02-20 00:00:00 -37.27700 178.3780 57 2 2007-06-03 20:31:35 NA NA 2007-02-20 12:00:00 -37.17855 178.3845 58 2 2007-06-03 20:31:35 NA NA 2007-02-21 00:00:00 -37.07595 178.3890 59 2 2007-06-03 20:31:35 NA NA 2007-02-21 12:00:00 -36.91965 178.3965 60 2 2007-06-03 20:31:35 NA NA 2007-02-22 00:00:00 -36.74345 178.3910 61 2 2007-06-03 20:31:35 NA NA 2007-02-22 12:00:00 -36.49245 178.5090 62 2 2007-06-03 20:31:35 NA NA 2007-02-23 00:00:00 -36.24150 178.5750 63 2 2007-06-03 20:31:35 NA NA 2007-02-23 12:00:00 -35.95500 178.6280 64 3 2007-03-23 06:33:26 NA NA <NA> NA NA 65 3 2007-03-23 17:02:21 NA NA <NA> NA NA 66 3 2007-02-19 00:00:00 -37.40822 178.3440 <NA> NA NA 67 3 2007-03-10 19:58:18 NA NA <NA> NA NA 68 3 2007-03-25 06:41:03 -31.18100 178.7700 <NA> NA NA 69 3 2007-03-24 04:27:44 -31.16400 178.7650 <NA> NA NA 70 3 2007-03-24 11:35:47 NA NA <NA> NA NA
On Sat, Dec 27, 2008 at 11:55 PM, Tim Sippel <tsippel at gmail.com> wrote:
Sorry...I was unaware I needed to 'deput' to deparse data examples into a
useful format. The data examples I provided initially are deparsed below:
df1<-
list(ID = c(2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3), Date = c("2007-05-30
18:48:20",
"2007-05-30 20:21:56", "2007-06-02 09:39:58", "2007-06-02 17:36:20",
"2007-06-03 20:31:35", "2007-06-05 08:34:37", "2007-02-19 00:00:00",
"2007-03-10 19:58:18", "2007-03-23 06:33:26", "2007-03-23 17:02:21",
"2007-03-24 04:27:44", "2007-03-24 11:35:47", "2007-03-25 06:41:03"
), y = c(NA, NA, -20.101, NA, NA, NA, -37.4082166666667, NA,
NA, NA, -31.164, NA, -31.181), x = c(NA, NA, -134.421, NA, NA,
NA, 178.34395, NA, NA, NA, 178.765, NA, 178.77))
df2<-
list(ID = c(1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Date = c("2007-05-31
23:00:00",
"2007-06-01 11:00:00", "2007-06-01 23:00:00", "2007-02-19 00:00:00",
"2007-02-19 12:00:00", "2007-02-20 00:00:00", "2007-02-20 12:00:00",
"2007-02-21 00:00:00", "2007-02-21 12:00:00", "2007-02-22 00:00:00",
"2007-02-22 12:00:00", "2007-02-23 00:00:00", "2007-02-23 12:00:00"
), y = c(-20.829075, -20.824995, -20.822355, -37.4096, -37.37515,
-37.277, -37.17855, -37.07595, -36.91965, -36.74345, -36.49245,
-36.2415, -35.955), x = c(217.7022, 217.698, 217.70115, 178.306,
178.336, 178.378, 178.3845, 178.389, 178.3965, 178.391, 178.509,
178.575, 178.628))
On Sun, Dec 28, 2008 at 4:36 PM, Tim Sippel <tsippel at gmail.com> wrote:
Hi-
I have two data frames for which I wish to conditionally subtract the
values of one dataframe from the other.
I want to subtract df1$x from df2$x when their id is equal and the
absolute value of difference between the dates is 12 hours or less. If
there is no match of equal id's and dates less than 12 hours apart I want it
to return "NA". Note that df1 has missing values in x and y columns, but
df2 does not have any missing values. I think I'm close with the code at
the end, but would appreciate a bit of help getting over this hurdle.
Sample dataframes follow:
*df1<-*
ID Date y x STM07.1 30-05-2007 18:48 STM07.1 30-05-2007 20:21
STM07.1 02-06-2007 09:39 -20.101 -134.421 STM07.1 02-06-2007 17:36
STM07.1 03-06-2007 20:31 STM07.1 05-06-2007 08:34 STM07.2 19-02-2007
00:00 -37.4082 178.34395 STM07.2 10-03-2007 19:58 STM07.2 23-03-2007
06:33 STM07.2 23-03-2007 17:02 STM07.2 24-03-2007 04:27 -31.164 178.765
STM07.2 24-03-2007 11:35 STM07.2 25-03-2007 06:41 -31.181 178.77 STM07.2 26-03-2007
05:11
**
*df2<-*
ID Date y x STM07.1 31/05/2007 23:00 -20.8291 217.7022 STM07.1 1/06/2007
11:00 -20.825 217.698 STM07.1 1/06/2007 23:00 -20.8224 217.70115 STM07.2 19/02/2007
0:00 -37.4096 178.306 STM07.2 19/02/2007 12:00 -37.3752 178.336 STM07.2 20/02/2007
0:00 -37.277 178.378 STM07.2 20/02/2007 12:00 -37.1786 178.3845 STM07.2 21/02/2007
0:00 -37.076 178.389 STM07.2 21/02/2007 12:00 -36.9197 178.3965 STM07.2 22/02/2007
0:00 -36.7435 178.391 STM07.2 22/02/2007 12:00 -36.4925 178.509 STM07.2 23/02/2007
0:00 -36.2415 178.575 STM07.2 23/02/2007 12:00 -35.955 178.628
d.x <- ifelse(c(test = df1$ID==df2$ID && abs(difftime(as.POSIXlt(df1$Date),
as.POSIXlt(df2$Date),
units="hours") < 12)), yes = delta.lon=df1$x-df2$x, no = "NA")
I'm using R 2.7.1.
Cheers,
Tim
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?