R-Help Community
I'm trying to combine two data.frames which each containing 10 columns of
which they each share two common fields. Here are two small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id = c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event = c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
Combining data.frames
13 messages · Jeff Reichman, Tom Woolman, Jeff Newmiller +2 more
Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/merge
On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10 columns
of
which they each share two common fields. Here are two small test
datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames such
as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/merge
On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10 columns
of which they each share two common fields. Here are two small test
datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames such
as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
You can also do "SQL-like" joins in the tidyverse with dplyr.
On 2022-03-19 21:23, Jeff Reichman wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/merge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10 columns
of which they each share two common fields. Here are two small test
datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames such
as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Yes I'm reading that presently The closest I've gotten has been df3 <- merge(df1, df2, all = TRUE) -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:27 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames You can also do "SQL-like" joins in the tidyverse with dplyr.
On 2022-03-19 21:23, Jeff Reichman wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/mer ge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10
columns of which they each share two common fields. Here are two
small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames
such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Tom Looks like I figured it out. Syntax issue - wrong "all" argument (I think) -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:27 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames You can also do "SQL-like" joins in the tidyverse with dplyr.
On 2022-03-19 21:23, Jeff Reichman wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/mer ge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10
columns of which they each share two common fields. Here are two
small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames
such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
I'm trying hard to take tonight off and avoid booting up the laptop and launching R... :) but you need to merge by the primary key(s), e.g. the common columns (common IVs) shared between the two dataframes.
On 2022-03-19 21:38, Jeff Reichman wrote:
Tom Looks like I figured it out. Syntax issue - wrong "all" argument (I think) -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:27 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames You can also do "SQL-like" joins in the tidyverse with dplyr. On 2022-03-19 21:23, Jeff Reichman wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/mer ge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10
columns of which they each share two common fields. Here are two
small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames
such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Then show your code so we can focus on what you haven't yet figured out. Have you read the examples in the merge help page?
On March 19, 2022 6:23:02 PM PDT, Jeff Reichman <reichmanj at sbcglobal.net> wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/merge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10 columns
of which they each share two common fields. Here are two small test
datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames such
as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
by = c("date", "geo_hash" )
On March 19, 2022 6:31:19 PM PDT, Jeff Reichman <reichmanj at sbcglobal.net> wrote:
Yes I'm reading that presently The closest I've gotten has been df3 <- merge(df1, df2, all = TRUE) -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:27 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames You can also do "SQL-like" joins in the tidyverse with dplyr. On 2022-03-19 21:23, Jeff Reichman wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/mer ge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10
columns of which they each share two common fields. Here are two
small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames
such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
Jeff This seems to work df3 <- merge(df1, df2, all = TRUE) When I use either of the by.x, by.y or all.x, all.y arguments I get really weard results. Simply using the code about appears to work thus far. -----Original Message----- From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us> Sent: Saturday, March 19, 2022 8:51 PM To: reichmanj at sbcglobal.net; Jeff Reichman <reichmanj at sbcglobal.net>; 'Tom Woolman' <twoolman at ontargettek.com> Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Then show your code so we can focus on what you haven't yet figured out. Have you read the examples in the merge help page?
On March 19, 2022 6:23:02 PM PDT, Jeff Reichman <reichmanj at sbcglobal.net> wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/merg e On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10
columns of which they each share two common fields. Here are two
small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames
such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Sent from my phone. Please excuse my brevity.
Ok this seems to work correctly
df1 <- data.frame(date = as.factor(c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3",
"2021-1-4")),
geo_hash = as.factor(c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123","z12345")),
ad_id = as.factor(c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345","a12345")))
df2 <- data.frame(date = as.factor(c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3","2021-1-4")),
geo_hash = as.factor(c("abc123","abc456","abc123","abc789","abc890","w12345")),
event = as.factor(c("shoting","ied","protest","riot","protest","killing")))
df1
df2
#df3 <- merge(df1, df2, all = TRUE)
df3 <- merge(df1, df2, by = c("date", "geo_hash" ), all = TRUE)
df3
-----Original Message-----
From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
Sent: Saturday, March 19, 2022 8:55 PM
To: reichmanj at sbcglobal.net; Jeff Reichman <reichmanj at sbcglobal.net>; 'Tom Woolman' <twoolman at ontargettek.com>
Cc: r-help at r-project.org
Subject: Re: [R] Combining data.frames
by = c("date", "geo_hash" )
On March 19, 2022 6:31:19 PM PDT, Jeff Reichman <reichmanj at sbcglobal.net> wrote:
Yes I'm reading that presently The closest I've gotten has been df3 <- merge(df1, df2, all = TRUE) -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:27 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames You can also do "SQL-like" joins in the tidyverse with dplyr. On 2022-03-19 21:23, Jeff Reichman wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/me r ge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10
columns of which they each share two common fields. Here are two
small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames
such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Sent from my phone. Please excuse my brevity.
Merge by the common keys/column names is the default. Te question is likely what to do with rows that don't match. That's determined by 'all' settings, which the OP may already have figured out.
On Sat, Mar 19, 2022, 7:16 PM Tom Woolman <twoolman at ontargettek.com> wrote:
I'm trying hard to take tonight off and avoid booting up the laptop and launching R... :) but you need to merge by the primary key(s), e.g. the common columns (common IVs) shared between the two dataframes. On 2022-03-19 21:38, Jeff Reichman wrote:
Tom Looks like I figured it out. Syntax issue - wrong "all" argument (I think) -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:27 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames You can also do "SQL-like" joins in the tidyverse with dplyr. On 2022-03-19 21:23, Jeff Reichman wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/mer ge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10
columns of which they each share two common fields. Here are two
small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames
such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello,
The two merge below give identical results.
Maybe there was something in your R session?
df3 <- merge(df1, df2, by = c("date", "geo_hash" ), all = TRUE)
df3b <- merge(df1, df2, all = TRUE)
identical(df3, df3b)
#[1] TRUE
Hope this helps,
Rui Barradas
?s 02:05 de 20/03/2022, Jeff Reichman escreveu:
Ok this seems to work correctly
df1 <- data.frame(date = as.factor(c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3",
"2021-1-4")),
geo_hash = as.factor(c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123","z12345")),
ad_id = as.factor(c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345","a12345")))
df2 <- data.frame(date = as.factor(c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3","2021-1-4")),
geo_hash = as.factor(c("abc123","abc456","abc123","abc789","abc890","w12345")),
event = as.factor(c("shoting","ied","protest","riot","protest","killing")))
df1
df2
#df3 <- merge(df1, df2, all = TRUE)
df3 <- merge(df1, df2, by = c("date", "geo_hash" ), all = TRUE)
df3
-----Original Message-----
From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
Sent: Saturday, March 19, 2022 8:55 PM
To: reichmanj at sbcglobal.net; Jeff Reichman <reichmanj at sbcglobal.net>; 'Tom Woolman' <twoolman at ontargettek.com>
Cc: r-help at r-project.org
Subject: Re: [R] Combining data.frames
by = c("date", "geo_hash" )
On March 19, 2022 6:31:19 PM PDT, Jeff Reichman <reichmanj at sbcglobal.net> wrote:
Yes I'm reading that presently The closest I've gotten has been df3 <- merge(df1, df2, all = TRUE) -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:27 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames You can also do "SQL-like" joins in the tidyverse with dplyr. On 2022-03-19 21:23, Jeff Reichman wrote:
Evening Tom Yest I've been playing with the merge function. But haven't been able to achieve what I need. Could maybe the way to to and it might be my syntax -----Original Message----- From: Tom Woolman <twoolman at ontargettek.com> Sent: Saturday, March 19, 2022 8:20 PM To: reichmanj at sbcglobal.net Cc: r-help at r-project.org Subject: Re: [R] Combining data.frames Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/me r ge On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10
columns of which they each share two common fields. Here are two
small test datasets.
df1 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-1","2021-1-1","2021-1-1",
"2021-1-2","2021-1-2","2021-1-3","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc123","abc456","abc789","abc246","abc123",
"asd123","abc789","abc890","abc123"),
ad_id =
c("a12345","b12345","a12345","a12345","c12345",
"b12345","b12345","a12345","b12345","a12345"))
df2 <- data.frame(date =
c("2021-1-1","2021-1-1","2021-1-2","2021-1-3","2021-1-3"),
geo_hash =
c("abc123","abc456","abc123","abc789","abc890"),
event =
c("shoting","ied","protest","riot","protest"))
I'm trying to combine them such that I get a combined data.frames
such as
date geo_hash ad_id event
1/1/2021 abc123 a12345 shoting
1/1/2021 abc123 b12345
1/1/2021 abc456 a12345 ied
1/1/2021 abc789 a12345
1/1/2021 abc246 c12345
Jeff
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Sent from my phone. Please excuse my brevity.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.