Dear All, I would like to merge two data sets however I am doing something wrong... 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany and 'species names' (2 column). and the second one names of 'Red list species'(1 column) and 'species status' (2 column). so I would like to merge Red list species with species names from the first table and to sign the species status I have tried with merge function but got this an error:" 'by' must specify a uniquely valid column" I also tried with the function left_join, however no success. Also columns in two data sets are different in size. 1 table has 7189 rows and 2 table just 426 rows as we do not have much Red list Species. I would appreciate your help. Kind regards, Sasha Dr Sasha Kosanic Ecology Lab (Biology Department) Room M842 University of Konstanz Universit?tsstra?e 10 D-78464 Konstanz Phone: +49 7531 883321 & +49 (0)175 9172503 http://cms.uni-konstanz.de/vkleunen/ https://tinyurl.com/y8u5wyoj https://tinyurl.com/cgec6tu
problems when merging two data sets
5 messages · sasa kosanic, Jeff Newmiller, Bert Gunter +2 more
There are many examples of how to do this properly on the web, and many ways you could have failed to follow those examples. You need to be much more specific (using actual R code) about what you did in order for us to help you get past your specific error. [1][2][3] You will also avoid the what-we-see-is-different-than-what-you-saw problems with your email if you read the Posting Guide and insure that your email client is configured to send plain text format rather than HTML- format email to the mailing list. [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example [2] http://adv-r.had.co.nz/Reproducibility.html [3] https://cran.r-project.org/web/packages/reprex/index.html (read the vignette)
On February 5, 2019 9:56:37 AM PST, sasa kosanic <sasa.kosanic at gmail.com> wrote:
Dear All, I would like to merge two data sets however I am doing something wrong... 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany and 'species names' (2 column). and the second one names of 'Red list species'(1 column) and 'species status' (2 column). so I would like to merge Red list species with species names from the first table and to sign the species status I have tried with merge function but got this an error:" 'by' must specify a uniquely valid column" I also tried with the function left_join, however no success. Also columns in two data sets are different in size. 1 table has 7189 rows and 2 table just 426 rows as we do not have much Red list Species. I would appreciate your help. Kind regards, Sasha Dr Sasha Kosanic Ecology Lab (Biology Department) Room M842 University of Konstanz Universit?tsstra?e 10 D-78464 Konstanz Phone: +49 7531 883321 & +49 (0)175 9172503 http://cms.uni-konstanz.de/vkleunen/ https://tinyurl.com/y8u5wyoj https://tinyurl.com/cgec6tu [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
Show us your code! (as the posting guide below requests. Please read the posting guide). Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Tue, Feb 5, 2019 at 10:04 AM sasa kosanic <sasa.kosanic at gmail.com> wrote:
Dear All, I would like to merge two data sets however I am doing something wrong... 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany and 'species names' (2 column). and the second one names of 'Red list species'(1 column) and 'species status' (2 column). so I would like to merge Red list species with species names from the first table and to sign the species status I have tried with merge function but got this an error:" 'by' must specify a uniquely valid column" I also tried with the function left_join, however no success. Also columns in two data sets are different in size. 1 table has 7189 rows and 2 table just 426 rows as we do not have much Red list Species. I would appreciate your help. Kind regards, Sasha Dr Sasha Kosanic Ecology Lab (Biology Department) Room M842 University of Konstanz Universit?tsstra?e 10 D-78464 Konstanz Phone: +49 7531 883321 & +49 (0)175 9172503 http://cms.uni-konstanz.de/vkleunen/ https://tinyurl.com/y8u5wyoj https://tinyurl.com/cgec6tu [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Quite agree with Jeff Newmiller and Bert Gunter.
The error you get (" 'by' must specify a uniquely valid column") is a
very common mistake when the function merge is misused. Although, the
function merge is the good choice. Have you read the manual of the
function sending the command `?merge`. That is always a good start.
Hereafter is what the function call look like:
`merge(x, y, by = intersect(names(x), names(y)), by.x = by, by.y = by,
all = FALSE, all.x = all, all.y = all, sort = TRUE, suffixes =
c(".x",".y"), no.dups = TRUE, incomparables = NULL, ...)`
For your matter, you probably need only 4 arguments:
`merge(x = dataset1, y = dataset2, by.x = "key1", by.y = "key2")`
In the example, key1 correspond to the column name in the dataset1 that
should match the column name in the dataset2. Likewise for key2.
Again, read the manual to understand the other arguments, I would
especially advise you to look at the arguments suffixes, all.x, all.y
which will help you doing exactly what you want.
Cheers,
Francois COLLIN
On 05/02/2019 19:49, Bert Gunter wrote:
Show us your code! (as the posting guide below requests. Please read the posting guide). Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Feb 5, 2019 at 10:04 AM sasa kosanic <sasa.kosanic at gmail.com> wrote:
Dear All, I would like to merge two data sets however I am doing something wrong... 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany and 'species names' (2 column). and the second one names of 'Red list species'(1 column) and 'species status' (2 column). so I would like to merge Red list species with species names from the first table and to sign the species status I have tried with merge function but got this an error:" 'by' must specify a uniquely valid column" I also tried with the function left_join, however no success. Also columns in two data sets are different in size. 1 table has 7189 rows and 2 table just 426 rows as we do not have much Red list Species. I would appreciate your help. Kind regards, Sasha Dr Sasha Kosanic Ecology Lab (Biology Department) Room M842 University of Konstanz Universit?tsstra?e 10 D-78464 Konstanz Phone: +49 7531 883321 & +49 (0)175 9172503 http://cms.uni-konstanz.de/vkleunen/ https://tinyurl.com/y8u5wyoj https://tinyurl.com/cgec6tu [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Sasha, I'll take a wild guess that your column names have periods (.) replacing the spaces in the names you use: species occurrence -> species.occurrence The error message means that R can't find the variable name you have used in the "by" argument. The second wild guess is that your column names for the species names are different and you must use the "by.x" and "by.y" arguments instead of just "by". Jim
On Wed, Feb 6, 2019 at 5:04 AM sasa kosanic <sasa.kosanic at gmail.com> wrote:
Dear All, I would like to merge two data sets however I am doing something wrong... 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany and 'species names' (2 column). and the second one names of 'Red list species'(1 column) and 'species status' (2 column). so I would like to merge Red list species with species names from the first table and to sign the species status I have tried with merge function but got this an error:" 'by' must specify a uniquely valid column" I also tried with the function left_join, however no success. Also columns in two data sets are different in size. 1 table has 7189 rows and 2 table just 426 rows as we do not have much Red list Species. I would appreciate your help. Kind regards, Sasha Dr Sasha Kosanic Ecology Lab (Biology Department) Room M842 University of Konstanz Universit?tsstra?e 10 D-78464 Konstanz Phone: +49 7531 883321 & +49 (0)175 9172503 http://cms.uni-konstanz.de/vkleunen/ https://tinyurl.com/y8u5wyoj https://tinyurl.com/cgec6tu [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.