Skip to content

problems when merging two data sets

5 messages · sasa kosanic, Jeff Newmiller, Bert Gunter +2 more

#
Dear All,

I would like to merge two data sets however I am doing something wrong...
1 data set contains 2 columns of  'species occurrence'(1 column) in Germany
and  'species names' (2 column).
and the second one names of 'Red list species'(1 column) and 'species
status' (2 column).
so I would like to merge Red list species with species names from the first
table and to sign the  species status
I have tried with merge function but got this an error:" 'by' must specify
a uniquely valid column"
I also tried with the function left_join, however no success.

Also columns in two data sets are different in size. 1 table has 7189 rows
and 2 table just 426 rows as we do not have much Red list Species.

I would appreciate your help.

Kind regards,
Sasha


Dr Sasha Kosanic
Ecology Lab (Biology Department)
Room M842
University of Konstanz
Universit?tsstra?e 10
D-78464 Konstanz
Phone: +49 7531 883321 & +49 (0)175 9172503

http://cms.uni-konstanz.de/vkleunen/
https://tinyurl.com/y8u5wyoj
https://tinyurl.com/cgec6tu
#
There are many examples of how to do this properly on the web, and many ways you could have failed to follow those examples. You need to be much more specific (using actual R code) about what you did in order for us to help you get past your specific error. [1][2][3]

You will also avoid the what-we-see-is-different-than-what-you-saw problems with your email if you read the Posting Guide and insure that your email client is configured to send plain text format rather than HTML- format email to the mailing list.

[1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

[2] http://adv-r.had.co.nz/Reproducibility.html

[3] https://cran.r-project.org/web/packages/reprex/index.html (read the vignette)
On February 5, 2019 9:56:37 AM PST, sasa kosanic <sasa.kosanic at gmail.com> wrote:

  
    
#
Show us your code! (as the posting guide below requests. Please read the
posting guide).


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Tue, Feb 5, 2019 at 10:04 AM sasa kosanic <sasa.kosanic at gmail.com> wrote:

            

  
  
#
Quite agree with Jeff Newmiller and Bert Gunter.

The error you get (" 'by' must specify a uniquely valid column") is a 
very common mistake when the function merge is misused. Although, the 
function merge is the good choice. Have you read the manual of the 
function sending the command `?merge`. That is always a good start.

Hereafter is what the function call look like:

`merge(x, y, by = intersect(names(x), names(y)), by.x = by, by.y = by, 
all = FALSE, all.x = all, all.y = all, sort = TRUE, suffixes = 
c(".x",".y"), no.dups = TRUE, incomparables = NULL, ...)`

For your matter, you probably need only 4 arguments:

`merge(x = dataset1, y = dataset2, by.x = "key1", by.y = "key2")`

In the example, key1 correspond to the column name in the dataset1 that 
should match the column name in the dataset2. Likewise for key2.

Again, read the manual to understand the other arguments, I would 
especially advise you to look at the arguments suffixes, all.x, all.y 
which will help you doing exactly what you want.

Cheers,

Francois COLLIN
On 05/02/2019 19:49, Bert Gunter wrote:

  
  
#
Hi Sasha,
I'll take a wild guess that your column names have periods (.)
replacing the spaces in the names you use:

species occurrence -> species.occurrence

The error message means that R can't find the variable name you have
used in the "by" argument. The second wild guess is that your column
names for the species names are different and you must use the "by.x"
and "by.y" arguments instead of just "by".

Jim
On Wed, Feb 6, 2019 at 5:04 AM sasa kosanic <sasa.kosanic at gmail.com> wrote: