Dear All, I would like to merge two data sets however I am doing something wrong... 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany and 'species names' (2 column). and the second one names of 'Red list species'(1 column) and 'species status' (2 column). so I would like to merge Red list species with species names from the first table and to sign the species status I have tried with merge function but got this an error:" 'by' must specify a uniquely valid column" I also tried with the function left_join, however no success. Also columns in two data sets are different in size. 1 table has 7189 rows and 2 table just 426 rows as we do not have much Red list Species. I would appreciate your help. Kind regards, Sasha Dr Sasha Kosanic Ecology Lab (Biology Department) Room M842 University of Konstanz Universit?tsstra?e 10 D-78464 Konstanz Phone: +49 7531 883321 & +49 (0)175 9172503 http://cms.uni-konstanz.de/vkleunen/ https://tinyurl.com/y8u5wyoj https://tinyurl.com/cgec6tu [[alternative HTML version deleted]]
There are many examples of how to do this properly on the web, and many ways you could have failed to follow those examples. You need to be much more specific (using actual R code) about what you did in order for us to help you get past your specific error. [1][2][3] You will also avoid the what-we-see-is-different-than-what-you-saw problems with your email if you read the Posting Guide and insure that your email client is configured to send plain text format rather than HTML- format email to the mailing list. [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example [2] http://adv-r.had.co.nz/Reproducibility.html [3] https://cran.r-project.org/web/packages/reprex/index.html (read the vignette) On February 5, 2019 9:56:37 AM PST, sasa kosanic <sasa.kosanic at gmail.com> wrote:>Dear All, > >I would like to merge two data sets however I am doing something >wrong... >1 data set contains 2 columns of 'species occurrence'(1 column) in >Germany >and 'species names' (2 column). >and the second one names of 'Red list species'(1 column) and 'species >status' (2 column). >so I would like to merge Red list species with species names from the >first >table and to sign the species status >I have tried with merge function but got this an error:" 'by' must >specify >a uniquely valid column" >I also tried with the function left_join, however no success. > >Also columns in two data sets are different in size. 1 table has 7189 >rows >and 2 table just 426 rows as we do not have much Red list Species. > >I would appreciate your help. > >Kind regards, >Sasha > > >Dr Sasha Kosanic >Ecology Lab (Biology Department) >Room M842 >University of Konstanz >Universit?tsstra?e 10 >D-78464 Konstanz >Phone: +49 7531 883321 & +49 (0)175 9172503 > >http://cms.uni-konstanz.de/vkleunen/ >https://tinyurl.com/y8u5wyoj >https://tinyurl.com/cgec6tu > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Show us your code! (as the posting guide below requests. Please read the posting guide). Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Feb 5, 2019 at 10:04 AM sasa kosanic <sasa.kosanic at gmail.com> wrote:> Dear All, > > I would like to merge two data sets however I am doing something wrong... > 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany > and 'species names' (2 column). > and the second one names of 'Red list species'(1 column) and 'species > status' (2 column). > so I would like to merge Red list species with species names from the first > table and to sign the species status > I have tried with merge function but got this an error:" 'by' must specify > a uniquely valid column" > I also tried with the function left_join, however no success. > > Also columns in two data sets are different in size. 1 table has 7189 rows > and 2 table just 426 rows as we do not have much Red list Species. > > I would appreciate your help. > > Kind regards, > Sasha > > > Dr Sasha Kosanic > Ecology Lab (Biology Department) > Room M842 > University of Konstanz > Universit?tsstra?e 10 > D-78464 Konstanz > Phone: +49 7531 883321 & +49 (0)175 9172503 > > http://cms.uni-konstanz.de/vkleunen/ > https://tinyurl.com/y8u5wyoj > https://tinyurl.com/cgec6tu > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Quite agree with Jeff Newmiller and Bert Gunter. The error you get (" 'by' must specify a uniquely valid column") is a very common mistake when the function merge is misused. Although, the function merge is the good choice. Have you read the manual of the function sending the command `?merge`. That is always a good start. Hereafter is what the function call look like: `merge(x, y, by = intersect(names(x), names(y)), by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all, sort = TRUE, suffixes = c(".x",".y"), no.dups = TRUE, incomparables = NULL, ...)` For your matter, you probably need only 4 arguments: `merge(x = dataset1, y = dataset2, by.x = "key1", by.y = "key2")` In the example, key1 correspond to the column name in the dataset1 that should match the column name in the dataset2. Likewise for key2. Again, read the manual to understand the other arguments, I would especially advise you to look at the arguments suffixes, all.x, all.y which will help you doing exactly what you want. Cheers, Francois COLLIN On 05/02/2019 19:49, Bert Gunter wrote:> Show us your code! (as the posting guide below requests. Please read the > posting guide). > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Feb 5, 2019 at 10:04 AM sasa kosanic <sasa.kosanic at gmail.com> wrote: > >> Dear All, >> >> I would like to merge two data sets however I am doing something wrong... >> 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany >> and 'species names' (2 column). >> and the second one names of 'Red list species'(1 column) and 'species >> status' (2 column). >> so I would like to merge Red list species with species names from the first >> table and to sign the species status >> I have tried with merge function but got this an error:" 'by' must specify >> a uniquely valid column" >> I also tried with the function left_join, however no success. >> >> Also columns in two data sets are different in size. 1 table has 7189 rows >> and 2 table just 426 rows as we do not have much Red list Species. >> >> I would appreciate your help. >> >> Kind regards, >> Sasha >> >> >> Dr Sasha Kosanic >> Ecology Lab (Biology Department) >> Room M842 >> University of Konstanz >> Universit?tsstra?e 10 >> D-78464 Konstanz >> Phone: +49 7531 883321 & +49 (0)175 9172503 >> >> http://cms.uni-konstanz.de/vkleunen/ >> https://tinyurl.com/y8u5wyoj >> https://tinyurl.com/cgec6tu >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi Sasha, I'll take a wild guess that your column names have periods (.) replacing the spaces in the names you use: species occurrence -> species.occurrence The error message means that R can't find the variable name you have used in the "by" argument. The second wild guess is that your column names for the species names are different and you must use the "by.x" and "by.y" arguments instead of just "by". Jim On Wed, Feb 6, 2019 at 5:04 AM sasa kosanic <sasa.kosanic at gmail.com> wrote:> > Dear All, > > I would like to merge two data sets however I am doing something wrong... > 1 data set contains 2 columns of 'species occurrence'(1 column) in Germany > and 'species names' (2 column). > and the second one names of 'Red list species'(1 column) and 'species > status' (2 column). > so I would like to merge Red list species with species names from the first > table and to sign the species status > I have tried with merge function but got this an error:" 'by' must specify > a uniquely valid column" > I also tried with the function left_join, however no success. > > Also columns in two data sets are different in size. 1 table has 7189 rows > and 2 table just 426 rows as we do not have much Red list Species. > > I would appreciate your help. > > Kind regards, > Sasha > > > Dr Sasha Kosanic > Ecology Lab (Biology Department) > Room M842 > University of Konstanz > Universit?tsstra?e 10 > D-78464 Konstanz > Phone: +49 7531 883321 & +49 (0)175 9172503 > > http://cms.uni-konstanz.de/vkleunen/ > https://tinyurl.com/y8u5wyoj > https://tinyurl.com/cgec6tu > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.