g'day R friends, can anyone please help me with a frustrating merge? The number of rows of a resulting merge is the smaller of the 2 dataframes used as input. What am I doing wrong? I'm using 1.1.0 on redhat 6.2 thanks, John Strumila> xx[1:10,]datetime c 948992940 948992940 0 948993000 948993000 0 948993060 948993060 0 948993120 948993120 0 948993180 948993180 0 948993240 948993240 0 948993300 948993300 0 948993360 948993360 0 948993420 948993420 0 948993480 948993480 0> yy[1:10,]datetime c 948992940 948992940 16560 949317120 949317120 84 949327800 949327800 23 949330440 949330440 0 949331580 949331580 27 949332000 949332000 22 949348800 949348800 6 949351440 949351440 81 949351500 949351500 213 949351560 949351560 52> zz<-merge(xx,yy,by="datetime") > zz[1:10,]datetime c.x c.y 1 948992940 0 16560 2 949317120 0 84 3 949327800 0 23 4 949330440 0 0 5 949331580 0 27 6 949332000 0 22 7 949348800 0 6 8 949351440 0 81 9 949351500 0 213 10 949351560 0 52> nrow(xx)[1] 9683> nrow(yy)[1] 984> nrow(zz)[1] 984>-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Wed, 26 Jul 2000, Strumila, John wrote:> > nrow(xx) > [1] 9683 > > nrow(yy) > [1] 984 > > nrow(zz) > [1] 984The Problem is that merge only keep cases, which are both in xx + yy (intersection). A work arount might be to use match instead:> x <- data.frame(id=1:10,21:30) > y <- data.frame(id=1:11,21:31) > dim(x)[1] 10 2> dim(y)[1] 11 2> dim(merge(x,y,by="id"))[1] 10 3> y$x <- x[match(y$id,x$id),2]or is there a hidden all, all.x, all.y in merge? Peter ** To YOU I'm an atheist; to God, I'm the Loyal Opposition. Woody Allen ** P.Malewski Tel.: 0531 500965 Maschplatz 8 Email: P.Malewski at tu-bs.de ************************38114 Braunschweig******************************** -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Wed, 26 Jul 2000, Strumila, John wrote:> g'day R friends, > > can anyone please help me with a frustrating merge? > > The number of rows of a resulting merge is the smaller of the 2 dataframes > used as input. What am I doing wrong? I'm using 1.1.0 on redhat 6.2What's the problem? From the help page: The rows in the two data frames that match on the specified columns are extracted, and joined together. If there is more than one match, all possible matches contribute one row each. You have no duplicates, so you get only matches, and hence no more rows than the smaller of the two data frames. merge does a database `join' operation, if that helps. S=PLUS has all.x and all.y to do things differently, but R does not (nor are they planned).> > thanks, > John Strumila > > > xx[1:10,] > datetime c > 948992940 948992940 0 > 948993000 948993000 0 > 948993060 948993060 0 > 948993120 948993120 0 > 948993180 948993180 0 > 948993240 948993240 0 > 948993300 948993300 0 > 948993360 948993360 0 > 948993420 948993420 0 > 948993480 948993480 0 > > yy[1:10,] > datetime c > 948992940 948992940 16560 > 949317120 949317120 84 > 949327800 949327800 23 > 949330440 949330440 0 > 949331580 949331580 27 > 949332000 949332000 22 > 949348800 949348800 6 > 949351440 949351440 81 > 949351500 949351500 213 > 949351560 949351560 52 > > zz<-merge(xx,yy,by="datetime") > > zz[1:10,] > datetime c.x c.y > 1 948992940 0 16560 > 2 949317120 0 84 > 3 949327800 0 23 > 4 949330440 0 0 > 5 949331580 0 27 > 6 949332000 0 22 > 7 949348800 0 6 > 8 949351440 0 81 > 9 949351500 0 213 > 10 949351560 0 52 > > nrow(xx) > [1] 9683 > > nrow(yy) > [1] 984 > > nrow(zz) > [1] 984 > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
thank you Peter and Brian, yep, I was a turkey and misunderstood the doco. Peter's suggestion looks good. A pity about the lack of an 'all.x' though ... thanks again, John Strumila -----Original Message----- From: Peter Malewski [mailto:y0004379 at tu-bs.de] Sent: Wednesday, 26 July 2000 19:13 To: Strumila, John Cc: R-Help Subject: Re: [R] merge aint merging Oh sorry, there are there! I've overseen the line... by, by.x, by.y: specifcations of the common columns. See Details. than this might be the best way... Sorry On Wed, 26 Jul 2000, Peter Malewski wrote:> or is there a hidden all, all.x, all.y in merge?** To YOU I'm an atheist; to God, I'm the Loyal Opposition. Woody Allen ** P.Malewski Tel.: 0531 500965 Maschplatz 8 Email: P.Malewski at tu-bs.de ************************38114 Braunschweig******************************** -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-. -.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._