Hello everyone, My problem is better explained with an example: > x=data.frame(a=1:4,b=1:4,c=rnorm(4)) > x a b c 1 1 1 -0.8821089 2 2 2 -0.7082583 3 3 3 -0.5948835 4 4 4 -1.8571443 > y=data.frame(a=c(1,3),b=3,c=rnorm(2)) > y a b c 1 1 3 -0.273155973 2 3 3 0.009517862 Now I want to merge x and y by columns a and b, hence creating a data.frame with all a:b combinations observed in x and y. That's easily done with merge: > merge(x,y,by=c("a","b"),all=T) a b c.x c.y 1 1 1 -0.8821089 NA 2 1 3 NA -0.273155973 3 2 2 -0.7082583 NA 4 3 3 -0.5948835 0.009517862 5 4 4 -1.8571443 NA But rather than two c columns I would want the merge to: - keep the value in x if there is no corresponding value in y - keep the value in y if there is no corresponding value in x - prefer the value in y when the a:b combination exists in both x and y So basically I want my result to look like: a b c 1 1 1 -0.8821089 2 1 3 -0.2731559 3 2 2 -0.7082583 4 3 3 0.0095178 5 4 4 -1.8571443 I can't find a combinations of options for merge that does this. Is there another fonction that would do that or do I have to resort to some post-processing after merge? It seems that it might be something like a "right merge" for data bases but I don't know this world at all. I would be happy to look into sqldf if that allows to do things like that. Thanks in advance. Sincerely, JiHO --- http://maururu.net
Try this: xy <- merge(x, y, by = c("a","b"),all = TRUE) xy$c <- ifelse(rowSums(!is.na(.x <- xy[, c('c.x', 'c.y')])) > 1, .x[,1], rowSums(.x, na.rm = TRUE)) xy On Thu, Sep 10, 2009 at 12:21 PM, JiHO <jo.lists@gmail.com> wrote:> Hello everyone, > > My problem is better explained with an example: > > > x=data.frame(a=1:4,b=1:4,c=rnorm(4)) > > x > a b c > 1 1 1 -0.8821089 > 2 2 2 -0.7082583 > 3 3 3 -0.5948835 > 4 4 4 -1.8571443 > > y=data.frame(a=c(1,3),b=3,c=rnorm(2)) > > y > a b c > 1 1 3 -0.273155973 > 2 3 3 0.009517862 > > Now I want to merge x and y by columns a and b, hence creating a data.frame > with all a:b combinations observed in x and y. That's easily done with > merge: > > > merge(x,y,by=c("a","b"),all=T) > a b c.x c.y > 1 1 1 -0.8821089 NA > 2 1 3 NA -0.273155973 > 3 2 2 -0.7082583 NA > 4 3 3 -0.5948835 0.009517862 > 5 4 4 -1.8571443 NA > > But rather than two c columns I would want the merge to: > - keep the value in x if there is no corresponding value in y > - keep the value in y if there is no corresponding value in x > - prefer the value in y when the a:b combination exists in both x and y > > So basically I want my result to look like: > a b c > 1 1 1 -0.8821089 > 2 1 3 -0.2731559 > 3 2 2 -0.7082583 > 4 3 3 0.0095178 > 5 4 4 -1.8571443 > > I can't find a combinations of options for merge that does this. Is there > another fonction that would do that or do I have to resort to some > post-processing after merge? It seems that it might be something like a > "right merge" for data bases but I don't know this world at all. I would be > happy to look into sqldf if that allows to do things like that. > > Thanks in advance. Sincerely, > > JiHO > --- > http://maururu.net > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
No you cannot. You may want to write a merge function with the special capability but there is no better way than the one suggested by Henrique. On Sep 14, 12:18?pm, JiHO <jo.li... at gmail.com> wrote:> On 2009-September-11 ?, at 13:55 , ?wrote: > > > Maybe: > > > do.call(rbind, lapply(with(xy <- rbind(x, y), split(xy, list(a, b), ? > > drop = TRUE)), tail, 1)) > > > On Fri, Sep 11, 2009 at 3:45 AM, jo <jo.li... at gmail.com> wrote: > > Thanks for the post-processing ideas. But is there any way to do that > > in one step? > > Thanks but by "in one step" I meant within the merge, not in one post- > processing step ;) > > JiHO > ---http://maururu.net > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.