Hello everyone,
My problem is better explained with an example:
> x=data.frame(a=1:4,b=1:4,c=rnorm(4))
> x
a b c
1 1 1 -0.8821089
2 2 2 -0.7082583
3 3 3 -0.5948835
4 4 4 -1.8571443
> y=data.frame(a=c(1,3),b=3,c=rnorm(2))
> y
a b c
1 1 3 -0.273155973
2 3 3 0.009517862
Now I want to merge x and y by columns a and b, hence creating a
data.frame with all a:b combinations observed in x and y. That's
easily done with merge:
> merge(x,y,by=c("a","b"),all=T)
a b c.x c.y
1 1 1 -0.8821089 NA
2 1 3 NA -0.273155973
3 2 2 -0.7082583 NA
4 3 3 -0.5948835 0.009517862
5 4 4 -1.8571443 NA
But rather than two c columns I would want the merge to:
- keep the value in x if there is no corresponding value in y
- keep the value in y if there is no corresponding value in x
- prefer the value in y when the a:b combination exists in both x and y
So basically I want my result to look like:
a b c
1 1 1 -0.8821089
2 1 3 -0.2731559
3 2 2 -0.7082583
4 3 3 0.0095178
5 4 4 -1.8571443
I can't find a combinations of options for merge that does this. Is
there another fonction that would do that or do I have to resort to
some post-processing after merge? It seems that it might be something
like a "right merge" for data bases but I don't know this world at
all. I would be happy to look into sqldf if that allows to do things
like that.
Thanks in advance. Sincerely,
JiHO
---
http://maururu.net
Try this:
xy <- merge(x, y, by = c("a","b"),all = TRUE)
xy$c <- ifelse(rowSums(!is.na(.x <- xy[, c('c.x',
'c.y')])) > 1, .x[,1],
rowSums(.x, na.rm = TRUE))
xy
On Thu, Sep 10, 2009 at 12:21 PM, JiHO <jo.lists@gmail.com> wrote:
> Hello everyone,
>
> My problem is better explained with an example:
>
> > x=data.frame(a=1:4,b=1:4,c=rnorm(4))
> > x
> a b c
> 1 1 1 -0.8821089
> 2 2 2 -0.7082583
> 3 3 3 -0.5948835
> 4 4 4 -1.8571443
> > y=data.frame(a=c(1,3),b=3,c=rnorm(2))
> > y
> a b c
> 1 1 3 -0.273155973
> 2 3 3 0.009517862
>
> Now I want to merge x and y by columns a and b, hence creating a data.frame
> with all a:b combinations observed in x and y. That's easily done with
> merge:
>
> > merge(x,y,by=c("a","b"),all=T)
> a b c.x c.y
> 1 1 1 -0.8821089 NA
> 2 1 3 NA -0.273155973
> 3 2 2 -0.7082583 NA
> 4 3 3 -0.5948835 0.009517862
> 5 4 4 -1.8571443 NA
>
> But rather than two c columns I would want the merge to:
> - keep the value in x if there is no corresponding value in y
> - keep the value in y if there is no corresponding value in x
> - prefer the value in y when the a:b combination exists in both x and y
>
> So basically I want my result to look like:
> a b c
> 1 1 1 -0.8821089
> 2 1 3 -0.2731559
> 3 2 2 -0.7082583
> 4 3 3 0.0095178
> 5 4 4 -1.8571443
>
> I can't find a combinations of options for merge that does this. Is
there
> another fonction that would do that or do I have to resort to some
> post-processing after merge? It seems that it might be something like a
> "right merge" for data bases but I don't know this world at
all. I would be
> happy to look into sqldf if that allows to do things like that.
>
> Thanks in advance. Sincerely,
>
> JiHO
> ---
> http://maururu.net
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O
[[alternative HTML version deleted]]
No you cannot. You may want to write a merge function with the special capability but there is no better way than the one suggested by Henrique. On Sep 14, 12:18?pm, JiHO <jo.li... at gmail.com> wrote:> On 2009-September-11 ?, at 13:55 , ?wrote: > > > Maybe: > > > do.call(rbind, lapply(with(xy <- rbind(x, y), split(xy, list(a, b), ? > > drop = TRUE)), tail, 1)) > > > On Fri, Sep 11, 2009 at 3:45 AM, jo <jo.li... at gmail.com> wrote: > > Thanks for the post-processing ideas. But is there any way to do that > > in one step? > > Thanks but by "in one step" I meant within the merge, not in one post- > processing step ;) > > JiHO > ---http://maururu.net > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.