If I want to do a join based on *two* matching fields in two data frames, can merge() handle this? It appears to only handle a single matching column -- do I need to make a "metacolumn" or is there some way to do this? E.g.: Dataframe 1 contains columns A,B,C and Dataframe 2 contains A,B,D I want an output A,B,C,D which places C and D together if A and B match (otherwise, make two new rows, e.g. Ax,Bx,Cx,nodata and Ay,By,nodata,Dy) --j -- Jonathan A. Greenberg, PhD NRC Research Associate NASA Ames Research Center MS 242-4 Moffett Field, CA 94035-1000 Office: 650-604-5896 Cell: 415-794-5043 AIM: jgrn307 MSN: jgrn307 at hotmail.com
merge() can handle this if you specify the "by" parameters to the same
length vector for x and y. The parameters corresponds the columns you want
to filter out under certain conditions and add the values to the merging
data.frame.
For example(modified from No.1 example from ?merge):
authors <- data.frame(
surname = c("Tukey", "Venables",
"Tierney", "Ripley", "McNeil"),
nationality = c("US", "Australia", "US",
"UK", "Australia"),
deceased = c("yes", rep("no",
4)),year=c(2000,2001,2000,2001,2000))
books <- data.frame(
name = c("Tukey", "Venables", "Tierney",
"Ripley", "Ripley", "McNeil",
"R Core"),
title = c("Exploratory Data Analysis",
"Modern Applied Statistics ...",
"LISP-STAT",
"Spatial Statistics", "Stochastic
Simulation",
"Interactive Data Analysis",
"An Introduction to R"),
other.author = c(NA, "Ripley", NA, NA, NA, NA,
"Venables &
Smith"),year=c(1999,2001,2000,2001,2000,2001,2002))
#Compare the following results:
merge(authors,books,by.x="surname",by.y="name")
merge(authors,books,by.x=c("surname","year"),by.y=c("name","year"))
Hope this can be some help of you.
Xiaohui Chen
Dept. of Statistics
UBC, Canada
>From: Jonathan Greenberg <jgreenberg at arc.nasa.gov>
>To: R-help <r-help at stat.math.ethz.ch>
>Subject: [R] Question about merge()
>Date: Sat, 23 Sep 2006 15:08:22 -0700
>
>If I want to do a join based on *two* matching fields in two data frames,
>can merge() handle this? It appears to only handle a single matching
>column
>-- do I need to make a "metacolumn" or is there some way to do
this? E.g.:
>
>Dataframe 1 contains columns A,B,C and Dataframe 2 contains A,B,D
>
>I want an output A,B,C,D which places C and D together if A and B match
>(otherwise, make two new rows, e.g. Ax,Bx,Cx,nodata and Ay,By,nodata,Dy)
>
>--j
>
>--
>Jonathan A. Greenberg, PhD
>NRC Research Associate
>NASA Ames Research Center
>MS 242-4
>Moffett Field, CA 94035-1000
>Office: 650-604-5896
>Cell: 415-794-5043
>AIM: jgrn307
>MSN: jgrn307 at hotmail.com
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
_________________________________________________________________
Buy what you want when you want it on Sympatico / MSN Shopping
MERGE DATA FRAMES BY 2 OR MORE VARIABLES
#######################################
# MERGE 2 DATA FRAMES BASED ON #
# 2 OR MORE VARIABLES #
#######################################
data1<-data.frame(x.id1 = 1:10, x.id2 = (1:10) * 2, x = rnorm(length(1:10)));
data2<-data.frame(y.id1 = seq(1, 20, by = 2), y.id2 = seq(1, 20, by 2) * 2, y
= rnorm(length(1:10)));
merged<-merge(data1, data2, by.x = c("x.id1", "x.id2"),
by.y c("y.id1", "y.id2"), all = TRUE);
On 9/23/06, Jonathan Greenberg <jgreenberg at arc.nasa.gov>
wrote:> If I want to do a join based on *two* matching fields in two data frames,
> can merge() handle this? It appears to only handle a single matching
column
> -- do I need to make a "metacolumn" or is there some way to do
this? E.g.:
>
> Dataframe 1 contains columns A,B,C and Dataframe 2 contains A,B,D
>
> I want an output A,B,C,D which places C and D together if A and B match
> (otherwise, make two new rows, e.g. Ax,Bx,Cx,nodata and Ay,By,nodata,Dy)
>
> --j
>
> --
> Jonathan A. Greenberg, PhD
> NRC Research Associate
> NASA Ames Research Center
> MS 242-4
> Moffett Field, CA 94035-1000
> Office: 650-604-5896
> Cell: 415-794-5043
> AIM: jgrn307
> MSN: jgrn307 at hotmail.com
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
WenSui Liu
(http://spaces.msn.com/statcompute/blog)
Senior Decision Support Analyst
Cincinnati Children Hospital Medical Center