schimpanski at gmx.de
2009-Sep-29 07:04 UTC
[R] Data Step: Outer Join Rows, Inner Join Columns
Hi, how can I do a data step in R which can be best described in SQL terminology as an outer join of rows (eg cases) and inner join of columns (variables)? In case of conflicting non-missing values the first dataset has higher always priority. Missing values should always be replaced by valid values (cf. coalesce function in SQL): INPUT DATASET 1: A B C 1 11 NA 31 2 12 22 32 3 NA 23 33 INPUT DATASET 2: C D E 2 NA 42 52 3 NA 43 53 4 34 44 54 OUTPUT DATASET: A B C D E 1 11 NA 31 NA NA 2 12 22 32 42 52 3 NA 23 33 43 53 4 NA NA 34 44 54 Thanks in advance, Will
Hi, if you like to use sql you can use the sqldf package. Another possibility is the basic function merge. HTH Christian> Hi, > > how can I do a data step in R which can be best described in SQL terminology as an outer join of rows (eg cases) and inner join of columns (variables)? In case of conflicting non-missing values the first dataset has higher always priority. Missing values should always be replaced by valid values (cf. coalesce function in SQL): > > INPUT DATASET 1: > > A B C > 1 11 NA 31 > 2 12 22 32 > 3 NA 23 33 > > INPUT DATASET 2: > > C D E > 2 NA 42 52 > 3 NA 43 53 > 4 34 44 54 > > OUTPUT DATASET: > A B C D E > 1 11 NA 31 NA NA > 2 12 22 32 42 52 > 3 NA 23 33 43 53 > 4 NA NA 34 44 54 > > Thanks in advance, > > Will > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >