Daniel Folkinshteyn
2007-Mar-25 01:47 UTC
[R] Contatenating data frames with partial overlap in variable names
Greetings to all. I need to concatenate data frames that do not have all the same variable names, there is only a partial overlap in the variables. So, for example, if i have two data frames, a and b, that look like the following:> aa b 1 1 4 2 2 5 3 3 6 4 4 7 5 5 8> bc a 1 1 10 2 2 11 3 3 12 4 4 13 5 5 14 i want to concatenate them by row, without any matching, so that the variables that are not available in all frames get NAs. The result should look like: a b c 1 1 4 NA 2 2 5 NA 3 3 6 NA 4 4 7 NA 5 5 8 NA 6 10 NA 1 7 11 NA 2 8 12 NA 3 9 13 NA 4 10 14 NA 5 rbind doesn't work, since it requires all variables to be matched between the two data frames. merge doesn't work, since it wants to /match/ by columns with the same name, and if matching by nothing, produces a cartesian product. is there a neat trick for doing this simply, or am i stuck with comparing variable lists and generating NAs manually? would appreciate any help! Daniel
Marc Schwartz
2007-Mar-25 02:00 UTC
[R] Contatenating data frames with partial overlap in variable names
On Sat, 2007-03-24 at 21:47 -0400, Daniel Folkinshteyn wrote:> Greetings to all. > I need to concatenate data frames that do not have all the same variable > names, there is only a partial overlap in the variables. So, for > example, if i have two data frames, a and b, that look like the following: > > a > a b > 1 1 4 > 2 2 5 > 3 3 6 > 4 4 7 > 5 5 8 > > b > c a > 1 1 10 > 2 2 11 > 3 3 12 > 4 4 13 > 5 5 14 > > i want to concatenate them by row, without any matching, so that the > variables that are not available in all frames get NAs. The result > should look like: > > a b c > 1 1 4 NA > 2 2 5 NA > 3 3 6 NA > 4 4 7 NA > 5 5 8 NA > 6 10 NA 1 > 7 11 NA 2 > 8 12 NA 3 > 9 13 NA 4 > 10 14 NA 5 > > rbind doesn't work, since it requires all variables to be matched > between the two data frames. merge doesn't work, since it wants to > /match/ by columns with the same name, and if matching by nothing, > produces a cartesian product. > > is there a neat trick for doing this simply, or am i stuck with > comparing variable lists and generating NAs manually? > > would appreciate any help! > DanielYou can use merge():> aa b 1 1 4 2 2 5 3 3 6 4 4 7 5 5 8> bc a 1 1 10 2 2 11 3 3 12 4 4 13 5 5 14 Use 'a' as the common 'by' column and specify 'all = TRUE' so that non-matching values of 'a' will be included in the result:> merge(a, b, by = "a", all = TRUE)a b c 1 1 4 NA 2 2 5 NA 3 3 6 NA 4 4 7 NA 5 5 8 NA 6 10 NA 1 7 11 NA 2 8 12 NA 3 9 13 NA 4 10 14 NA 5 See ?merge for more information. HTH, Marc Schwartz
hadley wickham
2007-Mar-25 03:11 UTC
[R] Contatenating data frames with partial overlap in variable names
On 3/24/07, Daniel Folkinshteyn <dfolkins at temple.edu> wrote:> Greetings to all. > I need to concatenate data frames that do not have all the same variable > names, there is only a partial overlap in the variables. So, for > example, if i have two data frames, a and b, that look like the following:Have a look at rbind.fill in the reshape package. Hadley