In the case of 1:1 merging with distinct sets of non-ID variables in two or more datasets, would the following code, which doesn't need to form the larger merged data frame, be useful or faster? [A generalization of with() would make this even better. I've often wondered about the utility of a "merged environment".] > set.seed(1) > a <- data.frame(id=c(1:3, 5, 7), x1=runif(5)) > b <- data.frame(id=c(1:3, 4, 6), x2=runif(5)) > a id x1 1 1 0.2655087 2 2 0.3721239 3 3 0.5728534 4 5 0.9082078 5 7 0.2016819 > b id x2 1 1 0.89838968 2 2 0.94467527 3 3 0.66079779 4 4 0.62911404 5 6 0.06178627 > > ida <- a$id; idb <- b$id > ids <- sort(unique(c(ida, idb))) > i <- match(ids, ida) > j <- match(ids, idb) > a[i,]$x1 [1] 0.2655087 0.3721239 0.5728534 NA 0.9082078 NA 0.2016819 > b[j,]$x2 [1] 0.89838968 0.94467527 0.66079779 0.62911404 NA 0.06178627 NA > > with(a[i,], + with(b[j,], + cbind(x1,x2))) x1 x2 [1,] 0.2655087 0.89838968 [2,] 0.3721239 0.94467527 [3,] 0.5728534 0.66079779 [4,] NA 0.62911404 [5,] 0.9082078 NA [6,] NA 0.06178627 [7,] 0.2016819 NA -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University