On Nov 10, 2009, at 12:36 PM, Chuck White wrote:
> df1 -- dataframe with column date and several other columns. #rows
> >40k Several of the dates are repeated.
> df2 -- dataframe with two columns date and index. #rows ~130 This
> is really a map from date to index.
>
> I would like to create a column called index in df1 which has the
> corresponding index from df2.
>
> The following works:
> index <- NULL
> for(wk in df1$week){
> index <- c(index,df2$index[df2$week==wk])
> }
> and then add index to df1.
>
> Can you please suggest a better way of doing this? I didn't think
> merge was suitable for this...is it? THANKS.
I think merge should work, but if you really have looked at the
various arguments, tested reasonable examples and are still convinced
it wouldn't, then see what you get with:
> df1 <- data.frame(dt = Sys.Date() - sample(100:120, 30,
replace=TRUE), 1:30)
> df2 <- data.frame(dt2 = Sys.Date() -100:120, index=LETTERS[1:21])
> df1$index <- df2[ match(df1$dt,df2$dt2), "index"]
> df1
dt X1.30 index
1 2009-07-30 1 D
2 2009-07-16 2 R
3 2009-07-23 3 K
4 2009-07-29 4 E
5 2009-07-15 5 S
6 2009-08-02 6 A
7 2009-07-18 7 P
8 2009-07-21 8 M
9 2009-07-27 9 G
10 2009-07-26 10 H
11 2009-07-31 11 C
12 2009-07-26 12 H
13 2009-07-18 13 P
14 2009-07-23 14 K
15 2009-07-21 15 M
16 2009-07-19 16 O
17 2009-07-14 17 T
18 2009-07-16 18 R
19 2009-07-15 19 S
20 2009-07-13 20 U
21 2009-07-28 21 F
22 2009-07-20 22 N
23 2009-07-24 23 J
24 2009-07-20 24 N
25 2009-07-16 25 R
26 2009-07-30 26 D
27 2009-07-14 27 T
28 2009-08-02 28 A
29 2009-07-19 29 O
30 2009-07-26 30 H
I tried merge(df1, df2, by.x=1, by.y=1) and got the same result modulo
the order of the output.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT