thr3ads.net - R help - [R] merge data [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Chuck White

2009-Nov-10 17:36 UTC

[R] merge data

df1 -- dataframe with column date and several other columns. #rows >40k 
Several of the dates are repeated.
df2 -- dataframe with two columns date and index. #rows ~130  This is really a
map from date to index.

I would like to create a column called index in df1 which has the corresponding
index from df2.

The following works:
index <- NULL
for(wk in df1$week){
    index <- c(index,df2$index[df2$week==wk])
}
and then add index to df1.

Can you please suggest a better way of doing this? I didn't think merge was
suitable for this...is it? THANKS.

David Winsemius

2009-Nov-10 18:34 UTC

head link

[R] merge data

On Nov 10, 2009, at 12:36 PM, Chuck White wrote:
> df1 -- dataframe with column date and several other columns. #rows  
> >40k  Several of the dates are repeated.
> df2 -- dataframe with two columns date and index. #rows ~130  This  
> is really a map from date to index.
>
> I would like to create a column called index in df1 which has the  
> corresponding index from df2.
>
> The following works:
> index <- NULL
> for(wk in df1$week){
>    index <- c(index,df2$index[df2$week==wk])
> }
> and then add index to df1.
>
> Can you please suggest a better way of doing this? I didn't think  
> merge was suitable for this...is it? THANKS.
I think merge should work, but if you really have looked at the  
various arguments, tested reasonable examples and are still convinced  
it wouldn't, then see what you get with:

 > df1 <- data.frame(dt = Sys.Date() - sample(100:120, 30,  
replace=TRUE), 1:30)
 > df2 <- data.frame(dt2 = Sys.Date() -100:120, index=LETTERS[1:21])

 > df1$index <- df2[ match(df1$dt,df2$dt2), "index"]
 > df1
            dt X1.30 index
1  2009-07-30     1     D
2  2009-07-16     2     R
3  2009-07-23     3     K
4  2009-07-29     4     E
5  2009-07-15     5     S
6  2009-08-02     6     A
7  2009-07-18     7     P
8  2009-07-21     8     M
9  2009-07-27     9     G
10 2009-07-26    10     H
11 2009-07-31    11     C
12 2009-07-26    12     H
13 2009-07-18    13     P
14 2009-07-23    14     K
15 2009-07-21    15     M
16 2009-07-19    16     O
17 2009-07-14    17     T
18 2009-07-16    18     R
19 2009-07-15    19     S
20 2009-07-13    20     U
21 2009-07-28    21     F
22 2009-07-20    22     N
23 2009-07-24    23     J
24 2009-07-20    24     N
25 2009-07-16    25     R
26 2009-07-30    26     D
27 2009-07-14    27     T
28 2009-08-02    28     A
29 2009-07-19    29     O
30 2009-07-26    30     H

I tried merge(df1, df2, by.x=1, by.y=1) and got the same result modulo  
the order of the output.


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Maybe Matching Threads

Search for more reasonably related threads

R help - Nov 2009 - merge data

[R] merge data

[R] merge data

Maybe Matching Threads