thr3ads.net - R help - [R] merging/intersecting 2 data frames [Jun 2010]

If this information is useful, please help other people find it:
Share via:

Erin Hodgess

2010-Jun-29 19:21 UTC

[R] merging/intersecting 2 data frames

Dear R People:

I have two data frames, a.df and b.df as seen here:
> a.df[1:10,]        DATE GENDER PATIENT_ID AGE             SYNDROME
1  4/16/2009      F      23686  45         RASH ON BODY
2  4/16/2009      F      13840  35         CANT URINATE
3  4/16/2009      M      12895  30       BLURRED VISION
4  4/16/2009      M      18375  33       UNABLE TO VOID
5  4/16/2009      M       2237  44         SOB WEAKNESS
6  4/16/2009      F      21484  41 TOOTH PAINTOOTH PAIN
7  4/16/2009      M      10783  37          RT ARM PAIN
8  4/16/2009      M      12610  65        L FOOT INJURY
9  4/16/2009      F       3495  29 URINARY DIFFICULTIES
10 4/16/2009      F        351  36           PT STS MVA> b.df[1:10,]   DATE_OF_DEATH    ID
1      4/19/2009 21676
2      4/19/2009 13717
3      4/19/2009 20498
4      4/19/2009 14281
5      4/19/2009 38848
6      4/20/2009   331
7      4/20/2009  4084
8      4/20/2009 19616
9      4/20/2009 17965
10     4/20/2009 11863>
a.df will always be larger than b.df.

I want to create a third data frame that is matched on PATIENT_ID from
a.df and ID from b.df.

If there is no match from a.df$PATIENT_ID to b.df$ID, then we omit the
row from the new data.frame.

If there is a match, we include the DATE_OF_DEATH column from b.df.

I've tried all kinds of tricks, but nothing works exactly as I wish.

Thanks in advance,
Sincerely,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodgess at gmail.com

jim holtman

2010-Jun-29 19:31 UTC

head link

[R] merging/intersecting 2 data frames

use 'merge'
> a.df        DATE GENDER PATIENT_ID AGE             SYNDROME
1  4/16/2009      F      23686  45         RASH ON BODY
2  4/16/2009      F      13840  35         CANT URINATE
3  4/16/2009      M      12895  30       BLURRED VISION
4  4/16/2009      M      18375  33       UNABLE TO VOID
5  4/16/2009      M       2237  44         SOB WEAKNESS
6  4/16/2009      F      21484  41 TOOTH PAINTOOTH PAIN
7  4/16/2009      M      10783  37          RT ARM PAIN
8  4/16/2009      M      12610  65        L FOOT INJURY
9  4/16/2009      F       3495  29 URINARY DIFFICULTIES
10 4/16/2009      F        351  36           PT STS MVA> b.df   DATE_OF_DEATH    ID
1      4/19/2009 23686
2      4/19/2009 13840
3      4/19/2009 12895
4      4/19/2009 18375
5      4/19/2009   351
6      4/20/2009  3495
7      4/20/2009  4084
8      4/20/2009 19616
9      4/20/2009 17965
10     4/20/2009 11863> merge(a.df, b.df, by.x="PATIENT_ID", by.y="ID")  PATIENT_ID      DATE GENDER AGE             SYNDROME DATE_OF_DEATH
1        351 4/16/2009      F  36           PT STS MVA     4/19/2009
2       3495 4/16/2009      F  29 URINARY DIFFICULTIES     4/20/2009
3      12895 4/16/2009      M  30       BLURRED VISION     4/19/2009
4      13840 4/16/2009      F  35         CANT URINATE     4/19/2009
5      18375 4/16/2009      M  33       UNABLE TO VOID     4/19/2009
6      23686 4/16/2009      F  45         RASH ON BODY    
4/19/2009>

On Tue, Jun 29, 2010 at 3:21 PM, Erin Hodgess <erinm.hodgess at gmail.com>
wrote:> Dear R People:
>
> I have two data frames, a.df and b.df as seen here:
>
>> a.df[1:10,]
> ? ? ? ?DATE GENDER PATIENT_ID AGE ? ? ? ? ? ? SYNDROME
> 1 ?4/16/2009 ? ? ?F ? ? ?23686 ?45 ? ? ? ? RASH ON BODY
> 2 ?4/16/2009 ? ? ?F ? ? ?13840 ?35 ? ? ? ? CANT URINATE
> 3 ?4/16/2009 ? ? ?M ? ? ?12895 ?30 ? ? ? BLURRED VISION
> 4 ?4/16/2009 ? ? ?M ? ? ?18375 ?33 ? ? ? UNABLE TO VOID
> 5 ?4/16/2009 ? ? ?M ? ? ? 2237 ?44 ? ? ? ? SOB WEAKNESS
> 6 ?4/16/2009 ? ? ?F ? ? ?21484 ?41 TOOTH PAINTOOTH PAIN
> 7 ?4/16/2009 ? ? ?M ? ? ?10783 ?37 ? ? ? ? ?RT ARM PAIN
> 8 ?4/16/2009 ? ? ?M ? ? ?12610 ?65 ? ? ? ?L FOOT INJURY
> 9 ?4/16/2009 ? ? ?F ? ? ? 3495 ?29 URINARY DIFFICULTIES
> 10 4/16/2009 ? ? ?F ? ? ? ?351 ?36 ? ? ? ? ? PT STS MVA
>> b.df[1:10,]
> ? DATE_OF_DEATH ? ?ID
> 1 ? ? ?4/19/2009 21676
> 2 ? ? ?4/19/2009 13717
> 3 ? ? ?4/19/2009 20498
> 4 ? ? ?4/19/2009 14281
> 5 ? ? ?4/19/2009 38848
> 6 ? ? ?4/20/2009 ? 331
> 7 ? ? ?4/20/2009 ?4084
> 8 ? ? ?4/20/2009 19616
> 9 ? ? ?4/20/2009 17965
> 10 ? ? 4/20/2009 11863
>>
>
> a.df will always be larger than b.df.
>
> I want to create a third data frame that is matched on PATIENT_ID from
> a.df and ID from b.df.
>
> If there is no match from a.df$PATIENT_ID to b.df$ID, then we omit the
> row from the new data.frame.
>
> If there is a match, we include the DATE_OF_DEATH column from b.df.
>
> I've tried all kinds of tricks, but nothing works exactly as I wish.
>
> Thanks in advance,
> Sincerely,
> Erin
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodgess at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

Weidong Gu

2010-Jun-29 20:17 UTC

head link

[R] merging/intersecting 2 data frames

Erin,

?merge

Try 
c.df=merge(a.df,b.df,by.x="PATIENT_ID",by.y="ID")

hope it helps

Weidong

Greg Snow

2010-Jun-29 21:48 UTC

head link

[R] merging/intersecting 2 data frames

Use the merge function, look at the by.x and by.y arguments, also look at the
all.x and all.y arguments as well as the suffixes argument.  You may need to
delete some columns after the merge (or replace missing values in one column
with those in the same location from the next column, see the ifelse function). 
So it may take a couple steps, but that is probably the most straight forward.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Erin Hodgess
> Sent: Tuesday, June 29, 2010 1:22 PM
> To: R help
> Subject: [R] merging/intersecting 2 data frames
> 
> Dear R People:
> 
> I have two data frames, a.df and b.df as seen here:
> 
> > a.df[1:10,]
>         DATE GENDER PATIENT_ID AGE             SYNDROME
> 1  4/16/2009      F      23686  45         RASH ON BODY
> 2  4/16/2009      F      13840  35         CANT URINATE
> 3  4/16/2009      M      12895  30       BLURRED VISION
> 4  4/16/2009      M      18375  33       UNABLE TO VOID
> 5  4/16/2009      M       2237  44         SOB WEAKNESS
> 6  4/16/2009      F      21484  41 TOOTH PAINTOOTH PAIN
> 7  4/16/2009      M      10783  37          RT ARM PAIN
> 8  4/16/2009      M      12610  65        L FOOT INJURY
> 9  4/16/2009      F       3495  29 URINARY DIFFICULTIES
> 10 4/16/2009      F        351  36           PT STS MVA
> > b.df[1:10,]
>    DATE_OF_DEATH    ID
> 1      4/19/2009 21676
> 2      4/19/2009 13717
> 3      4/19/2009 20498
> 4      4/19/2009 14281
> 5      4/19/2009 38848
> 6      4/20/2009   331
> 7      4/20/2009  4084
> 8      4/20/2009 19616
> 9      4/20/2009 17965
> 10     4/20/2009 11863
> >
> 
> a.df will always be larger than b.df.
> 
> I want to create a third data frame that is matched on PATIENT_ID from
> a.df and ID from b.df.
> 
> If there is no match from a.df$PATIENT_ID to b.df$ID, then we omit the
> row from the new data.frame.
> 
> If there is a match, we include the DATE_OF_DEATH column from b.df.
> 
> I've tried all kinds of tricks, but nothing works exactly as I wish.
> 
> Thanks in advance,
> Sincerely,
> Erin
> 
> 
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodgess at gmail.com
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Maybe Matching Threads

Search for more maybe matching threads

R help - Jun 2010 - merging/intersecting 2 data frames

[R] merging/intersecting 2 data frames

[R] merging/intersecting 2 data frames

[R] merging/intersecting 2 data frames

[R] merging/intersecting 2 data frames

Maybe Matching Threads