Mark Na
2006-Aug-07 20:05 UTC
[R] Retain only those records from a dataframe that exist in another dataframe
Dear R community, I have two dataframes "first" and "second" which share a unique identifier. I wish to make a new dataframe "third" retaining only the rows in "first" which also occur in "second". I have tried using merge but can't seem to figure it out. Any ideas? Thanks! Mark
Peter Dalgaard
2006-Aug-07 20:14 UTC
[R] Retain only those records from a dataframe that exist in another dataframe
"Mark Na" <mtb954 at gmail.com> writes:> Dear R community, > > I have two dataframes "first" and "second" which share a unique identifier. > > I wish to make a new dataframe "third" retaining only the rows in > "first" which also occur in "second". > > I have tried using merge but can't seem to figure it out. Any ideas?Doesn't sound like a merge problem. Will this do it?: first[first$ID %in% second$ID,] -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Marc Schwartz (via MN)
2006-Aug-07 20:17 UTC
[R] Retain only those records from a dataframe that exist in another dataframe
On Mon, 2006-08-07 at 14:05 -0600, Mark Na wrote:> Dear R community, > > I have two dataframes "first" and "second" which share a unique identifier. > > I wish to make a new dataframe "third" retaining only the rows in > "first" which also occur in "second". > > I have tried using merge but can't seem to figure it out. Any ideas? > > Thanks! > > MarkDo you want to actually join (merge) matching rows from 'first' and 'second' into 'third', or just get a subset of the rows from 'first' where there is a matching UniqueID in 'second'? In the first case: third <- merge(first, second, by = "UniqueID") Note that the UniqueID column is quoted. In the second case: third <- subset(first, UniqueID %in% second$UniqueID) See ?merge, ?"%in%" and ?subset HTH, Marc Schwartz
Possibly Parallel Threads
- How to drop variables using a wildcard and logic...
- How to write a list object's name to a new dataframe in that list object
- How to apply five lines of code to ten dataframes?
- Averaging dataframes that are stored in a list
- Select the rows in a dataframe that matches a criteria in another dataframe