Hi:
Here's one approach to the problem you posed (don't know if it is what
you
want for the problem you intend you use it on...)
df1 <- read.table(textConnection("
id sex age area
01 male adult LP
01 male adult LP
01 male adult LP
02 female subadult LP
02 female subadult LP
02 female subadult LP
02 female subadult LP
03 male subadult MR
03 male subadult MR
03 male subadult MR
03 male subadult MR"), stringsAsFactors = FALSE, header TRUE)
closeAllConnections()
df1$id <- rep(paste('0', 1:3, sep = ''), c(3, 4, 4)) #
replace id with
character var
df2 <- data.frame(id = paste('0', 1:6, sep = ''),
stringsAsFactors = FALSE)
df <- df1[df1$id %in% df2$id, ] # pick out ids that match those in
df2$id
df[!duplicated(df$id), ] # remove duplicate rows
id sex age area
1 01 male adult LP
4 02 female subadult LP
8 03 male subadult MR
I arranged the two data frames so that id was a character vector, in order
to support the leading 0.
I'm assuming your id variable is character - whichever class it is, make
sure it's consistent in both data frames. You can use str() to check the
types of your columns in each data frame.
HTH,
Dennis
On Wed, Feb 9, 2011 at 6:09 AM, Nathaniel <nathanielrayl@hotmail.com>
wrote:
>
> Hi R users,
>
> I am trying to extract some attributes (age, sex, area) from dataframe
"AB"
> that has 101,269 observations of 28 variables to dataframe "t2"
that has 47
> observations of 6 variables. They share a column called "id",
which is a
> factor with 47 levels. I want to end up with a dataframe that has 47
> observations of 9 variables (the original 6 variables of t2, plus age, sex,
> and area). The issue I am having is that in AB has multiple entries for
> each
> id, and so I can't use merge because there is more than one match, so
all
> possible matches contribute one row each--i.e., this code gives me
> dataframe
> "t3" of 101,269 observations of 33 variables:
>
> >t3<-merge(t2,AB,by="id",all=FALSE)
>
> Dataframe AB (24 variables omitted from example dataframe):
>
> id sex age area
> 01 male adult LP
> 01 male adult LP
> 01 male adult LP
> ...
> 02 female subadult LP
> 02 female subadult LP
> 02 female subadult LP
> 02 female subadult LP
> ...
> 03 male subadult MR
> 03 male subadult MR
> 03 male subadult MR
> 03 male subadult MR
> ...
>
> Dataframe t2 (5 variables omitted from example dataframe):
>
> id
> 01
> 02
> 03
> 04
> 05
> 06
> ....
>
> This is the structure I want for dataframe t3 (5 variables omitted from
> example dataframe):
>
> id sex age area
> 01 male adult LP
> 02 female subadult LP
> 03 male subadult MR
> ...
>
> Hopefully this all makes sense and someone knows a solution. Thanks in
> advance for taking a look at my problem and helping out (I hope!).
>
> Nathaniel
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/Need-help-merging-two-dataframes-tp3297313p3297313.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]