Try this:
merg <- merge(big, small, by = "id")
f <- function(x) {
x$date_.y <- max(x$date_.y)
x[x$date_.y >= x$date_.x, "date_.y"] <- NA
x
}
do.call("rbind", by(merg, merg$date_.x, f))
On 1/27/06, r user <ruser2006 at yahoo.com> wrote:> I have two datasets, big and small.
>
> s_date<-c('2005-12-02', '2005-12-01',
> '2004-11-02','2002-10-05','2000-12-15')
> s_id<-c('a','a','b','c','d')
>
> b_date<- c('2005-12-31', '2005-12-31',
>
'2004-12-31','2002-10-05','2001-10-31','1999-12-31')
>
>
b_id<-c('a','b','c','d','e','c')
>
> small<-data.frame(date_=as.Date(s_date),id=s_id)
> big<-data.frame(date_=as.Date(b_date),id=b_id)
>
> For each row in "big", I want to look for a match in
> small where two conditions are met:
>
> a. big$id=small$id
> b. big$date_>=small$date
>
> If match is found, I wish to return the value of the
> date. If no match is found, I want NA.
>
> If more than 1 match is found, I wish to return the
> match where small$date is greatest.
>
> I'm thinking I might be able to do this using the
> match function, and by sorting the "small" dataset by
> date_ in descending order.
>
> However, I do not know how to make the match
> conditional on big$date_>=small$date_.
>
> Any help is appreciated.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>