Dear R-help ? would like to know if there is a short solution in R for this merging problem... Let say I have a dataset A as: TYPE DATE A 2 A 5 A 20 B 10 B 2 (there can be duplicates for the same type and date) and I have another dataset B as : TYPE Special_Date A 2 A 6 A 20 A 22 B 5 B 6 The question is : I would like to obtain the difference between the date of each observation in A and the closest special date in B with the same type. In case of ties I would take the latest date of the two. For example I would obtain here TYPE DATE Difference A 2 0=2-2 A 5 -1=5-6 A 20 0=20-20 B 10 +4=10-6 B 2 -3=2-5 Do you know how to (simply?) obtain this in R? Many thanks! Best Regards
Hello,
Try the following.
A <- read.table(text="
TYPE DATE
A 2
A 5
A 20
B 10
B 2
", header = TRUE)
B <- read.table(text="
TYPE Special_Date
A 2
A 6
A 20
A 22
B 5
B 6
", header = TRUE)
result <- do.call( rbind, lapply(split(merge(A, B), list(m$DATE,
m$TYPE)), function(x){
a <- abs(x$DATE - x$Special_Date)
if(nrow(x)) x[which(min(a) == a), ] }) )
result$Difference <- result$DATE - result$Special_Date
result$Special_Date <- NULL
rownames(result) <- seq_len(nrow(result))
result
Also, it's a good practice to post data examples using dput(). For instance,
dput(A)
structure(list(TYPE = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), DATE = c(2L, 5L, 20L, 10L, 2L)),
.Names =
c("TYPE",
"DATE"), class = "data.frame", row.names = c(NA, -5L))
Now all we have to do is run the statement A <- structure(... etc...) to
have an exact copy of the data example.
Anyway, your example with input and the wanted result was very welcome.
Hope this helps,
Rui Barradas
Em 19-08-2012 11:10, Francesco escreveu:> Dear R-help
>
> ? would like to know if there is a short solution in R for this
> merging problem...
>
> Let say I have a dataset A as:
>
> TYPE DATE
> A 2
> A 5
> A 20
> B 10
> B 2
>
> (there can be duplicates for the same type and date)
>
> and I have another dataset B as :
>
> TYPE Special_Date
> A 2
> A 6
> A 20
> A 22
> B 5
> B 6
>
> The question is : I would like to obtain the difference between the
> date of each observation in A and the closest special date in B with
> the same type. In case of ties I would take the latest date of the
> two.
>
> For example I would obtain here
>
> TYPE DATE Difference
> A 2 0=2-2
> A 5 -1=5-6
> A 20 0=20-20
> B 10 +4=10-6
> B 2 -3=2-5
>
> Do you know how to (simply?) obtain this in R?
>
> Many thanks!
> Best Regards
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello, Em 19-08-2012 17:33, William Dunlap escreveu:> Did you omit > m <- merge(A, B) > from your code?Yes, completely forgot! It should be before the split/lapply. Rui Barradas> > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf >> Of Rui Barradas >> Sent: Sunday, August 19, 2012 3:52 AM >> To: Francesco >> Cc: r-help >> Subject: Re: [R] merging and obtaining the nearest value >> >> Hello, >> >> Try the following. >> >> >> A <- read.table(text=" >> TYPE DATE >> A 2 >> A 5 >> A 20 >> B 10 >> B 2 >> ", header = TRUE) >> >> >> B <- read.table(text=" >> TYPE Special_Date >> A 2 >> A 6 >> A 20 >> A 22 >> B 5 >> B 6 >> ", header = TRUE) >> >> result <- do.call( rbind, lapply(split(merge(A, B), list(m$DATE, >> m$TYPE)), function(x){ >> a <- abs(x$DATE - x$Special_Date) >> if(nrow(x)) x[which(min(a) == a), ] }) ) >> result$Difference <- result$DATE - result$Special_Date >> result$Special_Date <- NULL >> rownames(result) <- seq_len(nrow(result)) >> result >> >> >> Also, it's a good practice to post data examples using dput(). For instance, >> >> dput(A) >> structure(list(TYPE = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A", >> "B"), class = "factor"), DATE = c(2L, 5L, 20L, 10L, 2L)), .Names >> c("TYPE", >> "DATE"), class = "data.frame", row.names = c(NA, -5L)) >> >> Now all we have to do is run the statement A <- structure(... etc...) to >> have an exact copy of the data example. >> Anyway, your example with input and the wanted result was very welcome. >> >> Hope this helps, >> >> Rui Barradas >> >> Em 19-08-2012 11:10, Francesco escreveu: >>> Dear R-help >>> >>> ? would like to know if there is a short solution in R for this >>> merging problem... >>> >>> Let say I have a dataset A as: >>> >>> TYPE DATE >>> A 2 >>> A 5 >>> A 20 >>> B 10 >>> B 2 >>> >>> (there can be duplicates for the same type and date) >>> >>> and I have another dataset B as : >>> >>> TYPE Special_Date >>> A 2 >>> A 6 >>> A 20 >>> A 22 >>> B 5 >>> B 6 >>> >>> The question is : I would like to obtain the difference between the >>> date of each observation in A and the closest special date in B with >>> the same type. In case of ties I would take the latest date of the >>> two. >>> >>> For example I would obtain here >>> >>> TYPE DATE Difference >>> A 2 0=2-2 >>> A 5 -1=5-6 >>> A 20 0=20-20 >>> B 10 +4=10-6 >>> B 2 -3=2-5 >>> >>> Do you know how to (simply?) obtain this in R? >>> >>> Many thanks! >>> Best Regards >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.