thr3ads.net - R help - [R] lubridate and intervals [Aug 2011]

If this information is useful, please help other people find it:
Share via:

Justin Haynes

2011-Aug-30 18:15 UTC

[R] lubridate and intervals

Hiya,

maybe there is a native R function for this and if so please let me know!

I have 2 data.frames with start and end dates, they read in as strings and I
am converting to POSIXct.  How can I check for overlap?

The end result ideally will be a single data.frame containing all the
columns of the other two with rows where there were date overlaps.


df1<-data.frame(start=as.POSIXct(paste('2011-06-01
',1:20,':00',sep='')),
end=as.POSIXct(paste('2011-06-01 ',1:20,':30',sep='')))
df2<-data.frame(start=as.POSIXct(paste('2011-06-01
',rep(seq(1,20,2),2),':',sample(1:19,20,replace=T),sep='')),
end=as.POSIXct(paste('2011-06-01
',rep(seq(1,20,2),2),':',sample(20:50,20),sep='')))

I tried:
library(lubridate)

df1$interval<-new_interval(df1$start,df1$end)
> df1$interval[1]
[1] 2011-06-01 01:00:00 -- 2011-06-01 01:30:00> df2$start[1][1] "2011-06-01 01:17:00 PDT"

but
> df2$start[1] %in% df1$interval[1]
[1] FALSE>
This must be fairly straight forward and I just don't know where to look!


Thanks,
Justin

	[[alternative HTML version deleted]]

jim holtman

2011-Aug-31 00:29 UTC

head link

[R] lubridate and intervals

There are 10 overlaps in the data:
> df1<-data.frame(start=as.POSIXct(paste('2011-06-01
',1:20,':00',sep='')),+ end=as.POSIXct(paste('2011-06-01
',1:20,':30',sep='')))> df2<-data.frame(start=as.POSIXct(paste('2011-06-01+
',rep(seq(1,20,2),2),':',sample(1:19,20,replace=T),sep='')),
+ end=as.POSIXct(paste('2011-06-01
+
',rep(seq(1,20,2),2),':',sample(20:50,20),sep='')))>
> # create a matrix where the 'start' adds 1 to a count and the
'end' subtracts 1
> # the second column is the df# and the 4th is the row number of the data
>
> x <- rbind(+     cbind(df1$start, 1, 1, seq(nrow(df1))),
+     cbind(df1$end, 1, -1, seq(nrow(df1))),
+     cbind(df2$start, 2, 1, seq(nrow(df2))),
+     cbind(df2$end, 2, -1, seq(nrow(df2)))
+     )> # sort by time
> x <- x[order(x[,1]),]
> # add the queue count; this is the number of items in a queue which is
> # used to determine any overlaps if the queue is greater than one
> x <- cbind(x, count = cumsum(x[,3]))
> # split the data into group when the count == 0
> indx <- split(seq(nrow(x)), cumsum(c(FALSE, head(x[, 'count'],
-1) == 0)))
> # keep groups of length > 2; there are the overlaps
> indx <- indx[sapply(indx, length) > 2]
> # get unique df# and row indices
> lapply(indx, function(a){+     unique(paste(x[a, 2], x[a, 4], sep = ' - '))
+ })
$`0`
[1] "1 - 1"  "2 - 11" "2 - 1"

$`2`
[1] "1 - 3"  "2 - 12" "2 - 2"

$`4`
[1] "1 - 5"  "2 - 13" "2 - 3"

$`6`
[1] "1 - 7"  "2 - 14" "2 - 4"

$`8`
[1] "1 - 9"  "2 - 15" "2 - 5"

$`10`
[1] "1 - 11" "2 - 16" "2 - 6"

$`12`
[1] "1 - 13" "2 - 17" "2 - 7"

$`14`
[1] "1 - 15" "2 - 8"  "2 - 18"

$`16`
[1] "1 - 17" "2 - 19" "2 - 9"

$`18`
[1] "1 - 19" "2 - 20" "2 - 10"


On Tue, Aug 30, 2011 at 2:15 PM, Justin Haynes <jtor14 at gmail.com>
wrote:> Hiya,
>
> maybe there is a native R function for this and if so please let me know!
>
> I have 2 data.frames with start and end dates, they read in as strings and
I
> am converting to POSIXct. ?How can I check for overlap?
>
> The end result ideally will be a single data.frame containing all the
> columns of the other two with rows where there were date overlaps.
>
>
> df1<-data.frame(start=as.POSIXct(paste('2011-06-01
',1:20,':00',sep='')),
> end=as.POSIXct(paste('2011-06-01
',1:20,':30',sep='')))
> df2<-data.frame(start=as.POSIXct(paste('2011-06-01
>
',rep(seq(1,20,2),2),':',sample(1:19,20,replace=T),sep='')),
> end=as.POSIXct(paste('2011-06-01
> ',rep(seq(1,20,2),2),':',sample(20:50,20),sep='')))
>
> I tried:
> library(lubridate)
>
> df1$interval<-new_interval(df1$start,df1$end)
>
>> df1$interval[1]
> [1] 2011-06-01 01:00:00 -- 2011-06-01 01:30:00
>> df2$start[1]
> [1] "2011-06-01 01:17:00 PDT"
>
> but
>
>> df2$start[1] %in% df1$interval[1]
> [1] FALSE
>>
>
> This must be fairly straight forward and I just don't know where to
look!
>
>
> Thanks,
> Justin
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

Maybe Matching Threads

Search for more seemingly similar threads

R help - Aug 2011 - lubridate and intervals

[R] lubridate and intervals

[R] lubridate and intervals

Maybe Matching Threads