Try this: (it would have been easier if you had used 'dput' on your
data)
x <- read.table('/temp/example.txt', skip = 1, as.is = TRUE)
# convert to POSIXct
x$beg <- as.POSIXct(paste(x$V4, x$V5))
x$end <- as.POSIXct(paste(x$V6, x$V7))
# determine breaks over midnight
x$over <- format(x$beg, "%d") != format(x$end, "%d")
x$V1 <- x$V4 <- x$V5 <- x$V6 <- x$V7 <- NULL # remove extra
columns
# put names on columns
names(x) <- c('phaseno', 'activity', 'phasetime',
'beg', 'end', 'over')
# extract records that extend over midnight
overSet <- subset(x, over)
normalSet <- subset(x, !over)
newSet <- do.call(rbind, lapply(seq_along(overSet$over), function(.row){
# for each row, make two copies so you can change them individually
.data <- overSet[c(.row, .row), ] # two copies of the row
.data$end[1] <- trunc(.data$end[1], units = 'days')
.data$phasetime[1] <- as.numeric(.data$end[1]) - as.numeric(.data$beg[1])
.data$beg[2] <- .data$end[1]
.data$phasetime[2] <- as.numeric(.data$end[2]) - as.numeric(.data$beg[2])
.data
}))
# combine the data and then sort by 'beg'
result <- rbind(normalSet, newSet)
result <- result[order(result$beg), ]
output:
phaseno activity phasetime beg end over
1 1 L 61033 2010-06-01 00:21:00 2010-06-01 17:18:13 FALSE
2 2 D 7907 2010-06-01 17:18:14 2010-06-01 19:30:01 FALSE
3 3 L 395 2010-06-01 19:30:02 2010-06-01 19:36:37 FALSE
4 4 D 15802 2010-06-01 19:36:38 2010-06-02 00:00:00 TRUE
4.1 4 D 2693 2010-06-02 00:00:00 2010-06-02 00:44:53 TRUE
5 5 W 40 2010-06-02 00:44:54 2010-06-02 00:45:34 FALSE
6 6 D 6425 2010-06-02 00:45:35 2010-06-02 02:32:40 FALSE
7 7 L 379 2010-06-02 02:32:41 2010-06-02 02:39:00 FALSE
8 8 D 1414 2010-06-02 02:39:01 2010-06-02 03:02:35 FALSE
9 9 W 73 2010-06-02 03:02:36 2010-06-02 03:03:49 FALSE
On Wed, Nov 16, 2011 at 2:41 PM, PEL <pierre-etienne.lessard.1 at
ulaval.ca> wrote:> Hello all,
>
> I have a data frame that looks like this:
>
> http://r.789695.n4.nabble.com/file/n4077622/Capture.png
>
> I would like to know if it's possible to split a single row into two
rows
> when the time frame between "beg" and "end" overlaps
midnight. I want to
> compare the frequency of each activity for each day so a row for a phase
> that overlaps on two dates unbalances the graphs I create with this data.
>
> Ex:
> >From the original row:
>
> http://r.789695.n4.nabble.com/file/n4077622/Capture2.png
>
> Note: "phasetime" is only a difftime between "end" and
"beg".
> ? ? ? ? ?"phaseno" and "activity" should stay the same
for the two new
> lines.
>
> Here is a sample of my data that covers a few days:
> http://r.789695.n4.nabble.com/file/n4077622/example.txt example.txt
>
> Thank you to anyone who takes the time to read this and any idea will be
> welcome
>
> PEL
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Splitting-row-in-function-of-time-tp4077622p4077622.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.