Stefan Uhmann
2010-Jul-23 10:02 UTC
[R] start and end times to yes/no in certain intervall
Hi List, I have start and end times of events structure(list(start = c("15:00", "15:00", "15:00", "11:00", "14:00", "14:00", "15:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00"), end = c("16:00", "16:00", "16:00", "12:00", "16:00", "15:00", "16:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00")), .Names = c("start", "end"), row.names = c(NA, 20L), class = "data.frame") and I would like the data to look like this:> t9 t10 t11 t12 t13 t14 t15 t16 t17 > 1 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE > 2 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE > 3 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE > 4 FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE > 5 FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE > 6 FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE > 7 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE > 8 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE > 9 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE > 10 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSEWhich means, that I just get a TRUE for every hour the event was taking place. A finishing time of 16:00 means that t16 is FALSE, because the event was finished until 16:00; 16:15 as end time would result in t16 being TRUE. It would be nice if the function would add the variables needed (t9 ..) as well and depending on the times put in (no t9 if there is no event starting before 10:00). Thanks for any suggestion, Stefan
try this:> char2hr <- function(time){+ mat <- do.call(rbind, strsplit(time, ":")) + mode(mat) <- 'numeric' + mat %*% c(1, 1/60) # convert to hours + }> # convert to hours > x.hr <- apply(x, 2, char2hr) > # generate a set of sequences to set values > x.seq <- apply(x.hr, 1, function(.hr) seq(.hr[1], .hr[2] - 1)) > # create output matrix > result <- matrix(FALSE, nrow=nrow(x.hr), ncol=max(x.hr) + 1) > colnames(result) <- sprintf("t%02d", seq(0, length=ncol(result))) > # set the values > for (i in seq_along(x.seq)){+ result[i, x.seq[[i]] + 1] <- TRUE + }> resultt00 t01 t02 t03 t04 t05 t06 t07 t08 t09 t10 t11 t12 t13 t14 t15 t16 [1,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE [2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE [3,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE [5,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE [6,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE [7,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE [8,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [10,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [11,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [12,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [13,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [14,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [15,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [16,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [17,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [18,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [19,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [20,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE> > >you can add the limits for the number of columns that you want. On Fri, Jul 23, 2010 at 6:02 AM, Stefan Uhmann <stefan.uhmann at googlemail.com> wrote:> Hi List, > > I have start and end times of events > > structure(list(start = c("15:00", "15:00", "15:00", "11:00", > "14:00", "14:00", "15:00", "12:00", "12:00", "12:00", "12:00", > "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", > "12:00", "12:00"), end = c("16:00", "16:00", "16:00", "12:00", > "16:00", "15:00", "16:00", "13:00", "13:00", "13:00", "13:00", > "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", > "13:00", "13:00")), .Names = c("start", "end"), row.names = c(NA, > 20L), class = "data.frame") > > and I would like the data to look like this: > >> ? ? ?t9 ? t10 ? t11 ? t12 ? t13 ? t14 ? t15 ? t16 ? t17 >> 1 ?FALSE FALSE FALSE FALSE FALSE FALSE ?TRUE FALSE FALSE >> 2 ?FALSE FALSE FALSE FALSE FALSE FALSE ?TRUE FALSE FALSE >> 3 ?FALSE FALSE FALSE FALSE FALSE FALSE ?TRUE FALSE FALSE >> 4 ?FALSE FALSE ?TRUE FALSE FALSE FALSE FALSE FALSE FALSE >> 5 ?FALSE FALSE FALSE FALSE FALSE ?TRUE ?TRUE FALSE FALSE >> 6 ?FALSE FALSE FALSE FALSE FALSE ?TRUE FALSE FALSE FALSE >> 7 ?FALSE FALSE FALSE FALSE FALSE FALSE ?TRUE FALSE FALSE >> 8 ?FALSE FALSE FALSE ?TRUE FALSE FALSE FALSE FALSE FALSE >> 9 ?FALSE FALSE FALSE ?TRUE FALSE FALSE FALSE FALSE FALSE >> 10 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > > Which means, that I just get a TRUE for every hour the event was taking > place. A finishing time of 16:00 means that t16 is FALSE, because the event > was finished until 16:00; 16:15 as end time would result in t16 being TRUE. > It would be nice if the function would add the variables needed (t9 ..) as > well and depending on the times put in (no t9 if there is no event starting > before 10:00). > > Thanks for any suggestion, > Stefan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Allan Engelhardt
2010-Jul-23 13:17 UTC
[R] start and end times to yes/no in certain intervall
I like loops for this kind of thing so here is one: df<- structure(list(start = c("15:00", "15:00", "15:00", "11:00", "14:00", "14:00", "15:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00"), end = c("16:00", "16:00", "16:00", "12:00", "16:00", "15:00", "16:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00")), .Names = c("start", "end"), row.names = c(NA, 20L), class = "data.frame") duration<- with(df, floor( as.numeric( difftime(strptime(end, format="%H:%M"), strptime(start, format="%H:%M"), units = "hours")))) start<- as.integer(substr(df$start, 1, 2)) a<- matrix(FALSE, nrow = NROW(df), ncol = 24, dimnames = list(1:NROW(df), paste("t", 0:23, sep = ""))) for (i in 1:NROW(df)) { names<- paste("t", seq(start[i], by = 1L, length.out = duration[i]), sep = "") a[i, names]<- TRUE } r<- range(which(apply(a, 2, any))) a<- a[, r[1]:r[2]] # Drop columns we do not need head(a) # t11 t12 t13 t14 t15 # 1 FALSE FALSE FALSE FALSE TRUE # 2 FALSE FALSE FALSE FALSE TRUE # 3 FALSE FALSE FALSE FALSE TRUE # 4 TRUE FALSE FALSE FALSE FALSE # 5 FALSE FALSE FALSE TRUE TRUE # 6 FALSE FALSE FALSE TRUE FALSE Hope this helps a little. Allan On 23/07/10 11:02, Stefan Uhmann wrote:> Hi List, > > I have start and end times of events > > structure(list(start = c("15:00", "15:00", "15:00", "11:00", > "14:00", "14:00", "15:00", "12:00", "12:00", "12:00", "12:00", > "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", "12:00", > "12:00", "12:00"), end = c("16:00", "16:00", "16:00", "12:00", > "16:00", "15:00", "16:00", "13:00", "13:00", "13:00", "13:00", > "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", "13:00", > "13:00", "13:00")), .Names = c("start", "end"), row.names = c(NA, > 20L), class = "data.frame") > > and I would like the data to look like this: > >> t9 t10 t11 t12 t13 t14 t15 t16 t17 >> 1 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE >> 2 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE >> 3 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE >> 4 FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE >> 5 FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE >> 6 FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE >> 7 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE >> 8 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE >> 9 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE >> 10 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > > Which means, that I just get a TRUE for every hour the event was > taking place. A finishing time of 16:00 means that t16 is FALSE, > because the event was finished until 16:00; 16:15 as end time would > result in t16 being TRUE. > It would be nice if the function would add the variables needed (t9 > ..) as well and depending on the times put in (no t9 if there is no > event starting before 10:00). > > Thanks for any suggestion, > Stefan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.