Hi, I have recorded online/offline timestamps per user that looks like this: username,online_time,offline_time a,2011-11-01 16:16:56.692572+01,2011-11-01 21:06:16.388903+01 a,2011-11-01 21:07:14.204367+01,2011-11-01 21:34:21.47081+01 a,2011-11-01 21:38:09.501356+01,2011-11-01 21:53:45.272321+01 For each user I want to get a probability distribution over the day, i.d. for each minute of a day I want the probability that the user is online. I have come up with some helper functions that let me find the minute of the day and the duration of the online session: data <- read.table("availability.csv", header=T, sep=",") diff_online <- function(username) { user_on <- strptime(data$online_time[which(data$username==username)], format="%Y-%m-%d %H:%M:%S"); user_off <- strptime(data$offline_time[which(data$username==username)], format="%Y-%m-%d %H:%M:%S"); difftime(user_off, user_on, units="mins"); } min.of.day <- function(dtstr) # minute of day { dt <- strptime(dtstr, format="%Y-%m-%d %H:%M:%S"); h <- as.integer(strftime(dt, "%H")); m <- as.integer(strftime(dt, "%M")); s <- as.integer(strftime(dt, "%OS")); h*60+m } But there I am stuck. I thought of creating a factor of the minutes a user is online and use that to calculate a density, and had a couple other ideas. But I strongly feel that there is some more straightforward solution available in R. Thanks for any help, wr [[alternative HTML version deleted]]
Hello R community, I have recorded online/offline timestamps per user that looks like this: username,online_time,offline_time a,2011-11-01 16:16:56.692572+01,2011-11-01 21:06:16.388903+01 a,2011-11-01 21:07:14.204367+01,2011-11-01 21:34:21.47081+01 a,2011-11-01 21:38:09.501356+01,2011-11-01 21:53:45.272321+01 For each user I want to get a probability distribution over the day, i.d. for each minute of a day I want the probability that the user is online. I have come up with some helper functions that let me find the minute of the day and the duration of the online session: data <- read.table("availability.csv", header=T, sep=",") diff_online <- function(username) { user_on <- strptime(data$online_time[which(data$username==username)], format="%Y-%m-%d %H:%M:%S"); user_off <- strptime(data$offline_time[which(data$username==username)], format="%Y-%m-%d %H:%M:%S"); difftime(user_off, user_on, units="mins"); } min.of.day <- function(dtstr) # minute of day { dt <- strptime(dtstr, format="%Y-%m-%d %H:%M:%S"); h <- as.integer(strftime(dt, "%H")); m <- as.integer(strftime(dt, "%M")); s <- as.integer(strftime(dt, "%OS")); h*60+m } But there I am stuck. I thought of creating a factor of the minutes a user is online and use that to calculate a density, and had a couple other ideas. But I strongly feel that there is some more straightforward solution available in R. Thanks for any help, wr [[alternative HTML version deleted]]