Hi,
I have recorded online/offline timestamps per user that looks like this:
username,online_time,offline_time
a,2011-11-01 16:16:56.692572+01,2011-11-01 21:06:16.388903+01
a,2011-11-01 21:07:14.204367+01,2011-11-01 21:34:21.47081+01
a,2011-11-01 21:38:09.501356+01,2011-11-01 21:53:45.272321+01
For each user I want to get a probability distribution over the day, i.d.
for each minute of a day I want the probability that the user is online.
I have come up with some helper functions that let me find the minute of
the day and the duration of the online session:
data <- read.table("availability.csv", header=T, sep=",")
diff_online <- function(username)
{
  user_on <- strptime(data$online_time[which(data$username==username)],
format="%Y-%m-%d %H:%M:%S");
  user_off <- strptime(data$offline_time[which(data$username==username)],
format="%Y-%m-%d %H:%M:%S");
  difftime(user_off, user_on, units="mins");
}
 min.of.day <- function(dtstr) # minute of day
{
  dt <- strptime(dtstr, format="%Y-%m-%d %H:%M:%S");
  h <- as.integer(strftime(dt, "%H"));
  m <- as.integer(strftime(dt, "%M"));
  s <- as.integer(strftime(dt, "%OS"));
  h*60+m
}
But there I am stuck. I thought of creating a factor of the minutes a user
is online and use that to calculate a density, and had a couple other
ideas. But I strongly feel that there is some more straightforward solution
available in R.
Thanks for any help,
wr
	[[alternative HTML version deleted]]
Hello R community,
I have recorded online/offline timestamps per user that looks like this:
username,online_time,offline_time
a,2011-11-01 16:16:56.692572+01,2011-11-01 21:06:16.388903+01
a,2011-11-01 21:07:14.204367+01,2011-11-01 21:34:21.47081+01
a,2011-11-01 21:38:09.501356+01,2011-11-01 21:53:45.272321+01
For each user I want to get a probability distribution over the day, i.d.
for each minute of a day I want the probability that the user is online.
I have come up with some helper functions that let me find the minute of
the day and the duration of the online session:
data <- read.table("availability.csv", header=T, sep=",")
diff_online <- function(username)
{
  user_on <- strptime(data$online_time[which(data$username==username)],
format="%Y-%m-%d %H:%M:%S");
  user_off <- strptime(data$offline_time[which(data$username==username)],
format="%Y-%m-%d %H:%M:%S");
  difftime(user_off, user_on, units="mins");
}
 min.of.day <- function(dtstr) # minute of day
{
  dt <- strptime(dtstr, format="%Y-%m-%d %H:%M:%S");
  h <- as.integer(strftime(dt, "%H"));
  m <- as.integer(strftime(dt, "%M"));
  s <- as.integer(strftime(dt, "%OS"));
  h*60+m
}
But there I am stuck. I thought of creating a factor of the minutes a user
is online and use that to calculate a density, and had a couple other
ideas. But I strongly feel that there is some more straightforward solution
available in R.
Thanks for any help,
wr
	[[alternative HTML version deleted]]