Dear Colleagues, I have a data set that looks as below. I'd like to count the number of dates in a series of arbitrary ranges (breaks) i.e. not pre-defined breaks such as months, quarters or years. table(format()) produces ideally formatted output, but table() does not appear to accept arbitrary ranges. I also tried converting the dates to numeric and using histogram to try to get the data, but that doesn't work either. Cut appears to accept an arbitrary range, but I could only get it to produce NAs. Any suggestions? Yours, Simon Kiss mydata<-list(x=seq(as.Date("2007-05-01"), as.Date("2009-09-10"),"days"), y=seq(as.Date("2007-06-16"), as.Date("2009-11-12"),"days")) table(format(mydata[[1]], "%Y")) t_1<-hist(as.numeric(mydata[[1]], breaks=c("14056", "14421")))$counts cut(mydata[[1]], breaks=c(as.Date("2008-06-26"), ("2009=06-26"))) ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 519 761 7606
Hi: Perhaps you were looking for something like this: table(cut(mydata[[1]], breaks=seq(from = as.Date("2008-06-26"), to as.Date("2009-06-26"), by = 'month'))) 2008-06-26 2008-07-26 2008-08-26 2008-09-26 2008-10-26 2008-11-26 2008-12-26 30 31 31 30 31 30 31 2009-01-26 2009-02-26 2009-03-26 2009-04-26 2009-05-26 31 28 31 30 31 Considering that your data range from May 2007 to mid-November 2009, a way to generate monthly tables (or any set of common breaks you want) for all components of the list can be done as follows: f <- function(x) table(cut(x, breaks = seq(from = as.Date('2007-05-01'), to = as.Date('2009-12-01'), by 'month'))) lapply(mydata, f) This is simply intended to get you started in case you wanted to map your problem across multiple list components. HTH, Dennis On Mon, Jan 17, 2011 at 11:16 PM, Simon Kiss <simonjkiss@yahoo.ca> wrote:> Dear Colleagues, > I have a data set that looks as below. I'd like to count the number of > dates in a series of arbitrary ranges (breaks) i.e. not pre-defined breaks > such as months, quarters or years. table(format()) produces ideally > formatted output, but table() does not appear to accept arbitrary ranges. > I also tried converting the dates to numeric and using histogram to try to > get the data, but that doesn't work either. Cut appears to accept an > arbitrary range, but I could only get it to produce NAs. > > Any suggestions? Yours, Simon Kiss > > mydata<-list(x=seq(as.Date("2007-05-01"), as.Date("2009-09-10"),"days"), > y=seq(as.Date("2007-06-16"), as.Date("2009-11-12"),"days")) > table(format(mydata[[1]], "%Y")) > t_1<-hist(as.numeric(mydata[[1]], breaks=c("14056", "14421")))$counts > cut(mydata[[1]], breaks=c(as.Date("2008-06-26"), ("2009=06-26"))) > > > ********************************* > Simon J. Kiss, PhD > Assistant Professor, Wilfrid Laurier University > 73 George Street > Brantford, Ontario, Canada > N3T 2C9 > Cell: +1 519 761 7606 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
----------------------------------------> From: simonjkiss at yahoo.ca > Date: Tue, 18 Jan 2011 02:16:37 -0500 > To: r-help at r-project.org > Subject: [R] Counting dates in arbitrary ranges > > Dear Colleagues, > I have a data set that looks as below. I'd like to count the number of dates in a series of arbitrary ranges (breaks) i.e. not pre-defined breaks such as months, quarters or years. table(format()) produces ideally formatted output, but table() does not appear to accept arbitrary ranges. > I also tried converting the dates to numeric and using histogram to try to get the data, but that doesn't work either. Cut appears to accept an arbitrary range, but I could only get it to produce NAs. > > Any suggestions? Yours, Simon Kiss > > mydata<-list(x=seq(as.Date("2007-05-01"), as.Date("2009-09-10"),"days"), y=seq(as.Date("2007-06-16"), as.Date("2009-11-12"),"days")) > table(format(mydata[[1]], "%Y")) > t_1<-hist(as.numeric(mydata[[1]], breaks=c("14056", "14421")))$counts > cut(mydata[[1]], breaks=c(as.Date("2008-06-26"), ("2009=06-26")))well, with POSIXct I guess you could do things like this, ( not sure about POSIXct vs Date but maybe someone would comment. I tried to remove my typos( note that you left out an as.Date in the cut cmd too etc), leaving in informative error messages, ?but this is just a dump of what I tried. I would think the "rx" thing at the bottom would be of use to you,> str(mydata)List of 2 ?$ x:Class 'Date'? num [1:864] 13634 13635 13636 13637 13638 ... ?$ y:Class 'Date'? num [1:881] 13680 13681 13682 13683 13684 ...> z=as.POSIXct(mydata$x) > str(z)?POSIXct[1:864], format: "2007-04-30 19:00:00" "2007-05-01 19:00:00" ...> w=(z<"2008-05-01 12:34:56") > length(which(w==TRUE))[1] 864> w=(z<as.POSIXct("2008-05-01 12:34:56")) > length(which(w==FALSE))[1] 497> d1=as.POSIXct("2008-05-01 12:34:56") > str(d1)?POSIXct[1:1], format: "2008-05-01 12:34:56"> d2=as.POSIXct("2009-05-01 08:34:56") > rx=d1:d2 > str(rx)?int [1:31521601] 1209663296 1209663297 1209663298 1209663299 1209663300 301 1209663302 1209663303 1209663304 1209663305 ...> length(rx)[1] 31521601> as.POSIXct(rx[10])Error in as.POSIXct.numeric(rx[10]) : 'origin' must be supplied> as.POSIXct(rx[10],origin="1970-01-01")[1] "2008-05-01 18:35:05 CDT">> > > ********************************* > Simon J. Kiss, PhD > Assistant Professor, Wilfrid Laurier University > 73 George Street > Brantford, Ontario, Canada > N3T 2C9 > Cell: +1 519 761 7606 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.