Dear Colleagues,
I have a data set that looks as below. I'd like to count the number of dates
in a series of arbitrary ranges (breaks) i.e. not pre-defined breaks such as
months, quarters or years. table(format()) produces ideally formatted output,
but table() does not appear to accept arbitrary ranges.
I also tried converting the dates to numeric and using histogram to try to get
the data, but that doesn't work either. Cut appears to accept an arbitrary
range, but I could only get it to produce NAs.
Any suggestions? Yours, Simon Kiss
mydata<-list(x=seq(as.Date("2007-05-01"),
as.Date("2009-09-10"),"days"),
y=seq(as.Date("2007-06-16"),
as.Date("2009-11-12"),"days"))
table(format(mydata[[1]], "%Y"))
t_1<-hist(as.numeric(mydata[[1]], breaks=c("14056",
"14421")))$counts
cut(mydata[[1]], breaks=c(as.Date("2008-06-26"),
("2009=06-26")))
*********************************
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 519 761 7606
Hi:
Perhaps you were looking for something like this:
table(cut(mydata[[1]], breaks=seq(from = as.Date("2008-06-26"),
to
as.Date("2009-06-26"), by = 'month')))
2008-06-26 2008-07-26 2008-08-26 2008-09-26 2008-10-26 2008-11-26 2008-12-26
30 31 31 30 31 30 31
2009-01-26 2009-02-26 2009-03-26 2009-04-26 2009-05-26
31 28 31 30 31
Considering that your data range from May 2007 to mid-November 2009, a way
to generate monthly tables (or any set of common breaks you want) for all
components of the list can be done as follows:
f <- function(x) table(cut(x, breaks = seq(from =
as.Date('2007-05-01'),
to = as.Date('2009-12-01'),
by 'month')))
lapply(mydata, f)
This is simply intended to get you started in case you wanted to map your
problem across multiple list components.
HTH,
Dennis
On Mon, Jan 17, 2011 at 11:16 PM, Simon Kiss <simonjkiss@yahoo.ca> wrote:
> Dear Colleagues,
> I have a data set that looks as below. I'd like to count the number of
> dates in a series of arbitrary ranges (breaks) i.e. not pre-defined breaks
> such as months, quarters or years. table(format()) produces ideally
> formatted output, but table() does not appear to accept arbitrary ranges.
> I also tried converting the dates to numeric and using histogram to try to
> get the data, but that doesn't work either. Cut appears to accept an
> arbitrary range, but I could only get it to produce NAs.
>
> Any suggestions? Yours, Simon Kiss
>
> mydata<-list(x=seq(as.Date("2007-05-01"),
as.Date("2009-09-10"),"days"),
> y=seq(as.Date("2007-06-16"),
as.Date("2009-11-12"),"days"))
> table(format(mydata[[1]], "%Y"))
> t_1<-hist(as.numeric(mydata[[1]], breaks=c("14056",
"14421")))$counts
> cut(mydata[[1]], breaks=c(as.Date("2008-06-26"),
("2009=06-26")))
>
>
> *********************************
> Simon J. Kiss, PhD
> Assistant Professor, Wilfrid Laurier University
> 73 George Street
> Brantford, Ontario, Canada
> N3T 2C9
> Cell: +1 519 761 7606
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
----------------------------------------> From: simonjkiss at yahoo.ca > Date: Tue, 18 Jan 2011 02:16:37 -0500 > To: r-help at r-project.org > Subject: [R] Counting dates in arbitrary ranges > > Dear Colleagues, > I have a data set that looks as below. I'd like to count the number of dates in a series of arbitrary ranges (breaks) i.e. not pre-defined breaks such as months, quarters or years. table(format()) produces ideally formatted output, but table() does not appear to accept arbitrary ranges. > I also tried converting the dates to numeric and using histogram to try to get the data, but that doesn't work either. Cut appears to accept an arbitrary range, but I could only get it to produce NAs. > > Any suggestions? Yours, Simon Kiss > > mydata<-list(x=seq(as.Date("2007-05-01"), as.Date("2009-09-10"),"days"), y=seq(as.Date("2007-06-16"), as.Date("2009-11-12"),"days")) > table(format(mydata[[1]], "%Y")) > t_1<-hist(as.numeric(mydata[[1]], breaks=c("14056", "14421")))$counts > cut(mydata[[1]], breaks=c(as.Date("2008-06-26"), ("2009=06-26")))well, with POSIXct I guess you could do things like this, ( not sure about POSIXct vs Date but maybe someone would comment. I tried to remove my typos( note that you left out an as.Date in the cut cmd too etc), leaving in informative error messages, ?but this is just a dump of what I tried. I would think the "rx" thing at the bottom would be of use to you,> str(mydata)List of 2 ?$ x:Class 'Date'? num [1:864] 13634 13635 13636 13637 13638 ... ?$ y:Class 'Date'? num [1:881] 13680 13681 13682 13683 13684 ...> z=as.POSIXct(mydata$x) > str(z)?POSIXct[1:864], format: "2007-04-30 19:00:00" "2007-05-01 19:00:00" ...> w=(z<"2008-05-01 12:34:56") > length(which(w==TRUE))[1] 864> w=(z<as.POSIXct("2008-05-01 12:34:56")) > length(which(w==FALSE))[1] 497> d1=as.POSIXct("2008-05-01 12:34:56") > str(d1)?POSIXct[1:1], format: "2008-05-01 12:34:56"> d2=as.POSIXct("2009-05-01 08:34:56") > rx=d1:d2 > str(rx)?int [1:31521601] 1209663296 1209663297 1209663298 1209663299 1209663300 301 1209663302 1209663303 1209663304 1209663305 ...> length(rx)[1] 31521601> as.POSIXct(rx[10])Error in as.POSIXct.numeric(rx[10]) : 'origin' must be supplied> as.POSIXct(rx[10],origin="1970-01-01")[1] "2008-05-01 18:35:05 CDT">> > > ********************************* > Simon J. Kiss, PhD > Assistant Professor, Wilfrid Laurier University > 73 George Street > Brantford, Ontario, Canada > N3T 2C9 > Cell: +1 519 761 7606 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.