davideps
2012-Apr-04 18:00 UTC
[R] crosstabs and histograms with flexible binning of dates
Hi, First, thank you to Duncan Mackay for getting me started processing dates with R. Unfortunately, I need to do a little more than I initially expected. I have 5K lines of data that look like this: ID AREA DATE 0001 Center 2010-10-15 0002 Center 2010-01-02 0003 NorthWest 2010-02-05 0004 SouthWest 2010-05-11 I would like to create a script to create crosstabs like the one below, but that (1) could be used to easily create small multiples with lattice or ggplot2 and (2) provides flexible binning options, such as monthly from a specific day. Should I manually create the crosstab or can I use a histogram function to generate it on the way to generating a graphic? AREA 1/2010-3/2010 4/2010-6/2010 7/2010-9/2010 10/2010-12/2010 Center 1 0 0 1 NorthWest 1 0 0 0 SouthWest 0 1 0 0 Below is my code to handle the arbitrary bins, but I'm guessing there are useful libraries and more elegant approaches. Any pointers would be appreciated. import(foreign) # LOAD FILE #parcels=read.dbf() #depending on source file parcels=read.delim("~/Projects/GIS_DATA/Parcels_NSP_BlockGroup.txt") attach(parcels) # DEFINE BINNING basedate=as.Date("2011/05/11") currentdate=basedate interval=3 #width of interval in months. 3 = quarterly num_intervals=5 #how many intervals to include after basedate for (i in c(1:num_intervals)) { startdate=currentdate enddate=seq(startdate,by="month",length=interval)[interval] #create a sequence of months of length "interval" and take last one. # crosstab construction of single column here # add column to final dataframe currentdate=enddate } Thank you, -david