Hello, Thank you for your help/advice! The issue here is speed/efficiency. I can do what I want, but its really slow. The goal is to have the ability to do calculations on my data and have it adjusted for look-ahead. I see two ways to do this: (I'm open to more ideas. My terminology: Unadjusted = values not adjusted for look-ahead bias; adjusted = values adjusted for look-ahead bias.) 1) I could a) do calculations on unadjusted values then b) adjust the resulting values for look-ahead bias. Here is what I mean: a) I could say the following using time series of val1: [(val1 - val1 4 periods ago) / val1 4 periods ago] = resultval. ("Periods" correspond to the z.dates in my example below.) b) Then I would adjust the resultval for look-ahead based on val1's associated report date. Note: I don't think this will be the fastest. 2) I could do the same calculation [(val1 - val1 4 periods ago) / val1 4 periods ago] = resultval, but my calculation function would get the 'right' values that would have no look-ahead bias. I'm not sure how I would do this, but maybe a query starting with the date that I want, indexed to appropriate report date indexed to the correct value to return. But how do I do this in R? I think I would have to put this in our database and do a query. The data comes to me in RData format. I could put it all in our database via PpgSQL which we already use. Note: I think this will be fastest. Anyway, my first attempt at this was to solve part b of #1 above. Here is how my data looks and my first attempt at solving part b of idea #1 above. It only takes 0.14 seconds for my mock data, but that is way too slow. The major things slowing it down A) the loop, B) the merge statement. # mock data: this is how it comes to me (raw) # in practice I have over 10,000 columns # the starting 'periods' for my data z.dates c("2007-03-31","2007-06-30","2007-09-30","2007-12-31","2008-03-31","2008-06-30","2008-09-30","2008-12-31") nms = c("A","B","C","D") # these are the report dates that are the real days the data was available rd1 matrix(c("20070514","20070814","20071115","20080213","20080514","20080814","20081114","20090217", "20070410","20070709","20071009","20080109","20080407","20080708","20081007","20090112", "20070426","--","--","--","--","--","--","20090319", "--","--","--","--","--","--","--","--"), nrow=8,ncol=4) dimnames(rd1) = list(z.dates,nms) # this is the unadjusted raw data, that always has the same dimensions, rownames, and colnames as the report dates ua = matrix(c(640.35,636.16,655.91,657.41,682.06,702.90,736.15,667.65, 2625.050,2625.050,2645.000,2302.000,1972.000,1805.000,1547.000,1025.000, NaN, NaN,-98.426,190.304,180.894,183.220,172.520, 144.138, NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN), nrow=8,ncol=4) dimnames(ua) = list(z.dates,nms) ################################# change anything below. I can't change anything above this line. # My first attempt at this was to solve part b of #1 above. fix = function(x) { year = substring(x, 1, 4); mo = substring(x, 5, 6); day = substring(x, 7, 8); ifelse(year=="--", "NA", paste(year, mo, day, sep = "-")) } rd = apply(rd1, 2, fix) dimnames(rd) = dimnames(eps_rd) dt1 <- seq(from =as.Date(z.dates[1]), to = as.Date("2009-03-25"), by "day") dt = sapply(dt1, as.character) fin = dt ck_rows = length(dt) bad = character(0) start_t_all = Sys.time() for(cn in 1:ncol(ua)){ uac = ua[,cn] tkr = colnames(ua)[cn] rdc = rd[,cn] ua_rd = cbind(uac,rdc) colnames(ua_rd) = c(tkr,'rt_date') xx1 = merge(dt,ua_rd,by.x=1,by.y= 'rt_date',all.x=T) xx = as.character(xx1[,2]) values <- c(NA, xx[!is.na(xx)]) ind = cumsum(!is.na(xx)) + 1 y <- values[ind] if(ck_rows == length(y)){ fin = data.frame(fin,y) }else{ bad = c(bad,tkr) } } colnames(fin) = c('daily_dates',nms) # after this I would slice and dice the data into weekly, monthly, etc. periodicity as needed, but this leaves it in daily format which is as granular as I will get. print("over all time for loop") print(Sys.time()-start_t_all) Regards, Ben [[alternative HTML version deleted]]