I basically has a long data.frame a. but I only need three columns x,y. Let us say the index of row is t. I need to produce new column s_t as the linear regression coefficient of (x_(t-60),...x_(t-1)) on (y_(t-60),...,y_(t-1)). The data is about 140,000 rows. I wrote a simple code on this which is super slow, it takes more than 2 hours on a 2.8Ghz Intel Duo Core. My friend use SAS and his code needs only couple of minutes. I know there must be some more efficient way to write it. Can anyone help me on this? Here is the code. Also one line produce a complete NA temp$y and lm function failed on that. How to make it just produce a NA instead and keep runing? attach(return) betat=rep(NA,length(RET)) for (i in 61:length(RET)){cat(i," "); if (year[[i]]>=1995){ temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)]) betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]] #if (i%%100==0) cat(i," "); return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]] } }
Using runmean from caTools the first one below does it in under 1 second but will not handle NAs. The second one takes under 15 seconds and handles them by replacing them with linear approximations. Note that k must be odd. # 1 library(caTools) set.seed(1) system.time({ y <- rnorm(140001) x <- as.numeric(seq(y)) k <- 61 Mxy <- runmean(x * y, k) Mxx <- runmean(x * x, k) Mx <- runmean(x, k) My <- runmean(y, k) b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) a <- My - b * Mx }) # 2 library(caTools) library(zoo) set.seed(1) system.time({ y <- rnorm(140000) x <- as.numeric(seq(y)) x[100:200] <- NA x <- na.approx(zoo(x)) y <- zoo(y) k <- 60 Mxy <- runmean(x * y, k) Mxx <- runmean(x * x, k) Mx <- runmean(x, k) My <- runmean(y, k) b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) a <- My - b * Mx }) On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> wrote:> I basically has a long data.frame a. but I only need > three columns x,y. Let us say the index of row is t. > I need to produce new column s_t as the linear > regression coefficient of (x_(t-60),...x_(t-1)) on > (y_(t-60),...,y_(t-1)). The data is about 140,000 > rows. I wrote a simple code on this which is super > slow, it takes more than 2 hours on a 2.8Ghz Intel Duo > Core. My friend use SAS and his code needs only > couple of minutes. I know there must be some more > efficient way to write it. Can anyone help me on > this? Here is the code. > > Also one line produce a complete NA temp$y and lm > function failed on that. How to make it just produce > a NA instead and keep runing? > > attach(return) > betat=rep(NA,length(RET)) > for (i in 61:length(RET)){cat(i," "); > if (year[[i]]>=1995){ > > temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)]) > > betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]] > #if (i%%100==0) > cat(i," "); > > > return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]] > } > } > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
Sorry to bother you guys again. This is great. But this is for 61 number and the second case will change 60 to 61. "run*" only accept odd number window. How to get around it with 60? Any suggestion? Thanks. --- Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> Using runmean from caTools the first one below does > it in under 1 second but will not handle NAs. The > second one takes under 15 seconds and handles > them by replacing them with linear approximations. > Note that k must be odd. > > # 1 > > library(caTools) > set.seed(1) > system.time({ > y <- rnorm(140001) > x <- as.numeric(seq(y)) > k <- 61 > Mxy <- runmean(x * y, k) > Mxx <- runmean(x * x, k) > Mx <- runmean(x, k) > My <- runmean(y, k) > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > a <- My - b * Mx > }) > > # 2 > > library(caTools) > library(zoo) > set.seed(1) > system.time({ > y <- rnorm(140000) > x <- as.numeric(seq(y)) > x[100:200] <- NA > x <- na.approx(zoo(x)) > y <- zoo(y) > k <- 60 > Mxy <- runmean(x * y, k) > Mxx <- runmean(x * x, k) > Mx <- runmean(x, k) > My <- runmean(y, k) > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > a <- My - b * Mx > }) > > > On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> wrote: > > I basically has a long data.frame a. but I only > need > > three columns x,y. Let us say the index of row is > t. > > I need to produce new column s_t as the linear > > regression coefficient of (x_(t-60),...x_(t-1)) on > > (y_(t-60),...,y_(t-1)). The data is about 140,000 > > rows. I wrote a simple code on this which is > super > > slow, it takes more than 2 hours on a 2.8Ghz Intel > Duo > > Core. My friend use SAS and his code needs only > > couple of minutes. I know there must be some more > > efficient way to write it. Can anyone help me on > > this? Here is the code. > > > > Also one line produce a complete NA temp$y and lm > > function failed on that. How to make it just > produce > > a NA instead and keep runing? > > > > attach(return) > > betat=rep(NA,length(RET)) > > for (i in 61:length(RET)){cat(i," "); > > if (year[[i]]>=1995){ > > > > >temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)])> > > > >betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]]> > #if (i%%100==0) > > cat(i," "); > > > > > > >return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]]> > } > > } > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > >
Try runmean2 <- function(x, k) # k must be even (coredata(runmean(x, k-1)) * (k-1) + coredata(lag(x, -k/2, na.pad = TRUE)))/k Also, in your code use matrices or vectors instead of data frames to avoid any overhead in using data frames. On 5/2/06, Guojun Zhu <shmilylemon at yahoo.com> wrote:> Sorry to bother you guys again. This is great. But > this is for 61 number and the second case will change > 60 to 61. "run*" only accept odd number window. How > to get around it with 60? Any suggestion? Thanks. > > --- Gabor Grothendieck <ggrothendieck at gmail.com> > wrote: > > > Using runmean from caTools the first one below does > > it in under 1 second but will not handle NAs. The > > second one takes under 15 seconds and handles > > them by replacing them with linear approximations. > > Note that k must be odd. > > > > # 1 > > > > library(caTools) > > set.seed(1) > > system.time({ > > y <- rnorm(140001) > > x <- as.numeric(seq(y)) > > k <- 61 > > Mxy <- runmean(x * y, k) > > Mxx <- runmean(x * x, k) > > Mx <- runmean(x, k) > > My <- runmean(y, k) > > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > > a <- My - b * Mx > > }) > > > > # 2 > > > > library(caTools) > > library(zoo) > > set.seed(1) > > system.time({ > > y <- rnorm(140000) > > x <- as.numeric(seq(y)) > > x[100:200] <- NA > > x <- na.approx(zoo(x)) > > y <- zoo(y) > > k <- 60 > > Mxy <- runmean(x * y, k) > > Mxx <- runmean(x * x, k) > > Mx <- runmean(x, k) > > My <- runmean(y, k) > > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > > a <- My - b * Mx > > }) > > > > > > On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> wrote: > > > I basically has a long data.frame a. but I only > > need > > > three columns x,y. Let us say the index of row is > > t. > > > I need to produce new column s_t as the linear > > > regression coefficient of (x_(t-60),...x_(t-1)) on > > > (y_(t-60),...,y_(t-1)). The data is about 140,000 > > > rows. I wrote a simple code on this which is > > super > > > slow, it takes more than 2 hours on a 2.8Ghz Intel > > Duo > > > Core. My friend use SAS and his code needs only > > > couple of minutes. I know there must be some more > > > efficient way to write it. Can anyone help me on > > > this? Here is the code. > > > > > > Also one line produce a complete NA temp$y and lm > > > function failed on that. How to make it just > > produce > > > a NA instead and keep runing? > > > > > > attach(return) > > > betat=rep(NA,length(RET)) > > > for (i in 61:length(RET)){cat(i," "); > > > if (year[[i]]>=1995){ > > > > > > > > > temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)]) > > > > > > > > > betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]] > > > #if (i%%100==0) > > > cat(i," "); > > > > > > > > > > > > return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]] > > > } > > > } > > > > > > ______________________________________________ > > > R-help at stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com >
It does not work though. How is the lag work? How does the lag work? I read the help and do not quite understand. Here is a test> y[1] 1 2 3 4 5 6 7 8 9 10> coredata(lag(y,-1))[1] 1 2 3 4 5 6 7 8 9 10 attr(,"tsp") [1] 2 11 1 --- Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> Try > > runmean2 <- function(x, k) # k must be even > (coredata(runmean(x, k-1)) * (k-1) + > coredata(lag(x, -k/2, na.pad = TRUE)))/k > > Also, in your code use matrices or vectors instead > of data frames > to avoid any overhead in using data frames. > > On 5/2/06, Guojun Zhu <shmilylemon at yahoo.com> wrote: > > Sorry to bother you guys again. This is great. > But > > this is for 61 number and the second case will > change > > 60 to 61. "run*" only accept odd number window. > How > > to get around it with 60? Any suggestion? Thanks. > > > > --- Gabor Grothendieck <ggrothendieck at gmail.com> > > wrote: > > > > > Using runmean from caTools the first one below > does > > > it in under 1 second but will not handle NAs. > The > > > second one takes under 15 seconds and handles > > > them by replacing them with linear > approximations. > > > Note that k must be odd. > > > > > > # 1 > > > > > > library(caTools) > > > set.seed(1) > > > system.time({ > > > y <- rnorm(140001) > > > x <- as.numeric(seq(y)) > > > k <- 61 > > > Mxy <- runmean(x * y, k) > > > Mxx <- runmean(x * x, k) > > > Mx <- runmean(x, k) > > > My <- runmean(y, k) > > > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > > > a <- My - b * Mx > > > }) > > > > > > # 2 > > > > > > library(caTools) > > > library(zoo) > > > set.seed(1) > > > system.time({ > > > y <- rnorm(140000) > > > x <- as.numeric(seq(y)) > > > x[100:200] <- NA > > > x <- na.approx(zoo(x)) > > > y <- zoo(y) > > > k <- 60 > > > Mxy <- runmean(x * y, k) > > > Mxx <- runmean(x * x, k) > > > Mx <- runmean(x, k) > > > My <- runmean(y, k) > > > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > > > a <- My - b * Mx > > > }) > > > > > > > > > On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> > wrote: > > > > I basically has a long data.frame a. but I > only > > > need > > > > three columns x,y. Let us say the index of row > is > > > t. > > > > I need to produce new column s_t as the linear > > > > regression coefficient of > (x_(t-60),...x_(t-1)) on > > > > (y_(t-60),...,y_(t-1)). The data is about > 140,000 > > > > rows. I wrote a simple code on this which is > > > super > > > > slow, it takes more than 2 hours on a 2.8Ghz > Intel > > > Duo > > > > Core. My friend use SAS and his code needs > only > > > > couple of minutes. I know there must be some > more > > > > efficient way to write it. Can anyone help me > on > > > > this? Here is the code. > > > > > > > > Also one line produce a complete NA temp$y and > lm > > > > function failed on that. How to make it just > > > produce > > > > a NA instead and keep runing? > > > > > > > > attach(return) > > > > betat=rep(NA,length(RET)) > > > > for (i in 61:length(RET)){cat(i," "); > > > > if (year[[i]]>=1995){ > > > > > > > > > > > > > >temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)])> > > > > > > > > > > > > >betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]]> > > > #if (i%%100==0) > > > > cat(i," "); > > > > > > > > > > > > > > > > > >return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]]> > > > } > > > > } > > > > > > > > ______________________________________________ > > > > R-help at stat.math.ethz.ch mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide! > > > http://www.R-project.org/posting-guide.html > > > > > > > > > > > > > __________________________________________________ > > Do You Yahoo!?> protection around > > http://mail.yahoo.com > > >
I was assuming that this would be added to my example where the data is a zoo object so lag.zoo is being used. Try this:> library(zoo) > z <- zoo(11:15) > z1 2 3 4 5 11 12 13 14 15> lag(z,-1,na.pad=TRUE)1 2 3 4 5 NA 11 12 13 14> ?lag.zooOn 5/2/06, Guojun Zhu <shmilylemon at yahoo.com> wrote:> It does not work though. How is the lag work? How > does the lag work? I read the help and do not quite > understand. Here is a test > > > y > [1] 1 2 3 4 5 6 7 8 9 10 > > coredata(lag(y,-1)) > [1] 1 2 3 4 5 6 7 8 9 10 > attr(,"tsp") > [1] 2 11 1 > > --- Gabor Grothendieck <ggrothendieck at gmail.com> > wrote: > > > Try > > > > runmean2 <- function(x, k) # k must be even > > (coredata(runmean(x, k-1)) * (k-1) + > > coredata(lag(x, -k/2, na.pad = TRUE)))/k > > > > Also, in your code use matrices or vectors instead > > of data frames > > to avoid any overhead in using data frames. > > > > On 5/2/06, Guojun Zhu <shmilylemon at yahoo.com> wrote: > > > Sorry to bother you guys again. This is great. > > But > > > this is for 61 number and the second case will > > change > > > 60 to 61. "run*" only accept odd number window. > > How > > > to get around it with 60? Any suggestion? Thanks. > > > > > > --- Gabor Grothendieck <ggrothendieck at gmail.com> > > > wrote: > > > > > > > Using runmean from caTools the first one below > > does > > > > it in under 1 second but will not handle NAs. > > The > > > > second one takes under 15 seconds and handles > > > > them by replacing them with linear > > approximations. > > > > Note that k must be odd. > > > > > > > > # 1 > > > > > > > > library(caTools) > > > > set.seed(1) > > > > system.time({ > > > > y <- rnorm(140001) > > > > x <- as.numeric(seq(y)) > > > > k <- 61 > > > > Mxy <- runmean(x * y, k) > > > > Mxx <- runmean(x * x, k) > > > > Mx <- runmean(x, k) > > > > My <- runmean(y, k) > > > > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > > > > a <- My - b * Mx > > > > }) > > > > > > > > # 2 > > > > > > > > library(caTools) > > > > library(zoo) > > > > set.seed(1) > > > > system.time({ > > > > y <- rnorm(140000) > > > > x <- as.numeric(seq(y)) > > > > x[100:200] <- NA > > > > x <- na.approx(zoo(x)) > > > > y <- zoo(y) > > > > k <- 60 > > > > Mxy <- runmean(x * y, k) > > > > Mxx <- runmean(x * x, k) > > > > Mx <- runmean(x, k) > > > > My <- runmean(y, k) > > > > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > > > > a <- My - b * Mx > > > > }) > > > > > > > > > > > > On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> > > wrote: > > > > > I basically has a long data.frame a. but I > > only > > > > need > > > > > three columns x,y. Let us say the index of row > > is > > > > t. > > > > > I need to produce new column s_t as the linear > > > > > regression coefficient of > > (x_(t-60),...x_(t-1)) on > > > > > (y_(t-60),...,y_(t-1)). The data is about > > 140,000 > > > > > rows. I wrote a simple code on this which is > > > > super > > > > > slow, it takes more than 2 hours on a 2.8Ghz > > Intel > > > > Duo > > > > > Core. My friend use SAS and his code needs > > only > > > > > couple of minutes. I know there must be some > > more > > > > > efficient way to write it. Can anyone help me > > on > > > > > this? Here is the code. > > > > > > > > > > Also one line produce a complete NA temp$y and > > lm > > > > > function failed on that. How to make it just > > > > produce > > > > > a NA instead and keep runing? > > > > > > > > > > attach(return) > > > > > betat=rep(NA,length(RET)) > > > > > for (i in 61:length(RET)){cat(i," "); > > > > > if (year[[i]]>=1995){ > > > > > > > > > > > > > > > > > > > > temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)]) > > > > > > > > > > > > > > > > > > > > betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]] > > > > > #if (i%%100==0) > > > > > cat(i," "); > > > > > > > > > > > > > > > > > > > > > > > > > return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]] > > > > > } > > > > > } > > > > > > > > > > ______________________________________________ > > > > > R-help at stat.math.ethz.ch mailing list > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > PLEASE do read the posting guide! > > > > http://www.R-project.org/posting-guide.html > > > > > > > > > > > > > > > > > > __________________________________________________ > > > Do You Yahoo!? > > > Tired of spam? Yahoo! Mail has the best spam > > protection around > > > http://mail.yahoo.com > > > > > > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com >
For my perticular purpose (run regression on trailing 60 number b on 60 a), I have writen a short function for it because no available function (runmean, rollmean) can take care of NA as I wanted. I basically want the regression ignore the NA row. I have read a little bit source code of "runmean" and developed my code. I basically put zero for every NA row, as well as keep tracking the real number of regression for the least square fit (dummy). The whole program takes less than a about 0.2 seconds for the whole 144,000 data. I was amazed when I think about the 2+ hours for my original code. My friend with SAS needs about 1 minutes, he used to beat me by 100 times. Now I beat him by 100 times! Here is the code. I am wondering if it will be useful for others. Actually I think this kind of calculation is fairly common at least in finance for things like "stock beta". Does it worht to wrap it and put it somewhere on web? Thanks. window.sum<-function(x,k){ na.row<-is.na(x);x[na.row]<-0; temp<-c(sum(x[1:k]),diff(x,k)); #print(temp) cumsum(temp) } ################################## # function to calculate linear regression of y on x for the lag k, # such as stock beta for the trailing k months. # It will return a list of two vectors with length as length(x)-k+1; # ( intercept,coefficency) lag.regression<-function(x,y,k){ l<-length(x); ########################### # correctly taken care of the NA value, we basically set all value in a row as 0 # if one of them is NA. By that we we find the correct answer for regression # with this row basically takes no effect. na.row<-is.na(x)|is.na(y); x[na.row]<-0; y[na.row]<-0; dummy<-rep(1,l); dummy[na.row]<-0; ############################### mxy<-window.sum(x*y,k);mx<-window.sum(x,k); mxx<-window.sum(x*x,k);my<-window.sum(y,k); num<-window.sum(dummy,k); b<-(num*mxy-mx*my)/(num*mxx-mx*mx); a<-(my-b*mx)/num; ######### #this take care of the case when all k ys or xs are 0 and yield a none sense result na.row1<-is.na(a)|is.na(b); a[na.row1]<-NA; b[na.row1]<-NA; ############### list(a,b) } x<-RET-riskfree; y<-sprtrn-riskfree; s<-lag.regression(y,x,60) --- Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> Try > > runmean2 <- function(x, k) # k must be even > (coredata(runmean(x, k-1)) * (k-1) + > coredata(lag(x, -k/2, na.pad = TRUE)))/k > > Also, in your code use matrices or vectors instead > of data frames > to avoid any overhead in using data frames. > > On 5/2/06, Guojun Zhu <shmilylemon at yahoo.com> wrote: > > Sorry to bother you guys again. This is great. > But > > this is for 61 number and the second case will > change > > 60 to 61. "run*" only accept odd number window. > How > > to get around it with 60? Any suggestion? Thanks. > > > > --- Gabor Grothendieck <ggrothendieck at gmail.com> > > wrote: > > > > > Using runmean from caTools the first one below > does > > > it in under 1 second but will not handle NAs. > The > > > second one takes under 15 seconds and handles > > > them by replacing them with linear > approximations. > > > Note that k must be odd. > > > > > > # 1 > > > > > > library(caTools) > > > set.seed(1) > > > system.time({ > > > y <- rnorm(140001) > > > x <- as.numeric(seq(y)) > > > k <- 61 > > > Mxy <- runmean(x * y, k) > > > Mxx <- runmean(x * x, k) > > > Mx <- runmean(x, k) > > > My <- runmean(y, k) > > > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > > > a <- My - b * Mx > > > }) > > > > > > # 2 > > > > > > library(caTools) > > > library(zoo) > > > set.seed(1) > > > system.time({ > > > y <- rnorm(140000) > > > x <- as.numeric(seq(y)) > > > x[100:200] <- NA > > > x <- na.approx(zoo(x)) > > > y <- zoo(y) > > > k <- 60 > > > Mxy <- runmean(x * y, k) > > > Mxx <- runmean(x * x, k) > > > Mx <- runmean(x, k) > > > My <- runmean(y, k) > > > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx) > > > a <- My - b * Mx > > > }) > > > > > > > > > On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> > wrote: > > > > I basically has a long data.frame a. but I > only > > > need > > > > three columns x,y. Let us say the index of row > is > > > t. > > > > I need to produce new column s_t as the linear > > > > regression coefficient of > (x_(t-60),...x_(t-1)) on > > > > (y_(t-60),...,y_(t-1)). The data is about > 140,000 > > > > rows. I wrote a simple code on this which is > > > super > > > > slow, it takes more than 2 hours on a 2.8Ghz > Intel > > > Duo > > > > Core. My friend use SAS and his code needs > only > > > > couple of minutes. I know there must be some > more > > > > efficient way to write it. Can anyone help me > on > > > > this? Here is the code. > > > > > > > > Also one line produce a complete NA temp$y and > lm > > > > function failed on that. How to make it just > > > produce > > > > a NA instead and keep runing? > > > > > > > > attach(return) > > > > betat=rep(NA,length(RET)) > > > > for (i in 61:length(RET)){cat(i," "); > > > > if (year[[i]]>=1995){ > > > > > > > > > > > > > >temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)])> > > > > > > > > > > > > >betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]]> > > > #if (i%%100==0) > > > > cat(i," "); > > > > > > > > > > > > > > > > > >return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]]> > > > } > > > > } > > > > > > > > ______________________________________________ > > > > R-help at stat.math.ethz.ch mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide! > > > http://www.R-project.org/posting-guide.html > > > > > > > > > > > > > __________________________________________________ > > Do You Yahoo!?> protection around > > http://mail.yahoo.com > > >