nooldor
2013-Nov-29 22:22 UTC
[R] Multiple regressions with changing dependent variable and time span
Hi all! I am just starting my adventure with R, so excuse me naive questions. My data look like that: <http://r.789695.n4.nabble.com/file/n4681391/data_descr_img.jpg> I have 3 independent variables (F.1, F.2 and F.3) and 334 other variables (r.1, r.2, ... r.334) - each one of these will be dependent variable in my regression. Total span of the time is 154 observations. But I would like to have rolling window regression with length of 31 observations. I would like to run script like that: summary(lm(r.1~F.1+F.2+F.3, data=data)) vif(lm(r.1~F.1+F.2+F.3, data=data)) But for each of 334 (r.1 to r.334) dependent variables separately and with rolling-window of the length 31obs. Id est: summary(lm(r.1~F.1+F.2+F.3, data=data)) would be run 123 (154 total obs - 31. for the first regression) times for rolling-fixed period of 31 obs. The next regression would be: summary(lm(r.2~F.1+F.2+F.3, data=data)) also 123 times ... and so on till summary(lm(r.334~F.1+F.2+F.3, data=data)) It means it would be 123 x 334 regressions (=41082 regressions) I would like to save results (summary + vif test) of all those 41082 regressions in one read-user-friendly file like this given by e.g command capture.output() Could you help with it? Regards, T.S. [[alternative HTML version deleted]]
arun
2013-Nov-30 06:42 UTC
[R] Multiple regressions with changing dependent variable and time span
Hi,
The link seems to be not working.? From the description, it looks like:
set.seed(432)
dat1 <- as.data.frame(matrix(sample(200,154*337,replace=TRUE),ncol=337))
?colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:334,sep="."))
lst1 <- lapply(paste("r",1:334,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
?lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x} )
library(zoo)
res <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) coef(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
row.names(res) <- rep(paste("r",1:334,sep="."),each=123)
?dim(res)
#[1] 41082???? 4
coef(lm(r.1~F.1+F.2+F.3,data=dat1[1:32,]) )
#(Intercept)???????? F.1???????? F.2???????? F.3
#109.9168150? -0.1705361? -0.1028231?? 0.2027911
coef(lm(r.1~F.1+F.2+F.3,data=dat1[2:33,]) )
#(Intercept)???????? F.1???????? F.2???????? F.3
#119.3718949? -0.1660709? -0.2059830?? 0.1338608
res[1:2,]
#??? (Intercept)??????? F.1??????? F.2?????? F.3
#r.1??? 109.9168 -0.1705361 -0.1028231 0.2027911
#r.1??? 119.3719 -0.1660709 -0.2059830 0.1338608
A.K.
On Friday, November 29, 2013 6:43 PM, nooldor <nooldor at gmail.com>
wrote:
Hi all!
I am just starting my adventure with R, so excuse me naive questions.
My data look like that:
<http://r.789695.n4.nabble.com/file/n4681391/data_descr_img.jpg>
I have 3 independent variables (F.1, F.2 and F.3) and 334 other variables
(r.1, r.2, ... r.334) - each one of these will be dependent variable in my
regression.
Total span of the time is 154 observations. But I would like to have rolling
window regression with length of 31 observations.
I would like to run script like that:
summary(lm(r.1~F.1+F.2+F.3, data=data))
vif(lm(r.1~F.1+F.2+F.3, data=data))
But for each of 334 (r.1 to r.334) dependent variables separately and with
rolling-window of the length 31obs.
Id est:
summary(lm(r.1~F.1+F.2+F.3, data=data)) would be run 123 (154 total obs -
31. for the first regression) times for rolling-fixed period of 31 obs.
The next regression would be:
summary(lm(r.2~F.1+F.2+F.3, data=data)) also 123 times ... and so on till
summary(lm(r.334~F.1+F.2+F.3, data=data))
It means it would be 123 x 334 regressions (=41082 regressions)
I would like to save results (summary + vif test) of all those 41082
regressions in one read-user-friendly file like this given by e.g command
capture.output()
Could you help with it?
Regards,
T.S.
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
arun
2013-Nov-30 20:12 UTC
[R] Multiple regressions with changing dependent variable and time span
Hi,
I was able to read the file after saving it as .csv.? It seems to work without
any errors.
dat1<-read.csv("Book2.csv", header=T)
###same as previous
lst1 <- lapply(paste("r",1:334,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x} )
?sapply(lst2,function(x) sum(!!rowSums(is.na(x))))
library(zoo)
res1 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); c(coef(l1),
pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) } else
rep(NA,9)},by.column=FALSE,align="right")))
row.names(res1) <- rep(paste("r",1:334,sep="."),each=123)
?dim(res1)
#[1] 41082???? 9
#vif
?library(car)
res2 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); vif(l1) }
else rep(NA,3)},by.column=FALSE,align="right")))
row.names(res2) <- rep(paste("r",1:334,sep="."),each=123)
dim(res2)
#[1] 41082???? 3
#DW statistic:
?lst3 <- lapply(lst2,function(x) rollapply(x,width=32,FUN=function(z) {z1
<- as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1
<-lm(r~F.1+F.2+F.3,data=z1); durbinWatsonTest(l1) } else
rep(NA,4)},by.column=FALSE,align="right"))
?res3 <- do.call(rbind,lapply(lst3,function(x) x[,-4]))
row.names(res3) <- rep(paste("r",1:334,sep="."),each=123)
?dim(res3)
#[1] 41082???? 3
##ncvTest()
f4 <- function(meanmod, dta, varmod) {
assign(".dta", dta, envir=.GlobalEnv)
assign(".meanmod", meanmod, envir=.GlobalEnv)
m1 <- lm(.meanmod, .dta)
ans <- ncvTest(m1, varmod)
remove(".dta", envir=.GlobalEnv)
remove(".meanmod", envir=.GlobalEnv)
ans
}
?lst4 <- lapply(lst2,function(x) rollapply(x,width=32,FUN=function(z) {z1
<- as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1 <-f4(r~.,z1) }
else NA},by.column=FALSE,align="right"))
names(lst4) <- paste("r",1:334,sep=".")
length(lst4)
#[1] 334
###jarque.bera.test
library(tseries)
res5 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); resid <-
residuals(l1); unlist(jarque.bera.test(resid)[1:3]) } else
rep(NA,3)},by.column=FALSE,align="right")))
?dim(res5)
#[1] 41082???? 3
A.K.
On Saturday, November 30, 2013 1:44 PM, nooldor <nooldor at gmail.com>
wrote:
here is in .xlsx should be easy to open and eventually find&replace commas
according to you excel settings (or maybe it will do it automatically)
On 30 November 2013 19:15, arun <smartpink111 at yahoo.com> wrote:
I tried that, but:>
>
>
>dat1<-read.table("Book2.csv", head=T, sep=";",
dec=",")
>> str(dat1)
>'data.frame':??? 154 obs. of? 1 variable:
>
>Then I changed to:
>dat1<-read.table("Book2.csv", head=T, sep="\t",
dec=",")
>> str(dat1)
>'data.frame':??? 154 obs. of? 661 variables:
>Both of them are wrong as the number of variables should be 337.
>A.K.
>
>
>
>
>
>
>
>On Saturday, November 30, 2013 12:53 PM, nooldor <nooldor at
gmail.com> wrote:
>
>Thank you,
>
>I got your reply. I am just testing your script. I will let you know how is
it soon.
>
>.csv could be problematic as commas are used as dec separator (Eastern
Europe excel settings) ... I read it in R with this:
>dat1<-read.table("Book2.csv", head=T, sep=";",
dec=",")
>
>Thank you very much !!!
>
>T.S.
>
>
>
>
>On 30 November 2013 18:39, arun <smartpink111 at yahoo.com> wrote:
>
>I couldn't read the "Book.csv" as the format is completely
messed up.? Anyway, I hope the solution works on your dataset.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>On Saturday, November 30, 2013 10:34 AM, nooldor <nooldor at
gmail.com> wrote:
>>
>>
>>ok.
>>
>>
>>> dat1<-read.table("Book2.csv", head=T,
sep=";", dec=",") > colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:2,sep="."))
> lst1 <- lapply(paste("r",1:2,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x])) > lst2 <- lapply(lst1,function(x)
{colnames(x)[4] <-"r";x} ) > sum(!!rowSums(is.na(lst2[[1]])))
[1] 57 > #[1] 40 > sapply(lst2,function(x) sum(!!rowSums(is.na(x)))) [1]
57??0 > #[1] 40 46
>>in att you have the data file
>>
>>
>>
>>
>>
>>
>>On 30 November 2013 16:22, arun <smartpink111 at yahoo.com> wrote:
>>
>>Hi,
>>>The first point is not that clear.
>>>
>>>Could you show the expected results in this case?
>>>
>>>set.seed(432)
>>>dat1 <-
as.data.frame(matrix(sample(c(1:10,NA),154*5,replace=TRUE),ncol=5))
>>>?colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:2,sep="."))
>>>lst1 <-
lapply(paste("r",1:2,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>>>
>>>
>>>?lst2 <- lapply(lst1,function(x) {colnames(x)[4]
<-"r";x} )
>>>?sum(!!rowSums(is.na(lst2[[1]])))
>>>#[1] 40
>>>?sapply(lst2,function(x) sum(!!rowSums(is.na(x))))
>>>#[1] 40 46
>>>
>>>
>>>A.K.
>>>
>>>
>>>
>>>On Saturday, November 30, 2013 10:09 AM, nooldor <nooldor at
gmail.com> wrote:
>>>
>>>Hi,
>>>
>>>Thanks for reply!
>>>
>>>
>>>Three things:
>>>1.
>>>I did not write that some of the data has more then 31 NA in the
column and then it is not possible to run lm()
>>>
>>>Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok,
...) :??0 (non-NA) casesIn this case program should return "NA" symbol
and go further, in the case when length of the observations is shorter then 31
program should always return "NA" but go further .
>>>
>>>
>>>
>>>2. in your result matrix there are only 4 columns (for estimates of
the coefficients), is it possible to put there 4 more columns with p-values and
one column with R squared
>>>
>>>
>>>3. basic statistical test for the regressions:
>>>
>>>inflation factors can be captured by:
>>>res2 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z)
>>>? vif(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>
>>>and DW statistic:
>>>res3 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z)
>>>? durbinWatsonTest(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>
>>>
>>>3a)is that right?
>>>
>>>3b) how to do and have in user-friendly form durbinWatsonTest for
more then 1 lag?
>>>
>>>3c) how to apply: jarque.bera.test from library(tseries) and ncvTest
from library(car) ???
>>>
>>>
>>>
>>>
>>>
>>>
>>>Pozdrowienia,
>>>
>>>Tomasz Schabek
>>>
>>>
>>>On 30 November 2013 07:42, arun <smartpink111 at yahoo.com>
wrote:
>>>
>>>Hi,
>>>>The link seems to be not working.? From the description, it
looks like:
>>>>set.seed(432)
>>>>dat1 <-
as.data.frame(matrix(sample(200,154*337,replace=TRUE),ncol=337))
>>>>?colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:334,sep="."))
>>>>lst1 <-
lapply(paste("r",1:334,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>>>>
>>>>?lst2 <- lapply(lst1,function(x) {colnames(x)[4]
<-"r";x} )
>>>>library(zoo)
>>>>
>>>>res <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) coef(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>>
>>>>row.names(res) <-
rep(paste("r",1:334,sep="."),each=123)
>>>>?dim(res)
>>>>#[1] 41082???? 4
>>>>
>>>>coef(lm(r.1~F.1+F.2+F.3,data=dat1[1:32,]) )
>>>>#(Intercept)???????? F.1???????? F.2???????? F.3
>>>>#109.9168150? -0.1705361? -0.1028231?? 0.2027911
>>>>coef(lm(r.1~F.1+F.2+F.3,data=dat1[2:33,]) )
>>>>#(Intercept)???????? F.1???????? F.2???????? F.3
>>>>#119.3718949? -0.1660709? -0.2059830?? 0.1338608
>>>>res[1:2,]
>>>>#??? (Intercept)??????? F.1??????? F.2?????? F.3
>>>>#r.1??? 109.9168 -0.1705361 -0.1028231 0.2027911
>>>>#r.1??? 119.3719 -0.1660709 -0.2059830 0.1338608
>>>>
>>>>A.K.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>On Friday, November 29, 2013 6:43 PM, nooldor <nooldor at
gmail.com> wrote:
>>>>Hi all!
>>>>
>>>>
>>>>I am just starting my adventure with R, so excuse me naive
questions.
>>>>
>>>>My data look like that:
>>>>
>>>><http://r.789695.n4.nabble.com/file/n4681391/data_descr_img.jpg>
>>>>
>>>>I have 3 independent variables (F.1, F.2 and F.3) and 334 other
variables
>>>>(r.1, r.2, ... r.334) - each one of these will be dependent
variable in my
>>>>regression.
>>>>
>>>>Total span of the time is 154 observations. But I would like to
have rolling
>>>>window regression with length of 31 observations.
>>>>
>>>>I would like to run script like that:
>>>>
>>>>summary(lm(r.1~F.1+F.2+F.3, data=data))
>>>>vif(lm(r.1~F.1+F.2+F.3, data=data))
>>>>
>>>>But for each of 334 (r.1 to r.334) dependent variables
separately and with
>>>>rolling-window of the length 31obs.
>>>>
>>>>Id est:
>>>>summary(lm(r.1~F.1+F.2+F.3, data=data)) would be run 123 (154
total obs -
>>>>31. for the first regression) times for rolling-fixed period of
31 obs.
>>>>
>>>>The next regression would be:
>>>>summary(lm(r.2~F.1+F.2+F.3, data=data)) also 123 times ... and
so on till
>>>>summary(lm(r.334~F.1+F.2+F.3, data=data))
>>>>
>>>>It means it would be 123 x 334 regressions (=41082 regressions)
>>>>
>>>>I would like to save results (summary + vif test) of all those
41082
>>>>regressions in one read-user-friendly file like this given by
e.g command
>>>>capture.output()
>>>>
>>>>Could you help with it?
>>>>
>>>>Regards,
>>>>
>>>>T.S.
>>>>
>>>>??? [[alternative HTML version deleted]]
>>>>
>>>>______________________________________________
>>>>R-help at r-project.org mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>>and provide commented, minimal, self-contained, reproducible
code.
>>>>
>>>>
>>>
>>
>
arun
2013-Dec-02 02:16 UTC
[R] Multiple regressions with changing dependent variable and time span
Hi,
Try:
dat1 <- read.table(text="A??? B???? C???? D
r.1? x1??? x2?? x3
r.1? x4??? x5??? x6
r.2? x7??? x8??? x9
r.2? x10? x11 x12
r.3? x13? x14 x15
r.3? x16? x17 x18",header=TRUE,stringsAsFactors=FALSE)
?dat2 <- do.call(cbind,split(dat1,dat1$A))
colnames(dat2) <- gsub(".*\\.","",colnames(dat2))
A.K.
On Sunday, December 1, 2013 6:32 PM, nooldor <nooldor at gmail.com> wrote:
Hi,
could you also tell me how to reshape the res1 matrix like that:
[now]
A??? B???? C???? D
r.1? x1??? x2?? x3
r.1? x4 ?? x5??? x6
r.2? x7 ?? x8??? x9r.2? x10? x11 x12r.3? x13? x14 x15r.3? x16? x17 x18
[after]:
A??? B???? C???? D????? A??? B???? C???? D???? ? A??? B???? C???? D
r.1? x1??? x2?? x3? ? ? r.2? x7 ?? x8??? x9????? r.3? x13? x14 x15
r.1? x4 ?? x5?? x6? ? ? r.2? x10? x11 x12????? r.3? x16? x17 x18
big thanks!
On 30 November 2013 23:28, arun <smartpink111 at yahoo.com> wrote:
Hi,>No problem.
>
>In that case, each column will be a list.? For example if I take the first
element of `lst2`
>dW1 <- rollapply(lst2[[1]],width=32,FUN=function(z) {z1 <-
as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1
<-lm(r~F.1+F.2+F.3,data=z1); durbinWatsonTest(l1,max.lag=3) } else
rep(NA,4)},by.column=FALSE,align="right")
>
>?tail(dW1[,1],1)
>#[[1]]
>#[1] -0.3602936? 0.1975667 -0.1740797
>
>
>You can store it by:
>resdW1 <- do.call(cbind,lapply(seq_len(ncol(dW1)),function(i)
do.call(rbind,dW1[,i]))[1:3])
>
>
>Similarly, for more than one elements (using a subset of lst2- as it takes
time)
>
>
>lst3 <- lapply(lst2[1:2],function(x) rollapply(x,width=32,FUN=function(z)
{z1 <- as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1
<-lm(r~F.1+F.2+F.3,data=z1); durbinWatsonTest(l1,max.lag=3) } else
rep(NA,4)},by.column=FALSE,align="right"))
>
>lst3New <- lapply(lst3,function(x)
do.call(cbind,lapply(seq_len(ncol(x)),function(i) do.call(rbind,x[,i]))[1:3]))
>
>lst3New <- lapply(lst3New, function(x) {colnames(x) <-
paste0(rep(c("r","dw","p"),each=3),1:3); x})
>
>A.K.
>
>
>On Saturday, November 30, 2013 5:03 PM, nooldor <nooldor at gmail.com>
wrote:
>
>Hey!
>
>
>Yes,
>only the D-W test takes so much time, did not check it yet
>
>I checked results (estimates) with manually run regressions (in excel) and
they are correct.
>
>
>I only change the "width" to 31 and "each=123" to 124,
cause it should be ((154-31)+1) x 334 = 41416 matrix
>
>
>with the lag in D-W test I was wondering how to have table when I use
durbinWatsonTest(l1,3) - with three lags instead of default 1.
>
>but I can manage it - just need to learn about functions used by you.
>
>
>Any way: BIG THANK to you!
>
>
>Best wishes,
>T.S.
>
>
>
>
>
>On 30 November 2013 21:12, arun <smartpink111 at yahoo.com> wrote:
>
>Hi,
>>
>>I was able to read the file after saving it as .csv.? It seems to work
without any errors.
>>
>>dat1<-read.csv("Book2.csv", header=T)
>>###same as previous
>>
>>
>>lst1 <-
lapply(paste("r",1:334,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>>lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x}
)
>>?sapply(lst2,function(x) sum(!!rowSums(is.na(x))))
>>library(zoo)
>>
>>res1 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); c(coef(l1),
pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) } else
rep(NA,9)},by.column=FALSE,align="right")))
>>row.names(res1) <-
rep(paste("r",1:334,sep="."),each=123)
>>?dim(res1)
>>#[1] 41082???? 9
>>
>>#vif
>>?library(car)
>>
>>res2 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); vif(l1) }
else rep(NA,3)},by.column=FALSE,align="right")))
>>row.names(res2) <-
rep(paste("r",1:334,sep="."),each=123)
>>dim(res2)
>>#[1] 41082???? 3
>>
>>#DW statistic:
>>?lst3 <- lapply(lst2,function(x) rollapply(x,width=32,FUN=function(z)
{z1 <- as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1
<-lm(r~F.1+F.2+F.3,data=z1); durbinWatsonTest(l1) } else
rep(NA,4)},by.column=FALSE,align="right"))
>>?res3 <- do.call(rbind,lapply(lst3,function(x) x[,-4]))
>>row.names(res3) <-
rep(paste("r",1:334,sep="."),each=123)
>>?dim(res3)
>>#[1] 41082???? 3
>>##ncvTest()
>>f4 <- function(meanmod, dta, varmod) {
>>assign(".dta", dta, envir=.GlobalEnv)
>>assign(".meanmod", meanmod, envir=.GlobalEnv)
>>m1 <- lm(.meanmod, .dta)
>>ans <- ncvTest(m1, varmod)
>>remove(".dta", envir=.GlobalEnv)
>>remove(".meanmod", envir=.GlobalEnv)
>>ans
>>}
>>
>>?lst4 <- lapply(lst2,function(x) rollapply(x,width=32,FUN=function(z)
{z1 <- as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1 <-f4(r~.,z1) }
else NA},by.column=FALSE,align="right"))
>>names(lst4) <- paste("r",1:334,sep=".")
>>length(lst4)
>>#[1] 334
>>
>>
>>###jarque.bera.test
>>library(tseries)
>>res5 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); resid <-
residuals(l1); unlist(jarque.bera.test(resid)[1:3]) } else
rep(NA,3)},by.column=FALSE,align="right")))
>>?dim(res5)
>>#[1] 41082???? 3
>>
>>A.K.
>>
>>
>>
>>
>>
>>
>>
>>
>>On Saturday, November 30, 2013 1:44 PM, nooldor <nooldor at
gmail.com> wrote:
>>
>>here is in .xlsx should be easy to open and eventually find&replace
commas according to you excel settings (or maybe it will do it automatically)
>>
>>
>>
>>
>>
>>
>>On 30 November 2013 19:15, arun <smartpink111 at yahoo.com> wrote:
>>
>>I tried that, but:
>>>
>>>
>>>
>>>dat1<-read.table("Book2.csv", head=T,
sep=";", dec=",")
>>>> str(dat1)
>>>'data.frame':??? 154 obs. of? 1 variable:
>>>
>>>Then I changed to:
>>>dat1<-read.table("Book2.csv", head=T,
sep="\t", dec=",")
>>>> str(dat1)
>>>'data.frame':??? 154 obs. of? 661 variables:
>>>Both of them are wrong as the number of variables should be 337.
>>>A.K.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>On Saturday, November 30, 2013 12:53 PM, nooldor <nooldor at
gmail.com> wrote:
>>>
>>>Thank you,
>>>
>>>I got your reply. I am just testing your script. I will let you know
how is it soon.
>>>
>>>.csv could be problematic as commas are used as dec separator
(Eastern Europe excel settings) ... I read it in R with this:
>>>dat1<-read.table("Book2.csv", head=T,
sep=";", dec=",")
>>>
>>>Thank you very much !!!
>>>
>>>T.S.
>>>
>>>
>>>
>>>
>>>On 30 November 2013 18:39, arun <smartpink111 at yahoo.com>
wrote:
>>>
>>>I couldn't read the "Book.csv" as the format is
completely messed up.? Anyway, I hope the solution works on your dataset.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>On Saturday, November 30, 2013 10:34 AM, nooldor <nooldor at
gmail.com> wrote:
>>>>
>>>>
>>>>ok.
>>>>
>>>>
>>>>> dat1<-read.table("Book2.csv", head=T,
sep=";", dec=",") > colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:2,sep="."))
> lst1 <- lapply(paste("r",1:2,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x])) > lst2 <- lapply(lst1,function(x)
{colnames(x)[4] <-"r";x} ) > sum(!!rowSums(is.na(lst2[[1]])))
[1] 57 > #[1] 40 > sapply(lst2,function(x) sum(!!rowSums(is.na(x)))) [1]
57??0 > #[1] 40 46
>>>>in att you have the data file
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>On 30 November 2013 16:22, arun <smartpink111 at
yahoo.com> wrote:
>>>>
>>>>Hi,
>>>>>The first point is not that clear.
>>>>>
>>>>>Could you show the expected results in this case?
>>>>>
>>>>>set.seed(432)
>>>>>dat1 <-
as.data.frame(matrix(sample(c(1:10,NA),154*5,replace=TRUE),ncol=5))
>>>>>?colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:2,sep="."))
>>>>>lst1 <-
lapply(paste("r",1:2,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>>>>>
>>>>>
>>>>>?lst2 <- lapply(lst1,function(x) {colnames(x)[4]
<-"r";x} )
>>>>>?sum(!!rowSums(is.na(lst2[[1]])))
>>>>>#[1] 40
>>>>>?sapply(lst2,function(x) sum(!!rowSums(is.na(x))))
>>>>>#[1] 40 46
>>>>>
>>>>>
>>>>>A.K.
>>>>>
>>>>>
>>>>>
>>>>>On Saturday, November 30, 2013 10:09 AM, nooldor <nooldor
at gmail.com> wrote:
>>>>>
>>>>>Hi,
>>>>>
>>>>>Thanks for reply!
>>>>>
>>>>>
>>>>>Three things:
>>>>>1.
>>>>>I did not write that some of the data has more then 31 NA in
the column and then it is not possible to run lm()
>>>>>
>>>>>Error in lm.fit(x, y, offset = offset, singular.ok =
singular.ok, ...) :??0 (non-NA) casesIn this case program should return
"NA" symbol and go further, in the case when length of the
observations is shorter then 31 program should always return "NA" but
go further .
>>>>>
>>>>>
>>>>>
>>>>>2. in your result matrix there are only 4 columns (for
estimates of the coefficients), is it possible to put there 4 more columns with
p-values and one column with R squared
>>>>>
>>>>>
>>>>>3. basic statistical test for the regressions:
>>>>>
>>>>>inflation factors can be captured by:
>>>>>res2 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z)
>>>>>? vif(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>>>
>>>>>and DW statistic:
>>>>>res3 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z)
>>>>>? durbinWatsonTest(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>>>
>>>>>
>>>>>3a)is that right?
>>>>>
>>>>>3b) how to do and have in user-friendly form
durbinWatsonTest for more then 1 lag?
>>>>>
>>>>>3c) how to apply: jarque.bera.test from library(tseries) and
ncvTest from library(car) ???
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>Pozdrowienia,
>>>>>
>>>>>Tomasz Schabek
>>>>>
>>>>>
>>>>>On 30 November 2013 07:42, arun <smartpink111 at
yahoo.com> wrote:
>>>>>
>>>>>Hi,
>>>>>>The link seems to be not working.? From the description,
it looks like:
>>>>>>set.seed(432)
>>>>>>dat1 <-
as.data.frame(matrix(sample(200,154*337,replace=TRUE),ncol=337))
>>>>>>?colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:334,sep="."))
>>>>>>lst1 <-
lapply(paste("r",1:334,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>>>>>>
>>>>>>?lst2 <- lapply(lst1,function(x) {colnames(x)[4]
<-"r";x} )
>>>>>>library(zoo)
>>>>>>
>>>>>>res <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) coef(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>>>>
>>>>>>row.names(res) <-
rep(paste("r",1:334,sep="."),each=123)
>>>>>>?dim(res)
>>>>>>#[1] 41082???? 4
>>>>>>
>>>>>>coef(lm(r.1~F.1+F.2+F.3,data=dat1[1:32,]) )
>>>>>>#(Intercept)???????? F.1???????? F.2???????? F.3
>>>>>>#109.9168150? -0.1705361? -0.1028231?? 0.2027911
>>>>>>coef(lm(r.1~F.1+F.2+F.3,data=dat1[2:33,]) )
>>>>>>#(Intercept)???????? F.1???????? F.2???????? F.3
>>>>>>#119.3718949? -0.1660709? -0.2059830?? 0.1338608
>>>>>>res[1:2,]
>>>>>>#??? (Intercept)??????? F.1??????? F.2?????? F.3
>>>>>>#r.1??? 109.9168 -0.1705361 -0.1028231 0.2027911
>>>>>>#r.1??? 119.3719 -0.1660709 -0.2059830 0.1338608
>>>>>>
>>>>>>A.K.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>On Friday, November 29, 2013 6:43 PM, nooldor
<nooldor at gmail.com> wrote:
>>>>>>Hi all!
>>>>>>
>>>>>>
>>>>>>I am just starting my adventure with R, so excuse me
naive questions.
>>>>>>
>>>>>>My data look like that:
>>>>>>
>>>>>><http://r.789695.n4.nabble.com/file/n4681391/data_descr_img.jpg>
>>>>>>
>>>>>>I have 3 independent variables (F.1, F.2 and F.3) and
334 other variables
>>>>>>(r.1, r.2, ... r.334) - each one of these will be
dependent variable in my
>>>>>>regression.
>>>>>>
>>>>>>Total span of the time is 154 observations. But I would
like to have rolling
>>>>>>window regression with length of 31 observations.
>>>>>>
>>>>>>I would like to run script like that:
>>>>>>
>>>>>>summary(lm(r.1~F.1+F.2+F.3, data=data))
>>>>>>vif(lm(r.1~F.1+F.2+F.3, data=data))
>>>>>>
>>>>>>But for each of 334 (r.1 to r.334) dependent variables
separately and with
>>>>>>rolling-window of the length 31obs.
>>>>>>
>>>>>>Id est:
>>>>>>summary(lm(r.1~F.1+F.2+F.3, data=data)) would be run 123
(154 total obs -
>>>>>>31. for the first regression) times for rolling-fixed
period of 31 obs.
>>>>>>
>>>>>>The next regression would be:
>>>>>>summary(lm(r.2~F.1+F.2+F.3, data=data)) also 123 times
... and so on till
>>>>>>summary(lm(r.334~F.1+F.2+F.3, data=data))
>>>>>>
>>>>>>It means it would be 123 x 334 regressions (=41082
regressions)
>>>>>>
>>>>>>I would like to save results (summary + vif test) of all
those 41082
>>>>>>regressions in one read-user-friendly file like this
given by e.g command
>>>>>>capture.output()
>>>>>>
>>>>>>Could you help with it?
>>>>>>
>>>>>>Regards,
>>>>>>
>>>>>>T.S.
>>>>>>
>>>>>>??? [[alternative HTML version deleted]]
>>>>>>
>>>>>>______________________________________________
>>>>>>R-help at r-project.org mailing list
>>>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>>>>and provide commented, minimal, self-contained,
reproducible code.
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
arun
2013-Dec-02 03:14 UTC
[R] Multiple regressions with changing dependent variable and time span
Hi,
I guess you wanted something like this:
res2 <- do.call(cbind,lapply(lst2,
?function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1);
c(coef(l1), pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) }
else rep(NA,9)},by.column=FALSE,align="right")))
dim(res2)
#[1]? 123 3006
?334*9
#[1] 3006
lst2New <- lapply(split(seq(3006),((seq(3006)-1)%%9)+1),function(x) {x1 <-
res2[,x]; colnames(x1) <- paste(colnames(x1),1:334,sep="_");x1})
sapply(lst2New,ncol)
#? 1?? 2?? 3?? 4?? 5?? 6?? 7?? 8?? 9
#334 334 334 334 334 334 334 334 334
lst2New[[1]][1:4,1:4]
#???? (Intercept)_1 (Intercept)_2 (Intercept)_3 (Intercept)_4
#[1,]??????????? NA??? -0.3295765??? -0.7025259??????????? NA
#[2,]??????????? NA???? 0.1217360??? -1.5221660??????????? NA
#[3,]??????????? NA???? 0.3192466??? -1.3586341??????????? NA
#[4,]??????????? NA???? 0.2618476??? -0.9698798??????????? NA
A.K.
On Sunday, December 1, 2013 9:33 PM, nooldor <nooldor at gmail.com> wrote:
Hi,
actually, do you remember yesterday code you build for me?
it was:
dat1<-read.csv("Book2.csv", header=T)>###same as previous
>
>
>lst1 <- lapply(paste("r",1:334,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x} )
>?sapply(lst2,function(x) sum(!!rowSums(is.na(x))))
>library(zoo)
>res1 <- do.call(rbind,lapply(lst2,
>function(x) rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); c(coef(l1),
pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) } else
rep(NA,9)},by.column=FALSE,align="right")))>row.names(res1) <-
rep(paste("r",1:334,sep="."),each=123)
>?dim(res1)
>#[1] 41082???? 9
(boo2.xls is? attached in this e-mail previously, just need to open it in excel
and save as csv)
then we have result res1 as matrix 41082 x 9
(this 41082 = 123observations in time x 334objects)
now I need to separate this matrix res1 into 9 different matrices each one
containing 123 obs x 334 variables
maybe reshape() would be useful ...
... let me know if I described it clearly
On 2 December 2013 03:17, arun <smartpink111 at yahoo.com> wrote:
#or>
>data.frame(split(dat1,dat1$A))
>
>
>
>
>
>On Sunday, December 1, 2013 9:16 PM, arun <smartpink111 at yahoo.com>
wrote:
>Hi,
>Try:
>dat1 <- read.table(text="A??? B???? C???? D
>r.1? x1??? x2?? x3
>r.1? x4??? x5??? x6
>r.2? x7??? x8??? x9
>r.2? x10? x11 x12
>r.3? x13? x14 x15
>r.3? x16? x17 x18",header=TRUE,stringsAsFactors=FALSE)
>
>?dat2 <- do.call(cbind,split(dat1,dat1$A))
>colnames(dat2) <- gsub(".*\\.","",colnames(dat2))
>A.K.
>
>
>
>
>
>
>
>On Sunday, December 1, 2013 6:32 PM, nooldor <nooldor at gmail.com>
wrote:
>
>Hi,
>
>could you also tell me how to reshape the res1 matrix like that:
>
>[now]
>
>A??? B???? C???? D
>r.1? x1??? x2?? x3
>r.1? x4 ?? x5??? x6
>r.2? x7 ?? x8??? x9r.2? x10? x11 x12r.3? x13? x14 x15r.3? x16? x17 x18
>
>
>[after]:
>A??? B???? C???? D????? A??? B???? C???? D???? ? A??? B???? C???? D
>r.1? x1??? x2?? x3? ? ? r.2? x7 ?? x8??? x9????? r.3? x13? x14 x15
>r.1? x4 ?? x5?? x6? ? ? r.2? x10? x11 x12????? r.3? x16? x17 x18
>
>
>
>big thanks!
>
>
>
>
>
>
>
>On 30 November 2013 23:28, arun <smartpink111 at yahoo.com> wrote:
>
>Hi,
>>No problem.
>>
>>In that case, each column will be a list.? For example if I take the
first element of `lst2`
>>dW1 <- rollapply(lst2[[1]],width=32,FUN=function(z) {z1 <-
as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1
<-lm(r~F.1+F.2+F.3,data=z1); durbinWatsonTest(l1,max.lag=3) } else
rep(NA,4)},by.column=FALSE,align="right")
>>
>>?tail(dW1[,1],1)
>>#[[1]]
>>#[1] -0.3602936? 0.1975667 -0.1740797
>>
>>
>>You can store it by:
>>resdW1 <- do.call(cbind,lapply(seq_len(ncol(dW1)),function(i)
do.call(rbind,dW1[,i]))[1:3])
>>
>>
>>Similarly, for more than one elements (using a subset of lst2- as it
takes time)
>>
>>
>>lst3 <- lapply(lst2[1:2],function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1);
durbinWatsonTest(l1,max.lag=3) } else
rep(NA,4)},by.column=FALSE,align="right"))
>>
>>lst3New <- lapply(lst3,function(x)
do.call(cbind,lapply(seq_len(ncol(x)),function(i) do.call(rbind,x[,i]))[1:3]))
>>
>>lst3New <- lapply(lst3New, function(x) {colnames(x) <-
paste0(rep(c("r","dw","p"),each=3),1:3); x})
>>
>>A.K.
>>
>>
>>On Saturday, November 30, 2013 5:03 PM, nooldor <nooldor at
gmail.com> wrote:
>>
>>Hey!
>>
>>
>>Yes,
>>only the D-W test takes so much time, did not check it yet
>>
>>I checked results (estimates) with manually run regressions (in excel)
and they are correct.
>>
>>
>>I only change the "width" to 31 and "each=123" to
124, cause it should be ((154-31)+1) x 334 = 41416 matrix
>>
>>
>>with the lag in D-W test I was wondering how to have table when I use
durbinWatsonTest(l1,3) - with three lags instead of default 1.
>>
>>but I can manage it - just need to learn about functions used by you.
>>
>>
>>Any way: BIG THANK to you!
>>
>>
>>Best wishes,
>>T.S.
>>
>>
>>
>>
>>
>>On 30 November 2013 21:12, arun <smartpink111 at yahoo.com> wrote:
>>
>>Hi,
>>>
>>>I was able to read the file after saving it as .csv.? It seems to
work without any errors.
>>>
>>>dat1<-read.csv("Book2.csv", header=T)
>>>###same as previous
>>>
>>>
>>>lst1 <-
lapply(paste("r",1:334,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>>>lst2 <- lapply(lst1,function(x) {colnames(x)[4]
<-"r";x} )
>>>?sapply(lst2,function(x) sum(!!rowSums(is.na(x))))
>>>library(zoo)
>>>
>>>res1 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); c(coef(l1),
pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) } else
rep(NA,9)},by.column=FALSE,align="right")))
>>>row.names(res1) <-
rep(paste("r",1:334,sep="."),each=123)
>>>?dim(res1)
>>>#[1] 41082???? 9
>>>
>>>#vif
>>>?library(car)
>>>
>>>res2 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); vif(l1) }
else rep(NA,3)},by.column=FALSE,align="right")))
>>>row.names(res2) <-
rep(paste("r",1:334,sep="."),each=123)
>>>dim(res2)
>>>#[1] 41082???? 3
>>>
>>>#DW statistic:
>>>?lst3 <- lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1);
durbinWatsonTest(l1) } else rep(NA,4)},by.column=FALSE,align="right"))
>>>?res3 <- do.call(rbind,lapply(lst3,function(x) x[,-4]))
>>>row.names(res3) <-
rep(paste("r",1:334,sep="."),each=123)
>>>?dim(res3)
>>>#[1] 41082???? 3
>>>##ncvTest()
>>>f4 <- function(meanmod, dta, varmod) {
>>>assign(".dta", dta, envir=.GlobalEnv)
>>>assign(".meanmod", meanmod, envir=.GlobalEnv)
>>>m1 <- lm(.meanmod, .dta)
>>>ans <- ncvTest(m1, varmod)
>>>remove(".dta", envir=.GlobalEnv)
>>>remove(".meanmod", envir=.GlobalEnv)
>>>ans
>>>}
>>>
>>>?lst4 <- lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-f4(r~.,z1) } else
NA},by.column=FALSE,align="right"))
>>>names(lst4) <- paste("r",1:334,sep=".")
>>>length(lst4)
>>>#[1] 334
>>>
>>>
>>>###jarque.bera.test
>>>library(tseries)
>>>res5 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z);
if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); resid <-
residuals(l1); unlist(jarque.bera.test(resid)[1:3]) } else
rep(NA,3)},by.column=FALSE,align="right")))
>>>?dim(res5)
>>>#[1] 41082???? 3
>>>
>>>A.K.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>On Saturday, November 30, 2013 1:44 PM, nooldor <nooldor at
gmail.com> wrote:
>>>
>>>here is in .xlsx should be easy to open and eventually
find&replace commas according to you excel settings (or maybe it will do it
automatically)
>>>
>>>
>>>
>>>
>>>
>>>
>>>On 30 November 2013 19:15, arun <smartpink111 at yahoo.com>
wrote:
>>>
>>>I tried that, but:
>>>>
>>>>
>>>>
>>>>dat1<-read.table("Book2.csv", head=T,
sep=";", dec=",")
>>>>> str(dat1)
>>>>'data.frame':??? 154 obs. of? 1 variable:
>>>>
>>>>Then I changed to:
>>>>dat1<-read.table("Book2.csv", head=T,
sep="\t", dec=",")
>>>>> str(dat1)
>>>>'data.frame':??? 154 obs. of? 661 variables:
>>>>Both of them are wrong as the number of variables should be 337.
>>>>A.K.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>On Saturday, November 30, 2013 12:53 PM, nooldor <nooldor at
gmail.com> wrote:
>>>>
>>>>Thank you,
>>>>
>>>>I got your reply. I am just testing your script. I will let you
know how is it soon.
>>>>
>>>>.csv could be problematic as commas are used as dec separator
(Eastern Europe excel settings) ... I read it in R with this:
>>>>dat1<-read.table("Book2.csv", head=T,
sep=";", dec=",")
>>>>
>>>>Thank you very much !!!
>>>>
>>>>T.S.
>>>>
>>>>
>>>>
>>>>
>>>>On 30 November 2013 18:39, arun <smartpink111 at
yahoo.com> wrote:
>>>>
>>>>I couldn't read the "Book.csv" as the format is
completely messed up.? Anyway, I hope the solution works on your dataset.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>On Saturday, November 30, 2013 10:34 AM, nooldor <nooldor
at gmail.com> wrote:
>>>>>
>>>>>
>>>>>ok.
>>>>>
>>>>>
>>>>>> dat1<-read.table("Book2.csv", head=T,
sep=";", dec=",") > colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:2,sep="."))
> lst1 <- lapply(paste("r",1:2,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x])) > lst2 <- lapply(lst1,function(x)
{colnames(x)[4] <-"r";x} ) > sum(!!rowSums(is.na(lst2[[1]])))
[1] 57 > #[1] 40 > sapply(lst2,function(x) sum(!!rowSums(is.na(x)))) [1]
57??0 > #[1] 40 46
>>>>>in att you have the data file
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>On 30 November 2013 16:22, arun <smartpink111 at
yahoo.com> wrote:
>>>>>
>>>>>Hi,
>>>>>>The first point is not that clear.
>>>>>>
>>>>>>Could you show the expected results in this case?
>>>>>>
>>>>>>set.seed(432)
>>>>>>dat1 <-
as.data.frame(matrix(sample(c(1:10,NA),154*5,replace=TRUE),ncol=5))
>>>>>>?colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:2,sep="."))
>>>>>>lst1 <-
lapply(paste("r",1:2,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>>>>>>
>>>>>>
>>>>>>?lst2 <- lapply(lst1,function(x) {colnames(x)[4]
<-"r";x} )
>>>>>>?sum(!!rowSums(is.na(lst2[[1]])))
>>>>>>#[1] 40
>>>>>>?sapply(lst2,function(x) sum(!!rowSums(is.na(x))))
>>>>>>#[1] 40 46
>>>>>>
>>>>>>
>>>>>>A.K.
>>>>>>
>>>>>>
>>>>>>
>>>>>>On Saturday, November 30, 2013 10:09 AM, nooldor
<nooldor at gmail.com> wrote:
>>>>>>
>>>>>>Hi,
>>>>>>
>>>>>>Thanks for reply!
>>>>>>
>>>>>>
>>>>>>Three things:
>>>>>>1.
>>>>>>I did not write that some of the data has more then 31
NA in the column and then it is not possible to run lm()
>>>>>>
>>>>>>Error in lm.fit(x, y, offset = offset, singular.ok =
singular.ok, ...) :??0 (non-NA) casesIn this case program should return
"NA" symbol and go further, in the case when length of the
observations is shorter then 31 program should always return "NA" but
go further .
>>>>>>
>>>>>>
>>>>>>
>>>>>>2. in your result matrix there are only 4 columns (for
estimates of the coefficients), is it possible to put there 4 more columns with
p-values and one column with R squared
>>>>>>
>>>>>>
>>>>>>3. basic statistical test for the regressions:
>>>>>>
>>>>>>inflation factors can be captured by:
>>>>>>res2 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z)
>>>>>>? vif(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>>>>
>>>>>>and DW statistic:
>>>>>>res3 <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z)
>>>>>>? durbinWatsonTest(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>>>>
>>>>>>
>>>>>>3a)is that right?
>>>>>>
>>>>>>3b) how to do and have in user-friendly form
durbinWatsonTest for more then 1 lag?
>>>>>>
>>>>>>3c) how to apply: jarque.bera.test from library(tseries)
and ncvTest from library(car) ???
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>Pozdrowienia,
>>>>>>
>>>>>>Tomasz Schabek
>>>>>>
>>>>>>
>>>>>>On 30 November 2013 07:42, arun <smartpink111 at
yahoo.com> wrote:
>>>>>>
>>>>>>Hi,
>>>>>>>The link seems to be not working.? From the
description, it looks like:
>>>>>>>set.seed(432)
>>>>>>>dat1 <-
as.data.frame(matrix(sample(200,154*337,replace=TRUE),ncol=337))
>>>>>>>?colnames(dat1) <-
c(paste("F",1:3,sep="."),paste("r",1:334,sep="."))
>>>>>>>lst1 <-
lapply(paste("r",1:334,sep="."),function(x)
cbind(dat1[,c(1:3)],dat1[x]))
>>>>>>>
>>>>>>>?lst2 <- lapply(lst1,function(x) {colnames(x)[4]
<-"r";x} )
>>>>>>>library(zoo)
>>>>>>>
>>>>>>>res <- do.call(rbind,lapply(lst2,function(x)
rollapply(x,width=32,FUN=function(z) coef(lm(r~
F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right")))
>>>>>>>
>>>>>>>row.names(res) <-
rep(paste("r",1:334,sep="."),each=123)
>>>>>>>?dim(res)
>>>>>>>#[1] 41082???? 4
>>>>>>>
>>>>>>>coef(lm(r.1~F.1+F.2+F.3,data=dat1[1:32,]) )
>>>>>>>#(Intercept)???????? F.1???????? F.2???????? F.3
>>>>>>>#109.9168150? -0.1705361? -0.1028231?? 0.2027911
>>>>>>>coef(lm(r.1~F.1+F.2+F.3,data=dat1[2:33,]) )
>>>>>>>#(Intercept)???????? F.1???????? F.2???????? F.3
>>>>>>>#119.3718949? -0.1660709? -0.2059830?? 0.1338608
>>>>>>>res[1:2,]
>>>>>>>#??? (Intercept)??????? F.1??????? F.2?????? F.3
>>>>>>>#r.1??? 109.9168 -0.1705361 -0.1028231 0.2027911
>>>>>>>#r.1??? 119.3719 -0.1660709 -0.2059830 0.1338608
>>>>>>>
>>>>>>>A.K.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On Friday, November 29, 2013 6:43 PM, nooldor
<nooldor at gmail.com> wrote:
>>>>>>>Hi all!
>>>>>>>
>>>>>>>
>>>>>>>I am just starting my adventure with R, so excuse me
naive questions.
>>>>>>>
>>>>>>>My data look like that:
>>>>>>>
>>>>>>><http://r.789695.n4.nabble.com/file/n4681391/data_descr_img.jpg>
>>>>>>>
>>>>>>>I have 3 independent variables (F.1, F.2 and F.3)
and 334 other variables
>>>>>>>(r.1, r.2, ... r.334) - each one of these will be
dependent variable in my
>>>>>>>regression.
>>>>>>>
>>>>>>>Total span of the time is 154 observations. But I
would like to have rolling
>>>>>>>window regression with length of 31 observations.
>>>>>>>
>>>>>>>I would like to run script like that:
>>>>>>>
>>>>>>>summary(lm(r.1~F.1+F.2+F.3, data=data))
>>>>>>>vif(lm(r.1~F.1+F.2+F.3, data=data))
>>>>>>>
>>>>>>>But for each of 334 (r.1 to r.334) dependent
variables separately and with
>>>>>>>rolling-window of the length 31obs.
>>>>>>>
>>>>>>>Id est:
>>>>>>>summary(lm(r.1~F.1+F.2+F.3, data=data)) would be run
123 (154 total obs -
>>>>>>>31. for the first regression) times for
rolling-fixed period of 31 obs.
>>>>>>>
>>>>>>>The next regression would be:
>>>>>>>summary(lm(r.2~F.1+F.2+F.3, data=data)) also 123
times ... and so on till
>>>>>>>summary(lm(r.334~F.1+F.2+F.3, data=data))
>>>>>>>
>>>>>>>It means it would be 123 x 334 regressions (=41082
regressions)
>>>>>>>
>>>>>>>I would like to save results (summary + vif test) of
all those 41082
>>>>>>>regressions in one read-user-friendly file like this
given by e.g command
>>>>>>>capture.output()
>>>>>>>
>>>>>>>Could you help with it?
>>>>>>>
>>>>>>>Regards,
>>>>>>>
>>>>>>>T.S.
>>>>>>>
>>>>>>>??? [[alternative HTML version deleted]]
>>>>>>>
>>>>>>>______________________________________________
>>>>>>>R-help at r-project.org mailing list
>>>>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>>>>>and provide commented, minimal, self-contained,
reproducible code.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
nooldor
2013-Dec-02 03:28 UTC
[R] Multiple regressions with changing dependent variable and time span
Yes. Thank you once again. With this:> res2 <- do.call(cbind,lapply(lst2, > function(x) > rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z); > if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); > c(coef(l1), pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) } > else rep(NA,9)},by.column=FALSE,align="right"))) > dim(res2) > #[1] 123 3006 > 334*9 > #[1] 3006 >I can delete unnecessary columns and have only those I need. BIG thank! Regards, T.S. On 2 December 2013 04:14, arun <smartpink111@yahoo.com> wrote:> Hi, > I guess you wanted something like this: > > res2 <- do.call(cbind,lapply(lst2, > function(x) > rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z); > if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); > c(coef(l1), pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) } > else rep(NA,9)},by.column=FALSE,align="right"))) > dim(res2) > #[1] 123 3006 > 334*9 > #[1] 3006 > > lst2New <- lapply(split(seq(3006),((seq(3006)-1)%%9)+1),function(x) {x1 <- > res2[,x]; colnames(x1) <- paste(colnames(x1),1:334,sep="_");x1}) > sapply(lst2New,ncol) > # 1 2 3 4 5 6 7 8 9 > #334 334 334 334 334 334 334 334 334 > lst2New[[1]][1:4,1:4] > # (Intercept)_1 (Intercept)_2 (Intercept)_3 (Intercept)_4 > #[1,] NA -0.3295765 -0.7025259 NA > #[2,] NA 0.1217360 -1.5221660 NA > #[3,] NA 0.3192466 -1.3586341 NA > #[4,] NA 0.2618476 -0.9698798 NA > > > A.K. > > > > > > > > > > > > > > > On Sunday, December 1, 2013 9:33 PM, nooldor <nooldor@gmail.com> wrote: > > Hi, > > actually, do you remember yesterday code you build for me? > it was: > > dat1<-read.csv("Book2.csv", header=T) > >###same as previous > > > > > >lst1 <- lapply(paste("r",1:334,sep="."),function(x) > cbind(dat1[,c(1:3)],dat1[x])) > >lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x} ) > > sapply(lst2,function(x) sum(!!rowSums(is.na(x)))) > >library(zoo) > >res1 <- do.call(rbind,lapply(lst2, > >function(x) rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z); > if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); > c(coef(l1), > pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) } else > rep(NA,9)},by.column=FALSE,align="right"))) > >row.names(res1) <- rep(paste("r",1:334,sep="."),each=123) > > dim(res1) > >#[1] 41082 9 > > (boo2.xls is attached in this e-mail previously, just need to open it in > excel and save as csv) > then we have result res1 as matrix 41082 x 9 > (this 41082 = 123observations in time x 334objects) > now I need to separate this matrix res1 into 9 different matrices each one > containing 123 obs x 334 variables > > > > maybe reshape() would be useful ... > > ... let me know if I described it clearly > > > > > > On 2 December 2013 03:17, arun <smartpink111@yahoo.com> wrote: > > #or > > > >data.frame(split(dat1,dat1$A)) > > > > > > > > > > > >On Sunday, December 1, 2013 9:16 PM, arun <smartpink111@yahoo.com> wrote: > >Hi, > >Try: > >dat1 <- read.table(text="A B C D > >r.1 x1 x2 x3 > >r.1 x4 x5 x6 > >r.2 x7 x8 x9 > >r.2 x10 x11 x12 > >r.3 x13 x14 x15 > >r.3 x16 x17 x18",header=TRUE,stringsAsFactors=FALSE) > > > > dat2 <- do.call(cbind,split(dat1,dat1$A)) > >colnames(dat2) <- gsub(".*\\.","",colnames(dat2)) > >A.K. > > > > > > > > > > > > > > > >On Sunday, December 1, 2013 6:32 PM, nooldor <nooldor@gmail.com> wrote: > > > >Hi, > > > >could you also tell me how to reshape the res1 matrix like that: > > > >[now] > > > >A B C D > >r.1 x1 x2 x3 > >r.1 x4 x5 x6 > >r.2 x7 x8 x9r.2 x10 x11 x12r.3 x13 x14 x15r.3 x16 x17 x18 > > > > > >[after]: > >A B C D A B C D A B C D > >r.1 x1 x2 x3 r.2 x7 x8 x9 r.3 x13 x14 x15 > >r.1 x4 x5 x6 r.2 x10 x11 x12 r.3 x16 x17 x18 > > > > > > > >big thanks! > > > > > > > > > > > > > > > >On 30 November 2013 23:28, arun <smartpink111@yahoo.com> wrote: > > > >Hi, > >>No problem. > >> > >>In that case, each column will be a list. For example if I take the > first element of `lst2` > >>dW1 <- rollapply(lst2[[1]],width=32,FUN=function(z) {z1 <- > as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1 > <-lm(r~F.1+F.2+F.3,data=z1); durbinWatsonTest(l1,max.lag=3) } else > rep(NA,4)},by.column=FALSE,align="right") > >> > >> tail(dW1[,1],1) > >>#[[1]] > >>#[1] -0.3602936 0.1975667 -0.1740797 > >> > >> > >>You can store it by: > >>resdW1 <- do.call(cbind,lapply(seq_len(ncol(dW1)),function(i) > do.call(rbind,dW1[,i]))[1:3]) > >> > >> > >>Similarly, for more than one elements (using a subset of lst2- as it > takes time) > >> > >> > >>lst3 <- lapply(lst2[1:2],function(x) > rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z); > if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); > durbinWatsonTest(l1,max.lag=3) } else > rep(NA,4)},by.column=FALSE,align="right")) > >> > >>lst3New <- lapply(lst3,function(x) > do.call(cbind,lapply(seq_len(ncol(x)),function(i) > do.call(rbind,x[,i]))[1:3])) > >> > >>lst3New <- lapply(lst3New, function(x) {colnames(x) <- > paste0(rep(c("r","dw","p"),each=3),1:3); x}) > >> > >>A.K. > >> > >> > >>On Saturday, November 30, 2013 5:03 PM, nooldor <nooldor@gmail.com> > wrote: > >> > >>Hey! > >> > >> > >>Yes, > >>only the D-W test takes so much time, did not check it yet > >> > >>I checked results (estimates) with manually run regressions (in excel) > and they are correct. > >> > >> > >>I only change the "width" to 31 and "each=123" to 124, cause it should > be ((154-31)+1) x 334 = 41416 matrix > >> > >> > >>with the lag in D-W test I was wondering how to have table when I use > durbinWatsonTest(l1,3) - with three lags instead of default 1. > >> > >>but I can manage it - just need to learn about functions used by you. > >> > >> > >>Any way: BIG THANK to you! > >> > >> > >>Best wishes, > >>T.S. > >> > >> > >> > >> > >> > >>On 30 November 2013 21:12, arun <smartpink111@yahoo.com> wrote: > >> > >>Hi, > >>> > >>>I was able to read the file after saving it as .csv. It seems to work > without any errors. > >>> > >>>dat1<-read.csv("Book2.csv", header=T) > >>>###same as previous > >>> > >>> > >>>lst1 <- lapply(paste("r",1:334,sep="."),function(x) > cbind(dat1[,c(1:3)],dat1[x])) > >>>lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x} ) > >>> sapply(lst2,function(x) sum(!!rowSums(is.na(x)))) > >>>library(zoo) > >>> > >>>res1 <- do.call(rbind,lapply(lst2,function(x) > rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z); > if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); > c(coef(l1), pval=summary(l1)$coef[,4], rsquare=summary(l1)$r.squared) } > else rep(NA,9)},by.column=FALSE,align="right"))) > >>>row.names(res1) <- rep(paste("r",1:334,sep="."),each=123) > >>> dim(res1) > >>>#[1] 41082 9 > >>> > >>>#vif > >>> library(car) > >>> > >>>res2 <- do.call(rbind,lapply(lst2,function(x) > rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z); > if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); vif(l1) } > else rep(NA,3)},by.column=FALSE,align="right"))) > >>>row.names(res2) <- rep(paste("r",1:334,sep="."),each=123) > >>>dim(res2) > >>>#[1] 41082 3 > >>> > >>>#DW statistic: > >>> lst3 <- lapply(lst2,function(x) rollapply(x,width=32,FUN=function(z) > {z1 <- as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1 > <-lm(r~F.1+F.2+F.3,data=z1); durbinWatsonTest(l1) } else > rep(NA,4)},by.column=FALSE,align="right")) > >>> res3 <- do.call(rbind,lapply(lst3,function(x) x[,-4])) > >>>row.names(res3) <- rep(paste("r",1:334,sep="."),each=123) > >>> dim(res3) > >>>#[1] 41082 3 > >>>##ncvTest() > >>>f4 <- function(meanmod, dta, varmod) { > >>>assign(".dta", dta, envir=.GlobalEnv) > >>>assign(".meanmod", meanmod, envir=.GlobalEnv) > >>>m1 <- lm(.meanmod, .dta) > >>>ans <- ncvTest(m1, varmod) > >>>remove(".dta", envir=.GlobalEnv) > >>>remove(".meanmod", envir=.GlobalEnv) > >>>ans > >>>} > >>> > >>> lst4 <- lapply(lst2,function(x) rollapply(x,width=32,FUN=function(z) > {z1 <- as.data.frame(z); if(!sum(!!rowSums(is.na(z1)))) {l1 <-f4(r~.,z1) > } else NA},by.column=FALSE,align="right")) > >>>names(lst4) <- paste("r",1:334,sep=".") > >>>length(lst4) > >>>#[1] 334 > >>> > >>> > >>>###jarque.bera.test > >>>library(tseries) > >>>res5 <- do.call(rbind,lapply(lst2,function(x) > rollapply(x,width=32,FUN=function(z) {z1 <- as.data.frame(z); > if(!sum(!!rowSums(is.na(z1)))) {l1 <-lm(r~F.1+F.2+F.3,data=z1); resid <- > residuals(l1); unlist(jarque.bera.test(resid)[1:3]) } else > rep(NA,3)},by.column=FALSE,align="right"))) > >>> dim(res5) > >>>#[1] 41082 3 > >>> > >>>A.K. > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>>On Saturday, November 30, 2013 1:44 PM, nooldor <nooldor@gmail.com> > wrote: > >>> > >>>here is in .xlsx should be easy to open and eventually find&replace > commas according to you excel settings (or maybe it will do it > automatically) > >>> > >>> > >>> > >>> > >>> > >>> > >>>On 30 November 2013 19:15, arun <smartpink111@yahoo.com> wrote: > >>> > >>>I tried that, but: > >>>> > >>>> > >>>> > >>>>dat1<-read.table("Book2.csv", head=T, sep=";", dec=",") > >>>>> str(dat1) > >>>>'data.frame': 154 obs. of 1 variable: > >>>> > >>>>Then I changed to: > >>>>dat1<-read.table("Book2.csv", head=T, sep="\t", dec=",") > >>>>> str(dat1) > >>>>'data.frame': 154 obs. of 661 variables: > >>>>Both of them are wrong as the number of variables should be 337. > >>>>A.K. > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>On Saturday, November 30, 2013 12:53 PM, nooldor <nooldor@gmail.com> > wrote: > >>>> > >>>>Thank you, > >>>> > >>>>I got your reply. I am just testing your script. I will let you know > how is it soon. > >>>> > >>>>.csv could be problematic as commas are used as dec separator (Eastern > Europe excel settings) ... I read it in R with this: > >>>>dat1<-read.table("Book2.csv", head=T, sep=";", dec=",") > >>>> > >>>>Thank you very much !!! > >>>> > >>>>T.S. > >>>> > >>>> > >>>> > >>>> > >>>>On 30 November 2013 18:39, arun <smartpink111@yahoo.com> wrote: > >>>> > >>>>I couldn't read the "Book.csv" as the format is completely messed up. > Anyway, I hope the solution works on your dataset. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>On Saturday, November 30, 2013 10:34 AM, nooldor <nooldor@gmail.com> > wrote: > >>>>> > >>>>> > >>>>>ok. > >>>>> > >>>>> > >>>>>> dat1<-read.table("Book2.csv", head=T, sep=";", dec=",") > > colnames(dat1) <- c(paste("F",1:3,sep="."),paste("r",1:2,sep=".")) > lst1 > <- lapply(paste("r",1:2,sep="."),function(x) cbind(dat1[,c(1:3)],dat1[x])) > > lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x} ) > > sum(!!rowSums(is.na(lst2[[1]]))) [1] 57 > #[1] 40 > > sapply(lst2,function(x) sum(!!rowSums(is.na(x)))) [1] 57 0 > #[1] 40 46 > >>>>>in att you have the data file > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>On 30 November 2013 16:22, arun <smartpink111@yahoo.com> wrote: > >>>>> > >>>>>Hi, > >>>>>>The first point is not that clear. > >>>>>> > >>>>>>Could you show the expected results in this case? > >>>>>> > >>>>>>set.seed(432) > >>>>>>dat1 <- > as.data.frame(matrix(sample(c(1:10,NA),154*5,replace=TRUE),ncol=5)) > >>>>>> colnames(dat1) <- c(paste("F",1:3,sep="."),paste("r",1:2,sep=".")) > >>>>>>lst1 <- lapply(paste("r",1:2,sep="."),function(x) > cbind(dat1[,c(1:3)],dat1[x])) > >>>>>> > >>>>>> > >>>>>> lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x} ) > >>>>>> sum(!!rowSums(is.na(lst2[[1]]))) > >>>>>>#[1] 40 > >>>>>> sapply(lst2,function(x) sum(!!rowSums(is.na(x)))) > >>>>>>#[1] 40 46 > >>>>>> > >>>>>> > >>>>>>A.K. > >>>>>> > >>>>>> > >>>>>> > >>>>>>On Saturday, November 30, 2013 10:09 AM, nooldor <nooldor@gmail.com> > wrote: > >>>>>> > >>>>>>Hi, > >>>>>> > >>>>>>Thanks for reply! > >>>>>> > >>>>>> > >>>>>>Three things: > >>>>>>1. > >>>>>>I did not write that some of the data has more then 31 NA in the > column and then it is not possible to run lm() > >>>>>> > >>>>>>Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, > ...) : 0 (non-NA) casesIn this case program should return "NA" symbol and > go further, in the case when length of the observations is shorter then 31 > program should always return "NA" but go further . > >>>>>> > >>>>>> > >>>>>> > >>>>>>2. in your result matrix there are only 4 columns (for estimates of > the coefficients), is it possible to put there 4 more columns with p-values > and one column with R squared > >>>>>> > >>>>>> > >>>>>>3. basic statistical test for the regressions: > >>>>>> > >>>>>>inflation factors can be captured by: > >>>>>>res2 <- do.call(rbind,lapply(lst2,function(x) > rollapply(x,width=32,FUN=function(z) > >>>>>> vif(lm(r~ > F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right"))) > >>>>>> > >>>>>>and DW statistic: > >>>>>>res3 <- do.call(rbind,lapply(lst2,function(x) > rollapply(x,width=32,FUN=function(z) > >>>>>> durbinWatsonTest(lm(r~ > F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right"))) > >>>>>> > >>>>>> > >>>>>>3a)is that right? > >>>>>> > >>>>>>3b) how to do and have in user-friendly form durbinWatsonTest for > more then 1 lag? > >>>>>> > >>>>>>3c) how to apply: jarque.bera.test from library(tseries) and ncvTest > from library(car) ??? > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>Pozdrowienia, > >>>>>> > >>>>>>Tomasz Schabek > >>>>>> > >>>>>> > >>>>>>On 30 November 2013 07:42, arun <smartpink111@yahoo.com> wrote: > >>>>>> > >>>>>>Hi, > >>>>>>>The link seems to be not working. From the description, it looks > like: > >>>>>>>set.seed(432) > >>>>>>>dat1 <- > as.data.frame(matrix(sample(200,154*337,replace=TRUE),ncol=337)) > >>>>>>> colnames(dat1) <- > c(paste("F",1:3,sep="."),paste("r",1:334,sep=".")) > >>>>>>>lst1 <- lapply(paste("r",1:334,sep="."),function(x) > cbind(dat1[,c(1:3)],dat1[x])) > >>>>>>> > >>>>>>> lst2 <- lapply(lst1,function(x) {colnames(x)[4] <-"r";x} ) > >>>>>>>library(zoo) > >>>>>>> > >>>>>>>res <- do.call(rbind,lapply(lst2,function(x) > rollapply(x,width=32,FUN=function(z) coef(lm(r~ > F.1+F.2+F.3,data=as.data.frame(z))),by.column=FALSE,align="right"))) > >>>>>>> > >>>>>>>row.names(res) <- rep(paste("r",1:334,sep="."),each=123) > >>>>>>> dim(res) > >>>>>>>#[1] 41082 4 > >>>>>>> > >>>>>>>coef(lm(r.1~F.1+F.2+F.3,data=dat1[1:32,]) ) > >>>>>>>#(Intercept) F.1 F.2 F.3 > >>>>>>>#109.9168150 -0.1705361 -0.1028231 0.2027911 > >>>>>>>coef(lm(r.1~F.1+F.2+F.3,data=dat1[2:33,]) ) > >>>>>>>#(Intercept) F.1 F.2 F.3 > >>>>>>>#119.3718949 -0.1660709 -0.2059830 0.1338608 > >>>>>>>res[1:2,] > >>>>>>># (Intercept) F.1 F.2 F.3 > >>>>>>>#r.1 109.9168 -0.1705361 -0.1028231 0.2027911 > >>>>>>>#r.1 119.3719 -0.1660709 -0.2059830 0.1338608 > >>>>>>> > >>>>>>>A.K. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>On Friday, November 29, 2013 6:43 PM, nooldor <nooldor@gmail.com> > wrote: > >>>>>>>Hi all! > >>>>>>> > >>>>>>> > >>>>>>>I am just starting my adventure with R, so excuse me naive > questions. > >>>>>>> > >>>>>>>My data look like that: > >>>>>>> > >>>>>>><http://r.789695.n4.nabble.com/file/n4681391/data_descr_img.jpg> > >>>>>>> > >>>>>>>I have 3 independent variables (F.1, F.2 and F.3) and 334 other > variables > >>>>>>>(r.1, r.2, ... r.334) - each one of these will be dependent > variable in my > >>>>>>>regression. > >>>>>>> > >>>>>>>Total span of the time is 154 observations. But I would like to > have rolling > >>>>>>>window regression with length of 31 observations. > >>>>>>> > >>>>>>>I would like to run script like that: > >>>>>>> > >>>>>>>summary(lm(r.1~F.1+F.2+F.3, data=data)) > >>>>>>>vif(lm(r.1~F.1+F.2+F.3, data=data)) > >>>>>>> > >>>>>>>But for each of 334 (r.1 to r.334) dependent variables separately > and with > >>>>>>>rolling-window of the length 31obs. > >>>>>>> > >>>>>>>Id est: > >>>>>>>summary(lm(r.1~F.1+F.2+F.3, data=data)) would be run 123 (154 total > obs - > >>>>>>>31. for the first regression) times for rolling-fixed period of 31 > obs. > >>>>>>> > >>>>>>>The next regression would be: > >>>>>>>summary(lm(r.2~F.1+F.2+F.3, data=data)) also 123 times ... and so > on till > >>>>>>>summary(lm(r.334~F.1+F.2+F.3, data=data)) > >>>>>>> > >>>>>>>It means it would be 123 x 334 regressions (=41082 regressions) > >>>>>>> > >>>>>>>I would like to save results (summary + vif test) of all those 41082 > >>>>>>>regressions in one read-user-friendly file like this given by e.g > command > >>>>>>>capture.output() > >>>>>>> > >>>>>>>Could you help with it? > >>>>>>> > >>>>>>>Regards, > >>>>>>> > >>>>>>>T.S. > >>>>>>> > >>>>>>> [[alternative HTML version deleted]] > >>>>>>> > >>>>>>>______________________________________________ > >>>>>>>R-help@r-project.org mailing list > >>>>>>>https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>>>>>>and provide commented, minimal, self-contained, reproducible code. > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > >[[alternative HTML version deleted]]