Hi, I'm quite new to R (1 month full time use so far). I have to run loop
regressions VERY often in my work, so I would appreciate some new
methodology that I'm not considering.
#---------------------------------------------------------------------------------------------
y<-matrix(rnorm(100),ncol=10,nrow=10)
x<-matrix(rnorm(50),ncol=5,nrow=10)
#Suppose I want to run the specification y=A+Bx+error, for each and every
y[,n] onto each and every x[,n].
#So with:
ncol(y);ncol(x)
#I should end up with 10*5=50 regressions in total.
#I know how to do this fine:
MISC1<-0
for(i in 1:ncol(y)){
for(j in 1:ncol(x)){
reg<-lm(y[,i]~x[,j])
MISC1<-cbind(MISC1,coef(reg)) #for coefficients
}}
coef<-matrix(MISC1[,-1],ncol=50)
coef[,1];coef(lm(y[,1]~x[,1])) #test passed
ncol(coef) #as desired, 50 regressions.
#---------------------------------------------------------------------------------------------
Now for my question: Is there easier or better methods of doing this? I know
of a lapply method, but the only lapply way I know of for lm(..) is
basically doing a lapply inside of a lapply, meaning it's exactly the same
as the double loop above... I'm looking to escape from loops.
Also, if any of you could share your top R tips that you've learned over the
years, I'd really appreciate it. Tiny things like learning that array() and
matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped
me immeasurably. (Not that I'm also googling for this stuff! I'm doing R
14
hours a day!).
Thanks.
--
View this message in context:
http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4234487.html
Sent from the R help mailing list archive at Nabble.com.
You can get the ols coefficients with basic matrix operations as well (
https://files.nyu.edu/mrg217/public/ols_matrix.pdf) and by that avoid one
of the loops. I do not know how efficient this is but I have attached an
example you can paste bellow your code. Here, one x-array is used as a
right hand side variable for all y-arrays in each loop.
The coefficients match, but they are in different order.
#----------------------------------------------------------------------------------------------------------------------------
#Original code here....
m1=array(1,nrow(x))
#Creates an array of ones
MISC2<-0
for(j in 1:ncol(x)){
mX=cbind(m1,x[,j])
reg2<-solve(t(mX)%*%mX)%*%t(mX)%*%y
MISC2<-cbind(MISC2,reg2) #for coefficients
}
coef<-matrix(MISC2[,-1],ncol=50)
coef[,1];coef(lm(y[,1]~x[,1])) #test passed
ncol(coef)
#as desired, 50 regressions.
MISC1
MISC2
#-------------------------------------------------------------------------------------------------------------------------------------
2011/12/26 iliketurtles <isaacm200@gmail.com>
> Hi, I'm quite new to R (1 month full time use so far). I have to run
loop
> regressions VERY often in my work, so I would appreciate some new
> methodology that I'm not considering.
>
>
>
#---------------------------------------------------------------------------------------------
> y<-matrix(rnorm(100),ncol=10,nrow=10)
> x<-matrix(rnorm(50),ncol=5,nrow=10)
>
> #Suppose I want to run the specification y=A+Bx+error, for each and every
> y[,n] onto each and every x[,n].
> #So with:
> ncol(y);ncol(x)
> #I should end up with 10*5=50 regressions in total.
>
> #I know how to do this fine:
> MISC1<-0
> for(i in 1:ncol(y)){
> for(j in 1:ncol(x)){
> reg<-lm(y[,i]~x[,j])
> MISC1<-cbind(MISC1,coef(reg)) #for coefficients
> }}
> coef<-matrix(MISC1[,-1],ncol=50)
>
> coef[,1];coef(lm(y[,1]~x[,1])) #test passed
> ncol(coef) #as desired, 50 regressions.
>
>
#---------------------------------------------------------------------------------------------
>
> Now for my question: Is there easier or better methods of doing this? I
> know
> of a lapply method, but the only lapply way I know of for lm(..) is
> basically doing a lapply inside of a lapply, meaning it's exactly the
same
> as the double loop above... I'm looking to escape from loops.
>
> Also, if any of you could share your top R tips that you've learned
over
> the
> years, I'd really appreciate it. Tiny things like learning that array()
and
> matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped
> me immeasurably. (Not that I'm also googling for this stuff! I'm
doing R 14
> hours a day!).
>
> Thanks.
>
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4234487.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Hello, iliketurtles (?), for whatever strange reasons you want to regress all y-columns on all x-columns, maybe reg <- apply( x, 2, function( xx) lm( y ~ xx)) do.call( "cbind", lapply( reg, coef)) does what you want. (To understand what the code above does, check the documentation for lm(): "If response is a matrix a linear model is fitted separately by least-squares to each column of the matrix.") Hth -- Gerrit On Mon, 26 Dec 2011, iliketurtles wrote:> Hi, I'm quite new to R (1 month full time use so far). I have to run loop > regressions VERY often in my work, so I would appreciate some new > methodology that I'm not considering. > > #--------------------------------------------------------------------------------------------- > y<-matrix(rnorm(100),ncol=10,nrow=10) > x<-matrix(rnorm(50),ncol=5,nrow=10) > > #Suppose I want to run the specification y=A+Bx+error, for each and every > y[,n] onto each and every x[,n]. > #So with: > ncol(y);ncol(x) > #I should end up with 10*5=50 regressions in total. > > #I know how to do this fine: > MISC1<-0 > for(i in 1:ncol(y)){ > for(j in 1:ncol(x)){ > reg<-lm(y[,i]~x[,j]) > MISC1<-cbind(MISC1,coef(reg)) #for coefficients > }} > coef<-matrix(MISC1[,-1],ncol=50) > > coef[,1];coef(lm(y[,1]~x[,1])) #test passed > ncol(coef) #as desired, 50 regressions. > #--------------------------------------------------------------------------------------------- > > Now for my question: Is there easier or better methods of doing this? I know > of a lapply method, but the only lapply way I know of for lm(..) is > basically doing a lapply inside of a lapply, meaning it's exactly the same > as the double loop above... I'm looking to escape from loops. > > Also, if any of you could share your top R tips that you've learned over the > years, I'd really appreciate it. Tiny things like learning that array() and > matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped > me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14 > hours a day!). > > Thanks. > > -- > View this message in context: http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4234487.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Dear anonymous: 1. You may be more likely to get "useful tips" on this list if you sign with your real name. It's friendlier. 2. If you are using R "14 hours/day." get and read a good R book. The CRAN site or Amazon lists many; choose one or more that suits your needs. 3. Read the R Help files carefully. ?lm tells you that you do not need a loop to fit many y's simultaneously: "If response is a matrix a linear model is fitted separately by least-squares to each column of the matrix. " 4. Loops are not necessarily so terrible. "apply" type functions are basically loops, also; their chief advantage is often just code readability, not efficiency. 5. For tasks such as yours, ?update methods are typically useful. Cheers, Bert On Mon, Dec 26, 2011 at 4:29 AM, iliketurtles <isaacm200 at gmail.com> wrote:> Hi, I'm quite new to R (1 month full time use so far). I have to run loop > regressions VERY often in my work, so I would appreciate some new > methodology that I'm not considering. > > #--------------------------------------------------------------------------------------------- > y<-matrix(rnorm(100),ncol=10,nrow=10) > x<-matrix(rnorm(50),ncol=5,nrow=10) > > #Suppose I want to run the specification y=A+Bx+error, for each and every > y[,n] onto each and every x[,n]. > #So with: > ncol(y);ncol(x) > #I should end up with 10*5=50 regressions in total. > > #I know how to do this fine: > MISC1<-0 > for(i in 1:ncol(y)){ > for(j in 1:ncol(x)){ > reg<-lm(y[,i]~x[,j]) > MISC1<-cbind(MISC1,coef(reg)) #for coefficients > }} > coef<-matrix(MISC1[,-1],ncol=50) > > coef[,1];coef(lm(y[,1]~x[,1])) #test passed > ncol(coef) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#as desired, 50 regressions. > #--------------------------------------------------------------------------------------------- > > Now for my question: Is there easier or better methods of doing this? I know > of a lapply method, but the only lapply way I know of for lm(..) is > basically doing a lapply inside of a lapply, meaning it's exactly the same > as the double loop above... I'm looking to escape from loops. > > Also, if any of you could share your top R tips that you've learned over the > years, I'd really appreciate it. Tiny things like learning that array() and > matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped > me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14 > hours a day!). > > Thanks. > > -- > View this message in context: http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4234487.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Thanks for the advice everyone. All very helpful. @Bert Added my information to signature, thanks. ----- ---- Isaac Research Assistant Quantitative Finance Faculty, UTS -- View this message in context: http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4235654.html Sent from the R help mailing list archive at Nabble.com.