thr3ads.net - R help - [R] Looping an lapply linear regression function [Sep 2013]

If this information is useful, please help other people find it:
Share via:

arun

2013-Sep-05 16:49 UTC

[R] Looping an lapply linear regression function

HI,
May be this helps:
?set.seed(28)
?dat1<-
setNames(as.data.frame(matrix(sample(1:40,10*5,replace=TRUE),ncol=5)),letters[1:5])
indx<-as.data.frame(combn(names(dat1),2),stringsAsFactors=FALSE)
res<-t(sapply(indx,function(x)
{x1<-cbind(dat1[x[1]],dat1[x[2]]);summary(lm(x1[,1]~x1[,2]))$coef[,4]}))
?rownames(res)<-apply(indx,2,paste,collapse="_")
?colnames(res)[2]<- "Coef1"
?head(res,3)
#??? (Intercept)???? Coef1
#a_b? 0.39862676 0.8365606
#a_c? 0.02427885 0.6094141
#a_d? 0.37521423 0.7578723


#permutation
indx2<-expand.grid(names(dat1),names(dat1),stringsAsFactors=FALSE)
#or
indx2<- expand.grid(rep(list(names(dat1)),2),stringsAsFactors=FALSE)
indx2New<- indx2[indx2[,1]!=indx2[,2],]
res2<-t(sapply(seq_len(nrow(indx2New)),function(i) {x1<- indx2New[i,];
x2<-cbind(dat1[x1[,1]],dat1[x1[,2]]);summary(lm(x2[,1]~x2[,2]))$coef[,4]}))
row.names(res2)<-apply(indx2New,1,paste,collapse="_")
?colnames(res2)<- colnames(res)


A.K.


Hi everyone, 

First off just like to say thanks to everyone?s contributions. 
Up until now, I?ve never had to post as I?ve always found the answers 
from trawling through the database. I?ve finally managed to stump 
myself, and although for someone out there, I?m sure the answer to my 
problem is fairly simple, I, however have spent the whole day infront of
 my computer struggling. I know I?ll probably get an absolute ribbing 
for making a basic mistake, or not understanding something fully, but 
I?m blind to the mistake now after looking so long at it. 

What I?m looking to do, is formulate a matrix ([28,28]) of 
p-values produced from running linear regressions of 28 variables 
against themselves (eg a~b, a~c, a~d.....b~a, b~c etc...), if that makes
 sense. I?ve managed to get this to work if I just input each variable 
by hand, but this isn?t going to help when I have to make 20 matrices. 

My script is as follows; 


for (j in [1:28]) 
{ 
?##This section works perfectly, if I don?t try to loop it, I know 
this wont work at the moment, because I haven?t designated what j is, 
but I?m showing to highlight what I?m attempting to do. ? 
? 

? ?models <- lapply(varlist, function(x) { 
? ? lm(substitute(ANS ~ i, list(i = as.name(x))), data = con.i) 
? }) 
? 
? ? ? ? ? abc<- lapply(models, function(f) summary(f)$coefficients[,4]) 
? 
? ? ? ? ? abc<- do.call(rbind, abc) 
? 
? ? ? ? ? 
? 
} 

I get the following error when I try to loop it... 

Error in model.frame.default(formula = substitute(j ~ i, list(i = as.name(x))),
?:
? variable lengths differ (found for 'ANS') ##?NS being my first
variable

All variables are of the same length, with 21 recordings for each 


If anyone can suggest a method of looping, or another means 
or producing ?models? for each of my 28 variables, without having to do 
it by hand that would be fantastic. 

Thanks in advance!!

Flavio Barros

2013-Sep-05 19:41 UTC

head link

[R] Looping an lapply linear regression function

Hello Arun. Can you provide some data? To help you better i will need a
complete reproducible example ok?


On Thu, Sep 5, 2013 at 1:49 PM, arun <smartpink111@yahoo.com> wrote:
> HI,
> May be this helps:
>  set.seed(28)
>  dat1<-
>
setNames(as.data.frame(matrix(sample(1:40,10*5,replace=TRUE),ncol=5)),letters[1:5])
> indx<-as.data.frame(combn(names(dat1),2),stringsAsFactors=FALSE)
> res<-t(sapply(indx,function(x)
> {x1<-cbind(dat1[x[1]],dat1[x[2]]);summary(lm(x1[,1]~x1[,2]))$coef[,4]}))
>  rownames(res)<-apply(indx,2,paste,collapse="_")
>  colnames(res)[2]<- "Coef1"
>  head(res,3)
> #    (Intercept)     Coef1
> #a_b  0.39862676 0.8365606
> #a_c  0.02427885 0.6094141
> #a_d  0.37521423 0.7578723
>
>
> #permutation
> indx2<-expand.grid(names(dat1),names(dat1),stringsAsFactors=FALSE)
> #or
> indx2<- expand.grid(rep(list(names(dat1)),2),stringsAsFactors=FALSE)
> indx2New<- indx2[indx2[,1]!=indx2[,2],]
> res2<-t(sapply(seq_len(nrow(indx2New)),function(i) {x1<-
indx2New[i,];
>
x2<-cbind(dat1[x1[,1]],dat1[x1[,2]]);summary(lm(x2[,1]~x2[,2]))$coef[,4]}))
> row.names(res2)<-apply(indx2New,1,paste,collapse="_")
>  colnames(res2)<- colnames(res)
>
>
> A.K.
>
>
> Hi everyone,
>
> First off just like to say thanks to everyone´s contributions.
> Up until now, I´ve never had to post as I´ve always found the answers
> from trawling through the database. I´ve finally managed to stump
> myself, and although for someone out there, I´m sure the answer to my
> problem is fairly simple, I, however have spent the whole day infront of
>  my computer struggling. I know I´ll probably get an absolute ribbing
> for making a basic mistake, or not understanding something fully, but
> I´m blind to the mistake now after looking so long at it.
>
> What I´m looking to do, is formulate a matrix ([28,28]) of
> p-values produced from running linear regressions of 28 variables
> against themselves (eg a~b, a~c, a~d.....b~a, b~c etc...), if that makes
>  sense. I´ve managed to get this to work if I just input each variable
> by hand, but this isn´t going to help when I have to make 20 matrices.
>
> My script is as follows;
>
>
> for (j in [1:28])
> {
>  ##This section works perfectly, if I don´t try to loop it, I know
> this wont work at the moment, because I haven´t designated what j is,
> but I´m showing to highlight what I´m attempting to do.
>
>
>    models <- lapply(varlist, function(x) {
>     lm(substitute(ANS ~ i, list(i = as.name(x))), data = con.i)
>   })
>
>           abc<- lapply(models, function(f) summary(f)$coefficients[,4])
>
>           abc<- do.call(rbind, abc)
>
>
>
> }
>
> I get the following error when I try to loop it...
>
> Error in model.frame.default(formula = substitute(j ~ i, list(i =
as.name(x))),
>  :
>   variable lengths differ (found for 'ANS') ##ÄNS being my first
variable
>
> All variables are of the same length, with 21 recordings for each
>
>
> If anyone can suggest a method of looping, or another means
> or producing ´models´ for each of my 28 variables, without having to do
> it by hand that would be fantastic.
>
> Thanks in advance!!
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

arun

2013-Sep-06 16:03 UTC

head link

[R] Looping an lapply linear regression function

HI,
Using the example dataset (Test_data.csv):
dat1<-
read.csv("Test_data.csv",header=TRUE,sep="\t",row.names=1)
indx2<-expand.grid(names(dat1),names(dat1),stringsAsFactors=FALSE) 
indx2New<- indx2[indx2[,1]!=indx2[,2],] 
res2<-t(sapply(seq_len(nrow(indx2New)),function(i) {x1<- indx2New[i,];
x2<-cbind(dat1[x1[,1]],dat1[x1[,2]]);summary(lm(x2[,1]~x2[,2]))$coef[,4]}))
?dat2<- cbind(indx2New,value=res2[,2])
library(reshape2)
res2New<- dcast(dat2,Var1~Var2,value.var="value")
row.names(res2New)<- res2New[,1]
?res2New<- as.matrix(res2New[,-1])
?dim(res2New)
#[1] 28 28
head(res2New,3)
#??????????? AgriEmi?? AgriMach? AgriValAd???? AgrVaGDP?????? AIL???? ALAre
#AgriEmi????????? NA 0.23401895 0.45697412 4.644877e-01 0.6398030 0.4039855
#AgriMach? 0.2340189???????? NA 0.01449519 4.922558e-06 0.3890046 0.9279044
#AgriValAd 0.4569741 0.01449519???????? NA 5.135269e-02 0.5325943 0.4872555
#????????????? ALPer????????? ANS???? AraLa? AraLaPer??? CombusRen????? ForArea
#AgriEmi?? 0.4039855 2.507257e-01 0.2303275 0.2303275 0.9438409125 0.0004473563
#AgriMach? 0.9279044 6.072123e-05 0.3154370 0.3154370 0.0040254771 0.2590309747
#AgriValAd 0.4872555 2.060412e-01 0.8449600 0.8449600 0.0008077264 0.5152352072
#???????????? ForArePer? ForProTon ForProTonSKm????? ForRen????????? GDP
#AgriEmi?? 0.0004473563 0.01714768 0.0007089448 0.900222038 0.6022470671
#AgriMach? 0.2590309748 0.20170800 0.2305335762 0.005584703 0.4199684378
#AgriValAd 0.5152352071 0.80983446 0.4368256400 0.208975126 0.0003534226
#?????????????????? GEF GroAgriProVal PermaCrop? RoadDens?? RoadTot? RurPopGro
#AgriEmi?? 0.0008580856??? 0.01078593 0.6863110 0.6398030 0.6398030 0.40734903
#AgriMach? 0.1315182244??? 0.14074612 0.2530378 0.3064186 0.3064186 0.33705434
#AgriValAd 0.7520803684??? 0.31556633 0.1151395 0.4374599 0.4374599 0.04837586
#????????? RurPopPerc??? TerrPA???????? Trac????? Vehi WaterWith
#AgriEmi??? 0.4835676 0.4504239 2.279566e-01 0.6398030 0.3056195
#AgriMach?? 0.6401556 0.1707857 4.730759e-33 0.3064186 0.9502553
#AgriValAd? 0.2383507 0.0223124 1.513169e-02 0.1251843 0.3307148


#or
res3<-xtabs(value~Var1+Var2,data=dat2) #here the diagonals are "0"s
?attr(res3,"class")<- NULL
?attr(res3,"call")<-NULL
names(dimnames(res3))<-NULL

#You can change it in the first solution also.
?res2New<- dcast(dat2,Var1~Var2,value.var="value",fill=0)
row.names(res2New)<- res2New[,1]
?res2New<- as.matrix(res2New[,-1])
?identical(res2New,res3)
#[1] TRUE

A.K.




Arun, 

That does exactly what I wanted to do, but how would I 
manipulate into a matrix where the indepedent variable was on the x and 
dependent on y, or vice versa, rather than a 736, 2 matrix 



? ? V1 ? V2 ? V3 ? V4 ? V5...Vn 
V1 - 

V2 ? ? ? - 

V3 ? ? ? ? ? ? ?- 

V4 ? ? ? ? ? ? ? ? ? ?- ? 

V5 ? ? ? ? ? ? ? ? ? ? ? ? ?- 

Vn ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - 


----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Thursday, September 5, 2013 12:49 PM
Subject: Re: Looping an lapply linear regression function

HI,
May be this helps:
?set.seed(28)
?dat1<-
setNames(as.data.frame(matrix(sample(1:40,10*5,replace=TRUE),ncol=5)),letters[1:5])
indx<-as.data.frame(combn(names(dat1),2),stringsAsFactors=FALSE)
res<-t(sapply(indx,function(x)
{x1<-cbind(dat1[x[1]],dat1[x[2]]);summary(lm(x1[,1]~x1[,2]))$coef[,4]}))
?rownames(res)<-apply(indx,2,paste,collapse="_")
?colnames(res)[2]<- "Coef1"
?head(res,3)
#??? (Intercept)???? Coef1
#a_b? 0.39862676 0.8365606
#a_c? 0.02427885 0.6094141
#a_d? 0.37521423 0.7578723


#permutation
indx2<-expand.grid(names(dat1),names(dat1),stringsAsFactors=FALSE)
#or
indx2<- expand.grid(rep(list(names(dat1)),2),stringsAsFactors=FALSE)
indx2New<- indx2[indx2[,1]!=indx2[,2],]
res2<-t(sapply(seq_len(nrow(indx2New)),function(i) {x1<- indx2New[i,];
x2<-cbind(dat1[x1[,1]],dat1[x1[,2]]);summary(lm(x2[,1]~x2[,2]))$coef[,4]}))
row.names(res2)<-apply(indx2New,1,paste,collapse="_")
?colnames(res2)<- colnames(res)


A.K.


Hi everyone, 

First off just like to say thanks to everyone?s contributions. 
Up until now, I?ve never had to post as I?ve always found the answers 
from trawling through the database. I?ve finally managed to stump 
myself, and although for someone out there, I?m sure the answer to my 
problem is fairly simple, I, however have spent the whole day infront of
my computer struggling. I know I?ll probably get an absolute ribbing 
for making a basic mistake, or not understanding something fully, but 
I?m blind to the mistake now after looking so long at it. 

What I?m looking to do, is formulate a matrix ([28,28]) of 
p-values produced from running linear regressions of 28 variables 
against themselves (eg a~b, a~c, a~d.....b~a, b~c etc...), if that makes
sense. I?ve managed to get this to work if I just input each variable 
by hand, but this isn?t going to help when I have to make 20 matrices. 

My script is as follows; 


for (j in [1:28]) 
{ 
?##This section works perfectly, if I don?t try to loop it, I know 
this wont work at the moment, because I haven?t designated what j is, 
but I?m showing to highlight what I?m attempting to do. ? 
? 

? ?models <- lapply(varlist, function(x) { 
? ? lm(substitute(ANS ~ i, list(i = as.name(x))), data = con.i) 
? }) 
? 
? ? ? ? ? abc<- lapply(models, function(f) summary(f)$coefficients[,4]) 
? 
? ? ? ? ? abc<- do.call(rbind, abc) 
? 
? ? ? ? ? 
? 
} 

I get the following error when I try to loop it... 

Error in model.frame.default(formula = substitute(j ~ i, list(i = as.name(x))),
?:
? variable lengths differ (found for 'ANS') ##?NS being my first
variable

All variables are of the same length, with 21 recordings for each 


If anyone can suggest a method of looping, or another means 
or producing ?models? for each of my 28 variables, without having to do 
it by hand that would be fantastic. 

Thanks in advance!!

R help - Sep 2013 - Looping an lapply linear regression function

[R] Looping an lapply linear regression function

[R] Looping an lapply linear regression function

[R] Looping an lapply linear regression function