HI, May be this helps: ?set.seed(28) ?dat1<- setNames(as.data.frame(matrix(sample(1:40,10*5,replace=TRUE),ncol=5)),letters[1:5]) indx<-as.data.frame(combn(names(dat1),2),stringsAsFactors=FALSE) res<-t(sapply(indx,function(x) {x1<-cbind(dat1[x[1]],dat1[x[2]]);summary(lm(x1[,1]~x1[,2]))$coef[,4]})) ?rownames(res)<-apply(indx,2,paste,collapse="_") ?colnames(res)[2]<- "Coef1" ?head(res,3) #??? (Intercept)???? Coef1 #a_b? 0.39862676 0.8365606 #a_c? 0.02427885 0.6094141 #a_d? 0.37521423 0.7578723 #permutation indx2<-expand.grid(names(dat1),names(dat1),stringsAsFactors=FALSE) #or indx2<- expand.grid(rep(list(names(dat1)),2),stringsAsFactors=FALSE) indx2New<- indx2[indx2[,1]!=indx2[,2],] res2<-t(sapply(seq_len(nrow(indx2New)),function(i) {x1<- indx2New[i,]; x2<-cbind(dat1[x1[,1]],dat1[x1[,2]]);summary(lm(x2[,1]~x2[,2]))$coef[,4]})) row.names(res2)<-apply(indx2New,1,paste,collapse="_") ?colnames(res2)<- colnames(res) A.K. Hi everyone, First off just like to say thanks to everyone?s contributions. Up until now, I?ve never had to post as I?ve always found the answers from trawling through the database. I?ve finally managed to stump myself, and although for someone out there, I?m sure the answer to my problem is fairly simple, I, however have spent the whole day infront of my computer struggling. I know I?ll probably get an absolute ribbing for making a basic mistake, or not understanding something fully, but I?m blind to the mistake now after looking so long at it. What I?m looking to do, is formulate a matrix ([28,28]) of p-values produced from running linear regressions of 28 variables against themselves (eg a~b, a~c, a~d.....b~a, b~c etc...), if that makes sense. I?ve managed to get this to work if I just input each variable by hand, but this isn?t going to help when I have to make 20 matrices. My script is as follows; for (j in [1:28]) { ?##This section works perfectly, if I don?t try to loop it, I know this wont work at the moment, because I haven?t designated what j is, but I?m showing to highlight what I?m attempting to do. ? ? ? ?models <- lapply(varlist, function(x) { ? ? lm(substitute(ANS ~ i, list(i = as.name(x))), data = con.i) ? }) ? ? ? ? ? ? abc<- lapply(models, function(f) summary(f)$coefficients[,4]) ? ? ? ? ? ? abc<- do.call(rbind, abc) ? ? ? ? ? ? ? } I get the following error when I try to loop it... Error in model.frame.default(formula = substitute(j ~ i, list(i = as.name(x))), ?: ? variable lengths differ (found for 'ANS') ##?NS being my first variable All variables are of the same length, with 21 recordings for each If anyone can suggest a method of looping, or another means or producing ?models? for each of my 28 variables, without having to do it by hand that would be fantastic. Thanks in advance!!
Hello Arun. Can you provide some data? To help you better i will need a complete reproducible example ok? On Thu, Sep 5, 2013 at 1:49 PM, arun <smartpink111@yahoo.com> wrote:> HI, > May be this helps: > set.seed(28) > dat1<- > setNames(as.data.frame(matrix(sample(1:40,10*5,replace=TRUE),ncol=5)),letters[1:5]) > indx<-as.data.frame(combn(names(dat1),2),stringsAsFactors=FALSE) > res<-t(sapply(indx,function(x) > {x1<-cbind(dat1[x[1]],dat1[x[2]]);summary(lm(x1[,1]~x1[,2]))$coef[,4]})) > rownames(res)<-apply(indx,2,paste,collapse="_") > colnames(res)[2]<- "Coef1" > head(res,3) > # (Intercept) Coef1 > #a_b 0.39862676 0.8365606 > #a_c 0.02427885 0.6094141 > #a_d 0.37521423 0.7578723 > > > #permutation > indx2<-expand.grid(names(dat1),names(dat1),stringsAsFactors=FALSE) > #or > indx2<- expand.grid(rep(list(names(dat1)),2),stringsAsFactors=FALSE) > indx2New<- indx2[indx2[,1]!=indx2[,2],] > res2<-t(sapply(seq_len(nrow(indx2New)),function(i) {x1<- indx2New[i,]; > x2<-cbind(dat1[x1[,1]],dat1[x1[,2]]);summary(lm(x2[,1]~x2[,2]))$coef[,4]})) > row.names(res2)<-apply(indx2New,1,paste,collapse="_") > colnames(res2)<- colnames(res) > > > A.K. > > > Hi everyone, > > First off just like to say thanks to everyone´s contributions. > Up until now, I´ve never had to post as I´ve always found the answers > from trawling through the database. I´ve finally managed to stump > myself, and although for someone out there, I´m sure the answer to my > problem is fairly simple, I, however have spent the whole day infront of > my computer struggling. I know I´ll probably get an absolute ribbing > for making a basic mistake, or not understanding something fully, but > I´m blind to the mistake now after looking so long at it. > > What I´m looking to do, is formulate a matrix ([28,28]) of > p-values produced from running linear regressions of 28 variables > against themselves (eg a~b, a~c, a~d.....b~a, b~c etc...), if that makes > sense. I´ve managed to get this to work if I just input each variable > by hand, but this isn´t going to help when I have to make 20 matrices. > > My script is as follows; > > > for (j in [1:28]) > { > ##This section works perfectly, if I don´t try to loop it, I know > this wont work at the moment, because I haven´t designated what j is, > but I´m showing to highlight what I´m attempting to do. > > > models <- lapply(varlist, function(x) { > lm(substitute(ANS ~ i, list(i = as.name(x))), data = con.i) > }) > > abc<- lapply(models, function(f) summary(f)$coefficients[,4]) > > abc<- do.call(rbind, abc) > > > > } > > I get the following error when I try to loop it... > > Error in model.frame.default(formula = substitute(j ~ i, list(i = as.name(x))), > : > variable lengths differ (found for 'ANS') ##ÄNS being my first variable > > All variables are of the same length, with 21 recordings for each > > > If anyone can suggest a method of looping, or another means > or producing ´models´ for each of my 28 variables, without having to do > it by hand that would be fantastic. > > Thanks in advance!! > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
HI, Using the example dataset (Test_data.csv): dat1<- read.csv("Test_data.csv",header=TRUE,sep="\t",row.names=1) indx2<-expand.grid(names(dat1),names(dat1),stringsAsFactors=FALSE) indx2New<- indx2[indx2[,1]!=indx2[,2],] res2<-t(sapply(seq_len(nrow(indx2New)),function(i) {x1<- indx2New[i,]; x2<-cbind(dat1[x1[,1]],dat1[x1[,2]]);summary(lm(x2[,1]~x2[,2]))$coef[,4]})) ?dat2<- cbind(indx2New,value=res2[,2]) library(reshape2) res2New<- dcast(dat2,Var1~Var2,value.var="value") row.names(res2New)<- res2New[,1] ?res2New<- as.matrix(res2New[,-1]) ?dim(res2New) #[1] 28 28 head(res2New,3) #??????????? AgriEmi?? AgriMach? AgriValAd???? AgrVaGDP?????? AIL???? ALAre #AgriEmi????????? NA 0.23401895 0.45697412 4.644877e-01 0.6398030 0.4039855 #AgriMach? 0.2340189???????? NA 0.01449519 4.922558e-06 0.3890046 0.9279044 #AgriValAd 0.4569741 0.01449519???????? NA 5.135269e-02 0.5325943 0.4872555 #????????????? ALPer????????? ANS???? AraLa? AraLaPer??? CombusRen????? ForArea #AgriEmi?? 0.4039855 2.507257e-01 0.2303275 0.2303275 0.9438409125 0.0004473563 #AgriMach? 0.9279044 6.072123e-05 0.3154370 0.3154370 0.0040254771 0.2590309747 #AgriValAd 0.4872555 2.060412e-01 0.8449600 0.8449600 0.0008077264 0.5152352072 #???????????? ForArePer? ForProTon ForProTonSKm????? ForRen????????? GDP #AgriEmi?? 0.0004473563 0.01714768 0.0007089448 0.900222038 0.6022470671 #AgriMach? 0.2590309748 0.20170800 0.2305335762 0.005584703 0.4199684378 #AgriValAd 0.5152352071 0.80983446 0.4368256400 0.208975126 0.0003534226 #?????????????????? GEF GroAgriProVal PermaCrop? RoadDens?? RoadTot? RurPopGro #AgriEmi?? 0.0008580856??? 0.01078593 0.6863110 0.6398030 0.6398030 0.40734903 #AgriMach? 0.1315182244??? 0.14074612 0.2530378 0.3064186 0.3064186 0.33705434 #AgriValAd 0.7520803684??? 0.31556633 0.1151395 0.4374599 0.4374599 0.04837586 #????????? RurPopPerc??? TerrPA???????? Trac????? Vehi WaterWith #AgriEmi??? 0.4835676 0.4504239 2.279566e-01 0.6398030 0.3056195 #AgriMach?? 0.6401556 0.1707857 4.730759e-33 0.3064186 0.9502553 #AgriValAd? 0.2383507 0.0223124 1.513169e-02 0.1251843 0.3307148 #or res3<-xtabs(value~Var1+Var2,data=dat2) #here the diagonals are "0"s ?attr(res3,"class")<- NULL ?attr(res3,"call")<-NULL names(dimnames(res3))<-NULL #You can change it in the first solution also. ?res2New<- dcast(dat2,Var1~Var2,value.var="value",fill=0) row.names(res2New)<- res2New[,1] ?res2New<- as.matrix(res2New[,-1]) ?identical(res2New,res3) #[1] TRUE A.K. Arun, That does exactly what I wanted to do, but how would I manipulate into a matrix where the indepedent variable was on the x and dependent on y, or vice versa, rather than a 736, 2 matrix ? ? V1 ? V2 ? V3 ? V4 ? V5...Vn V1 - V2 ? ? ? - V3 ? ? ? ? ? ? ?- V4 ? ? ? ? ? ? ? ? ? ?- ? V5 ? ? ? ? ? ? ? ? ? ? ? ? ?- Vn ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - ----- Original Message ----- From: arun <smartpink111 at yahoo.com> To: R help <r-help at r-project.org> Cc: Sent: Thursday, September 5, 2013 12:49 PM Subject: Re: Looping an lapply linear regression function HI, May be this helps: ?set.seed(28) ?dat1<- setNames(as.data.frame(matrix(sample(1:40,10*5,replace=TRUE),ncol=5)),letters[1:5]) indx<-as.data.frame(combn(names(dat1),2),stringsAsFactors=FALSE) res<-t(sapply(indx,function(x) {x1<-cbind(dat1[x[1]],dat1[x[2]]);summary(lm(x1[,1]~x1[,2]))$coef[,4]})) ?rownames(res)<-apply(indx,2,paste,collapse="_") ?colnames(res)[2]<- "Coef1" ?head(res,3) #??? (Intercept)???? Coef1 #a_b? 0.39862676 0.8365606 #a_c? 0.02427885 0.6094141 #a_d? 0.37521423 0.7578723 #permutation indx2<-expand.grid(names(dat1),names(dat1),stringsAsFactors=FALSE) #or indx2<- expand.grid(rep(list(names(dat1)),2),stringsAsFactors=FALSE) indx2New<- indx2[indx2[,1]!=indx2[,2],] res2<-t(sapply(seq_len(nrow(indx2New)),function(i) {x1<- indx2New[i,]; x2<-cbind(dat1[x1[,1]],dat1[x1[,2]]);summary(lm(x2[,1]~x2[,2]))$coef[,4]})) row.names(res2)<-apply(indx2New,1,paste,collapse="_") ?colnames(res2)<- colnames(res) A.K. Hi everyone, First off just like to say thanks to everyone?s contributions. Up until now, I?ve never had to post as I?ve always found the answers from trawling through the database. I?ve finally managed to stump myself, and although for someone out there, I?m sure the answer to my problem is fairly simple, I, however have spent the whole day infront of my computer struggling. I know I?ll probably get an absolute ribbing for making a basic mistake, or not understanding something fully, but I?m blind to the mistake now after looking so long at it. What I?m looking to do, is formulate a matrix ([28,28]) of p-values produced from running linear regressions of 28 variables against themselves (eg a~b, a~c, a~d.....b~a, b~c etc...), if that makes sense. I?ve managed to get this to work if I just input each variable by hand, but this isn?t going to help when I have to make 20 matrices. My script is as follows; for (j in [1:28]) { ?##This section works perfectly, if I don?t try to loop it, I know this wont work at the moment, because I haven?t designated what j is, but I?m showing to highlight what I?m attempting to do. ? ? ? ?models <- lapply(varlist, function(x) { ? ? lm(substitute(ANS ~ i, list(i = as.name(x))), data = con.i) ? }) ? ? ? ? ? ? abc<- lapply(models, function(f) summary(f)$coefficients[,4]) ? ? ? ? ? ? abc<- do.call(rbind, abc) ? ? ? ? ? ? ? } I get the following error when I try to loop it... Error in model.frame.default(formula = substitute(j ~ i, list(i = as.name(x))), ?: ? variable lengths differ (found for 'ANS') ##?NS being my first variable All variables are of the same length, with 21 recordings for each If anyone can suggest a method of looping, or another means or producing ?models? for each of my 28 variables, without having to do it by hand that would be fantastic. Thanks in advance!!