Dimitri Liakhovitski
2013-Feb-12 21:33 UTC
[R] grabbing from elements of a list without a loop
Hello! # I have a list with several data frames: mylist<-list(data.frame(a=1:2,b=2:3), data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10)) (mylist) # I want to grab only one specific column from each list element neededcolumns<-c(1,2,0) # number of the column I need from each element of the list # Below, I am doing it using a loop: newlist<-NULL for(i in 1:length(mylist) ) { newlist[[i]]<-mylist[[i]] [neededcolumns[i]] } newlist<-do.call(cbind,newlist) (newlist) I was wondering if there is any way to avoid the loop above and make it faster. In reality, I have a much longer list, each of my data frames is much larger and I have to do it MANY-MANY times. Thanks a lot! Dimitri Liakhovitski gfk.com <http://marketfusionanalytics.com/> [[alternative HTML version deleted]]
Hi, ?mapply(`[`,mylist,list(1,2,0),SIMPLIFY=FALSE) #[[1]] #? a #1 1 #2 2 #[[2]] ?# b #1 5 #2 6 #[[3]] #data frame with 0 columns and 2 rows A.K. ----- Original Message ----- From: Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> To: r-help <r-help at r-project.org> Cc: Sent: Tuesday, February 12, 2013 4:33 PM Subject: [R] grabbing from elements of a list without a loop Hello! # I have a list with several data frames: mylist<-list(data.frame(a=1:2,b=2:3), ? ? ? ? ? data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10)) (mylist) # I want to grab only one specific column from each list element neededcolumns<-c(1,2,0)? # number of the column I need from each element of the list # Below, I am doing it using a loop: newlist<-NULL for(i in 1:length(mylist) ) { ? newlist[[i]]<-mylist[[i]] [neededcolumns[i]] } newlist<-do.call(cbind,newlist) (newlist) I was wondering if there is any way to avoid the loop above and make it faster. In reality, I have a much longer list, each of my data frames is much larger and I have to do it MANY-MANY times. Thanks a lot! Dimitri Liakhovitski gfk.com <http://marketfusionanalytics.com/> ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
The answer is essentially no, a loop is required (as best I can see), although it can be in the form of an apply type call instead. Also, your code will fail with a 0 index. Something like this should work: newlist <- lapply(1:3,function(i)if(!neededcolumns[i])NULL else mylist[[c(i,neededcolumns[i])]]) ## note the use of [[c(i,j)]] form for selecting columns as an element from a list of lists ## Note that your cbind call produces a matrix, not a list. -- Bert You might wish to check the parallel package, as this looks like the sort of thing parallellization could be profitably used for; but I have no experience to offer beyond that suggestion. -- Bert On Tue, Feb 12, 2013 at 1:33 PM, Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> wrote:> Hello! > > # I have a list with several data frames: > mylist<-list(data.frame(a=1:2,b=2:3), > data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10)) > (mylist) > > # I want to grab only one specific column from each list element > neededcolumns<-c(1,2,0) # number of the column I need from each element of > the list > > # Below, I am doing it using a loop: > newlist<-NULL > for(i in 1:length(mylist) ) { > newlist[[i]]<-mylist[[i]] [neededcolumns[i]] > } > newlist<-do.call(cbind,newlist) > (newlist) > > I was wondering if there is any way to avoid the loop above and make it > faster. > In reality, I have a much longer list, each of my data frames is much > larger and I have to do it MANY-MANY times. > Thanks a lot! > > Dimitri Liakhovitski > gfk.com <http://marketfusionanalytics.com/> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
One could try using unlist(mylist, recursive=FALSE) to make a list of all the columns of the data.frames in mylist and subscripting from that. E.g., here is some barely tested code: > nCols <- vapply(mylist, ncol, 0L) > neededcolumns<-c(1,2,0) # I assume 0 means no column wanted from 3rd df in list > i <- neededcolumns + c(0, cumsum(nCols[-length(nCols)])) > i <- i[neededcolumns>0] > data.frame(unlist(mylist, recursive=FALSE)[i]) a b 1 1 5 2 2 6 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of Bert Gunter > Sent: Tuesday, February 12, 2013 2:53 PM > To: Dimitri Liakhovitski > Cc: r-help > Subject: Re: [R] grabbing from elements of a list without a loop > > The answer is essentially no, a loop is required (as best I can see), > although it can be in the form of an apply type call instead. Also, > your code will fail with a 0 index. Something like this should work: > > newlist <- lapply(1:3,function(i)if(!neededcolumns[i])NULL else > mylist[[c(i,neededcolumns[i])]]) > > ## note the use of [[c(i,j)]] form for selecting columns as an > element from a list of lists > > ## Note that your cbind call produces a matrix, not a list. > > -- Bert > > > You might wish to check the parallel package, as this looks like the > sort of thing parallellization could be profitably used for; but I > have no experience to offer beyond that suggestion. > > -- Bert > > On Tue, Feb 12, 2013 at 1:33 PM, Dimitri Liakhovitski > <dimitri.liakhovitski at gmail.com> wrote: > > Hello! > > > > # I have a list with several data frames: > > mylist<-list(data.frame(a=1:2,b=2:3), > > data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10)) > > (mylist) > > > > # I want to grab only one specific column from each list element > > neededcolumns<-c(1,2,0) # number of the column I need from each element of > > the list > > > > # Below, I am doing it using a loop: > > newlist<-NULL > > for(i in 1:length(mylist) ) { > > newlist[[i]]<-mylist[[i]] [neededcolumns[i]] > > } > > newlist<-do.call(cbind,newlist) > > (newlist) > > > > I was wondering if there is any way to avoid the loop above and make it > > faster. > > In reality, I have a much longer list, each of my data frames is much > > larger and I have to do it MANY-MANY times. > > Thanks a lot! > > > > Dimitri Liakhovitski > > gfk.com <http://marketfusionanalytics.com/> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- > biostatistics/pdb-ncb-home.htm > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.