Dimitri Liakhovitski
2013-Feb-12 21:33 UTC
[R] grabbing from elements of a list without a loop
Hello!
# I have a list with several data frames:
mylist<-list(data.frame(a=1:2,b=2:3),
data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10))
(mylist)
# I want to grab only one specific column from each list element
neededcolumns<-c(1,2,0) # number of the column I need from each element of
the list
# Below, I am doing it using a loop:
newlist<-NULL
for(i in 1:length(mylist) ) {
newlist[[i]]<-mylist[[i]] [neededcolumns[i]]
}
newlist<-do.call(cbind,newlist)
(newlist)
I was wondering if there is any way to avoid the loop above and make it
faster.
In reality, I have a much longer list, each of my data frames is much
larger and I have to do it MANY-MANY times.
Thanks a lot!
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>
[[alternative HTML version deleted]]
Hi,
?mapply(`[`,mylist,list(1,2,0),SIMPLIFY=FALSE)
#[[1]]
#? a
#1 1
#2 2
#[[2]]
?# b
#1 5
#2 6
#[[3]]
#data frame with 0 columns and 2 rows
A.K.
----- Original Message -----
From: Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com>
To: r-help <r-help at r-project.org>
Cc:
Sent: Tuesday, February 12, 2013 4:33 PM
Subject: [R] grabbing from elements of a list without a loop
Hello!
# I have a list with several data frames:
mylist<-list(data.frame(a=1:2,b=2:3),
? ? ? ? ? data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10))
(mylist)
# I want to grab only one specific column from each list element
neededcolumns<-c(1,2,0)? # number of the column I need from each element of
the list
# Below, I am doing it using a loop:
newlist<-NULL
for(i in 1:length(mylist) ) {
? newlist[[i]]<-mylist[[i]] [neededcolumns[i]]
}
newlist<-do.call(cbind,newlist)
(newlist)
I was wondering if there is any way to avoid the loop above and make it
faster.
In reality, I have a much longer list, each of my data frames is much
larger and I have to do it MANY-MANY times.
Thanks a lot!
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
The answer is essentially no, a loop is required (as best I can see), although it can be in the form of an apply type call instead. Also, your code will fail with a 0 index. Something like this should work: newlist <- lapply(1:3,function(i)if(!neededcolumns[i])NULL else mylist[[c(i,neededcolumns[i])]]) ## note the use of [[c(i,j)]] form for selecting columns as an element from a list of lists ## Note that your cbind call produces a matrix, not a list. -- Bert You might wish to check the parallel package, as this looks like the sort of thing parallellization could be profitably used for; but I have no experience to offer beyond that suggestion. -- Bert On Tue, Feb 12, 2013 at 1:33 PM, Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> wrote:> Hello! > > # I have a list with several data frames: > mylist<-list(data.frame(a=1:2,b=2:3), > data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10)) > (mylist) > > # I want to grab only one specific column from each list element > neededcolumns<-c(1,2,0) # number of the column I need from each element of > the list > > # Below, I am doing it using a loop: > newlist<-NULL > for(i in 1:length(mylist) ) { > newlist[[i]]<-mylist[[i]] [neededcolumns[i]] > } > newlist<-do.call(cbind,newlist) > (newlist) > > I was wondering if there is any way to avoid the loop above and make it > faster. > In reality, I have a much longer list, each of my data frames is much > larger and I have to do it MANY-MANY times. > Thanks a lot! > > Dimitri Liakhovitski > gfk.com <http://marketfusionanalytics.com/> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
One could try using unlist(mylist, recursive=FALSE) to make a list of all the
columns
of the data.frames in mylist and subscripting from that. E.g., here is some
barely tested
code:
> nCols <- vapply(mylist, ncol, 0L)
> neededcolumns<-c(1,2,0) # I assume 0 means no column wanted from 3rd
df in list
> i <- neededcolumns + c(0, cumsum(nCols[-length(nCols)]))
> i <- i[neededcolumns>0]
> data.frame(unlist(mylist, recursive=FALSE)[i])
a b
1 1 5
2 2 6
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Bert Gunter
> Sent: Tuesday, February 12, 2013 2:53 PM
> To: Dimitri Liakhovitski
> Cc: r-help
> Subject: Re: [R] grabbing from elements of a list without a loop
>
> The answer is essentially no, a loop is required (as best I can see),
> although it can be in the form of an apply type call instead. Also,
> your code will fail with a 0 index. Something like this should work:
>
> newlist <- lapply(1:3,function(i)if(!neededcolumns[i])NULL else
> mylist[[c(i,neededcolumns[i])]])
>
> ## note the use of [[c(i,j)]] form for selecting columns as an
> element from a list of lists
>
> ## Note that your cbind call produces a matrix, not a list.
>
> -- Bert
>
>
> You might wish to check the parallel package, as this looks like the
> sort of thing parallellization could be profitably used for; but I
> have no experience to offer beyond that suggestion.
>
> -- Bert
>
> On Tue, Feb 12, 2013 at 1:33 PM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
> > Hello!
> >
> > # I have a list with several data frames:
> > mylist<-list(data.frame(a=1:2,b=2:3),
> > data.frame(a=3:4,b=5:6),data.frame(a=7:8,b=9:10))
> > (mylist)
> >
> > # I want to grab only one specific column from each list element
> > neededcolumns<-c(1,2,0) # number of the column I need from each
element of
> > the list
> >
> > # Below, I am doing it using a loop:
> > newlist<-NULL
> > for(i in 1:length(mylist) ) {
> > newlist[[i]]<-mylist[[i]] [neededcolumns[i]]
> > }
> > newlist<-do.call(cbind,newlist)
> > (newlist)
> >
> > I was wondering if there is any way to avoid the loop above and make
it
> > faster.
> > In reality, I have a much longer list, each of my data frames is much
> > larger and I have to do it MANY-MANY times.
> > Thanks a lot!
> >
> > Dimitri Liakhovitski
> > gfk.com <http://marketfusionanalytics.com/>
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.