Hi, n = 10000000 L = list(a=integer(n), b=integer(n)) L[[2]][1:10] gives me the first 10 items of the 2nd vector in the list L. It works fine. However it appears to copy the entire L[[2]] vector in memory first, before subsetting it. It seems reasonable that "[[" can't know that all that is to be done is to do [1:10] on the result and therefore a copy in memory of the entire vector L[[2]] is not required. Only a new vector length 10 need be created. I see why [[ needs to make a copy in general. L[[c(2,1)]] gives me the 1st item of the 2nd vector in the list L. It works fine, and does not appear to copy L[[2]] in memory first. Its much faster as n grows large. But I need more than 1 element of the vector .... L[[c(2,1:10)]] fails with "Error: recursive indexing failed at level 2" Is there a way I can obtain the first 10 items of L[[2]] without a memory copy of L[[2]] ? Thanks! Matthew R 2.1.1 [[alternative HTML version deleted]]
On Tue, 23 May 2006, Matthew Dowle wrote:> > Hi, > > n = 10000000 > L = list(a=integer(n), b=integer(n)) > > L[[2]][1:10] gives me the first 10 items of the 2nd vector in the list L. > It works fine. However it appears to copy the entire L[[2]] vector in > memory first, before subsetting it. It seems reasonable that "[[" can't > know that all that is to be done is to do [1:10] on the result and therefore > a copy in memory of the entire vector L[[2]] is not required. Only a new > vector length 10 need be created. I see why [[ needs to make a copy in > general. > > L[[c(2,1)]] gives me the 1st item of the 2nd vector in the list L. It > works fine, and does not appear to copy L[[2]] in memory first. Its much > faster as n grows large. > > But I need more than 1 element of the vector .... L[[c(2,1:10)]] fails > with "Error: recursive indexing failed at level 2"Note that [[ ]] is documented to only ever return one element, so this is invalid.> Is there a way I can obtain the first 10 items of L[[2]] without a memory > copy of L[[2]] ?Use .Call -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thanks. I looked some more and found that L$b[1:10] doesn't seem to copy L$b. If that's correct why does L[[2]][1:10] copy L[[2]] ?> -----Original Message----- > From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] > Sent: 23 May 2006 16:23 > To: Matthew Dowle > Cc: 'r-help at stat.math.ethz.ch' > Subject: Re: [R] Avoiding a memory copy by [[ > > > On Tue, 23 May 2006, Matthew Dowle wrote: > > > > > Hi, > > > > n = 10000000 > > L = list(a=integer(n), b=integer(n)) > > > > L[[2]][1:10] gives me the first 10 items of the 2nd vector in the > > list L. It works fine. However it appears to copy the > entire L[[2]] > > vector in memory first, before subsetting it. It seems reasonable > > that "[[" can't know that all that is to be done is to do [1:10] on > > the result and therefore a copy in memory of the entire > vector L[[2]] > > is not required. Only a new vector length 10 need be > created. I see > > why [[ needs to make a copy in general. > > > > L[[c(2,1)]] gives me the 1st item of the 2nd vector in the > list L. > > It works fine, and does not appear to copy L[[2]] in > memory first. > > Its much faster as n grows large. > > > > But I need more than 1 element of the vector .... > L[[c(2,1:10)]] fails > > with "Error: recursive indexing failed at level 2" > > Note that [[ ]] is documented to only ever return one > element, so this is > invalid. > > > Is there a way I can obtain the first 10 items of L[[2]] without a > > memory copy of L[[2]] ? > > Use .Call > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 >
On 5/23/06, Matthew Dowle <mdowle at concordiafunds.com> wrote:> > Hi, > > n = 10000000 > L = list(a=integer(n), b=integer(n)) > > L[[2]][1:10] gives me the first 10 items of the 2nd vector in the list L. > It works fine. However it appears to copy the entire L[[2]] vector in > memory first, before subsetting it. It seems reasonable that "[[" can't > know that all that is to be done is to do [1:10] on the result and therefore > a copy in memory of the entire vector L[[2]] is not required. Only a new > vector length 10 need be created. I see why [[ needs to make a copy in > general. > > L[[c(2,1)]] gives me the 1st item of the 2nd vector in the list L. It > works fine, and does not appear to copy L[[2]] in memory first. Its much > faster as n grows large. > > But I need more than 1 element of the vector .... L[[c(2,1:10)]] fails > with "Error: recursive indexing failed at level 2" > > Is there a way I can obtain the first 10 items of L[[2]] without a memory > copy of L[[2]] ?I think environments will help you out here: n < 10000000 env <- new.env() env$a <- integer(n) env$b <- integer(n) env$a[1:10] /Henrik> Thanks! > Matthew > > R 2.1.1 > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > >
That development sounds excellent. I'm happy to help test it, just let me know. Until 2.4.0 then I'll do something like the following, because I need to deal with list integer locations rather than names : eval(parse(text=paste("L$'",names(L)[2],"'[1:10]",sep=""))) This works well but if there is an easier way until 2.4.0, please let me know. Thank you and Henrik for your replies.> -----Original Message----- > From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] > Sent: 23 May 2006 17:47 > To: Henrik Bengtsson > Cc: Matthew Dowle; r-help at stat.math.ethz.ch > Subject: Re: [R] Avoiding a memory copy by [[ > > > On Tue, 23 May 2006, Henrik Bengtsson wrote: > > > On 5/23/06, Matthew Dowle <mdowle at concordiafunds.com> wrote: > >> > >> Thanks. > >> > >> I looked some more and found that L$b[1:10] doesn't seem > to copy L$b. > >> If that's correct why does L[[2]][1:10] copy L[[2]] ? > > > > I forgot, this is probably what I was told in discussion about > > UseMethod("$") the other day: The "$" operator is very special. Its > > second argument (the one after the operator) is not evaluated. For > > "[[" it is. This is probably also why the solution with > environment > > works. I think some with the more knowledge about the R > core has to > > give you the details on this, and especially why "$" is > special in the > > first place (maybe because of the example you're giving). > > That's not the reason here: the internal code for [[ > duplicates for vector > lists but not pairlists. That could be replaced by a NAMED > optimization, > although we would not do that until 2.4.0 (for which Thomas > Lumley has > written profiling code for memory use and duplication). > > > > > /Henrik > > > >> > -----Original Message----- > >> > From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] > >> > Sent: 23 May 2006 16:23 > >> > To: Matthew Dowle > >> > Cc: 'r-help at stat.math.ethz.ch' > >> > Subject: Re: [R] Avoiding a memory copy by [[ > >> > > >> > > >> > On Tue, 23 May 2006, Matthew Dowle wrote: > >> > > >> > > > >> > > Hi, > >> > > > >> > > n = 10000000 > >> > > L = list(a=integer(n), b=integer(n)) > >> > > > >> > > L[[2]][1:10] gives me the first 10 items of the 2nd vector in > >> > > the list L. It works fine. However it appears to copy the > >> > entire L[[2]] > >> > > vector in memory first, before subsetting it. It seems > >> > > reasonable that "[[" can't know that all that is to be > done is to > >> > > do [1:10] on the result and therefore a copy in memory of the > >> > > entire > >> > vector L[[2]] > >> > > is not required. Only a new vector length 10 need be > >> > created. I see > >> > > why [[ needs to make a copy in general. > >> > > > >> > > L[[c(2,1)]] gives me the 1st item of the 2nd vector in the > >> > list L. > >> > > It works fine, and does not appear to copy L[[2]] in > >> > memory first. > >> > > Its much faster as n grows large. > >> > > > >> > > But I need more than 1 element of the vector .... > >> > L[[c(2,1:10)]] fails > >> > > with "Error: recursive indexing failed at level 2" > >> > > >> > Note that [[ ]] is documented to only ever return one > element, so > >> > this is invalid. > >> > > >> > > Is there a way I can obtain the first 10 items of > L[[2]] without > >> > > a memory copy of L[[2]] ? > >> > > >> > Use .Call > >> > > >> > -- > >> > Brian D. Ripley, ripley at stats.ox.ac.uk > >> > Professor of Applied Statistics, > http://www.stats.ox.ac.uk/~ripley/ > >> > University of Oxford, Tel: +44 1865 272861 (self) > >> > 1 South Parks Road, +44 1865 272866 (PA) > >> > Oxford OX1 3TG, UK Fax: +44 1865 272595 > >> > > >> > >> ______________________________________________ > >> R-help at stat.math.ethz.ch mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide! > >> http://www.R-project.org/posting-guide.html > >> > >> > > > > > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 >