On Thu, 29 Jun 2006, Manoj wrote:
> Hello All,
> I am trying to work on writing the following piece of (pseudo)
> code in an optimal fashion:
>
> ----------------------------------------------------
> # Two data frames with some data
>
> a = data.frame(somedata)
> b = data.frame(somedata)
>
> for(i in 1:nrow(dt) {
> # Merge dates for a given date into a new data frame
> c = merge(a[a$dt==dt[i],),b[b$dt == dt[i],], by=c(some column));
> }
Note that only the last iteration of that loop is actually needed.
What are you really trying to do, and why are you worrying about memory?
E.g. merge() in R-devel is a lot more efficient for some operations,
including perhaps your example.
> ----------------------------------------------------
>
>
> Now, my understanding is that the data frame c in the above code is
> malloc'ed in every count of the loop. Is that assumption correct?
No. Here 'c' is just a symbol, and assignment (please use <- in
public
code, it is easier to read) binds the symbol to the data frame returned by
merge(). So the allocation (not 'malloc' necessarily) is going on
inside
merge(). Also, 'c' is a system object, so you are confusing people by
using its name for your own object.
When you assign to 'c' you change the binding to a different already
allocated object. Eventually garbage collection will recover (to R) the
memory allocated to objects which are no longer bound to symbols.
I am not aware of any account which describes in detail how R works at
this level, and end users do not need to know it. (It is also the case
that R maintains a number of illusions and internally may not do what it
appears to do.)
>
> Is the following attempt a better way of doing things?
>
> ----------------------------------------------------
> a = data.frame(somedata)
> b = data.frame(somedata)
>
> # Pre-allocate data frame c
>
> c = data.frame(for some size);
>
> for(i in 1:nrow(dt) {
> # Merge dates for a given date into a new data frame
> # and copy the result into c
>
> copy(c, merge(a[a$dt==dt[i],),b[b$dt == dt[i],], by=c(some column));
>
> }
> ----------------------------------------------------
>
> Now the question is, How can I copy the merged data into my
> pre-allocated data frame c ? I tried rbind/cbind but they are pretty
> fuzzy about having the right names and dimension hence it fails.
>
> Any help would be greatly appreciated!
>
> Thanks.
>
> Manoj
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595