On 05/01/2013 11:20 AM, David Kulp wrote:> I'm using refClass for a complex multi-directional tree structure with
> possibly 100,000s of nodes. The refClass design is very impressive and
I'd
> love to use it, but I've found that the size of refClass instances are
very
> large and creation time is slow. For example, below is a RefClass and
normal
> S4 class. The RefClass requires about 4KB per instance vs 500B for the S4
> class -- based on adding the Ncells and Vcells of used memory reported by
> gc(). And instantiation is more than twice as slow for a RefClass. (R
> 2.14.2)
>
> Anyone have thoughts on this and whether there's any hope for improving
> resources on either front?
Hi David -- not necessarily helpful but creating a few large objects is always
better than creating many small in R, so perhaps re-conceptualize your data
structure? As a rough analogy, instead of constructing a graph as a large number
of 'Node' instances each pointing to one another, a graph could be
represented
as a data.frame containing columns of 'from' and 'to' indexes
(neighbour-edge
list, a few large objects) or as an adjacency matrix. One would also implement
creation and update of the few large objects in an R-friendly (vectorized) way.
Perhaps there are existing packages that already model the data you're
interested in? If your multi-directional tree can be represented as a graph,
then perhaps
http://bioconductor.org/packages/release/bioc/html/graph.html
including facilities in the Boost graph library (RBGL, on the Bioconductor web
site, too) or the igraph package can be put to use.
Martin
>
> I wonder what others are doing. I've been thinking about lightweight
> alternative implementations, but nothing particularly elegant has come to
> mind, yet!
>
> Thanks!
>
>
> simple <- setRefClass('simple', fields = list(a =
"character", b="numeric")
> ) gc() system.time(simple.list <- lapply(1:100000, function(i) {
> simple$new(a='foo',b=i) })) gc()
>
> setClass('simple2',
representation(a="character",b="numeric"))
> setMethod("initialize", "simple2", function(.Object, a,
b) { .Object at a <- a
> .Object at b <- b .Object })
>
> gc() system.time(simple2.list <- lapply(1:100000, function(i) {
> new('simple2',a='foo',b=i) })) gc()
>
> ______________________________________________ R-help at r-project.org
mailing
> list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
posting
> guide http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793