John Tillinghast
2009-Nov-10 21:14 UTC
[R] standardGeneric seems slow; any way to get around it?
Hi, I'm running some routines with standard matrix operations like solve() and diag(). When I do a profile, the lead item under total time is standardGeneric(). Furthermore, solve() and diag() have much greater total time than self time. ??? I assume there is some time-consuming decision going on in the usual functions; is there any way to avoid that and go straight to the calculaions? Thanks John Tillinghast [[alternative HTML version deleted]]
Martin Morgan
2009-Nov-11 03:47 UTC
[R] standardGeneric seems slow; any way to get around it?
John Tillinghast wrote:> Hi, > I'm running some routines with standard matrix operations like solve() and > diag(). > When I do a profile, the lead item under total time is standardGeneric(). > Furthermore, solve() and diag() have much greater total time than self time. > ???Hi John -- are you interpreting the output of summaryRprof() correctly? Here's an example> f = function() { g(); h() } > g = function() { t0 = Sys.time(); while (Sys.time() - t0 < 5) {} } > h = function() { t0 = Sys.time(); while (Sys.time() - t0 < 2) {} }We'd expect all the time to be spent in 'g' and 'h'; 'f' is just a wrapper. So...> Rprof(); f(); Rprof(NULL); head(summaryRprof()$by.total, 8)total.time total.pct self.time self.pct "f" 6.98 100.0 0.00 0.0 "g" 4.98 71.3 0.06 0.9 "-" 4.36 62.5 0.18 2.6 "-.POSIXt" 4.24 60.7 0.06 0.9 "difftime" 4.00 57.3 0.22 3.2 "structure" 2.74 39.3 1.16 16.6 "match.arg" 2.04 29.2 0.20 2.9 "h" 2.00 28.7 0.02 0.3 This says 100% of the total time (about 7s) is spent in 'f' or functions called by f, and 0% (well, I guess that's an estimate) is actually spent in 'f' itself. Likewise, the total time in g or the functions it calls is about 5s, and h or the functions it calls about 2s. In all of these cases, little time is spent exactly _in_ the function, so self.time is small. Note also -, -.POSIX, difftime, structure etc; these functions are common to both g and h, and the time spent in them reflects the contributions from both g and h. So the small self.time of solve and diag just say that the code is moving quickly through these, and the large total time for standardGeneric likely means that it's at the top of your call stack, or that it's called from several different places in your code. Here's another example, after library(Matrix); example(solve)> Rprof(); for (i in 1:1000) solve(h8); Rprof(NULL);head(summaryRprof()$by.tot) total.time total.pct self.time self.pct "solve" 0.22 100.0 0.04 18.2 "standardGeneric" 0.22 100.0 0.04 18.2 "solve.default" 0.14 63.6 0.00 0.0 "diag" 0.04 18.2 0.02 9.1 "*" 0.02 9.1 0.02 9.1 ".Call" 0.02 9.1 0.02 9.1 'solve' and 'standardGeneric' are at the top of the call stack, so get ~100% of the total time. In this particular case they also account for a substantial portion of the self.time, and it (might) be worth asking whether there's a different algorithm that would reduce the number of calls to solve (e.g., by not iterating over 1:1000 ;). If you were being aggressive and really were calling solve repeatedly on the same type of object, you might> Rprof(); f = selectMethod(solve, class(h8)); for (i in 1:1000) f(h8);+ Rprof(NULL); head(summaryRprof()$by.total) total.time total.pct self.time self.pct "f" 0.18 100.0 0.04 22.2 "solve.default" 0.14 77.8 0.02 11.1 ".Call" 0.04 22.2 0.04 22.2 "nrow" 0.04 22.2 0.04 22.2 "diag" 0.04 22.2 0.00 0.0 "array" 0.02 11.1 0.02 11.1 Hope that sheds some light on your problem. Martin> I assume there is some time-consuming decision going on in the usual > functions; > is there any way to avoid that and go straight to the calculaions? > > Thanks > John Tillinghast > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793