My surmise would be that you have not analyzed the situation correctly, and you
are making a false assumption about your code. Since you can't show the
code, it's pretty hard to figure out what that is. I think you're going
to have to produce a simple example that you can share that has the same
behavior. My guess is that you will answer your own question as you try to do
that.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of S?bastien Bihorel
Sent: Tuesday, February 22, 2011 12:16 PM
To: r-help at r-project.org
Subject: [R] Discrepancies in run times
Dear R-users,
I am in the process of creating new custom functions and am quite puzzled by
some discrepancies in execution time when I run some R scripts that call
those new functions. So here is the situation:
- let's assume I have created two custom functions, called myg and myf;
- myg is mostly a plotting function, which makes a heavy use of grid and
lattice functions;
- myf is a function that massages data, opens and closes graph devices, and
pass the data to myg:
* myf contains loops and sub-loops which subset the data in little pieces
necessary for plotting purposes;
* the most inner loop in myf contains two calls to myg, one in section A
of the code, one in section B of the code;
* Both sections could be turn on and off based upon an input of the myf
function;
* Both sections passes the same data to myg, except for some graph
settings;
* All graph devices open in section A are closed before section B starts;
and all graph devices open in section B are closed before the next iteration
of the inner loop.
Running a script passing a particular set of data to myf and turning on both
section A and B takes around 9 minutes (~3 combined minutes for Section A,
~6 combined minutes for Section B). The results of R CMD Rprof indicates
that most of the execution time is used by print (see extract below).
% total % self
total seconds self seconds name
99.4 545.84 0.0 0.00 "myf"
95.2 522.70 0.0 0.06 "myg"
90.7 498.20 0.0 0.02 "standardGeneric"
90.6 497.70 0.0 0.00 "print"
90.5 497.32 5.3 29.06 "printFunction"
90.5 497.32 0.0 0.00 "print.trellis"
62.7 344.58 62.6 343.96 "lattice.setStatus"
...
% self % total
self seconds total seconds name
62.6 343.96 62.7 344.58 "lattice.setStatus"
5.6 30.86 7.6 41.60 ".Call.graphics"
5.3 29.06 90.5 497.32 "printFunction"
3.5 19.12 3.7 20.22 ".Call"
2.3 12.52 2.3 12.52 "$"
1.3 7.18 1.9 10.42 "match"
1.3 6.98 1.3 6.98 "dev.off"
...
Running another script passing the same set of data to myf and turning
section A on and section B off takes around 3 minutes. The results of R CMD
Rprof also indicates that most of the execution time is used by print (see
extract below).
% total % self
total seconds self seconds name
98.1 177.16 0.0 0.00 "myf"
93.3 168.40 0.0 0.00 "myg"
85.0 153.50 0.1 0.10 "standardGeneric"
84.7 152.94 0.0 0.02 "print"
84.6 152.74 0.0 0.00 "print.trellis"
84.6 152.72 4.5 8.04 "printFunction"
51.3 92.66 51.3 92.58 "lattice.setStatus"
...
% self % total
self seconds total seconds name
51.3 92.58 51.3 92.66 "lattice.setStatus"
8.5 15.34 10.7 19.36 ".Call.graphics"
6.6 11.96 6.9 12.50 ".Call"
4.5 8.04 84.6 152.72 "printFunction"
3.4 6.14 3.4 6.14 "$"
2.1 3.72 3.0 5.40 "match"
0.8 1.46 0.8 1.46 "dev.off"
...
Running another script passing the same set of data to myf and turning
section A off and section B on takes around 3 minutes. The results of R CMD
Rprof also indicates that most of the execution time is used by print (see
extract below).
% total % self
total seconds self seconds name
98.1 175.00 0.0 0.00 "myf"
90.7 161.82 0.0 0.00 "myg"
86.8 154.90 0.0 0.06 "standardGeneric"
86.5 154.32 0.0 0.02 "print"
86.4 154.16 4.0 7.18 "printFunction"
86.4 154.16 0.0 0.00 "print.trellis"
52.6 93.76 52.5 93.62 "lattice.setStatus"
...
% self % total
self seconds total seconds name
52.5 93.62 52.6 93.76 "lattice.setStatus"
8.6 15.28 10.9 19.40 ".Call.graphics"
4.2 7.58 4.5 7.98 ".Call"
4.0 7.18 86.4 154.16 "printFunction"
3.1 5.56 3.1 5.56 "dev.off"
...
I cannot quite explain why the execution of print in section B would take
more time if section A is turn off. I sincerely apologize but, due to the
confidentiality of the actual script, I cannot provide the content of myf or
myg in my email. I realize this is not ideal for identifying the source of
this discrepancy, therefore any suggestion to approach this problem would be
very much welcome.
Sebastien
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
message may contain confidential information. If you are not the designated
recipient, please notify the sender immediately, and delete the original and any
copies. Any use of the message by you is prohibited.