A couple of days ago a few messages indicated that something changed in the basic plot routine that made plot(*, type='l') slow for large data sets. Some people even reported crashes for very large data sets. As far as I remember, this was not reported as a formal bug. I am still not sure if this is a bug, so I report my findings here. First of all, I think I see a slowdown of the plot function, although I do not have older versions of R installed, so I cannot do side-by-side comparisons. Secondly, I noticed that the behavior of plot(*, type='l') differs. Before R-2.1, the plotted lines would appear on the plot gradually. Now, after the wait, the whole plot appears at once. Here are my timing results. I am on Windows2000, IBM Intellistation with Xenon 2.8MHz with 1Gb of memory. I checked May-06 versions of R-patched and R-devel built from sources. I ran the following simple test: x <- rnorm(n) date(); plot(x, type='l'); date() Here are the timings: n seconds 5000 1 6000 2 7000 4 8000 6 9000 9 10000 13 12000 22 14000 36 20000 91 It looks like only type='l' and type='o' exhibit this behavior. All other types produce plots in approximately 1 second. Also, the (long) wait and plot at once behavior happens with the two types mentioned. All others (except 'n' of course) produce gradually appearing plots. Hope this helps, Andy __________________________________ Andy Jaworski 518-1-01 Process Laboratory 3M Corporate Research Laboratory ----- E-mail: apjaworski at mmm.com Tel: (651) 733-6092 Fax: (651) 736-3122
On Fri, 06 May 2005 15:19:01 -0500 apjaworski at mmm.com wrote:> > > > > A couple of days ago a few messages indicated that something changed in the > basic plot routine that made plot(*, type='l') slow for large data sets. > Some people even reported crashes for very large data sets. As far as I > remember, this was not reported as a formal bug. > > I am still not sure if this is a bug, so I report my findings here. First > of all, I think I see a slowdown of the plot function, although I do not > have older versions of R installed, so I cannot do side-by-side > comparisons. Secondly, I noticed that the behavior of plot(*, type='l') > differs. Before R-2.1, the plotted lines would appear on the plot > gradually. Now, after the wait, the whole plot appears at once. > > Here are my timing results. I am on Windows2000, IBM Intellistation with > Xenon 2.8MHz with 1Gb of memory. I checked May-06 versions of R-patched > and R-devel built from sources. I ran the following simple test: > > x <- rnorm(n) > date(); plot(x, type='l'); date() > > Here are the timings: > > n seconds > 5000 1 > 6000 2 > 7000 4 > 8000 6 > 9000 9 > 10000 13 > 12000 22 > 14000 36 > 20000 91~~~~~~~~~~~~~~~~~~~~~~~ I have no such porblem,my OS is debian,256M ram and 2Gswap.> x <- rnorm(200000) > date(); plot(x, type='l'); date()[1] "Sat May 7 08:49:18 2005" [1] "Sat May 7 08:49:20 2005"> version_ platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major 2 minor 1.0 year 2005 month 04 day 18 language R> It looks like only type='l' and type='o' exhibit this behavior. All other > types produce plots in approximately 1 second. Also, the (long) wait and > plot at once behavior happens with the two types mentioned. All others > (except 'n' of course) produce gradually appearing plots. > > Hope this helps, > > Andy > > __________________________________ > Andy Jaworski > 518-1-01 > Process Laboratory > 3M Corporate Research Laboratory > ----- > E-mail: apjaworski at mmm.com > Tel: (651) 733-6092 > Fax: (651) 736-3122 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Plotting times depend on the graphics device. That is nowhere mentioned here, which is unhelpful, and we have already seen a post saying it does not happen on another unmentioned device (presumably X11). Let us assume the unmentioned device was windows(), as that is the only one I see any slowdown for. (Others like win.metafile are windows() under the skin.) On Fri, 6 May 2005 apjaworski at mmm.com wrote:> A couple of days ago a few messages indicated that something changed in the > basic plot routine that made plot(*, type='l') slow for large data sets. > Some people even reported crashes for very large data sets. As far as I > remember, this was not reported as a formal bug.Well, _is_ there a bug in R (as distinct from in Windows graphics internals)? I am almost certain there is not in R and this is a bug in Windows.> I am still not sure if this is a bug, so I report my findings here. First > of all, I think I see a slowdown of the plot function, although I do not > have older versions of R installed, so I cannot do side-by-side > comparisons. Secondly, I noticed that the behavior of plot(*, type='l') > differs. Before R-2.1, the plotted lines would appear on the plot > gradually. Now, after the wait, the whole plot appears at once. > > Here are my timing results. I am on Windows2000, IBM Intellistation with > Xenon 2.8MHz with 1Gb of memory. I checked May-06 versions of R-patched > and R-devel built from sources. I ran the following simple test: > > x <- rnorm(n) > date(); plot(x, type='l'); date()Oh, PLEASE, use system.time() to time things. Had you done so you might have seen things like> windows() > n <- 10000 > system.time(plot(rnorm(n), type="l"))[1] 0.03 13.11 13.21 NA NA> postscript() > system.time(plot(rnorm(n), type="l"))[1] 0.07 0.00 0.08 NA NA> dev.off() > system.time(plot(rnorm(n), type="p"))[1] 0.07 0.93 1.00 NA NA so the time is not being taken by R but by Windows. I can tell you the reason: it is the support for mitred etc line ends introduced in R 2.0.0 and only supported in windows() from 2.1.0. This has slowed solid lines down to the sort of times taken for dashed lines previously. Now, the best we can do to work around this is to follow what we did for dashed lines, and not attempt to be accurate for very large numbers of line segments. By plotting in bunches of 1000 lines I get> system.time(plot(rnorm(n), type="l"))[1] 0.03 0.36 0.42 NA NA> system.time(plot(rnorm(n), type="l", lty=3))[1] 0.22 2.89 3.11 NA NA We have been here before, and as I recall this slowdown happens only in NT-based versions of Windows which seem _de facto_ restricted to about 1000 line elements in a path: what we were not aware of was that it happened for solid lines as well as dashed ones. I've put the bunching into R-patched. It is very regretable that this sort of thing was not tested for during beta-testing. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595