Bryan W. Lewis
2010-Jul-09 04:49 UTC
[Rd] Suggestion for serialization performance improvement on Windows
Dear R developers, The slow performance of serializing to a raw vector on Windows is an issue that has appeared in this list before. It appears to be due to the frequent use of realloc from the resize_buffer method in serialize.c. I suggest a more granular, but still incremental, re-allocation of memory. For example change near the top of resize_buffer to: R_size_t newsize = needed + 65536 - (needed % 65536); or some other similar small multiple of a typical system page size. I have found this to dramatically improve performance of serialization to raw vectors on Windows. Best, Bryan
Henrik Bengtsson
2010-Jul-14 05:53 UTC
[Rd] Suggestion for serialization performance improvement on Windows
On Fri, Jul 9, 2010 at 6:49 AM, Bryan W. Lewis <bwaynelewis at gmail.com> wrote:> Dear R developers, > > ?The slow performance of serializing to a raw vector on Windows is an > issue that has appeared in this list before.My guess is that you are referring to: [Rd] serialize() to via temporary file is heaps faster than doing it directly (on Windows), 2008-07-24 http://tolstoy.newcastle.edu.au/R/e4/devel/08/07/2355.html If so, that thread show how unnecessarily slow (5 mins instead of 5 secs) it is on Windows.> It appears to be due to > the frequent use of realloc from the resize_buffer method in > serialize.c. > > I suggest a more granular, but still incremental, re-allocation of > memory. For example change near the top of resize_buffer to: > > R_size_t newsize = needed + 65536 - (needed % 65536); > > or some other similar small multiple of a typical system page size. > > I have found this to dramatically improve performance of serialization > to raw vectors on Windows.I second this update, which seems to make serialize(..., connection=NULL) useful in Windows. Thxs, Henrik> > Best, > > Bryan > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Prof Brian Ripley
2010-Jul-20 10:45 UTC
[Rd] Suggestion for serialization performance improvement on Windows
On Fri, 9 Jul 2010, Bryan W. Lewis wrote:> Dear R developers, > > The slow performance of serializing to a raw vector on Windows is an > issue that has appeared in this list before. It appears to be due toReferences?> the frequent use of realloc from the resize_buffer method in > serialize.c. > > I suggest a more granular, but still incremental, re-allocation of > memory. For example change near the top of resize_buffer to: > > R_size_t newsize = needed + 65536 - (needed % 65536); > > or some other similar small multiple of a typical system page size.for some definition of 'small multiple'> I have found this to dramatically improve performance of serialization > to raw vectors on Windows.However, I didn't and you presented no evidence. On HB's 2008 example your idea achieved for me a speedup of about 3x. A much better speedup (15x) was achieved by switching serialize.c to use the alternative malloc used by memory.c, and using a much larger page size (e.g. 1Mb) was better still. But changing the re-allocation strategy resulted in a 150x speed up, to levels comparable to decent operating systems like Linux and Solaris with the existing code. (In case it matters, I was using x64 Windows 7.) Ideally you would have - given references for your claims - given examples for why this was too slow for you - specified an exact patch with performance comparisons for your examples - given your credentials (see the comment about 'good manners' in the R posting guide). It is very likely that we would not have been able to use any patch you supplied without such credentials. So please test R-devel, and if there is still a problem reply with all the details omitted here. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595