dteller@affinnova.com
2003-Aug-25 20:38 UTC
[Rd] R Memory Management Under Windows (PR#3980)
Full_Name: David Teller Version: 1.7.1 OS: Windows XP Submission from: (NULL) (12.110.141.194) I've noticed several issues with the R memory management under Windows. It frequently runs out of memory, even when there is plenty of available memory on the system and even plenty of free process virtual address space. This tends to happen far more often when allocation a number of large objects early in the processing. Investigating the malloc.c routines indicates that when expanding the heap to allocate a large object that will not fit, the windows sbrk emulation routine calls an internal routine findregion with the object size (rounded up for overhead). This routine essentially searches the virtual address space for an unused place large enough to put the object, looking towards higher addresses, until it finds a suitable location and always starting at the end of the last chunk allocated. The issue is that any available space that is not large enough for the object in question is bypassed and can NEVER be used by the memory allocator. This can lead to a huge amount of address space wastage if several large objects are allocated early in the process. It seems that the search cannot just be restarted at the beginning of memory each time because the malloc implementation assumes that each chunk of memory allocated from the system will be at a higher address than the last. I have several possible solutions to this and am thinking about implementing one of them: 1) Completely scrap the routines in malloc.c and replace them with a thin wrapper around the core Windows memory allocation routines. 2) Attempt to implement a windows version of mmap() and enable that feature for large allocations. 3) Attempt to clean up the malloc routines so they are less picky about the address ordering of chunks allocated from the system. Any feedback would be appreciated. I don't see any other bugs on this issue and don't want to duplicate efforts if someone else is already working on this. - Dave Teller
Why is this a bug report? It does not fit the definition of bugs in the FAQ. If you want to help, don't misuse R-bugs for questions/comments. On Mon, 25 Aug 2003 dteller@affinnova.com wrote:> Full_Name: David Teller > Version: 1.7.1 > OS: Windows XP > Submission from: (NULL) (12.110.141.194) > > > I've noticed several issues with the R memory management under Windows. > It frequently runs out of memory, even when there is plenty of available memory > on the system and even plenty of free process virtual address space. > This tends to happen far more often when allocation a number of large objects > early in the processing.But that is not a bug -- it is even documented as likely.> Investigating the malloc.c routines indicates that when expanding the heap to > allocate a large object that will not fit, the windows sbrk emulation routine > calls an internal routine findregion with the object size (rounded up for > overhead). This routine essentially searches the virtual address space for an > unused place large enough to put the object, looking towards higher addresses, > until it finds a suitable location and always starting at the end of the last > chunk allocated. > > The issue is that any available space that is not large enough for the object in > question is bypassed and can NEVER be used by the memory allocator. This can > lead to a huge amount of address space wastage if several large objects are > allocated early in the process. > > It seems that the search cannot just be restarted at the beginning of memory > each time because the malloc implementation assumes that each chunk of memory > allocated from the system will be at a higher address than the last. > > I have several possible solutions to this and am thinking about implementing one > of them: > 1) Completely scrap the routines in malloc.c and replace them with a thin > wrapper around the core Windows memory allocation routines.We did this precisely because that approach (used prior to 1.2.0) was far too slow. We saw speed hits of at least 4x in that version until this malloc.c was used.> 2) Attempt to implement a windows version of mmap() and enable that feature for > large allocations. > 3) Attempt to clean up the malloc routines so they are less picky about the > address ordering of chunks allocated from the system. > > Any feedback would be appreciated. > I don't see any other bugs on this issue and don't want to duplicate efforts if > someone else is already working on this.I don't see any bugs at all. I suspect most people using large amounts of memory use a better OS anyway: if Windows provided an adequate malloc in the first place this would all be moot. -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Mon, 25 Aug 2003 20:29:10 +0200 (MET DST), dteller@affinnova.com wrote :>I have several possible solutions to this and am thinking about implementing one >of them: >1) Completely scrap the routines in malloc.c and replace them with a thin >wrapper around the core Windows memory allocation routines. >2) Attempt to implement a windows version of mmap() and enable that feature for >large allocations. >3) Attempt to clean up the malloc routines so they are less picky about the >address ordering of chunks allocated from the system.Option 1 is currently available with the compile time option LEA_MALLOC = NO in src/gnuwin32/MkRules. I doubt if we would incorporate patches to do 2 or 3, because this sort of stuff is so hard to test and debug, but you're welcome to do it for your own use. If you do, I'd recommend talking to Doug Lea, the author of our current malloc, about incorporating your ideas into his code. Duncan Murdoch