On 12 August 2017 at 15:10, luke-tierney at uiowa.edu wrote: | As the Python posts poitns out, it is possible to use alternate malloc | implementations, either rebuilding R to use them or using LD_PRELOAD. | On Ubuntu for example, you can have R use jemalloc with | | sudo apt-get install libjemalloc1 | env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 R | | This does not seem to hold onto memory to the same degree, but I don't | know about any other aspect of its performance. Interesting. I don't really know anything about malloc versus jemalloc internals but I can affirm that redis -- an in-memory database written in single-threaded C for high performance -- in its Debian builds has been using jemalloc for years, presumably by choice of the maintainer. (We are very happy users of [a gently patched] redis at work; lots of writes; very good uptime.) Having the ability to switch to jemalloc, we could design a test bench and compare what the impact is. Similarly, if someone cared, I could (presumably) alter the default R build for Debian and Ubunto to also switch to jemalloc. Anybody feel like doing some empirics? Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
Very interesting information about switching glibc malloc to jemalloc. So I see action plan as following: 1. set up some benchmark (need to think about design) 2. Run it on ubuntu machine with default glibc malloc 3. Run it with malloc_trim passed with reg.finalizer() 4. Run it with jemalloc 5. Review results and if they will look better than with glibc malloc - possibly consider switch R builds to use jemalloc on Debian, Ubuntu Can't promise about timeline, but I will definitely try to investigate. 2017-08-13 1:36 GMT+04:00 Dirk Eddelbuettel <edd at debian.org>:> > On 12 August 2017 at 15:10, luke-tierney at uiowa.edu wrote: > | As the Python posts poitns out, it is possible to use alternate malloc > | implementations, either rebuilding R to use them or using LD_PRELOAD. > | On Ubuntu for example, you can have R use jemalloc with > | > | sudo apt-get install libjemalloc1 > | env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 R > | > | This does not seem to hold onto memory to the same degree, but I don't > | know about any other aspect of its performance. > > Interesting. > > I don't really know anything about malloc versus jemalloc internals but I > can > affirm that redis -- an in-memory database written in single-threaded C for > high performance -- in its Debian builds has been using jemalloc for years, > presumably by choice of the maintainer. (We are very happy users of [a > gently > patched] redis at work; lots of writes; very good uptime.) > > Having the ability to switch to jemalloc, we could design a test bench and > compare what the impact is. > > Similarly, if someone cared, I could (presumably) alter the default R build > for Debian and Ubunto to also switch to jemalloc. > > Anybody feel like doing some empirics? > > Dirk > > -- > http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org >-- Regards Dmitriy Selivanov [[alternative HTML version deleted]]
On 13 August 2017 at 15:15, Dmitriy Selivanov wrote: | Very interesting information about switching glibc malloc to jemalloc. | | So I see action plan as following: | | 1. set up some benchmark (need to think about design) | 2. Run it on ubuntu machine with default glibc malloc | 3. Run it with malloc_trim passed with reg.finalizer() | 4. Run it with jemalloc | 5. Review results and if they will look better than with glibc malloc - | possibly consider switch R builds to use jemalloc on Debian, Ubuntu | | Can't promise about timeline, but I will definitely try to investigate. Thumbs up! If you set up a (public) git repo I will try to help. I have access to boxen running these OSs ranging from 2gb ram (old netbook) to 100+gb ram (at work). Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
On Saturday, August 12, 2017 5:36:36 PM EDT Dirk Eddelbuettel wrote:> On 12 August 2017 at 15:10, luke-tierney at uiowa.edu wrote: > | As the Python posts poitns out, it is possible to use alternate malloc > | implementations, either rebuilding R to use them or using LD_PRELOAD. > | On Ubuntu for example, you can have R use jemalloc with > | > | sudo apt-get install libjemalloc1 > | env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 R > | > | This does not seem to hold onto memory to the same degree, but I don't > | know about any other aspect of its performance. > > Interesting. > > I don't really know anything about malloc versus jemalloc internals but I > can affirm that redis -- an in-memory database written in single-threaded C > for high performance -- in its Debian builds has been using jemalloc for > years, presumably by choice of the maintainer. (We are very happy users of > [a gently patched] redis at work; lots of writes; very good uptime.) > > Having the ability to switch to jemalloc, we could design a test bench and > compare what the impact is. > > Similarly, if someone cared, I could (presumably) alter the default R build > for Debian and Ubunto to also switch to jemalloc.Depending on how this turns out, Fedora, RHEL, Centos also have jemalloc and tcmalloc. Meaning, if its good on those two, its good on Linux in general. Basically, jemalloc is faster for many work loads but its harder to spot problems. Glibc is better at spotting memory bugs but not as fast. -Steve> Anybody feel like doing some empirics? > > Dirk
I've created repo with initial investigation - https://github.com/dselivanov/r-malloc/blob/master/README.md. At first glance it seems jemalloc, tcmalloc, glibc with malloc_trim all work better than default malloc with glibc. Interesting thing is that glibc with malloc_trim finishes benchmark a bit faster than vanilla glibc (I've checked several times - result is consistent). Another observation is that with jemalloc virtual memory grows much faster than with tcmalloc or glibc malloc (this could be an issue for those who limit process memory with `ulimit`). 2017-08-14 6:16 GMT+04:00 Steve Grubb <sgrubb at redhat.com>:> On Saturday, August 12, 2017 5:36:36 PM EDT Dirk Eddelbuettel wrote: > > On 12 August 2017 at 15:10, luke-tierney at uiowa.edu wrote: > > | As the Python posts poitns out, it is possible to use alternate malloc > > | implementations, either rebuilding R to use them or using LD_PRELOAD. > > | On Ubuntu for example, you can have R use jemalloc with > > | > > | sudo apt-get install libjemalloc1 > > | env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 R > > | > > | This does not seem to hold onto memory to the same degree, but I don't > > | know about any other aspect of its performance. > > > > Interesting. > > > > I don't really know anything about malloc versus jemalloc internals but I > > can affirm that redis -- an in-memory database written in > single-threaded C > > for high performance -- in its Debian builds has been using jemalloc for > > years, presumably by choice of the maintainer. (We are very happy users > of > > [a gently patched] redis at work; lots of writes; very good uptime.) > > > > Having the ability to switch to jemalloc, we could design a test bench > and > > compare what the impact is. > > > > Similarly, if someone cared, I could (presumably) alter the default R > build > > for Debian and Ubunto to also switch to jemalloc. > > Depending on how this turns out, Fedora, RHEL, Centos also have jemalloc > and > tcmalloc. Meaning, if its good on those two, its good on Linux in general. > Basically, jemalloc is faster for many work loads but its harder to spot > problems. Glibc is better at spotting memory bugs but not as fast. > > -Steve > > > Anybody feel like doing some empirics? > > > > Dirk > > >-- Regards Dmitriy Selivanov [[alternative HTML version deleted]]