Tomas Kalibera
2021-Mar-29 08:41 UTC
[Rd] custom allocators, Valgrind and uninitialized memory
Hi Andreas, On 3/26/21 8:48 PM, Andreas Kersting wrote:> Hi Dirk, > > Sure, let me try to explain: > > CRAN ran the tests of my packageusing R which was configured > --with-valgrind-instrumentation > 0. Valgrind reported many errors > related to the use of supposedly uninitialized memory and the CRAN > team asked me to tackle these. > > These errors are false positives, because I pass a custom allocator > to allocVector3() which hands out memory which is already > initialized. However, this memory is explicitly marked for Valgrind > as uninitialized by allocVector3(), and I do not initialize it > subsequently, so Valgrind complains. > > Now I am asking if it is correct that allocVector3() marks memory as > uninitialized/undefined, even if it comes from a custom allocator. > This is because allocVector3() cannot know if the memory might > already by initialized. I think the semantics of allocVector/allocVector3 should be the same regardless of whether custom allocators are used. The semantics of allocVector is to provide uninitialized memory (non-pointer types, Writing R Extensions 5.9.2). Therefore, it is the caller who needs to take care of initialization. This is also the semantics of "malloc" and Rallocators.h says "custom_alloc_t mem_alloc; /* malloc equivalent */". So I think that your code using your custom allocator needs to initialize allocated memory to be correct. If your allocator initializes the memory, that is fine, but unnecessary. So technically speaking, the valgrind reports are not false alarms. I think your call sites should initialize. Best Tomas [[alternative HTML version deleted]]
Andreas Kersting
2021-Mar-29 20:18 UTC
[Rd] custom allocators, Valgrind and uninitialized memory
Hi Tomas, Thanks for sharing your view on this! I understand your point, but still I think that the current situation is somewhat unfortunate: I would argue that mmap() is a natural candidate to be used together with allocVector3(); it is even mentioned explicitly here: https://github.com/wch/r-source/blob/trunk/src/main/memory.c#L2575-L2576 However, when using a non-anonymous mapping, i.e. we want mmap() to initialize the memory e.g. from a file or a POSIX shared memory object, this means that we need to use MAP_FIXED in case we are obliged to initialize the memory AFTER allocVector3() returned it; at least I cannot think of a different way to achieve this. The use of MAP_FIXED - is discouraged (e.g. https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html) - requires two calls to mmap(): (1) to obtain the (anonymous) memory to be handed out by the custom allocater and (2) to actually map the file "over" the just allocated vector (using MAP_FIXED), which will overwrite the vector header; hence, we need to first back it up to later restore it I have implemented my function using MAP_FIXED here: https://github.com/gfkse/bettermc/commit/f34c4f4c45c9ab11abe9b9e9b8b48064f128d731#diff-7098a5dde34efab163bbef27fe32f95c29e76236649479985d09c70100e4c737R278-R323 This solution, to me, is much more complicated and hacky than my previous one, which assumed it is OK to hand out already initialized memory directly from allocVector3(). Regards, Andreas 2021-03-29 10:41 GMT+02:00 "Tomas Kalibera" <tomas.kalibera at gmail.com>:> Hi Andreas, > On 3/26/21 8:48 PM, Andreas Kersting wrote: >> Hi Dirk, > > Sure, let me try to explain: > > CRAN ran the tests of my package using R which was configured > --with-valgrind-instrumentation > 0. Valgrind reported many errors > related to the use of supposedly uninitialized memory and the CRAN > team asked me to tackle these. > > These errors are false positives, because I pass a custom allocator > to allocVector3() which hands out memory which is already > initialized. However, this memory is explicitly marked for Valgrind > as uninitialized by allocVector3(), and I do not initialize it > subsequently, so Valgrind complains. > > Now I am asking if it is correct that allocVector3() marks memory as > uninitialized/undefined, even if it comes from a custom allocator. > This is because allocVector3() cannot know if the memory might > already by initialized. > I think the semantics of allocVector/allocVector3 should be the same regardless of whether custom allocators are used. The semantics of allocVector is to provide uninitialized memory (non-pointer types, Writing R Extensions 5.9.2). Therefore, it is the caller who needs to take care of initialization. This is also the semantics of "malloc" and Rallocators.h says "custom_alloc_t mem_alloc; /* malloc equivalent */". > > So I think that your code using your custom allocator needs to initialize allocated memory to be correct. If your allocator initializes the memory, that is fine, but unnecessary. > > So technically speaking, the valgrind reports are not false alarms. I think your call sites should initialize. > > Best > Tomas > > >