Henrik Bengtsson
2016-Dec-21 17:10 UTC
[Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
On Tue, Dec 20, 2016 at 7:39 AM, Karl Millar <kmillar at google.com> wrote:> It's not always clear when it's safe to remove the DLL. > > The main problem that I'm aware of is that native objects with > finalizers might still exist (created by R_RegisterCFinalizer etc). > Even if there are no live references to such objects (which would be > hard to verify), it still wouldn't be safe to unload the DLL until a > full garbage collection has been done. > > If the DLL is unloaded, then the function pointer that was registered > now becomes a pointer into the memory where the DLL was, leading to an > almost certain crash when such objects get garbage collected.Very good point. Does base::gc() perform such a *full* garbage collection and thereby trigger all remaining finalizers to be called? In other words, do you think an explicit call to base::gc() prior to cleaning out left-over DLLs (e.g. R.utils::gcDLLs()) would be sufficient? /Henrik> > A better approach would be to just remove the limit on the number of > DLLs, dynamically expanding the array if/when needed. > > > On Tue, Dec 20, 2016 at 3:40 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote: >> On Tue, Dec 20, 2016 at 7:04 AM, Henrik Bengtsson >> <henrik.bengtsson at gmail.com> wrote: >>> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some >>> packages don't unload their DLLs when they being unloaded themselves. >> >> I am surprised by this. Why does R not do this automatically? What is >> the case for keeping the DLL loaded after the package has been >> unloaded? What happens if you reload another version of the same >> package from a different library after unloading? >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel
It does, but you'd still be relying on the R code ensuring that all of these objects are dead prior to unloading the DLL, otherwise they'll survive the GC. Maybe if the package counted how many such objects exist, it could work out when it's safe to remove the DLL. I'm not sure that it can be done automatically. What could be done is to to keep the DLL loaded, but remove it from R's table of loaded DLLs. That way, there's no risk of dangling function pointers and a new DLL of the same name could be loaded. You could still run into issues though as some DLLs assume that the associated namespace exists. Currently what I do is to never unload DLLs. If I need to replace one, then I just restart R. It's less convenient, but it's always correct. On Wed, Dec 21, 2016 at 9:10 AM, Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:> On Tue, Dec 20, 2016 at 7:39 AM, Karl Millar <kmillar at google.com> wrote: >> It's not always clear when it's safe to remove the DLL. >> >> The main problem that I'm aware of is that native objects with >> finalizers might still exist (created by R_RegisterCFinalizer etc). >> Even if there are no live references to such objects (which would be >> hard to verify), it still wouldn't be safe to unload the DLL until a >> full garbage collection has been done. >> >> If the DLL is unloaded, then the function pointer that was registered >> now becomes a pointer into the memory where the DLL was, leading to an >> almost certain crash when such objects get garbage collected. > > Very good point. > > Does base::gc() perform such a *full* garbage collection and thereby > trigger all remaining finalizers to be called? In other words, do you > think an explicit call to base::gc() prior to cleaning out left-over > DLLs (e.g. R.utils::gcDLLs()) would be sufficient? > > /Henrik > >> >> A better approach would be to just remove the limit on the number of >> DLLs, dynamically expanding the array if/when needed. >> >> >> On Tue, Dec 20, 2016 at 3:40 AM, Jeroen Ooms <jeroen.ooms at stat.ucla.edu> wrote: >>> On Tue, Dec 20, 2016 at 7:04 AM, Henrik Bengtsson >>> <henrik.bengtsson at gmail.com> wrote: >>>> On reason for hitting the MAX_NUM_DLLS (= 100) limit is because some >>>> packages don't unload their DLLs when they being unloaded themselves. >>> >>> I am surprised by this. Why does R not do this automatically? What is >>> the case for keeping the DLL loaded after the package has been >>> unloaded? What happens if you reload another version of the same >>> package from a different library after unloading? >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel
Dirk Eddelbuettel
2016-Dec-21 17:58 UTC
[Rd] Request: Increasing MAX_NUM_DLLS in Rdynload.c
On 21 December 2016 at 09:42, Karl Millar via R-devel wrote: | Currently what I do is to never unload DLLs. If I need to replace | one, then I just restart R. It's less convenient, but it's always | correct. Same here. Ever since we built littler in 2006 (!!) I have been doing tests at the command-line with fresh 'r' processes. No surprises, no side effects. Dirk PS Spencer, if you are still reading, std::vector is describe inter alia here http://en.cppreference.com/w/cpp/container/vector My point of bringing it up was a deeper one because that (really widely used) data structure grows as needed. No pointers, no malloc, no horror stories you may have heard from C. -- http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org