> Consider the case where a function creates and populates a Module, stuffs it > in an ExistingModuleProvider for the JIT, then returns the ModuleProvider, > dropping direct reference to the Module. (ModuleProvider takes ownership of > the Module.) I presume that your Python object is under the impression it > owns the Module; when that goes out of scope, its refcount goes to zero and > it invokes its dtor, disposing of the Module. D'oh— now the ModuleProvider > has a dangling pointer. :)Ah. Good one. Would the following fix it? 1) Have ModuleProvider maintain a reference to the Module it owns, so that the ref count is at least 1 at any time. This is easily done. The only thing left is when an MP goes away, the module's dtor will be called first, deleting the module, then the MP's dtor will be called, which will try to delete the same module again. 2a) So either we can prevent the actual destruction of modules that are attached to MPs, or 2b) Do not do anything in the dtors of MPs (while letting the dtor of modules do the work) Both options have the disadvantage of assuming the C/C++ implementation (like MP::dtor deletes only the module and nothing else).> The routine LLVMModuleRef > LLVMGetGlobalParent(LLVMValueRef Global); poses a related problem; in this > case, the returned reference is non-owning, so you must not dtor it from > Python.If I do this: m1 = Module.new() g1 = m1.add_global_variable(ty, "name") m2 = g1.module will the LLVMModuleRef pointer returned in the last call be the same as that of m1? If so probably we can get "g1.module" to return the original object itself.> The fix, of course, is providing a dispose routine and requiring the user to > call it, since you can't know what they've done with the pointer.It'd be much easier to use it without an explicit destruction call. I'd prefer to do it only if there's absolutely no other go. Regards, -Mahadevan.
On May 12, 2008, at 02:58, Mahadevan R wrote:>> Consider the case where a function creates and populates a Module, >> stuffs it in an ExistingModuleProvider for the JIT, then returns >> the ModuleProvider, dropping direct reference to the Module. >> (ModuleProvider takes ownership of the Module.) I presume that your >> Python object is under the impression it owns the Module; when that >> goes out of scope, its refcount goes to zero and it invokes its >> dtor, disposing of the Module. D'oh— now the ModuleProvider has a >> dangling pointer. :) > > Ah. Good one. Would the following fix it? > > 1) Have ModuleProvider maintain a reference to the Module it owns, > so that the ref count is at least 1 at any time. This is easily > done. > The only thing left is when an MP goes away, the module's dtor > will be called first, deleting the module, then the MP's dtor will > be called, which will try to delete the same module again. > > 2a) So either we can prevent the actual destruction of modules that > are attached to MPs, or > > 2b) Do not do anything in the dtors of MPs (while letting the dtor of > modules do the work)The ModuleProvider itself also needs to be destroyed or you have a memory leak.> Both options have the disadvantage of assuming the C/C++ > implementation (like MP::dtor deletes only the module and nothing > else). > >> The routine LLVMModuleRef LLVMGetGlobalParent(LLVMValueRef Global); >> poses a related problem; in this case, the returned reference is >> non-owning, so you must not dtor it from Python. > > If I do this: > > m1 = Module.new() > g1 = m1.add_global_variable(ty, "name") > m2 = g1.module > > will the LLVMModuleRef pointer returned in the last call be the > same as that of m1? If so probably we can get "g1.module" to > return the original object itself.It will, so that's a possibile.>> The fix, of course, is providing a dispose routine and requiring >> the user to call it, since you can't know what they've done with >> the pointer. > > It'd be much easier to use it without an explicit destruction call. > I'd prefer to do it only if there's absolutely no other go.It's certainly the easiest option. The thing to remember is that C++ won't play with your garbage collector at all. The only other option I can see is to 1. Add an 'owns pointer?' flag to each Python wrapper object. 2. Update that flag dynamically as the value is used, in accordance with the C++ ownership semantics. 3. Maintain a LLVMFooRef -> Python object table. So you would reset the 'owns pointer?' flag on m if the user calls ModuleProvider.new(m). The table is necessary, else mp = ModuleProvider.new(m2) [using your example fragment above] would trigger a double free by mp and m1. Python users are probably not performance sensitive to routing lookups through the dictionary, so this might be quite workable. But even with all that work, you'll still have the possibility of surprising results: def f(m): mp = ModuleProvider.new(m) # do something useful? m = Module.new("") f(m) print(m.name) # Segfault, heap corruption, etc. For Value and subclasses, you can perform the optimization that the Value is always owned by the Module and skip invoking all of this machinery. Likewise for Pass, if it ever gets bound. — Gordon
On Mon, May 12, 2008 at 7:55 PM, Gordon Henriksen <gordonhenriksen at mac.com> wrote:> On May 12, 2008, at 02:58, Mahadevan R wrote: > > >> Consider the case where a function creates and populates a Module, > >> stuffs it in an ExistingModuleProvider for the JIT, then returns > >> the ModuleProvider, dropping direct reference to the Module. > >> (ModuleProvider takes ownership of the Module.) I presume that your > >> Python object is under the impression it owns the Module; when that > >> goes out of scope, its refcount goes to zero and it invokes its > >> dtor, disposing of the Module. D'oh— now the ModuleProvider has a > >> dangling pointer. :) > > > > Ah. Good one. Would the following fix it? > > > > 1) Have ModuleProvider maintain a reference to the Module it owns, > > so that the ref count is at least 1 at any time. This is easily > > done. > > The only thing left is when an MP goes away, the module's dtor > > will be called first, deleting the module, then the MP's dtor will > > be called, which will try to delete the same module again. > > > > 2a) So either we can prevent the actual destruction of modules that > > are attached to MPs, or > > > > 2b) Do not do anything in the dtors of MPs (while letting the dtor of > > modules do the work) > > The ModuleProvider itself also needs to be destroyed or you have a > memory leak. > > > > Both options have the disadvantage of assuming the C/C++ > > implementation (like MP::dtor deletes only the module and nothing > > else). > > > >> The routine LLVMModuleRef LLVMGetGlobalParent(LLVMValueRef Global); > >> poses a related problem; in this case, the returned reference is > >> non-owning, so you must not dtor it from Python. > > > > If I do this: > > > > m1 = Module.new() > > g1 = m1.add_global_variable(ty, "name") > > m2 = g1.module > > > > will the LLVMModuleRef pointer returned in the last call be the > > same as that of m1? If so probably we can get "g1.module" to > > return the original object itself. > > It will, so that's a possibile. > > > >> The fix, of course, is providing a dispose routine and requiring > >> the user to call it, since you can't know what they've done with > >> the pointer. > > > > It'd be much easier to use it without an explicit destruction call. > > I'd prefer to do it only if there's absolutely no other go. > > > It's certainly the easiest option. The thing to remember is that C++ > won't play with your garbage collector at all. > > The only other option I can see is to > > 1. Add an 'owns pointer?' flag to each Python wrapper object. > 2. Update that flag dynamically as the value is used, in accordance > with the C++ ownership semantics. > 3. Maintain a LLVMFooRef -> Python object table. > > So you would reset the 'owns pointer?' flag on m if the user calls > ModuleProvider.new(m). > > The table is necessary, else mp = ModuleProvider.new(m2) [using your > example fragment above] would trigger a double free by mp and m1. > Python users are probably not performance sensitive to routing lookups > through the dictionary, so this might be quite workable. > > But even with all that work, you'll still have the possibility of > surprising results: > > def f(m): > mp = ModuleProvider.new(m) > # do something useful? > > m = Module.new("") > f(m) > print(m.name) # Segfault, heap corruption, etc.Riiight. Let's try again: 1) The MP dtor does a no-op (deletes self, but not the module it owns) 2) The module has a 'provider' attribute, initially None. 3) When a MP is created for a module, it sets module.provider to itself. 4) MP itself has a reference to the module it owns. Essentially, this creates a mutual reference between the two objects and both remain active untill *all* references to *both* the objects go away from the user code. It does appear that Python collects these properly. Regards, -Mahadevan.