thr3ads.net - llvm dev - [LLVMdev] Python bindings available. [May 2008]

If this information is useful, please help other people find it:
Share via:

Mahadevan R

2008-May-12 06:58 UTC

[LLVMdev] Python bindings available.

> Consider the case where a function creates and populates a Module, stuffs
it
> in an ExistingModuleProvider for the JIT, then returns the ModuleProvider,
> dropping direct reference to the Module. (ModuleProvider takes ownership of
> the Module.) I presume that your Python object is under the impression it
> owns the Module; when that goes out of scope, its refcount goes to zero and
> it invokes its dtor, disposing of the Module. D'oh— now the
ModuleProvider
> has a dangling pointer. :)
Ah. Good one. Would the following fix it?

1) Have ModuleProvider maintain a reference to the Module it owns,
     so that the ref count is at least 1 at any time. This is easily done.
     The only thing left is when an MP goes away, the module's dtor
     will be called first, deleting the module, then the MP's dtor will
     be called, which will try to delete the same module again.

2a) So either we can prevent the actual destruction of modules that
    are attached to MPs, or

2b) Do not do anything in the dtors of MPs (while letting the dtor of
    modules do the work)

Both options have the disadvantage of assuming the C/C++ implementation
(like MP::dtor deletes only the module and nothing else).
> The routine LLVMModuleRef
> LLVMGetGlobalParent(LLVMValueRef Global); poses a related problem; in this
> case, the returned reference is non-owning, so you must not dtor it from
> Python.
If I do this:

  m1 = Module.new()
  g1 = m1.add_global_variable(ty, "name")
  m2 = g1.module

will the LLVMModuleRef pointer returned in the last call be the
same as that of m1? If so probably we can get "g1.module" to
return the original object itself.
> The fix, of course, is providing a dispose routine and requiring the user
to
> call it, since you can't know what they've done with the pointer.
It'd be much easier to use it without an explicit destruction call.
I'd prefer to do it only if there's absolutely no other go.

Regards,
-Mahadevan.

Gordon Henriksen

2008-May-12 14:25 UTC

head link

[LLVMdev] Python bindings available.

On May 12, 2008, at 02:58, Mahadevan R wrote:
>> Consider the case where a function creates and populates a Module,  
>> stuffs it in an ExistingModuleProvider for the JIT, then returns  
>> the ModuleProvider, dropping direct reference to the Module.  
>> (ModuleProvider takes ownership of the Module.) I presume that your  
>> Python object is under the impression it owns the Module; when that  
>> goes out of scope, its refcount goes to zero and it invokes its  
>> dtor, disposing of the Module. D'oh— now the ModuleProvider has a  
>> dangling pointer. :)
>
> Ah. Good one. Would the following fix it?
>
> 1) Have ModuleProvider maintain a reference to the Module it owns,
>     so that the ref count is at least 1 at any time. This is easily  
> done.
>     The only thing left is when an MP goes away, the module's dtor
>     will be called first, deleting the module, then the MP's dtor will
>     be called, which will try to delete the same module again.
>
> 2a) So either we can prevent the actual destruction of modules that
>    are attached to MPs, or
>
> 2b) Do not do anything in the dtors of MPs (while letting the dtor of
>    modules do the work)
The ModuleProvider itself also needs to be destroyed or you have a  
memory leak.
> Both options have the disadvantage of assuming the C/C++  
> implementation (like MP::dtor deletes only the module and nothing  
> else).
>
>> The routine LLVMModuleRef LLVMGetGlobalParent(LLVMValueRef Global);  
>> poses a related problem; in this case, the returned reference is  
>> non-owning, so you must not dtor it from Python.
>
> If I do this:
>
>  m1 = Module.new()
>  g1 = m1.add_global_variable(ty, "name")
>  m2 = g1.module
>
> will the LLVMModuleRef pointer returned in the last call be the
> same as that of m1? If so probably we can get "g1.module" to
> return the original object itself.
It will, so that's a possibile.
>> The fix, of course, is providing a dispose routine and requiring  
>> the user to call it, since you can't know what they've done
with
>> the pointer.
>
> It'd be much easier to use it without an explicit destruction call.  
> I'd prefer to do it only if there's absolutely no other go.

It's certainly the easiest option. The thing to remember is that C++  
won't play with your garbage collector at all.

The only other option I can see is to

  1. Add an 'owns pointer?' flag to each Python wrapper object.
  2. Update that flag dynamically as the value is used, in accordance  
with the C++ ownership semantics.
  3. Maintain a LLVMFooRef -> Python object table.

So you would reset the 'owns pointer?' flag on m if the user calls  
ModuleProvider.new(m).

The table is necessary, else mp = ModuleProvider.new(m2) [using your  
example fragment above] would trigger a double free by mp and m1.  
Python users are probably not performance sensitive to routing lookups  
through the dictionary, so this might be quite workable.

But even with all that work, you'll still have the possibility of  
surprising results:

   def f(m):
     mp = ModuleProvider.new(m)
     # do something useful?

   m = Module.new("")
   f(m)
   print(m.name)  # Segfault, heap corruption, etc.

For Value and subclasses, you can perform the optimization that the  
Value is always owned by the Module and skip invoking all of this  
machinery. Likewise for Pass, if it ever gets bound.

— Gordon

Mahadevan R

2008-May-12 16:00 UTC

head link

[LLVMdev] Python bindings available.

On Mon, May 12, 2008 at 7:55 PM, Gordon Henriksen
<gordonhenriksen at mac.com> wrote:> On May 12, 2008, at 02:58, Mahadevan R wrote:
>
>  >> Consider the case where a function creates and populates a
Module,
>  >> stuffs it in an ExistingModuleProvider for the JIT, then returns
>  >> the ModuleProvider, dropping direct reference to the Module.
>  >> (ModuleProvider takes ownership of the Module.) I presume that
your
>  >> Python object is under the impression it owns the Module; when
that
>  >> goes out of scope, its refcount goes to zero and it invokes its
>  >> dtor, disposing of the Module. D'oh— now the ModuleProvider
has a
>  >> dangling pointer. :)
>  >
>  > Ah. Good one. Would the following fix it?
>  >
>  > 1) Have ModuleProvider maintain a reference to the Module it owns,
>  >     so that the ref count is at least 1 at any time. This is easily
>  > done.
>  >     The only thing left is when an MP goes away, the module's
dtor
>  >     will be called first, deleting the module, then the MP's dtor
will
>  >     be called, which will try to delete the same module again.
>  >
>  > 2a) So either we can prevent the actual destruction of modules that
>  >    are attached to MPs, or
>  >
>  > 2b) Do not do anything in the dtors of MPs (while letting the dtor of
>  >    modules do the work)
>
>  The ModuleProvider itself also needs to be destroyed or you have a
>  memory leak.
>
>
>  > Both options have the disadvantage of assuming the C/C++
>  > implementation (like MP::dtor deletes only the module and nothing
>  > else).
>  >
>  >> The routine LLVMModuleRef LLVMGetGlobalParent(LLVMValueRef
Global);
>  >> poses a related problem; in this case, the returned reference is
>  >> non-owning, so you must not dtor it from Python.
>  >
>  > If I do this:
>  >
>  >  m1 = Module.new()
>  >  g1 = m1.add_global_variable(ty, "name")
>  >  m2 = g1.module
>  >
>  > will the LLVMModuleRef pointer returned in the last call be the
>  > same as that of m1? If so probably we can get "g1.module"
to
>  > return the original object itself.
>
>  It will, so that's a possibile.
>
>
>  >> The fix, of course, is providing a dispose routine and requiring
>  >> the user to call it, since you can't know what they've
done with
>  >> the pointer.
>  >
>  > It'd be much easier to use it without an explicit destruction
call.
>  > I'd prefer to do it only if there's absolutely no other go.
>
>
>  It's certainly the easiest option. The thing to remember is that C++
>  won't play with your garbage collector at all.
>
>  The only other option I can see is to
>
>   1. Add an 'owns pointer?' flag to each Python wrapper object.
>   2. Update that flag dynamically as the value is used, in accordance
>  with the C++ ownership semantics.
>   3. Maintain a LLVMFooRef -> Python object table.
>
>  So you would reset the 'owns pointer?' flag on m if the user calls
>  ModuleProvider.new(m).
>
>  The table is necessary, else mp = ModuleProvider.new(m2) [using your
>  example fragment above] would trigger a double free by mp and m1.
>  Python users are probably not performance sensitive to routing lookups
>  through the dictionary, so this might be quite workable.
>
>  But even with all that work, you'll still have the possibility of
>  surprising results:
>
>    def f(m):
>      mp = ModuleProvider.new(m)
>      # do something useful?
>
>    m = Module.new("")
>    f(m)
>    print(m.name)  # Segfault, heap corruption, etc.
Riiight. Let's try again:

1) The MP dtor does a no-op (deletes self, but not the module it owns)

2) The module has a 'provider' attribute, initially None.

3) When a MP is created for a module, it sets module.provider to itself.

4) MP itself has a reference to the module it owns.

Essentially, this creates a mutual reference between the two objects
and both remain active untill *all* references to *both* the objects
go away from the user code.

It does appear that Python collects these properly.

Regards,
-Mahadevan.

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - May 2008 - [LLVMdev] Python bindings available.

[LLVMdev] Python bindings available.

[LLVMdev] Python bindings available.

[LLVMdev] Python bindings available.

Reasonably Related Threads