thr3ads.net - llvm dev - [llvm-dev] Should I add intrinsics to write my own automatic reference counting passes? [Nov 2020]

If this information is useful, please help other people find it:
Share via:

Ola Fosheim Grøstad via llvm-dev

2020-Nov-18 12:08 UTC

[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?

My experience with LLVM is limited, but I am trying to figure out how to
add optimizations for automatic reference counting. The GC documentation
mentions that patch-points could be useful, but it does not state how they
would be useful. If this is a FAQ, please let me know...

So this is my idea at this point:

The context is a C++ like language with an aggregate type that is always
reference counted. The typesystem differentiate between pointers to objects
that is shared between threads and those that does not. I also want a pass
that turn shared_ptr to nonshared_ptr if it can be proven.

So what I want to do is to wrap up all the "events" that are relevant
as
intrinsics and run some simplification passes, then use the pointer
capture/escape analysis that LLVM has to turn shared_ptrs to nonshared_ptrs
and to elide nonatomic/atomic acquire/release. So basically, the intrinsics
will be the type-annotation also.

The compilation will then follow this pattern:
1. generate LLVM IR
2. simplification passes
3. pass for turning shared_ptr to nonshared_ptr
4. pass for eliding acquire/release
5, pass that substitute the custom intrinsics to function call
6. full optimization passes

I think about having the following intrinsics:

ptr = cast_untyped_to_nonshared(ptr) // e.g. used after allocation
ptr = cast_to_shared_irreversible(ptr)  // basically a gateway to other
threads
nonhared_acquire(ptr)
nonshared_release(ptr)
shared_acquire(ptr)
shared_release(ptr)

I also want weak_ptr at  a later stage, but leave it out for now to keep
the complexity manageble.

Is this idea completely unreasonable?

Ola.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201118/14f27ce0/attachment.html>

Florian Hahn via llvm-dev

2020-Nov-18 12:39 UTC

head link

[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?

Hi,
> On Nov 18, 2020, at 12:08, Ola Fosheim Grøstad via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> My experience with LLVM is limited, but I am trying to figure out how to
add optimizations for automatic reference counting. The GC documentation
mentions that patch-points could be useful, but it does not state how they would
be useful. If this is a FAQ, please let me know...
> 
> So this is my idea at this point:
> 
> The context is a C++ like language with an aggregate type that is always
reference counted. The typesystem differentiate between pointers to objects that
is shared between threads and those that does not. I also want a pass that turn
shared_ptr to nonshared_ptr if it can be proven.
> 
> So what I want to do is to wrap up all the "events" that are
relevant as intrinsics and run some simplification passes, then use the pointer
capture/escape analysis that LLVM has to turn shared_ptrs to nonshared_ptrs and
to elide nonatomic/atomic acquire/release. So basically, the intrinsics will be
the type-annotation also.
> 
> The compilation will then follow this pattern:
> 1. generate LLVM IR
> 2. simplification passes
> 3. pass for turning shared_ptr to nonshared_ptr
> 4. pass for eliding acquire/release
> 5, pass that substitute the custom intrinsics to function call
> 6. full optimization passes
> 
> I think about having the following intrinsics:
> 
> ptr = cast_untyped_to_nonshared(ptr) // e.g. used after allocation
> ptr = cast_to_shared_irreversible(ptr)  // basically a gateway to other
threads
> nonhared_acquire(ptr)
> nonshared_release(ptr)
> shared_acquire(ptr)
> shared_release(ptr)
> 
> I also want weak_ptr at  a later stage, but leave it out for now to keep
the complexity manageble.
> 
> Is this idea completely unreasonable?

LLVM has intrinsics for Objective-C ARC
(https://llvm.org/docs/LangRef.html#objective-c-arc-runtime-intrinsics) and I
think there are also some optimizations passes for those intrinsics. Clang uses
those for Objective-C
(https://clang.llvm.org/docs/AutomaticReferenceCounting.html#background).

I am not really familiar with the details there, but it might be helpful to take
a look at the Objective-C support. I am not sure how much the intrinsics are
tied to Objective-C and it might even be possible to generalize them for other
use-cases. I’m CC’ing some people who are more familiar with the details.

Cheers,
Florian

John McCall via llvm-dev

2020-Nov-19 00:04 UTC

head link

[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?

On 18 Nov 2020, at 7:39, Florian Hahn wrote:>> On Nov 18, 2020, at 12:08, Ola Fosheim Grøstad via llvm-dev 
>> <llvm-dev at lists.llvm.org> wrote:
>>
>> My experience with LLVM is limited, but I am trying to figure out how 
>> to add optimizations for automatic reference counting. The GC 
>> documentation mentions that patch-points could be useful, but it does 
>> not state how they would be useful. If this is a FAQ, please let me 
>> know...
>>
>> So this is my idea at this point:
>>
>> The context is a C++ like language with an aggregate type that is 
>> always reference counted. The typesystem differentiate between 
>> pointers to objects that is shared between threads and those that 
>> does not. I also want a pass that turn shared_ptr to nonshared_ptr if 
>> it can be proven.
>>
>> So what I want to do is to wrap up all the "events" that are
relevant
>> as intrinsics and run some simplification passes, then use the 
>> pointer capture/escape analysis that LLVM has to turn shared_ptrs to 
>> nonshared_ptrs and to elide nonatomic/atomic acquire/release. So 
>> basically, the intrinsics will be the type-annotation also.
>>
>> The compilation will then follow this pattern:
>> 1. generate LLVM IR
>> 2. simplification passes
>> 3. pass for turning shared_ptr to nonshared_ptr
>> 4. pass for eliding acquire/release
>> 5, pass that substitute the custom intrinsics to function call
>> 6. full optimization passes
>>
>> I think about having the following intrinsics:
>>
>> ptr = cast_untyped_to_nonshared(ptr) // e.g. used after allocation
>> ptr = cast_to_shared_irreversible(ptr)  // basically a gateway to 
>> other threads
>> nonhared_acquire(ptr)
>> nonshared_release(ptr)
>> shared_acquire(ptr)
>> shared_release(ptr)
>>
>> I also want weak_ptr at  a later stage, but leave it out for now to 
>> keep the complexity manageble.
>>
>> Is this idea completely unreasonable?
The main problem for this sort of optimization is that it is difficult 
to do on an IR like LLVM’s, where the semantic relationships between 
values that exist in the program have been lowered into a sequence of 
primitive operations with no remaining structural relationship.  What 
you really want is an IR that preserves those relationships of value 
definition and use, allowing you to say e.g. that one value is a copy of 
another, that ownership of a value is passed into something, that a 
value is used for a certain duration but then is no longer used, and so 
on.  With this sort of representation, the optimization turns into 
fairly straightforward value-forwarding and lifetime manipulations.  
Dealing with unrelated operations and retroactively attempting to infer 
relationships, as LLVM IR must, turns it into a fundamentally difficult 
analysis that often relies on semantic knowledge that isn’t expressed 
in the IR, so that you’re actually reasoning about what happens under 
a “well-behaved” frontend.

John.

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Nov 2020 - Should I add intrinsics to write my own automatic reference counting passes?

[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?

[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?

[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?

Possibly Parallel Threads