Ola Fosheim Grøstad via llvm-dev
2020-Nov-18 12:08 UTC
[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?
My experience with LLVM is limited, but I am trying to figure out how to add optimizations for automatic reference counting. The GC documentation mentions that patch-points could be useful, but it does not state how they would be useful. If this is a FAQ, please let me know... So this is my idea at this point: The context is a C++ like language with an aggregate type that is always reference counted. The typesystem differentiate between pointers to objects that is shared between threads and those that does not. I also want a pass that turn shared_ptr to nonshared_ptr if it can be proven. So what I want to do is to wrap up all the "events" that are relevant as intrinsics and run some simplification passes, then use the pointer capture/escape analysis that LLVM has to turn shared_ptrs to nonshared_ptrs and to elide nonatomic/atomic acquire/release. So basically, the intrinsics will be the type-annotation also. The compilation will then follow this pattern: 1. generate LLVM IR 2. simplification passes 3. pass for turning shared_ptr to nonshared_ptr 4. pass for eliding acquire/release 5, pass that substitute the custom intrinsics to function call 6. full optimization passes I think about having the following intrinsics: ptr = cast_untyped_to_nonshared(ptr) // e.g. used after allocation ptr = cast_to_shared_irreversible(ptr) // basically a gateway to other threads nonhared_acquire(ptr) nonshared_release(ptr) shared_acquire(ptr) shared_release(ptr) I also want weak_ptr at a later stage, but leave it out for now to keep the complexity manageble. Is this idea completely unreasonable? Ola. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201118/14f27ce0/attachment.html>
Florian Hahn via llvm-dev
2020-Nov-18 12:39 UTC
[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?
Hi,> On Nov 18, 2020, at 12:08, Ola Fosheim Grøstad via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > My experience with LLVM is limited, but I am trying to figure out how to add optimizations for automatic reference counting. The GC documentation mentions that patch-points could be useful, but it does not state how they would be useful. If this is a FAQ, please let me know... > > So this is my idea at this point: > > The context is a C++ like language with an aggregate type that is always reference counted. The typesystem differentiate between pointers to objects that is shared between threads and those that does not. I also want a pass that turn shared_ptr to nonshared_ptr if it can be proven. > > So what I want to do is to wrap up all the "events" that are relevant as intrinsics and run some simplification passes, then use the pointer capture/escape analysis that LLVM has to turn shared_ptrs to nonshared_ptrs and to elide nonatomic/atomic acquire/release. So basically, the intrinsics will be the type-annotation also. > > The compilation will then follow this pattern: > 1. generate LLVM IR > 2. simplification passes > 3. pass for turning shared_ptr to nonshared_ptr > 4. pass for eliding acquire/release > 5, pass that substitute the custom intrinsics to function call > 6. full optimization passes > > I think about having the following intrinsics: > > ptr = cast_untyped_to_nonshared(ptr) // e.g. used after allocation > ptr = cast_to_shared_irreversible(ptr) // basically a gateway to other threads > nonhared_acquire(ptr) > nonshared_release(ptr) > shared_acquire(ptr) > shared_release(ptr) > > I also want weak_ptr at a later stage, but leave it out for now to keep the complexity manageble. > > Is this idea completely unreasonable?LLVM has intrinsics for Objective-C ARC (https://llvm.org/docs/LangRef.html#objective-c-arc-runtime-intrinsics) and I think there are also some optimizations passes for those intrinsics. Clang uses those for Objective-C (https://clang.llvm.org/docs/AutomaticReferenceCounting.html#background). I am not really familiar with the details there, but it might be helpful to take a look at the Objective-C support. I am not sure how much the intrinsics are tied to Objective-C and it might even be possible to generalize them for other use-cases. I’m CC’ing some people who are more familiar with the details. Cheers, Florian
John McCall via llvm-dev
2020-Nov-19 00:04 UTC
[llvm-dev] Should I add intrinsics to write my own automatic reference counting passes?
On 18 Nov 2020, at 7:39, Florian Hahn wrote:>> On Nov 18, 2020, at 12:08, Ola Fosheim Grøstad via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> >> My experience with LLVM is limited, but I am trying to figure out how >> to add optimizations for automatic reference counting. The GC >> documentation mentions that patch-points could be useful, but it does >> not state how they would be useful. If this is a FAQ, please let me >> know... >> >> So this is my idea at this point: >> >> The context is a C++ like language with an aggregate type that is >> always reference counted. The typesystem differentiate between >> pointers to objects that is shared between threads and those that >> does not. I also want a pass that turn shared_ptr to nonshared_ptr if >> it can be proven. >> >> So what I want to do is to wrap up all the "events" that are relevant >> as intrinsics and run some simplification passes, then use the >> pointer capture/escape analysis that LLVM has to turn shared_ptrs to >> nonshared_ptrs and to elide nonatomic/atomic acquire/release. So >> basically, the intrinsics will be the type-annotation also. >> >> The compilation will then follow this pattern: >> 1. generate LLVM IR >> 2. simplification passes >> 3. pass for turning shared_ptr to nonshared_ptr >> 4. pass for eliding acquire/release >> 5, pass that substitute the custom intrinsics to function call >> 6. full optimization passes >> >> I think about having the following intrinsics: >> >> ptr = cast_untyped_to_nonshared(ptr) // e.g. used after allocation >> ptr = cast_to_shared_irreversible(ptr) // basically a gateway to >> other threads >> nonhared_acquire(ptr) >> nonshared_release(ptr) >> shared_acquire(ptr) >> shared_release(ptr) >> >> I also want weak_ptr at a later stage, but leave it out for now to >> keep the complexity manageble. >> >> Is this idea completely unreasonable?The main problem for this sort of optimization is that it is difficult to do on an IR like LLVM’s, where the semantic relationships between values that exist in the program have been lowered into a sequence of primitive operations with no remaining structural relationship. What you really want is an IR that preserves those relationships of value definition and use, allowing you to say e.g. that one value is a copy of another, that ownership of a value is passed into something, that a value is used for a certain duration but then is no longer used, and so on. With this sort of representation, the optimization turns into fairly straightforward value-forwarding and lifetime manipulations. Dealing with unrelated operations and retroactively attempting to infer relationships, as LLVM IR must, turns it into a fundamentally difficult analysis that often relies on semantic knowledge that isn’t expressed in the IR, so that you’re actually reasoning about what happens under a “well-behaved” frontend. John.
Possibly Parallel Threads
- Should I add intrinsics to write my own automatic reference counting passes?
- Should I add intrinsics to write my own automatic reference counting passes?
- [LLVMdev] Why is "typedef boost::shared_ptr<MyClass> MyClass_ptr" named "struct.boost::MyClass_ptr" by llvm-g++?
- [LLVMdev] Use of Smart Pointers in LLVM Projects
- RTTI with smart pointers