Lang Hames via llvm-dev
2020-Feb-24 07:46 UTC
[llvm-dev] ORC JIT Weekly #6 -- General initializer support and JITLink optimizations
Hi All, The general initializer support patch has landed (see 85fb997659b plus follow up fixes). Some quick background: Until now ORC, like MCJIT, has handled static initializer discovery by searching for llvm.global_ctors and llvm.global_dtors arrays in the IR added to the JIT. This approach suffers from several drawbacks: 1) It provides no built-in support for other program representations: Object files and custom program representations added to the JIT require manual intervention from the user to run their initializers. 2) It requires naming and promoting the linkage of initializer functions, since they have to be looked up by name to be run. 3) It doesn't handle platform-specific initializers, e.g. Objective-C registration, which are described by globals in specific sections. The general initializer support patch has changed how initialization is handled. Now all MaterializationUnits, regardless of what kind of program representation they wrap (IR, object files, ASTs, etc.) can now declare an optional "initializer symbol". Instances of the new Platform class (see include/llvm/ExecutionEngine/Orc/Core.h) are notified whenever MaterializationUnits are added to a JITDylib, and can record the presence of any declared initializers. By issuing lookups for initializers, the Platform can force their materialization and arrange for them to be run in a platform specific way (See https://reviews.llvm.org/D74300 for more discussion on this). This new system is flexible enough to permit two very different platform implementations for LLJIT that are already available in tree: GenericLLVMIRPlatform and MachOPlatform. The former essentially re-implements the existing llvm.global_ctor scanning scheme: It promotes functions that appear in the llvm.global_ctors array, then looks them up by name and executes them when requested. On the other hand, MachOPlatform implements a scheme that mimics the behavior of the Darwin dynamic loader, dyld: By installing an ObjectLinkingLayer::Plugin, the MachOPlatform can scan all objects as they are materialized to discover known special sections (E.g. __mod_init_func, __objc_classlist, and __objc_selref), then handle them according the usual platform rules (__mod_init_func pointers are executed, Objective-C classes and selectors are registered with the Objective-C runtime). While this system is still very new, it is far enough along that the lli command line tool, when run with the -jit-mode=orc-lazy option, can now execute simple IR compiled from simple Objective C and Swift programs on Darwin. Also of interest this week: JITLink has a new "GOT and Stub bypass" optimization for x86-64. When linking position independent code, JITLink must conservatively build global offset table entries and stubs to access/call external symbols that may be out of range of the JIT'd code. With this new optimization, these indirect accesses may be bypassed if the JIT'd code ends up being allocated within range of the target. Coupled with a slab allocator for your JIT this optimization can eliminate a layer of indirection and may improve performance for some use cases. See 27a79b72162. Just a heads up: I expect next week to be a quiet one, as I'm out on vacation from Wednesday. -- Lang. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200223/0e95dbdc/attachment.html>
Gaier, Bjoern via llvm-dev
2020-Feb-24 07:58 UTC
[llvm-dev] ORC JIT Weekly #6 -- General initializer support and JITLink optimizations
Hello Lang, This sounds interesting even tough I’m not able to understand everything. I wonder, will your changes be in the upcoming LLVM10 release? Also, do your changes allow to get the actual addresses of the constructor and destructors - or are they fully managed by the ORC? I hope this makes sense! Kind greetings and wishing you a nice vacation Björn From: Lang Hames <lhames at gmail.com> Sent: 24 February 2020 08:47 To: LLVM Developers Mailing List <llvm-dev at lists.llvm.org> Cc: Stefan Gränitz <stefan.graenitz at gmail.com>; David Blaikie <dblaikie at gmail.com>; Geoff Levner <glevner at gmail.com>; Jacob Lifshay <programmerjake at gmail.com>; Christian Schafmeister <meister at temple.edu>; Andres Freund <andres at anarazel.de>; Gaier, Bjoern <Bjoern.Gaier at horiba.com>; guangnan he <gnhe2009 at gmail.com>; preejackie <praveenvelliengiri at gmail.com> Subject: ORC JIT Weekly #6 -- General initializer support and JITLink optimizations Hi All, The general initializer support patch has landed (see 85fb997659b plus follow up fixes). Some quick background: Until now ORC, like MCJIT, has handled static initializer discovery by searching for llvm.global_ctors and llvm.global_dtors arrays in the IR added to the JIT. This approach suffers from several drawbacks: 1) It provides no built-in support for other program representations: Object files and custom program representations added to the JIT require manual intervention from the user to run their initializers. 2) It requires naming and promoting the linkage of initializer functions, since they have to be looked up by name to be run. 3) It doesn't handle platform-specific initializers, e.g. Objective-C registration, which are described by globals in specific sections. The general initializer support patch has changed how initialization is handled. Now all MaterializationUnits, regardless of what kind of program representation they wrap (IR, object files, ASTs, etc.) can now declare an optional "initializer symbol". Instances of the new Platform class (see include/llvm/ExecutionEngine/Orc/Core.h) are notified whenever MaterializationUnits are added to a JITDylib, and can record the presence of any declared initializers. By issuing lookups for initializers, the Platform can force their materialization and arrange for them to be run in a platform specific way (See https://reviews.llvm.org/D74300 for more discussion on this). This new system is flexible enough to permit two very different platform implementations for LLJIT that are already available in tree: GenericLLVMIRPlatform and MachOPlatform. The former essentially re-implements the existing llvm.global_ctor scanning scheme: It promotes functions that appear in the llvm.global_ctors array, then looks them up by name and executes them when requested. On the other hand, MachOPlatform implements a scheme that mimics the behavior of the Darwin dynamic loader, dyld: By installing an ObjectLinkingLayer::Plugin, the MachOPlatform can scan all objects as they are materialized to discover known special sections (E.g. __mod_init_func, __objc_classlist, and __objc_selref), then handle them according the usual platform rules (__mod_init_func pointers are executed, Objective-C classes and selectors are registered with the Objective-C runtime). While this system is still very new, it is far enough along that the lli command line tool, when run with the -jit-mode=orc-lazy option, can now execute simple IR compiled from simple Objective C and Swift programs on Darwin. Also of interest this week: JITLink has a new "GOT and Stub bypass" optimization for x86-64. When linking position independent code, JITLink must conservatively build global offset table entries and stubs to access/call external symbols that may be out of range of the JIT'd code. With this new optimization, these indirect accesses may be bypassed if the JIT'd code ends up being allocated within range of the target. Coupled with a slab allocator for your JIT this optimization can eliminate a layer of indirection and may improve performance for some use cases. See 27a79b72162. Just a heads up: I expect next week to be a quiet one, as I'm out on vacation from Wednesday. -- Lang. Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200224/d97871a9/attachment.html>
Benoit Belley via llvm-dev
2020-Feb-24 13:10 UTC
[llvm-dev] ORC JIT Weekly #6 -- General initializer support and JITLink optimizations
Hi Lang, I really like the direction that this this taking. Our own ORC JIT workflow requires us to be able to JIT execute code found in llvm::ObjectFile's without having access to the LLVM IR. (In summary, we jit compile a whole bunch of LLVM modules and only keep the object files around). The ability to invoke the global constructor and destructor functions contained within those object files is paramount to us as some of our object files might embed arbitrary fragments of LLVM IR generated by the clang backend. Our internal solution is similar to the MachOPlatform one described below. We also have a similar solution for ELF object files (scanning the .init_array and .fini_array object sections) that we are using on both Linux and Windows x64 platforms. To be complete, the Windows 64 C++ ABI also requires us to intercept calls to atexit() to ensure the that destructor of global C++ objects are correctly invoked. If there's any interest, we would be happy to contribute our code for this. Cheers, Benoit ________________________________ De : llvm-dev <llvm-dev-bounces at lists.llvm.org> de la part de Lang Hames via llvm-dev <llvm-dev at lists.llvm.org> Envoyé : 24 février 2020 02:46 À : LLVM Developers Mailing List <llvm-dev at lists.llvm.org> Cc : Christian Schafmeister <meister at temple.edu> Objet : [llvm-dev] ORC JIT Weekly #6 -- General initializer support and JITLink optimizations Hi All, The general initializer support patch has landed (see 85fb997659b plus follow up fixes). Some quick background: Until now ORC, like MCJIT, has handled static initializer discovery by searching for llvm.global_ctors and llvm.global_dtors arrays in the IR added to the JIT. This approach suffers from several drawbacks: 1) It provides no built-in support for other program representations: Object files and custom program representations added to the JIT require manual intervention from the user to run their initializers. 2) It requires naming and promoting the linkage of initializer functions, since they have to be looked up by name to be run. 3) It doesn't handle platform-specific initializers, e.g. Objective-C registration, which are described by globals in specific sections. The general initializer support patch has changed how initialization is handled. Now all MaterializationUnits, regardless of what kind of program representation they wrap (IR, object files, ASTs, etc.) can now declare an optional "initializer symbol". Instances of the new Platform class (see include/llvm/ExecutionEngine/Orc/Core.h) are notified whenever MaterializationUnits are added to a JITDylib, and can record the presence of any declared initializers. By issuing lookups for initializers, the Platform can force their materialization and arrange for them to be run in a platform specific way (See https://reviews.llvm.org/D74300<https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D74300&d=DwMFaQ&c=76Q6Tcqc-t2x0ciWn7KFdCiqt6IQ7a_IF9uzNzd_2pA&r=wR2gM5Rr7Ie8nJT0AKKx0nretMcnu3YZMyPRVEnnIr0&m=RgCy4P5BdyKtkOeaOQ3TKRFKYnMjPhsvm0KHV9FwXvE&s=N32yF-VzBmQ25CVm3Ue3m4Qrplc5fn-m_JX7cLmo9Sk&e=> for more discussion on this). This new system is flexible enough to permit two very different platform implementations for LLJIT that are already available in tree: GenericLLVMIRPlatform and MachOPlatform. The former essentially re-implements the existing llvm.global_ctor scanning scheme: It promotes functions that appear in the llvm.global_ctors array, then looks them up by name and executes them when requested. On the other hand, MachOPlatform implements a scheme that mimics the behavior of the Darwin dynamic loader, dyld: By installing an ObjectLinkingLayer::Plugin, the MachOPlatform can scan all objects as they are materialized to discover known special sections (E.g. __mod_init_func, __objc_classlist, and __objc_selref), then handle them according the usual platform rules (__mod_init_func pointers are executed, Objective-C classes and selectors are registered with the Objective-C runtime). While this system is still very new, it is far enough along that the lli command line tool, when run with the -jit-mode=orc-lazy option, can now execute simple IR compiled from simple Objective C and Swift programs on Darwin. Also of interest this week: JITLink has a new "GOT and Stub bypass" optimization for x86-64. When linking position independent code, JITLink must conservatively build global offset table entries and stubs to access/call external symbols that may be out of range of the JIT'd code. With this new optimization, these indirect accesses may be bypassed if the JIT'd code ends up being allocated within range of the target. Coupled with a slab allocator for your JIT this optimization can eliminate a layer of indirection and may improve performance for some use cases. See 27a79b72162. Just a heads up: I expect next week to be a quiet one, as I'm out on vacation from Wednesday. -- Lang. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200224/9140a848/attachment.html>
Lang Hames via llvm-dev
2020-Feb-25 00:34 UTC
[llvm-dev] ORC JIT Weekly #6 -- General initializer support and JITLink optimizations
Hi Bjoern,> I wonder, will your changes be in the upcoming LLVM10 release?It’s too late to cherry pick these changes for LLVM 10, but they will be in LLVM 11.> Also, do your changes allow to get the actual addresses of the constructor and destructors - or are they fully managed by the ORC? I hope this makes sense!In general you will want to let ORC take care of running the initializers for you since the problem is non-trivial, but if you want to get the raw initializer data it is possible to do so with some legwork. Background: I simplified the description of what was included in the initializer patch to keep it short, but there are actually two major parts to the GenericIR and MachO init schemes that are now available in LLJIT. The first part is the Platform subclass (see lib/ExecutionEngine/Orc/LLJIT.cpp for the GenericLLVMIRPlatform class, and include/llvm/ExecutionEngine/Orc/MachOPlatform.h for the MachOPlatform class). The second part is the LLJIT::PlatformSupport subclass (both PlatformSupport subclasses are in lib/ExecutionEngine/Orc/LLJIT.cpp). The Platform subclass aggregates and exposes the raw initialization data, and the PlatformSupport subclass hides the handling of that initialization data behind a common interface for LLJIT — the LLJIT::PlatformSupport::initialize method. If you want to get the raw data yourself you could write your own PlatformSupport subclass to do so by following the examples that are available in-tree. However, note that your PlatformSupport class(es) will be platform specific, as will be the initializer data. In general you will have to deal with things other than constructors and destructors, e.g. Objective-C metadata on Darwin. Regards, Lang.> On Feb 23, 2020, at 11:58 PM, Gaier, Bjoern <Bjoern.Gaier at horiba.com> wrote: > > > Hello Lang, > > This sounds interesting even tough I’m not able to understand everything. > I wonder, will your changes be in the upcoming LLVM10 release? Also, do your changes allow to get the actual addresses of the constructor and destructors - or are they fully managed by the ORC? I hope this makes sense! > > Kind greetings and wishing you a nice vacation > Björn > > From: Lang Hames <lhames at gmail.com> > Sent: 24 February 2020 08:47 > To: LLVM Developers Mailing List <llvm-dev at lists.llvm.org> > Cc: Stefan Gränitz <stefan.graenitz at gmail.com>; David Blaikie <dblaikie at gmail.com>; Geoff Levner <glevner at gmail.com>; Jacob Lifshay <programmerjake at gmail.com>; Christian Schafmeister <meister at temple.edu>; Andres Freund <andres at anarazel.de>; Gaier, Bjoern <Bjoern.Gaier at horiba.com>; guangnan he <gnhe2009 at gmail.com>; preejackie <praveenvelliengiri at gmail.com> > Subject: ORC JIT Weekly #6 -- General initializer support and JITLink optimizations > > Hi All, > > The general initializer support patch has landed (see 85fb997659b plus follow up fixes). > > Some quick background: > > Until now ORC, like MCJIT, has handled static initializer discovery by searching for llvm.global_ctors and llvm.global_dtors arrays in the IR added to the JIT. This approach suffers from several drawbacks: > 1) It provides no built-in support for other program representations: Object files and custom program representations added to the JIT require manual intervention from the user to run their initializers. > 2) It requires naming and promoting the linkage of initializer functions, since they have to be looked up by name to be run. > 3) It doesn't handle platform-specific initializers, e.g. Objective-C registration, which are described by globals in specific sections. > > The general initializer support patch has changed how initialization is handled. Now all MaterializationUnits, regardless of what kind of program representation they wrap (IR, object files, ASTs, etc.) can now declare an optional "initializer symbol". Instances of the new Platform class (see include/llvm/ExecutionEngine/Orc/Core.h) are notified whenever MaterializationUnits are added to a JITDylib, and can record the presence of any declared initializers. By issuing lookups for initializers, the Platform can force their materialization and arrange for them to be run in a platform specific way (See https://reviews.llvm.org/D74300 for more discussion on this). > > This new system is flexible enough to permit two very different platform implementations for LLJIT that are already available in tree: GenericLLVMIRPlatform and MachOPlatform. The former essentially re-implements the existing llvm.global_ctor scanning scheme: It promotes functions that appear in the llvm.global_ctors array, then looks them up by name and executes them when requested. On the other hand, MachOPlatform implements a scheme that mimics the behavior of the Darwin dynamic loader, dyld: By installing an ObjectLinkingLayer::Plugin, the MachOPlatform can scan all objects as they are materialized to discover known special sections (E.g. __mod_init_func, __objc_classlist, and __objc_selref), then handle them according the usual platform rules (__mod_init_func pointers are executed, Objective-C classes and selectors are registered with the Objective-C runtime). > > While this system is still very new, it is far enough along that the lli command line tool, when run with the -jit-mode=orc-lazy option, can now execute simple IR compiled from simple Objective C and Swift programs on Darwin. > > Also of interest this week: JITLink has a new "GOT and Stub bypass" optimization for x86-64. When linking position independent code, JITLink must conservatively build global offset table entries and stubs to access/call external symbols that may be out of range of the JIT'd code. With this new optimization, these indirect accesses may be bypassed if the JIT'd code ends up being allocated within range of the target. Coupled with a slab allocator for your JIT this optimization can eliminate a layer of indirection and may improve performance for some use cases. See 27a79b72162. > > Just a heads up: I expect next week to be a quiet one, as I'm out on vacation from Wednesday. > > -- Lang. > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200224/316f0316/attachment.html>
Lang Hames via llvm-dev
2020-Feb-25 00:57 UTC
[llvm-dev] ORC JIT Weekly #6 -- General initializer support and JITLink optimizations
Hi Benoit,> I really like the direction that this this taking.Thanks!> Our internal solution is similar to the MachOPlatform one described below. We also have a similar solution for ELF object files (scanning the .init_array and .fini_array object sections) that we are using on both Linux and Windows x64 platforms. To be complete, the Windows 64 C++ ABI also requires us to intercept calls to atexit() to ensure the that destructor of global C++ objects are correctly invoked.The GenericIR platform and MachO platform are already interposing __cxa_atexit to ensure that static destructors are run when the JITDylib is closed (See code scattered through lib/ExecutionEngine/Orc/LLJIT.cpp). We should include interposition of regular atexit calls too. I’m happy to review patches for that, or you can file a bug and assign it to me and I will try to get to it next week. Ideally I would like to get the in-tree support to a state where you could rely on it, rather than having to maintain a custom implementation.> If there's any interest, we would be happy to contribute our code for this.I am very interested, and I think many other members of the community would be too. :) Please let me know how I can help, and assign any relevant reviews to me. As noted I’ll be away later this week, but I’ll be back Monday of next week and happy to answer questions. Regards, Lang. Sent from my iPad> On Feb 24, 2020, at 5:10 AM, Benoit Belley <Benoit.Belley at autodesk.com> wrote: > > > Hi Lang, > > I really like the direction that this this taking. > > Our own ORC JIT workflow requires us to be able to JIT execute code found in llvm::ObjectFile's without having access to the LLVM IR. (In summary, we jit compile a whole bunch of LLVM modules and only keep the object files around). The ability to invoke the global constructor and destructor functions contained within those object files is paramount to us as some of our object files might embed arbitrary fragments of LLVM IR generated by the clang backend. > > Our internal solution is similar to the MachOPlatform one described below. We also have a similar solution for ELF object files (scanning the .init_array and .fini_array object sections) that we are using on both Linux and Windows x64 platforms. To be complete, the Windows 64 C++ ABI also requires us to intercept calls to atexit() to ensure the that destructor of global C++ objects are correctly invoked. > > If there's any interest, we would be happy to contribute our code for this. > > Cheers, > Benoit > > De : llvm-dev <llvm-dev-bounces at lists.llvm.org> de la part de Lang Hames via llvm-dev <llvm-dev at lists.llvm.org> > Envoyé : 24 février 2020 02:46 > À : LLVM Developers Mailing List <llvm-dev at lists.llvm.org> > Cc : Christian Schafmeister <meister at temple.edu> > Objet : [llvm-dev] ORC JIT Weekly #6 -- General initializer support and JITLink optimizations > > Hi All, > > The general initializer support patch has landed (see 85fb997659b plus follow up fixes). > > Some quick background: > > Until now ORC, like MCJIT, has handled static initializer discovery by searching for llvm.global_ctors and llvm.global_dtors arrays in the IR added to the JIT. This approach suffers from several drawbacks: > 1) It provides no built-in support for other program representations: Object files and custom program representations added to the JIT require manual intervention from the user to run their initializers. > 2) It requires naming and promoting the linkage of initializer functions, since they have to be looked up by name to be run. > 3) It doesn't handle platform-specific initializers, e.g. Objective-C registration, which are described by globals in specific sections. > > The general initializer support patch has changed how initialization is handled. Now all MaterializationUnits, regardless of what kind of program representation they wrap (IR, object files, ASTs, etc.) can now declare an optional "initializer symbol". Instances of the new Platform class (see include/llvm/ExecutionEngine/Orc/Core.h) are notified whenever MaterializationUnits are added to a JITDylib, and can record the presence of any declared initializers. By issuing lookups for initializers, the Platform can force their materialization and arrange for them to be run in a platform specific way (See https://reviews.llvm.org/D74300 for more discussion on this). > > This new system is flexible enough to permit two very different platform implementations for LLJIT that are already available in tree: GenericLLVMIRPlatform and MachOPlatform. The former essentially re-implements the existing llvm.global_ctor scanning scheme: It promotes functions that appear in the llvm.global_ctors array, then looks them up by name and executes them when requested. On the other hand, MachOPlatform implements a scheme that mimics the behavior of the Darwin dynamic loader, dyld: By installing an ObjectLinkingLayer::Plugin, the MachOPlatform can scan all objects as they are materialized to discover known special sections (E.g. __mod_init_func, __objc_classlist, and __objc_selref), then handle them according the usual platform rules (__mod_init_func pointers are executed, Objective-C classes and selectors are registered with the Objective-C runtime). > > While this system is still very new, it is far enough along that the lli command line tool, when run with the -jit-mode=orc-lazy option, can now execute simple IR compiled from simple Objective C and Swift programs on Darwin. > > Also of interest this week: JITLink has a new "GOT and Stub bypass" optimization for x86-64. When linking position independent code, JITLink must conservatively build global offset table entries and stubs to access/call external symbols that may be out of range of the JIT'd code. With this new optimization, these indirect accesses may be bypassed if the JIT'd code ends up being allocated within range of the target. Coupled with a slab allocator for your JIT this optimization can eliminate a layer of indirection and may improve performance for some use cases. See 27a79b72162. > > Just a heads up: I expect next week to be a quiet one, as I'm out on vacation from Wednesday. > > -- Lang. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200224/00bc898d/attachment.html>