Juergen Ributzka
2015-Jul-10 16:47 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
Sounds good. I will add that to the StackMap documentation when I update it for v2. —Juergen> On Jul 10, 2015, at 9:40 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > No, but I've noticed that it is true in practice, and so I think that we should say something about it one way or another. Especially since, in switching to a fixed-size record format, binary searching now becomes relatively easy/fast. Maybe it would be a useful guarantee? > > Thanks again, > Hal-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150710/ed93df3f/attachment.html>
Marius Wachtler
2015-Jul-12 21:12 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
Hi, I submitted several weeks ago a patch D9176 to extend stackmaps to support symbolic constants. I think this is a good time to clean up this patch by proposing to add another stackmap location type for symbolic constants to the new v2 StackMap format. The idea is that this new type behaves like the ConstantIndex type with the only difference that the constant value specified at the index will be zero when stored on disk and will get filled in by the runtime linker with the actual value of the symbol (returned value from RTDyldMemoryManager::getSymbolAddress). Please let my know what you think. My use case is (this is a copy of reply I just send to the review request): We (Pyston project) use patchpoints to implement inline caches and for deoptimization when using the LLVM tier. For the deoptimization use case we add all variables to the patchpoint live args which we need too continue the execution in a lower generic tier (e.g. interpreter). A lot of our generated IR values were direct inttoptr casts because we often generate instances of our objects outside of LLVM. For example we may generate instances of a python objects when we setup the internal representation of a python function which we then share between the interpreter and LLVM tier. That's why we had a lot of inttoptr casts in our generated IR, there are also additional args like pointers to the AST nodes which we will need for deopt. Deopimizations should happen only very rarely that means that we don't want to actually load all the constants we specified as live values inside the patchpoint into registers/stack slots. Currently LLVM will put all arguments which are constants inside the stackmap constant table in order to not have to generate code in front of the patchpoint to put all this constant values into register/stack slots. This is exactly how I would expect the behavior to be and how I need it. But then I added a new feature: in order to speedup JITing time if we encounter the same function on the next application start I implemented an object cache for the LLVM generated code. This means I need to be able to relocate all this embedded pointers because the memory layout will not be the same. I choose to solve this by emitting special unique symbol name for all cases where I previously embedded the direct pointer value. This symbol names are deterministic, on the next startup when encountering the same function I can directly load it from the object cache and just have to return the real pointer values inside the RTDyldMemoryManager::getSymbolAddress() overloaded function. The problem I encountered and this patch tries to solve is that LLVM will currently emit code which will load all this symbolic constants into registers before the patchpoint. With this patch we will stop emitting this machine instructions and instead emit constant table entries inside the stackmap. Hope this helps understanding what I have done (even if my english isn't good), I successfully use this solution now since several weeks and it gave us a huge speedup. On Fri, Jul 10, 2015 at 6:47 PM, Juergen Ributzka <juergen at apple.com> wrote:> Sounds good. I will add that to the StackMap documentation when I update > it for v2. > > —Juergen > > On Jul 10, 2015, at 9:40 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > No, but I've noticed that it is true in practice, and so I think that we > should say something about it one way or another. Especially since, in > switching to a fixed-size record format, binary searching now becomes > relatively easy/fast. Maybe it would be a useful guarantee? > > Thanks again, > Hal > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150712/02a9b1a6/attachment.html>
Philip Reames
2015-Jul-15 05:25 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
On 07/12/2015 02:12 PM, Marius Wachtler wrote:> Hi, > > I submitted several weeks ago a patch D9176 to extend stackmaps to > support symbolic constants. > I think this is a good time to clean up this patch by proposing to add > another stackmap location type for symbolic constants to the new v2 > StackMap format.Yep, great timing.> > The idea is that this new type behaves like the ConstantIndex type > with the only difference that the constant value specified at the > index will be zero when stored on disk and will get filled in by the > runtime linker with the actual value of the symbol (returned value > from RTDyldMemoryManager::getSymbolAddress). > > Please let my know what you think. > > My use case is (this is a copy of reply I just send to the review > request): > > We (Pyston project) use patchpoints to implement inline caches and for > deoptimization when using the LLVM tier. > For the deoptimization use case we add all variables to the patchpoint > live args which we need too continue the execution in a lower generic > tier (e.g. interpreter). A lot of our generated IR values were direct > inttoptr casts because we often generate instances of our objects > outside of LLVM. For example we may generate instances of a python > objects when we setup the internal representation of a python function > which we then share between the interpreter and LLVM tier. That's why > we had a lot of inttoptr casts in our generated IR, there are also > additional args like pointers to the AST nodes which we will need for > deopt. > > Deopimizations should happen only very rarely that means that we don't > want to actually load all the constants we specified as live values > inside the patchpoint into registers/stack slots. Currently LLVM will > put all arguments which are constants inside the stackmap constant > table in order to not have to generate code in front of the patchpoint > to put all this constant values into register/stack slots. This is > exactly how I would expect the behavior to be and how I need it. > > But then I added a new feature: in order to speedup JITing time if we > encounter the same function on the next application start I > implemented an object cache for the LLVM generated code. This means I > need to be able to relocate all this embedded pointers because the > memory layout will not be the same. I choose to solve this by emitting > special unique symbol name for all cases where I previously embedded > the direct pointer value. This symbol names are deterministic, on the > next startup when encountering the same function I can directly load > it from the object cache and just have to return the real pointer > values inside the RTDyldMemoryManager::getSymbolAddress() overloaded > function. > > The problem I encountered and this patch tries to solve is that LLVM > will currently emit code which will load all this symbolic constants > into registers before the patchpoint. With this patch we will stop > emitting this machine instructions and instead emit constant table > entries inside the stackmap. > Hope this helps understanding what I have done (even if my english > isn't good), I successfully use this solution now since several weeks > and it gave us a huge speedup.Thanks for the context. The explanation made this make a lot more sense to me than the original patch had. I'm generally in support of this feature being added subject to normal code review. If nothing else, we should think about the format requirements even if the implementation isn't quite in yet. It's definitely something which would be good to support at some point.> > > On Fri, Jul 10, 2015 at 6:47 PM, Juergen Ributzka <juergen at apple.com > <mailto:juergen at apple.com>> wrote: > > Sounds good. I will add that to the StackMap documentation when I > update it for v2. > > —Juergen > >> On Jul 10, 2015, at 9:40 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> No, but I've noticed that it is true in practice, and so I think >> that we should say something about it one way or another. >> Especially since, in switching to a fixed-size record format, >> binary searching now becomes relatively easy/fast. Maybe it would >> be a useful guarantee? >> >> Thanks again, >> Hal > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> > http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/39a1d2fc/attachment.html>
Philip Reames
2015-Jul-15 05:26 UTC
[LLVMdev] [RFC] New StackMap format proposal (StackMap v2)
+1 sounds entirely reasonable. On 07/10/2015 09:47 AM, Juergen Ributzka wrote:> Sounds good. I will add that to the StackMap documentation when I > update it for v2. > > —Juergen > >> On Jul 10, 2015, at 9:40 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> No, but I've noticed that it is true in practice, and so I think that >> we should say something about it one way or another. Especially >> since, in switching to a fixed-size record format, binary searching >> now becomes relatively easy/fast. Maybe it would be a useful guarantee? >> >> Thanks again, >> Hal > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/10f126b0/attachment.html>