Soham Sinha via llvm-dev
2018-May-07 18:06 UTC
[llvm-dev] How to add assembly instructions in CodeGen
Hello Dean, I looked at the XRay Instrumentation. That's a nice engineering effort. I am sure you had your motivation to do this in CodeGen just like I wanted to do. I don't understand all of your code but I get the idea that you are adjusting the alignment with explicit bytes and no-op instructions. My problem is also very much related to yours where my stack pointer ($rsp) alignment breaks in printf. Having said that, I am not sure whether I need the engineering effort that you have pursued. I am trying to add function calls in some places of the machine code. I followed X86_64 calling convention to do so. I saved (pushed into stack) all the necessary registers (also tried saving all the 16 registers) and then filled up 3 arguments in rdi, rsi, rdx and then call the desired function (and then pop the registers). Mathematically, saving the 16 register should not break the alignment of the stack pointer. But when I am trying to debug with gdb, I see that the alignment breaks sometimes during the push operations of 16 registers, and it comes as broken alignment in the printf function. I am very confused what can go wrong here. This is why I was trying to rely on LLVM to maintain the alignment. Interestingly, at the start of the runOnMachineFunction, I check the alignment of the function and also at the end of the runOnMachineFunction (after my push, call function and pop). The alignment stays same as 4 (16 bytes). Therefore, I guess, the BuildMI function doesn't maintain the alignment and doesn't even report the broken alignment through the alignment variable of MachineFunction. I access the alignment through the function, getAlignment. I think BuildMI should have cared about alignment or at least update the alignment value. I am afraid if I follow your path of instrumentation, again I might ultimately face the same issue where I could not maintain the alignment. Your effort is quite similar to what I am trying to do, but I am just doing it in the MachineFunctionPass itself. It's very non-trivial and tedious to change the internals of CodeGen because the LLVM MC infrastructure is very much intertwined with the Assembler. That makes compilation faster but instrumentation tougher. This is why I wrote a MachineFunctionPass so that my instrumentation stays like a module. I add my MachineFunctionPass at the end of addPreEmitPass phase of X86. I wish LLVM provided more modular ways of instrumentation just like it provides similar instrumentation in the LLVM IR level. Regards, Soham Sinha PhD Student, Department of Computer Science Boston University On Mon, May 7, 2018 at 1:20 AM, Dean Michael Berris <dean.berris at gmail.com> wrote:> On Sun, May 6, 2018 at 7:26 AM Soham Sinha via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > Hello, > > > I want to add assembly instructions at certain points in a function. This > is X86 specific. So I am working in the lib/Target/X86 folder. I create a > `MachineFunctionPass` in that folder. I register it in the > X86TargetMachine.cpp in addPreEmitPass(). I use BuildMI to insert my own > assembly instructions in the MachineFunctionPass. This works and my > assembly instructions are inserted at desired places. However, this breaks > the alignment. So when I run the generated code, I get segmentation fault > (precisely in printf with XMM registers). Where should I add my pass? > > > It sounds like you're running into stack alignment issues. If you're adding > data to the stack, you may need to work a little harder with maintaining > the state of the stack. This is not trivial to do especially if you're > emitting the assembly by the time you're at a MachineFunctionPass (because > register spilling and/or stack alignment information would have already > been done by the time you're in machine instruction lowering). What you may > need to do here is to either: > > - hook into the preamble and stack re-alignment code specifically in X86 > that would look at information from your pass. This is not trivial and I > don't recommend going down this path (I tried, but I lost the patience to > do it properly). > > - when emitting the assembly instructions that involve pushing/popping from > the stack, that you're keeping track of the alignment of the stack > variables. This is what we do with XRay, when we're lowering the custom > event sleds. > > - use pseudo-instructions and preserving those until lowering, where the > lowering > > > My pass depends on the MachineBasicBlock information as well. Therefore, > I cannot add my pass too early in LLVM IR. What is the proper pass to add > my custom MachineFunctionPass? I tried addPreRegAlloc, but it failed due to > insufficient register allocation error or something on that line. > > > Can anybody please help me write a MachineFunctionPass where I can insert > assembly instruction without breaking the alignment? I am doing this for > X86_64. > > > You can look at the XRay lowering for the PATCHABLE_EVENT_CALL lowering in > X86AsmPrinter as a guide for the lowering, but you might also want to see > how we're inserting these pseudo-instructions from the > > I don't remember having to specify where the pass is defined, since it's > already in the assembly printing. So you might consider inserting these > pseudo-instructions a the MachineFunctionPass, which gets lowered > appropriately in the assembly printer. Unfortunately I don't think there's > a generic way of doing this (yet) with the X86 back-end. There might be a > good case for making this easier, but right now these kinds of things > haven't been too important to fix yet. > > Hope this helps! > -- > Dean >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180507/bf14ad80/attachment.html>
Dean Michael Berris via llvm-dev
2018-May-08 00:38 UTC
[llvm-dev] How to add assembly instructions in CodeGen
On Tue, May 8, 2018 at 4:06 AM Soham Sinha <soham1 at bu.edu> wrote:> Hello Dean,> I looked at the XRay Instrumentation. That's a nice engineering effort. Iam sure you had your motivation to do this in CodeGen just like I wanted to do. I don't understand all of your code but I get the idea that you are adjusting the alignment with explicit bytes and no-op instructions. My problem is also very much related to yours where my stack pointer ($rsp) alignment breaks in printf.> Having said that, I am not sure whether I need the engineering effortthat you have pursued. I am trying to add function calls in some places of the machine code. I followed X86_64 calling convention to do so. I saved (pushed into stack) all the necessary registers (also tried saving all the 16 registers) and then filled up 3 arguments in rdi, rsi, rdx and then call the desired function (and then pop the registers). Mathematically, saving the 16 register should not break the alignment of the stack pointer. But when I am trying to debug with gdb, I see that the alignment breaks sometimes during the push operations of 16 registers, and it comes as broken alignment in the printf function. I am very confused what can go wrong here. This is why I was trying to rely on LLVM to maintain the alignment.> Interestingly, at the start of the runOnMachineFunction, I check thealignment of the function and also at the end of the runOnMachineFunction (after my push, call function and pop). The alignment stays same as 4 (16 bytes). Therefore, I guess, the BuildMI function doesn't maintain the alignment and doesn't even report the broken alignment through the alignment variable of MachineFunction. I access the alignment through the function, getAlignment. I think BuildMI should have cared about alignment or at least update the alignment value. IIRC, getAlignment() tells you the function's *code* alignment, not whether the stack is aligned to a certain boundary at a given point. I don't know whether that information is maintained per MachineBasicBlock, because the decision on whether to spill variables onto the stack is done on a per-function-call basis -- you may need to look at the way functions are lowered specifically in X86 to see the (complicated) logic to figure out whether/how to spill which registers onto the stack and how to lay out the stack. To address this partially, we not only insert the custom event pseudo-instruction, but we dispatch to a trampoline that's defined in compiler-rt -- that code will maintain the stack alignment before making a function call. It saves all the relevant registers first, aligns the stack, then calls the function -- upon return we restore the registers from the stack. Essentially we're doing a context-switch, which might be what you're looking to do as well. That code is in compiler-rt hand-written as x86_64 assembly. See https://github.com/llvm-mirror/compiler-rt/blob/master/lib/xray/xray_trampoline_x86_64.S#L224 for some inspiration. The custom event instrumentation points just call into the trampoline, setting up the arguments on the spot. We've had to do some gymnastics to make that happen all the way up to the IR -- i.e. we insert the instrumentation as calls to LLVM intrinsics at the IR, and preserve those all the way down to the codegen. Doing it another way seemed much too hard, as you may be finding out. :(> I am afraid if I follow your path of instrumentation, again I mightultimately face the same issue where I could not maintain the alignment. Your effort is quite similar to what I am trying to do, but I am just doing it in the MachineFunctionPass itself.> It's very non-trivial and tedious to change the internals of CodeGenbecause the LLVM MC infrastructure is very much intertwined with the Assembler. That makes compilation faster but instrumentation tougher. This is why I wrote a MachineFunctionPass so that my instrumentation stays like a module. I add my MachineFunctionPass at the end of addPreEmitPass phase of X86.> I wish LLVM provided more modular ways of instrumentation just like itprovides similar instrumentation in the LLVM IR level. I have the same wish -- it'd be great if we can move the XRay instrumentation to normal MachineFunctionPass implementations. Just a thought -- have you considered using XRay instrumentation as a framework instead to accomplish what you're trying to do? I mean, instead of implementing your own pass?> Regards, > Soham Sinha > PhD Student, Department of Computer Science > Boston University> On Mon, May 7, 2018 at 1:20 AM, Dean Michael Berris > <dean.berris at gmail.com>wrote:>> On Sun, May 6, 2018 at 7:26 AM Soham Sinha via llvm-dev < >> llvm-dev at lists.llvm.org> wrote:>> > Hello,>> > I want to add assembly instructions at certain points in a function.This>> is X86 specific. So I am working in the lib/Target/X86 folder. I create a >> `MachineFunctionPass` in that folder. I register it in the >> X86TargetMachine.cpp in addPreEmitPass(). I use BuildMI to insert my own >> assembly instructions in the MachineFunctionPass. This works and my >> assembly instructions are inserted at desired places. However, thisbreaks>> the alignment. So when I run the generated code, I get segmentation fault >> (precisely in printf with XMM registers). Where should I add my pass?>> It sounds like you're running into stack alignment issues. If you'readding>> data to the stack, you may need to work a little harder with maintaining >> the state of the stack. This is not trivial to do especially if you're >> emitting the assembly by the time you're at a MachineFunctionPass(because>> register spilling and/or stack alignment information would have already >> been done by the time you're in machine instruction lowering). What youmay>> need to do here is to either:>> - hook into the preamble and stack re-alignment code specifically in X86 >> that would look at information from your pass. This is not trivial and I >> don't recommend going down this path (I tried, but I lost the patience to >> do it properly).>> - when emitting the assembly instructions that involve pushing/poppingfrom>> the stack, that you're keeping track of the alignment of the stack >> variables. This is what we do with XRay, when we're lowering the custom >> event sleds.>> - use pseudo-instructions and preserving those until lowering, where the >> lowering>> > My pass depends on the MachineBasicBlock information as well.Therefore,>> I cannot add my pass too early in LLVM IR. What is the proper pass to add >> my custom MachineFunctionPass? I tried addPreRegAlloc, but it failed dueto>> insufficient register allocation error or something on that line.>> > Can anybody please help me write a MachineFunctionPass where I caninsert>> assembly instruction without breaking the alignment? I am doing this for >> X86_64.>> You can look at the XRay lowering for the PATCHABLE_EVENT_CALL loweringin>> X86AsmPrinter as a guide for the lowering, but you might also want to see >> how we're inserting these pseudo-instructions from the>> I don't remember having to specify where the pass is defined, since it's >> already in the assembly printing. So you might consider inserting these >> pseudo-instructions a the MachineFunctionPass, which gets lowered >> appropriately in the assembly printer. Unfortunately I don't thinkthere's>> a generic way of doing this (yet) with the X86 back-end. There might be a >> good case for making this easier, but right now these kinds of things >> haven't been too important to fix yet.>> Hope this helps! >> -- >> Dean-- Dean
Soham Sinha via llvm-dev
2018-May-09 13:54 UTC
[llvm-dev] How to add assembly instructions in CodeGen
Hi Dean, I looked at XRay. I also thought on the similar line to add assembly instructions as auxiliary template code and jump on to there. However, that may still dis-align the stack. I have to think about it. But your XRay code does give me the courage to think about this seriously. Thank you for your help. I also figured out that we can access certain CodeGen's feature right from the IR level, as you have explained your tussle of dealing with IR and CodeGen together. Hopefully I can work out a convenient way. Regards, Soham Sinha PhD Student, Department of Computer Science Boston University On Mon, May 7, 2018 at 8:38 PM, Dean Michael Berris <dean.berris at gmail.com> wrote:> On Tue, May 8, 2018 at 4:06 AM Soham Sinha <soham1 at bu.edu> wrote: > > > Hello Dean, > > > I looked at the XRay Instrumentation. That's a nice engineering effort. I > am sure you had your motivation to do this in CodeGen just like I wanted to > do. I don't understand all of your code but I get the idea that you are > adjusting the alignment with explicit bytes and no-op instructions. My > problem is also very much related to yours where my stack pointer ($rsp) > alignment breaks in printf. > > > Having said that, I am not sure whether I need the engineering effort > that you have pursued. I am trying to add function calls in some places of > the machine code. I followed X86_64 calling convention to do so. I saved > (pushed into stack) all the necessary registers (also tried saving all the > 16 registers) and then filled up 3 arguments in rdi, rsi, rdx and then call > the desired function (and then pop the registers). Mathematically, saving > the 16 register should not break the alignment of the stack pointer. But > when I am trying to debug with gdb, I see that the alignment breaks > sometimes during the push operations of 16 registers, and it comes as > broken alignment in the printf function. I am very confused what can go > wrong here. This is why I was trying to rely on LLVM to maintain the > alignment. > > > Interestingly, at the start of the runOnMachineFunction, I check the > alignment of the function and also at the end of the runOnMachineFunction > (after my push, call function and pop). The alignment stays same as 4 (16 > bytes). Therefore, I guess, the BuildMI function doesn't maintain the > alignment and doesn't even report the broken alignment through the > alignment variable of MachineFunction. I access the alignment through the > function, getAlignment. I think BuildMI should have cared about alignment > or at least update the alignment value. > > > IIRC, getAlignment() tells you the function's *code* alignment, not whether > the stack is aligned to a certain boundary at a given point. I don't know > whether that information is maintained per MachineBasicBlock, because the > decision on whether to spill variables onto the stack is done on a > per-function-call basis -- you may need to look at the way functions are > lowered specifically in X86 to see the (complicated) logic to figure out > whether/how to spill which registers onto the stack and how to lay out the > stack. > > To address this partially, we not only insert the custom event > pseudo-instruction, but we dispatch to a trampoline that's defined in > compiler-rt -- that code will maintain the stack alignment before making a > function call. It saves all the relevant registers first, aligns the stack, > then calls the function -- upon return we restore the registers from the > stack. Essentially we're doing a context-switch, which might be what you're > looking to do as well. That code is in compiler-rt hand-written as x86_64 > assembly. > > See > https://github.com/llvm-mirror/compiler-rt/blob/master/lib/xray/xray_ > trampoline_x86_64.S#L224 > for some inspiration. > > The custom event instrumentation points just call into the trampoline, > setting up the arguments on the spot. We've had to do some gymnastics to > make that happen all the way up to the IR -- i.e. we insert the > instrumentation as calls to LLVM intrinsics at the IR, and preserve those > all the way down to the codegen. Doing it another way seemed much too hard, > as you may be finding out. :( > > > I am afraid if I follow your path of instrumentation, again I might > ultimately face the same issue where I could not maintain the alignment. > Your effort is quite similar to what I am trying to do, but I am just > doing it in the MachineFunctionPass itself. > > > It's very non-trivial and tedious to change the internals of CodeGen > because the LLVM MC infrastructure is very much intertwined with the > Assembler. That makes compilation faster but instrumentation tougher. This > is why I wrote a MachineFunctionPass so that my instrumentation stays like > a module. I add my MachineFunctionPass at the end of addPreEmitPass phase > of X86. > > > I wish LLVM provided more modular ways of instrumentation just like it > provides similar instrumentation in the LLVM IR level. > > > I have the same wish -- it'd be great if we can move the XRay > instrumentation to normal MachineFunctionPass implementations. > > Just a thought -- have you considered using XRay instrumentation as a > framework instead to accomplish what you're trying to do? I mean, instead > of implementing your own pass? > > > Regards, > > Soham Sinha > > PhD Student, Department of Computer Science > > Boston University > > > On Mon, May 7, 2018 at 1:20 AM, Dean Michael Berris > > <dean.berris at gmail.com> > wrote: > > >> On Sun, May 6, 2018 at 7:26 AM Soham Sinha via llvm-dev < > >> llvm-dev at lists.llvm.org> wrote: > > >> > Hello, > > >> > I want to add assembly instructions at certain points in a function. > This > >> is X86 specific. So I am working in the lib/Target/X86 folder. I create > a > >> `MachineFunctionPass` in that folder. I register it in the > >> X86TargetMachine.cpp in addPreEmitPass(). I use BuildMI to insert my own > >> assembly instructions in the MachineFunctionPass. This works and my > >> assembly instructions are inserted at desired places. However, this > breaks > >> the alignment. So when I run the generated code, I get segmentation > fault > >> (precisely in printf with XMM registers). Where should I add my pass? > > > >> It sounds like you're running into stack alignment issues. If you're > adding > >> data to the stack, you may need to work a little harder with maintaining > >> the state of the stack. This is not trivial to do especially if you're > >> emitting the assembly by the time you're at a MachineFunctionPass > (because > >> register spilling and/or stack alignment information would have already > >> been done by the time you're in machine instruction lowering). What you > may > >> need to do here is to either: > > >> - hook into the preamble and stack re-alignment code specifically in X86 > >> that would look at information from your pass. This is not trivial and I > >> don't recommend going down this path (I tried, but I lost the patience > to > >> do it properly). > > >> - when emitting the assembly instructions that involve pushing/popping > from > >> the stack, that you're keeping track of the alignment of the stack > >> variables. This is what we do with XRay, when we're lowering the custom > >> event sleds. > > >> - use pseudo-instructions and preserving those until lowering, where the > >> lowering > > >> > My pass depends on the MachineBasicBlock information as well. > Therefore, > >> I cannot add my pass too early in LLVM IR. What is the proper pass to > add > >> my custom MachineFunctionPass? I tried addPreRegAlloc, but it failed due > to > >> insufficient register allocation error or something on that line. > > >> > Can anybody please help me write a MachineFunctionPass where I can > insert > >> assembly instruction without breaking the alignment? I am doing this for > >> X86_64. > > > >> You can look at the XRay lowering for the PATCHABLE_EVENT_CALL lowering > in > >> X86AsmPrinter as a guide for the lowering, but you might also want to > see > >> how we're inserting these pseudo-instructions from the > > >> I don't remember having to specify where the pass is defined, since it's > >> already in the assembly printing. So you might consider inserting these > >> pseudo-instructions a the MachineFunctionPass, which gets lowered > >> appropriately in the assembly printer. Unfortunately I don't think > there's > >> a generic way of doing this (yet) with the X86 back-end. There might be > a > >> good case for making this easier, but right now these kinds of things > >> haven't been too important to fix yet. > > >> Hope this helps! > >> -- > >> Dean > > > > > -- > Dean >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180509/888b9e65/attachment.html>