K Jelesnianski via llvm-dev
2018-Jun-24 00:28 UTC
[llvm-dev] MachineFunction Instructions Pass using Segment Registers
Dear All, Currently I am trying to inject custom x86-64 assembly into a functions entry basic block. More specifically, I am trying to build assembly in a machine function pass from scratch. While the dumped machine function instruction info displays that %gs will be used, when I perform objdump -d on my executable I am see that %gs is replaced by %ebp? Why is this happening? I know it probably has something to do with me not specifying operands properly, but I cannot find enough documentation on this besides looking through code comments such as X86BaseInfo.cpp. I feel there isn't enough for me to be able to connect the dots. Below I have sample code: %gs holds a base address to a memory location where I am trying to store information. I am trying to update the %gs register pointer location before saving more values, etc. LLVM C++ codeMachine Function pass code: MachineInstrBuilder sss = BuildMI(MBB, MBB.begin(), DL, TII->get(X86::SUB32ri),X86::GS) .addReg(X86::GS) .addImm(0x8); machine function pass dump: %gs = SUB32ri %gs, 8, implicit-def %eflags Objdump -d assembly from executable 400510: 81 ed 04 00 00 00 sub $0x8,%ebp TLDR: I am trying to create custom assembly via BuildMI() and manipulate segment registers via a MachineFunctionPass. I have looked at LLVMs safestack implementation, but they are taking a fairly complicated hybrid approach between an IR Function pass with Backend support. I would like to stay as a single machinefunction pass. Believe me I would do this at the IR level if I didnt need to specifically use the segment registers. Thanks for the help in advance! Sincerely, Christopher Jelesnianski Graduate Research Assistant Virginia Tech
Craig Topper via llvm-dev
2018-Jun-24 00:36 UTC
[llvm-dev] MachineFunction Instructions Pass using Segment Registers
The SUB32ri can't instruction can't operate on segment registers. It operates on EAX/EBX/EDX/ECX/EBP, etc. When it gets encoded only 3 or 4 bits of the register value make it into the binary encoding. Objdump just extracts those 3 or 4 bits back out and prints one of the EAX/EBX/EDX/ECX/EBP registers that those bits correspond to. ~Craig On Sat, Jun 23, 2018 at 5:28 PM K Jelesnianski via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Dear All, > > Currently I am trying to inject custom x86-64 assembly into a > functions entry basic block. More specifically, I am trying to build > assembly in a machine function pass from scratch. > > While the dumped machine function instruction info displays that %gs > will be used, when I perform objdump -d on my executable I am see that > %gs is replaced by %ebp? Why is this happening? > > I know it probably has something to do with me not specifying operands > properly, but I cannot find enough documentation on this besides > looking through code comments such as X86BaseInfo.cpp. I feel there > isn't enough for me to be able to connect the dots. > > Below I have sample code: %gs holds a base address to a memory > location where I am trying to store information. I am trying to update > the %gs register pointer location before saving more values, etc. > > LLVM C++ codeMachine Function pass code: > MachineInstrBuilder sss = BuildMI(MBB, MBB.begin(), DL, > TII->get(X86::SUB32ri),X86::GS) > .addReg(X86::GS) > .addImm(0x8); > > machine function pass dump: > %gs = SUB32ri %gs, 8, implicit-def %eflags > > Objdump -d assembly from executable > 400510: 81 ed 04 00 00 00 sub $0x8,%ebp > > > TLDR: I am trying to create custom assembly via BuildMI() and manipulate > segment > registers via a MachineFunctionPass. > > I have looked at LLVMs safestack implementation, but they are taking a > fairly complicated hybrid approach between an IR Function pass with > Backend support. I would like to stay as a single machinefunction > pass. > > Believe me I would do this at the IR level if I didnt need to > specifically use the segment registers. > > Thanks for the help in advance! > > Sincerely, > > Christopher Jelesnianski > Graduate Research Assistant > Virginia Tech > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180623/cef1c7f6/attachment.html>
Craig Topper via llvm-dev
2018-Jun-24 00:45 UTC
[llvm-dev] MachineFunction Instructions Pass using Segment Registers
More specifically there is no instruction that can add/subtract segment registers. They can only be updated my the mov segment register instructions, opcodes 0x8c and 0x8e in x86 assembly. I suggest you write the text version of the assembly you want to generate and assemble it with llvm-mc. This will tell you if its even valid. After that you can use -show-inst to print the names of the instructions that X86 uses that you can give to BuildMI. ~Craig On Sat, Jun 23, 2018 at 5:36 PM Craig Topper <craig.topper at gmail.com> wrote:> The SUB32ri can't instruction can't operate on segment registers. It > operates on EAX/EBX/EDX/ECX/EBP, etc. When it gets encoded only 3 or 4 bits > of the register value make it into the binary encoding. Objdump just > extracts those 3 or 4 bits back out and prints one of the > EAX/EBX/EDX/ECX/EBP registers that those bits correspond to. > > ~Craig > > > On Sat, Jun 23, 2018 at 5:28 PM K Jelesnianski via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Dear All, >> >> Currently I am trying to inject custom x86-64 assembly into a >> functions entry basic block. More specifically, I am trying to build >> assembly in a machine function pass from scratch. >> >> While the dumped machine function instruction info displays that %gs >> will be used, when I perform objdump -d on my executable I am see that >> %gs is replaced by %ebp? Why is this happening? >> >> I know it probably has something to do with me not specifying operands >> properly, but I cannot find enough documentation on this besides >> looking through code comments such as X86BaseInfo.cpp. I feel there >> isn't enough for me to be able to connect the dots. >> >> Below I have sample code: %gs holds a base address to a memory >> location where I am trying to store information. I am trying to update >> the %gs register pointer location before saving more values, etc. >> >> LLVM C++ codeMachine Function pass code: >> MachineInstrBuilder sss = BuildMI(MBB, MBB.begin(), DL, >> TII->get(X86::SUB32ri),X86::GS) >> .addReg(X86::GS) >> .addImm(0x8); >> >> machine function pass dump: >> %gs = SUB32ri %gs, 8, implicit-def %eflags >> >> Objdump -d assembly from executable >> 400510: 81 ed 04 00 00 00 sub $0x8,%ebp >> >> >> TLDR: I am trying to create custom assembly via BuildMI() and manipulate >> segment >> registers via a MachineFunctionPass. >> >> I have looked at LLVMs safestack implementation, but they are taking a >> fairly complicated hybrid approach between an IR Function pass with >> Backend support. I would like to stay as a single machinefunction >> pass. >> >> Believe me I would do this at the IR level if I didnt need to >> specifically use the segment registers. >> >> Thanks for the help in advance! >> >> Sincerely, >> >> Christopher Jelesnianski >> Graduate Research Assistant >> Virginia Tech >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180623/1a401efc/attachment.html>