John Criswell
2012-Oct-23 18:58 UTC
[LLVMdev] How to Find Instruction Encoding for a MachineInstr
Dear All, I'm enhancing a MachineFunctionPass that enforces control-flow integrity. One of the things I want to do is to set the alignment of an instruction (by adding NOPs before it in the MachineBasicBlock or by emitting an alignment directive to the assembler) if it causes a specific sequence of bytes to be generated at a specific alignment. The goal is to ensure that sequences of bytes used to label valid targets of an indirect branch (e.g., a return instruction) do not appear at a given alignment anywhere in a program other than for where I inserted them explicitly. It looks like MachineInstr has a method for finding the length of the instruction's binary encoding, but I didn't see a method for finding the exact bytes that would be emitted from the MachineInstr. Is there a way to do this in the MachineFunctionPass/MachineInstr infrastructure, or do I need to use something like the MC classes? Thanks in advance for any help provided. -- John T.
Craig Topper
2012-Oct-24 00:19 UTC
[LLVMdev] How to Find Instruction Encoding for a MachineInstr
What function provides the encoding length? X86 in particular is so difficult to encode that only the old style JIT and the MC Code Emitter could possibly know how many bytes something takes. On Tue, Oct 23, 2012 at 11:58 AM, John Criswell <criswell at illinois.edu>wrote:> Dear All, > > I'm enhancing a MachineFunctionPass that enforces control-flow integrity. > One of the things I want to do is to set the alignment of an instruction > (by adding NOPs before it in the MachineBasicBlock or by emitting an > alignment directive to the assembler) if it causes a specific sequence of > bytes to be generated at a specific alignment. The goal is to ensure that > sequences of bytes used to label valid targets of an indirect branch (e.g., > a return instruction) do not appear at a given alignment anywhere in a > program other than for where I inserted them explicitly. > > It looks like MachineInstr has a method for finding the length of the > instruction's binary encoding, but I didn't see a method for finding the > exact bytes that would be emitted from the MachineInstr. Is there a way to > do this in the MachineFunctionPass/**MachineInstr infrastructure, or do I > need to use something like the MC classes? > > Thanks in advance for any help provided. > > -- John T. > > ______________________________**_________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/**mailman/listinfo/llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev> >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121023/00d365e7/attachment.html>
John Criswell
2012-Oct-24 00:23 UTC
[LLVMdev] How to Find Instruction Encoding for a MachineInstr
On 10/23/12 7:19 PM, Craig Topper wrote:> What function provides the encoding length? X86 in particular is so > difficult to encode that only the old style JIT and the MC Code > Emitter could possibly know how many bytes something takes.The getSize() method of MCInstrDesc which can be fetched from a MachineInstr using the getDesc() method: http://llvm.org/doxygen/classllvm_1_1MCInstrDesc.html#ae8a17b854d9787d11797d9334a22647d Does this method not work as advertised in Doxygen? -- John T.> > On Tue, Oct 23, 2012 at 11:58 AM, John Criswell <criswell at illinois.edu > <mailto:criswell at illinois.edu>> wrote: > > Dear All, > > I'm enhancing a MachineFunctionPass that enforces control-flow > integrity. One of the things I want to do is to set the alignment > of an instruction (by adding NOPs before it in the > MachineBasicBlock or by emitting an alignment directive to the > assembler) if it causes a specific sequence of bytes to be > generated at a specific alignment. The goal is to ensure that > sequences of bytes used to label valid targets of an indirect > branch (e.g., a return instruction) do not appear at a given > alignment anywhere in a program other than for where I inserted > them explicitly. > > It looks like MachineInstr has a method for finding the length of > the instruction's binary encoding, but I didn't see a method for > finding the exact bytes that would be emitted from the > MachineInstr. Is there a way to do this in the > MachineFunctionPass/MachineInstr infrastructure, or do I need to > use something like the MC classes? > > Thanks in advance for any help provided. > > -- John T. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> > http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > -- > ~Craig-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121023/3c3db393/attachment.html>
Joshua Cranmer
2012-Oct-24 01:22 UTC
[LLVMdev] How to Find Instruction Encoding for a MachineInstr
On 10/23/2012 1:58 PM, John Criswell wrote:> Dear All, > > I'm enhancing a MachineFunctionPass that enforces control-flow > integrity. One of the things I want to do is to set the alignment of > an instruction (by adding NOPs before it in the MachineBasicBlock or > by emitting an alignment directive to the assembler) if it causes a > specific sequence of bytes to be generated at a specific alignment. > The goal is to ensure that sequences of bytes used to label valid > targets of an indirect branch (e.g., a return instruction) do not > appear at a given alignment anywhere in a program other than for where > I inserted them explicitly. > > It looks like MachineInstr has a method for finding the length of the > instruction's binary encoding, but I didn't see a method for finding > the exact bytes that would be emitted from the MachineInstr. Is there > a way to do this in the MachineFunctionPass/MachineInstr > infrastructure, or do I need to use something like the MC classes? >As I recall (I haven't played this deep with MachineInstrs for close to a year), it's not necessarily knowable what the length is or the exact bytes that would be emitted since some of them depend on information not known until the final assembly emission pass. An example here is the x86 jmp instruction: the choice between near and long jumps (and hence 2 bytes or 5 bytes on x86-64) is not made until the actual conversion to MCInst and after applying all of the fixups--which only happens deep within the bowels of the AsmPrinter pass. -- Joshua Cranmer News submodule owner DXR coauthor
Jim Grosbach
2012-Oct-24 18:52 UTC
[LLVMdev] How to Find Instruction Encoding for a MachineInstr
On Oct 23, 2012, at 6:22 PM, Joshua Cranmer <pidgeot18 at gmail.com> wrote:> On 10/23/2012 1:58 PM, John Criswell wrote: >> Dear All, >> >> I'm enhancing a MachineFunctionPass that enforces control-flow integrity. One of the things I want to do is to set the alignment of an instruction (by adding NOPs before it in the MachineBasicBlock or by emitting an alignment directive to the assembler) if it causes a specific sequence of bytes to be generated at a specific alignment. The goal is to ensure that sequences of bytes used to label valid targets of an indirect branch (e.g., a return instruction) do not appear at a given alignment anywhere in a program other than for where I inserted them explicitly. >> >> It looks like MachineInstr has a method for finding the length of the instruction's binary encoding, but I didn't see a method for finding the exact bytes that would be emitted from the MachineInstr. Is there a way to do this in the MachineFunctionPass/MachineInstr infrastructure, or do I need to use something like the MC classes? >> > > As I recall (I haven't played this deep with MachineInstrs for close to a year), it's not necessarily knowable what the length is or the exact bytes that would be emitted since some of them depend on information not known until the final assembly emission pass. An example here is the x86 jmp instruction: the choice between near and long jumps (and hence 2 bytes or 5 bytes on x86-64) is not made until the actual conversion to MCInst and after applying all of the fixups--which only happens deep within the bowels of the AsmPrinter pass.Right. See X86AsmBackend::mayNeedRelaxation() and friends for the gory details. -jim> > -- > Joshua Cranmer > News submodule owner > DXR coauthor > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Apparently Analagous Threads
- [LLVMdev] How to Find Instruction Encoding for a MachineInstr
- [LLVMdev] How to Find Instruction Encoding for a MachineInstr
- [LLVMdev] How to Find Instruction Encoding for a MachineInstr
- [LLVMdev] How to Find Instruction Encoding for a MachineInstr
- How to get assembly opcode mnemonic(s) corresponding to a MachineInstr?