Hi, Intel's Xed can interpret "43 40 04 75" as "add al, 0x75", but LLVM's X86 disassembler considers this invalid code. I guess the reason is that LLVM fails to recognize the REX prefix in this case. Is this correct? Thanks. Jun -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141211/dc1a23fe/attachment.html>
Ahmed Bougacha
2014-Dec-11 19:12 UTC
[LLVMdev] REX prefix is not handled properly for X86_64?
Hi Jun, FWIW, I think LLVM's right in rejecting this. Per SDM 2.2.1, "Only one REX prefix is allowed per instruction." Here, 0x43 and 0x40 are both REX prefixes, so that contradicts the manual. However, trunk llvm-mc is still able to disassemble the add, I guess because it ignores invalid bytes: <stdin>:1:1: warning: invalid instruction encoding 0x43 0x40 0x04 0x75 ^ addb $117, %al ## encoding: [0x04,0x75] ## <MCInst #107 ADD8i8 ## <MCOperand Imm:117>> It would be trivial to change the disassembler to accept redundant REX prefixes (see attached patch, turn that into a loop to accept more than 2, but that would be even worse). Then, you have to decide which one to use: the first, or the last. Currently, only the last REX prefix is the one that's actually used for the following instruction: all the others before are discarded as invalid encodings. Now, if LLVM rejected useless REX prefixes (e.g. "40 04 75") that would be a problem, but that seems to work fine without any change. So, to recap: to avoid the problem, I think you should change the way you use the LLVM Disassembler API. When it's unable to disassemble a byte, ignore it and try again at the next one. That's what most linear disassemblers do, and would correctly ignore the first REX prefix here. - Ahmed On Thu, Dec 11, 2014 at 1:27 AM, Jun Koi <junkoi2004 at gmail.com> wrote:> Hi, > > Intel's Xed can interpret "43 40 04 75" as "add al, 0x75", but LLVM's X86 > disassembler considers this invalid code. I guess the reason is that LLVM > fails to recognize the REX prefix in this case. > > Is this correct? > > Thanks. > Jun > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- A non-text attachment was scrubbed... Name: x86_rex_redundant.patch Type: application/octet-stream Size: 595 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141211/15421a6c/attachment.obj>
On Fri, Dec 12, 2014 at 3:12 AM, Ahmed Bougacha <ahmed.bougacha at gmail.com> wrote:> > Hi Jun, > > FWIW, I think LLVM's right in rejecting this. Per SDM 2.2.1, "Only one > REX prefix is allowed per instruction." > Here, 0x43 and 0x40 are both REX prefixes, so that contradicts the manual. > > However, trunk llvm-mc is still able to disassemble the add, I guess > because it ignores invalid bytes: > > <stdin>:1:1: warning: invalid instruction encoding > 0x43 0x40 0x04 0x75 > ^ > addb $117, %al ## encoding: [0x04,0x75] > ## <MCInst #107 ADD8i8 > ## <MCOperand Imm:117>> > > It would be trivial to change the disassembler to accept redundant REX > prefixes (see attached patch, turn that into a loop to accept more > than 2, but that would be even worse). Then, you have to decide which > one to use: the first, or the last. Currently, only the last REX > prefix is the one that's actually used for the following instruction: > all the others before are discarded as invalid encodings. > > Now, if LLVM rejected useless REX prefixes (e.g. "40 04 75") that > would be a problem, but that seems to work fine without any change. > > So, to recap: to avoid the problem, I think you should change the way > you use the LLVM Disassembler API. When it's unable to disassemble a > byte, ignore it and try again at the next one. That's what most > linear disassemblers do, and would correctly ignore the first REX > prefix here. >got it, thanks a lot!!! Jun -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141213/c5a338a3/attachment.html>