On Wed, Dec 24, 2014 at 2:43 PM, Craig Topper <craig.topper at gmail.com> wrote:> I believe this particular error is caused by this. That seems easy enough > to just drop the bit. Do you have other non-mmx examples? > > case TYPE_MM: \ > if (index > 7) \ > *valid = 0; \ > return prefix##_MM0 + index; >yes, exactly this place. but the question is: how do we know when to drop the REX.B? i dont know any non-MMX examples. it seems only MMX related instructions have this issue. thanks, Jun> > On Tue, Dec 23, 2014 at 10:17 PM, Jun Koi <junkoi2004 at gmail.com> wrote: >> >> hi, >> >> i think the current X86 disassembler is quite broken and fails badly on >> handling REX for x86_64 code. >> >> below are some examples: >> >> $ echo "0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble >> -triple=x86_64 >> .text >> por %mm3, %mm0 >> >> $ echo "0x40,0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble >> -triple=x86_64 >> .text >> por %mm3, %mm0 >> >> $ echo "0x41,0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble >> -triple=x86_64 >> .text >> <stdin>:1:1: warning: invalid instruction encoding >> 0x41,0x0f,0xeb,0xc3 >> ^ >> >> >> the last example should also return "por %mm3, %mm0", but it fails to >> understand the input. >> >> the reason stays with this line in X86DisassemblerDecoder.cpp: >> >> rm |= bFromREX(insn->rexPrefix) << 3; >> >> we can see that we take into account REX.B, but for "por" (0F EB), this >> should be ignored. >> >> there are quite a lot of other instructions taking into account REX like >> this, while according to the manual, REX should be ignored. >> >> i dont see any clean solution for this issue without some significant >> changes into the way we decode ModRM & providing more information to .td >> files. >> >> any idea? >> >> thanks. >> Jun >> > > > -- > ~Craig >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141224/e144180f/attachment.html>
Craig Topper
2014-Dec-24 06:59 UTC
[LLVMdev] X86 disassembler is quite broken on handling REX
Wouldn't changing
case TYPE_MM: \
if (index > 7) \
*valid = 0; \
return prefix##_MM0 + index;
to
case TYPE_MM: \
return prefix##_MM0 + (index & 0x7);
Fix the issue for both rex.b and rex.r?
On Tue, Dec 23, 2014 at 10:54 PM, Jun Koi <junkoi2004 at gmail.com>
wrote:>
>
>
> On Wed, Dec 24, 2014 at 2:43 PM, Craig Topper <craig.topper at
gmail.com>
> wrote:
>
>> I believe this particular error is caused by this. That seems easy
enough
>> to just drop the bit. Do you have other non-mmx examples?
>>
>> case TYPE_MM: \
>> if (index > 7) \
>> *valid = 0; \
>> return prefix##_MM0 + index;
>>
>
> yes, exactly this place. but the question is: how do we know when to drop
> the REX.B?
>
>
> i dont know any non-MMX examples. it seems only MMX related instructions
> have this issue.
>
> thanks,
> Jun
>
>
>
>
>>
>> On Tue, Dec 23, 2014 at 10:17 PM, Jun Koi <junkoi2004 at
gmail.com> wrote:
>>>
>>> hi,
>>>
>>> i think the current X86 disassembler is quite broken and fails
badly on
>>> handling REX for x86_64 code.
>>>
>>> below are some examples:
>>>
>>> $ echo "0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc
-disassemble
>>> -triple=x86_64
>>> .text
>>> por %mm3, %mm0
>>>
>>> $ echo
"0x40,0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble
>>> -triple=x86_64
>>> .text
>>> por %mm3, %mm0
>>>
>>> $ echo
"0x41,0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble
>>> -triple=x86_64
>>> .text
>>> <stdin>:1:1: warning: invalid instruction encoding
>>> 0x41,0x0f,0xeb,0xc3
>>> ^
>>>
>>>
>>> the last example should also return "por %mm3, %mm0", but
it fails to
>>> understand the input.
>>>
>>> the reason stays with this line in X86DisassemblerDecoder.cpp:
>>>
>>> rm |= bFromREX(insn->rexPrefix) << 3;
>>>
>>> we can see that we take into account REX.B, but for "por"
(0F EB), this
>>> should be ignored.
>>>
>>> there are quite a lot of other instructions taking into account REX
like
>>> this, while according to the manual, REX should be ignored.
>>>
>>> i dont see any clean solution for this issue without some
significant
>>> changes into the way we decode ModRM & providing more
information to .td
>>> files.
>>>
>>> any idea?
>>>
>>> thanks.
>>> Jun
>>>
>>
>>
>> --
>> ~Craig
>>
>
>
--
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141223/d29210cb/attachment.html>
On Wed, Dec 24, 2014 at 2:59 PM, Craig Topper <craig.topper at gmail.com> wrote:> Wouldn't changing > > case TYPE_MM: \ > if (index > 7) \ > *valid = 0; \ > return prefix##_MM0 + index; > > > to > > case TYPE_MM: \ > return prefix##_MM0 + (index & 0x7); > > > Fix the issue for both rex.b and rex.r? >this sounds OK. but there is no more check (index > 7)? is there any case that ca be the issue? thanks, Jun> > On Tue, Dec 23, 2014 at 10:54 PM, Jun Koi <junkoi2004 at gmail.com> wrote: >> >> >> >> On Wed, Dec 24, 2014 at 2:43 PM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> I believe this particular error is caused by this. That seems easy >>> enough to just drop the bit. Do you have other non-mmx examples? >>> >>> case TYPE_MM: \ >>> if (index > 7) \ >>> *valid = 0; \ >>> return prefix##_MM0 + index; >>> >> >> yes, exactly this place. but the question is: how do we know when to drop >> the REX.B? >> >> >> i dont know any non-MMX examples. it seems only MMX related instructions >> have this issue. >> >> thanks, >> Jun >> >> >> >> >>> >>> On Tue, Dec 23, 2014 at 10:17 PM, Jun Koi <junkoi2004 at gmail.com> wrote: >>>> >>>> hi, >>>> >>>> i think the current X86 disassembler is quite broken and fails badly on >>>> handling REX for x86_64 code. >>>> >>>> below are some examples: >>>> >>>> $ echo "0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble >>>> -triple=x86_64 >>>> .text >>>> por %mm3, %mm0 >>>> >>>> $ echo "0x40,0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble >>>> -triple=x86_64 >>>> .text >>>> por %mm3, %mm0 >>>> >>>> $ echo "0x41,0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble >>>> -triple=x86_64 >>>> .text >>>> <stdin>:1:1: warning: invalid instruction encoding >>>> 0x41,0x0f,0xeb,0xc3 >>>> ^ >>>> >>>> >>>> the last example should also return "por %mm3, %mm0", but it fails to >>>> understand the input. >>>> >>>> the reason stays with this line in X86DisassemblerDecoder.cpp: >>>> >>>> rm |= bFromREX(insn->rexPrefix) << 3; >>>> >>>> we can see that we take into account REX.B, but for "por" (0F EB), this >>>> should be ignored. >>>> >>>> there are quite a lot of other instructions taking into account REX >>>> like this, while according to the manual, REX should be ignored. >>>> >>>> i dont see any clean solution for this issue without some significant >>>> changes into the way we decode ModRM & providing more information to .td >>>> files. >>>> >>>> any idea? >>>> >>>> thanks. >>>> Jun >>>> >>> >>> >>> -- >>> ~Craig >>> >> >> > > -- > ~Craig >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141224/485b00a6/attachment.html>