thr3ads.net - llvm dev - [LLVMdev] Issue with instruction decoding / disassembly [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Richard Osborne

2012-Dec-18 10:37 UTC

[LLVMdev] Issue with instruction decoding / disassembly

I'm currently trying to get llvm-mc --disassemble working for the XCore
backend. Up until recently there was no instruction encoding / decoding
information on any of the XCore instructions so Im incrementally adding this
information at the same time as adding tests for the disassembler. However
I've run into a problem and I'm not sure of the best way to solve it.
With some of the XCore's instruction formats operands are not encoded into
bits individually. instead they are combined into a single field using
arithmetic operations before being inserted in the instruction. For example:

ADD_3r is encoded as: 00010aaaaabbccdd

where:

aaaaa = op1[3...2] × 9 + op2[3...2] × 3 + op3[3..2]
bb = op1[1..0]
cc = op2[1..0]
dd = op3[1..0]

op1 - op3 are all in the range 0-11 and therefore aaaaa is in the range 0 - 26

I managed to get decoding of ADD_3r instructions to work by specifying the value
of the bits that are fixed in the instruction format and using a custom
DecoderMethod to handle the rest. The problem comes when I try and add the
INITSP_2r instruction.

INITSP_2r is encoded as: 00010aaaaab0ccdd

Again operands are not individually encoded into bits but instead they are
combined into a single field using arithmetic operations. Due to the way the
aaaaa is derived it is guaranteed to be greater than 27. The value of these bits
is how the INITSP_2r and ADD_3r instructions should be distinguished. I tried to
handle SETD_r2r the same way as ADD_3r (specifying the value of the bits that
are fixed and using the a custom DecoderMethod for the rest). With this change I
can disassemble INITSP_2r instructions but it breaks the decoding of ADD_3r
instructions. Consider the following bit pattern:

0001000000000000

This is an ADD_3r instruction. Before adding INITSP_2r the autogenerated
decodeInstruction method would identify this as a possible ADD_3r instruction
and it would call the associated decoder method (Decode3RInstruction) which
returns Success. After adding INITSP_2r the autogenerated decodeInstruction
method identifies this as a possible INITSP_2r instruction and it calls the
associated decoder method (Decode2RInstruction) which returns Fail. At this
point I'd like decodeInstruction to carry on testing to see if it can be
decoded as an ADD_3r instruction but instead it stops looking at this point and
returns Fail.

How should I deal with this situation? One idea I had (which I haven't tried
yet) is to move the troublesome instructions into a different decoding table by
setting the DecoderNamespace. This way in XCoreDisassembler::getInstruction() I
can call decodeInstruction() on the first decoder table (containing INITSP_2r)
and if this fails I can then call decodeInstruction() on the second decoder
table (containing ADD_3r). Is this an abuse of DecoderNamespaces? Is there a
better way of solving my problem?

Thanks,

Richard

Jim Grosbach

2012-Dec-18 18:28 UTC

head link

[LLVMdev] Issue with instruction decoding / disassembly

Owen,

As I recall, we had some similar issues with custom decoders needing to
cooperate on ARM. Do you remember the details?

-Jim

On Dec 18, 2012, at 2:37 AM, Richard Osborne <richard at xmos.com> wrote:
> I'm currently trying to get llvm-mc --disassemble working for the XCore
backend. Up until recently there was no instruction encoding / decoding
information on any of the XCore instructions so Im incrementally adding this
information at the same time as adding tests for the disassembler. However
I've run into a problem and I'm not sure of the best way to solve it.
With some of the XCore's instruction formats operands are not encoded into
bits individually. instead they are combined into a single field using
arithmetic operations before being inserted in the instruction. For example:
> 
> ADD_3r is encoded as: 00010aaaaabbccdd
> 
> where:
> 
> aaaaa = op1[3...2] × 9 + op2[3...2] × 3 + op3[3..2]
> bb = op1[1..0]
> cc = op2[1..0]
> dd = op3[1..0]
> 
> op1 - op3 are all in the range 0-11 and therefore aaaaa is in the range 0 -
26
> 
> I managed to get decoding of ADD_3r instructions to work by specifying the
value of the bits that are fixed in the instruction format and using a custom
DecoderMethod to handle the rest. The problem comes when I try and add the
INITSP_2r instruction.
> 
> INITSP_2r is encoded as: 00010aaaaab0ccdd
> 
> Again operands are not individually encoded into bits but instead they are
combined into a single field using arithmetic operations. Due to the way the
aaaaa is derived it is guaranteed to be greater than 27. The value of these bits
is how the INITSP_2r and ADD_3r instructions should be distinguished. I tried to
handle SETD_r2r the same way as ADD_3r (specifying the value of the bits that
are fixed and using the a custom DecoderMethod for the rest). With this change I
can disassemble INITSP_2r instructions but it breaks the decoding of ADD_3r
instructions. Consider the following bit pattern:
> 
> 0001000000000000
> 
> This is an ADD_3r instruction. Before adding INITSP_2r the autogenerated
decodeInstruction method would identify this as a possible ADD_3r instruction
and it would call the associated decoder method (Decode3RInstruction) which
returns Success. After adding INITSP_2r the autogenerated decodeInstruction
method identifies this as a possible INITSP_2r instruction and it calls the
associated decoder method (Decode2RInstruction) which returns Fail. At this
point I'd like decodeInstruction to carry on testing to see if it can be
decoded as an ADD_3r instruction but instead it stops looking at this point and
returns Fail.
> 
> How should I deal with this situation? One idea I had (which I haven't
tried yet) is to move the troublesome instructions into a different decoding
table by setting the DecoderNamespace. This way in
XCoreDisassembler::getInstruction() I can call decodeInstruction() on the first
decoder table (containing INITSP_2r) and if this fails I can then call
decodeInstruction() on the second decoder table (containing ADD_3r). Is this an
abuse of DecoderNamespaces? Is there a better way of solving my problem?
> 
> Thanks,
> 
> Richard
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Owen Anderson

2012-Dec-18 18:44 UTC

head link

[LLVMdev] Issue with instruction decoding / disassembly

We heavily refactored the ARM instruction definitions to get enough fixed bits
exposed to allow the auto generated decoder.
If that's not possible, you can use the decoder hooks to override opcode of
the instruction.  See, for example, DecodeVCVTD in ARMDisassembler.cpp.

--Owen

On Dec 18, 2012, at 10:28 AM, Jim Grosbach <grosbach at apple.com> wrote:
> Owen,
> 
> As I recall, we had some similar issues with custom decoders needing to
cooperate on ARM. Do you remember the details?
> 
> -Jim
> 
> On Dec 18, 2012, at 2:37 AM, Richard Osborne <richard at xmos.com>
wrote:
> 
>> I'm currently trying to get llvm-mc --disassemble working for the
XCore backend. Up until recently there was no instruction encoding / decoding
information on any of the XCore instructions so Im incrementally adding this
information at the same time as adding tests for the disassembler. However
I've run into a problem and I'm not sure of the best way to solve it.
With some of the XCore's instruction formats operands are not encoded into
bits individually. instead they are combined into a single field using
arithmetic operations before being inserted in the instruction. For example:
>> 
>> ADD_3r is encoded as: 00010aaaaabbccdd
>> 
>> where:
>> 
>> aaaaa = op1[3...2] × 9 + op2[3...2] × 3 + op3[3..2]
>> bb = op1[1..0]
>> cc = op2[1..0]
>> dd = op3[1..0]
>> 
>> op1 - op3 are all in the range 0-11 and therefore aaaaa is in the range
0 - 26
>> 
>> I managed to get decoding of ADD_3r instructions to work by specifying
the value of the bits that are fixed in the instruction format and using a
custom DecoderMethod to handle the rest. The problem comes when I try and add
the INITSP_2r instruction.
>> 
>> INITSP_2r is encoded as: 00010aaaaab0ccdd
>> 
>> Again operands are not individually encoded into bits but instead they
are combined into a single field using arithmetic operations. Due to the way the
aaaaa is derived it is guaranteed to be greater than 27. The value of these bits
is how the INITSP_2r and ADD_3r instructions should be distinguished. I tried to
handle SETD_r2r the same way as ADD_3r (specifying the value of the bits that
are fixed and using the a custom DecoderMethod for the rest). With this change I
can disassemble INITSP_2r instructions but it breaks the decoding of ADD_3r
instructions. Consider the following bit pattern:
>> 
>> 0001000000000000
>> 
>> This is an ADD_3r instruction. Before adding INITSP_2r the
autogenerated decodeInstruction method would identify this as a possible ADD_3r
instruction and it would call the associated decoder method
(Decode3RInstruction) which returns Success. After adding INITSP_2r the
autogenerated decodeInstruction method identifies this as a possible INITSP_2r
instruction and it calls the associated decoder method (Decode2RInstruction)
which returns Fail. At this point I'd like decodeInstruction to carry on
testing to see if it can be decoded as an ADD_3r instruction but instead it
stops looking at this point and returns Fail.
>> 
>> How should I deal with this situation? One idea I had (which I
haven't tried yet) is to move the troublesome instructions into a different
decoding table by setting the DecoderNamespace. This way in
XCoreDisassembler::getInstruction() I can call decodeInstruction() on the first
decoder table (containing INITSP_2r) and if this fails I can then call
decodeInstruction() on the second decoder table (containing ADD_3r). Is this an
abuse of DecoderNamespaces? Is there a better way of solving my problem?
>> 
>> Thanks,
>> 
>> Richard
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Dec 2012 - [LLVMdev] Issue with instruction decoding / disassembly

[LLVMdev] Issue with instruction decoding / disassembly

[LLVMdev] Issue with instruction decoding / disassembly

[LLVMdev] Issue with instruction decoding / disassembly

Possibly Parallel Threads