Alvin Ye via llvm-dev
2021-Nov-11 20:17 UTC
[llvm-dev] Assembler Validates Instruction's Immediate Input
HI, I'm working on an issue where the Hexagon Assembler would assemble an incorrect instruction without emitting an error where GNU Assembler on the other hand would emit an error in the same case. llvm-mc: echo 'memh(r4+#3) = r2;memh(r4+#-5) = r3' | bin/llvm-mc --mcpu=hexagonv71 -filetype=obj | llvm-objdump -d - Disassembly of section .text: 00000000 <.text>: 0: 01 c2 44 a1 a144c201 { memh(r4+#2) = r2 } 4: fd e3 44 a7 a744e3fd { memh(r4+#-6) = r3 } GNU as: a.s:17: Error: low 1 bits of immediate -5 must be zero a.s:17: Error: invalid instruction `memh(r4+#-5) =r2' a.s:18: Error: low 1 bits of immediate -1 must be zero a.s:18: Error: invalid instruction `memh(r4+#-1) =r3' The above example shows the immediate value #3 and #-5 were changed to #2 and #-6 due to the fact that the encoding class specifies that the first bit of the immediate value be skipped because this instruction always accesses 2 bytes aligned memory addresses. The instruction is defined at lib/Target/Hexagon/HexagonDepInstrInfo.td:9634 where it has a encoding class of Enc_de0214 which is defined at lib/Target/Hexagon/HexagonDepInstrFormats.td:3103. Enc_dec0214 specifies the first bit of the immediate value be skipped during encoding. This issue could be fixed in Hexagon target dependent portion, but I wonder if some sort of "fact checking" for this kind of behavior could be added somewhere in the tablegen e.g. TableGen/AsmMatcherEmitter.cpp, so if such check were to be implemented it could benefit other targets as well. Thanks, Alvin