Hello everyone At EuroLLVM I presented some testing work we have been doing on improving correctness of the MC Layer for ARM. There seemed to be interest from the community in seeing the results of this test suite. Background ----------- We are using a test suite, called MC Hammer, that compares MC with an ARM in-house implementation of the same functionality. The test space for this suite is very large ( O(10 trillion) points ) so we are concentrating on small slices at a time. For further details you can check out the talk I did at EuroLLVM last month: http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf Results -------- The below results are: - for ARM instructions (i.e. not Thumb instructions) - for Cortex-A8 with VFPv3 and Advanced SIMDv1 extensions - for the encode/decode loop, described on slide 11 of the talk - for all instruction encodings with condition code AL (all 32-bit patterns with the top 4 bits set to 0xE) [1] - all silent codegen bugs[2], that is bugs where: - The reference bitpattern is defined and predictable. - The generated bitpattern is defined. - The generated bitpattern differs from the reference bitpattern. - for LLVM r156468 (updated at 2012-05-09 08:08:58 BST.) - for LLVM built with no assertions[3] Attached is a zipped copy of the full log from MC Hammer. Also attached is a triaged summary of these logs, showing that there are five bugs found (summary from log reproduced below [4]). Currently we have patches for three (bugs 1-3) of these that are ready to go upstream. We are actively working on a patch for one other (bug 5). We would, of course welcome a patch for bug 4 (I have put a proposal for the fix in the attached report.) Unless there is a preference otherwise, the next set of results will be the list of silent codegen bugs for the 0xF slice. I will then move to the list of silent codegen bugs for the disassemble and assemble loops (slides 10 and 12 respectively) but, like MC Hammer's DJ, I can take requests. Regards, Richard Barton ARM Ltd, Cambridge =================== [1] The slice is run with only one condition code to reduce the size of the logs, that is to say for all 32-bit patterns with the top 4 bits set to 0xE. Running the whole space over all 32 bits hits the same bugs repeatedly, once for each condition code. Condition code 0xF is for instructions that can only be executed unconditionally. Notably, this includes most of the VFP instructions. [2] Other types of failure can be detected by MC Hammer. These results can be published if there is interest in seeing them. [3] LLVM built with assertions turned on hits SIGABRT on some bitpatterns. When this occurs the rest of the slice is not run, so to cover the whole slice we are using an version of LLVM with no assertions. One would expect an assertion to catch early a genuine codegen failure so hopefully bugs that would have triggered an assertion will still show up without them on. [4] (reproduced bug triage from the error output) [bug 1] echo 0xb0 0x00 0x80 0xe6 | ./llvm-mc -triple armv7 --show-inst --show-encoding --disassemble This bitpattern should decode to an unpredictable SEL r0, r0, r0. MC is decoding this to an STR r0, [r0], r0, lsr #1 which it is incorrectly diagnosing as unpredictable. [bug 2] echo 0x70 0x01 0x80 0xe6 | ./llvm-mc -triple armv7 --show-inst --show-encoding --disassemble This bitpattern should decode to an unpredictable SXTAB16 r0, r0, r0. MC is decoding this to an STR r0, [r0], r0, ror #2 which it is incorrectly diagnosing as unpredictable. [bug 3] echo 0x70 0x01 0xc0 0xe6 | ./llvm-mc -triple armv7 --show-inst --show-encoding --disassemble This bitpattern should decode to an unpredictable UXTAB16 r0, r0, r0. MC is decoding this to an STRB r0, [r0], r0, ror #2 which it is incorrectly diagnosing as unpredictable. [bug 4] echo 0x90 0x00 0xc0 0xe7 | ./llvm-mc -triple armv7 --show-inst --show-encoding --disassemble echo 0x90 0x01 0xc0 0xe7 | ./llvm-mc -triple armv7 --show-inst --show-encoding --disassemble This bitpattern decodes to a BFI with an invalid mask operand, which is unpredictable. The first example fails with an abort when they are turned on, and otherwise creates the instruction BFI r0, r0, #32, #-32. The second example does not abort and decodes to BFI r0, r0, #1, #2 (0xe7c20090). The ARMARM could be clearer on this point, but the real UAL should be BFI r0, r0, #lsbit #(msbit+1-lsbit) or BFI r0, r0, #3, #-2 In my opinion, the root cause of the problem is that BFI MCInsts store the mask as a 32-bit operand and converts to and from the msbit and lsbit fields during encode, decode, assemble and disassemble. I think it should store the msbit and lsbit fields as operands and compute the mask at the instruction selection phase. [bug 5] echo 0x03 0x0b 0x80 0xec | ./llvm-mc -triple armv7 --show-inst --show-encoding --disassemble The bitpattern is decoding as VSTMIA r0, {d0} when it should decode to FSTMIAX r0, {d0} These instructions are a bit of a curiosity in that they are pre-ARMv6 (VFPv1) instruction mnemonics which were not superseded by UAL-style V* mnemonics. They still exist in VFPv4 but their use is deprecated. Any VSTM's with odd numbered imm8 fields (bottom 8 bits) are the old-style F* encodings, and the encoding i for the immediate is different. -------------- next part -------------- A non-text attachment was scrubbed... Name: ca8_ARM_enc_dec_alcond_diffsonly_raw.rpt.bz2 Type: application/octet-stream Size: 1272798 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120510/40dc6f5e/attachment.obj> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ca8_ARM_enc_dec_alcond_diffsonly.txt URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120510/40dc6f5e/attachment.txt>