Hello everyone At EuroLLVM I presented some testing work we have been doing on improving correctness of the MC Layer for ARM. There seemed to be interest from the community in seeing the results of this test suite. Background ----------- We are using a test suite, called MC Hammer, that compares MC with an ARM in-house implementation of the same functionality. The test space for this suite is very large ( O(10 trillion) points ) so we are concentrating on small slices at a time. For further details you can check out the talk I did at EuroLLVM last month: http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf Results -------- The below results are: - for Thumb instructions - for Cortex-A8 with VFPv3 and Advanced SIMDv1 extensions - for the encode/decode loop, described on slide 11 of the talk - all silent codegen bugs[1], that is bugs where: - The reference bitpattern is defined and predictable. - The generated bitpattern is defined. - The generated bitpattern differs from the reference bitpattern. - for LLVM r157187 (updated at 2012-05-21 14:18:06 BST.) - for LLVM built with no assertions[2] The full log from MC Hammer is very large (>16GB) this is theoretically available on request. Of more interest, is the attached triaged summary of these logs, showing that there are six bugs of this kind found (summary from log reproduced below [3]). If anyone is interested in seeing the results for any particular slice of the test space then I will consider requests. Regards, Richard Barton ARM Ltd, Cambridge =================== [1] Other types of failure can be detected by MC Hammer. These results can be published if there is interest in seeing them. [2] LLVM built with assertions turned on hits SIGABRT on some bitpatterns. When this occurs the rest of the slice is not run, so to cover the whole slice we are using an version of LLVM with no assertions. One would expect an assertion to catch early a genuine codegen failure so hopefully bugs that would have triggered an assertion will still show up without them on. [3] (reproduced bug triage from the error output) [bug 1] - Decode for T1 B<c> encoding does not sign extend the 8-bit immediate properly reproduce with: echo 0x40 0xd0 | .../llvm-mc -triple thumbv7 -show-encoding -disassemble -show-inst The re-encoding has an incorrect offset when the 6th bit of the immediate is set. A look at the code shows that the decoder uses a C++ cast to signed integer rather than calling the SignExtend function. [bug 2] - STRD with negative offset re-encodes to positive offset and disassembles with no offset reproduce with: echo 0x61 0xe9 0x00 0x00 | .../llvm-mc -triple thumbv7 -show-encoding -disassemble -show-inst Any encoding with the U bit unset, re-encodes to the equivalent instruction with the U-bit set. [bug 3] - FSTMX/FLDMX not supported This is similar to the already reported bug in the ARM encoding slice. These are pre-UAL representation of VLDM/VSTMs with bit 0 set to 1 (an odd first register in the list) that are still available in ARMv7. [bug 4] - B.W immediate is not being encoded correctly. I1 and I2 are not being decoded correctly echo 0x00 0xf0 0x00 0xb0 | .../llvm-mc -triple thumbv7 -show-encoding -disassemble -show-inst This is the T4 encoding of the wide branch instruction. The branch target is encoded as SignExtend(S:(NOT(J1 EOR S)):(NOT(J2 EOR S)):imm10:imm11:0, 32) the current implementation missed out the EOR-ing. [bug 5] - PLD with -ve offset is re-encoding to have +ve offset reproduce with: echo 0x1f 0xf8 0x01 0xf0 | .../llvm-mc -triple thumbv7 -show-encoding -disassemble -show-inst [bug 6] - VST1.16 with alignment 16 is re-encoded to unaligned reproduce with: echo 0x80 0xf9 0x10 0x04 | ../build-none/bin/llvm-mc -triple thumbv7 -show-encoding -disassemble -show-inst Richard Barton ARM Ltd, Cambridge -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ca8_Thumb_enc_dec_all_diffsonly.txt URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120524/8bf1fe88/attachment.txt>