Haoran Xu via llvm-dev
2020-Oct-22 04:12 UTC
[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk
A further bisect using opt's -opt-bisect-limit option shows that the following pass is causing the issue:> BISECT: running pass (39) Early CSE w/ MemorySSA on function (main) >Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午9:00写道:> I did a simple bisect on clang version, and it seems like clang 8.0.0 > works correctly, but clang 9.0.0 failed to compile the code correctly. > https://godbolt.org/z/676Grr <- if you change the clang version to > 8.0.0, you will see the expected output in 'output' section. > I don't have the ability to bisect on clang git history. I would greatly > appreciate it if any one is willing to do that. > > Thanks! > > Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午8:47写道: > >> Hello, >> >> I'm really amazed to find out that under -O3, a simple piece of C code >> generated from a brainfxxk-to-C transpiler is miscompiled. >> As one probably know, the C code transpiled from brainfxxk only contains >> 3 kind of statements: >> >>> (1) ++(*ptr) / --(*ptr) >>> (2) ++ptr / --ptr >>> (3) while (*ptr) { ... } >>> >> where ptr is a uint8_t*. >> So it seems very clear to me that the code contains no undefined behavior >> (the pointer is uint8_t* and unsigned integer overflow is not UD). >> >> After further investigation, it seems like clang compiled this loop: >> >>> while (*ptr) { >>> --(*ptr); >>> ++ptr; >>> ++(*ptr); >>> --ptr; >>> } >>> >> to an unconditional infinite loop under -O3, resulting in the bug. The >> code snippet above seems completely benign to me. >> >> I attached the offending program. With >> >>> clang a.c -O0 >>> >> it worked fine (it should print out an ASCII-art picture of mandelbrot >> fracture). However, with -O1 or -O3, it goes into a dead loop (in the code >> snippet above) after printing out a few characters. >> >> I also tried UndefinedBehaviorSanitizer. Strangely, when compiling using >> >>> clang a.c -O3 -fsanitize=undefined >>> >> the code worked again, with no infinite loop, and no undefined behavior >> reported. >> >> So it seems to me a LLVM optimizer bug. I would greatly appreciate if any >> one is willing to investigate. >> >> Best, >> Haoran >> >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201021/9090c1d5/attachment.html>
David Blaikie via llvm-dev
2020-Oct-22 05:17 UTC
[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk
Might be worth running the c source file through creduce or similar to narrow it down a bit that way too. On Wed, Oct 21, 2020 at 9:12 PM Haoran Xu via llvm-dev < llvm-dev at lists.llvm.org> wrote:> A further bisect using opt's -opt-bisect-limit option shows that the > following pass is causing the issue: > >> BISECT: running pass (39) Early CSE w/ MemorySSA on function (main) >> > > > Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午9:00写道: > >> I did a simple bisect on clang version, and it seems like clang 8.0.0 >> works correctly, but clang 9.0.0 failed to compile the code correctly. >> https://godbolt.org/z/676Grr <- if you change the clang version to >> 8.0.0, you will see the expected output in 'output' section. >> I don't have the ability to bisect on clang git history. I would greatly >> appreciate it if any one is willing to do that. >> >> Thanks! >> >> Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午8:47写道: >> >>> Hello, >>> >>> I'm really amazed to find out that under -O3, a simple piece of C code >>> generated from a brainfxxk-to-C transpiler is miscompiled. >>> As one probably know, the C code transpiled from brainfxxk only contains >>> 3 kind of statements: >>> >>>> (1) ++(*ptr) / --(*ptr) >>>> (2) ++ptr / --ptr >>>> (3) while (*ptr) { ... } >>>> >>> where ptr is a uint8_t*. >>> So it seems very clear to me that the code contains no undefined >>> behavior (the pointer is uint8_t* and unsigned integer overflow is not UD). >>> >>> After further investigation, it seems like clang compiled this loop: >>> >>>> while (*ptr) { >>>> --(*ptr); >>>> ++ptr; >>>> ++(*ptr); >>>> --ptr; >>>> } >>>> >>> to an unconditional infinite loop under -O3, resulting in the bug. The >>> code snippet above seems completely benign to me. >>> >>> I attached the offending program. With >>> >>>> clang a.c -O0 >>>> >>> it worked fine (it should print out an ASCII-art picture of mandelbrot >>> fracture). However, with -O1 or -O3, it goes into a dead loop (in the code >>> snippet above) after printing out a few characters. >>> >>> I also tried UndefinedBehaviorSanitizer. Strangely, when compiling using >>> >>>> clang a.c -O3 -fsanitize=undefined >>>> >>> the code worked again, with no infinite loop, and no undefined behavior >>> reported. >>> >>> So it seems to me a LLVM optimizer bug. I would greatly appreciate if >>> any one is willing to investigate. >>> >>> Best, >>> Haoran >>> >>> >>> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201021/77a8809f/attachment-0001.html>
Haoran Xu via llvm-dev
2020-Oct-22 05:32 UTC
[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk
I was just able to determine the offending IR code before and after the transformation. I'm now almost certain it's a bug in LLVM. Before transformation, we have the following IR (I renamed all %xxx for brevity):> %1 = load i8, i8* %0, align 1 > %2 = add i8 %1, -1 > store i8 %2, i8* %0, align 1 >The above IR is inside a loop, so the value in %0 can be different in each run. The optimization pass changed the IR above to the following:> store i8 %3, i8* %0, align 1 >where %3 is defined by> %4 = load i8, i8* %0, align 1 > %3 = add i8 %4, -1 >in an earlier piece of IR. Apparently the pass treated %3 the same thing as %2 and it fired CSE, without realizing that the content in %0 may have been changed by the loop. David Blaikie <dblaikie at gmail.com> 于2020年10月21日周三 下午10:18写道:> Might be worth running the c source file through creduce or similar to > narrow it down a bit that way too. > > On Wed, Oct 21, 2020 at 9:12 PM Haoran Xu via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> A further bisect using opt's -opt-bisect-limit option shows that the >> following pass is causing the issue: >> >>> BISECT: running pass (39) Early CSE w/ MemorySSA on function (main) >>> >> >> >> Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午9:00写道: >> >>> I did a simple bisect on clang version, and it seems like clang 8.0.0 >>> works correctly, but clang 9.0.0 failed to compile the code correctly. >>> https://godbolt.org/z/676Grr <- if you change the clang version to >>> 8.0.0, you will see the expected output in 'output' section. >>> I don't have the ability to bisect on clang git history. I would greatly >>> appreciate it if any one is willing to do that. >>> >>> Thanks! >>> >>> Haoran Xu <haoranxu510 at gmail.com> 于2020年10月21日周三 下午8:47写道: >>> >>>> Hello, >>>> >>>> I'm really amazed to find out that under -O3, a simple piece of C code >>>> generated from a brainfxxk-to-C transpiler is miscompiled. >>>> As one probably know, the C code transpiled from brainfxxk only >>>> contains 3 kind of statements: >>>> >>>>> (1) ++(*ptr) / --(*ptr) >>>>> (2) ++ptr / --ptr >>>>> (3) while (*ptr) { ... } >>>>> >>>> where ptr is a uint8_t*. >>>> So it seems very clear to me that the code contains no undefined >>>> behavior (the pointer is uint8_t* and unsigned integer overflow is not UD). >>>> >>>> After further investigation, it seems like clang compiled this loop: >>>> >>>>> while (*ptr) { >>>>> --(*ptr); >>>>> ++ptr; >>>>> ++(*ptr); >>>>> --ptr; >>>>> } >>>>> >>>> to an unconditional infinite loop under -O3, resulting in the bug. The >>>> code snippet above seems completely benign to me. >>>> >>>> I attached the offending program. With >>>> >>>>> clang a.c -O0 >>>>> >>>> it worked fine (it should print out an ASCII-art picture of mandelbrot >>>> fracture). However, with -O1 or -O3, it goes into a dead loop (in the code >>>> snippet above) after printing out a few characters. >>>> >>>> I also tried UndefinedBehaviorSanitizer. Strangely, when compiling >>>> using >>>> >>>>> clang a.c -O3 -fsanitize=undefined >>>>> >>>> the code worked again, with no infinite loop, and no undefined behavior >>>> reported. >>>> >>>> So it seems to me a LLVM optimizer bug. I would greatly appreciate if >>>> any one is willing to investigate. >>>> >>>> Best, >>>> Haoran >>>> >>>> >>>> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201021/d7494aa6/attachment.html>