thr3ads.net - llvm dev - [llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Haoran Xu via llvm-dev

2020-Oct-23 23:25 UTC

[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk

Hi Alina,

Thanks for your reply. I don't have commit access (it's my first patch
to
LLVM), so please do it on my behalf.
It's kind of hard for me to compile LLVM on my local machine, so can you
just check that the failing code I attached in original email works (i.e.
does not dead-loop and prints out a fracture-like ASCII art) in the
original email after Nikita's diff?

Thanks!
Haoran


Alina Sbirlea <asbirlea at google.com> 于2020年10月23日周五 下午3:48写道：
> Hi Haoran,
>
> Yes, you're right, the instances you added in the patch can also pass
> along the AAQI param.
> Do you have commit access to check in this change or should I do it on
> your behalf?
>
> The fix Nikita proposed (https://reviews.llvm.org/D90066) should resolve
> the original issue in AA and BatchAA, but please let us know if you're
> still seeing it with this patch.
>
> Best,
> Alina
>
> On Fri, Oct 23, 2020 at 1:25 PM Haoran Xu <haoranxu510 at gmail.com>
wrote:
>
>> Hi Alina,
>>
>> Thanks for your reply! That makes sense to me.
>> Regarding the other point I mentioned (the missing AAQI parameters in
>> some of the function calls), do you think they make sense, or is it
>> intentional that you use the non-batchAA version there?
>> My current understanding is that changing them to use BatchAA might
>> expose more issues that triggers the `isValueEqualInPotentialCycles`
bug,
>> but if you have the bug fixed, make them use BatchAA as well could
probably
>> improve performance a bit.
>>
>> Best,
>> Haoran
>>
>> Alina Sbirlea <asbirlea at google.com> 于2020年10月23日周五 上午11:21写道：
>>
>>> Some clarifications to the description in
>>> https://reviews.llvm.org/D89991:
>>> The clearing of the cache is not supposed to be done in BatchAA.
The
>>> patch introducing BatchAA was not a refactoring patch, and it did
not
>>> accidentally miss the lines to clear those caches.
>>>
>>> Nikita flagged the issue and I looked into it. The condition
that's
>>> broken in BatchAA is the `isValueEqualInPotentialCycles`. This
condition
>>> can be satisfied in a query and a result cached, but the condition
broken
>>> in a subsequent query. Working out a fix is not straightforward.
>>> Yes, clearing the whole cache is a big hammer we can use, and, yes,
it
>>> will lose all the performance benefits.
>>>
>>> For visibility, I uploaded a WIP fix:
>>> https://reviews.llvm.org/D90062
>>> There are still problems I have yet to solve in the above patch,
hence
>>> the lack of reviewers. However, please feel free to test it and let
me know
>>> if it resolves your failure.
>>>
>>> Best,
>>> Alina
>>>
>>>
>>> On Thu, Oct 22, 2020 at 6:47 AM Haoran Xu <haoranxu510 at
gmail.com> wrote:
>>>
>>>> I don't know anything about LLVM internals, but solely from
the commit
>>>> message of your diff I think it's very likely.
>>>> So my current understanding is that the ``shrink_and_clear()``
is
>>>> required to produce correct aliasing result, but if we naively
do it for
>>>> AAQI (as my patch proposed), we probably lose most of the
performance
>>>> benefits of BatchedAA.
>>>>
>>>> However, this is only my understanding after spending 2 hours
reading
>>>> the code from zero background knowledge, so I guess I can't
say much about
>>>> it.
>>>>
>>>>
>>>> Nikita Popov <nikita.ppv at gmail.com> 于2020年10月22日周四
上午6:41写道：
>>>>
>>>>> I suspect that you found the same issue as
>>>>>
https://github.com/llvm/llvm-project/commit/ddd0f083184feb489684f87cf28d5eb10e0a244e.
>>>>> We're trying to find a solution to that, but it's
not entirely simple.
>>>>>
>>>>> Nikita
>>>>>
>>>>> On Thu, Oct 22, 2020 at 3:33 PM Haoran Xu via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> I managed to bisected out the diff introducing the bug:
>>>>>>
https://github.com/llvm/llvm-project/commit/bfc779e491099213d74c057b5727bde976c7ba02
>>>>>>
>>>>>> And I managed to figured out a fix: it seems (to me)
that the diff is
>>>>>> mostly a code refactoring diff, but the author
accidentally removed two
>>>>>> lines of functional logic, resulting in the bug.
>>>>>> After adding the two lines back, the miscompiled code
works fine now.
>>>>>> However, I'm not certain of the performance
implications of clearing the
>>>>>> whole AAQI.
>>>>>>
>>>>>>> diff --git
a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>> b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>> index 232132958f7..9a8ca46093d 100644
>>>>>>> --- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>> +++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>> @@ -836,6 +836,8 @@ AliasResult
BasicAAResult::alias(const
>>>>>>> MemoryLocation &LocA,
>>>>>>>    AliasResult Alias = aliasCheck(LocA.Ptr,
LocA.Size, LocA.AATags,
>>>>>>> LocB.Ptr,
>>>>>>>                                   LocB.Size,
LocB.AATags, AAQI);
>>>>>>>
>>>>>>> +  AAQI.AliasCache.shrink_and_clear();
>>>>>>> +  AAQI.IsCapturedCache.shrink_and_clear();
>>>>>>>    VisitedPhiBBs.clear();
>>>>>>>    return Alias;
>>>>>>>  }
>>>>>>>
>>>>>>
>>>>>> I also noticed that the author forgot to change a few
function calls
>>>>>> to use his new API, but I'm not sure if it is a
correctness issue or not.
>>>>>> At least it's not causing the brainfxxk code
miscompilation.
>>>>>>
>>>>>>> diff --git a/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>> b/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>> index 06a33f659c1..94a9d12eb17 100644
>>>>>>> --- a/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>> +++ b/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>> @@ -231,7 +231,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>> CallBase *Call,
>>>>>>>
>>>>>>>    // If Loc is a constant memory location, the
call definitely
>>>>>>> could not
>>>>>>>    // modify the memory location.
>>>>>>> -  if (isModSet(Result) &&
pointsToConstantMemory(Loc, /*OrLocal*/
>>>>>>> false))
>>>>>>> +  if (isModSet(Result) &&
pointsToConstantMemory(Loc, AAQI,
>>>>>>> /*OrLocal*/ false))
>>>>>>>      Result = clearMod(Result);
>>>>>>>
>>>>>>>    return Result;
>>>>>>> @@ -308,7 +308,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>> CallBase *Call1,
>>>>>>>
>>>>>>>        // ModRefC1 indicates what Call1 might do to
Call2ArgLoc, and
>>>>>>> we use
>>>>>>>        // above ArgMask to update dependence info.
>>>>>>> -      ModRefInfo ModRefC1 = getModRefInfo(Call1,
Call2ArgLoc);
>>>>>>> +      ModRefInfo ModRefC1 = getModRefInfo(Call1,
Call2ArgLoc, AAQI);
>>>>>>>        ArgMask = intersectModRef(ArgMask,
ModRefC1);
>>>>>>>
>>>>>>>        // Conservatively clear IsMustAlias unless
only MustAlias is
>>>>>>> found.
>>>>>>> @@ -349,7 +349,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>> CallBase *Call1,
>>>>>>>        // might Mod Call1ArgLoc, then we care about
either a Mod or
>>>>>>> a Ref by
>>>>>>>        // Call2. If Call1 might Ref, then we care
only about a Mod
>>>>>>> by Call2.
>>>>>>>        ModRefInfo ArgModRefC1 =
getArgModRefInfo(Call1, Call1ArgIdx);
>>>>>>> -      ModRefInfo ModRefC2 = getModRefInfo(Call2,
Call1ArgLoc);
>>>>>>> +      ModRefInfo ModRefC2 = getModRefInfo(Call2,
Call1ArgLoc, AAQI);
>>>>>>>        if ((isModSet(ArgModRefC1) &&
isModOrRefSet(ModRefC2)) ||
>>>>>>>            (isRefSet(ArgModRefC1) &&
isModSet(ModRefC2)))
>>>>>>>          R = intersectModRef(unionModRef(R,
ArgModRefC1), Result);
>>>>>>>
>>>>>>
>>>>>> I'll submit a patch for review, but I would really
appreciate any
>>>>>> comments here as well.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Best,
>>>>>> Haoran
>>>>>>
>>>>>> Haoran Xu <haoranxu510 at gmail.com>
于2020年10月22日周四 上午12:12写道：
>>>>>>
>>>>>>> Hi David,
>>>>>>>
>>>>>>> I just tried creduce, but it generated a code with
completely
>>>>>>> undefined behavior (out of range accesses, etc).
>>>>>>> The "interesting" criteria I used is
"timeout in -O1 but finishes in
>>>>>>> 20s in -O0". However, it turns out that the
undefined behavior in
>>>>>>> creduce-generated code just makes the criteria
"happens to" pass.
>>>>>>>
>>>>>>> Best,
>>>>>>> Haoran
>>>>>>>
>>>>>>> Haoran Xu <haoranxu510 at gmail.com>
于2020年10月21日周三 下午10:32写道：
>>>>>>>
>>>>>>>> I was just able to determine the offending IR
code before and after
>>>>>>>> the transformation. I'm now almost certain
it's a bug in LLVM.
>>>>>>>>
>>>>>>>> Before transformation, we have the following IR
(I renamed all %xxx
>>>>>>>> for brevity):
>>>>>>>>
>>>>>>>>> %1 = load i8, i8* %0, align 1
>>>>>>>>> %2 = add i8 %1, -1
>>>>>>>>> store i8 %2, i8* %0, align 1
>>>>>>>>>
>>>>>>>> The above IR is inside a loop, so the value in
%0 can be different
>>>>>>>> in each run.
>>>>>>>>
>>>>>>>> The optimization pass changed the IR above to
the following:
>>>>>>>>
>>>>>>>>> store i8 %3, i8* %0, align 1
>>>>>>>>>
>>>>>>>> where %3 is defined by
>>>>>>>>
>>>>>>>>> %4 = load i8, i8* %0, align 1
>>>>>>>>> %3 = add i8 %4, -1
>>>>>>>>>
>>>>>>>> in an earlier piece of IR.
>>>>>>>>
>>>>>>>> Apparently the pass treated %3 the same thing
as %2 and it fired
>>>>>>>> CSE, without realizing that the content in %0
may have been changed by the
>>>>>>>> loop.
>>>>>>>>
>>>>>>>>
>>>>>>>> David Blaikie <dblaikie at gmail.com>
于2020年10月21日周三 下午10:18写道：
>>>>>>>>
>>>>>>>>> Might be worth running the c source file
through creduce or
>>>>>>>>> similar to narrow it down a bit that way
too.
>>>>>>>>>
>>>>>>>>> On Wed, Oct 21, 2020 at 9:12 PM Haoran Xu
via llvm-dev <
>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>> A further bisect using opt's
-opt-bisect-limit option shows that
>>>>>>>>>> the following pass is causing the
issue:
>>>>>>>>>>
>>>>>>>>>>> BISECT: running pass (39) Early CSE
w/ MemorySSA on function
>>>>>>>>>>> (main)
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Haoran Xu <haoranxu510 at
gmail.com> 于2020年10月21日周三 下午9:00写道：
>>>>>>>>>>
>>>>>>>>>>> I did a simple bisect on clang
version, and it seems like clang
>>>>>>>>>>> 8.0.0 works correctly, but clang
9.0.0 failed to compile the code correctly.
>>>>>>>>>>> https://godbolt.org/z/676Grr  <-
if you change the clang
>>>>>>>>>>> version to 8.0.0, you will see the
expected output in 'output' section.
>>>>>>>>>>> I don't have the ability to
bisect on clang git history. I would
>>>>>>>>>>> greatly appreciate it if any one is
willing to do that.
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>> Haoran Xu <haoranxu510 at
gmail.com> 于2020年10月21日周三 下午8:47写道：
>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm really amazed to find
out that under -O3, a simple piece of
>>>>>>>>>>>> C code generated from a
brainfxxk-to-C transpiler is miscompiled.
>>>>>>>>>>>> As one probably know, the C
code transpiled from brainfxxk only
>>>>>>>>>>>> contains 3 kind of statements:
>>>>>>>>>>>>
>>>>>>>>>>>>> (1) ++(*ptr) / --(*ptr)
>>>>>>>>>>>>> (2) ++ptr / --ptr
>>>>>>>>>>>>> (3) while (*ptr) { ... }
>>>>>>>>>>>>>
>>>>>>>>>>>> where ptr is a uint8_t*.
>>>>>>>>>>>> So it seems very clear to me
that the code contains no
>>>>>>>>>>>> undefined behavior (the pointer
is uint8_t* and unsigned integer overflow
>>>>>>>>>>>> is not UD).
>>>>>>>>>>>>
>>>>>>>>>>>> After further investigation, it
seems like clang compiled this
>>>>>>>>>>>> loop:
>>>>>>>>>>>>
>>>>>>>>>>>>> while (*ptr) {
>>>>>>>>>>>>>  --(*ptr);
>>>>>>>>>>>>>  ++ptr;
>>>>>>>>>>>>>  ++(*ptr);
>>>>>>>>>>>>>  --ptr;
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>  to an unconditional infinite
loop under -O3, resulting in the
>>>>>>>>>>>> bug. The code snippet above
seems completely benign to me.
>>>>>>>>>>>>
>>>>>>>>>>>> I attached the offending
program. With
>>>>>>>>>>>>
>>>>>>>>>>>>> clang a.c -O0
>>>>>>>>>>>>>
>>>>>>>>>>>> it worked fine (it should print
out an ASCII-art picture of
>>>>>>>>>>>> mandelbrot fracture). However,
with -O1 or -O3, it goes into a dead loop
>>>>>>>>>>>> (in the code snippet above)
after printing out a few characters.
>>>>>>>>>>>>
>>>>>>>>>>>> I also tried
UndefinedBehaviorSanitizer. Strangely, when
>>>>>>>>>>>> compiling using
>>>>>>>>>>>>
>>>>>>>>>>>>> clang a.c -O3 
-fsanitize=undefined
>>>>>>>>>>>>>
>>>>>>>>>>>> the code worked again, with no
infinite loop, and no undefined
>>>>>>>>>>>> behavior reported.
>>>>>>>>>>>>
>>>>>>>>>>>> So it seems to me a LLVM
optimizer bug. I would greatly
>>>>>>>>>>>> appreciate if any one is
willing to investigate.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Haoran
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201023/44cee7e4/attachment.html>

Alina Sbirlea via llvm-dev

2020-Oct-24 00:15 UTC

head link

[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk

Hi Haoran,

I think you may have come across an issue that was fixed already. I cannot
reproduce the failure at ToT.
In godbolt it reproduces with the clang11 release, but not with trunk.

Best,
Alina


On Fri, Oct 23, 2020 at 4:25 PM Haoran Xu <haoranxu510 at gmail.com>
wrote:
> Hi Alina,
>
> Thanks for your reply. I don't have commit access (it's my first
patch to
> LLVM), so please do it on my behalf.
> It's kind of hard for me to compile LLVM on my local machine, so can
you
> just check that the failing code I attached in original email works (i.e.
> does not dead-loop and prints out a fracture-like ASCII art) in the
> original email after Nikita's diff?
>
> Thanks!
> Haoran
>
>
> Alina Sbirlea <asbirlea at google.com> 于2020年10月23日周五 下午3:48写道：
>
>> Hi Haoran,
>>
>> Yes, you're right, the instances you added in the patch can also
pass
>> along the AAQI param.
>> Do you have commit access to check in this change or should I do it on
>> your behalf?
>>
>> The fix Nikita proposed (https://reviews.llvm.org/D90066) should
resolve
>> the original issue in AA and BatchAA, but please let us know if
you're
>> still seeing it with this patch.
>>
>> Best,
>> Alina
>>
>> On Fri, Oct 23, 2020 at 1:25 PM Haoran Xu <haoranxu510 at
gmail.com> wrote:
>>
>>> Hi Alina,
>>>
>>> Thanks for your reply! That makes sense to me.
>>> Regarding the other point I mentioned (the missing AAQI parameters
in
>>> some of the function calls), do you think they make sense, or is it
>>> intentional that you use the non-batchAA version there?
>>> My current understanding is that changing them to use BatchAA might
>>> expose more issues that triggers the
`isValueEqualInPotentialCycles` bug,
>>> but if you have the bug fixed, make them use BatchAA as well could
probably
>>> improve performance a bit.
>>>
>>> Best,
>>> Haoran
>>>
>>> Alina Sbirlea <asbirlea at google.com> 于2020年10月23日周五
上午11:21写道：
>>>
>>>> Some clarifications to the description in
>>>> https://reviews.llvm.org/D89991:
>>>> The clearing of the cache is not supposed to be done in
BatchAA. The
>>>> patch introducing BatchAA was not a refactoring patch, and it
did not
>>>> accidentally miss the lines to clear those caches.
>>>>
>>>> Nikita flagged the issue and I looked into it. The condition
that's
>>>> broken in BatchAA is the `isValueEqualInPotentialCycles`. This
condition
>>>> can be satisfied in a query and a result cached, but the
condition broken
>>>> in a subsequent query. Working out a fix is not
straightforward.
>>>> Yes, clearing the whole cache is a big hammer we can use, and,
yes, it
>>>> will lose all the performance benefits.
>>>>
>>>> For visibility, I uploaded a WIP fix:
>>>> https://reviews.llvm.org/D90062
>>>> There are still problems I have yet to solve in the above
patch, hence
>>>> the lack of reviewers. However, please feel free to test it and
let me know
>>>> if it resolves your failure.
>>>>
>>>> Best,
>>>> Alina
>>>>
>>>>
>>>> On Thu, Oct 22, 2020 at 6:47 AM Haoran Xu <haoranxu510 at
gmail.com>
>>>> wrote:
>>>>
>>>>> I don't know anything about LLVM internals, but solely
from the commit
>>>>> message of your diff I think it's very likely.
>>>>> So my current understanding is that the
``shrink_and_clear()`` is
>>>>> required to produce correct aliasing result, but if we
naively do it for
>>>>> AAQI (as my patch proposed), we probably lose most of the
performance
>>>>> benefits of BatchedAA.
>>>>>
>>>>> However, this is only my understanding after spending 2
hours reading
>>>>> the code from zero background knowledge, so I guess I
can't say much about
>>>>> it.
>>>>>
>>>>>
>>>>> Nikita Popov <nikita.ppv at gmail.com> 于2020年10月22日周四
上午6:41写道：
>>>>>
>>>>>> I suspect that you found the same issue as
>>>>>>
https://github.com/llvm/llvm-project/commit/ddd0f083184feb489684f87cf28d5eb10e0a244e.
>>>>>> We're trying to find a solution to that, but
it's not entirely simple.
>>>>>>
>>>>>> Nikita
>>>>>>
>>>>>> On Thu, Oct 22, 2020 at 3:33 PM Haoran Xu via llvm-dev
<
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> I managed to bisected out the diff introducing the
bug:
>>>>>>>
https://github.com/llvm/llvm-project/commit/bfc779e491099213d74c057b5727bde976c7ba02
>>>>>>>
>>>>>>> And I managed to figured out a fix: it seems (to
me) that the diff
>>>>>>> is mostly a code refactoring diff, but the author
accidentally removed two
>>>>>>> lines of functional logic, resulting in the bug.
>>>>>>> After adding the two lines back, the miscompiled
code works fine
>>>>>>> now. However, I'm not certain of the
performance implications of clearing
>>>>>>> the whole AAQI.
>>>>>>>
>>>>>>>> diff --git
a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>>> b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>>> index 232132958f7..9a8ca46093d 100644
>>>>>>>> --- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>>> +++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>>> @@ -836,6 +836,8 @@ AliasResult
BasicAAResult::alias(const
>>>>>>>> MemoryLocation &LocA,
>>>>>>>>    AliasResult Alias = aliasCheck(LocA.Ptr,
LocA.Size, LocA.AATags,
>>>>>>>> LocB.Ptr,
>>>>>>>>                                   LocB.Size,
LocB.AATags, AAQI);
>>>>>>>>
>>>>>>>> +  AAQI.AliasCache.shrink_and_clear();
>>>>>>>> +  AAQI.IsCapturedCache.shrink_and_clear();
>>>>>>>>    VisitedPhiBBs.clear();
>>>>>>>>    return Alias;
>>>>>>>>  }
>>>>>>>>
>>>>>>>
>>>>>>> I also noticed that the author forgot to change a
few function calls
>>>>>>> to use his new API, but I'm not sure if it is a
correctness issue or not.
>>>>>>> At least it's not causing the brainfxxk code
miscompilation.
>>>>>>>
>>>>>>>> diff --git
a/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>>> b/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>>> index 06a33f659c1..94a9d12eb17 100644
>>>>>>>> --- a/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>>> +++ b/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>>> @@ -231,7 +231,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>>> CallBase *Call,
>>>>>>>>
>>>>>>>>    // If Loc is a constant memory location, the
call definitely
>>>>>>>> could not
>>>>>>>>    // modify the memory location.
>>>>>>>> -  if (isModSet(Result) &&
pointsToConstantMemory(Loc, /*OrLocal*/
>>>>>>>> false))
>>>>>>>> +  if (isModSet(Result) &&
pointsToConstantMemory(Loc, AAQI,
>>>>>>>> /*OrLocal*/ false))
>>>>>>>>      Result = clearMod(Result);
>>>>>>>>
>>>>>>>>    return Result;
>>>>>>>> @@ -308,7 +308,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>>> CallBase *Call1,
>>>>>>>>
>>>>>>>>        // ModRefC1 indicates what Call1 might
do to Call2ArgLoc,
>>>>>>>> and we use
>>>>>>>>        // above ArgMask to update dependence
info.
>>>>>>>> -      ModRefInfo ModRefC1 =
getModRefInfo(Call1, Call2ArgLoc);
>>>>>>>> +      ModRefInfo ModRefC1 =
getModRefInfo(Call1, Call2ArgLoc,
>>>>>>>> AAQI);
>>>>>>>>        ArgMask = intersectModRef(ArgMask,
ModRefC1);
>>>>>>>>
>>>>>>>>        // Conservatively clear IsMustAlias
unless only MustAlias is
>>>>>>>> found.
>>>>>>>> @@ -349,7 +349,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>>> CallBase *Call1,
>>>>>>>>        // might Mod Call1ArgLoc, then we care
about either a Mod or
>>>>>>>> a Ref by
>>>>>>>>        // Call2. If Call1 might Ref, then we
care only about a Mod
>>>>>>>> by Call2.
>>>>>>>>        ModRefInfo ArgModRefC1 =
getArgModRefInfo(Call1,
>>>>>>>> Call1ArgIdx);
>>>>>>>> -      ModRefInfo ModRefC2 =
getModRefInfo(Call2, Call1ArgLoc);
>>>>>>>> +      ModRefInfo ModRefC2 =
getModRefInfo(Call2, Call1ArgLoc,
>>>>>>>> AAQI);
>>>>>>>>        if ((isModSet(ArgModRefC1) &&
isModOrRefSet(ModRefC2)) ||
>>>>>>>>            (isRefSet(ArgModRefC1) &&
isModSet(ModRefC2)))
>>>>>>>>          R = intersectModRef(unionModRef(R,
ArgModRefC1), Result);
>>>>>>>>
>>>>>>>
>>>>>>> I'll submit a patch for review, but I would
really appreciate any
>>>>>>> comments here as well.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Best,
>>>>>>> Haoran
>>>>>>>
>>>>>>> Haoran Xu <haoranxu510 at gmail.com>
于2020年10月22日周四 上午12:12写道：
>>>>>>>
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> I just tried creduce, but it generated a code
with completely
>>>>>>>> undefined behavior (out of range accesses,
etc).
>>>>>>>> The "interesting" criteria I used is
"timeout in -O1 but finishes
>>>>>>>> in 20s in -O0". However, it turns out that
the undefined behavior in
>>>>>>>> creduce-generated code just makes the criteria
"happens to" pass.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Haoran
>>>>>>>>
>>>>>>>> Haoran Xu <haoranxu510 at gmail.com>
于2020年10月21日周三 下午10:32写道：
>>>>>>>>
>>>>>>>>> I was just able to determine the offending
IR code before and
>>>>>>>>> after the transformation. I'm now
almost certain it's a bug in LLVM.
>>>>>>>>>
>>>>>>>>> Before transformation, we have the
following IR (I renamed all
>>>>>>>>> %xxx for brevity):
>>>>>>>>>
>>>>>>>>>> %1 = load i8, i8* %0, align 1
>>>>>>>>>> %2 = add i8 %1, -1
>>>>>>>>>> store i8 %2, i8* %0, align 1
>>>>>>>>>>
>>>>>>>>> The above IR is inside a loop, so the value
in %0 can be different
>>>>>>>>> in each run.
>>>>>>>>>
>>>>>>>>> The optimization pass changed the IR above
to the following:
>>>>>>>>>
>>>>>>>>>> store i8 %3, i8* %0, align 1
>>>>>>>>>>
>>>>>>>>> where %3 is defined by
>>>>>>>>>
>>>>>>>>>> %4 = load i8, i8* %0, align 1
>>>>>>>>>> %3 = add i8 %4, -1
>>>>>>>>>>
>>>>>>>>> in an earlier piece of IR.
>>>>>>>>>
>>>>>>>>> Apparently the pass treated %3 the same
thing as %2 and it fired
>>>>>>>>> CSE, without realizing that the content in
%0 may have been changed by the
>>>>>>>>> loop.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> David Blaikie <dblaikie at gmail.com>
于2020年10月21日周三 下午10:18写道：
>>>>>>>>>
>>>>>>>>>> Might be worth running the c source
file through creduce or
>>>>>>>>>> similar to narrow it down a bit that
way too.
>>>>>>>>>>
>>>>>>>>>> On Wed, Oct 21, 2020 at 9:12 PM Haoran
Xu via llvm-dev <
>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> A further bisect using opt's
-opt-bisect-limit option shows
>>>>>>>>>>> that the following pass is causing
the issue:
>>>>>>>>>>>
>>>>>>>>>>>> BISECT: running pass (39) Early
CSE w/ MemorySSA on function
>>>>>>>>>>>> (main)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Haoran Xu <haoranxu510 at
gmail.com> 于2020年10月21日周三 下午9:00写道：
>>>>>>>>>>>
>>>>>>>>>>>> I did a simple bisect on clang
version, and it seems like clang
>>>>>>>>>>>> 8.0.0 works correctly, but
clang 9.0.0 failed to compile the code correctly.
>>>>>>>>>>>> https://godbolt.org/z/676Grr 
<- if you change the clang
>>>>>>>>>>>> version to 8.0.0, you will see
the expected output in 'output' section.
>>>>>>>>>>>> I don't have the ability to
bisect on clang git history. I
>>>>>>>>>>>> would greatly appreciate it if
any one is willing to do that.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Haoran Xu <haoranxu510 at
gmail.com> 于2020年10月21日周三 下午8:47写道：
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm really amazed to
find out that under -O3, a simple piece
>>>>>>>>>>>>> of C code generated from a
brainfxxk-to-C transpiler is miscompiled.
>>>>>>>>>>>>> As one probably know, the C
code transpiled from brainfxxk
>>>>>>>>>>>>> only contains 3 kind of
statements:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> (1) ++(*ptr) / --(*ptr)
>>>>>>>>>>>>>> (2) ++ptr / --ptr
>>>>>>>>>>>>>> (3) while (*ptr) { ...
}
>>>>>>>>>>>>>>
>>>>>>>>>>>>> where ptr is a uint8_t*.
>>>>>>>>>>>>> So it seems very clear to
me that the code contains no
>>>>>>>>>>>>> undefined behavior (the
pointer is uint8_t* and unsigned integer overflow
>>>>>>>>>>>>> is not UD).
>>>>>>>>>>>>>
>>>>>>>>>>>>> After further
investigation, it seems like clang compiled this
>>>>>>>>>>>>> loop:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> while (*ptr) {
>>>>>>>>>>>>>>  --(*ptr);
>>>>>>>>>>>>>>  ++ptr;
>>>>>>>>>>>>>>  ++(*ptr);
>>>>>>>>>>>>>>  --ptr;
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>  to an unconditional
infinite loop under -O3, resulting in the
>>>>>>>>>>>>> bug. The code snippet above
seems completely benign to me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I attached the offending
program. With
>>>>>>>>>>>>>
>>>>>>>>>>>>>> clang a.c -O0
>>>>>>>>>>>>>>
>>>>>>>>>>>>> it worked fine (it should
print out an ASCII-art picture of
>>>>>>>>>>>>> mandelbrot fracture).
However, with -O1 or -O3, it goes into a dead loop
>>>>>>>>>>>>> (in the code snippet above)
after printing out a few characters.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I also tried
UndefinedBehaviorSanitizer. Strangely, when
>>>>>>>>>>>>> compiling using
>>>>>>>>>>>>>
>>>>>>>>>>>>>> clang a.c -O3 
-fsanitize=undefined
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the code worked again, with
no infinite loop, and no undefined
>>>>>>>>>>>>> behavior reported.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So it seems to me a LLVM
optimizer bug. I would greatly
>>>>>>>>>>>>> appreciate if any one is
willing to investigate.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Haoran
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>
>>>>>>>>>>
_______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201023/e6214c16/attachment-0001.html>

Haoran Xu via llvm-dev

2020-Oct-24 00:20 UTC

head link

[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk

That's interesting. I didn't try trunk since I thought LLVM 11 is
already
the latest. Thanks again!

Alina Sbirlea <asbirlea at google.com> 于2020年10月23日周五 下午5:15写道：
> Hi Haoran,
>
> I think you may have come across an issue that was fixed already. I cannot
> reproduce the failure at ToT.
> In godbolt it reproduces with the clang11 release, but not with trunk.
>
> Best,
> Alina
>
>
> On Fri, Oct 23, 2020 at 4:25 PM Haoran Xu <haoranxu510 at gmail.com>
wrote:
>
>> Hi Alina,
>>
>> Thanks for your reply. I don't have commit access (it's my
first patch to
>> LLVM), so please do it on my behalf.
>> It's kind of hard for me to compile LLVM on my local machine, so
can you
>> just check that the failing code I attached in original email works
(i.e.
>> does not dead-loop and prints out a fracture-like ASCII art) in the
>> original email after Nikita's diff?
>>
>> Thanks!
>> Haoran
>>
>>
>> Alina Sbirlea <asbirlea at google.com> 于2020年10月23日周五 下午3:48写道：
>>
>>> Hi Haoran,
>>>
>>> Yes, you're right, the instances you added in the patch can
also pass
>>> along the AAQI param.
>>> Do you have commit access to check in this change or should I do it
on
>>> your behalf?
>>>
>>> The fix Nikita proposed (https://reviews.llvm.org/D90066) should
>>> resolve the original issue in AA and BatchAA, but please let us
know if
>>> you're still seeing it with this patch.
>>>
>>> Best,
>>> Alina
>>>
>>> On Fri, Oct 23, 2020 at 1:25 PM Haoran Xu <haoranxu510 at
gmail.com> wrote:
>>>
>>>> Hi Alina,
>>>>
>>>> Thanks for your reply! That makes sense to me.
>>>> Regarding the other point I mentioned (the missing AAQI
parameters in
>>>> some of the function calls), do you think they make sense, or
is it
>>>> intentional that you use the non-batchAA version there?
>>>> My current understanding is that changing them to use BatchAA
might
>>>> expose more issues that triggers the
`isValueEqualInPotentialCycles` bug,
>>>> but if you have the bug fixed, make them use BatchAA as well
could probably
>>>> improve performance a bit.
>>>>
>>>> Best,
>>>> Haoran
>>>>
>>>> Alina Sbirlea <asbirlea at google.com> 于2020年10月23日周五
上午11:21写道：
>>>>
>>>>> Some clarifications to the description in
>>>>> https://reviews.llvm.org/D89991:
>>>>> The clearing of the cache is not supposed to be done in
BatchAA. The
>>>>> patch introducing BatchAA was not a refactoring patch, and
it did not
>>>>> accidentally miss the lines to clear those caches.
>>>>>
>>>>> Nikita flagged the issue and I looked into it. The
condition that's
>>>>> broken in BatchAA is the `isValueEqualInPotentialCycles`.
This condition
>>>>> can be satisfied in a query and a result cached, but the
condition broken
>>>>> in a subsequent query. Working out a fix is not
straightforward.
>>>>> Yes, clearing the whole cache is a big hammer we can use,
and, yes, it
>>>>> will lose all the performance benefits.
>>>>>
>>>>> For visibility, I uploaded a WIP fix:
>>>>> https://reviews.llvm.org/D90062
>>>>> There are still problems I have yet to solve in the above
patch, hence
>>>>> the lack of reviewers. However, please feel free to test it
and let me know
>>>>> if it resolves your failure.
>>>>>
>>>>> Best,
>>>>> Alina
>>>>>
>>>>>
>>>>> On Thu, Oct 22, 2020 at 6:47 AM Haoran Xu <haoranxu510
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I don't know anything about LLVM internals, but
solely from the
>>>>>> commit message of your diff I think it's very
likely.
>>>>>> So my current understanding is that the
``shrink_and_clear()`` is
>>>>>> required to produce correct aliasing result, but if we
naively do it for
>>>>>> AAQI (as my patch proposed), we probably lose most of
the performance
>>>>>> benefits of BatchedAA.
>>>>>>
>>>>>> However, this is only my understanding after spending 2
hours reading
>>>>>> the code from zero background knowledge, so I guess I
can't say much about
>>>>>> it.
>>>>>>
>>>>>>
>>>>>> Nikita Popov <nikita.ppv at gmail.com>
于2020年10月22日周四 上午6:41写道：
>>>>>>
>>>>>>> I suspect that you found the same issue as
>>>>>>>
https://github.com/llvm/llvm-project/commit/ddd0f083184feb489684f87cf28d5eb10e0a244e.
>>>>>>> We're trying to find a solution to that, but
it's not entirely simple.
>>>>>>>
>>>>>>> Nikita
>>>>>>>
>>>>>>> On Thu, Oct 22, 2020 at 3:33 PM Haoran Xu via
llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> I managed to bisected out the diff introducing
the bug:
>>>>>>>>
https://github.com/llvm/llvm-project/commit/bfc779e491099213d74c057b5727bde976c7ba02
>>>>>>>>
>>>>>>>> And I managed to figured out a fix: it seems
(to me) that the diff
>>>>>>>> is mostly a code refactoring diff, but the
author accidentally removed two
>>>>>>>> lines of functional logic, resulting in the
bug.
>>>>>>>> After adding the two lines back, the
miscompiled code works fine
>>>>>>>> now. However, I'm not certain of the
performance implications of clearing
>>>>>>>> the whole AAQI.
>>>>>>>>
>>>>>>>>> diff --git
a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>>>> b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>>>> index 232132958f7..9a8ca46093d 100644
>>>>>>>>> ---
a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>>>> +++
b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
>>>>>>>>> @@ -836,6 +836,8 @@ AliasResult
BasicAAResult::alias(const
>>>>>>>>> MemoryLocation &LocA,
>>>>>>>>>    AliasResult Alias = aliasCheck(LocA.Ptr,
LocA.Size,
>>>>>>>>> LocA.AATags, LocB.Ptr,
>>>>>>>>>                                  
LocB.Size, LocB.AATags, AAQI);
>>>>>>>>>
>>>>>>>>> +  AAQI.AliasCache.shrink_and_clear();
>>>>>>>>> +  AAQI.IsCapturedCache.shrink_and_clear();
>>>>>>>>>    VisitedPhiBBs.clear();
>>>>>>>>>    return Alias;
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>
>>>>>>>> I also noticed that the author forgot to change
a few function
>>>>>>>> calls to use his new API, but I'm not sure
if it is a correctness issue or
>>>>>>>> not. At least it's not causing the
brainfxxk code miscompilation.
>>>>>>>>
>>>>>>>>> diff --git
a/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>>>> b/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>>>> index 06a33f659c1..94a9d12eb17 100644
>>>>>>>>> --- a/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>>>> +++ b/llvm/lib/Analysis/AliasAnalysis.cpp
>>>>>>>>> @@ -231,7 +231,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>>>> CallBase *Call,
>>>>>>>>>
>>>>>>>>>    // If Loc is a constant memory location,
the call definitely
>>>>>>>>> could not
>>>>>>>>>    // modify the memory location.
>>>>>>>>> -  if (isModSet(Result) &&
pointsToConstantMemory(Loc, /*OrLocal*/
>>>>>>>>> false))
>>>>>>>>> +  if (isModSet(Result) &&
pointsToConstantMemory(Loc, AAQI,
>>>>>>>>> /*OrLocal*/ false))
>>>>>>>>>      Result = clearMod(Result);
>>>>>>>>>
>>>>>>>>>    return Result;
>>>>>>>>> @@ -308,7 +308,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>>>> CallBase *Call1,
>>>>>>>>>
>>>>>>>>>        // ModRefC1 indicates what Call1
might do to Call2ArgLoc,
>>>>>>>>> and we use
>>>>>>>>>        // above ArgMask to update
dependence info.
>>>>>>>>> -      ModRefInfo ModRefC1 =
getModRefInfo(Call1, Call2ArgLoc);
>>>>>>>>> +      ModRefInfo ModRefC1 =
getModRefInfo(Call1, Call2ArgLoc,
>>>>>>>>> AAQI);
>>>>>>>>>        ArgMask = intersectModRef(ArgMask,
ModRefC1);
>>>>>>>>>
>>>>>>>>>        // Conservatively clear IsMustAlias
unless only MustAlias
>>>>>>>>> is found.
>>>>>>>>> @@ -349,7 +349,7 @@ ModRefInfo
AAResults::getModRefInfo(const
>>>>>>>>> CallBase *Call1,
>>>>>>>>>        // might Mod Call1ArgLoc, then we
care about either a Mod
>>>>>>>>> or a Ref by
>>>>>>>>>        // Call2. If Call1 might Ref, then
we care only about a Mod
>>>>>>>>> by Call2.
>>>>>>>>>        ModRefInfo ArgModRefC1 =
getArgModRefInfo(Call1,
>>>>>>>>> Call1ArgIdx);
>>>>>>>>> -      ModRefInfo ModRefC2 =
getModRefInfo(Call2, Call1ArgLoc);
>>>>>>>>> +      ModRefInfo ModRefC2 =
getModRefInfo(Call2, Call1ArgLoc,
>>>>>>>>> AAQI);
>>>>>>>>>        if ((isModSet(ArgModRefC1)
&& isModOrRefSet(ModRefC2)) ||
>>>>>>>>>            (isRefSet(ArgModRefC1)
&& isModSet(ModRefC2)))
>>>>>>>>>          R = intersectModRef(unionModRef(R,
ArgModRefC1), Result);
>>>>>>>>>
>>>>>>>>
>>>>>>>> I'll submit a patch for review, but I would
really appreciate any
>>>>>>>> comments here as well.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Haoran
>>>>>>>>
>>>>>>>> Haoran Xu <haoranxu510 at gmail.com>
于2020年10月22日周四 上午12:12写道：
>>>>>>>>
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> I just tried creduce, but it generated a
code with completely
>>>>>>>>> undefined behavior (out of range accesses,
etc).
>>>>>>>>> The "interesting" criteria I used
is "timeout in -O1 but finishes
>>>>>>>>> in 20s in -O0". However, it turns out
that the undefined behavior in
>>>>>>>>> creduce-generated code just makes the
criteria "happens to" pass.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Haoran
>>>>>>>>>
>>>>>>>>> Haoran Xu <haoranxu510 at gmail.com>
于2020年10月21日周三 下午10:32写道：
>>>>>>>>>
>>>>>>>>>> I was just able to determine the
offending IR code before and
>>>>>>>>>> after the transformation. I'm now
almost certain it's a bug in LLVM.
>>>>>>>>>>
>>>>>>>>>> Before transformation, we have the
following IR (I renamed all
>>>>>>>>>> %xxx for brevity):
>>>>>>>>>>
>>>>>>>>>>> %1 = load i8, i8* %0, align 1
>>>>>>>>>>> %2 = add i8 %1, -1
>>>>>>>>>>> store i8 %2, i8* %0, align 1
>>>>>>>>>>>
>>>>>>>>>> The above IR is inside a loop, so the
value in %0 can be
>>>>>>>>>> different in each run.
>>>>>>>>>>
>>>>>>>>>> The optimization pass changed the IR
above to the following:
>>>>>>>>>>
>>>>>>>>>>> store i8 %3, i8* %0, align 1
>>>>>>>>>>>
>>>>>>>>>> where %3 is defined by
>>>>>>>>>>
>>>>>>>>>>> %4 = load i8, i8* %0, align 1
>>>>>>>>>>> %3 = add i8 %4, -1
>>>>>>>>>>>
>>>>>>>>>> in an earlier piece of IR.
>>>>>>>>>>
>>>>>>>>>> Apparently the pass treated %3 the same
thing as %2 and it fired
>>>>>>>>>> CSE, without realizing that the content
in %0 may have been changed by the
>>>>>>>>>> loop.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> David Blaikie <dblaikie at
gmail.com> 于2020年10月21日周三 下午10:18写道：
>>>>>>>>>>
>>>>>>>>>>> Might be worth running the c source
file through creduce or
>>>>>>>>>>> similar to narrow it down a bit
that way too.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 21, 2020 at 9:12 PM
Haoran Xu via llvm-dev <
>>>>>>>>>>> llvm-dev at lists.llvm.org>
wrote:
>>>>>>>>>>>
>>>>>>>>>>>> A further bisect using
opt's -opt-bisect-limit option shows
>>>>>>>>>>>> that the following pass is
causing the issue:
>>>>>>>>>>>>
>>>>>>>>>>>>> BISECT: running pass (39)
Early CSE w/ MemorySSA on function
>>>>>>>>>>>>> (main)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Haoran Xu <haoranxu510 at
gmail.com> 于2020年10月21日周三 下午9:00写道：
>>>>>>>>>>>>
>>>>>>>>>>>>> I did a simple bisect on
clang version, and it seems like
>>>>>>>>>>>>> clang 8.0.0 works
correctly, but clang 9.0.0 failed to compile the code
>>>>>>>>>>>>> correctly.
>>>>>>>>>>>>>
https://godbolt.org/z/676Grr  <- if you change the clang
>>>>>>>>>>>>> version to 8.0.0, you will
see the expected output in 'output' section.
>>>>>>>>>>>>> I don't have the
ability to bisect on clang git history. I
>>>>>>>>>>>>> would greatly appreciate it
if any one is willing to do that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Haoran Xu <haoranxu510
at gmail.com> 于2020年10月21日周三 下午8:47写道：
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm really amazed
to find out that under -O3, a simple piece
>>>>>>>>>>>>>> of C code generated
from a brainfxxk-to-C transpiler is miscompiled.
>>>>>>>>>>>>>> As one probably know,
the C code transpiled from brainfxxk
>>>>>>>>>>>>>> only contains 3 kind of
statements:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (1) ++(*ptr) /
--(*ptr)
>>>>>>>>>>>>>>> (2) ++ptr / --ptr
>>>>>>>>>>>>>>> (3) while (*ptr) {
... }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> where ptr is a
uint8_t*.
>>>>>>>>>>>>>> So it seems very clear
to me that the code contains no
>>>>>>>>>>>>>> undefined behavior (the
pointer is uint8_t* and unsigned integer overflow
>>>>>>>>>>>>>> is not UD).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> After further
investigation, it seems like clang compiled
>>>>>>>>>>>>>> this loop:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> while (*ptr) {
>>>>>>>>>>>>>>>  --(*ptr);
>>>>>>>>>>>>>>>  ++ptr;
>>>>>>>>>>>>>>>  ++(*ptr);
>>>>>>>>>>>>>>>  --ptr;
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  to an unconditional
infinite loop under -O3, resulting in
>>>>>>>>>>>>>> the bug. The code
snippet above seems completely benign to me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I attached the
offending program. With
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> clang a.c -O0
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it worked fine (it
should print out an ASCII-art picture of
>>>>>>>>>>>>>> mandelbrot fracture).
However, with -O1 or -O3, it goes into a dead loop
>>>>>>>>>>>>>> (in the code snippet
above) after printing out a few characters.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also tried
UndefinedBehaviorSanitizer. Strangely, when
>>>>>>>>>>>>>> compiling using
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> clang a.c -O3 
-fsanitize=undefined
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the code worked again,
with no infinite loop, and no
>>>>>>>>>>>>>> undefined behavior
reported.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So it seems to me a
LLVM optimizer bug. I would greatly
>>>>>>>>>>>>>> appreciate if any one
is willing to investigate.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Haoran
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>
>>>>>>>>>>>
_______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201023/176d0e3f/attachment.html>

llvm dev - Oct 2020 - clang10 mis-compiles simple C program transpiled from brainfxxk

[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk

[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk

[llvm-dev] clang10 mis-compiles simple C program transpiled from brainfxxk