thr3ads.net - search: "d28625"

Folding zext from i1 into PHI nodes with only zwo incoming values.

2017 Jan 29

3

Folding zext from i1 into PHI nodes with only zwo incoming values.

Hi, AFAICT there are two places where zext instructions may get folded into PHI nodes. One is FoldPHIArgZextsIntoPHI and the other is the more generic FoldPHIArgOpIntoPHI. Now, the former only handles PHIs with more than 2 incoming values, while the latter only handles casts where the source type is legal. This means that for an PHI node with two incoming i8 values, both resulting from `zext i1

Folding zext from i1 into PHI nodes with only zwo incoming values.

2017 Jan 30

3

Folding zext from i1 into PHI nodes with only zwo incoming values.

...} The zext instructions should be folded into the phi, and then the new zext gets removed along with the icmp instruction at the end. Björn 2017-01-30 20:20 GMT+01:00 Sanjay Patel <spatel at rotateright.com>: > I'm looking at a similar problem in: > https://reviews.llvm.org/D28625 > > Does that patch make any difference on the cases that you are looking at? > > Instead of avoiding ShouldChangeType with zext as a special-case opcode, > it might be better to treat i1 as a special-case type. There's no way to > avoid i1 in IR, so we might as well allow tra...

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 22

2

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

...017 8:00 PM To: Evgeny Astigeevich Cc: llvm-dev; nd Subject: Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines > Do you mean to remove the hack in InstCombiner::visitICmpInst()? Yes. Although (this just came up in D28625 too) we might need to remove multiple versions of that in order to unlock optimization: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCompares.cpp#L4338 https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCasts.cpp#L470 https...

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 23

2

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

...17 8:00 PM To: Evgeny Astigeevich Cc: llvm-dev; nd Subject: Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines > Do you mean to remove the hack in InstCombiner::visitICmpInst()? Yes. Although (this just came up in D28625 too) we might need to remove multiple versions of that in order to unlock optimization: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCompares.cpp#L4338 https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCasts.cpp#L470 https...

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 22

2

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

Hi Sanjay, The benchmark source file: http://www.llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Benchmarks/Shootout/sieve.c?view=markup Clang options used to produce the initial IR: clang -DNDEBUG -O3 -DNDEBUG -mcpu=cortex-a53 -fomit-frame-pointer -O3 -DNDEBUG -w -Werror=date-time -c sieve.c -S -emit-llvm -mllvm -disable-llvm-optzns --target=aarch64-arm-linux Opt options: opt -O3

Folding zext from i1 into PHI nodes with only zwo incoming values.

2017 Jan 30

0

Folding zext from i1 into PHI nodes with only zwo incoming values.

...d be folded into the phi, and then the new zext > gets removed along with the icmp instruction at the end. > > Björn > > 2017-01-30 20:20 GMT+01:00 Sanjay Patel <spatel at rotateright.com>: > >> I'm looking at a similar problem in: >> https://reviews.llvm.org/D28625 >> >> Does that patch make any difference on the cases that you are looking at? >> >> Instead of avoiding ShouldChangeType with zext as a special-case opcode, >> it might be better to treat i1 as a special-case type. There's no way to >> avoid i1 in IR, so we...

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 24

2

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

...> Subject: Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines > > > > > Do you mean to remove the hack in InstCombiner::visitICmpInst()? > > > > Yes. Although (this just came up in D28625 too) we might need to remove multiple versions of that in order to unlock optimization: > > https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCompares.cpp#L4338 <https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCompa...

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 24

3

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

...rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines >> >> >> >> > Do you mean to remove the hack in InstCombiner::visitICmpInst()? >> >> >> >> Yes. Although (this just came up in D28625 too) we might need to remove multiple versions of that in order to unlock optimization: >> >> https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCompares.cpp#L4338 <https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstComb...

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 24

3

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

...17 8:00 PM To: Evgeny Astigeevich Cc: llvm-dev; nd Subject: Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines > Do you mean to remove the hack in InstCombiner::visitICmpInst()? Yes. Although (this just came up in D28625 too) we might need to remove multiple versions of that in order to unlock optimization: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCompares.cpp#L4338 https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineCasts.cpp#L470 https...

search for: d28625