thr3ads.net - similar to: "[LLVMdev] Inferring dependencies in phi instructions"

Displaying 20 results from an estimated 12000 matches similar to: "[LLVMdev] Inferring dependencies in phi instructions"

[LLVMdev] Inferring dependencies in phi instructions

2015 Jun 29

[LLVMdev] Inferring dependencies in phi instructions

On 6/29/15 5:16 AM, Evgeny Astigeevich wrote: > Hi Anirudh, > > 'x' has a control dependency on 'y' because the value assigned to 'x' > depends on a path selected. This dependency can be converted into a data > dependency by means of a 'select' instruction because the control flow is > simple. Just an FYI, there is an optimization called

[LLVMdev] Inferring dependencies in phi instructions

2015 Jun 29

[LLVMdev] Inferring dependencies in phi instructions

On Jun 29, 2015 3:16 AM, "Evgeny Astigeevich" <evgeny.astigeevich at arm.com> wrote: > > Hi Anirudh, > > 'x' has a control dependency on 'y' because the value assigned to 'x' > depends on a path selected. This dependency can be converted into a data > dependency by means of a 'select' instruction because the control flow is >

[LLVMdev] Inferring dependencies in phi instructions

2015 Jun 29

[LLVMdev] Inferring dependencies in phi instructions

On Mon, Jun 29, 2015 at 10:16 AM, Evgeny Astigeevich <Evgeny.Astigeevich at arm.com> wrote: > Hi Anirudh, > > > > I hope these lecture slides about SSA and the dominance frontier will help > you with SSA and control flow analysis: > > > > http://www.seas.harvard.edu/courses/cs252/2011sp/slides/Lec04-SSA.pdf > > > > Unfortunately a use of

RFC: Stop using redundant PHI node entries for multi-edge predecessors

2017 May 01

RFC: Stop using redundant PHI node entries for multi-edge predecessors

Hi, On Mon, May 1, 2017 at 8:47 AM, Daniel Berlin via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> Today, the IR requires that if you have multiple edges from A to B >> (typically with a switch) any phi nodes in B must have an equal number of >> entries for A, but that all of them must have the same value. > >> This seems rather annoying.... >> 1) It

[LLVMdev] Program order in inst_iterator?

2015 Jun 16

[LLVMdev] Program order in inst_iterator?

On 6/16/15 1:09 AM, Nick Lewycky wrote: > Anirudh Sivaraman wrote: >> On Mon, Jun 15, 2015 at 10:50 AM, mats >> petersson<mats at planetcatfish.com> wrote: >>> It will iterate over the instructions in the order that they are >>> stored in >>> the module/function/basicblock that they belong to. And that SHOULD, >>> assuming llvm-dis does

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 23

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

Confirm there is no change in IR if the hack is disabled in the sources. David wrote that these instructions are created by SCEV. Are other targets affected by the changes, e.g. X86? Kind regards, Evgeny Astigeevich Senior Compiler Engineer Compilation Tools ARM From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: Sunday, January 22, 2017 10:45 PM To: Evgeny Astigeevich Cc: llvm-dev; nd

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 22

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

Thank you for information. I’ll build clang without the hack and re-run the benchmark tomorrow. -Evgeny From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: Sunday, January 22, 2017 8:00 PM To: Evgeny Astigeevich Cc: llvm-dev; nd Subject: Re: [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines > Do you mean to

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 24

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

> On Jan 24, 2017, at 7:18 AM, Sanjay Patel <spatel at rotateright.com> wrote: > > > > On Mon, Jan 23, 2017 at 10:53 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: > >> On Jan 23, 2017, at 3:48 PM, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 24

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

> On Jan 23, 2017, at 3:48 PM, Sanjay Patel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > All targets are likely affected in some way by the icmp+shl fold introduced with r292492. It's a basic pattern that occurs in lots of code. Did you see any perf wins on your targets with this commit? > > Sadly, it is also likely that many (all?) targets are negatively

[LLVMdev] Program order in inst_iterator?

2015 Jun 15

[LLVMdev] Program order in inst_iterator?

On Mon, Jun 15, 2015 at 10:50 AM, mats petersson <mats at planetcatfish.com> wrote: > It will iterate over the instructions in the order that they are stored in > the module/function/basicblock that they belong to. And that SHOULD, > assuming llvm-dis does what it is expected to do, be the same order. > Thanks for the reply. What about instruction ordering across basic blocks?

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 22

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

Hi Sanjay, The benchmark source file: http://www.llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Benchmarks/Shootout/sieve.c?view=markup Clang options used to produce the initial IR: clang -DNDEBUG -O3 -DNDEBUG -mcpu=cortex-a53 -fomit-frame-pointer -O3 -DNDEBUG -w -Werror=date-time -c sieve.c -S -emit-llvm -mllvm -disable-llvm-optzns --target=aarch64-arm-linux Opt options: opt -O3

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 24

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

Hi Sanjay, Thank you for your analysis. It’s interesting why the x86 machine is not affected. Maybe the x86 backend is smarter than the AArch64 backend, or it might be micro-architectural differences. I don’t mind to keep the changes on trunk. What I’d like to see is who will/should be involved in solving the issue. What kind of help/support is needed? Should we (ARM Compilation Tools) start

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 20

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

Hi, We found that today's 17.30%/11.37% performance regressions in LNT SingleSource/Benchmarks/Shootout/sieve on LNT-AArch64-A53-O3__clang_DEV__aarch64 and LNT-Thumb2v7-A15-O3__clang_DEV__thumbv7 (http://llvm.org/perf/db_default/v4/nts/daily_report/2017/1/20?filter-machine-regex=aarch64%7Carm%7Cthumb%7Cgreen) are caused by changes [rL292492] in InstCombine: https://reviews.llvm.org/D28406

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

2015 Jul 17

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

Before the fix, the compiler may simply return 'noalias' for cases it can not really prove to be noalias, but actually correct by luck (or even wrong noalias, but does not result in miscompile). It would be useful to find out the set of missed noalias queries from GlobalModRef with your benchmark and examine if there is some improvement can be done. David On Fri, Jul 17, 2015 at 6:32

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

2015 Jul 17

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

Can you say what Benchmark or give a test case so we understand the nature of the regression? As Gerolf said, that will be important to understand what is best to do. On Fri, Jul 17, 2015, 06:43 Evgeny Astigeevich <Evgeny.Astigeevich at arm.com> wrote: > Yes, the regression is stable. I double checked this. A full benchmark > run consists of at least 10 sub-runs to validate the

[LLVMdev] Program order in inst_iterator?

2015 Jun 15

[LLVMdev] Program order in inst_iterator?

Does inst_iterator (http://llvm.org/docs/ProgrammersManual.html#iterating-over-the-instruction-in-a-function) guarantee that the iterated instructions are in program order: the order of instructions printed by llvm-dis? Thanks in advance, Anirudh

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

2015 Jul 17

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

Hey, thanks for benchmarking. How stable is the 2% regression? Michael ran some benchmarks with GlobalsModRef completely disabled and the only differences were in the noise. This was a complete spec2k6 run along with some others. Based on the number of benchmarks run there, I'm going to go ahead and submit these patches, but if you can clarify the impact here, we can look at potentially some

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

2015 Jul 21

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

Based on function names and structures, this is some version of GCC :) Any way you can post the entire .ll file? Because it's globalsmodref, it's hard to debug without the other functions, since it goes over all the functions to determine address takenness, etc :) On Tue, Jul 21, 2015 at 3:23 PM, Michael Zolotukhin <mzolotukhin at apple.com> wrote: > Hi Chandler, > > We

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

2015 Jul 17

[LLVMdev] GlobalsModRef (and thus LTO) is completely broken

On Fri, Jul 17, 2015 at 9:13 AM Evgeny Astigeevich < evgeny.astigeevich at arm.com> wrote: > It’s Dhrystone. > Dhrystone has historically not been a good indicator of real-world performance fluctuations, especially at this small of a shift. I'd like to see if we see any fluctuation on larger and more realistic application benchmarks. One advantage of the flag being set is that we

[RFC] Enable Partial Inliner by default

2017 Nov 10

[RFC] Enable Partial Inliner by default

Hi Graham, Thank you for offering help. I am trying to create a reproducer. The problem is that the crashes happen whilst LTO is used. One thing I am sure about IR is broken at compile time. Thanks, Evgeny From: Graham Yiu <gyiu at ca.ibm.com> Date: Friday, 10 November 2017 at 16:09 To: Evgeny Astigeevich <Evgeny.Astigeevich at arm.com> Cc: "junbuml at codeaurora.org"

similar to: [LLVMdev] Inferring dependencies in phi instructions