thr3ads.net - similar to: "Question about LLVM region analysis"

Displaying 20 results from an estimated 2000 matches similar to: "Question about LLVM region analysis"

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 15

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

Codeprepare and independent blocks are introducing these loads and stores. These are prepasses that polly runs prior to building the dependence graph to transform scalar dependences into data dependences. Ether was working on eliminating the rewrite of scalar dependences. On Thu, Aug 15, 2013 at 5:32 AM, Star Tan <tanmx_star at yeah.net> wrote: > Hi all, > > I have investigated the

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 16

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

I do not think that running SROA before polly is a good idea: it would defeat the purpose of the code preparation passes that polly intentionally schedules for the data dependence analysis. If you remove the data references before polly runs, you would miss them in the dependence graph: that could lead to incorrect transforms. On Thu, Aug 15, 2013 at 7:28 PM, Star Tan <tanmx_star at

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 16

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

Hi Sebpop, Thanks for your explanation. I noticed that Polly would finally run the SROA pass to transform these load/store instructions into scalar operations. Is it possible to run such a pass before polly-dependence analysis? Star Tan At 2013-08-15 21:12:53,"Sebastian Pop" <sebpop at gmail.com> wrote: >Codeprepare and independent blocks are introducing these loads and

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 15

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

Hi all, I have investigated the 6X extra compile-time overhead when Polly compiles the simple nestedloop benchmark in LLVM-testsuite. (http://188.40.87.11:8000/db_default/v4/nts/31?compare_to=28&baseline=28). Preliminary results show that such compile-time overhead is resulted by the complicated polly-dependence analysis. However, the key seems to be the polly-prepare pass, which introduces

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 16

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

On 08/15/2013 03:32 AM, Star Tan wrote: > Hi all, Hi, I tried to reproduce your findings, but could not do so. > I have investigated the 6X extra compile-time overhead when Polly compiles the simple nestedloop benchmark in LLVM-testsuite. (http://188.40.87.11:8000/db_default/v4/nts/31?compare_to=28&baseline=28). Preliminary results show that such compile-time overhead is resulted by

[LLVMdev] Help with hazards

2011 Dec 14

[LLVMdev] Help with hazards

The scoreboard hazard detector that I've added for the PPC 440 is not detecting hazards as it should (which certainly could be my fault somehow, but...). For example, it will produce a schedule that looks like... SU(28): 0x127969b0: f64,ch = LFD 0x12793aa0, 0x1277b4f0, 0x127965b0<Mem:LD8[%scevgep100](tbaa=!"double")> [ORD=41] [ID=28] SU(46): 0x12796ab0: f64 = FADD 0x127969b0,

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

2012 Dec 10

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

Hello all, I wanted to get some feedback on this patch for ScalarEvolution. It addresses a performance problem I am seeing for simple benchmark. Starting with this C code: 01: signed char foo(void) 02: { 03: const int count = 8000; 04: signed char result = 0; 05: int j; 06: 07: for (j = 0; j < count; ++j) { 08: result += (result_t)(3); 09: } 10: 11: return result; 12: } I

Cleaning up ‘br i1 false’ cases in CodeGenPrepare

2018 Jun 29

Cleaning up ‘br i1 false’ cases in CodeGenPrepare

Hi, I have come across a couple of cases where the code generated after CodeGenPrepare pass has "br i1 false .." with both true and false conditions preserved and this propagates further and remains the same in the final assembly code/executable. In CodeGenPrepare::runOnFunction, ConstantFoldTerminator (which handles the br i1 false condition) is called only once and if after the

GlobalAddress lowering strategy

2018 May 16

GlobalAddress lowering strategy

I've been looking at GlobalAddress lowering strategy recently and was wondering if anyone had any ideas, insights, or experience to share. ## Background When lowering global address, which is typically done in FooTargetLowering::LowerGlobalAddress you have the option of folding in the global offset into the global address or else emitting the base globaladdress and a separate ADD node for the

[LLVMdev] [Polly] Assert in Scope construction

2013 Jul 03

[LLVMdev] [Polly] Assert in Scope construction

Should have changed the subject line... --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Sergei Larin > Sent: Wednesday, July 03, 2013 12:29 PM > To: 'Tobias Grosser' > Cc: 'llvmdev'

[LLVMdev] [Polly] Assert in Scope construction

2013 Jul 05

[LLVMdev] [Polly] Assert in Scope construction

Hi Sergei, On Thu, Jul 4, 2013 at 1:36 AM, Sergei Larin <slarin at codeaurora.org> wrote: > Should have changed the subject line... > > --- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted > by > The Linux Foundation > > > > -----Original Message----- > > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at

An issue with "lifetime.start" and storing "undef"

2019 Sep 24

An issue with "lifetime.start" and storing "undef"

Hi All, In order to confine the scope of a local variable allocated using "alloca", we had written a pass named 'undeflocal'. This helps avoid pseudo phi nodes inserted because of the path (carrying an undefined value) starting from the "entry" block. This unwanted phi instruction was preventing some of the optimizations from getting triggered later on. For example,

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

I changed the input C to using a 64 bit type for the loop index (this eliminates 'sext' instructions in the IR) Here the IR produced with clang -O0 define float @foo(i64 %start, i64 %end, float* %A) #0 { entry: %start.addr = alloca i64, align 8 %end.addr = alloca i64, align 8 %A.addr = alloca float*, align 8 %sum = alloca [4 x float], align 16 %i = alloca i64, align 8

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

2012 May 04

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Hi Preston, On Fri, May 4, 2012 at 9:12 AM, Preston Briggs <preston.briggs at gmail.com> wrote: > > which produces > > %arrayidx24 = getelementptr inbounds [100 x [100 x i64]]* %A, i64 > %arrayidx21.sum, i64 %add1411, i64 %add > store i64 0, i64* %arrayidx24, align 8 > {{{(5 + ((3 + %n) * %n)),+,(2 * %n * %n)}<%for.cond1.preheader>,+,(4 *

[LLVMdev] Can LLVM vectorize <2 x i32> type

2015 Jun 26

[LLVMdev] Can LLVM vectorize <2 x i32> type

For example, I have the following IR code, for.cond.preheader: ; preds = %if.end18 %mul = mul i32 %12, %3 %cmp21128 = icmp sgt i32 %mul, 0 br i1 %cmp21128, label %for.body.preheader, label %return for.body.preheader: ; preds = %for.cond.preheader %19 = mul i32 %12, %3 %20 = add i32 %19, -1 %21 = zext i32 %20 to i64 %22 =

How to make ScalarEvolution recompute SCEV values?

2019 Oct 30

How to make ScalarEvolution recompute SCEV values?

Hello all, I’m pretty new to LLVM. I'm writing a pass for loop optimization. I clone and rearrange loops, setting the cloned loop as the original loop’s parent. This can be done multiple times, until there is no more work to do. The trouble is, after the first time I do this, the cloned loop's SCEVs become unknown types when they should be AddRecExpr. If I re-run the whole pass on the

[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?

2012 Dec 31

[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?

Dear all, In our compiler we use a modified version LLVM Polly, which is very sensitive to proper code generation. Among the number of limitations, the loop region (enclosed by phi node on induction variable and branch) is required to be free of additional memory-dependent branches. In other words, there must be no conditional "br" instructions below phi nodes. The problem we are facing

Making loop guards part of canonical loop structure

2019 May 30

Making loop guards part of canonical loop structure

On Hexagon, unguarded loops cannot be converted to hardware loops. If the loop's latch branch alone handles the iteration count, including the possibility of 0, then the loop cannot be converted to a hardware loop, because hardware loops must iterate at least once. If the entire loop is guarded against zero iteration count, we can put the loop setup in the preheader, since at that point the

[LLVMdev] Splitting a basic block, replacing it's terminator

2009 May 08

[LLVMdev] Splitting a basic block, replacing it's terminator

Hi, I want to insert a conditional branch in the middle of a basic block. To that end, I am doing these steps: (1) Split the basic block: bb->splitBasicBlock() (2) Remove the old terminator: succ->removePredecessor(bb) bb->getTerminator()->getParent() (3) Adding a new terminator: BranchInst::Create(ifTrue, ifFalse, cnd, "", bb); That seems to work, but later passes

[RFC] New pass: LoopExitValues

2015 Sep 10

[RFC] New pass: LoopExitValues

Which cases does this pass handle which aren't otherwise optimized out by passes like GlobalValueNumbering or DeadCodeElimination? Thanks, Jake VanAdrighem On Thu, Sep 10, 2015 at 2:35 PM, Steve King <steve at metrokings.com> wrote: > Hello LLVM, > It seems this thread has gone cold. Is there some low risk way for > the community to take the new pass for a test drive? >

similar to: Question about LLVM region analysis