thr3ads.net - similar to: "[LLVMdev] IR optimization pass ideas for backend porting before ISel"

Displaying 20 results from an estimated 200 matches similar to: "[LLVMdev] IR optimization pass ideas for backend porting before ISel"

[LLVMdev] Help with hazards

2011 Dec 14

[LLVMdev] Help with hazards

The scoreboard hazard detector that I've added for the PPC 440 is not detecting hazards as it should (which certainly could be my fault somehow, but...). For example, it will produce a schedule that looks like... SU(28): 0x127969b0: f64,ch = LFD 0x12793aa0, 0x1277b4f0, 0x127965b0<Mem:LD8[%scevgep100](tbaa=!"double")> [ORD=41] [ID=28] SU(46): 0x12796ab0: f64 = FADD 0x127969b0,

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

2013 Nov 01

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

In the case when coming from C it was probably the loop unroller and SLP vectorizer which vectorized the code. Potentially I could do the same in the IR. However, the loop body that is generated in the IR can get very large. Thus, the loop unroller will refuse to unroll the loop in a large number of (important) cases. Isn't there a way to convince the loop vectorizer that it should

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

2013 Nov 01

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

I am trying a setup where the one loop is rewritten as two loops. This avoids the 'rem' and 'div' instructions in the index calculation (which give the loop vectorizer a hard time). However, with this setup the loop vectorizer complains about a too small loop. LV: Checking a loop in "main" LV: Found a loop: L3 LV: Found a loop with a very small trip count. This loop

[LLVMdev] loop vectorizer issue

2013 Nov 03

[LLVMdev] loop vectorizer issue

Hello, I was trying to trace the Loop vectorizer of the LLVM, I wrote a simple loop with a clear dependency. But found that the debug shows that 'we can vectorize this loop' Here you are my loop with dependency: for(int k=20;k<50;k++) dataY[k] = dataY[k-1]; And the debug prints: LV: Checking a loop in "main" LV: Found a loop: for.body4 LV: Found an

[LLVMdev] loop vectorizer issue

2013 Nov 03

[LLVMdev] loop vectorizer issue

Notice that the code you provided, for globals and stack allocations, at least, is semantically equivalent to: int a = d[19]; for(int k = 20; k < 50; k++) dataY[k] = a; Like so, the load you see missing was redundant, probably hoisted by GVN/PRE and replaced with "%.pre". H. On Sun, Nov 3, 2013 at 11:26 AM, Sara Elshobaky <sara.elshobaky at gmail.com>wrote: >

[LLVMdev] loop vectorizer issue

2013 Nov 03

[LLVMdev] loop vectorizer issue

Hi Sarah, the loop vectorizer runs not on the C code but on LLVM IR this c code was lowered to. Before the loop vectorizer runs many other optimization change the shape of this IR. You can see in the LLVM IR you referenced below, a preceding LLVM IR transformation has change your loop from: > for(int k=20;k<50;k++) > dataY[k] = dataY[k-1]; to > int a = d[19]; >

[LLVMdev] loop vectorizer issue

2013 Nov 03

[LLVMdev] loop vectorizer issue

Actually what I meant in my original loop, that there is a dependency between every two consecutive iterations. So, how the loop vectorizer says 'we can vectorize this loop'? for(int k=20;k<50;k++) dataY[k] = dataY[k-1]; From: Henrique Santos [mailto:henrique.nazare.santos at gmail.com] Sent: Sunday, November 03, 2013 4:28 PM To: Sara Elshobaky Cc: <llvmdev at

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 15

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

Codeprepare and independent blocks are introducing these loads and stores. These are prepasses that polly runs prior to building the dependence graph to transform scalar dependences into data dependences. Ether was working on eliminating the rewrite of scalar dependences. On Thu, Aug 15, 2013 at 5:32 AM, Star Tan <tanmx_star at yeah.net> wrote: > Hi all, > > I have investigated the

[LLVMdev] Cast to SCEVAddRecExpr

2015 Mar 19

[LLVMdev] Cast to SCEVAddRecExpr

Hi Nick, Thanks for looking into it. I have tried that as well but it didn't worked. "AddExpr->getOperand(0))" node is: " (4 * (sext i32 {2,+,2}<%for.body4> to i64))<nsw>" When I cast this to "SCEVAddRecExpr" it returns NULL. Regards, Ashutosh -----Original Message----- From: Nick Lewycky [mailto:nicholas at mxc.ca] Sent: Thursday, March 19,

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 16

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

I do not think that running SROA before polly is a good idea: it would defeat the purpose of the code preparation passes that polly intentionally schedules for the data dependence analysis. If you remove the data references before polly runs, you would miss them in the dependence graph: that could lead to incorrect transforms. On Thu, Aug 15, 2013 at 7:28 PM, Star Tan <tanmx_star at

[LLVMdev] Cast to SCEVAddRecExpr

2015 Mar 19

[LLVMdev] Cast to SCEVAddRecExpr

Yes, I can get "SCEVAddRecExpr" from operands of "(sext i32 {2,+,2}<%for.body4> to i64)". So whenever SCEV cast to "SCEVAddRecExpr" fails, we have drill down for such patterns ? Is that the right way ? Regards, Ashutosh -----Original Message----- From: Nick Lewycky [mailto:nicholas at mxc.ca] Sent: Thursday, March 19, 2015 1:02 PM To: Nema, Ashutosh Cc:

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 16

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

Hi Sebpop, Thanks for your explanation. I noticed that Polly would finally run the SROA pass to transform these load/store instructions into scalar operations. Is it possible to run such a pass before polly-dependence analysis? Star Tan At 2013-08-15 21:12:53,"Sebastian Pop" <sebpop at gmail.com> wrote: >Codeprepare and independent blocks are introducing these loads and

[LLVMdev] Cast to SCEVAddRecExpr

2015 Mar 31

[LLVMdev] Cast to SCEVAddRecExpr

Sorry typo in test case, Please ignore previous mail. Consider below case: for (j=1; j < itr; j++) { - - - - for (i=1; i < itr; i++) { { temp= var[i << 1]; - - - - - } } In the above example, we are unable to get "SCEVAddRecExpr" for "var[i << 1]" Its "SCEVAddRecExpr" is computable in *Outer Loop* I

[LLVMdev] Cast to SCEVAddRecExpr

2015 Apr 01

[LLVMdev] Cast to SCEVAddRecExpr

Thanks Sanjoy. > To be pedantic, "var[i<<1]" is not an add recurrence, but "&var[i << > 1]" is an add recurrence. I'll assume that's that you meant. Yes, I meant the same. > I think that is because in C, multiplication is nsw but left shift is > not and so "i << 1" can legitimately sign-overflow but i * 2 cannot >

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 15

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

Hi all, I have investigated the 6X extra compile-time overhead when Polly compiles the simple nestedloop benchmark in LLVM-testsuite. (http://188.40.87.11:8000/db_default/v4/nts/31?compare_to=28&baseline=28). Preliminary results show that such compile-time overhead is resulted by the complicated polly-dependence analysis. However, the key seems to be the polly-prepare pass, which introduces

[LLVMdev] Cast to SCEVAddRecExpr

2015 Mar 19

[LLVMdev] Cast to SCEVAddRecExpr

Hi, I'm trying to cast one of the SCEV node to "SCEVAddRecExpr". Every time cast return NULL, and I'm unable to do this. SCEV Node: ((4 * (sext i32 {2,+,2}<%for.body4> to i64))<nsw> + %var)<nsw> Casting: const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(SCEVNode); 'var' is of type float pointer (float*). Without 'sext' it works, but

Delinearization validity checks in DependenceAnalysis

2019 May 22

Delinearization validity checks in DependenceAnalysis

Hello Yes, I agree that the SCEV cannot be simplified. Is my understanding correct that it is passed to a function like "isKnownNegative"? Which could still be able to prove is always true. The delinearisation may be valid, depending on exactly how you define delinearisation (under what conditions it should be giving results). It would be invalid for DA to return a dependency of [0

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 16

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

On 08/15/2013 03:32 AM, Star Tan wrote: > Hi all, Hi, I tried to reproduce your findings, but could not do so. > I have investigated the 6X extra compile-time overhead when Polly compiles the simple nestedloop benchmark in LLVM-testsuite. (http://188.40.87.11:8000/db_default/v4/nts/31?compare_to=28&baseline=28). Preliminary results show that such compile-time overhead is resulted by

Getting Error: channel 0: chan_shutdown_write: close() failed for fd5: Resource temporarily una

2001 Sep 26

Getting Error: channel 0: chan_shutdown_write: close() failed for fd5: Resource temporarily una

Hi there, I've installed the whole cygwin package on an NT4 machine. I've been trying to rsync from the NT box to a box running openbsd (an account on an isp). I'm using a test directory to see if things are working. At the end of transmission - I get an error - does anyone have any ideas as to what it means? (i.e should I talk to the ISP because its something that's a

Verify that we only get loop metadata on latches

2018 Jul 06

Verify that we only get loop metadata on latches

In https://bugs.llvm.org/show_bug.cgi?id=38011 (see also https://reviews.llvm.org/D48721) a problem was revealed related to llvm.loop metadata. The fault was that clang added the !llvm.loop metadata to branches outside of the loop (not only the loop latch). That was not handled properly by some opt passes (simplifying cfg) since it ended up merging branch instructions with different !llvm.loop

similar to: [LLVMdev] IR optimization pass ideas for backend porting before ISel