thr3ads.net - similar to: "[LLVMdev] sign extensions, SCEVs, and wrap flags"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] sign extensions, SCEVs, and wrap flags"

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

2012 Dec 10

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

Hello all, I wanted to get some feedback on this patch for ScalarEvolution. It addresses a performance problem I am seeing for simple benchmark. Starting with this C code: 01: signed char foo(void) 02: { 03: const int count = 8000; 04: signed char result = 0; 05: int j; 06: 07: for (j = 0; j < count; ++j) { 08: result += (result_t)(3); 09: } 10: 11: return result; 12: } I

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

I changed the input C to using a 64 bit type for the loop index (this eliminates 'sext' instructions in the IR) Here the IR produced with clang -O0 define float @foo(i64 %start, i64 %end, float* %A) #0 { entry: %start.addr = alloca i64, align 8 %end.addr = alloca i64, align 8 %A.addr = alloca float*, align 8 %sum = alloca [4 x float], align 16 %i = alloca i64, align 8

[LLVMdev] sign extensions, SCEVs, and wrap flags

2012 Sep 20

[LLVMdev] sign extensions, SCEVs, and wrap flags

Hi, > Sorry, I probably led you astray. No-self-wrap is useful for determining > trip count, but does not mean that sign/zero extension can be hoisted. > > But if you run your analysis after -indvars, the sign-extension should be > removed if possible. The algorithm walks the derived induction variables > specifically looking for add nsw/nuw and replacing

Question on induction variable simplification pass

2017 Apr 14

Question on induction variable simplification pass

Hi Sanjoy, I have attached the IR I got by compiling with -O2. This is just before we widen the IV. To get the backedge taken count info I ran indvars on it and then replaced zext with sext. I think regardless of where we decide to add this transformation in the pipeline, it should try to preserve as much information as it can. This means that we should generate sext for signed IVs and

[LLVMdev] sign extensions, SCEVs, and wrap flags

2012 Sep 20

[LLVMdev] sign extensions, SCEVs, and wrap flags

On Sep 19, 2012, at 2:18 PM, Preston Briggs <preston.briggs at gmail.com> wrote: > On Tue, Sep 18, 2012 at 10:59 PM, Andrew Trick <atrick at apple.com> wrote: > > On Sep 18, 2012, at 8:21 PM, Preston Briggs <preston.briggs at gmail.com> wrote: > >> Given the following SCEV, >> >> (sext i32 {2,+,1}<nw><%for.body> to i64) >>

[LLVMdev] sign extensions, SCEVs, and wrap flags

2012 Sep 20

[LLVMdev] sign extensions, SCEVs, and wrap flags

On Wed, Sep 19, 2012 at 6:30 PM, Andrew Trick <atrick at apple.com> wrote: > > On Sep 19, 2012, at 2:18 PM, Preston Briggs <preston.briggs at gmail.com> > wrote: > > On Tue, Sep 18, 2012 at 10:59 PM, Andrew Trick <atrick at apple.com> wrote: > >> >> On Sep 18, 2012, at 8:21 PM, Preston Briggs <preston.briggs at gmail.com> >> wrote:

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

2012 May 04

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Hi Preston, On Fri, May 4, 2012 at 9:12 AM, Preston Briggs <preston.briggs at gmail.com> wrote: > > which produces > > %arrayidx24 = getelementptr inbounds [100 x [100 x i64]]* %A, i64 > %arrayidx21.sum, i64 %add1411, i64 %add > store i64 0, i64* %arrayidx24, align 8 > {{{(5 + ((3 + %n) * %n)),+,(2 * %n * %n)}<%for.cond1.preheader>,+,(4 *

Question on induction variable simplification pass

2017 Apr 13

Question on induction variable simplification pass

Hi all, It looks like the induction variable simplification pass prefers doing a zero-extension to compute the wider trip count of loops when extending the IV. This can sometimes result in loss of information making ScalarEvolution's analysis conservative which can lead to missed performance opportunities. For example, consider this loopnest- int i, j; for(i=0; i< 40; i++) for(j=0;

[LLVMdev] sign extensions, SCEVs, and wrap flags

2012 Sep 19

[LLVMdev] sign extensions, SCEVs, and wrap flags

On Tue, Sep 18, 2012 at 10:59 PM, Andrew Trick <atrick at apple.com> wrote: > > On Sep 18, 2012, at 8:21 PM, Preston Briggs <preston.briggs at gmail.com> > wrote: > > Given the following SCEV, > > *(sext i32 {2,+,1}<nw><%for.body> to i64)* > > > from the following C source, > > *void strong3(int *A, int *B, int n) {* > * for (int i =

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

2012 May 04

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Is there any chance of replacing/extending the GEP instruction? As noted in the GEP FAQ, GEPs don't support variable-length arrays; when the front ends have to support VLAs, they linearize the subscript expressions, throwing away information. The FAQ suggests that folks interested in writing an analysis that understands array indices (I'm thinking of dependence analysis) should be

[LLVMdev] RFC: Loop versioning for LICM

2015 Feb 26

[LLVMdev] RFC: Loop versioning for LICM

Hi Ashutosh, Have you been following the recent Loop Access Analysis work? LAA was split out from the Loop Vectorizer that have been performing the kind of loop versioning that you describe. The main reason was to be able to share this functionality with other passes. Loop Access Analysis is an analysis pass that computes basic memory dependence and the runtime checks. The versioning decision

[LLVMdev] Loss of precision with very large branch weights

2015 Apr 24

[LLVMdev] Loss of precision with very large branch weights

In PR 22718, we are looking at issues with long running applications producing non-representative frequencies. For example, in these two loops: int g = 0; __attribute__((noinline)) void bar() { g++; } extern int printf(const char*, ...); int main() { int i, j, k; for (i = 0; i < 1000000; i++) bar(); printf ("g = %d\n", g); g = 0; for (i = 0; i < 500000; i++)

[LLVMdev] sign extensions, SCEVs, and wrap flags

2012 Sep 19

[LLVMdev] sign extensions, SCEVs, and wrap flags

On Sep 18, 2012, at 8:21 PM, Preston Briggs <preston.briggs at gmail.com> wrote: > Given the following SCEV, > > (sext i32 {2,+,1}<nw><%for.body> to i64) > > from the following C source, > > void strong3(int *A, int *B, int n) { > for (int i = 0; i < n; i++) { > A[i + 2] = i; > ... > } > } > > Since the No-Wrap flag is

[ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls

2017 Aug 08

[ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls

On 8/8/2017 1:37 PM, Friedman, Eli wrote: > On 8/8/2017 10:22 AM, Geoff Berry via llvm-dev wrote: >> Hi all, >> >> I'm looking into resolving a FIXME in the LoopDataPrefetch (and FalkorMarkStridedAccesses) pass by marking both of these passes as preserving the ScalarEvolution analysis. Unfortunately, when this change is made, LSR will generate different code. One of the

[LLVMdev] sign extensions, SCEVs, and wrap flags

2012 Sep 19

[LLVMdev] sign extensions, SCEVs, and wrap flags

Given the following SCEV, *(sext i32 {2,+,1}<nw><%for.body> to i64)* from the following C source, *void strong3(int *A, int *B, int n) {* * for (int i = 0; i < n; i++) {* * A[i + 2] = i;* * ...* * }* *}* Since the No-Wrap flag is set on the addrec, can't I safely rewrite it as *{2,+,1}<nw><%for.body>* If I can, why isn't the SCEV package

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 04

[LLVMdev] RFC: Loop versioning for LICM

> On Mar 3, 2015, at 1:29 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com <mailto:Ashutosh.Nema at amd.com>> wrote: > > Hi Adam, > > Thanks for looking into LoopVersioning work. > > I have gone through recent LoopAccessAnalysis changes and found some of the stuff > overlaps (i.e. runtime memory check, loop access analysis etc.). LoopVersioning can > use

[LLVMdev] RFC: Loop versioning for LICM

2015 Feb 26

[LLVMdev] RFC: Loop versioning for LICM

I like to propose a new loop multi versioning optimization for LICM. For now I kept this for LICM only, but it can be used in multiple places. The main motivation is to allow optimizations stuck because of memory alias dependencies. Most of the time when alias analysis is unsure about memory access and it says may-alias. This un surety from alias analysis restrict some of the memory based

[LLVMdev] IR optimization pass ideas for backend porting before ISel

2012 Jul 30

[LLVMdev] IR optimization pass ideas for backend porting before ISel

Hi LLVMers, I'm writing a LLVM backend for C*Core, an ISA derived from Motorola M*Core. I was wondering if someone wrote some IR level optimization passes for backend porting before ISel, such as an IR transformation from GEP to integer conversion/calculating instructions, and PHI combination. Here's the bubble sorting example. The IR codes below are changed by hand and I try to write

[ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls

2017 Aug 08

[ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls

Hi all, I'm looking into resolving a FIXME in the LoopDataPrefetch (and FalkorMarkStridedAccesses) pass by marking both of these passes as preserving the ScalarEvolution analysis. Unfortunately, when this change is made, LSR will generate different code. One of the root causes seems to be that SCEV will return different nsw/nuw flags for the same Value, depending on what order the

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

2012 Apr 23

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

Hi, When I write various test cases and explore how they're handled by the code in LoopDependenceAnalysis::analysePair, I'm surprised. This loop collects pairs of subscripts from the source and destination refs. * // Collect GEP operand pairs (FIXME: use GetGEPOperands from BasicAA), adding* * // trailing zeroes to the smaller GEP, if needed.* * GEPOpdsTy destOpds, srcOpds;* *

similar to: [LLVMdev] sign extensions, SCEVs, and wrap flags