thr3ads.net - similar to: "GEP transformation by InstCombiner"

Displaying 20 results from an estimated 10000 matches similar to: "GEP transformation by InstCombiner"

2018 Jan 15

GEP transformation by InstCombiner

I tried to retrieve anything from DataLayout. It contains pointer size, but how can I conclude that the GEP index can't be widened? - Elena From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Monday, January 15, 2018 20:34 To: Demikhovsky, Elena <elena.demikhovsky at intel.com>; llvm-dev at lists.llvm.org; Sanjay Patel (spatel at rotateright.com) <spatel at

GEP transformation by InstCombiner

2018 Jan 16

GEP transformation by InstCombiner

> On 15 Jan 2018, at 18:21, Demikhovsky, Elena via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi all, > > I’m working on an out-of-tree target and encountered the following problem: > > InstCombiner “normalizes” GEPs and extends Index operand to the Pointer width. > It works fine if you can convert pointer to integer for address calculation and I assume

GEP transformation by InstCombiner

2018 Jan 15

GEP transformation by InstCombiner

On 01/15/2018 12:59 PM, Demikhovsky, Elena wrote: > > I tried to retrieve anything from DataLayout. It contains pointer > size, but how can I conclude that the GEP index can’t be widened? > I meant that we'd add a new field giving the preferred size for indexing arithmetic. On the other hand, in your case, and in general, would it make sense to prevent widening beyond the largest

GEP transformation by InstCombiner

2018 Jan 15

GEP transformation by InstCombiner

On 01/15/2018 12:21 PM, Demikhovsky, Elena wrote: > Hi all, > > I’m working on an out-of-tree target and encountered the following > problem: > > InstCombiner “normalizes” GEPs and extends Index operand to the > Pointer width. > It works fine if you can convert pointer to integer for address > calculation and I assume that all registered targets do this. > >

GEP transformation by InstCombiner

2018 Jan 16

GEP transformation by InstCombiner

> Note that InstCombine is not the only place that tries to insert pointer-width GEPs in the optimization pipeline. I think that we’ve fixed all of them, but I can’t be entirely sure. I'm going to upload a patch, but I'm fixing only the places that are covered by our test system. I'll add you as a reviewer and you are welcome to help me with fixing them all. > We haven’t

[LLVMdev] Extending Vector GEP - proposal

2015 Mar 03

[LLVMdev] Extending Vector GEP - proposal

> This problem can be solved by sinking the broadcast instruction at codegen-prepare time. I considered this option. We currently don’t have target specific optimizations in codegen-prepare time. (Or I’m wrong?) And it will be very X86-directed optimization. Even gather-scatter intrinsics are considered as common for all targets. And the second reason, why I’d prefer to generate a splat-GEP,

AVX Scheduling and Parallelism

2017 Jun 25

AVX Scheduling and Parallelism

Hi Ahmed, >From what can be seen in the code snippet you provided, the reuse of XMM0 and XMM1 across loop-unroll instances does not inhibit instruction-level parallelism. Modern X86 processors use register renaming that can eliminate the dependencies in the instruction stream. In the example you provided, the processor should be able to identify the 2-vloads + vadd + vstore sequences as

[LLVMdev] Extending Vector GEP - proposal

2015 Mar 01

[LLVMdev] Extending Vector GEP - proposal

Hi, According to the current GEP syntax, vector GEP requires that each index must be a vector with the same number of elements. %A = getelementptr <4 x i8*> %ptrs, <4 x i64> %offsets I propose to lessen this requirement. Let each index be or vector or scalar. All vector indices must have the same number of elements. The scalar value will mean the splat vector value. %A =

AVX Scheduling and Parallelism

2017 Jun 25

AVX Scheduling and Parallelism

Hi, Zvi, I agree. In the context of targeting the KNL, however, I'm a bit concerned about the addressing, and specifically, the size of the resulting encoding: > vmovdqu32 zmm0, zmmword ptr [rax + c+401280] ;load b[401280] in > zmm0 > > vpaddd zmm1, zmm1, zmmword ptr [rax + b+401344] > ; zmm1<-zmm1+b[401344] The KNL can only

Alias Analysis with inbound GEPs

2016 Jul 26

Alias Analysis with inbound GEPs

> On Jul 25, 2016, at 10:16 AM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > From: "Elena via llvm-dev Demikhovsky" <llvm-dev at lists.llvm.org> > To: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Monday, July 25, 2016 9:45:55 AM > Subject: [llvm-dev] Alias Analysis with inbound GEPs > > Hi, > > I’m

Alias Analysis with inbound GEPs

2016 Jul 26

Alias Analysis with inbound GEPs

----- Original Message ----- > From: "Elena Demikhovsky" <elena.demikhovsky at intel.com> > To: "Hal J. Finkel" <hfinkel at anl.gov>, "Eli Friedman" > <eli.friedman at gmail.com> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Richard Smith" > <richard-llvm at metafoo.co.uk> > Sent: Tuesday, July 26,

Alias Analysis with inbound GEPs

2016 Jul 25

Alias Analysis with inbound GEPs

Sent from my Verizon Wireless 4G LTE DROID On Jul 25, 2016 6:10 PM, Eli Friedman <eli.friedman at gmail.com<mailto:eli.friedman at gmail.com>> wrote: > > On Mon, Jul 25, 2016 at 2:06 PM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: >> >> >> ________________________________ >>> >>>

[LLVMdev] Indexed Load and Store Intrinsics - proposal

2014 Dec 24

[LLVMdev] Indexed Load and Store Intrinsics - proposal

----- Original Message ----- > From: "Ayal Zaks" <ayal.zaks at intel.com> > To: "Philip Reames" <listmail at philipreames.com>, dag at cray.com, "Elena Demikhovsky" <elena.demikhovsky at intel.com> > Cc: "Robert Khasanov" <robert.khasanov at intel.com>, llvmdev at cs.uiuc.edu > Sent: Monday, December 22, 2014 8:05:43 AM

[LLVMdev] Indexed Load and Store Intrinsics - proposal

2014 Dec 24

[LLVMdev] Indexed Load and Store Intrinsics - proposal

----- Original Message ----- > From: "Xinmin Tian" <xinmin.tian at intel.com> > To: "Hal Finkel" <hfinkel at anl.gov>, "Ayal Zaks" <ayal.zaks at intel.com> > Cc: dag at cray.com, "Robert Khasanov" <robert.khasanov at intel.com>, llvmdev at cs.uiuc.edu > Sent: Tuesday, December 23, 2014 7:36:44 PM > Subject: RE:

Alias Analysis with inbound GEPs

2016 Jul 25

Alias Analysis with inbound GEPs

I’m checking aliasing of two pointers: %GEP1 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32 1, i64 %indvars.iv41, i64 %indvars.iv39 %GEP2 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32 16 The result I got is “PartialAlias” because the indices of the GEP1 are variable. That seems like a bug. PartialAlias should only be returned when we can prove a partial

[LLVMdev] Indexed Load and Store Intrinsics - proposal

2014 Dec 21

[LLVMdev] Indexed Load and Store Intrinsics - proposal

On 12/18/2014 11:56 AM, dag at cray.com wrote: > "Demikhovsky, Elena" <elena.demikhovsky at intel.com> writes: > >> Semantics: >> For i=0,1,…,N-1: if (Mask[i]) {*(BaseAddr + VectorOfIndices[i]*Scale) >> = VectorValue[i];} >> VectorValue: any float or integer vector type. >> BaseAddr: a pointer; may be zero if full address is placed in the

Alias Analysis with inbound GEPs

2016 Jul 25

Alias Analysis with inbound GEPs

Hi, I'm checking aliasing of two pointers: %GEP1 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32 1, i64 %indvars.iv41, i64 %indvars.iv39 %GEP2 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32 16 The result I got is "PartialAlias" because the indices of the GEP1 are variable. Shouldn't the "inbounds" keyword mean that the access to

RFC: New intrinsics masked.expandload and masked.compressstore

2016 Sep 25

RFC: New intrinsics masked.expandload and masked.compressstore

| |Hi Elena, | |Technically speaking, this seems straightforward. | |I wonder, however, how target-independent this is in a practical |sense; will there be an efficient lowering when targeting any other |ISA? I don't want to get into the territory where, because the |vectorizer is supposed to be architecture independent, we need to |add target-independent intrinsics for all

InstCombine GEP

2017 Aug 10

InstCombine GEP

Hi, I have a doubt with GEP transformation in the instruction-combiner. Consider below test-case: struct ABC { int A; int B[100]; struct XYZ { int X; int Y[100]; } OBJ; }; void Setup(struct ABC *); int foo(int offset) { struct ABC *Ptr = malloc(sizeof(struct ABC)); Setup(Ptr); return Ptr->OBJ.X + Ptr->OBJ.Y[33]; } Generated IR for the test-case: define i32 @foo(i32

RFC: New intrinsics masked.expandload and masked.compressstore

2016 Sep 26

RFC: New intrinsics masked.expandload and masked.compressstore

| |How would this work in this case? The result would need to affect the |legality and cost of the memory instruction. From your poster, it looks |like we're talking about loops with constructs like this: | |for (i =0; i < N; i++) { | if (topVal > b[i]) { | *dst = a[i]; | dst++; | } |} | |is this loop vectorizable at all without these constructs? Good

similar to: GEP transformation by InstCombiner