Displaying 20 results from an estimated 58 matches for "arrayidx4".
Did you mean:
arrayidx
2011 Jul 27
2
[LLVMdev] Avoiding load narrowing in DAGCombiner
...t is known to be in the high-addressed 2 bytes of a word
on my little-endian target, I emit and LD4 from the word-aligned address
and an SRL 16 to shift the i16 into the LSbits of the register.
DAGCombine visit()s an ISD::SRL node and notices that it is
right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and
replaces it with an LD2 from %arrayidx+2.
Replaces
--------
0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]>
0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10]
0x17f7470: i32 = srl 0x17f7070, 0x17f94c0
With
----
0x17fceb0: i32,ch = load 0x17faa00,...
2020 Apr 04
4
Legality of transformation
...32 @main() local_unnamed_addr #0 {entry: %A = alloca
[2048 x i32], align 16 %B = alloca [2048 x i32], align 16 %"reg2mem
alloca point" = bitcast i32 0 to i32 %arrayidx3 = getelementptr inbounds
[2048 x i32], [2048 x i32]* %A, i64 0, i64 1024 %0 = load i32, i32*
%arrayidx3, align 16 %arrayidx4 = getelementptr inbounds [2048 x i32],
[2048 x i32]* %B, i64 0, i64 1024 %1 = load i32, i32* %arrayidx4, align
16 %cmp5 = icmp eq i32 %0, %1 %conv = zext i1 %cmp5 to i32 %call = call
i32 (i32, ...) bitcast (i32 (...)* @assert to i32 (i32, ...)*)(i32 %conv)
#2 ret i32 0}*
It is my understandin...
2014 Feb 19
2
[LLVMdev] better code for IV
...ext i32 %trunc to i64
%arrayidx = getelementptr inbounds float* %a, i64 %idxprom
%0 = load float* %arrayidx, align 4
%arrayidx2 = getelementptr inbounds float* %b, i64 %idxprom
%1 = load float* %arrayidx2, align 4
%add = fadd float %0, %1
%arrayidx4 = getelementptr inbounds float* %c, i64 %idxprom
store float %add, float* %arrayidx4, align 4
%L_inc_ind_var = add nuw nsw i64 %L_ind_var, 1
%L_cmp.to.max = icmp eq i64 %L_inc_ind_var, %iNumElements
%L_inc_tid = add nuw nsw i64 %L_tid, 1
br i1 %L_cm...
2011 Jul 27
0
[LLVMdev] Avoiding load narrowing in DAGCombiner
...he high-addressed 2 bytes of a word
> on my little-endian target, I emit and LD4 from the word-aligned address
> and an SRL 16 to shift the i16 into the LSbits of the register.
>
> DAGCombine visit()s an ISD::SRL node and notices that it is
> right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and
> replaces it with an LD2 from %arrayidx+2.
>
> Replaces
> --------
> 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]>
> 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10]
> 0x17f7470: i32 = srl 0x17f7070, 0x17f94c0
>
> With...
2011 Jul 27
2
[LLVMdev] Avoiding load narrowing in DAGCombiner
...bytes of a word
>> on my little-endian target, I emit and LD4 from the word-aligned address
>> and an SRL 16 to shift the i16 into the LSbits of the register.
>>
>> DAGCombine visit()s an ISD::SRL node and notices that it is
>> right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and
>> replaces it with an LD2 from %arrayidx+2.
>>
>> Replaces
>> --------
>> 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]>
>> 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10]
>> 0x17f7470: i32 = srl 0x17f7...
2013 Feb 07
1
[LLVMdev] alloca scalarization with dynamic indexing into vectors
...%arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1
%1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !9
%arrayidx3 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage,
i64 0, i64 1
store <2 x i32> %1, <2 x i32>* %arrayidx3, align 8, !tbaa !9
%arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2
%2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !9
%arrayidx5 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage,
i64 0, i64 2
store <2 x i32> %2, <2 x i32>* %arrayidx5, align 8, !tbaa !9
%idx.ext = ze...
2013 Feb 05
1
[LLVMdev] Vectorizing global struct pointers
On 5 February 2013 17:28, Nadav Rotem <nrotem at apple.com> wrote:
> We insert runtime overlap checks only for unidentified objects. The
> problem here is that the vectorizer thinks that A,B,C are all pointers to
> the same array, so it gives up. If A,B,C were different arrays then it
> could have used runtime checks.
>
Yes, that is exactly the code that creates the
2011 Jul 27
0
[LLVMdev] Avoiding load narrowing in DAGCombiner
...>> on my little-endian target, I emit and LD4 from the word-aligned address
>>> and an SRL 16 to shift the i16 into the LSbits of the register.
>>>
>>> DAGCombine visit()s an ISD::SRL node and notices that it is
>>> right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and
>>> replaces it with an LD2 from %arrayidx+2.
>>>
>>> Replaces
>>> --------
>>> 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]>
>>> 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10]
>>>...
2013 Nov 08
1
[LLVMdev] loop vectorizer and storing to uniform addresses
...%for.cond1
%5 = load i64* %i, align 8
%mul = mul nsw i64 %5, 4
%6 = load i64* %q, align 8
%add = add nsw i64 %mul, %6
%7 = load float** %A.addr, align 8
%arrayidx = getelementptr inbounds float* %7, i64 %add
%8 = load float* %arrayidx, align 4
%9 = load i64* %q, align 8
%arrayidx4 = getelementptr inbounds [4 x float]* %sum, i32 0, i64 %9
%10 = load float* %arrayidx4, align 4
%add5 = fadd float %10, %8
store float %add5, float* %arrayidx4, align 4
br label %for.inc
for.inc: ; preds = %for.body3
%11 = load i64* %q, align...
2013 Feb 05
3
[LLVMdev] Vectorizing global struct pointers
...initializer, align 8
...
%arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32
%idxprom
%0 = load i64* %arrayidx, align 8
%arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32
%idxprom
%1 = load i64* %arrayidx2, align 8
%mul = mul nsw i64 %1, %0
%arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32
%idxprom
store i64 %mul, i64* %arrayidx4, align 8
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130205/373cd8d0/attachment.html>
2018 Feb 27
0
Question about instcombine pass.
...---------------------
IR.(Excerpt)
----------------------------------
for.body: ; preds = %for.body, %entry
%store_forwarded = phi i32 [ %load_initial, %entry ], [ %mul10, %for.body ]
%indvars.iv = phi i64 [ 1, %entry ], [ %indvars.iv.next, %for.body ]
%arrayidx4 = getelementptr inbounds [10 x i32], [10 x i32]* @a, i64 0, i64 %indvars.iv
%0 = load i32, i32* %arrayidx4, align 4, !tbaa !2
%mul10 = mul nsw i32 %store_forwarded, %store_forwarded
%arrayidx12 = getelementptr inbounds [10 x i32], [10 x i32]* @X, i64 0, i64 %indvars.iv
store i32 %mul10, i32...
2012 Apr 23
0
[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch
Hi,
When I write various test cases and explore how they're handled by the code
in LoopDependenceAnalysis::analysePair, I'm surprised. This loop collects
pairs of subscripts from the source and destination refs.
* // Collect GEP operand pairs (FIXME: use GetGEPOperands from BasicAA),
adding*
* // trailing zeroes to the smaller GEP, if needed.*
* GEPOpdsTy destOpds, srcOpds;*
*
2017 Dec 19
4
MemorySSA question
...arrayidx = getelementptr inbounds i32, i32* %b, i64 %indvars.iv35
; MemoryUse(3)
%2 = load i32, i32* %arrayidx, align 4, !tbaa !2
%arrayidx2 = getelementptr inbounds i32, i32* %c, i64 %indvars.iv35
; MemoryUse(3)
%3 = load i32, i32* %arrayidx2, align 4, !tbaa !2
%add = add nsw i32 %3, %2
%arrayidx4 = getelementptr inbounds i32, i32* %a, i64 %indvars.iv35
*; 1 = MemoryDef(3)*
store i32 %add, i32* %arrayidx4, align 4, !tbaa !2
%indvars.iv.next36 = add nuw nsw i64 %indvars.iv35, 5
%cmp = icmp slt i64 %indvars.iv.next36, %1
br i1 %cmp, label %for.body, label %for.end
for.end:...
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
...32* %a,
> align 4, !tbaa !1 %arrayidx1 = getelementptr inbounds i32* %a, i32 1 %1 =
> load i32* %arrayidx1, align 4, !tbaa !1 %add = add nsw i32 %1, %0
> %arrayidx2 = getelementptr inbounds i32* %a, i32 2 %2 = load i32*
> %arrayidx2, align 4, !tbaa !1 %add3 = add nsw i32 %add, %2 %arrayidx4 =
> getelementptr inbounds i32* %a, i32 3 %3 = load i32* %arrayidx4, align 4,
> !tbaa !1 %add5 = add nsw i32 %add3, %3 %arrayidx6 = getelementptr
> inbounds i32* %a, i32 4 %4 = load i32* %arrayidx6, align 4, !tbaa !1
> %add7 = add nsw i32 %add5, %4 %arrayidx8 = getelementptr inboun...
2013 Feb 05
0
[LLVMdev] Vectorizing global struct pointers
...;
> %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32 %idxprom
> %0 = load i64* %arrayidx, align 8
> %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32 %idxprom
> %1 = load i64* %arrayidx2, align 8
> %mul = mul nsw i64 %1, %0
> %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32 %idxprom
> store i64 %mul, i64* %arrayidx4, align 8
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.e...
2014 Sep 19
3
[LLVMdev] [Vectorization] Mis match in code generated
...%0 = load i32* %a, align 4, !tbaa !1
%arrayidx1 = getelementptr inbounds i32* %a, i32 1
%1 = load i32* %arrayidx1, align 4, !tbaa !1
%add = add nsw i32 %1, %0
%arrayidx2 = getelementptr inbounds i32* %a, i32 2
%2 = load i32* %arrayidx2, align 4, !tbaa !1
%add3 = add nsw i32 %add, %2
%arrayidx4 = getelementptr inbounds i32* %a, i32 3
%3 = load i32* %arrayidx4, align 4, !tbaa !1
%add5 = add nsw i32 %add3, %3
%arrayidx6 = getelementptr inbounds i32* %a, i32 4
%4 = load i32* %arrayidx6, align 4, !tbaa !1
%add7 = add nsw i32 %add5, %4
%arrayidx8 = getelementptr inbounds i32* %a, i...
2017 Sep 13
2
RFC phantom memory intrinsic
...= insertelement <4 x double> undef, double %0, i32 0
%add = add i64 %i, 1
%arrayidx1 = getelementptr inbounds double, double* %ptr, i64 %add
%1 = load double, double* %arrayidx1, align 8
%vecinit2 = insertelement <4 x double> %vecinit, double %1, i32 1
%add3 = add i64 %i, 2
%arrayidx4 = getelementptr inbounds double, double* %ptr, i64 %add3
%2 = load double, double* %arrayidx4, align 8
%vecinit5 = insertelement <4 x double> %vecinit2, double %2, i32 2
%add6 = add i64 %i, 3
%arrayidx7 = getelementptr inbounds double, double* %ptr, i64 %add6
%3 = load double, doubl...
2014 Sep 18
2
[LLVMdev] [Vectorization] Mis match in code generated
...{entry: %0 = load i32* %a,
align 4, !tbaa !1 %arrayidx1 = getelementptr inbounds i32* %a, i32 1 %1 =
load i32* %arrayidx1, align 4, !tbaa !1 %add = add nsw i32 %1, %0
%arrayidx2 = getelementptr inbounds i32* %a, i32 2 %2 = load i32*
%arrayidx2, align 4, !tbaa !1 %add3 = add nsw i32 %add, %2 %arrayidx4 =
getelementptr inbounds i32* %a, i32 3 %3 = load i32* %arrayidx4, align 4,
!tbaa !1 %add5 = add nsw i32 %add3, %3 %arrayidx6 = getelementptr
inbounds i32* %a, i32 4 %4 = load i32* %arrayidx6, align 4, !tbaa !1
%add7 = add nsw i32 %add5, %4 %arrayidx8 = getelementptr inbounds i32* %a,
i32 5 %...
2013 Feb 19
0
[LLVMdev] Pointer Context Metadata (was: Parallel Loop Metadata)
On 02/19/2013 05:51 PM, Hal Finkel wrote:
> Understood. If you have some time, it seems that there are several sub-tasks:
>
> - Update the language reference
Document the additional optional iteration id argument to
llvm.mem.parallel_loop_access? I'll do this.
> - Update the loop vectorizer (to update the metadata when it unrolls)
> - Update the regular unroller
2013 Feb 04
2
[LLVMdev] RFC: [PATCH] parallel loop metadata
...correct use of
+both ``llvm.loop.parallel`` and ``llvm.mem.parallel_loop_access``
+metadata types that refer to the same loop identifier metadata.
+
+.. code-block:: llvm
+
+ for.body:
+ ...
+ %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
+ ...
+ store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0
+ ...
+ br i1 %exitcond, label %for.end, label %for.body, !llvm.loop.parallel !0
+
+ for.end:
+ ...
+ !0 = metadata !{ metadata !0 }
+
+It is also possible to have nested parallel loops. In that case the
+memory accesses refer to a list of loop...