Displaying 5 results from an estimated 5 matches for "xi32".
Did you mean:
i32
2014 Apr 22
2
[LLVMdev] where is F7 opcode for TEST instruction on X86?
...{
let Defs = [EFLAGS] in {
let isCommutable = 1 in {
def TEST8rr : BinOpRR_F<0x84, "test", Xi8 , X86testpat, MRMSrcReg>;
def TEST16rr : BinOpRR_F<0x84, "test", Xi16, X86testpat, MRMSrcReg>;
def TEST32rr : BinOpRR_F<0x84, "test", Xi32, X86testpat, MRMSrcReg>;
def TEST64rr : BinOpRR_F<0x84, "test", Xi64, X86testpat, MRMSrcReg>;
} // isCommutable
def TEST8rm : BinOpRM_F<0x84, "test", Xi8 , X86testpat>;
def TEST16rm : BinOpRM_F<0x84, "test", Xi16, X86testpat>...
2012 Feb 25
3
[LLVMdev] Missed optimization on array initialization
...9;
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define void @_Z3fooi(i32 %a) uwtable {
%ar =alloca [100 xi32],align 16
%1 =bitcast [100 xi32]* %arto i8*
call void @llvm.memset.p0i8.i64(i8* %1,i8 0,i64 400,i32 16,i1 false)
%2 =getelementptr inbounds [100 xi32]* %ar,i64 0,i64 0
store i32 %a,i32* %2,align 16, !tbaa !0
%3 =icmp eq i32 %a, 0
br i1 %3,label %4,label %5
;...
2013 Nov 16
1
[LLVMdev] Limit loop vectorizer to SSE
The vectorizer will now emit
= load <8 x i32>, align #TargetAlignmentOfScalari32
where before it would emit
= load <8 x i32>
(which has the semantics of “= load <8 xi32>, align 0” which means the address is aligned with target abi alignment, see http://llvm.org/docs/LangRef.html#load-instruction).
When the backend generates code for the former it will emit an unaligned move:
= vmovups ...
wheres for the later it will use an aligned move:
= vmovaps …
vmovups...
2013 Nov 16
0
[LLVMdev] Limit loop vectorizer to SSE
I confirm that r194876 fixes the issue, i.e. segfault not caused.
My program still passed 16 byte aligned pointers to the function
which the loop vectorizer processes successfully:
LV: Vector loop of width 8 costs: 1.
LV: Selecting VF = : 8.
LV: Found a vectorizable loop (8) in func_orig.ll
LV: Unroll Factor is 1
Since the program runs fine, it seems to be allowed for the CPU
to issue a vector
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
A fix for this is in r194876.
Thanks for reporting this!
On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote:
> Nadav,
>
> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to