thr3ads.net - search: "conv6"

Displaying 7 results from an estimated 7 matches for "conv6".

Did you mean: conv

[LLVMdev] How to vectorize a vector type cast?

2012 Feb 28

[LLVMdev] How to vectorize a vector type cast?

...p i8 %2 to float %vecinit3 = insertelement <4 x float> %vecinit, float %conv2, i32 1 %3 = extractelement <4 x i8> %0, i32 2 %conv4 = uitofp i8 %3 to float %vecinit5 = insertelement <4 x float> %vecinit3, float %conv4, i32 2 %4 = extractelement <4 x i8> %0, i32 3 %conv6 = uitofp i8 %4 to float %vecinit7 = insertelement <4 x float> %vecinit5, float %conv6, i32 3 ret <4 x float> %vecinit7 Which does the cast as a sequence of scalar operations, whereas it could be done as %1 = uitofp <4 x i8> %0 to <4 x float> ret <4 x float>...

[LLVMdev] Question on Machine Combiner Pass

2015 Feb 04

[LLVMdev] Question on Machine Combiner Pass

...flehner <ghoflehner at apple.com <mailto:ghoflehner at apple.com> > Date: Thu Aug 7 21:40:58 2014 +0000 MachineCombiner Pass for selecting faster instruction sequence on AArch64 For this example code sequence: %mul = mul nuw nsw i32 %conv2, %conv %mul7 = mul nuw nsw i32 %conv6, %conv4 %add = add nuw nsw i32 %mul7, %mul ret i32 %add We generate the following assembly: mul w8, w0, w1 mul w9, w2, w3 add w0, w9, w8 ret Whereas I expected the MUL+ADD to be combined to MADD ot...

Remove zext-unfolding from InstCombine

2016 Jul 27

Remove zext-unfolding from InstCombine

...32 %conv, 1 %cmp = icmp ne i32 %and, 0 %conv1 = zext i1 %cmp to i32 %conv2 = sext i8 %b to i32 %cmp3 = icmp ne i32 %conv2, 0 %conv4 = zext i1 %cmp3 to i32 %and5 = and i32 %conv1, %conv4 %tobool = icmp ne i32 %and5, 0 %lnot = xor i1 %tobool, true %lnot.ext = zext i1 %lnot to i32 %conv6 = trunc i32 %lnot.ext to i8 ret i8 %conv6 } ``` For both functions, the `icmp` operations will be immediately followed by `zext` instructions, which will directly be optimized away by `transformZExtICmp()`, which is the reason why in the end we will only have one of the `icmp` instructions left....

Remove zext-unfolding from InstCombine

2016 Aug 04

Remove zext-unfolding from InstCombine

...%conv1 = zext i1 %cmp to i32 > %conv2 = sext i8 %b to i32 > %cmp3 = icmp ne i32 %conv2, 0 > %conv4 = zext i1 %cmp3 to i32 > %and5 = and i32 %conv1, %conv4 > %tobool = icmp ne i32 %and5, 0 > %lnot = xor i1 %tobool, true > %lnot.ext = zext i1 %lnot to i32 > %conv6 = trunc i32 %lnot.ext to i8 > ret i8 %conv6 > } > ``` > > For both functions, the `icmp` operations will be immediately followed by `zext` instructions, which will directly be optimized away by `transformZExtICmp()`, which is the reason why in the end we will only have one of the...

Remove zext-unfolding from InstCombine

2016 Jul 21

Remove zext-unfolding from InstCombine

Hi all, I have a question regarding a transformation that is carried out in InstCombine, which has been introduced by r48715. It unfolds expressions of the form `zext(or(icmp, (icmp)))` to `or(zext(icmp), zext(icmp)))` to expose pairs of `zext(icmp)`. In a subsequent iteration these `zext(icmp)` pairs could then (possibly) be optimized by another optimization (which has already been there before

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

2012 Dec 10

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

...; preds = %for.cond1.preheader, %for.body3 + %storemerge15 = phi i32 [ 0, %for.cond1.preheader ], [ %inc, %for.body3 ] + %1 = phi i32 [ %0, %for.cond1.preheader ], [ %add, %for.body3 ] + %conv2 = and i32 %1, 255 + %add = add nsw i32 %conv2, %storemerge15 + %conv6 = trunc i32 %add to i8 + %inc = add nsw i32 %storemerge15, 1 + %cmp2 = icmp slt i32 %inc, 10 + br i1 %cmp2, label %for.body3, label %for.inc7 + +for.inc7: ; preds = %for.body3 + %inc8 = add nsw i32 %storemerge7, 1 + %cmp = icmp slt i32 %inc8, 10 + br i1...

[LLVMdev] LiveIntervals analysis problem

2013 Feb 14

[LLVMdev] LiveIntervals analysis problem

...incdec.ptr.i75 = getelementptr inbounds i16* %x, i32 3 %14 = load i16* %incdec.ptr.i75, align 2, !tbaa !5 %conv.1.i = zext i16 %14 to i32 %shl.1.i = shl nuw nsw i32 %conv.1.i, 8 %shr12.1.i = lshr i16 %14, 8 %conv5.1.i = zext i16 %shr12.1.i to i32 %or.1.i = or i32 %conv5.1.i, %shl.i74 %conv6.1.i = trunc i32 %or.1.i to i16 store i16 %conv6.1.i, i16* %incdec.ptr.i75, align 2, !tbaa !5 %incdec.ptr.1.i76 = getelementptr inbounds i16* %x, i32 4 %15 = load i16* %incdec.ptr.1.i76, align 2, !tbaa !5 %conv.2.i = zext i16 %15 to i32 %shl.2.i = shl nuw nsw i32 %conv.2.i, 8 %shr12.2.i...

search for: conv6