search for: mul2

Displaying 17 results from an estimated 17 matches for "mul2".

Did you mean: mul
2018 Dec 18
2
should we do this time-consuming transform in InstCombine?
...Tue, Dec 18, 2018 at 10:18 AM Zheng CZ Chen via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, Hi. > There is an opportunity in instCombine for following instruction pattern: > > %mul = mul nsw i32 %b, %a > %cmp = icmp sgt i32 %mul, -1 > %sub = sub i32 0, %a > %mul2 = mul nsw i32 %sub, %b > %cond = select i1 %cmp, i32 %mul, i32 %mul2 > > Source code for above pattern: > return (a*b) >=0 ? (a*b) : -a*b; > > Currently, llvm(-O3) can not recognize this as abs(a*b). > > I initially think we could do this in instCombine phase in opt. Belo...
2018 Dec 18
2
should we do this time-consuming transform in InstCombine?
Hi, There is an opportunity in instCombine for following instruction pattern: %mul = mul nsw i32 %b, %a %cmp = icmp sgt i32 %mul, -1 %sub = sub i32 0, %a %mul2 = mul nsw i32 %sub, %b %cond = select i1 %cmp, i32 %mul, i32 %mul2 Source code for above pattern: return (a*b) >=0 ? (a*b) : -a*b; Currently, llvm(-O3) can not recognize this as abs(a*b). I initially think we could do this in instCombine phase in opt. Below is what I think: %res = OP i32...
2015 Mar 25
0
[PATCH] nv50/ir: take postFactor into account when doing peephole optimizations
...Instruction *i, i->src(0).mod = Modifier(0); i->src(1).mod = Modifier(0); + i->postFactor = 0; i->setSrc(0, new_ImmediateValue(i->bb->getProgram(), res.data.u32)); i->setSrc(1, NULL); @@ -682,7 +685,7 @@ ConstantFolding::tryCollapseChainedMULs(Instruction *mul2, Instruction *insn; Instruction *mul1 = NULL; // mul1 before mul2 int e = 0; - float f = imm2.reg.data.f32; + float f = imm2.reg.data.f32 * exp2f(mul2->postFactor); ImmediateValue imm1; assert(mul2->op == OP_MUL && mul2->dType == TYPE_F32); @@ -782,9 +785,...
2014 Jul 08
1
[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses
...eephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index c162ac4..8d052c5 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -686,7 +686,7 @@ ConstantFolding::tryCollapseChainedMULs(Instruction *mul2, // b = mul a, imm // d = mul b, c -> d = mul_x_imm a, c int s2, t2; - insn = mul2->getDef(0)->uses.front()->getInsn(); + insn = (*mul2->getDef(0)->uses.begin())->getInsn(); if (!insn) return; mul1 = mul2; diff --git a/sr...
2010 Mar 03
5
[LLVMdev] folding x * 0 = 0
...f; return cos(0.5) * sin(0.5) * x; }; after compiling it with clang (cpp mode) and renaming _ZSt3sinf to sin and _ZSt3cosf to cos I get the following: define float @_Z3fooff(float %a, float %b) nounwind { entry: %mul = fmul float %a, %b ; <float> [#uses=1] %mul2 = fmul float %mul, 0.000000e+000 ; <float> [#uses=1] %mul6 = fmul float 0x3FDAED54A0000000, %mul2 ; <float> [#uses=1] ret float %mul6 } the sin and cos calls are folded, but not the mul by zero. May be this is missing in llvm::ConstantFoldInstOperands in ConsantFoldi...
2019 Jul 23
2
[RFC] A new multidimensional array indexing intrinsic
...on-integer for illustration purposes: > > %arrayidx = call i64 @llvm.multidim.array.index.i64.p0f64.i64.i64.i64.i64 double* %A, i64 %str_1, i64 %idx_1, i64 %str_2, i64 %idx_2 > > According to the RFC, that would get lowered to this: > > %mul1 = mul nsw i64 %str_1, %idx_1 > %mul2 = mul1 nsw i64 %str_2, %idx_2 > %total = add nsw i64 %mul2, %mul1 > %arrayidx = getelementptr inbounds double, double* %A, i64 %total, !multidim !1 > > The problem I'm having is that the source element type in the GEP instruction in the lowering can only be inferred from the poi...
2017 Mar 15
2
Data structure improvement for the SLP vectorizer
...%element33 = getelementptr inbounds double, double* %data3, i32 3 %load30 = load double, double* %data3 %load31 = load double, double* %element31 %load32 = load double, double* %element32 %load33 = load double, double* %element33 %mul1 = fmul fast double %load20, %load10 %mul2 = fmul fast double %load21, %load11 %mul3 = fmul fast double %load22, %load10 %mul4 = fmul fast double %load23, %load11 %add1 = fadd fast double %load30, %mul1 %add2 = fadd fast double %load31, %mul2 %add3 = fadd fast double %load32, %mul3 %add4 = fadd fast double %load33,...
2010 Mar 01
2
[LLVMdev] constant folding for standard math functions
Hi! I'd like to replace all calls to standard math functions (e.g. sin(0.5)) by their result. What strategy do you recommend? Should I write a pass that does only this or should I copy and modify the SCCP pass? A problem with an extra pass could be that that I need to alternate my pass and SCCP several times since the results of the math functions could be folded again. -Jochen
2010 Mar 01
0
[LLVMdev] constant folding for standard math functions
On Mar 1, 2010, at 9:44 AM, Jochen Wilhelmy wrote: > Hi! > > I'd like to replace all calls to standard math functions (e.g. sin(0.5)) by > their result. > What strategy do you recommend? > Should I write a pass that does only this or should I copy and > modify the SCCP pass? > > A problem with an extra pass could be that that I need to alternate > my pass and
2010 Mar 05
0
[LLVMdev] folding x * 0 = 0
...}; > > after compiling it with clang (cpp mode) and renaming _ZSt3sinf to sin > and _ZSt3cosf to cos I get the following: > > define float @_Z3fooff(float %a, float %b) nounwind { > entry: > %mul = fmul float %a, %b ;<float> [#uses=1] > %mul2 = fmul float %mul, 0.000000e+000 ;<float> [#uses=1] > %mul6 = fmul float 0x3FDAED54A0000000, %mul2 ;<float> [#uses=1] > ret float %mul6 > } > > the sin and cos calls are folded, but not the mul by zero. > May be this is missing in llvm::ConstantFo...
2019 Jul 22
1
[RFC] A new multidimensional array indexing intrinsic
...> ``` > > > %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, %str_2, %idx_2 > > > ``` > > > > > > is lowered to: > > > > > > ``` > > > %mul1 = mul nsw i64 %str_1, %idx_1 > > > %mul2 = mul1 nsw i64 %str_2, %idx_2 > > > %total = add nsw i64 %mul2, %mul1 > > > %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total, !multidim !1 > > > ``` > > > with guarantees that the first term in each multiplication is the stride > &gt...
2017 Mar 15
2
Data structure improvement for the SLP vectorizer
There was some discussion of this on the llvm-commits list, but I wanted to raise the topic for discussion here. The background of the -commits discussion was that r296863 added the ability to sort memory access when the SLP vectorizer reached a load (the SLP vectorizer starts at a store or some other sink, and tries to go up the tree vectorizing as it goes along - if the input is in a different
2019 Jul 21
6
[RFC] A new multidimensional array indexing intrinsic
...d by the lowered GEP is guaranteed to be in a canonical form which allows the analysis to infer stride and index sizes. A multidim index of the form: ``` %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, %str_2, %idx_2 ``` is lowered to: ``` %mul1 = mul nsw i64 %str_1, %idx_1 %mul2 = mul1 nsw i64 %str_2, %idx_2 %total = add nsw i64 %mul2, %mul1 %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total, !multidim !1 ``` with guarantees that the first term in each multiplication is the stride and the second term in each multiplication is the index. (What happens if intermedia...
2019 Jul 22
2
[RFC] A new multidimensional array indexing intrinsic
...stride and index sizes. >> >> A multidim index of the form: >> ``` >> %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, %str_2, %idx_2 >> ``` >> >> is lowered to: >> >> ``` >> %mul1 = mul nsw i64 %str_1, %idx_1 >> %mul2 = mul1 nsw i64 %str_2, %idx_2 >> %total = add nsw i64 %mul2, %mul1 >> %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total, !multidim !1 >> ``` >> with guarantees that the first term in each multiplication is the stride >> and the second term in each multiplicat...
2019 Jul 22
2
[RFC] A new multidimensional array indexing intrinsic
...to infer stride and index sizes. > > A multidim index of the form: > ``` > %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, %str_2, %idx_2 > ``` > > is lowered to: > > ``` > %mul1 = mul nsw i64 %str_1, %idx_1 > %mul2 = mul1 nsw i64 %str_2, %idx_2 > %total = add nsw i64 %mul2, %mul1 > %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total, !multidim !1 > ``` > with guarantees that the first term in each multiplication is the stride > and the second term in each multiplicat...
2019 Jul 25
0
[RFC] A new multidimensional array indexing intrinsic
...;> A multidim index of the form: >>> ``` >>> %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, %str_2, %idx_2 >>> ``` >>> >>> is lowered to: >>> >>> ``` >>> %mul1 = mul nsw i64 %str_1, %idx_1 >>> %mul2 = mul1 nsw i64 %str_2, %idx_2 >>> %total = add nsw i64 %mul2, %mul1 >>> %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total, !multidim !1 >>> ``` >>> with guarantees that the first term in each multiplication is the stride >>> and the second term...
2018 Sep 06
2
Replacing a function from one module into another one
Hi Philip, The error happens when the program finishes and it automatically calls the destructors, so it is not an error specifically inside my program. Here's the full code: #include "llvm/ExecutionEngine/ExecutionEngine.h" #include "llvm/ExecutionEngine/MCJIT.h" #include "llvm/IRReader/IRReader.h" #include "llvm/Support/TargetSelect.h" #include