similar to: [LLVMdev] constant folding for standard math functions

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] constant folding for standard math functions"

2010 Mar 01
0
[LLVMdev] constant folding for standard math functions
On Mar 1, 2010, at 9:44 AM, Jochen Wilhelmy wrote: > Hi! > > I'd like to replace all calls to standard math functions (e.g. sin(0.5)) by > their result. > What strategy do you recommend? > Should I write a pass that does only this or should I copy and > modify the SCCP pass? > > A problem with an extra pass could be that that I need to alternate > my pass and
2010 Mar 03
5
[LLVMdev] folding x * 0 = 0
Hi! > sin/cos etc should already be handled by lib/Analysis/ConstantFolding.cpp. > Thanks for the hint and it works! Now I have a new Problem: I have this function: float foo(float a, float b) { float x = a * b * 0.0f; return cos(0.5) * sin(0.5) * x; }; after compiling it with clang (cpp mode) and renaming _ZSt3sinf to sin and _ZSt3cosf to cos I get the following: define
2010 Mar 05
0
[LLVMdev] folding x * 0 = 0
Hi Jochen, I just wanted to point out that if x = inf the result of x * 0 is in indeterminate form so reducing it to zero would give the wrong result in that case. Thanks, Javier On 3/3/2010 8:56 AM, Jochen Wilhelmy wrote: > Hi! > > > >> sin/cos etc should already be handled by lib/Analysis/ConstantFolding.cpp. >> >> > Thanks for the hint and it
2014 Sep 19
2
[LLVMdev] More careful treatment of floating point exceptions
Hi Sanjay, Thanks, I saw this flag and it's definitely should be considered, but it appeared to me to be static characteristic of target platform. I'm not sure how appropriate it would be to change its value from a front-end. It says "Has", while optional flag would rather say "Uses" meaning that implementation cares about floating point exceptions. Regards, Sergey
2017 Mar 15
2
Data structure improvement for the SLP vectorizer
Maybe it would illustrative to give an IR example of the case I'm interested in. Consider define void @"julia_transform_bvn_derivs_hessian!"(double* %data, double* %data2, double *%data3, double *%out) { %element11 = getelementptr inbounds double, double* %data, i32 1 %load10 = load double, double* %data %load11 = load double, double* %element11 %element21 =
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
Hi, LLVM community: I found basicaa seems not to tell must-not-alias for __restrict__ arguments in c/c++. It only compares two pointers and the underlying objects they point to. I wonder how clang does alias analysis for c/c++ keyword restrict. let assume we compile the following code: $cat myalias.cc float foo(float * __restrict__ v0, float * __restrict__ v1, float * __restrict__ v2, float *
2017 Mar 15
2
Data structure improvement for the SLP vectorizer
There was some discussion of this on the llvm-commits list, but I wanted to raise the topic for discussion here. The background of the -commits discussion was that r296863 added the ability to sort memory access when the SLP vectorizer reached a load (the SLP vectorizer starts at a store or some other sink, and tries to go up the tree vectorizing as it goes along - if the input is in a different
2014 Jul 08
1
[PATCH] nv50/ir: use unordered_set instead of list to keep our instructions in uses
This shortens runtime of piglit test fp-long-alu to ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp | 6 +++--- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 7 ++++--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
2014 Apr 22
2
[LLVMdev] InstCombine strips the inBounds attribute in GetElementPtr ConstantExpr
I can't upload my program due to confidentiality, but the problem is obvious. At lib/Analysis/ConstantFolding.cpp:646 Constant *C = ConstantExpr::getGetElementPtr(Ops[0], NewIdxs); if (ConstantExpr *CE = dyn_cast<ConstantExpr>(C)) { if (Constant *Folded = ConstantFoldConstantExpression(CE, TD, TLI)) C = Folded; } The generated ConstantExpr C doesn't inherit the
2013 Nov 12
0
[LLVMdev] What's the Alias Analysis does clang use ?
Hi, Your problem is that the function arguments, which are makes as noalias, are not being directly used as the base objects of the array accesses: > %v0.addr = alloca float*, align 8 > %v1.addr = alloca float*, align 8 > %v2.addr = alloca float*, align 8 > %t.addr = alloca float*, align 8 ... > store float* %v0, float** %v0.addr, align 8 > store float* %v1, float** %v1.addr,
2014 Jun 03
8
[PATCH v2 0/4] Constant folding of new Instructions
And another try for constant folding of Instructions for nvc0. Please Review this! Thanks, Tobias Klausmann Tobias Klausmann (4): nvc0/ir: clear subop when folding constant expressions nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions nvc0/ir: Handle OP_BFIND when folding constant expressions nvc0/ir: Handle OP_POPCNT when folding constant expressions
2014 May 29
4
Add constant folding for new opcodes
Hi, please review the following 4 patches: 1b1cfc6 nvc0/ir: Handle OP_BFIND when folding constant expressions d2d2727 nvc0/ir: Handle OP_POPCNT when folding constant expressions 86a1ee6 nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions 84563bf nvc0/ir: clear subop when folding constant expressions src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 39
2014 Nov 27
2
[LLVMdev] Fast-math flags in constant expressions
Hi, I'm wondering why lib/AsmParser/LLParser handles fast-math flags in the following IR: ... %val = fmul nnan double 1.0, 1.0 ... but doesn't allow any flags if "fmul" is inside "phi": ... %val = phi double [ fmul (double 1.0, double 1.0), %cond.true ], [ fmul (double 1.0, double 1.0), %cond.false ] ...
2014 Jun 03
6
[PATCH v3 0/4] Constant folding of new Instructions
Yet another try for constant folding of Instructions for nvc0. Please Review this again! (Hopefully the last time ;-) ) Tobias Klausmann (4): nvc0/ir: clear subop when folding constant expressions nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions nvc0/ir: Handle OP_BFIND when folding constant expressions nvc0/ir: Handle OP_POPCNT when folding constant
2019 May 03
2
[ConstantExpr] Adding folding tests
Hey everyone, I'd like to add some new constant foldings to ConstantExpr -- in particular ConstantExpr::get(...) and friends. But, I'm having trouble finding the correct place for adding IR tests in the /test directory. Any suggestions? Thanks, Cam -------------- next part -------------- An HTML attachment was scrubbed... URL:
2014 Sep 25
2
[LLVMdev] More careful treatment of floating point exceptions
Hi again, It's partially done. My concern is that it won't be accepted as is because of adding the flag parameter in a lot of places. I'd like to show what it looks like (here, not on llvm-commit yet), maybe someone could suggest a better way. There are two sources of the flag: field of TargetOptions and function attribute. I had to add the later one for InstCombine pass. Still
2014 Dec 02
2
[LLVMdev] Fast-math flags in constant expressions
Out of curiosity, how would you envision fast-math flags interacting with constant expressions? Off the top of my head, I can’t think of any flags that would be relevant if the expression can just be constant-folded away at full precision anyways. > On Nov 28, 2014, at 4:56 AM, Sergey Dmitrouk <sdmitrouk at accesssoftek.com> wrote: > > Doesn't look like a bug, more like a
2018 Aug 21
4
different output with fast-math flag
This is of course not homework. I am trying to understand how fast math optimizations work in llvm. When I compared IR for both the programs, the only thing I have noticed is that fdiv and fmul are replaced with fdiv fast and fmul fast. Not sure what happens in fdiv fast and fmul fast. I feel that its because d/max is really small number and fast-math does not care about small numbers and consider
2014 May 29
2
[PATCH 2/4] nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 58092f4..93f7c2a 100644 ---
2016 Oct 02
2
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
Previously we'd end up with an unnecessary mov for the thirs immediate value. total instructions in shared programs : 851881 -> 851864 (-0.00%) total gprs used in shared programs : 110295 -> 110295 (0.00%) total local used in shared programs : 1020 -> 1020 (0.00%) local gpr inst bytes helped 0 0 17 17