similar to: [LLVMdev] Instruction combiner multiplication canonicalization

Displaying 20 results from an estimated 500 matches similar to: "[LLVMdev] Instruction combiner multiplication canonicalization"

2014 Jul 01
2
[LLVMdev] Probable error in InstCombine
I've found what appears to be a bug in instcombine. Specifically, the transformation of -(X/C) to X/(-C) is invalid if C == INT_MIN. Specifically, if I have > define i32 @foo(i32 %x) #0 { > entry: > %div = sdiv i32 %x, -2147483648 > %sub = sub nsw i32 0, %div > ret i32 %sub > } then opt -instcombine will produce > define i32 @foo(i32 %x) #0 { > entry: > %sub
2008 May 17
0
[LLVMdev] More info, was Help needed after hiatus
On Sat, May 17, 2008 at 11:34 AM, Richard Pennington <rich at pennware.com> wrote: > If I run the optimizer (opt) on this code snippet with -std-compile-opts > the optimizer hangs. > > > ; ModuleID = 'test.ubc' > target datalayout = >
2019 Dec 31
3
Any significance for m_OneUse in (X / Y) / Z => X / (Y * Z) ??
Dear All, The InstCombine pass performs the following transformation. Z / (X / Y) => (Y * Z) / X This is performed only when operand Op1 ( (X/Y) in this case) has only one use in future. The code snippet is shown below. if (match(Op1, m_OneUse(m_FDiv(m_Value(X), m_Value(Y)))) && (!isa<Constant>(Y) || !isa<Constant>(Op0))) { // Z / (X / Y) => (Y *
2020 Jan 03
3
Any significance for m_OneUse in (X / Y) / Z => X / (Y * Z) ??
A couple more general comments: 1. There shouldn't be any correctness issues removing one-use checks (the transform should be safe independently of use-counts). 2. Ideally, you can remove a m_OneUse() from the code and run 'make check' or similar, and you will see a regression test failure because we have a 'negative' test to cover that pattern. That should make it clear that
2020 Jan 06
2
Any significance for m_OneUse in (X / Y) / Z => X / (Y * Z) ??
Is your case the case mentioned in the subject or a different case? ~Craig On Sun, Jan 5, 2020 at 8:11 PM raghesh <raghesh.a at gmail.com> wrote: > Thanks Sanjay for your comments. > > So, if we want to add a transformation avoiding the one-use check, which > one is the ideal pass? Shall we do it in -aggressive-instcombine? I came > to know that if the pattern search
2017 Sep 13
2
How to add optimizations to InstCombine correctly?
Hi, I am working on PR34474 and try to add a new optimization to InstCombine. Like in other parts of the visitMul function I add a Shl through the IR builder and create a new BinaryOp which I return from visitMul. If I understand correctly the new BinaryOp returned from visitMul should replace the original Instruction in the Worklist. However, I end up in an infinite loop and the Instruction
2017 Jul 13
2
failing to optimize boolean ops on cmps
This can't be an instsimplify though? The values we want in these cases do not exist already: %res = or i8 %b, %a %res = or i1 %cmp, %c On Thu, Jul 13, 2017 at 5:10 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > On Thu, Jul 13, 2017 at 2:12 PM, Sanjay Patel <spatel at rotateright.com> > wrote: > >> We have several optimizations in InstCombine
2016 Feb 25
1
Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 11:44 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> The only optimizations I can think of that are okay are algebraic >> simplifications that don't exploit no-overflow, inbounds or exact > > Why? Can you provide an example using nsw, inbounds, etc.? I think the same case as the general UB applies: void foo(int n) available_externally { if (n
2017 Sep 19
0
How to add optimizations to InstCombine correctly?
I am currently improving the D37896 to include the suggestions from Chad. However, running the lit checks for the x86 backend I observe some changes in the generated MC, e.g.: llvm/test/CodeGen/X86/lea-3.ll:13:10: error: expected string not found in input ; CHECK: leal ([[A0]],[[A0]],2), %eax ^ <stdin>:10:2: note: scanning from here orq %rdi, %rax ^ <stdin>:10:2:
2017 Sep 16
2
How to add optimizations to InstCombine correctly?
This conversation has (partially) moved on to D37896 now, but if possible I was hoping that we could perform this in DAGCombiner and remove the various target specific combines that we still have. At least ARM/AARCH64 and X86 have cases that can hopefully be generalised and removed, but there will probably be a few legality/perf issues that will occur. Simon. > On 14 Sep 2017, at 06:23,
2017 Sep 13
3
How to add optimizations to InstCombine correctly?
There is in fact a transform out there somewhere that reverses yours. define i64 @foo(i64 %a) { %b = shl i64 %a, 5 %c = add i64 %b, %a ret i64 %c } becomes define i64 @foo(i64 %a) { %c = mul i64 %a, 33 ret i64 %c } ~Craig On Wed, Sep 13, 2017 at 10:11 AM, Craig Topper <craig.topper at gmail.com> wrote: > Your code seems fine. InstCombine can infinite loop if some other
2017 Sep 14
3
How to add optimizations to InstCombine correctly?
Hi Craig, thanks for digging into this. So InstCombine is the wrong place for fixing PR34474. Can you give me a hint where such an optimization should go into CodeGen? I am not really familiar with stuff that happens after the MidLevel. Cheers, Michael Am 13.09.2017 um 19:21 schrieb Craig Topper: > And that is less instructions. So from InstCombine's perspective the > multiply is
2017 Jul 14
2
failing to optimize boolean ops on cmps
On 07/13/2017 06:40 PM, Daniel Berlin via llvm-dev wrote: > Ah yes, it can only return one instruction and you need two. > > NewGVN could still handle it if instsimplify was changed to return the > right binary operators because it has infrastructure that can handle > insertion. > (it would need a little work). > > > (At this point, InstCombine seems to have become
2015 Apr 22
4
[LLVMdev] Missed vectorization opportunities?
Hi, I am trying to understand the limitations of the current vectorizer, and came upon these test cases that fail to get vectorized. 1. loop1 below (note the increment by 2) fails to get vectorized because the access a[j+1] is considered to wrap around (the corresponding SCEV doesn't have nsw/nuw set) and hence isStridedPtr() in LoopAccessAnalysis.cpp return false. #define SIZE 100000 void
2010 Jan 05
0
[LLVMdev] [llvm-commits] [llvm] r92458 - in /llvm/trunk: lib/Target/README.txt lib/Transforms/Scalar/InstructionCombining.cpp test/Transforms/InstCombine/or.ll
Hi Bill- For what it's worth, a simple truth table proves Chris correct. Alastair On 5 Jan 2010, at 02:46, Bill Wendling wrote: > On Jan 3, 2010, at 10:04 PM, Chris Lattner wrote: > >> Author: lattner >> Date: Mon Jan 4 00:03:59 2010 >> New Revision: 92458 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=92458&view=rev >> Log: >>
2017 Sep 19
0
How to add optimizations to InstCombine correctly?
Hi Sanjay, thanks for enlighten me on terms of tests. I assume I have to run the test-suite benchmarks to check for regressions? Is there a guide to get the metrics from the benchmarks? Cheers, Michael BTW the beginner tag for bugs was really a good idea to get started with contributing to llvm. On Tue, Sep 19, 2017 at 3:58 PM +0200, "Sanjay Patel" <spatel at
2018 Dec 18
2
should we do this time-consuming transform in InstCombine?
Hi Roman, Thanks for your good idea. I think it can solve the abs issue very well. I can continue with my work now^-^. But if it is not abs and there is no select, %res = OP i32 %b, %a %sub = sub i32 0, %b %res2 = OP i32 %sub, %a theoretically, we can still do the following transform for the above pattern: %res2 = OP i32 %sub, %a ==> %res2 = sub i32 0, %res Not sure whether we can do it
2016 Feb 25
0
Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
----- Original Message ----- > From: "Sanjoy Das" <sanjoy at playingwithpointers.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Chandler Carruth" <chandlerc at google.com>, "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip Reames" > <listmail at philipreames.com>, "Duncan P. N. Exon Smith"
2016 May 31
0
AMDGPUPromoteAlloca assume 3-dims enabled?
hi, list, I found AMDGPUPromoteAlloca calculates newly ptr as follows: std::tie(TCntY, TCntZ) = getLocalSizeYZ(Builder); Value *TIdX = getWorkitemID(Builder, 0); Value *TIdY = getWorkitemID(Builder, 1); Value *TIdZ = getWorkitemID(Builder, 2); Value *Tmp0 = Builder.CreateMul(TCntY, TCntZ, "", true, true); Tmp0 = Builder.CreateMul(Tmp0, TIdX); Value *Tmp1 =
2016 Feb 25
2
Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 11:18 PM, Hal Finkel <hfinkel at anl.gov> wrote: > It might be much more challenging, but let's try. This is not an > issue we need to fix by the end of the month, and the potential > optimization regressions are significant. Our deductions of > readonly/readnone/nocapture/etc. are really important for enabling > other optimizations. Given that all