thr3ads.net - search: "commutes"

Displaying 20 results from an estimated 274 matches for "commutes".

Did you mean: commute

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi all, The 213 variant of the FMA3 instructions is currently marked commutable (see X86InstrFMA.td). Is that safe? According to the ISA the FMA3 instructions aren't commutable for non-numeric results, so I'd have thought commuting this would only be valid in fast-math mode? For the curious, the reason that I'm asking is that we currently always select the 213 variant, but this

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi Kay, My patch will partially address your bug. For now I'm just looking to switch the default FMA from vfmadd213xx to vfmadd231xx. That will cause the code in PR17229 to compile as desired, but would regress code like: double foo(double a, double b, double c) { return a * b + c; } Which will now require a vmovaps + vfmadd231. If this impacts real benchmarks we could add an

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi Lang, Unfortunately, I don't have an answer on the commutability question, but I wanted to let you know that I filed a bug on this: http://llvm.org/bugs/show_bug.cgi?id=17229 This also shows a memory operand variant of the fma that you may want to consider in your patch and testcases. Thanks! On Thu, Dec 19, 2013 at 10:45 PM, Lang Hames <lhames at gmail.com> wrote: > Hi all,

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 23

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi Elena, Thank you very much for looking in to that. I'll go ahead and remove the isCommutable flag from those instructions, since it sounds like that's the right thing to do. I would still like to change the default from the 231 variant to 213 too, as this will reduce code-size for accumulator-style loops. I have at least one benchmark that shows significant speedups when this change

Change Bit Value To Text When Viewed

2006 Feb 27

Change Bit Value To Text When Viewed

Hey All, very simple question for you folks ;) I have a @commutes object that for each record there is a bit value set (0/1) to denote if an accident occurred. What I want to do is have it display "Yes" or "No" when I view the listing of commutes. I can get it to display the bit value just fine with <%= commute.accident %> but how can I...

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

2009 Apr 21

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

...otice this >> and >> transform it again? >> > > Yes, the later pass is the coalescer. It would be worthwhile to > understand why it is not coalescing the copies. > I discovered a curious phenomenon: The copies are necessary because TwoAddressInstructionPass commutes the second add. When I suppress the commute, the movs disappear and the code became optimal. It seems the two-address commuter is either buggy or inherently short-sighted/simple-minded and paints itself into a corner. How do you recommend I approach this problem? G

[X86] FMA transformation restrictions

2016 Sep 12

[X86] FMA transformation restrictions

I noticed that the operand commuting code in X86InstrInfo.cpp treats scalar FMA intrinsics specially. It prevents operand commuting on these scalar instructions because the scalar FMA instructions preserve the upper bits of the vector. Presumably, the restrictions are there because commuting operands potentially changes the result upper bits. However, AFAIK the Intel and GNU FMA intrinsics

How to tell LLVM to treat Commutable library calls as such, for example multiplication?

2019 Jun 11

How to tell LLVM to treat Commutable library calls as such, for example multiplication?

A few library calls are commutable by definition, for example multiplications. I defined them as LibCalls for my architecture. However, I found that arguments are always passed in the order they are generated by Clang thus missing possible optimisations. For example, the following IR code ; Function Attrs: minsize norecurse nounwind optsize readnone define dso_local i16 @multTest(i16 %a, i16

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 26

[LLVMdev] bitwise AND selector node not commutative?

On Jun 25, 2009, at 6:06 PM, Evan Cheng wrote: > > On Jun 25, 2009, at 4:38 PM, David Goodwin wrote: > >> Using the Thumb-2 target we see that ORN ( a | ^b) and BIC (a & ^b) >> have similar patterns, as we would expect: >> >> defm t2BIC : T2I_bin_irs<"bic", BinOpFrag<(and node:$LHS, (not >> node:$RHS))>>; >> defm t2ORN :

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 26

[LLVMdev] bitwise AND selector node not commutative?

On Jun 25, 2009, at 4:38 PM, David Goodwin wrote: > Using the Thumb-2 target we see that ORN ( a | ^b) and BIC (a & ^b) > have similar patterns, as we would expect: > > defm t2BIC : T2I_bin_irs<"bic", BinOpFrag<(and node:$LHS, (not node: > $RHS))>>; > defm t2ORN : T2I_bin_irs<"orn", BinOpFrag<(or node:$LHS, (not node: >

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 25

[LLVMdev] bitwise AND selector node not commutative?

Using the Thumb-2 target we see that ORN ( a | ^b) and BIC (a & ^b) have similar patterns, as we would expect: defm t2BIC : T2I_bin_irs<"bic", BinOpFrag<(and node:$LHS, (not node: $RHS))>>; defm t2ORN : T2I_bin_irs<"orn", BinOpFrag<(or node:$LHS, (not node: $RHS))>>; Compiling the following three works as expected: %tmp1 = xor i32

[LLVMdev] "icmp eq", "icmp ne" not commuting operands on ARM

2009 Jun 26

[LLVMdev] "icmp eq", "icmp ne" not commuting operands on ARM

NE and EQ comparisons should be able to commute their operands. But, for ARM at least, this does not seem to be happening. The first sequence below generates CMN (compare negated) but the second does not (complete test attached). These seem to map to ARMcmpNZ. Where would I look to see if that is marked commutative? %nb = sub i32 0, %b %tmp = icmp ne i32 %a, %nb %nb = sub

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

2012 Jan 14

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

Ivan, Sorry, no, I wasn't clear enough. Both "op dst_reg,immediate,src_reg" and "op dst_reg,src_reg,immediate" are allowed in the ALU ops. For most instructions these are two different things - e.g. sub a,5,b is different from sub,a,b,5 obviously - but for things like add they just define the same thing. My problem is that LLVM won't allow immediates on the LHS of

[LLVMdev] Efficient Pattern matching in Instruction Combine

2014 Aug 07

[LLVMdev] Efficient Pattern matching in Instruction Combine

Hi, All, Duncan, Rafael, David, Nick. This is regarding pattern matching in InstructionCombine pass. We use 'match' functions many times, but it doesn't do the pattern matching effectively. e.x. Lets take pattern : (A ^ B) | ((B ^ C) ^ A) -> (A ^ B) | C (B ^ A) | ((B ^ C) ^ A) -> (A ^ B) | C Both the patterns above are same, since ^ is commutative in Op0. But,

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

2012 Jan 14

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

Dear all, I was wondering if it is possible in TableGen to either: 1. Selectively define an instruction depending on an SDNode's properties, e.g. if the SDNode is not commutative. 2. Override/ignore the TableGen error given when a commutative node has an immediate on the LHS. My case comes from trying to define a generic ALU operation multiclass for my target, which includes a

[LLVMdev] Intrinsic's "Commutative" property

2012 Jul 24

[LLVMdev] Intrinsic's "Commutative" property

Hi, What does it mean when "Commutative" property is applied to an intrinsic with more than two arguments? For example, __builtin_ia32_dppd has this property. Thanks. -- Simon

[LLVMdev] Help me improve two-address code

2009 Apr 16

[LLVMdev] Help me improve two-address code

Evan Cheng wrote: > On Apr 16, 2009, at 3:17 PM, Greg McGary wrote: > >> Is there some optimizer knob I'm not turning properly? In more complex >> cases, GCC does poorly with two-address operand choices and so bloats >> the code with unnecessary register moves. I have high hopes LLVM >> can do better, so this result for a simple case is bothersome. >>

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

2009 Apr 22

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

...> transform it again? >>> >> >> Yes, the later pass is the coalescer. It would be worthwhile to >> understand why it is not coalescing the copies. >> > > I discovered a curious phenomenon: > > The copies are necessary because TwoAddressInstructionPass commutes > the second add. When I suppress the commute, the movs disappear and > the code became optimal. It seems the two-address commuter is > either buggy > or inherently short-sighted/simple-minded and paints itself into a > corner. > > How do you recommend I approach this pro...

[LLVMdev] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

2008 Oct 08

[LLVMdev] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

instcombine can handle certain orders of "icmp"s that are "or"ed together: x != 5 OR x > 10 OR x == 8 becomes.. x != 5 OR x == 8 becomes.. x != 5 However, a different ordering prevents the simplification: x == 8 OR x > 10 OR x != 5 becomes.. %or.eq8.gt10 OR x != 5 and that can't be simplified because we now have an "or" OR "icmp". What would I

[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

2008 Oct 08

[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

Here's an initial stab, but I'm not too happy about the temporarily adding new instructions then removing it because returning it will have it added back in to replace other uses. I also added a couple test cases pass with the new InstructionCombining changes (the old code only passes one of the added tests). Also, this change exposes some simplification for

search for: commutes