thr3ads.net - search: "commute"

Displaying 20 results from an estimated 274 matches for "commute".

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

...umulator; yields: loop: vfmadd.213 y, x, acc vmovaps acc, x decl count jne loop instead of loop: vfmadd.231 acc, x, y decl count jne loop I have started writing a patch to generate the 231 variant by default, and I want to know whether I need to go to the trouble of adding custom commute logic. If these things aren't commutable then I don't need to worry at all. If they are commutable, but only in fast-math mode, then I can support that too. Thanks for the help! - Lang.

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

...t; >> instead of >> >> loop: >> vfmadd.231 acc, x, y >> decl count >> jne loop >> >> I have started writing a patch to generate the 231 variant by default, >> and I want to know whether I need to go to the trouble of adding >> custom commute logic. If these things aren't commutable then I don't >> need to worry at all. If they are commutable, but only in fast-math >> mode, then I can support that too. >> >> Thanks for the help! >> >> - Lang. >> __________________________________________...

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

..., x > decl count > jne loop > > instead of > > loop: > vfmadd.231 acc, x, y > decl count > jne loop > > I have started writing a patch to generate the 231 variant by default, > and I want to know whether I need to go to the trouble of adding > custom commute logic. If these things aren't commutable then I don't > need to worry at all. If they are commutable, but only in fast-math > mode, then I can support that too. > > Thanks for the help! > > - Lang. > _______________________________________________ > LLVM Developers m...

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 23

[LLVMdev] Commutability of X86 FMA3 instructions.

...>>> loop: >>> vfmadd.231 acc, x, y >>> decl count >>> jne loop >>> >>> I have started writing a patch to generate the 231 variant by >>> default, and I want to know whether I need to go to the trouble of >>> adding custom commute logic. If these things aren't commutable then I >>> don't need to worry at all. If they are commutable, but only in >>> fast-math mode, then I can support that too. >>> >>> Thanks for the help! >>> >>> - Lang. >>> ______________...

Change Bit Value To Text When Viewed

2006 Feb 27

Change Bit Value To Text When Viewed

Hey All, very simple question for you folks ;) I have a @commutes object that for each record there is a bit value set (0/1) to denote if an accident occurred. What I want to do is have it display "Yes" or "No" when I view the listing of commutes. I can get it to display the bit value just fine with <%= commute.accident %> but how can I...

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

2009 Apr 21

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

...otice this >> and >> transform it again? >> > > Yes, the later pass is the coalescer. It would be worthwhile to > understand why it is not coalescing the copies. > I discovered a curious phenomenon: The copies are necessary because TwoAddressInstructionPass commutes the second add. When I suppress the commute, the movs disappear and the code became optimal. It seems the two-address commuter is either buggy or inherently short-sighted/simple-minded and paints itself into a corner. How do you recommend I approach this problem? G

[X86] FMA transformation restrictions

2016 Sep 12

[X86] FMA transformation restrictions

I noticed that the operand commuting code in X86InstrInfo.cpp treats scalar FMA intrinsics specially. It prevents operand commuting on these scalar instructions because the scalar FMA instructions preserve the upper bits of the vector. Presumably, the restrictions are there because commuting operands potentially changes the result upper bits. However, AFAIK the Intel and GNU FMA intrinsics

How to tell LLVM to treat Commutable library calls as such, for example multiplication?

2019 Jun 11

How to tell LLVM to treat Commutable library calls as such, for example multiplication?

A few library calls are commutable by definition, for example multiplications. I defined them as LibCalls for my architecture. However, I found that arguments are always passed in the order they are generated by Clang thus missing possible optimisations. For example, the following IR code ; Function Attrs: minsize norecurse nounwind optsize readnone define dso_local i16 @multTest(i16 %a, i16

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 26

[LLVMdev] bitwise AND selector node not commutative?

...gt; > eor r1, r1, #4294967295 ; and r0, r1, r0 >> >> On the surface it seems that the selector is not commuting the AND >> operands. I've attached the complete test files. I can take a look >> but I need a pointer to get started. > > No, isel is trying to commute the AND. See ARMGenDAGISel.inc (auto- > generated by tablegen): > > // Pattern: (and:i32 GPR:i32:$lhs, (xor:i32 t2_so_reg:i32:$rhs, > (imm:i32)<<P:Predicate_immAllOnes>>)) > // Emits: (t2BICrs:i32 GPR:i32:$lhs, t2_so_reg:i32:$rhs) > // Pattern complexity...

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 26

[LLVMdev] bitwise AND selector node not commutative?

...and i32 %tmp, %a -- > > eor r1, r1, #4294967295 ; and r0, r1, r0 > > On the surface it seems that the selector is not commuting the AND > operands. I've attached the complete test files. I can take a look > but I need a pointer to get started. No, isel is trying to commute the AND. See ARMGenDAGISel.inc (auto- generated by tablegen): // Pattern: (and:i32 GPR:i32:$lhs, (xor:i32 t2_so_reg:i32:$rhs, (imm:i32)<<P:Predicate_immAllOnes>>)) // Emits: (t2BICrs:i32 GPR:i32:$lhs, t2_so_reg:i32:$rhs) // Pattern complexity = 19 cost = 1 size = 0...

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 25

[LLVMdev] bitwise AND selector node not commutative?

Using the Thumb-2 target we see that ORN ( a | ^b) and BIC (a & ^b) have similar patterns, as we would expect: defm t2BIC : T2I_bin_irs<"bic", BinOpFrag<(and node:$LHS, (not node: $RHS))>>; defm t2ORN : T2I_bin_irs<"orn", BinOpFrag<(or node:$LHS, (not node: $RHS))>>; Compiling the following three works as expected: %tmp1 = xor i32

[LLVMdev] "icmp eq", "icmp ne" not commuting operands on ARM

2009 Jun 26

[LLVMdev] "icmp eq", "icmp ne" not commuting operands on ARM

NE and EQ comparisons should be able to commute their operands. But, for ARM at least, this does not seem to be happening. The first sequence below generates CMN (compare negated) but the second does not (complete test attached). These seem to map to ARMcmpNZ. Where would I look to see if that is marked commutative? %nb = sub i32 0...

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

2012 Jan 14

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

Ivan, Sorry, no, I wasn't clear enough. Both "op dst_reg,immediate,src_reg" and "op dst_reg,src_reg,immediate" are allowed in the ALU ops. For most instructions these are two different things - e.g. sub a,5,b is different from sub,a,b,5 obviously - but for things like add they just define the same thing. My problem is that LLVM won't allow immediates on the LHS of

[LLVMdev] Efficient Pattern matching in Instruction Combine

2014 Aug 07

[LLVMdev] Efficient Pattern matching in Instruction Combine

Hi, All, Duncan, Rafael, David, Nick. This is regarding pattern matching in InstructionCombine pass. We use 'match' functions many times, but it doesn't do the pattern matching effectively. e.x. Lets take pattern : (A ^ B) | ((B ^ C) ^ A) -> (A ^ B) | C (B ^ A) | ((B ^ C) ^ A) -> (A ^ B) | C Both the patterns above are same, since ^ is commutative in Op0. But,

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

2012 Jan 14

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

Dear all, I was wondering if it is possible in TableGen to either: 1. Selectively define an instruction depending on an SDNode's properties, e.g. if the SDNode is not commutative. 2. Override/ignore the TableGen error given when a commutative node has an immediate on the LHS. My case comes from trying to define a generic ALU operation multiclass for my target, which includes a

[LLVMdev] Intrinsic's "Commutative" property

2012 Jul 24

[LLVMdev] Intrinsic's "Commutative" property

Hi, What does it mean when "Commutative" property is applied to an intrinsic with more than two arguments? For example, __builtin_ia32_dppd has this property. Thanks. -- Simon

[LLVMdev] Help me improve two-address code

2009 Apr 16

[LLVMdev] Help me improve two-address code

Evan Cheng wrote: > On Apr 16, 2009, at 3:17 PM, Greg McGary wrote: > >> Is there some optimizer knob I'm not turning properly? In more complex >> cases, GCC does poorly with two-address operand choices and so bloats >> the code with unnecessary register moves. I have high hopes LLVM >> can do better, so this result for a simple case is bothersome. >>

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

2009 Apr 22

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

...> transform it again? >>> >> >> Yes, the later pass is the coalescer. It would be worthwhile to >> understand why it is not coalescing the copies. >> > > I discovered a curious phenomenon: > > The copies are necessary because TwoAddressInstructionPass commutes > the second add. When I suppress the commute, the movs disappear and > the code became optimal. It seems the two-address commuter is > either buggy > or inherently short-sighted/simple-minded and paints itself into a > corner. > > How do you recommend I approach this pr...

[LLVMdev] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

2008 Oct 08

[LLVMdev] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

instcombine can handle certain orders of "icmp"s that are "or"ed together: x != 5 OR x > 10 OR x == 8 becomes.. x != 5 OR x == 8 becomes.. x != 5 However, a different ordering prevents the simplification: x == 8 OR x > 10 OR x != 5 becomes.. %or.eq8.gt10 OR x != 5 and that can't be simplified because we now have an "or" OR "icmp". What would I

[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

2008 Oct 08

[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

...or" instructions. E.g., %or.eq.8.gt10 > originally had more than 1 use. > > Am I on the right track (or does LLVM already support this in another > optimization pass?) > > Ed > -- Ed -------------- next part -------------- A non-text attachment was scrubbed... Name: or.commute.patch Type: text/x-patch Size: 3090 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081008/80069501/attachment.bin>

search for: commute