thr3ads.net - search: "commuting"

Displaying 20 results from an estimated 274 matches for "commuting".

Did you mean: commiting

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi all, The 213 variant of the FMA3 instructions is currently marked commutable (see X86InstrFMA.td). Is that safe? According to the ISA the FMA3 instructions aren't commutable for non-numeric results, so I'd have thought commuting this would only be valid in fast-math mode? For the curious, the reason that I'm asking is that we currently always select the 213 variant, but this introduces an extra copies in accumulator-style loops. Something like: while (...) accumulator = x * y + accumulator; yields: loop: vfmadd...

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

...wrote: >> >> Hi all, >> >> The 213 variant of the FMA3 instructions is currently marked >> commutable (see X86InstrFMA.td). Is that safe? According to the ISA >> the FMA3 instructions aren't commutable for non-numeric results, so >> I'd have thought commuting this would only be valid in fast-math mode? >> >> For the curious, the reason that I'm asking is that we currently >> always select the 213 variant, but this introduces an extra copies in >> accumulator-style loops. Something like: >> >> while (...) >>...

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

...mes <lhames at gmail.com> wrote: > Hi all, > > The 213 variant of the FMA3 instructions is currently marked > commutable (see X86InstrFMA.td). Is that safe? According to the ISA > the FMA3 instructions aren't commutable for non-numeric results, so > I'd have thought commuting this would only be valid in fast-math mode? > > For the curious, the reason that I'm asking is that we currently > always select the 213 variant, but this introduces an extra copies in > accumulator-style loops. Something like: > > while (...) > accumulator = x * y + accu...

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 23

[LLVMdev] Commutability of X86 FMA3 instructions.

...> Hi all, >>> >>> The 213 variant of the FMA3 instructions is currently marked >>> commutable (see X86InstrFMA.td). Is that safe? According to the ISA >>> the FMA3 instructions aren't commutable for non-numeric results, so >>> I'd have thought commuting this would only be valid in fast-math mode? >>> >>> For the curious, the reason that I'm asking is that we currently >>> always select the 213 variant, but this introduces an extra copies in >>> accumulator-style loops. Something like: >>> >>&g...

Change Bit Value To Text When Viewed

2006 Feb 27

Change Bit Value To Text When Viewed

Hey All, very simple question for you folks ;) I have a @commutes object that for each record there is a bit value set (0/1) to denote if an accident occurred. What I want to do is have it display "Yes" or "No" when I view the listing of commutes. I can get it to display the bit value just fine with <%= commute.accident %> but how can I get it so that when a record with

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

2009 Apr 21

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

Dan Gohman wrote: > On Apr 19, 2009, at 6:15 PM, Greg McGary wrote: > >> Because sextb_r and sextw_r have destination tied to source operands, >> TwoAddressInstructionPass thinks it needs a copy. However, since the >> sext kills its source, the copy is unnecessary. Why does this happen? >> Is TwoAddressInstructionPass relying on a later pass to notice this

[X86] FMA transformation restrictions

2016 Sep 12

[X86] FMA transformation restrictions

I noticed that the operand commuting code in X86InstrInfo.cpp treats scalar FMA intrinsics specially. It prevents operand commuting on these scalar instructions because the scalar FMA instructions preserve the upper bits of the vector. Presumably, the restrictions are there because commuting operands potentially changes the result u...

How to tell LLVM to treat Commutable library calls as such, for example multiplication?

2019 Jun 11

How to tell LLVM to treat Commutable library calls as such, for example multiplication?

A few library calls are commutable by definition, for example multiplications. I defined them as LibCalls for my architecture. However, I found that arguments are always passed in the order they are generated by Clang thus missing possible optimisations. For example, the following IR code ; Function Attrs: minsize norecurse nounwind optsize readnone define dso_local i16 @multTest(i16 %a, i16

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 26

[LLVMdev] bitwise AND selector node not commutative?

...%a, %tmp -- >> > bic r0, r0, r1 >> >> But this doesn't: >> >> %tmp = xor i32 %b, 4294967295 ; %tmp1 = and i32 %tmp, %a -- >> > eor r1, r1, #4294967295 ; and r0, r1, r0 >> >> On the surface it seems that the selector is not commuting the AND >> operands. I've attached the complete test files. I can take a look >> but I need a pointer to get started. > > No, isel is trying to commute the AND. See ARMGenDAGISel.inc (auto- > generated by tablegen): > > // Pattern: (and:i32 GPR:i32:$lhs, (xor...

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 26

[LLVMdev] bitwise AND selector node not commutative?

...4294967295 ; %tmp1 = and i32 %a, %tmp -- > > bic r0, r0, r1 > > But this doesn't: > > %tmp = xor i32 %b, 4294967295 ; %tmp1 = and i32 %tmp, %a -- > > eor r1, r1, #4294967295 ; and r0, r1, r0 > > On the surface it seems that the selector is not commuting the AND > operands. I've attached the complete test files. I can take a look > but I need a pointer to get started. No, isel is trying to commute the AND. See ARMGenDAGISel.inc (auto- generated by tablegen): // Pattern: (and:i32 GPR:i32:$lhs, (xor:i32 t2_so_reg:i32:$rhs, (im...

[LLVMdev] bitwise AND selector node not commutative?

2009 Jun 25

[LLVMdev] bitwise AND selector node not commutative?

...r0, r0, r1 %tmp = xor i32 %b, 4294967295 ; %tmp1 = and i32 %a, %tmp -- > bic r0, r0, r1 But this doesn't: %tmp = xor i32 %b, 4294967295 ; %tmp1 = and i32 %tmp, %a -- > eor r1, r1, #4294967295 ; and r0, r1, r0 On the surface it seems that the selector is not commuting the AND operands. I've attached the complete test files. I can take a look but I need a pointer to get started. David -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090625/0706b1cb/attachment.ht...

[LLVMdev] "icmp eq", "icmp ne" not commuting operands on ARM

2009 Jun 26

[LLVMdev] "icmp eq", "icmp ne" not commuting operands on ARM

NE and EQ comparisons should be able to commute their operands. But, for ARM at least, this does not seem to be happening. The first sequence below generates CMN (compare negated) but the second does not (complete test attached). These seem to map to ARMcmpNZ. Where would I look to see if that is marked commutative? %nb = sub i32 0, %b %tmp = icmp ne i32 %a, %nb %nb = sub

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

2012 Jan 14

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

Ivan, Sorry, no, I wasn't clear enough. Both "op dst_reg,immediate,src_reg" and "op dst_reg,src_reg,immediate" are allowed in the ALU ops. For most instructions these are two different things - e.g. sub a,5,b is different from sub,a,b,5 obviously - but for things like add they just define the same thing. My problem is that LLVM won't allow immediates on the LHS of

[LLVMdev] Efficient Pattern matching in Instruction Combine

2014 Aug 07

[LLVMdev] Efficient Pattern matching in Instruction Combine

Hi, All, Duncan, Rafael, David, Nick. This is regarding pattern matching in InstructionCombine pass. We use 'match' functions many times, but it doesn't do the pattern matching effectively. e.x. Lets take pattern : (A ^ B) | ((B ^ C) ^ A) -> (A ^ B) | C (B ^ A) | ((B ^ C) ^ A) -> (A ^ B) | C Both the patterns above are same, since ^ is commutative in Op0. But,

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

2012 Jan 14

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

Dear all, I was wondering if it is possible in TableGen to either: 1. Selectively define an instruction depending on an SDNode's properties, e.g. if the SDNode is not commutative. 2. Override/ignore the TableGen error given when a commutative node has an immediate on the LHS. My case comes from trying to define a generic ALU operation multiclass for my target, which includes a

[LLVMdev] Intrinsic's "Commutative" property

2012 Jul 24

[LLVMdev] Intrinsic's "Commutative" property

Hi, What does it mean when "Commutative" property is applied to an intrinsic with more than two arguments? For example, __builtin_ia32_dppd has this property. Thanks. -- Simon

[LLVMdev] Help me improve two-address code

2009 Apr 16

[LLVMdev] Help me improve two-address code

Evan Cheng wrote: > On Apr 16, 2009, at 3:17 PM, Greg McGary wrote: > >> Is there some optimizer knob I'm not turning properly? In more complex >> cases, GCC does poorly with two-address operand choices and so bloats >> the code with unnecessary register moves. I have high hopes LLVM >> can do better, so this result for a simple case is bothersome. >>

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

2009 Apr 22

[LLVMdev] Unnecessary moves after sign-extension in 2-address target

On Apr 21, 2009, at 4:02 PM, Greg McGary wrote: > Dan Gohman wrote: >> On Apr 19, 2009, at 6:15 PM, Greg McGary wrote: >> >>> Because sextb_r and sextw_r have destination tied to source >>> operands, >>> TwoAddressInstructionPass thinks it needs a copy. However, since >>> the >>> sext kills its source, the copy is unnecessary. Why

[LLVMdev] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

2008 Oct 08

[LLVMdev] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

instcombine can handle certain orders of "icmp"s that are "or"ed together: x != 5 OR x > 10 OR x == 8 becomes.. x != 5 OR x == 8 becomes.. x != 5 However, a different ordering prevents the simplification: x == 8 OR x > 10 OR x != 5 becomes.. %or.eq8.gt10 OR x != 5 and that can't be simplified because we now have an "or" OR "icmp". What would I

[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

2008 Oct 08

[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)

Here's an initial stab, but I'm not too happy about the temporarily adding new instructions then removing it because returning it will have it added back in to replace other uses. I also added a couple test cases pass with the new InstructionCombining changes (the old code only passes one of the added tests). Also, this change exposes some simplification for

search for: commuting