thr3ads.net - llvm dev - [llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication? [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Joan Lluch via llvm-dev

2019-Jun-11 10:02 UTC

[llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication?

A few library calls are commutable by definition, for example multiplications. 

I defined them as LibCalls for my architecture. However, I found that arguments
are always passed in the order they are generated by Clang thus missing possible
optimisations. For example, the following IR code

; Function Attrs: minsize norecurse nounwind optsize readnone
define dso_local i16 @multTest(i16 %a, i16 %b) local_unnamed_addr #0 {
entry:
  %mul = mul i16 %b, %a
  ret i16 %mul
}

is lowered as it is to a  __mulhi3  library call with the arguments reversed as
shown (b,a). This is suboptimal because later on this requires an intermediate
register to swap them.

SelectionDAG has 9 nodes:
  t0: ch = EntryToken
      t4: i16,ch = CopyFromReg t0, Register:i16 %1
      t2: i16,ch = CopyFromReg t0, Register:i16 %0
    t5: i16 = mul t4, t2
  t7: ch,glue = CopyToReg t0, Register:i16 $r0, t5
  t8: ch = CPU74ISD::RET_FLAG t7, Register:i16 $r0, t7:1

 Given the commutative nature of multiplication, the arguments could be passed
in the same order they are received (a,b) so that they would not need to be
swapped before the __mulhi3 call

 What am I missing?, 

Thanks,

John Lluch.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190611/578ae2da/attachment.html>

Tim Northover via llvm-dev

2019-Jun-11 11:57 UTC

head link

[llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication?

Hi Joan,

On Tue, 11 Jun 2019 at 11:03, Joan Lluch via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> I defined them as LibCalls for my architecture. However, I found that
arguments are always passed in the order they are generated by Clang thus
missing possible optimisations. For example, the following IR code
Have you tried to assess the potential benefit over a larger codebase?
It looks like it could be something that comes up in trivial tests,
but disappears in larger functions where the register allocator has
more freedom.
>  Given the commutative nature of multiplication, the arguments could be
passed in the same order they are received (a,b) so that they would not need to
be swapped before the __mulhi3 call
It would be pretty ugly to implement during instruction selection,
given how call lowering works in LLVM. The call's registers (if a call
even puts its arguments in registers) aren't decided until you get
into LowerCall in target-specific code. You'd probably need some kind
of target-hook to answer whether it was worthwhile.

If I had my heart set on this optimization, I'd probably implement it
as a MachineIR pass after register allocation that searches upwards
from any call to __mulhi3 etc and deletes compatible COPYs.

Cheers.

Tim.

Finkel, Hal J. via llvm-dev

2019-Jun-11 12:07 UTC

head link

[llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication?

On 6/11/19 5:02 AM, Joan Lluch via llvm-dev wrote:
A few library calls are commutable by definition, for example multiplications.

I defined them as LibCalls for my architecture. However, I found that arguments
are always passed in the order they are generated by Clang thus missing possible
optimisations. For example, the following IR code

; Function Attrs: minsize norecurse nounwind optsize readnone
define dso_local i16 @multTest(i16 %a, i16 %b) local_unnamed_addr #0 {
entry:
  %mul = mul i16 %b, %a
  ret i16 %mul
}

is lowered as it is to a  __mulhi3  library call with the arguments reversed as
shown (b,a). This is suboptimal because later on this requires an intermediate
register to swap them.

SelectionDAG has 9 nodes:
  t0: ch = EntryToken
      t4: i16,ch = CopyFromReg t0, Register:i16 %1
      t2: i16,ch = CopyFromReg t0, Register:i16 %0
    t5: i16 = mul t4, t2
  t7: ch,glue = CopyToReg t0, Register:i16 $r0, t5
  t8: ch = CPU74ISD::RET_FLAG t7, Register:i16 $r0, t7:1

 Given the commutative nature of multiplication, the arguments could be passed
in the same order they are received (a,b) so that they would not need to be
swapped before the __mulhi3 call

 What am I missing?,


I don't think that you're missing anything, but I don't think that
we have anything that currently allows for this in a generic way. You should be
able to account for this in your target's call lowering for functions that
you happen to know to be commutative, at least in cases where it's possible
to recognize the source of the arguments as coming from incoming function
arguments (this seems similar to how tail-call lowering works).

We do have the ability to mark instructions is commutative and the register
coalescer can use that capability. The problem is that, by the time we reach
that stage, the call arguments have already (likely) been lowered into copies to
physical registers.

 -Hal

Thanks,

John Lluch.



_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190611/b5997586/attachment.html>

Joan Lluch via llvm-dev

2019-Jun-12 18:40 UTC

head link

[llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication?

Hi Finkel and Tim

Thanks for your responses. 

About this topic, and partly as a matter of experimentation, I have implemented
a function that gets called for ‘commutable’ libcalls just at the beginning of
my LowerCall implementation. I conveniently called it
‘AnalizeCommutableLibCall’. This function performs a recursive search up the
operand tree starting from the libcall operands until it finds a ‘CopyFromReg’
or another LibCall operation action. The function determines the register number
encountered at the end of the search, either physical or virtual, for each of
the two commutable operands. In case the first operand got a bigger register
number than the second operand, then the operands are swapped. This apparently
simple thing seems to work in all cases because, ultimately, physical register
assignations go in ascending number for formal arguments and function returns.
By placing the (commutable) operands in ascending register numbers, even if
these register numbers were found way up on the operand tree, the compiler never
needs to perform any explicit or implicit register swaps, so this often results
in less register overhead and shorter code. Much like if the compiler already
knew that the call operands were ‘commutable’.

For larger codebases, certainly the compiler has some extra opportunities to
swap registers through intermediate instruction results, but the problem remains
that registers or intermediate results must still be swapped at some time if
Clang dictated so. This circumstance repeats again at every new libcall
instruction, regardless of the length of the function, so even if some calls may
not get affected, most apparently are.

For now I’m leaving this (initially experimental) function on my target
implementation, I can post the code in case there’s some interest, as I found
that it systematically produces shorter code and it can't produce any side
effects (all what it ultimately does is swapping commutative arguments).

John.

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Jun 2019 - How to tell LLVM to treat Commutable library calls as such, for example multiplication?

[llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication?

[llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication?

[llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication?

[llvm-dev] How to tell LLVM to treat Commutable library calls as such, for example multiplication?

Seemingly Similar Threads