thr3ads.net - llvm dev - [llvm-dev] TwoAddressInstructionPass::isProfitableToConv3Addr() [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Matthias Braun via llvm-dev

2015-Sep-29 23:15 UTC

[llvm-dev] TwoAddressInstructionPass::isProfitableToConv3Addr()

A similar setting occurs with ARM Thumb code which for many instructions has a
short 2-address encoding and a longer 3 address form. As far as I know this is
done by selecting the 3 address form and rewriting them to 2-address after
register allocation where possible. See lib/Target/ARM/Thumb2SizeReduction.cpp.

- Matthias
> On Sep 29, 2015, at 2:22 PM, Quentin Colombet via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi Jonas,
> 
>> On Sep 29, 2015, at 2:00 AM, Jonas Paulsson via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>> 
>> Hi,
>> 
>> I have cases of instruction pairs, where one is cheaper 2-address, and
the other 3-address. I would like to select the 2-addr instruction during isel,
but use the 3-addr instruction to avoid a copy if possible. I find that
TwoAddressInstructionPass::isProfitableToConv3Addr() is only checking
>> for the case of a physreg copy, and so leaves the majority of cases as
they are (2-address).
>> 
>> I would like to say "If 3-addr version would avoid a copy, use
it!". Does anyone else have a similar situation?
> 
> I think this is what it is supposed to do right now :). Though I reckon the
test is probably over conservative in the sense that it returns true only if it
can prove this is going to save a copy.
> 
>> 
>> To do this, one would need to check the kill-flag on the tied use
operand. If it is not killed, one can assume that the use and dst registers
overlap, and therefore the copy is needed for the two-address form. The kill
flags would however need to be recomputed by TwoAddr pass, since
>> LiveVariables clear them.
>> 
>> An alternative approach might be to have something like
TII->handleMachineFunctionPostCoalescer() at the end of
RegisterCoalescer.cpp::runOnMachineFunction(). There, one could look for
instructions and query live intervals for overlap. This hook might also be
useful for other things, since this is the point just before mi-sched/regalloc,
where one could do things like estimate register pressure.
>> 
>> Any comments on this anyone?
> 
> We could try to fix the check in two-address pass first. I believe a hook
like you describe might be useful but this is yet another thing to teach the
coalescer, which is already complex enough IMO. Moreover, I like the separation
of concerns that 2- and 3-addr conversions are made within a dedicated pass.
> That being said, if getting the best code involves teaching the coalescer
about this transformation, sure!
> 
> Cheers,
> Q.
> 
>> 
>> /Jonas Paulsson
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Jonas Paulsson via llvm-dev

2015-Sep-30 09:15 UTC

head link

[llvm-dev] TwoAddressInstructionPass::isProfitableToConv3Addr()

On 2015-09-30 01:15, Matthias Braun wrote:> A similar setting occurs with ARM Thumb code which for many instructions
has a short 2-address encoding and a longer 3 address form. As far as I know
this is done by selecting the 3 address form and rewriting them to 2-address
after register allocation where possible. See
lib/Target/ARM/Thumb2SizeReduction.cpp.
>
> - Matthias
The late oportunistic conversion is simple, but can only work in cases 
where regalloc happens to put the new definition in the same register as 
one of the source operands. In case regalloc would try to do this as 
much as possible, this might work most of the time, however I have no 
idea if this is the case. Some targets may want round-robin allocation, 
while others would prefer reuse of registers. Steve says he gets most of 
it handled, how about ARM Thumb? Is this with RAGreedy?
>> On Sep 29, 2015, at 2:22 PM, Quentin Colombet via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>>
>> Hi Jonas,
>>
>>> On Sep 29, 2015, at 2:00 AM, Jonas Paulsson via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>>>
>>> Hi,
>>>
>>> I have cases of instruction pairs, where one is cheaper 2-address,
and the other 3-address. I would like to select the 2-addr instruction during
isel, but use the 3-addr instruction to avoid a copy if possible. I find that
TwoAddressInstructionPass::isProfitableToConv3Addr() is only checking
>>> for the case of a physreg copy, and so leaves the majority of cases
as they are (2-address).
>>>
>>> I would like to say "If 3-addr version would avoid a copy, use
it!". Does anyone else have a similar situation?
>> I think this is what it is supposed to do right now :). Though I reckon
the test is probably over conservative in the sense that it returns true only if
it can prove this is going to save a copy.
Yes, it is very much overconservative because it only checks the cases 
of phys-reg copies around calls and returns, meaning that all other 
cases are never transformed. I believe it should ask target to convert 
to 3-address in all cases no source register is killed.

PPCVSXFMAMutate is indeed doing something towards the same goal. I 
wonder if this pass could be removed/simplified if TwoAddress would be 
aware of kill flags and eliminate more copys?

>>
>>> To do this, one would need to check the kill-flag on the tied use
operand. If it is not killed, one can assume that the use and dst registers
overlap, and therefore the copy is needed for the two-address form. The kill
flags would however need to be recomputed by TwoAddr pass, since
>>> LiveVariables clear them.
>>>
>>> An alternative approach might be to have something like
TII->handleMachineFunctionPostCoalescer() at the end of
RegisterCoalescer.cpp::runOnMachineFunction(). There, one could look for
instructions and query live intervals for overlap. This hook might also be
useful for other things, since this is the point just before mi-sched/regalloc,
where one could do things like estimate register pressure.
>>>
>>> Any comments on this anyone?
>> We could try to fix the check in two-address pass first. I believe a
hook like you describe might be useful but this is yet another thing to teach
the coalescer, which is already complex enough IMO. Moreover, I like the
separation of concerns that 2- and 3-addr conversions are made within a
dedicated pass.
>> That being said, if getting the best code involves teaching the
coalescer about this transformation, sure!
>>
>> Cheers,
>> Q.
>>
>>> /Jonas Paulsson
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
/ Jonas

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150930/001a37d3/attachment.html>

Hal Finkel via llvm-dev

2015-Oct-05 17:16 UTC

head link

[llvm-dev] TwoAddressInstructionPass::isProfitableToConv3Addr()

----- Original Message -----> From: "Jonas Paulsson" <paulsson at linux.vnet.ibm.com>
> To: "Matthias Braun" <mbraun at apple.com>, "Quentin
Colombet" <qcolombet at apple.com>, "Steve King"
> <steve at metrokings.com>, hfinkel at anl.gov
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Wednesday, September 30, 2015 4:15:25 AM
> Subject: Re: [llvm-dev]
TwoAddressInstructionPass::isProfitableToConv3Addr()
> 
> On 2015-09-30 01:15, Matthias Braun wrote:
> 
> 
> A similar setting occurs with ARM Thumb code which for many
> instructions has a short 2-address encoding and a longer 3 address
> form. As far as I know this is done by selecting the 3 address form
> and rewriting them to 2-address after register allocation where
> possible. See lib/Target/ARM/Thumb2SizeReduction.cpp.
> 
> - Matthias
> 
> The late oportunistic conversion is simple, but can only work in
> cases where regalloc happens to put the new definition in the same
> register as one of the source operands. In case regalloc would try
> to do this as much as possible, this might work most of the time,
> however I have no idea if this is the case. Some targets may want
> round-robin allocation, while others would prefer reuse of
> registers. Steve says he gets most of it handled, how about ARM
> Thumb? Is this with RAGreedy?
> 
> 
> 
> 
> On Sep 29, 2015, at 2:22 PM, Quentin Colombet via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> 
> Hi Jonas,
> 
> On Sep 29, 2015, at 2:00 AM, Jonas Paulsson via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> 
> Hi,
> 
> I have cases of instruction pairs, where one is cheaper 2-address,
> and the other 3-address. I would like to select the 2-addr
> instruction during isel, but use the 3-addr instruction to avoid a
> copy if possible. I find that
> TwoAddressInstructionPass::isProfitableToConv3Addr() is only
> checking
> for the case of a physreg copy, and so leaves the majority of cases
> as they are (2-address).
> 
> I would like to say "If 3-addr version would avoid a copy, use
it!".
> Does anyone else have a similar situation? I think this is what it
> is supposed to do right now :). Though I reckon the test is probably
> over conservative in the sense that it returns true only if it can
> prove this is going to save a copy.
> 
> Yes, it is very much overconservative because it only checks the
> cases of phys-reg copies around calls and returns, meaning that all
> other cases are never transformed. I believe it should ask target to
> convert to 3-address in all cases no source register is killed.
> 
> PPCVSXFMAMutate is indeed doing something towards the same goal. I
> wonder if this pass could be removed/simplified if TwoAddress would
> be aware of kill flags and eliminate more copys?
PPCVSXFMAMutate runs in between MI scheduling and register allocation. In this
way, it only eliminates copies that actually remain after scheduling. TwoAddress
runs prior to MI scheduling.

If I have:

c1 = c
x = a*b + c <tied>
y = d*e + c1 <tied>

I might mutate the first instruction so that I have:

x = a*(b <tied>) + c
y = d*e + c <tied>

to eliminate the copy. However, in order to hide latency, the schedule might
prefer to flip the two FMA instructions, which it now cannot do if I've
mutated first without re-introducing the copy I was trying to avoid.

 -Hal
> 
> 
> To do this, one would need to check the kill-flag on the tied use
> operand. If it is not killed, one can assume that the use and dst
> registers overlap, and therefore the copy is needed for the
> two-address form. The kill flags would however need to be recomputed
> by TwoAddr pass, since
> LiveVariables clear them.
> 
> An alternative approach might be to have something like
> TII->handleMachineFunctionPostCoalescer() at the end of
> RegisterCoalescer.cpp::runOnMachineFunction(). There, one could look
> for instructions and query live intervals for overlap. This hook
> might also be useful for other things, since this is the point just
> before mi-sched/regalloc, where one could do things like estimate
> register pressure.
> 
> Any comments on this anyone? We could try to fix the check in
> two-address pass first. I believe a hook like you describe might be
> useful but this is yet another thing to teach the coalescer, which
> is already complex enough IMO. Moreover, I like the separation of
> concerns that 2- and 3-addr conversions are made within a dedicated
> pass.
> That being said, if getting the best code involves teaching the
> coalescer about this transformation, sure!
> 
> Cheers,
> Q.
> 
> 
> 
> 
> 
> /Jonas Paulsson
> 
> _______________________________________________
> LLVM Developers mailing list llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> / Jonas
> 
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

llvm dev - Sep 2015 - TwoAddressInstructionPass::isProfitableToConv3Addr()

[llvm-dev] TwoAddressInstructionPass::isProfitableToConv3Addr()

[llvm-dev] TwoAddressInstructionPass::isProfitableToConv3Addr()

[llvm-dev] TwoAddressInstructionPass::isProfitableToConv3Addr()