thr3ads.net - llvm dev - [llvm-dev] GlobalISel round table follow up: register bank select [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Dominik Montada via llvm-dev

2020-Oct-09 07:07 UTC

[llvm-dev] GlobalISel round table follow up: register bank select

Hi Quentin,

Am 08.10.20 um 21:17 schrieb Quentin Colombet:> Hi Dominik,
>
>> On Oct 8, 2020, at 5:03 AM, Dominik Montada 
>> <dominik.montada at hightec-rt.com 
>> <mailto:dominik.montada at hightec-rt.com>> wrote:
>>
>> Hi Quentin,
>>
>> thanks for picking up the conversation!
>>
>> > I think we should step back and check what we want before
investing
>> any time in some rewrite.
>>
>> That is a very fair point and I might have been getting ahead of 
>> myself in my last email.
>> What I would like to see from RegBankSelect is to produce the mapping 
>> with the overall lowest cost. Keeping track of all different 
>> combinations of mappings will certainly be non-trivial however, so I 
>> wonder if there is a smart way to do this without spending too much 
>> compilation time.
>>
>> Ideally for instructions with no operands (like G_CONSTANT) it could 
>> also check whether a cross-bank copy is actually worth it or if it 
>> would be more beneficial to simply rematerialize the instruction on 
>> the required bank. For such instructions this information should 
>> already be available as part of the cost-modelling in RegBankSelect: 
>> we could simply compare the cost of a mapping on the required bank 
>> vs. the cost of a cross-bank copy.
>>
>> Would you see this as a valid direction for RegBankSelect?
>>
> Yes, I think this is a valid direction. I actually think we shouldn’t 
> restrict ourselves to instructions with no operands. We could always 
> duplicate next to the original instruction, i.e., we wouldn’t have any 
> issue materializing the arguments.Sure, I guess that would also work as long we don't have to introduce 
copies for the operands. Theoretically this could even try to duplicate 
the operands if overall it's still cheaper than a cross-bank copy but 
that is probably some pandoras box just waiting to be opened
:)> The one thing that needs work is the cost model.
What would you say does the cost model need in its current state?

By the way, does this have anything to do with what Matt was talking 
about during the round table? Unfortunately I don't remember exactly 
what his issues with RegBankSelect were but I'd be interested to know 
whether what we talked about would also benefit him.

I do remember that AMDGPU is apparently doing something very different 
compared to AArch64 for example and I'd be interested to know the 
reasoning behind the different approaches.

Cheers,

Dominik
>
> Cheers,
> -Quentin
>
>> Best regards,
>>
>> Dominik
>>
>>
>> Am 07.10.20 um 19:47 schrieb Quentin Colombet:
>>> Hi Dominik,
>>>
>>> Thanks for sending this!
>>>
>>>> On Oct 7, 2020, at 5:21 AM, Dominik Montada via llvm-dev 
>>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> this is the second email for the round table follow-up, this
time
>>>> regarding the issues around the greedy RegBankSelect and 
>>>> alternative mappings.
>>>>
>>>> The issue I brought up was that because RegBankSelect goes 
>>>> top-down, it never looks at all available mappings for the
operands
>>>> when considering which of the mappings to apply to the current 
>>>> instruction. In our architecture we have one register bank 
>>>> dedicated to pointers and another one for anything else. We
often
>>>> see code where we have a G_PTR_ADD with a constant. Since the 
>>>> constant is not a pointer, we put it on the other register
bank. We
>>>> could put it on the address regbank and do provide alternative 
>>>> mappings for that, but since the greedy algorithm doesn't
actually
>>>> check the usage of the constant, it is always put on the other
bank.
>>>
>>> The intent behind the greedy algorithm was for the mapping to look 
>>> at X instructions ahead, depending on the optimization level, when 
>>> assigning one instruction.
>>> Right now the window is simply 1 instruction and the code is not 
>>> structured to allow to have more than one.
>>>
>>> As we gain more insights on what we want RegBankSelect to do, it 
>>> makes sense to redesign it.
>>>
>>>>
>>>> When RegBankSelect then sees the G_PTR_ADD and sees that one of
its
>>>> inputs is on the other register bank already, it then inserts a
>>>> costly cross-bank copy instead of checking if that operand has
any
>>>> alternative mappings which would make the overall mapping for
the
>>>> current instruction cheaper.
>>>
>>> The idea is when a mapping is done, that means it was the best one 
>>> at the time of the decision (greedy), thus we don’t challenge that.
>>> “Reverting” an already set mapping is not that simple because yes 
>>> for this particular use we are inserting costly cross copies, but 
>>> there are no guarantees that replacing the mapping of the
definition
>>> will not insert even more costly copies for the other uses.
>>>
>>> E.g., consider:
>>> ```
>>> A = def <— RBS starts here
>>> = useFP A
>>> = useInt A
>>> = useInt A
>>> ```
>>>
>>> Let’s assume that greedy works the way it is intended. I.e., it 
>>> assigns A to the int register bank because there are 2 such uses
vs.
>>> only 1 fp bank use:
>>> ```
>>> A<int> = def
>>> = useFP A<int> <— Next, RBS looks at this one
>>> = useInt A<int>
>>> = useInt A<int>
>>> ```
>>>
>>> Now the regbank for useFP is not right and has to be repaired.
Right
>>> now, we will insert the costly cross copy for that use:
>>> ```
>>> A<int> = def
>>> AFP<fp> = cross_copy A<int>
>>> = useFP AFP<fp>
>>> = useInt A<int>
>>> = useInt A<int>
>>> ```
>>>
>>> Now, if we were to change the definition of A to avoid this copy we
>>> would create two costly copy for the useInt. Actually, another 
>>> question is what would we do when we look at the first useInt? Is 
>>> this use allowed to change the definition of the instruction again?
>>> Do we duplicate the definition? Etc..
>>> ```
>>> A<fp> = def <— reassign
>>> = useFP AFP<fp>
>>> AInt1<int> = cross_copy A<fp>
>>> = useInt AInt1<int>
>>> AInt2<int> = cross_copy A<fp>
>>> = useInt AInt2<int>
>>> ```
>>>
>>> Bottom line, if we allow to modify the assignments of the 
>>> definition, the decision making is not local anymore and in 
>>> particular may require to add repairing code all over the place. As
>>> a result the cost model becomes much more complicated.
>>>
>>>>
>>>> Matt suggested that RegBankSelect should probably go bottom-up 
>>>> instead and I agree with him. I don't think there is a
particular
>>>> reason why RegBankSelect necessarily has to go top-down.
>>>
>>> The rationale for going top-down is that when you reach an 
>>> instruction, all the operands are assigned so you know what would
be
>>> the cost of repairing.
>>> You could said that the problem is the same for definitions when 
>>> going bottom-up. However, this is likely to be more problematic 
>>> because usually you have fewer definitions than arguments on each 
>>> individual instruction, therefore there is more guess work going on
>>> (e.g., top-down, you assume a cost for 1 definition, going
bottom-up
>>> you have to assume a cost for 2 arguments or more precisely you 
>>> would have to track a window of X instructions for 2 arguments 
>>> instead of 1 definition.)
>>>
>>>>
>>>> I'm not too familiar with the implementation of
RegBankSelect.
>>>> Would it be a big effort to make it work bottom-up instead?
I'm
>>>> guessing one of the biggest areas would be the check whether a 
>>>> cross-bank copy is needed as well as calculating the overall
cost
>>>> for alternative mappings as now all usages of the current 
>>>> instruction would have to be checked instead of the much more 
>>>> limited operands. How big of an impact would this have?
>>>
>>> It’s been a while since I looked at the implementation but I would 
>>> expect this to be significant.
>>>
>>> I think we should step back and check what we want before investing
>>> any time in some rewrite. For instance, I don’t see what bottom-up 
>>> fundamentally gives us. It seems like a workaround to me.
>>> That said, it would work either way!
>>>
>>> Cheers,
>>> -Quentin
>>>
>>>>
>>>> Cheers,
>>>>
>>>> Dominik
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> -- 
>> ----------------------------------------------------------------------
>> Dominik Montada                   Email:dominik.montada at
hightec-rt.com
>> HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
>> Europaallee 19                    Fax:   +49-681-92613-26
>> D-66113 Saarbrücken               WWW:http://www.hightec-rt.com
>>
>> Managing Director: Vera Strothmann
>> Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222
>>
>> This e-mail may contain confidential and/or privileged information. If
>> you are not the intended recipient please notify the sender immediately
>> and destroy this e-mail. Any unauthorised copying, disclosure or
>> distribution of the material in this e-mail is strictly forbidden.
>> ---
>-- 
----------------------------------------------------------------------
Dominik Montada                   Email: dominik.montada at hightec-rt.com
HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
Europaallee 19                    Fax:   +49-681-92613-26
D-66113 Saarbrücken               WWW: http://www.hightec-rt.com

Managing Director: Vera Strothmann
Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient please notify the sender immediately
and destroy this e-mail. Any unauthorised copying, disclosure or
distribution of the material in this e-mail is strictly forbidden.
---
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/f52d0042/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 6822 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/f52d0042/attachment-0001.bin>

Quentin Colombet via llvm-dev

2020-Oct-09 17:15 UTC

head link

[llvm-dev] GlobalISel round table follow up: register bank select

> On Oct 9, 2020, at 12:07 AM, Dominik Montada <dominik.montada at
hightec-rt.com> wrote:
> 
> Hi Quentin,
> 
> Am 08.10.20 um 21:17 schrieb Quentin Colombet:
>> Hi Dominik,
>> 
>>> On Oct 8, 2020, at 5:03 AM, Dominik Montada <dominik.montada at
hightec-rt.com <mailto:dominik.montada at hightec-rt.com>> wrote:
>>> 
>>> Hi Quentin,
>>> 
>>> thanks for picking up the conversation!
>>> 
>>> > I think we should step back and check what we want before
investing any time in some rewrite.
>>> 
>>> That is a very fair point and I might have been getting ahead of
myself in my last email.
>>> What I would like to see from RegBankSelect is to produce the
mapping with the overall lowest cost. Keeping track of all different
combinations of mappings will certainly be non-trivial however, so I wonder if
there is a smart way to do this without spending too much compilation time.
>>> 
>>> Ideally for instructions with no operands (like G_CONSTANT) it
could also check whether a cross-bank copy is actually worth it or if it would
be                   more beneficial to simply rematerialize the instruction on
the required bank. For such instructions this information should already be
available as part of the cost-modelling in RegBankSelect: we could simply
compare the cost of a mapping on the required bank vs. the cost of a cross-bank
copy.
>>> 
>>> Would you see this as a valid direction for RegBankSelect?
>>> 
>> Yes, I think this is a valid direction. I actually think we shouldn’t
restrict ourselves to instructions with no operands. We could always duplicate
next to the original instruction, i.e., we wouldn’t have any issue materializing
the arguments.
> Sure, I guess that would also work as long we don't have to introduce
copies for the operands. Theoretically this could even try to duplicate the
operands if overall it's still cheaper than a cross-bank copy but that is
probably some pandoras box just waiting to be opened :)
Hehe!
Ideally we would compute the costs globally and do the code transformation only
once. But I am guessing this is wishful thinking.

>> The one thing that needs work is the cost model.
> What would you say does the cost model need in its current state?
> 
> 
I don’t remember how everything works so take it with a grain of salt.
I’d say we would need to take into account how much it would cost to duplicate
the definitions and compute the repairing cost with those duplications.
The thing is it may become compile time intensive pretty quickly if we want to
consider all the possibilities.

As a starter we could focus on the best mapping for the current instruction and
try to match the desired regbanks from each operand and only evaluate that cost
against just plain repairing.

In any case, I think it would require a non-trivial amount of
work.> By the way, does this have anything to do with what Matt was talking about
during the round table? Unfortunately I don't remember exactly what his
issues with RegBankSelect were but I'd be interested to know whether what we
talked about would also benefit him.
> 
> 
Partially. IIRC Matt’s problem was also how we apply the mapping and how, right
now, we have to abuse the observer mechanism to do what we want.
> I do remember that AMDGPU is apparently doing something very different
compared to AArch64 for example and I'd be interested to know the reasoning
behind the different approaches.
> 
> Cheers,
> 
> Dominik
> 
>> 
>> Cheers,
>> -Quentin
>> 
>>> Best regards,
>>> 
>>> Dominik
>>> 
>>> 
>>> Am 07.10.20 um 19:47 schrieb Quentin Colombet:
>>>> Hi Dominik,
>>>> 
>>>> Thanks for sending this!
>>>> 
>>>>> On Oct 7, 2020, at 5:21 AM, Dominik Montada via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> this is the second email for the round table follow-up,
this time regarding the issues around the greedy RegBankSelect and alternative
mappings.
>>>>> 
>>>>> The issue I brought up was that because RegBankSelect goes
top-down, it never looks at all available mappings for the operands when
considering which of the mappings to apply to the current instruction. In our
architecture we have one register bank dedicated to pointers and another one for
anything else. We often see code where we have a G_PTR_ADD with a constant.
Since the constant is not a pointer, we put it on the other register bank. We
could put it on the address regbank and do provide alternative mappings for
that, but since the greedy algorithm doesn't actually check the usage of the
constant, it is always put on the other bank.
>>>> 
>>>> The intent behind the greedy algorithm was for the mapping to
look at X instructions ahead, depending on the optimization level, when
assigning one instruction.
>>>> Right now the window is simply 1 instruction and the code is
not structured to allow to have more than one.
>>>> 
>>>> As we gain more insights on what we want RegBankSelect to do,
it makes sense to redesign it.
>>>> 
>>>>> 
>>>>> When RegBankSelect then sees the G_PTR_ADD and sees that
one of its inputs is on the other register bank already, it then inserts a
costly cross-bank copy instead of checking if that operand has any alternative
mappings which would make the overall mapping for the current instruction
cheaper.
>>>> 
>>>> The idea is when a mapping is done, that means it was the best
one at the time of the decision (greedy), thus we don’t challenge that.
>>>> “Reverting” an already set mapping is not that simple because
yes for this particular use we are inserting costly cross copies, but there are
no guarantees that replacing the mapping of the definition will not insert even
more costly copies for the other uses.
>>>> 
>>>> E.g., consider:
>>>> ```
>>>> A = def <— RBS starts here
>>>> = useFP A
>>>> = useInt A
>>>> = useInt A
>>>> ```
>>>> 
>>>> Let’s assume that greedy works the way it is intended. I.e., it
assigns A to the int register bank because there are 2 such uses vs. only 1 fp
bank use:
>>>> ```
>>>> A<int> = def
>>>> = useFP A<int> <— Next, RBS looks at this one
>>>> = useInt A<int>
>>>> = useInt A<int>
>>>> ```
>>>> 
>>>> Now the regbank for useFP is not right and has to be repaired.
Right now, we will insert the costly cross copy for that use:
>>>> ```
>>>> A<int> = def
>>>> AFP<fp> = cross_copy A<int>
>>>> = useFP AFP<fp>
>>>> = useInt A<int>
>>>> = useInt A<int>
>>>> ```
>>>> 
>>>> Now, if we were to change the definition of A to avoid this
copy we would create two costly copy for the useInt. Actually, another question
is what would we do when we look at the first useInt? Is this use allowed to
change the definition of the instruction again? Do we duplicate the definition?
Etc..
>>>> ```
>>>> A<fp> = def <— reassign
>>>> = useFP AFP<fp>
>>>> AInt1<int> = cross_copy A<fp>
>>>> = useInt AInt1<int>
>>>> AInt2<int> = cross_copy A<fp>
>>>> = useInt AInt2<int>
>>>> ```
>>>> 
>>>> Bottom line, if we allow to modify the assignments of the
definition, the decision making is not local anymore and in particular may
require to add repairing code all over the place. As a result the cost model
becomes much more complicated.
>>>> 
>>>>> 
>>>>> Matt suggested that RegBankSelect should probably go
bottom-up instead and I agree with him. I don't think there is a particular
reason why RegBankSelect necessarily has to go top-down.
>>>> 
>>>> The rationale for going top-down is that when you reach an
instruction, all the operands are assigned so you know what would be the cost of
repairing.
>>>> You could said that the problem is the same for definitions
when going bottom-up. However, this is likely to be more problematic because
usually you have fewer definitions than arguments on each individual
instruction, therefore there is more guess work going on (e.g., top-down, you
assume a cost for 1 definition, going bottom-up you have to assume a cost for 2
arguments or more precisely you would have to track a window of X instructions
for 2 arguments instead of 1 definition.)
>>>> 
>>>>> 
>>>>> I'm not too familiar with the implementation of
RegBankSelect. Would it be a big effort to make it work bottom-up instead?
I'm guessing one of the biggest areas would be the check whether a
cross-bank copy is needed as well as calculating the overall cost for
alternative mappings as now all usages of the current instruction would have to
be checked instead of the much more limited operands. How big of an impact would
this have?
>>>> 
>>>> It’s been a while since I looked at the implementation but I
would expect this to be significant.
>>>> 
>>>> I think we should step back and check what we want before
investing any time in some rewrite. For instance, I don’t see what bottom-up
fundamentally gives us. It seems like a workaround to me.
>>>> That said, it would work either way!
>>>> 
>>>> Cheers,
>>>> -Quentin
>>>> 
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> Dominik
>>>>> 
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>> 
>>> -- 
>>>
----------------------------------------------------------------------
>>> Dominik Montada                   Email: dominik.montada at
hightec-rt.com <mailto:dominik.montada at hightec-rt.com>
>>> HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
>>> Europaallee 19                    Fax:   +49-681-92613-26
>>> D-66113 Saarbrücken               WWW: http://www.hightec-rt.com
<http://www.hightec-rt.com/>
>>> 
>>> Managing Director: Vera Strothmann
>>> Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222
>>> 
>>> This e-mail may contain confidential and/or privileged information.
If
>>> you are not the intended recipient please notify the sender
immediately
>>> and destroy this e-mail. Any unauthorised copying, disclosure or
>>> distribution of the material in this e-mail is strictly forbidden.
>>> --- 
>> 
> -- 
> ----------------------------------------------------------------------
> Dominik Montada                   Email: dominik.montada at hightec-rt.com
<mailto:dominik.montada at hightec-rt.com>
> HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
> Europaallee 19                    Fax:   +49-681-92613-26
> D-66113 Saarbrücken               WWW: http://www.hightec-rt.com
<http://www.hightec-rt.com/>
> 
> Managing Director: Vera Strothmann
> Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222
> 
> This e-mail may contain confidential and/or privileged information. If
> you are not the intended recipient please notify the sender immediately
> and destroy this e-mail. Any unauthorised copying, disclosure or
> distribution of the material in this e-mail is strictly forbidden.
> --- 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/eb1f50cb/attachment.html>

Gabriel Hjort Åkerlund via llvm-dev

2020-Oct-13 15:44 UTC

head link

[llvm-dev] GlobalISel round table follow up: register bank select

Hi all,



I was pondering on how regbank-select could find the optimal assignment, and
came up with the following algorithm:



1.      Traverse top-down and compute cost of putting the given result in a
particular register bank. The cost equals the cost of the InstructionMapping
itself plus the cost of putting each operand in the register bank as required by
the operand’s ValueMapping. Hence the cost definition is recursive and increases
monotonically. If some operand does not have an InstructionMapping that puts it
in the required register bank, add such an option and compute the cost as before
plus the cost of converting the register bank. Hence such options would only be
computed on demand.
2.      Traverse bottom-up and mark the InstructionMapping that puts the result
in the appropriate register bank at lowest cost.
3.      Traverse top-down and select the register banks, also emitting repair
code if needed.



I started to implement this to see whether it would work, but I encountered
problems with AMDGPU in the sense that its RegisterBankInfo seems to assume that
the register banks are assigned progressively, meaning that when regbankselect
queries AMDGPU’s RegisterBankInfo for the possible InstructionMappings for a
given MachineInstr, then it only returns the smallest set necessary instead of
all possible combinations. Meaning, for a MachineInstr where its input operands
have not been assigned, it does not return the whole set of interesting
InstructionMappings which are needed for the algorithm above.



I might finish the implementation just for fun and learning, but I would
appreciate if anyone had some comments on whether this is a viable approach or
will lead to naught.



Cheers,

Gabriel



From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Quentin
Colombet via llvm-dev
Sent: den 9 oktober 2020 19:16
To: Dominik Montada <dominik.montada at hightec-rt.com>
Cc: llvm-de >> LLVM Developers' List <llvm-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] GlobalISel round table follow up: register bank select









   On Oct 9, 2020, at 12:07 AM, Dominik Montada <dominik.montada at
hightec-rt.com<mailto:dominik.montada at hightec-rt.com>> wrote:



   Hi Quentin,

   Am 08.10.20 um 21:17 schrieb Quentin Colombet:

      Hi Dominik,





         On Oct 8, 2020, at 5:03 AM, Dominik Montada <dominik.montada at
hightec-rt.com<mailto:dominik.montada at hightec-rt.com>> wrote:



         Hi Quentin,

         thanks for picking up the conversation!

         > I think we should step back and check what we want before
investing any time in some rewrite.

         That is a very fair point and I might have been getting ahead of myself
in my last email.
         What I would like to see from RegBankSelect is to produce the mapping
with the overall lowest cost. Keeping track of all different combinations of
mappings will certainly be non-trivial however, so I wonder if there is a smart
way to do this without spending too much compilation time.

         Ideally for instructions with no operands (like G_CONSTANT) it could
also check whether a cross-bank copy is actually worth it or if it would be more
beneficial to simply rematerialize the instruction on the required bank. For
such instructions this information should already be available as part of the
cost-modelling in RegBankSelect: we could simply compare the cost of a mapping
on the required bank vs. the cost of a cross-bank copy.

         Would you see this as a valid direction for RegBankSelect?

      Yes, I think this is a valid direction. I actually think we shouldn’t
restrict ourselves to instructions with no operands. We could always duplicate
next to the original instruction, i.e., we wouldn’t have any issue materializing
the arguments.

   Sure, I guess that would also work as long we don't have to introduce
copies for the operands. Theoretically this could even try to duplicate the
operands if overall it's still cheaper than a cross-bank copy but that is
probably some pandoras box just waiting to be opened :)



   Hehe!

   Ideally we would compute the costs globally and do the code transformation
only once. But I am guessing this is wishful thinking.







      The one thing that needs work is the cost model.

   What would you say does the cost model need in its current state?





   I don’t remember how everything works so take it with a grain of salt.

   I’d say we would need to take into account how much it would cost to
duplicate the definitions and compute the repairing cost with those
duplications.

   The thing is it may become compile time intensive pretty quickly if we want
to consider all the possibilities.



   As a starter we could focus on the best mapping for the current instruction
and try to match the desired regbanks from each operand and only evaluate that
cost against just plain repairing.



   In any case, I think it would require a non-trivial amount of work.

   By the way, does this have anything to do with what Matt was talking about
during the round table? Unfortunately I don't remember exactly what his
issues with RegBankSelect were but I'd be interested to know whether what we
talked about would also benefit him.





   Partially. IIRC Matt’s problem was also how we apply the mapping and how,
right now, we have to abuse the observer mechanism to do what we want.




   I do remember that AMDGPU is apparently doing something very different
compared to AArch64 for example and I'd be interested to know the reasoning
behind the different approaches.

   Cheers,

   Dominik



      Cheers,

      -Quentin





         Best regards,

         Dominik



         Am 07.10.20 um 19:47 schrieb Quentin Colombet:

            Hi Dominik,



            Thanks for sending this!





               On Oct 7, 2020, at 5:21 AM, Dominik Montada via llvm-dev
<llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
wrote:



               Hi all,

               this is the second email for the round table follow-up, this time
regarding the issues around the greedy RegBankSelect and alternative mappings.

               The issue I brought up was that because RegBankSelect goes
top-down, it never looks at all available mappings for the operands when
considering which of the mappings to apply to the current instruction. In our
architecture we have one register bank dedicated to pointers and another one for
anything else. We often see code where we have a G_PTR_ADD with a constant.
Since the constant is not a pointer, we put it on the other register bank. We
could put it on the address regbank and do provide alternative mappings for
that, but since the greedy algorithm doesn't actually check the usage of the
constant, it is always put on the other bank.



            The intent behind the greedy algorithm was for the mapping to look
at X instructions ahead, depending on the optimization level, when assigning one
instruction.

            Right now the window is simply 1 instruction and the code is not
structured to allow to have more than one.



            As we gain more insights on what we want RegBankSelect to do, it
makes sense to redesign it.






               When RegBankSelect then sees the G_PTR_ADD and sees that one of
its inputs is on the other register bank already, it then inserts a costly
cross-bank copy instead of checking if that operand has any alternative mappings
which would make the overall mapping for the current instruction cheaper.



            The idea is when a mapping is done, that means it was the best one
at the time of the decision (greedy), thus we don’t challenge that.

            “Reverting” an already set mapping is not that simple because yes
for this particular use we are inserting costly cross copies, but there are no
guarantees that replacing the mapping of the definition will not insert even
more costly copies for the other uses.



            E.g., consider:

            ```

            A = def <— RBS starts here

            = useFP A

            = useInt A

            = useInt A

            ```



            Let’s assume that greedy works the way it is intended. I.e., it
assigns A to the int register bank because there are 2 such uses vs. only 1 fp
bank use:

            ```

            A<int> = def

            = useFP A<int> <— Next, RBS looks at this one

            = useInt A<int>

            = useInt A<int>

            ```



            Now the regbank for useFP is not right and has to be repaired. Right
now, we will insert the costly cross copy for that use:

            ```

            A<int> = def

            AFP<fp> = cross_copy A<int>

            = useFP AFP<fp>

            = useInt A<int>

            = useInt A<int>

            ```



            Now, if we were to change the definition of A to avoid this copy we
would create two costly copy for the useInt. Actually, another question is what
would we do when we look at the first useInt? Is this use allowed to change the
definition of the instruction again? Do we duplicate the definition? Etc..

            ```

            A<fp> = def <— reassign

            = useFP AFP<fp>

            AInt1<int> = cross_copy A<fp>

            = useInt AInt1<int>

            AInt2<int> = cross_copy A<fp>

            = useInt AInt2<int>

            ```



            Bottom line, if we allow to modify the assignments of the
definition, the decision making is not local anymore and in particular may
require to add repairing code all over the place. As a result the cost model
becomes much more complicated.




               Matt suggested that RegBankSelect should probably go bottom-up
instead and I agree with him. I don't think there is a particular reason why
RegBankSelect necessarily has to go top-down.



            The rationale for going top-down is that when you reach an
instruction, all the operands are assigned so you know what would be the cost of
repairing.

            You could said that the problem is the same for definitions when
going bottom-up. However, this is likely to be more problematic because usually
you have fewer definitions than arguments on each individual instruction,
therefore there is more guess work going on (e.g., top-down, you assume a cost
for 1 definition, going bottom-up you have to assume a cost for 2 arguments or
more precisely you would have to track a window of X instructions for 2
arguments instead of 1 definition.)






               I'm not too familiar with the implementation of
RegBankSelect. Would it be a big effort to make it work bottom-up instead?
I'm guessing one of the biggest areas would be the check whether a
cross-bank copy is needed as well as calculating the overall cost for
alternative mappings as now all usages of the current instruction would have to
be checked instead of the much more limited operands. How big of an impact would
this have?



            It’s been a while since I looked at the implementation but I would
expect this to be significant.



            I think we should step back and check what we want before investing
any time in some rewrite. For instance, I don’t see what bottom-up fundamentally
gives us. It seems like a workaround to me.

            That said, it would work either way!



            Cheers,

            -Quentin






               Cheers,

               Dominik

               _______________________________________________
               LLVM Developers mailing list
               llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>
               https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



         --
         ----------------------------------------------------------------------
         Dominik Montada                   Email: dominik.montada at
hightec-rt.com<mailto:dominik.montada at hightec-rt.com>
         HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
         Europaallee 19                    Fax:   +49-681-92613-26
         D-66113 Saarbrücken               WWW:
http://www.hightec-rt.com<https://protect2.fireeye.com/v1/url?k=10181272-4eb8bcbf-101852e9-86073b36ea28-3b408a428d93631b&q=1&e=a2c4ed0a-0e5c-4bef-9460-752197daba21&u=http%3A%2F%2Fwww.hightec-rt.com%2F>

         Managing Director: Vera Strothmann
         Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222

         This e-mail may contain confidential and/or privileged information. If
         you are not the intended recipient please notify the sender immediately
         and destroy this e-mail. Any unauthorised copying, disclosure or
         distribution of the material in this e-mail is strictly forbidden.
         ---



   --
   ----------------------------------------------------------------------
   Dominik Montada                   Email: dominik.montada at
hightec-rt.com<mailto:dominik.montada at hightec-rt.com>
   HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
   Europaallee 19                    Fax:   +49-681-92613-26
   D-66113 Saarbrücken               WWW:
http://www.hightec-rt.com<https://protect2.fireeye.com/v1/url?k=09ae2d90-570e835d-09ae6d0b-86073b36ea28-251999af69af6c7f&q=1&e=a2c4ed0a-0e5c-4bef-9460-752197daba21&u=http%3A%2F%2Fwww.hightec-rt.com%2F>

   Managing Director: Vera Strothmann
   Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222

   This e-mail may contain confidential and/or privileged information. If
   you are not the intended recipient please notify the sender immediately
   and destroy this e-mail. Any unauthorised copying, disclosure or
   distribution of the material in this e-mail is strictly forbidden.
   ---



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201013/522195b6/attachment-0001.html>

llvm dev - Oct 2020 - GlobalISel round table follow up: register bank select

[llvm-dev] GlobalISel round table follow up: register bank select

[llvm-dev] GlobalISel round table follow up: register bank select

[llvm-dev] GlobalISel round table follow up: register bank select