thr3ads.net - llvm dev - [llvm-dev] [GlobalISel][MIPS] Legality and instruction combining [Sep 2018]

If this information is useful, please help other people find it:
Share via:

pavram via llvm-dev

2018-Sep-14 13:50 UTC

[llvm-dev] [GlobalISel][MIPS] Legality and instruction combining

Hi Daniel,

On 13.09.2018. 19:32, Daniel Sanders wrote:> Could you clarify what you mean here? The new legalizer info can 
> define this with:
>     getActionDefinitionsBuilder(G_SELECT).clampScalar(1, s32, s32)
> so I'm guessing you mean that code to mutate the G_SELECT is currently 
> missingYes, LegalizerHelper::widenScalar widens only TypeIdx==0, it doesn't do 
that for TypeIdx==1. Is it intentionally implemented this
way?>> b) Is the plan to sometimes let s1 as legal type and ignore it later?
> I'm not sure what you mean here
>For example lets look at AArch64 G_SELECT:
   getActionDefinitionsBuilder(G_SELECT)
       .legalFor({{s32, s1}, {s64, s1}, {p0, s1}})
       .clampScalar(0, s32, s64)
       .widenScalarToNextPow2(0);

In this case LLT of operand 1 (s1) in G_SELECT has size 1, and 
corresponding register class in selected instruction has size 32 (that 
is $src1 in AArch64::ANDSWri, it has GPR32 regsiter class).
For that reason s1 is practically ignored and silently converted to s32.
We could get similar result if we have:

   getActionDefinitionsBuilder(G_SELECT)
       .legalFor({{s32, s32}, {s64, s32}, {p0, s32}})
       .clampScalar(0, s32, s64)
       .clampScalar(1, s32, s32)
       .widenScalarToNextPow2(0);

And in this case sizes of LLTs and register classes would be the same.
Implementation for G_ICMP on AArch64 is very similar to second described 
option for G_SELECT.
Is there a reason for different implementation of G_SELECT and G_ICMP on 
AArch64? Are there some general ruses to determine which of the two 
given options is better in which case?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180914/3bf07fce/attachment.html>

Daniel Sanders via llvm-dev

2018-Sep-14 16:31 UTC

head link

[llvm-dev] [GlobalISel][MIPS] Legality and instruction combining

> On 14 Sep 2018, at 06:50, pavram <Petar.Avramovic at rt-rk.com>
wrote:
> 
> 
> 
> 
> Hi Daniel, 
> 
> On 13.09.2018. 19:32, Daniel Sanders wrote: 
>> Could you clarify what you mean here? The new legalizer info can define
this with:
>>     getActionDefinitionsBuilder(G_SELECT).clampScalar(1, s32, s32) 
>> so I'm guessing you mean that code to mutate the G_SELECT is
currently missing
> Yes, LegalizerHelper::widenScalar widens only TypeIdx==0, it doesn't do
that for TypeIdx==1. Is it intentionally implemented this way?
No, we just haven't needed to implement the TypeIdx==1 case so far. Up until
recently we didn't have a means to test things that weren't in use by a
target. We decided it was better for the compiler to fail than to submit
untested/untestable code so there's quite a few gaps like this.
>>> b) Is the plan to sometimes let s1 as legal type and ignore it
later?
>> I'm not sure what you mean here 
>> 
> For example lets look at AArch64 G_SELECT: 
>   getActionDefinitionsBuilder(G_SELECT) 
>       .legalFor({{s32, s1}, {s64, s1}, {p0, s1}}) 
>       .clampScalar(0, s32, s64) 
>       .widenScalarToNextPow2(0); 
> 
> In this case LLT of operand 1 (s1) in G_SELECT has size 1, and
corresponding register class in selected instruction has size 32 (that is $src1
in AArch64::ANDSWri, it has GPR32 regsiter class).
> For that reason s1 is practically ignored and silently converted to s32.
Oh, I see what you mean. This is also a viable option but it requires that bits
1-31 don't affect the results of legal operations no matter what value they
are. This makes little difference to something like G_ADD, but things like
G_LSHR (that one isn't really an issue for s1, but applies for s2-s30) have
to ensure they insert leading zeros even if the bits in the container are
1's. Of course, you don't have to declare the difficult operations to be
legal and can let them widenScalar up to a type that's easy to deal with.

For Mips, this approach is probably ok so long as G_TRUNC defines bits 1-31 and
any other s1-producing operations that are legal maintain those bits to keep
them at the value they would be if you had only just done the G_TRUNC. I expect
you only want G_TRUNC/G_ANYEXT/G_SEXT/G_ZEXT, G_ICMP/G_FCMP, and
G_OR/G_XOR/G_AND in which case it's easy to achieve this.
> We could get similar result if we have: 
> 
>   getActionDefinitionsBuilder(G_SELECT) 
>       .legalFor({{s32, s32}, {s64, s32}, {p0, s32}}) 
>       .clampScalar(0, s32, s64) 
>       .clampScalar(1, s32, s32) 
>       .widenScalarToNextPow2(0); 
> 
> And in this case sizes of LLTs and register classes would be the same. 
> Implementation for G_ICMP on AArch64 is very similar to second described
option for G_SELECT.
> Is there a reason for different implementation of G_SELECT and G_ICMP on
AArch64? Are there some general ruses to determine which of the two given
options is better in which case?
At the moment, it largely comes down to preference. I don't have any real
evidence to back this up but I expect the latter will be the better optimized in
the long run since maintaining the excess bits requires instructions and these
are explicit in the GMIR. This means we can optimize it and decline to do it
whenever it doesn't matter. For example:
	%2(s2) = G_ADD %0(s2), %1(s2)
	%4(s2) = G_ADD %2(s2), %3(s2)
	%5(s2) = G_LSHR %4(s2), i2 1
has to either preserve bits 2-31 between each operation just in case one that
cares about it comes along, or define bits 2-31 just before an operation that
cares about it even if they're already correct. This costs six and
instructions. Whereas:
	%3(s32) = G_AND %0(s2), i32 3
	%4(s32) = G_AND %1(s2), i32 3
	%5(s32) = G_AND %2(s2), i32 3
	%6(s32) = G_ADD %3(s32), %4(s32)
	%7(s32) = G_AND %6(s32), i32 3
	%8(s32) = G_ADD %5(s32), %7(s32)
	%9(s32) = G_AND %8(s32), i32 3
	%10(s32) = G_LSHR %9(s32), i2 1
	%11(s32) = G_AND %9(s32), i32 3
can optimize to:
	%6(s32) = G_ADD %0(s32), %1(s32)
	%8(s32) = G_ADD %2(s32), %6(s32)
	%10(s32) = G_LSHR %8(s32), i2 1
	%11(s32) = G_AND %9(s32), i32 1
which costs only one and instruction.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180914/27912afc/attachment.html>

Petar Avramovic via llvm-dev

2018-Sep-17 15:08 UTC

head link

[llvm-dev] [GlobalISel][MIPS] Legality and instruction combining

Thanks Daniel,

On 14.09.2018. 18:31, Daniel Sanders wrote:>
>> On 14 Sep 2018, at 06:50, pavram <Petar.Avramovic at rt-rk.com 
>> <mailto:Petar.Avramovic at rt-rk.com>> wrote:
>>
>> We could get similar result if we have:
>>
>>   getActionDefinitionsBuilder(G_SELECT)
>>       .legalFor({{s32, s32}, {s64, s32}, {p0, s32}})
>>       .clampScalar(0, s32, s64)
>>       .clampScalar(1, s32, s32)
>>       .widenScalarToNextPow2(0);
>>
>> And in this case sizes of LLTs and register classes would be the same.
>> Implementation for G_ICMP on AArch64 is very similar to second 
>> described option for G_SELECT.
>> Is there a reason for different implementation of G_SELECT and G_ICMP 
>> on AArch64? Are there some general ruses to determine which of the 
>> two given options is better in which case?
>
> At the moment, it largely comes down to preference. I don't have any 
> real evidence to back this up but I expect the latter will be the 
> better optimized in the long run since maintaining the excess bits 
> requires instructions and these are explicit in the GMIR. This means 
> we can optimize it and decline to do it whenever it doesn't matter.Then, for now, I will widen s1 to size of register class in instruction 
"to be selected".
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180917/792e0138/attachment.html>

llvm dev - Sep 2018 - [GlobalISel][MIPS] Legality and instruction combining

[llvm-dev] [GlobalISel][MIPS] Legality and instruction combining

[llvm-dev] [GlobalISel][MIPS] Legality and instruction combining

[llvm-dev] [GlobalISel][MIPS] Legality and instruction combining