thr3ads.net - llvm dev - [llvm-dev] getScalarizationOverhead() [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Jonas Paulsson via llvm-dev

2017-Jan-20 14:30 UTC

[llvm-dev] getScalarizationOverhead()

On 2017-01-20 14:31, Hal Finkel wrote:>
> On 01/20/2017 06:11 AM, Jonas Paulsson via llvm-dev wrote:
>> Hi,
>>
>> I wonder why getScalarizationOverhead() does not take into account 
>> the number of operands of the instruction? This should influence the 
>> number of extracts needed, so instead of
>>
>> Scalarization cost = NumEls * (insert + extract)
>>
>> it would be better to do
>>
>> Scalarization cost = NumEls * (insert + (extract * numOperands))
>
> I suspect this is an oversight (although we need to be a bit careful 
> here because if two operands are the same, which is not uncommon, we 
> don't want to double the cost).
>
>  -Hal
Do you in those cases of an identical operand want to count just a cost 
of "1" for a register move, instead of the "extraction
cost"?

/Jonas

Hal Finkel via llvm-dev

2017-Jan-20 14:53 UTC

head link

[llvm-dev] getScalarizationOverhead()

On 01/20/2017 08:30 AM, Jonas Paulsson wrote:>
>
> On 2017-01-20 14:31, Hal Finkel wrote:
>>
>> On 01/20/2017 06:11 AM, Jonas Paulsson via llvm-dev wrote:
>>> Hi,
>>>
>>> I wonder why getScalarizationOverhead() does not take into account 
>>> the number of operands of the instruction? This should influence
the
>>> number of extracts needed, so instead of
>>>
>>> Scalarization cost = NumEls * (insert + extract)
>>>
>>> it would be better to do
>>>
>>> Scalarization cost = NumEls * (insert + (extract * numOperands))
>>
>> I suspect this is an oversight (although we need to be a bit careful 
>> here because if two operands are the same, which is not uncommon, we 
>> don't want to double the cost).
>>
>>  -Hal
>
> Do you in those cases of an identical operand want to count just a 
> cost of "1" for a register move, instead of the "extraction
cost"?
There should be no cost to reusing the operand. (mul a, a) should only 
extract a once, the fact that it is used twice should not increase the cost.

  -Hal
>
> /Jonas
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Simon Pilgrim via llvm-dev

2017-Jan-22 16:07 UTC

head link

[llvm-dev] getScalarizationOverhead()

> On 20 Jan 2017, at 14:53, Hal Finkel via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> On 01/20/2017 08:30 AM, Jonas Paulsson wrote:
>> 
>> 
>> On 2017-01-20 14:31, Hal Finkel wrote:
>>> 
>>> On 01/20/2017 06:11 AM, Jonas Paulsson via llvm-dev wrote:
>>>> Hi,
>>>> 
>>>> I wonder why getScalarizationOverhead() does not take into
account the number of operands of the instruction? This should influence the
number of extracts needed, so instead of
>>>> 
>>>> Scalarization cost = NumEls * (insert + extract)
>>>> 
>>>> it would be better to do
>>>> 
>>>> Scalarization cost = NumEls * (insert + (extract *
numOperands))
>>> 
>>> I suspect this is an oversight (although we need to be a bit
careful here because if two operands are the same, which is not uncommon, we
don't want to double the cost).
>>> 
>>> -Hal
>> 
>> Do you in those cases of an identical operand want to count just a cost
of "1" for a register move, instead of the "extraction
cost"?
> 
> There should be no cost to reusing the operand. (mul a, a) should only
extract a once, the fact that it is used twice should not increase the cost.
> 
> -Hal
There appears to be a similar issue within the x86 AVX1 cost tables for cases
where we have to split the 256-bit integer operations. Some binops add
1*extract_subvector + 1*insert_subvector to the 2*128-binop costs whilst others
don’t bother adding anything at all. We need to try harder to determine if we
should add 1 (duplicate input or constant folded extract) or 2 extracts to the
final cost.

Simon Pilgrim via llvm-dev

2017-Jan-22 16:07 UTC

head link

[llvm-dev] getScalarizationOverhead()

> On 20 Jan 2017, at 14:53, Hal Finkel via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> On 01/20/2017 08:30 AM, Jonas Paulsson wrote:
>> 
>> 
>> On 2017-01-20 14:31, Hal Finkel wrote:
>>> 
>>> On 01/20/2017 06:11 AM, Jonas Paulsson via llvm-dev wrote:
>>>> Hi,
>>>> 
>>>> I wonder why getScalarizationOverhead() does not take into
account the number of operands of the instruction? This should influence the
number of extracts needed, so instead of
>>>> 
>>>> Scalarization cost = NumEls * (insert + extract)
>>>> 
>>>> it would be better to do
>>>> 
>>>> Scalarization cost = NumEls * (insert + (extract *
numOperands))
>>> 
>>> I suspect this is an oversight (although we need to be a bit
careful here because if two operands are the same, which is not uncommon, we
don't want to double the cost).
>>> 
>>> -Hal
>> 
>> Do you in those cases of an identical operand want to count just a cost
of "1" for a register move, instead of the "extraction
cost"?
> 
> There should be no cost to reusing the operand. (mul a, a) should only
extract a once, the fact that it is used twice should not increase the cost.
> 
> -Hal
There appears to be a similar issue within the x86 AVX1 cost tables for cases
where we have to split the 256-bit integer operations. Some binops add
1*extract_subvector + 1*insert_subvector to the 2*128-binop costs whilst others
don’t bother adding anything at all. We need to try harder to determine if we
should add 1 (duplicate input or constant folded extract) or 2 extracts to the
final cost.

llvm dev - Jan 2017 - getScalarizationOverhead()

[llvm-dev] getScalarizationOverhead()

[llvm-dev] getScalarizationOverhead()

[llvm-dev] getScalarizationOverhead()

[llvm-dev] getScalarizationOverhead()