Displaying 3 results from an estimated 3 matches for "vectorcost".
2018 Jul 23
2
[LoopVectorizer] Improving the performance of dot product reduction loop
On 07/23/2018 06:23 PM, Hal Finkel via llvm-dev wrote:
>
> On 07/23/2018 05:22 PM, Craig Topper wrote:
>> Hello all,
>>
>> This code https://godbolt.org/g/tTyxpf is a dot product reduction
>> loop multipying sign extended 16-bit values to produce a 32-bit
>> accumulated result. The x86 backend is currently not able to optimize
>> it as well as gcc and icc.
2018 Jun 01
2
[VPlan] about vectorization factor selection
....first / (float)Width;
}
for (unsigned i = 2; i <= MaxVF; i *= 2) {
// Notice that the vector loop needs to be executed less times, so
// we need to divide the cost of the vector loops by the width of
// the vector elements.
VectorizationCostTy C = expectedCost(i);
float VectorCost = C.first / (float)i;
Cheers,
Shixiong (Jason) Xu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180601/ec49697c/attachment.html>
2018 Jul 24
2
[LoopVectorizer] Improving the performance of dot product reduction loop
...s
> the VF as 8 though VF16 has the same cost.
>
> LV: Vector loop of width 8 costs: 1.
>
> LV: Vector loop of width 16 costs: 1.
>
>
>
> It’s because of below check in LV:
>
> LoopVectorizationCostModel::selectVectorizationFactor() {
>
> …
>
> if (VectorCost < Cost) {
>
> Cost = VectorCost;
>
> Width = i;
>
> }
>
>
>
> I don’t know the history behind this change but wondering why it’s
> like this, at least for “vectorizer-maximize-bandwidth” it should be
> (VectorCost <= Cost).
>
Ah, inter...