thr3ads.net - llvm dev - [llvm-dev] Enable vectorizer-maximize-bandwidth by default? [May 2017]

If this information is useful, please help other people find it:
Share via:

Dehao Chen via llvm-dev

2017-May-18 22:30 UTC

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

Hi,

I'm proposing to make vectorizer-maximize-bandwidth on by default for loop
vectorizer because it should generally help performance.

I've tested the performance impact on Intel sandybridge machine with
speccpu benchmarks:

           Benchmark             Base:Reference   (1)
-------------------------------------------------------
spec/2006/fp/C++/444.namd                 26.84  -0.31%
spec/2006/fp/C++/447.dealII               46.19  +0.89%
spec/2006/fp/C++/450.soplex               42.92  -0.44%
spec/2006/fp/C++/453.povray               38.57  -2.25%
spec/2006/fp/C/433.milc                   24.54  -0.76%
spec/2006/fp/C/470.lbm                    41.08  +0.26%
spec/2006/fp/C/482.sphinx3                47.58  -0.99%
spec/2006/int/C++/471.omnetpp             22.06  +1.87%
spec/2006/int/C++/473.astar               22.65  -0.12%
spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
spec/2006/int/C/400.perlbench             33.43  +1.70%
spec/2006/int/C/401.bzip2                 23.02  -0.19%
spec/2006/int/C/403.gcc                   32.57  -0.43%
spec/2006/int/C/429.mcf                   40.35  +0.27%
spec/2006/int/C/445.gobmk                 26.96  +0.06%
spec/2006/int/C/456.hmmer                  24.4  +0.19%
spec/2006/int/C/458.sjeng                 27.91  -0.08%
spec/2006/int/C/462.libquantum            57.47  -0.20%
spec/2006/int/C/464.h264ref               46.52  +1.35%

geometric mean                                   +0.29%

  Scores are benchmark specific.

We do have regression on 453.povray, but it's due to secondary effects as
all hot functions are the same. I've also tested the code size impact, it
does not change for tested speccpu benchmarks.

I've prepared https://reviews.llvm.org/D33341 to do this.

I really appreciate if the community can help test the performance impact
of this change on other architectures so that we can decide if this should
go target-dependent.

Any comments/concerns?

Thanks,
Dehao
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170518/0a56b7de/attachment.html>

陳韋任 via llvm-dev

2017-May-19 13:18 UTC

head link

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

Besides speccpu, any other real-world applications benefit from this option?

Regards,
chenwj

2017-05-19 6:30 GMT+08:00 Dehao Chen via llvm-dev <llvm-dev at
lists.llvm.org>:
> Hi,
>
> I'm proposing to make vectorizer-maximize-bandwidth on by default for
> loop vectorizer because it should generally help performance.
>
> I've tested the performance impact on Intel sandybridge machine with
> speccpu benchmarks:
>
>            Benchmark             Base:Reference   (1)
> -------------------------------------------------------
> spec/2006/fp/C++/444.namd                 26.84  -0.31%
> spec/2006/fp/C++/447.dealII               46.19  +0.89%
> spec/2006/fp/C++/450.soplex               42.92  -0.44%
> spec/2006/fp/C++/453.povray               38.57  -2.25%
> spec/2006/fp/C/433.milc                   24.54  -0.76%
> spec/2006/fp/C/470.lbm                    41.08  +0.26%
> spec/2006/fp/C/482.sphinx3                47.58  -0.99%
> spec/2006/int/C++/471.omnetpp             22.06  +1.87%
> spec/2006/int/C++/473.astar               22.65  -0.12%
> spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
> spec/2006/int/C/400.perlbench             33.43  +1.70%
> spec/2006/int/C/401.bzip2                 23.02  -0.19%
> spec/2006/int/C/403.gcc                   32.57  -0.43%
> spec/2006/int/C/429.mcf                   40.35  +0.27%
> spec/2006/int/C/445.gobmk                 26.96  +0.06%
> spec/2006/int/C/456.hmmer                  24.4  +0.19%
> spec/2006/int/C/458.sjeng                 27.91  -0.08%
> spec/2006/int/C/462.libquantum            57.47  -0.20%
> spec/2006/int/C/464.h264ref               46.52  +1.35%
>
> geometric mean                                   +0.29%
>
>   Scores are benchmark specific.
>
> We do have regression on 453.povray, but it's due to secondary effects
as
> all hot functions are the same. I've also tested the code size impact,
it
> does not change for tested speccpu benchmarks.
>
> I've prepared https://reviews.llvm.org/D33341 to do this.
>
> I really appreciate if the community can help test the performance impact
> of this change on other architectures so that we can decide if this should
> go target-dependent.
>
> Any comments/concerns?
>
> Thanks,
> Dehao
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

-- 
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170519/d2d66943/attachment.html>

Matthew Simpson via llvm-dev

2017-May-19 15:56 UTC

head link

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

This sounds good to me. Enabling this by default has been mentioned a few
times already. I've tested this feature in the past on AArch64 (Kryo and
Falkor) and found it to be beneficial for mixed-type loops. Thanks!

On Thu, May 18, 2017 at 6:30 PM, Dehao Chen via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> I'm proposing to make vectorizer-maximize-bandwidth on by default for
> loop vectorizer because it should generally help performance.
>
> I've tested the performance impact on Intel sandybridge machine with
> speccpu benchmarks:
>
>            Benchmark             Base:Reference   (1)
> -------------------------------------------------------
> spec/2006/fp/C++/444.namd                 26.84  -0.31%
> spec/2006/fp/C++/447.dealII               46.19  +0.89%
> spec/2006/fp/C++/450.soplex               42.92  -0.44%
> spec/2006/fp/C++/453.povray               38.57  -2.25%
> spec/2006/fp/C/433.milc                   24.54  -0.76%
> spec/2006/fp/C/470.lbm                    41.08  +0.26%
> spec/2006/fp/C/482.sphinx3                47.58  -0.99%
> spec/2006/int/C++/471.omnetpp             22.06  +1.87%
> spec/2006/int/C++/473.astar               22.65  -0.12%
> spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
> spec/2006/int/C/400.perlbench             33.43  +1.70%
> spec/2006/int/C/401.bzip2                 23.02  -0.19%
> spec/2006/int/C/403.gcc                   32.57  -0.43%
> spec/2006/int/C/429.mcf                   40.35  +0.27%
> spec/2006/int/C/445.gobmk                 26.96  +0.06%
> spec/2006/int/C/456.hmmer                  24.4  +0.19%
> spec/2006/int/C/458.sjeng                 27.91  -0.08%
> spec/2006/int/C/462.libquantum            57.47  -0.20%
> spec/2006/int/C/464.h264ref               46.52  +1.35%
>
> geometric mean                                   +0.29%
>
>   Scores are benchmark specific.
>
> We do have regression on 453.povray, but it's due to secondary effects
as
> all hot functions are the same. I've also tested the code size impact,
it
> does not change for tested speccpu benchmarks.
>
> I've prepared https://reviews.llvm.org/D33341 to do this.
>
> I really appreciate if the community can help test the performance impact
> of this change on other architectures so that we can decide if this should
> go target-dependent.
>
> Any comments/concerns?
>
> Thanks,
> Dehao
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170519/942fa225/attachment.html>

Dehao Chen via llvm-dev

2017-May-19 17:35 UTC

head link

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

Yes, we do see performance benefits for this change on some google internal
benchmarks.

Dehao

On Fri, May 19, 2017 at 6:18 AM, 陳韋任 <chenwj.cs97g at g2.nctu.edu.tw>
wrote:
> Besides speccpu, any other real-world applications benefit from this
> option?
>
> Regards,
> chenwj
>
> 2017-05-19 6:30 GMT+08:00 Dehao Chen via llvm-dev <llvm-dev at
lists.llvm.org
> >:
>
>> Hi,
>>
>> I'm proposing to make vectorizer-maximize-bandwidth on by default
for
>> loop vectorizer because it should generally help performance.
>>
>> I've tested the performance impact on Intel sandybridge machine
with
>> speccpu benchmarks:
>>
>>            Benchmark             Base:Reference   (1)
>> -------------------------------------------------------
>> spec/2006/fp/C++/444.namd                 26.84  -0.31%
>> spec/2006/fp/C++/447.dealII               46.19  +0.89%
>> spec/2006/fp/C++/450.soplex               42.92  -0.44%
>> spec/2006/fp/C++/453.povray               38.57  -2.25%
>> spec/2006/fp/C/433.milc                   24.54  -0.76%
>> spec/2006/fp/C/470.lbm                    41.08  +0.26%
>> spec/2006/fp/C/482.sphinx3                47.58  -0.99%
>> spec/2006/int/C++/471.omnetpp             22.06  +1.87%
>> spec/2006/int/C++/473.astar               22.65  -0.12%
>> spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
>> spec/2006/int/C/400.perlbench             33.43  +1.70%
>> spec/2006/int/C/401.bzip2                 23.02  -0.19%
>> spec/2006/int/C/403.gcc                   32.57  -0.43%
>> spec/2006/int/C/429.mcf                   40.35  +0.27%
>> spec/2006/int/C/445.gobmk                 26.96  +0.06%
>> spec/2006/int/C/456.hmmer                  24.4  +0.19%
>> spec/2006/int/C/458.sjeng                 27.91  -0.08%
>> spec/2006/int/C/462.libquantum            57.47  -0.20%
>> spec/2006/int/C/464.h264ref               46.52  +1.35%
>>
>> geometric mean                                   +0.29%
>>
>>   Scores are benchmark specific.
>>
>> We do have regression on 453.povray, but it's due to secondary
effects as
>> all hot functions are the same. I've also tested the code size
impact, it
>> does not change for tested speccpu benchmarks.
>>
>> I've prepared https://reviews.llvm.org/D33341 to do this.
>>
>> I really appreciate if the community can help test the performance
impact
>> of this change on other architectures so that we can decide if this
should
>> go target-dependent.
>>
>> Any comments/concerns?
>>
>> Thanks,
>> Dehao
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
>
> --
> Wei-Ren Chen (陳韋任)
> Homepage: https://people.cs.nctu.edu.tw/~chenwj
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170519/38f2bda1/attachment.html>

Adam Nemet via llvm-dev

2017-May-19 23:01 UTC

head link

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

> On May 18, 2017, at 3:30 PM, Dehao Chen via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi,
> 
> I'm proposing to make vectorizer-maximize-bandwidth on by default for
loop vectorizer because it should generally help performance.
> 
> I've tested the performance impact on Intel sandybridge machine with
speccpu benchmarks:
> 
>            Benchmark             Base:Reference   (1)  
> -------------------------------------------------------
> spec/2006/fp/C++/444.namd                 26.84  -0.31%
> spec/2006/fp/C++/447.dealII               46.19  +0.89%
> spec/2006/fp/C++/450.soplex               42.92  -0.44%
> spec/2006/fp/C++/453.povray               38.57  -2.25%
> spec/2006/fp/C/433.milc                   24.54  -0.76%
> spec/2006/fp/C/470.lbm                    41.08  +0.26%
> spec/2006/fp/C/482.sphinx3                47.58  -0.99%
> spec/2006/int/C++/471.omnetpp             22.06  +1.87%
> spec/2006/int/C++/473.astar               22.65  -0.12%
> spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
> spec/2006/int/C/400.perlbench             33.43  +1.70%
> spec/2006/int/C/401.bzip2                 23.02  -0.19%
> spec/2006/int/C/403.gcc                   32.57  -0.43%
> spec/2006/int/C/429.mcf                   40.35  +0.27%
> spec/2006/int/C/445.gobmk                 26.96  +0.06%
> spec/2006/int/C/456.hmmer                  24.4  +0.19%
> spec/2006/int/C/458.sjeng                 27.91  -0.08%
> spec/2006/int/C/462.libquantum            57.47  -0.20%
> spec/2006/int/C/464.h264ref               46.52  +1.35%
> 
> geometric mean                                   +0.29%
> 
>   Scores are benchmark specific.
> 
> We do have regression on 453.povray, but it's due to secondary effects
as all hot functions are the same. I've also tested the code size impact, it
does not change for tested speccpu benchmarks.
Can you please describe the config for the runs (optimization level, PGO/no-PGO,
etc).

It would be good to provide analysis for the changes >1%. I.e. we need to
make sure that the improvements are not noise either ;).
> 
> I've prepared https://reviews.llvm.org/D33341
<https://reviews.llvm.org/D33341> to do this.
> 
> I really appreciate if the community can help test the performance impact
of this change on other architectures so that we can decide if this should go
target-dependent.
I will run it on Cyclone/AArch64 next week.

Adam
> 
> Any comments/concerns?
> 
> Thanks,
> Dehao
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170519/74e7eb11/attachment-0001.html>

Dehao Chen via llvm-dev

2017-May-22 16:57 UTC

head link

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

On Fri, May 19, 2017 at 4:01 PM, Adam Nemet <anemet at apple.com> wrote:
>
> On May 18, 2017, at 3:30 PM, Dehao Chen via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi,
>
> I'm proposing to make vectorizer-maximize-bandwidth on by default for
> loop vectorizer because it should generally help performance.
>
> I've tested the performance impact on Intel sandybridge machine with
> speccpu benchmarks:
>
>            Benchmark             Base:Reference   (1)
> -------------------------------------------------------
> spec/2006/fp/C++/444.namd                 26.84  -0.31%
> spec/2006/fp/C++/447.dealII               46.19  +0.89%
> spec/2006/fp/C++/450.soplex               42.92  -0.44%
> spec/2006/fp/C++/453.povray               38.57  -2.25%
> spec/2006/fp/C/433.milc                   24.54  -0.76%
> spec/2006/fp/C/470.lbm                    41.08  +0.26%
> spec/2006/fp/C/482.sphinx3                47.58  -0.99%
> spec/2006/int/C++/471.omnetpp             22.06  +1.87%
> spec/2006/int/C++/473.astar               22.65  -0.12%
> spec/2006/int/C++/483.xalancbmk           33.69  +4.97%
> spec/2006/int/C/400.perlbench             33.43  +1.70%
> spec/2006/int/C/401.bzip2                 23.02  -0.19%
> spec/2006/int/C/403.gcc                   32.57  -0.43%
> spec/2006/int/C/429.mcf                   40.35  +0.27%
> spec/2006/int/C/445.gobmk                 26.96  +0.06%
> spec/2006/int/C/456.hmmer                  24.4  +0.19%
> spec/2006/int/C/458.sjeng                 27.91  -0.08%
> spec/2006/int/C/462.libquantum            57.47  -0.20%
> spec/2006/int/C/464.h264ref               46.52  +1.35%
>
> geometric mean                                   +0.29%
>
>   Scores are benchmark specific.
>
> We do have regression on 453.povray, but it's due to secondary effects
as
> all hot functions are the same. I've also tested the code size impact,
it
> does not change for tested speccpu benchmarks.
>
>
> Can you please describe the config for the runs (optimization level,
> PGO/no-PGO, etc).
>
This is O2 build without PGO.

>
> It would be good to provide analysis for the changes >1%. I.e. we need
to
> make sure that the improvements are not noise either ;).
>
Good point. I just examined all benchmarks with >1% "improvement".
Turns
out they are all noises: the hot functions (with >1% total cycles) are all
identical. So the conclusion is: this change does not affect speccpu2006
performance.

Thanks,
Dehao

>
>
> I've prepared https://reviews.llvm.org/D33341 to do this.
>
> I really appreciate if the community can help test the performance impact
> of this change on other architectures so that we can decide if this should
> go target-dependent.
>
>
> I will run it on Cyclone/AArch64 next week.
>
> Adam
>
>
> Any comments/concerns?
>
> Thanks,
> Dehao
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170522/8f4148fa/attachment.html>

Chandler Carruth via llvm-dev

2017-May-30 07:58 UTC

head link

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

On Fri, May 19, 2017 at 4:01 PM Adam Nemet via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I will run it on Cyclone/AArch64 next week.
>
FYI, we're still waiting on these Adam...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170530/7cb390ca/attachment.html>

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - May 2017 - Enable vectorizer-maximize-bandwidth by default?

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

[llvm-dev] Enable vectorizer-maximize-bandwidth by default?

Maybe Matching Threads