Renato Golin
2013-Jul-29 20:07 UTC
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
On 29 July 2013 20:39, Jim Grosbach <grosbach at apple.com> wrote:> These results are really excellent. They’re on Intel, I assume, right? > What do the ARM numbers look like? Before enabling by default, we should > make sure that the results are comparable there as well. >Hi Jim, I'll have a look. --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130729/7b32b0d3/attachment.html>
Jim Grosbach
2013-Jul-29 20:07 UTC
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
Cool. Thanks! -Jim On Jul 29, 2013, at 1:07 PM, Renato Golin <renato.golin at linaro.org> wrote:> On 29 July 2013 20:39, Jim Grosbach <grosbach at apple.com> wrote: > These results are really excellent. They’re on Intel, I assume, right? What do the ARM numbers look like? Before enabling by default, we should make sure that the results are comparable there as well. > > Hi Jim, > > I'll have a look. > > --renato-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130729/377bd20f/attachment.html>
Renato Golin
2013-Jul-30 23:39 UTC
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
Nadav, I ran some benchmarks and I'm seeing 3-6% performance loss on a few of them when we use SLP on both O2 and O3 (with O3 having the biggest differences). Unfortunately, my benchmarking is not scientific, so I can't vow for those numbers, nor I'll have time to investigate it closer in the short term, but I wouldn't be surprised if this is result of extra shuffles we were seeing a few months back on the BB-Vect. This means that we could maybe trim that off (later) by fixing two or three bugs and (fingers crossed) making those shuffles go away. I'm trying to set up a task to compare the most important compile options (including all three vectorizers) on all optimization levels, but that's not going to happen any time soon, so don't take my word for it. If I'm the only one with bad numbers, I'm sure we can fix the issues later if you do introduce the SLP into O3 now. Though, I would like to wait a bit more for O2 and Os, because I can't also vow for its correctness, since we don't have buildbots with the SLP on, nor on O2,, and O2 is more or less the "fast, but still stable" state that people prefer to compile production code. Turn it on on O3, let's see how the bots behave, lets get a few data points and have a bit more information on its state. cheers, --renato On 29 July 2013 21:07, Jim Grosbach <grosbach at apple.com> wrote:> Cool. Thanks! > -Jim > > On Jul 29, 2013, at 1:07 PM, Renato Golin <renato.golin at linaro.org> wrote: > > On 29 July 2013 20:39, Jim Grosbach <grosbach at apple.com> wrote: > >> These results are really excellent. They’re on Intel, I assume, right? >> What do the ARM numbers look like? Before enabling by default, we should >> make sure that the results are comparable there as well. >> > > Hi Jim, > > I'll have a look. > > --renato > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130731/1641d621/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Enabling the SLP-vectorizer by default for -O3
- [LLVMdev] Enabling the SLP-vectorizer by default for -O3
- [LLVMdev] Enabling the SLP-vectorizer by default for -O3
- [LLVMdev] Enabling the SLP-vectorizer by default for -O3
- [LLVMdev] Enabling the SLP-vectorizer by default for -O3