Gao, Yunzhong
2013-Sep-20 21:58 UTC
[LLVMdev] Proposal to improve vzeroupper optimization strategy
Hi Eli, Thanks for the feedback. Please see below. - Gao. From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Thursday, September 19, 2013 12:31 PM To: Gao, Yunzhong Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Proposal to improve vzeroupper optimization strategy> This is essentially equivalent to "don't insert vzeroupper anywhere", as > far as I can tell. (The case of SSE instructions without a v- prefixed > equivalent is rare enough we can separate it from this discussion.)So will you be interested in a patch that disables vzeroupper by default? I implemented this possibly over-engineering solution in our local tree to work around some bad instruction selection issues in LLVM backend. When benchmarking on our game codes, I noticed that sometimes legacy SSE instructions were selected despite existence of AVX equivalent, in which case the vzeroupper instruction was needed. And it is much easier to detect existence of vzeroupper instruction than to detect each single legacy SSE instructions. The instruction selection issues were later fixed in our tree (patches to be submitted later), at least for the handful of games I tested on. So a simple change to just disable vzeroupper by default will be acceptable to us as well.> The reason we need vzeroupper in the first place is because we can't assume > other functions won't use legacy SSE instructions; for example, on most > systems, calling sin() will use legacy SSE instructions. I mean, if you can > make some unusual guarantee about your platform, it might make sense to > disable vzeroupper generation in general, but it simply doesn't make sense > on most platforms.I am confused by this point. By "most systems," do you have in mind a platform where the sin() function was compiled by gcc but the application codes were compiled by clang? If the sin() function was compiled by clang for a platform that supports AVX instructions, I do not expect it to contain legacy SSE instructions. Is it not the case for your platform? I just looked at the library code for our sin() function and I do not see any legacy SSE instructions (but for license restrictions I cannot share our library codes; sorry).> If you want a mechanism to disable vzeroupper generation for particular > function calls, that might make sense... > -Eli-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/1a8d64aa/attachment.html>
Nadav Rotem
2013-Sep-20 22:15 UTC
[LLVMdev] Proposal to improve vzeroupper optimization strategy
Hi Gao, Eli is right. In many cases the OS is not compiled with AVX support but the application is. In other words, AVX code calling non-avx code is very common. Thanks, Nadav On Sep 20, 2013, at 2:58 PM, Gao, Yunzhong <yunzhong_gao at playstation.sony.com> wrote:> Hi Eli, > Thanks for the feedback. Please see below. > - Gao. > > From: Eli Friedman [mailto:eli.friedman at gmail.com] > Sent: Thursday, September 19, 2013 12:31 PM > To: Gao, Yunzhong > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Proposal to improve vzeroupper optimization strategy > > > This is essentially equivalent to "don't insert vzeroupper anywhere", as > > far as I can tell. (The case of SSE instructions without a v- prefixed > > equivalent is rare enough we can separate it from this discussion.) > > So will you be interested in a patch that disables vzeroupper by default? > > I implemented this possibly over-engineering solution in our local tree to work > around some bad instruction selection issues in LLVM backend. When benchmarking > on our game codes, I noticed that sometimes legacy SSE instructions were > selected despite existence of AVX equivalent, in which case the vzeroupper > instruction was needed. And it is much easier to detect existence of vzeroupper > instruction than to detect each single legacy SSE instructions. > > The instruction selection issues were later fixed in our tree (patches to be > submitted later), at least for the handful of games I tested on. So a simple > change to just disable vzeroupper by default will be acceptable to us as well. > > > The reason we need vzeroupper in the first place is because we can't assume > > other functions won't use legacy SSE instructions; for example, on most > > systems, calling sin() will use legacy SSE instructions. I mean, if you can > > make some unusual guarantee about your platform, it might make sense to > > disable vzeroupper generation in general, but it simply doesn't make sense > > on most platforms. > > I am confused by this point. By "most systems," do you have in mind a platform > where the sin() function was compiled by gcc but the application codes were > compiled by clang? > > If the sin() function was compiled by clang for a platform that supports AVX > instructions, I do not expect it to contain legacy SSE instructions. Is it not > the case for your platform? > > I just looked at the library code for our sin() function and I do not see any > legacy SSE instructions (but for license restrictions I cannot share our > library codes; sorry). > > > If you want a mechanism to disable vzeroupper generation for particular > > function calls, that might make sense... > > -Eli > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/21abf6fd/attachment.html>
Eli Friedman
2013-Sep-20 23:11 UTC
[LLVMdev] Proposal to improve vzeroupper optimization strategy
On Fri, Sep 20, 2013 at 2:58 PM, Gao, Yunzhong < yunzhong_gao at playstation.sony.com> wrote:> Hi Eli,**** > > Thanks for the feedback. Please see below. > - Gao.**** > > ** ** > > From: Eli Friedman [mailto:eli.friedman at gmail.com] **** > > Sent: Thursday, September 19, 2013 12:31 PM**** > > To: Gao, Yunzhong**** > > Cc: llvmdev at cs.uiuc.edu**** > > Subject: Re: [LLVMdev] Proposal to improve vzeroupper optimization strategy > **** > > ** ** > > > This is essentially equivalent to "don't insert vzeroupper anywhere", as > **** > > > far as I can tell. (The case of SSE instructions without a v- prefixed** > ** > > > equivalent is rare enough we can separate it from this discussion.)**** > > ** ** > > So will you be interested in a patch that disables vzeroupper by default? >A patch which adds a switch/LLVM IR function attribute to disable vzeroupper would be fine. A patch that disables vzeroupper on your platform would be fine (assuming the target triple is distinguishable). Turning off vzeroupper by default on all platforms is not fine.> I implemented this possibly over-engineering solution in our local tree to > work**** > > around some bad instruction selection issues in LLVM backend. When > benchmarking**** > > on our game codes, I noticed that sometimes legacy SSE instructions were** > ** > > selected despite existence of AVX equivalent, in which case the vzeroupper > **** > > instruction was needed. And it is much easier to detect existence of > vzeroupper**** > > instruction than to detect each single legacy SSE instructions.**** > > ** ** > > The instruction selection issues were later fixed in our tree (patches to > be**** > > submitted later), at least for the handful of games I tested on. So a > simple**** > > change to just disable vzeroupper by default will be acceptable to us as > well.**** > > ** ** > > > The reason we need vzeroupper in the first place is because we can't > assume**** > > > other functions won't use legacy SSE instructions; for example, on most* > *** > > > systems, calling sin() will use legacy SSE instructions. I mean, if you > can**** > > > make some unusual guarantee about your platform, it might make sense to* > *** > > > disable vzeroupper generation in general, but it simply doesn't make > sense**** > > > on most platforms.**** > > ** ** > > I am confused by this point. By "most systems," do you have in mind a > platform**** > > where the sin() function was compiled by gcc but the application codes were > **** > > compiled by clang? >On, for example, OS X, AVX is not enabled by default, so the sin() function uses legacy SSE instructions. Users can still turn on AVX in their applications. -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/2f8b19c0/attachment.html>
Sean Silva
2013-Sep-21 00:07 UTC
[LLVMdev] Proposal to improve vzeroupper optimization strategy
Is it realistic to worry about performance of vectorized code that does PIC calls into a non-vectorized sin() in libc? Maybe there's an example other than sin() that is more realistic? -- Sean Silva On Fri, Sep 20, 2013 at 7:11 PM, Eli Friedman <eli.friedman at gmail.com>wrote:> On Fri, Sep 20, 2013 at 2:58 PM, Gao, Yunzhong < > yunzhong_gao at playstation.sony.com> wrote: > >> Hi Eli,**** >> >> Thanks for the feedback. Please see below. >> - Gao.**** >> >> ** ** >> >> From: Eli Friedman [mailto:eli.friedman at gmail.com] **** >> >> Sent: Thursday, September 19, 2013 12:31 PM**** >> >> To: Gao, Yunzhong**** >> >> Cc: llvmdev at cs.uiuc.edu**** >> >> Subject: Re: [LLVMdev] Proposal to improve vzeroupper optimization >> strategy**** >> >> ** ** >> >> > This is essentially equivalent to "don't insert vzeroupper anywhere", as >> **** >> >> > far as I can tell. (The case of SSE instructions without a v- prefixed* >> *** >> >> > equivalent is rare enough we can separate it from this discussion.)**** >> >> ** ** >> >> So will you be interested in a patch that disables vzeroupper by default? >> > > A patch which adds a switch/LLVM IR function attribute to disable > vzeroupper would be fine. A patch that disables vzeroupper on your > platform would be fine (assuming the target triple is distinguishable). > Turning off vzeroupper by default on all platforms is not fine. > > >> I implemented this possibly over-engineering solution in our local tree >> to work**** >> >> around some bad instruction selection issues in LLVM backend. When >> benchmarking**** >> >> on our game codes, I noticed that sometimes legacy SSE instructions were* >> *** >> >> selected despite existence of AVX equivalent, in which case the vzeroupper >> **** >> >> instruction was needed. And it is much easier to detect existence of >> vzeroupper**** >> >> instruction than to detect each single legacy SSE instructions.**** >> >> ** ** >> >> The instruction selection issues were later fixed in our tree (patches to >> be**** >> >> submitted later), at least for the handful of games I tested on. So a >> simple**** >> >> change to just disable vzeroupper by default will be acceptable to us as >> well.**** >> >> ** ** >> >> > The reason we need vzeroupper in the first place is because we can't >> assume**** >> >> > other functions won't use legacy SSE instructions; for example, on most >> **** >> >> > systems, calling sin() will use legacy SSE instructions. I mean, if >> you can**** >> >> > make some unusual guarantee about your platform, it might make sense to >> **** >> >> > disable vzeroupper generation in general, but it simply doesn't make >> sense**** >> >> > on most platforms.**** >> >> ** ** >> >> I am confused by this point. By "most systems," do you have in mind a >> platform**** >> >> where the sin() function was compiled by gcc but the application codes >> were**** >> >> compiled by clang? >> > > On, for example, OS X, AVX is not enabled by default, so the sin() > function uses legacy SSE instructions. Users can still turn on AVX in > their applications. > > -Eli > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/53eed112/attachment.html>
Apparently Analagous Threads
- [LLVMdev] Proposal to improve vzeroupper optimization strategy
- [LLVMdev] Proposal to improve vzeroupper optimization strategy
- [LLVMdev] Proposal to improve vzeroupper optimization strategy
- [LLVMdev] Proposal to improve vzeroupper optimization strategy
- [LLVMdev] Proposal to improve vzeroupper optimization strategy