Robert Lougher
2014-Dec-05 16:02 UTC
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On 5 December 2014 at 06:49, Sean Silva <chisophugis at gmail.com> wrote:> > > On Wed, Dec 3, 2014 at 4:23 AM, Robert Lougher <rob.lougher at gmail.com> > wrote: >> >> Hi, >> >> In feedback from game studios a common issue is the replacement of >> loops with calls to memcpy/memset. These loops are often >> hand-optimised, and highly-efficient and the developers strongly want >> a way to control the compiler (i.e. leave my loop alone). > > > Please provide examples of such "hand-optimised, and highly-efficient" > routines and test cases (and execution conditions) that demonstrate a > performance improvement. >This sounds like a cop-out, but we can't share customer code (even if we could get a small runnable example). But this is all getting beside the point. I discussed performance issues to try and justify why the user should have control. That was probably a mistake as it has subverted the conversation. The blunt fact is that game developers don't like their loops being replaced and they want user control. The real conversation I wanted was what form should this user control take. To be honest, I am surprised at the level of resistance to giving users *any* control over their codegen.
David Chisnall
2014-Dec-05 17:09 UTC
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On 5 Dec 2014, at 16:02, Robert Lougher <rob.lougher at gmail.com> wrote:> The blunt fact is that game > developers don't like their loops being replaced and they want user > control. The real conversation I wanted was what form should this > user control take.This doesn't really make sense. They're writing code in a high(ish)-level language and passing it to a compiler. The processor doesn't understand C, so the compiler *must* replace it with something. Whether that's a call to a library routine, a scalar loop, a vector loop, or a completely unrolled sequence depends on heuristics in the compiler. If they want full control of the machine instructions that are generated, then there is a mechanism for doing this: inline assembly. The complaint can't be that it's not generating the same code, it is that the compiler is generating something with performance characteristics that are difficult to reason about from the input code. That's always a danger when you use an optimising compiler, but it looks like this case is a pretty extreme example. David
Robert Lougher
2014-Dec-05 17:22 UTC
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On 5 December 2014 at 17:09, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> On 5 Dec 2014, at 16:02, Robert Lougher <rob.lougher at gmail.com> wrote: > >> The blunt fact is that game >> developers don't like their loops being replaced and they want user >> control. The real conversation I wanted was what form should this >> user control take. > > This doesn't really make sense. They're writing code in a high(ish)-level language and passing it to a compiler. The processor doesn't understand C, so the compiler *must* replace it with something. Whether that's a call to a library routine, a scalar loop, a vector loop, or a completely unrolled sequence depends on heuristics in the compiler. >Yes, but why isn't the user allowed any control over what the compiler does?> If they want full control of the machine instructions that are generated, then there is a mechanism for doing this: inline assembly. >Of course they could do this, but this is like having to make your deli sandwich yourself instead of telling the guy to "hold the mayo". Rob.> The complaint can't be that it's not generating the same code, it is that the compiler is generating something with performance characteristics that are difficult to reason about from the input code. That's always a danger when you use an optimising compiler, but it looks like this case is a pretty extreme example. > > David >
Philip Reames
2014-Dec-05 18:06 UTC
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On 12/05/2014 08:02 AM, Robert Lougher wrote:> On 5 December 2014 at 06:49, Sean Silva <chisophugis at gmail.com> wrote: >> >> On Wed, Dec 3, 2014 at 4:23 AM, Robert Lougher <rob.lougher at gmail.com> >> wrote: >>> Hi, >>> >>> In feedback from game studios a common issue is the replacement of >>> loops with calls to memcpy/memset. These loops are often >>> hand-optimised, and highly-efficient and the developers strongly want >>> a way to control the compiler (i.e. leave my loop alone). >> >> Please provide examples of such "hand-optimised, and highly-efficient" >> routines and test cases (and execution conditions) that demonstrate a >> performance improvement. >> > This sounds like a cop-out, but we can't share customer code (even if > we could get a small runnable example). But this is all getting > beside the point. I discussed performance issues to try and justify > why the user should have control. That was probably a mistake as it > has subverted the conversation. The blunt fact is that game > developers don't like their loops being replaced and they want user > control. The real conversation I wanted was what form should this > user control take. To be honest, I am surprised at the level of > resistance to giving users *any* control over their codegen.If you want to maintain a custom branch of clang with an additional option added, no one would object or care. If you were to submit a patch to add such a flag, it might even be accepted. So far, the discussion has focused on what the compiler is doing wrong in this case. You have requested a workaround for what is clearly a compiler optimization bug. Before agreeing to support the workaround, considering how hard it would be to fix is clearly the right approach. Having said all of that, the existing push/pop optimization scopes (a gcc extension) should either already work for what you're trying to with the workaround or could be relatively easy to adapt. If there's an -OX setting that excludes the optimization you consider problematic try: #pragma GCC optimize("-OX") You could also try the clang::optnone function attribute. Philip
Sean Silva
2014-Dec-07 21:50 UTC
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On Fri, Dec 5, 2014 at 8:02 AM, Robert Lougher <rob.lougher at gmail.com> wrote:> On 5 December 2014 at 06:49, Sean Silva <chisophugis at gmail.com> wrote: > > > > > > On Wed, Dec 3, 2014 at 4:23 AM, Robert Lougher <rob.lougher at gmail.com> > > wrote: > >> > >> Hi, > >> > >> In feedback from game studios a common issue is the replacement of > >> loops with calls to memcpy/memset. These loops are often > >> hand-optimised, and highly-efficient and the developers strongly want > >> a way to control the compiler (i.e. leave my loop alone). > > > > > > Please provide examples of such "hand-optimised, and highly-efficient" > > routines and test cases (and execution conditions) that demonstrate a > > performance improvement. > > > > This sounds like a cop-out, but we can't share customer code (even if > we could get a small runnable example).I doubt a reduced and sanitized example would violate any policy. If even a reduced and sanitized example violates your policy, then you may want to start an internal discussion regarding this policy because it is difficult to collaborate with such a policy. (it's not like you're being asked to post a reduced example of a trade-secret algorithm; it's just a memcpy loop).> But this is all getting > beside the point. I discussed performance issues to try and justify > why the user should have control. That was probably a mistake as it > has subverted the conversation. The blunt fact is that game > developers don't like their loops being replaced and they want user > control.I'm not convinced. If the compiler produced code that was faster than what they wrote, they would not be complaining. -- Sean Silva> The real conversation I wanted was what form should this > user control take. To be honest, I am surprised at the level of > resistance to giving users *any* control over their codegen. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141207/6767b4c5/attachment.html>