Our application is 32-bit big-endian ARM and we use -O3 with LTO. clang optimizes certain initialization of structures to zero with calls to memset, which are not further lowered to move instructions. Investigating perf reports, it looks like it may be beneficial to disable this optimization that introduces a function call to memset in certain hot paths. I tried passing -fno-builtin, but that doesn't seem to help my case; the code doesn't compile with -ffreestanding. Any suggestions on what I could try to avoid the calls to memset? It is possible to reorganize the code to avoid this, but I am looking for a more general solution. I find that GCC has an option -fno-tree-loop-distribute-patterns that can be used to disable memcpy/memset synthesis. I wonder if there is something similar in llvm/clang. Thanks, Bharathi
On 15 August 2017 at 19:38, bharathi seshadri via llvm-dev <llvm-dev at lists.llvm.org> wrote:> I find that GCC has an option -fno-tree-loop-distribute-patterns that > can be used to disable memcpy/memset synthesis. I wonder if there is > something similar in llvm/clang.I have no idea what that means, but we almost certainly don't have any option with similar semantics. Clang does not provide options to control specific optimization passes like that. The best advice is to file a bug report about the situation you're seeing where a call to memset is bad for performance. There's clearly something going wrong with Clang's heuristics and the best solution is to fix that. Cheers. Tim.
On Tue, Aug 15, 2017 at 09:37:58PM -0700, Tim Northover via llvm-dev wrote:> The best advice is to file a bug report about the situation you're > seeing where a call to memset is bad for performance. There's clearly > something going wrong with Clang's heuristics and the best solution is > to fix that.Do you mean clang or the backend? The discussion elsewhere about disabling memcpy intrinsic forming already makes me suspect that some targets don't handle those intrinsics well enough. Joerg
Reid Kleckner via llvm-dev
2017-Aug-16 17:38 UTC
[llvm-dev] [cfe-dev] Disable memset synthesis
On Tue, Aug 15, 2017 at 9:37 PM, Tim Northover via cfe-dev < cfe-dev at lists.llvm.org> wrote:> On 15 August 2017 at 19:38, bharathi seshadri via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > I find that GCC has an option -fno-tree-loop-distribute-patterns that > > can be used to disable memcpy/memset synthesis. I wonder if there is > > something similar in llvm/clang. > > I have no idea what that means, but we almost certainly don't have any > option with similar semantics. Clang does not provide options to > control specific optimization passes like that. >I think Sony exposes an option to disable idiom recognition in PS4 compiler. This seems like one of those areas where users keep asking for something and we keep insisting that what they think they want isn't actually what they want, i.e. disabling idiom recognition blocks mid-level canonicalization and that leads to missing optimizations and bad performance, etc. However, the user feedback has been persistent, and in the interests of not having to hear about it again, we might want to consider giving users the rope they need to hang themselves. It would let them work around real performance problems today rather than waiting for the next version of the compiler that will lower memset/memcpy/memcmp better. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170816/e70f767a/attachment.html>