On May 15, 2013, at 8:44 PM, Richard Smith <richard at metafoo.co.uk> wrote:> On Wed, May 15, 2013 at 8:28 PM, Richard Smith <richard at metafoo.co.uk> wrote: > On Wed, May 15, 2013 at 7:49 PM, Chandler Carruth <chandlerc at google.com> wrote: > On Wed, May 15, 2013 at 8:31 PM, Richard Smith <richard at metafoo.co.uk> wrote: > Hi, > > LLVM classifies _Znwm as a builtin by default. After some discussion, the C++ core working group have decreed that that is not correct: calls to "operator new" *can* be optimized, but only if they come from new-expressions, and not if they come from explicit calls to ::operator new. We cannot work around this in the frontend by marking the call as 'nobuiltin' for two reasons: > > 1) The 'nobuiltin' attribute doesn't actually prevent the optimization (see recent patch on llvmcommits) > 2) We can't block the optimization if the call happens through a function pointer, unless we also annotate all calls through function pointers as 'nobuiltin' > > How feasible would it be to make the 'builtin-ness' of _Znwm etc be opt-in rather than opt-out? Is there some other option we could pursue? > > Wow, this was spectacularly unclear, sorry about that. To avoid confusion, I'm suggesting that we add a 'builtin' attribute, and do not treat a call to _Znwm as a builtin call unless it has the attribute. >It's not clear to me that "builtin" is the right way to model this, but it definitely sounds like this should be an attribute on a call site (as opposed to on the function itself). What specific kinds of optimizations are we interested in doing to _Znwm calls? -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130515/d4d044f0/attachment.html>
On Wed, May 15, 2013 at 8:46 PM, Chris Lattner <clattner at apple.com> wrote:> On May 15, 2013, at 8:44 PM, Richard Smith <richard at metafoo.co.uk> wrote: > > On Wed, May 15, 2013 at 8:28 PM, Richard Smith <richard at metafoo.co.uk>wrote: > >> On Wed, May 15, 2013 at 7:49 PM, Chandler Carruth <chandlerc at google.com>wrote: >> >>> On Wed, May 15, 2013 at 8:31 PM, Richard Smith <richard at metafoo.co.uk>wrote: >>> >>>> Hi, >>>> >>>> LLVM classifies _Znwm as a builtin by default. After some discussion, >>>> the C++ core working group have decreed that that is not correct: calls to >>>> "operator new" *can* be optimized, but only if they come from >>>> new-expressions, and not if they come from explicit calls to ::operator >>>> new. We cannot work around this in the frontend by marking the call as >>>> 'nobuiltin' for two reasons: >>>> >>>> 1) The 'nobuiltin' attribute doesn't actually prevent the optimization >>>> (see recent patch on llvmcommits) >>>> 2) We can't block the optimization if the call happens through a >>>> function pointer, unless we also annotate all calls through function >>>> pointers as 'nobuiltin' >>>> >>>> How feasible would it be to make the 'builtin-ness' of _Znwm etc be >>>> opt-in rather than opt-out? Is there some other option we could pursue? >>>> >>> > Wow, this was spectacularly unclear, sorry about that. To avoid confusion, > I'm suggesting that we add a 'builtin' attribute, and do not treat a call > to _Znwm as a builtin call unless it has the attribute. > > > > It's not clear to me that "builtin" is the right way to model this, but it > definitely sounds like this should be an attribute on a call site (as > opposed to on the function itself). What specific kinds of optimizations > are we interested in doing to _Znwm calls? >Initially, I'm just concerned about keeping the optimizations we already perform, such as globalopt lowering a new/delete pair into a global, while disabling the non-conforming variations of those optimizations. But we're also permitted to merge multiple allocations into one if they have sufficiently similar lifetimes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130515/d5fdc36c/attachment.html>
On May 15, 2013, at 8:50 PM, Richard Smith <richard at metafoo.co.uk> wrote:>> 1) The 'nobuiltin' attribute doesn't actually prevent the optimization (see recent patch on llvmcommits) >> 2) We can't block the optimization if the call happens through a function pointer, unless we also annotate all calls through function pointers as 'nobuiltin' >> >> How feasible would it be to make the 'builtin-ness' of _Znwm etc be opt-in rather than opt-out? Is there some other option we could pursue? >> >> Wow, this was spectacularly unclear, sorry about that. To avoid confusion, I'm suggesting that we add a 'builtin' attribute, and do not treat a call to _Znwm as a builtin call unless it has the attribute. >> > > It's not clear to me that "builtin" is the right way to model this, but it definitely sounds like this should be an attribute on a call site (as opposed to on the function itself). What specific kinds of optimizations are we interested in doing to _Znwm calls? > > Initially, I'm just concerned about keeping the optimizations we already perform, such as globalopt lowering a new/delete pair into a global, while disabling the non-conforming variations of those optimizations. But we're also permitted to merge multiple allocations into one if they have sufficiently similar lifetimes.So your proposal is for Clang to slap the attribute on explicit calls to ::operator new, but any other use of the symbol (e.g. from C code or something else weird) can be optimized? If so, using nobuiltin seems perfectly fine to me. It seems like there is no auto-upgrade required. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130515/51024136/attachment.html>
On Wed, May 15, 2013 at 9:46 PM, Chris Lattner <clattner at apple.com> wrote:> It's not clear to me that "builtin" is the right way to model this, but it > definitely sounds like this should be an attribute on a call site (as > opposed to on the function itself). What specific kinds of optimizations > are we interested in doing to _Znwm calls?You can see the paper to the C++ committee, but my primary goals. 1) run SROA over heap memory 2) pool together heap allocations on the same CFG trace 3) promote (sufficiently small and lifetime bounded) heap allocations to stack allocations -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130515/fde30d99/attachment.html>
On May 15, 2013, at 8:59 PM, Chandler Carruth <chandlerc at google.com> wrote:> > On Wed, May 15, 2013 at 9:46 PM, Chris Lattner <clattner at apple.com> wrote: > It's not clear to me that "builtin" is the right way to model this, but it definitely sounds like this should be an attribute on a call site (as opposed to on the function itself). What specific kinds of optimizations are we interested in doing to _Znwm calls? > > You can see the paper to the C++ committee, but my primary goals. > > 1) run SROA over heap memory > 2) pool together heap allocations on the same CFG trace > 3) promote (sufficiently small and lifetime bounded) heap allocations to stack allocationsOk, presumably also 4) know it returns non-aliased memory, and maybe other stuff in the future. Are you "happy" to special case operator new and new[], or should we design something more extravagant to handle other cases? -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130515/ff6b33c8/attachment.html>
On May 15, 2013, at 11:59 PM, Chandler Carruth <chandlerc at google.com> wrote:> > On Wed, May 15, 2013 at 9:46 PM, Chris Lattner <clattner at apple.com> wrote: > It's not clear to me that "builtin" is the right way to model this, but it definitely sounds like this should be an attribute on a call site (as opposed to on the function itself). What specific kinds of optimizations are we interested in doing to _Znwm calls? > > You can see the paper to the C++ committee, but my primary goals. > > 1) run SROA over heap memory > 2) pool together heap allocations on the same CFG trace > 3) promote (sufficiently small and lifetime bounded) heap allocations to stack allocationsThis appears to me to be an observable change (with custom new/delete). Will this optimization be limited to -std=c++1y? Howard