Joerg Sonnenberger via llvm-dev
2015-Nov-12 14:30 UTC
[llvm-dev] [RFC] A new intrinsic, `llvm.blackbox`, to explicitly prevent constprop, die, etc optimizations
On Wed, Nov 11, 2015 at 07:14:28PM -0800, Sean Silva via llvm-dev wrote:> Can you show a real benchmark that users have tried to write where the call > overhead of actually using an external function call is measurable?This is the wrong question. The correct question is: What useful benchmark cannot trivally factor out the overhead of the external function call. Yes, if you do microbenchmarking it can be measurable. But the point is that the overhead should be extremely predictable and stable. As such, it can be easily calibrated and removed from the cost of whatever you are really trying to measure. Given that the instrumentation in general has some latency, you won't get around calibration anyway. Joerg
Richard Diamond via llvm-dev
2015-Nov-16 17:00 UTC
[llvm-dev] [RFC] A new intrinsic, `llvm.blackbox`, to explicitly prevent constprop, die, etc optimizations
Hey all, I apologize for my delay with my reply to you all (two tests last week, with three more coming this week). I appreciate all of your inputs. Based on the discussion, I’ve refined the scope and purpose of llvm.blackbox, at least as it pertains to Rust’s desired use case. Previously, I left the intrinsic only vaguely specified, and based on the resulting comments, I’ve arrived at a more well defined intrinsic. Specifically: - Calls to the intrinsic must not be removed; - Calls may be duplicated; - No assumptions about the return value may be made from its argument (including pointer arguments, meaning the returned pointer is a may alias for all queries); - It must be assumed that every byte of the pointer element type of the argument will be read (if the argument is a pointer); and - These rules must be maintained all the way to machine code emission, at which point the first rule must be violated. All other optimizations are fair game. The above is a bit involved to be sure, and seeing how this intrinsic isn’t critical, I’m fine with leaving it at the worse case (ie read/write mem + other side effects) for now. Why? Alex summed it up best: “[..] it’s about *simulating the existence* of a “perfectly efficient” external world.” This intrinsic would serve as an aid for benchmarking, ensuring benchmark code is still relevant after optimizations are performed on it, and is an attempt to create a dedicated escape hatch to be used in place of the alternatives I’ve listed below. Alternatives In no particular order: - Volatile stores Not ideal for benchmarking (isn’t guaranteed to cache), nonetheless I made an attempt to measure the effects on Rustc’s set of benchmarks. However, I found an issue with rustc which blocks attempts to measure the effect: https://github.com/rust-lang/rust/issues/29663. - Inline asm which “uses” a pointer to the value Rust’s current solution. Needs stack space. - Inline asm which returns the value Won’t work for any type which is bigger than a register; at least not without creating a rustc intrinsic anyway to make the asm operate piecewise on the register native component types of the type if need be. And how is rustc to know exactly which are the register sized or smaller types? rustc mostly leaves such knowledge to LLVM. Good idea, but the needed logistics would make it ugly. - Mark test::black_box as noinline Also not ideal because of the mandatory function call overhead. - External function Impossible for Rust; generics are monomorphised into the crate in which it is used (ie the resulting function in IR won’t ever be external to the module using it), thus making this an impossible solution for Rust. Also, Rust doesn’t allow function overloading, so C++ style explicit specialization is also out. Also suffers from the same aforementioned call overhead. Again, comments are welcome. Richard Diamond -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151116/76281936/attachment.html>
James Molloy via llvm-dev
2015-Nov-16 18:03 UTC
[llvm-dev] [RFC] A new intrinsic, `llvm.blackbox`, to explicitly prevent constprop, die, etc optimizations
Hi Richard, You don't appear to have addressed my suggestion to not require a perfect external world, instead to measure the overhead of an imperfect world (by using an empty benchmark) and subtracting that from the measured benchmark score. Besides which, absolute benchmark results are more than often totally useless - the really important part of benchmarking is relative differences. Certainly in my experience I've never needed to care about absolute numbers and i wonder why you do. Cheers, James On Mon, 16 Nov 2015 at 17:00, Richard Diamond via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hey all, > > I apologize for my delay with my reply to you all (two tests last week, > with three more coming this week). > > I appreciate all of your inputs. Based on the discussion, I’ve refined the > scope and purpose of llvm.blackbox, at least as it pertains to Rust’s > desired use case. Previously, I left the intrinsic only vaguely specified, > and based on the resulting comments, I’ve arrived at a more well defined > intrinsic. > > Specifically: > > - Calls to the intrinsic must not be removed; > - Calls may be duplicated; > - No assumptions about the return value may be made from its argument > (including pointer arguments, meaning the returned pointer is a may alias > for all queries); > - It must be assumed that every byte of the pointer element type of > the argument will be read (if the argument is a pointer); and > - These rules must be maintained all the way to machine code emission, > at which point the first rule must be violated. > > All other optimizations are fair game. > > The above is a bit involved to be sure, and seeing how this intrinsic > isn’t critical, I’m fine with leaving it at the worse case (ie read/write > mem + other side effects) for now. > Why? > > Alex summed it up best: “[..] it’s about *simulating the existence* of a > “perfectly efficient” external world.” This intrinsic would serve as an aid > for benchmarking, ensuring benchmark code is still relevant after > optimizations are performed on it, and is an attempt to create a dedicated > escape hatch to be used in place of the alternatives I’ve listed below. > Alternatives > > In no particular order: > > - Volatile stores > > Not ideal for benchmarking (isn’t guaranteed to cache), nonetheless I made > an attempt to measure the effects on Rustc’s set of benchmarks. However, I > found an issue with rustc which blocks attempts to measure the effect: > https://github.com/rust-lang/rust/issues/29663. > > - Inline asm which “uses” a pointer to the value > > Rust’s current solution. Needs stack space. > > - Inline asm which returns the value > > Won’t work for any type which is bigger than a register; at least not > without creating a rustc intrinsic anyway to make the asm operate > piecewise on the register native component types of the type if need be. > And how is rustc to know exactly which are the register sized or smaller > types? rustc mostly leaves such knowledge to LLVM. > > Good idea, but the needed logistics would make it ugly. > > - Mark test::black_box as noinline > > Also not ideal because of the mandatory function call overhead. > > - External function > > Impossible for Rust; generics are monomorphised into the crate in which it > is used (ie the resulting function in IR won’t ever be external to the > module using it), thus making this an impossible solution for Rust. Also, > Rust doesn’t allow function overloading, so C++ style explicit > specialization is also out. Also suffers from the same aforementioned call > overhead. > > Again, comments are welcome. > Richard Diamond > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151116/a1e28b00/attachment.html>
Daniel Berlin via llvm-dev
2015-Nov-17 04:57 UTC
[llvm-dev] [RFC] A new intrinsic, `llvm.blackbox`, to explicitly prevent constprop, die, etc optimizations
"Not ideal for benchmarking (isn’t guaranteed to cache)," Could you clarify what you mean by this? On Mon, Nov 16, 2015 at 9:00 AM, Richard Diamond via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hey all, > > I apologize for my delay with my reply to you all (two tests last week, > with three more coming this week). > > I appreciate all of your inputs. Based on the discussion, I’ve refined the > scope and purpose of llvm.blackbox, at least as it pertains to Rust’s > desired use case. Previously, I left the intrinsic only vaguely specified, > and based on the resulting comments, I’ve arrived at a more well defined > intrinsic. > > Specifically: > > - Calls to the intrinsic must not be removed; > - Calls may be duplicated; > - No assumptions about the return value may be made from its argument > (including pointer arguments, meaning the returned pointer is a may alias > for all queries); > - It must be assumed that every byte of the pointer element type of > the argument will be read (if the argument is a pointer); and > - These rules must be maintained all the way to machine code emission, > at which point the first rule must be violated. > > All other optimizations are fair game. > > The above is a bit involved to be sure, and seeing how this intrinsic > isn’t critical, I’m fine with leaving it at the worse case (ie read/write > mem + other side effects) for now. > Why? > > Alex summed it up best: “[..] it’s about *simulating the existence* of a > “perfectly efficient” external world.” This intrinsic would serve as an aid > for benchmarking, ensuring benchmark code is still relevant after > optimizations are performed on it, and is an attempt to create a dedicated > escape hatch to be used in place of the alternatives I’ve listed below. > Alternatives > > In no particular order: > > - Volatile stores > > Not ideal for benchmarking (isn’t guaranteed to cache), nonetheless I made > an attempt to measure the effects on Rustc’s set of benchmarks. However, I > found an issue with rustc which blocks attempts to measure the effect: > https://github.com/rust-lang/rust/issues/29663. > > - Inline asm which “uses” a pointer to the value > > Rust’s current solution. Needs stack space. > > - Inline asm which returns the value > > Won’t work for any type which is bigger than a register; at least not > without creating a rustc intrinsic anyway to make the asm operate > piecewise on the register native component types of the type if need be. > And how is rustc to know exactly which are the register sized or smaller > types? rustc mostly leaves such knowledge to LLVM. > > Good idea, but the needed logistics would make it ugly. > > - Mark test::black_box as noinline > > Also not ideal because of the mandatory function call overhead. > > - External function > > Impossible for Rust; generics are monomorphised into the crate in which it > is used (ie the resulting function in IR won’t ever be external to the > module using it), thus making this an impossible solution for Rust. Also, > Rust doesn’t allow function overloading, so C++ style explicit > specialization is also out. Also suffers from the same aforementioned call > overhead. > > Again, comments are welcome. > Richard Diamond > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151116/81b89ada/attachment.html>