Jonas Wagner via llvm-dev
2016-Jan-19 17:40 UTC
[llvm-dev] Adding support for self-modifying branches to LLVM?
Hi, I’m thinking about using LLVM to implement a limited form of self-modifying code. Before diving into that, I’d like to get some feedback from you all. *The goal:* I’d like to add “optional” code to a program that I can enable at runtime and that has zero (i.e., as close to zero as I can get) overhead when not enabled. *Existing solutions:* Currently, I can guard optional code using a branch, something like br i1 %cond, label %optional, label %skip, !prof !0. Branch weights ensure that the branch is predicted correctly. The overhead of this is not as low as I’d like, though, because the branch is still present in the code and because computing %cond also has some cost. *The idea:* I’d like to have a branch that is the same as the example above, but that gets translated into a nop instruction. Preferably some unique nop that I can easily recognize in the binary, and that has the same size as an unconditional branch instruction. Then, I could use a framework such as DynInst to replace that nop with an unconditional branch instruction at run-time. My questions to the community would be: - Does the idea make sense, or am I missing a much simpler approach? - What would be the easiest way to obtain the desired binary? Adding a new TerminatorInstruction sounds daunting, is there something simpler? I also wonder whether I could even expects speedups from this? Are nop instructions actually cheaper than branches? Would modifying the binary at run-time play well enough with caches etc.? These are probably not questions for the LLVM mailing list, but if anybody has good answers they are welcome. Looking forward to hearing your thoughts, Jonas -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160119/e206da4d/attachment.html>
Hal Finkel via llvm-dev
2016-Jan-20 04:21 UTC
[llvm-dev] Adding support for self-modifying branches to LLVM?
----- Original Message -----> From: "Jonas Wagner via llvm-dev" <llvm-dev at lists.llvm.org> > To: llvm-dev at lists.llvm.org > Sent: Tuesday, January 19, 2016 11:40:16 AM > Subject: [llvm-dev] Adding support for self-modifying branches to > LLVM?> Hi, > I’m thinking about using LLVM to implement a limited form of > self-modifying code. Before diving into that, I’d like to get some > feedback from you all. > The goal: I’d like to add “optional” code to a program that I can > enable at runtime and that has zero (i.e., as close to zero as I can > get) overhead when not enabled. > Existing solutions: Currently, I can guard optional code using a > branch, something like br i1 %cond, label %optional, label %skip, > !prof !0 . Branch weights ensure that the branch is predicted > correctly. The overhead of this is not as low as I’d like, though, > because the branch is still present in the code and because > computing %cond also has some cost. > The idea: I’d like to have a branch that is the same as the example > above, but that gets translated into a nop instruction. Preferably > some unique nop that I can easily recognize in the binary, and that > has the same size as an unconditional branch instruction. Then, I > could use a framework such as DynInst to replace that nop with an > unconditional branch instruction at run-time. > My questions to the community would be:> * Does the idea make sense, or am I missing a much simpler approach? > * What would be the easiest way to obtain the desired binary? Adding > a new TerminatorInstruction sounds daunting, is there something > simpler?> I also wonder whether I could even expects speedups from this? Are > nop instructions actually cheaper than branches? Would modifying the > binary at run-time play well enough with caches etc.? These are > probably not questions for the LLVM mailing list, but if anybody has > good answers they are welcome.If you've not already, you'll want to look at this: http://llvm.org/docs/StackMaps.html (it does not quite do what you want, but it should give you some idea on how you might proceed). -Hal> Looking forward to hearing your thoughts, > Jonas > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160119/8e3bf583/attachment.html>
Sean Silva via llvm-dev
2016-Jan-20 05:04 UTC
[llvm-dev] Adding support for self-modifying branches to LLVM?
On Tue, Jan 19, 2016 at 9:40 AM, Jonas Wagner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > I’m thinking about using LLVM to implement a limited form of > self-modifying code. Before diving into that, I’d like to get some feedback > from you all. > > *The goal:* I’d like to add “optional” code to a program that I can > enable at runtime and that has zero (i.e., as close to zero as I can get) > overhead when not enabled. > > *Existing solutions:* Currently, I can guard optional code using a > branch, something like br i1 %cond, label %optional, label %skip, !prof !0. > Branch weights ensure that the branch is predicted correctly. The overhead > of this is not as low as I’d like, >How low would you like it? What use case is suffering for performance due to the branch? Self-modifying code for truly zero-overhead (when not enabled) instrumentation is a real thing (look at e.g. DTrace pid provider) but unless the number of instrumentation point is very large (100's of thousands? millions?) or not known beforehand (both are true for DTrace), the cost of a branch will be negligible. AFAIK, the cost of a well-predicted, not-taken branch is the same as a nop on every x86 made in the last many years. See http://www.agner.org/optimize/instruction_tables.pdf Generally speaking a correctly-predicted not-taken branch is basically identical to a nop, and a correctly-predicted taken branch is has an extra overhead similar to an "add" or other extremely cheap operation. More concerning is that the condition that is branched on is probably some flag in memory somewhere and will require a memory operation to check it (but of course on a good OoO w/ speculative execution this doesn't hold up anything but the retire queue). -- Sean Silva> though, because the branch is still present in the code and because > computing %cond also has some cost. > > *The idea:* I’d like to have a branch that is the same as the example > above, but that gets translated into a nop instruction. Preferably some > unique nop that I can easily recognize in the binary, and that has the > same size as an unconditional branch instruction. Then, I could use a > framework such as DynInst to replace that nop with an unconditional > branch instruction at run-time. > > My questions to the community would be: > > - Does the idea make sense, or am I missing a much simpler approach? > - What would be the easiest way to obtain the desired binary? Adding a > new TerminatorInstruction sounds daunting, is there something simpler? > > I also wonder whether I could even expects speedups from this? Are nop > instructions actually cheaper than branches? Would modifying the binary at > run-time play well enough with caches etc.? These are probably not > questions for the LLVM mailing list, but if anybody has good answers they > are welcome. > > Looking forward to hearing your thoughts, > Jonas > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160119/10a4e683/attachment.html>
Jonas Wagner via llvm-dev
2016-Jan-20 21:49 UTC
[llvm-dev] Adding support for self-modifying branches to LLVM?
Thanks for the information. This has been very useful! Patch points indeed *almost* do what I need. I will try to build a similar solution. Self-modifying code for truly zero-overhead (when not enabled)> instrumentation is a real thing (look at e.g. DTrace pid provider) but > unless the number of instrumentation point is very large (100's of > thousands? millions?) or not known beforehand (both are true for DTrace), > the cost of a branch will be negligible. >In the use case that I have in mind, there are indeed a large number of instrumentation points. To give a concrete example of what I'd like to achieve, consider Clang's -fsanitize flag. These sanitizers add thousands of little independent bits of code to the program, e.g., memory access checks. Code that is very useful but also slows the program down. I'd like to transform this code into zero-overhead instrumentation that I can enable selectively. AFAIK, the cost of a well-predicted, not-taken branch is the same as a nop> on every x86 made in the last many years. >I'm still not 100% sure whether the `nop <-> br` conversion is the best approach. I've considered some alternatives: - Branches where the condition is a flag in memory. My early experiments were too slow :( - Conditional branches along with some other way to control the flags register. I'm afraid of the side effects this might have. For now, transforming an `llvm.experimental.patchpoint` into an unconditional branch looks most promising. I just need to figure out how to trick LLVM into laying out basic blocks the right way (and not eliminating those that are unreachable due to not-yet-transformed branches). Cheers, Jonas -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160120/3e55ff9b/attachment.html>
Philip Reames via llvm-dev
2016-Jan-21 21:33 UTC
[llvm-dev] Adding support for self-modifying branches to LLVM?
On 01/19/2016 09:04 PM, Sean Silva via llvm-dev wrote:> > AFAIK, the cost of a well-predicted, not-taken branch is the same as a > nop on every x86 made in the last many years. > See http://www.agner.org/optimize/instruction_tables.pdf > <http://www.agner.org/optimize/instruction_tables.pdf> > Generally speaking a correctly-predicted not-taken branch is basically > identical to a nop, and a correctly-predicted taken branch is has an > extra overhead similar to an "add" or other extremely cheap operation.Specifically on this point only: While absolutely true for most micro-benchmarks, this is less true at large scale. I've definitely seen removing a highly predictable branch (in many, many places, some of which are hot) to benefit performance in the 5-10% range. For instance, removing highly predictable branches is the primary motivation of implicit null checking. (http://llvm.org/docs/FaultMaps.html). Where exactly the performance improvement comes from is hard to say, but, empirically, it does matter. (Caveat to above: I have not run an experiment that actually put in the same number of bytes in nops. It's possible the entire benefit I mentioned is code size related, but I doubt it given how many ticks a sample profiler will show on said branches.) p.s. Sean mentions down-thread that most of the slowdown from checks is in the effect on the optimizer, not the direct impact of the instructions emitted. This is absolutely our experience as well. I don't intend for anything I said above to imply otherwise. Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160121/5c7b9209/attachment.html>
Possibly Parallel Threads
- Adding support for self-modifying branches to LLVM?
- Adding support for self-modifying branches to LLVM?
- Adding support for self-modifying branches to LLVM?
- Adding support for self-modifying branches to LLVM?
- How shall I evaluate the latency of each instruction in LLVM IR?