thr3ads.net - llvm dev - [llvm-dev] Adding support for self-modifying branches to LLVM? [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Jonas Wagner via llvm-dev

2016-Feb-09 14:57 UTC

[llvm-dev] Adding support for self-modifying branches to LLVM?

Hi,

I'm coming back to this old thread with data about the performance of NOPs.
Recalling that I was considering transforming NOP instructions into
branches and back, in order to dynamically enable code. One use case for
this was enabling/disabling individual sanitizer checks (ASan, UBSan) on
demand.

I wrote a pass which takes an ASan-instrumented program, and replaces each
ASan check with an llvm.experimental.patchpoint intrinsic. This intrinsic
inserts a NOP of configurable size. It has otherwise no effect on the
program semantics. It does prevent some optimizations, presumably because
instructions cannot be moved across the patchpoint.

Some results:
- On SPEC, patchpoints introduce an overhead of ~25% compared to a version
where ASan checks are removed.
- This is almost half of the cost of the checks themselves.
- The results are similar for NOPs of size 1 and 5 bytes.
- Interestingly, the results are similar for NOPs of 0 bytes, too. These
are patchpoints that don't insert any code and only inhibit optimizations.
I've only tested this on one benchmark, though.

To summarize, only part of the cost of NOPs is due to executing them. Their
effect on optimizations is significant, too. I guess this would hold for
branches and sanitizer checks as well.

Best,
Jonas


On Thu, Jan 21, 2016 at 11:52 PM Jonas Wagner <jonas.wagner at epfl.ch>
wrote:
> Hello,
>
> There is some data on this, e.g, in “High System-Code Security with Low
> Overhead” <http://dslab.epfl.ch/proj/asap/#publications>. In this
work we
> found that, for ASan as well as other instrumentation tools, most overhead
> comes from the checks. Especially for CPU-intensive applications, the cost
> of maintaining shadow memory is small.
>
> How did you measure this? If it was measured by removing the checks before
> optimization happens, then what you may have been measuring is not the
> execution overhead of the branches (which is what would be eliminated by
> nop’ing them out) but the effect on the optimizer.
>
> Interesting. Indeed this was measured by removing some checks and then
> re-optimizing the program.
>
> I’m aware of some impact checks may have on optimization. For example,
> I’ve seen cases where much less inlining happens because functions with
> checks are larger. Do you know other concrete examples? This is definitely
> something I’ll have to be careful about. Philip Reames confirms this, too.
>
> On the other hand, we’ve also found that the benefit from removing a check
> is roughly proportional to the number of cycles spent executing that
> check’s instructions. Our model of this is not very precise, but it shows
> that the cost of executing the check’s instructions matters.
>
> I'll try to measure this, and will come back when I have data.
>
> Best,
> Jonas
> 
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/2eb19061/attachment.html>

Philip Reames via llvm-dev

2016-Feb-09 16:07 UTC

head link

[llvm-dev] Adding support for self-modifying branches to LLVM?

On 02/09/2016 06:57 AM, Jonas Wagner wrote:> Hi,
>
> I'm coming back to this old thread with data about the performance of 
> NOPs. Recalling that I was considering transforming NOP instructions 
> into branches and back, in order to dynamically enable code. One use 
> case for this was enabling/disabling individual sanitizer checks 
> (ASan, UBSan) on demand.
>
> I wrote a pass which takes an ASan-instrumented program, and replaces 
> each ASan check with an llvm.experimental.patchpoint intrinsic. This 
> intrinsic inserts a NOP of configurable size. It has otherwise no 
> effect on the program semantics. It does prevent some optimizations, 
> presumably because instructions cannot be moved across the patchpoint.
>
> Some results:
> - On SPEC, patchpoints introduce an overhead of ~25% compared to a 
> version where ASan checks are removed.
> - This is almost half of the cost of the checks themselves.
> - The results are similar for NOPs of size 1 and 5 bytes.
> - Interestingly, the results are similar for NOPs of 0 bytes, too. 
> These are patchpoints that don't insert any code and only inhibit 
> optimizations. I've only tested this on one benchmark, though.
>
> To summarize, only part of the cost of NOPs is due to executing them. 
> Their effect on optimizations is significant, too. I guess this would 
> hold for branches and sanitizer checks as well.I don't think you can really draw strong conclusions from the 
experiments you described.  What you've ended up measuring is nearly the 
impact of not optimizing over patchpoints at the check locations.  This 
doesn't really tell you much about what a check (which is likely to 
inhibit optimization much less) costs over a nop at the same position.

One bit of data you could extract from the experiment as constructed 
would be the relative cost of extra nops.  You do mention that the 
results are similar for sizes 1-5 bytes, but similar is very vague in 
this context.  Are the results statistically indistinguishable? Or is 
there a noticeable but small slowdown that results?  (Numbers would be 
great here.)
>
> Best,
> Jonas
>
>
> On Thu, Jan 21, 2016 at 11:52 PM Jonas Wagner <jonas.wagner at epfl.ch 
> <mailto:jonas.wagner at epfl.ch>> wrote:
>
>     Hello,
>
>             There is some data on this, e.g, in “High System-Code
>             Security with Low Overhead”
>             <http://dslab.epfl.ch/proj/asap/#publications>. In this
>             work we found that, for ASan as well as other
>             instrumentation tools, most overhead comes from the
>             checks. Especially for CPU-intensive applications, the
>             cost of maintaining shadow memory is small.
>
>         How did you measure this? If it was measured by removing the
>         checks before optimization happens, then what you may have
>         been measuring is not the execution overhead of the branches
>         (which is what would be eliminated by nop’ing them out) but
>         the effect on the optimizer.
>
>     Interesting. Indeed this was measured by removing some checks and
>     then re-optimizing the program.
>
>     I’m aware of some impact checks may have on optimization. For
>     example, I’ve seen cases where much less inlining happens because
>     functions with checks are larger. Do you know other concrete
>     examples? This is definitely something I’ll have to be careful
>     about. Philip Reames confirms this, too.
>
>     On the other hand, we’ve also found that the benefit from removing
>     a check is roughly proportional to the number of cycles spent
>     executing that check’s instructions. Our model of this is not very
>     precise, but it shows that the cost of executing the check’s
>     instructions matters.
>
>     I'll try to measure this, and will come back when I have data.
>
>     Best,
>     Jonas
>
>     
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/e1c3ae04/attachment.html>

Sean Silva via llvm-dev

2016-Feb-09 22:22 UTC

head link

[llvm-dev] Adding support for self-modifying branches to LLVM?

On Tue, Feb 9, 2016 at 8:07 AM, Philip Reames <listmail at
philipreames.com>
wrote:
>
>
> On 02/09/2016 06:57 AM, Jonas Wagner wrote:
>
> Hi,
>
> I'm coming back to this old thread with data about the performance of
> NOPs. Recalling that I was considering transforming NOP instructions into
> branches and back, in order to dynamically enable code. One use case for
> this was enabling/disabling individual sanitizer checks (ASan, UBSan) on
> demand.
>
> I wrote a pass which takes an ASan-instrumented program, and replaces each
> ASan check with an llvm.experimental.patchpoint intrinsic. This intrinsic
> inserts a NOP of configurable size. It has otherwise no effect on the
> program semantics. It does prevent some optimizations, presumably because
> instructions cannot be moved across the patchpoint.
>
> Some results:
> - On SPEC, patchpoints introduce an overhead of ~25% compared to a version
> where ASan checks are removed.
> - This is almost half of the cost of the checks themselves.
> - The results are similar for NOPs of size 1 and 5 bytes.
> - Interestingly, the results are similar for NOPs of 0 bytes, too. These
> are patchpoints that don't insert any code and only inhibit
optimizations.
> I've only tested this on one benchmark, though.
>
> To summarize, only part of the cost of NOPs is due to executing them.
> Their effect on optimizations is significant, too. I guess this would hold
> for branches and sanitizer checks as well.
>
> I don't think you can really draw strong conclusions from the
experiments
> you described.  What you've ended up measuring is nearly the impact of
not
> optimizing over patchpoints at the check locations.  This doesn't
really
> tell you much about what a check (which is likely to inhibit optimization
> much less) costs over a nop at the same position.
>
> One bit of data you could extract from the experiment as constructed would
> be the relative cost of extra nops.  You do mention that the results are
> similar for sizes 1-5 bytes, but similar is very vague in this context.
> Are the results statistically indistinguishable?  Or is there a noticeable
> but small slowdown that results?  (Numbers would be great here.)
>
In this same vein, try inserting 1,2,3,4,5,6,... nops and measure the
performance impact (the total size of nops is also interesting but is more
difficult to measure reliably). I've used this kind of technique
successfully in the past for e.g. measuring the cost of "stat"
syscalls on
windows. I call the technique "stuffing". Basically, make a plot of
the
performance degradation as you insert more and more redundant stuff (e.g. 1
nop, 2 nops, 3 nops, etc.). If the result is a strong linear trend, then
you can pretty confidently extrapolate backward to the "0 nop" case to
see
the overhead of inserting 1 nop.

-- Sean Silva

>
>
>
> Best,
> Jonas
>
>
> On Thu, Jan 21, 2016 at 11:52 PM Jonas Wagner <jonas.wagner at
epfl.ch>
> wrote:
>
>> Hello,
>>
>> There is some data on this, e.g, in “High System-Code Security with Low
>> Overhead” <http://dslab.epfl.ch/proj/asap/#publications>. In this
work
>> we found that, for ASan as well as other instrumentation tools, most
>> overhead comes from the checks. Especially for CPU-intensive
applications,
>> the cost of maintaining shadow memory is small.
>>
>> How did you measure this? If it was measured by removing the checks
>> before optimization happens, then what you may have been measuring is
not
>> the execution overhead of the branches (which is what would be
eliminated
>> by nop’ing them out) but the effect on the optimizer.
>>
>> Interesting. Indeed this was measured by removing some checks and then
>> re-optimizing the program.
>>
>> I’m aware of some impact checks may have on optimization. For example,
>> I’ve seen cases where much less inlining happens because functions with
>> checks are larger. Do you know other concrete examples? This is
definitely
>> something I’ll have to be careful about. Philip Reames confirms this,
too.
>>
>> On the other hand, we’ve also found that the benefit from removing a
>> check is roughly proportional to the number of cycles spent executing
that
>> check’s instructions. Our model of this is not very precise, but it
shows
>> that the cost of executing the check’s instructions matters.
>>
>> I'll try to measure this, and will come back when I have data.
>>
>> Best,
>> Jonas
>> 
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/197c0df6/attachment.html>

llvm dev - Feb 2016 - Adding support for self-modifying branches to LLVM?

[llvm-dev] Adding support for self-modifying branches to LLVM?

[llvm-dev] Adding support for self-modifying branches to LLVM?

[llvm-dev] Adding support for self-modifying branches to LLVM?