thr3ads.net - llvm dev - [llvm-dev] RFC: EfficiencySanitizer [Apr 2016]

If this information is useful, please help other people find it:
Share via:

Yury Gribov via llvm-dev

2016-Apr-20 12:18 UTC

[llvm-dev] RFC: EfficiencySanitizer

On 04/20/2016 02:58 PM, Renato Golin via llvm-dev wrote:> Hi Derek,
>
> I'm not an expert in any of these topics, but I'm excited that you
> guys are doing it. It seems like a missing piece that needs to be
> filled.
>
> Some comments inline...
>
>
> On 17 April 2016 at 22:46, Derek Bruening via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> We would prefer to trade off accuracy and build a
>> less-accurate tool below our overhead ceiling than to build a
high-accuracy
>> but slow tool.
>
> I agree with this strategy.
>
> As a first approach, making the fastest you can, then later
> introducing more probes, maybe via some slider flag (like -ON) to
> consciously trade speed for accuracy.
>
>
>> Studying instruction cache behavior with compiler
>> instrumentation can be challenging, however, so we plan to at least
>> initially focus on data performance.
>
> I'm interested in how you're going to do this without kernel
profiling
> probes, like perf.
>
> Or is the point here introducing syscalls in the right places instead
> of randomly profiled? Wouldn't that bias your results?
>
>
>> Many of our planned tools target specific performance issues with data
>> accesses.  They employ the technique of *shadow memory* to store
metadata
>> about application data references, using the compiler to instrument
loads
>> and stores with code to update the shadow memory.
>
> Is it just counting the number of reads/writes? Or are you going to
> add how many of those accesses were hit by a cache miss?
>
>
>> *Cache fragmentation*: this tool gather data structure field hotness
>> information, looking for data layout optimization opportunities by
grouping
>> hot fields together to avoid data cache fragmentation.  Future
enhancements
>> may add field affinity information if it can be computed with low
enough
>> overhead.
>
> Would be also good to have temporal information, so that you can
> correlate data access that occurs, for example, inside the same loop /
> basic block, or in sequence in the common CFG flow. This could lead to
> change in allocation patterns (heap, BSS).
>
>
>> *Working set measurement*: this tool measures the data working set size
of
>> an application at each snapshot during execution.  It can help to
understand
>> phased behavior as well as providing basic direction for further effort
by
>> the developer: e.g., knowing whether the working set is close to
fitting in
>> current L3 caches or is many times larger can help determine where to
spend
>> effort.
>
> This is interesting, but most useful when your dataset changes size
> over different runs. This is similar to running the program under perf
> for different workloads, and I'm not sure how you're going to get
that
> in a single run. It also comes with the additional problem that cache
> sizes are not always advertised, so you might have an additional tool
> to guess the sizes based on increasing the size of data blocks and
> finding steps on the data access graph.
>
>
>> *Dead store detection*: this tool identifies dead stores
(write-after-write
>> patterns with no intervening read) as well as redundant stores (writes
of
>> the same value already in memory).  Xref the Deadspy paper from CGO
2012.
>
> This should probably be spotted by the compiler, so I guess it's a
> tool for compiler developers to spot missed optimisation opportunities
> in the back-end.
Not when dead store happens in an external DSO where compiler can't 
detect it (same applies for single references).
>> *Single-reference*: this tool identifies data cache lines brought in
but
>> only read once.  These could be candidates for non-temporal loads.
>
> That's nice and should be simple enough to get a report in the end.
> This also seem to be a hint to compiler developers rather than users.
>
> I think you guys have a nice set of tools to develop and I'm looking
> forward to working with them.
>
> cheers,
> --renato
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Renato Golin via llvm-dev

2016-Apr-20 12:42 UTC

head link

[llvm-dev] RFC: EfficiencySanitizer

On 20 April 2016 at 13:18, Yury Gribov <y.gribov at samsung.com>
wrote:> Not when dead store happens in an external DSO where compiler can't
detect
> it (same applies for single references).
Do you mean the ones between the DSO and the instrumented code?
Because if it's just in the DSO itself, then the compiler could have
spotted it, too, when compiling the DSO.

I mean, of course there are cases line interprocedural dead-store
(call a function that changes A while changing A right after), but
that again, could be found at compilation time, given enough inlining
depth or IP analysis.

Also, if this is not something the compiler can fix, what is the point
of detecting dead-stores? For all the non-trivial cases the compiler
can't spot, most will probably arise from special situations where the
compiler is changing the code to expose the issue, and thus, the user
has little control over how to fix the underlying problem.

cheers,
--renato

Filipe Cabecinhas via llvm-dev

2016-Apr-20 14:07 UTC

head link

[llvm-dev] RFC: EfficiencySanitizer

On Wed, Apr 20, 2016 at 1:42 PM, Renato Golin via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> On 20 April 2016 at 13:18, Yury Gribov <y.gribov at samsung.com>
wrote:
>> Not when dead store happens in an external DSO where compiler can't
detect
>> it (same applies for single references).
>
> Do you mean the ones between the DSO and the instrumented code?
> Because if it's just in the DSO itself, then the compiler could have
> spotted it, too, when compiling the DSO.
>
> I mean, of course there are cases line interprocedural dead-store
> (call a function that changes A while changing A right after), but
> that again, could be found at compilation time, given enough inlining
> depth or IP analysis.When I read the description, I assumed it would (mostly) be used to
detect those inter-procedural dead-stores, that the compiler can't see
(without LTO, at least).
The "external DSO" case also exists, but unless the DSO is also
instrumented, you'd get lots of false-negatives (which aren't "a
big
problem" with the sanitizers, but of course we want to minimize them
(you'd do it by also instrumenting the DSO)).
> Also, if this is not something the compiler can fix, what is the point
> of detecting dead-stores? For all the non-trivial cases the compiler
> can't spot, most will probably arise from special situations where the
> compiler is changing the code to expose the issue, and thus, the user
> has little control over how to fix the underlying problem.Same as the other sanitizers: The compiler can't fix, but you (the
programmer) can! :-)
I don't think the dead-stores would mostly come from the compiler
changing code around. I think they'd most likely come from the other
example you mentioned, where you call a function which writes
somewhere, and then you write over it, with no intervening read.
If this happens a lot with a given function, maybe you want to write
to some parts of the structure conditionally.



Derek, Qin:
Since this is mostly being researched as it is being implemented (and
in the public repo), how do you plan to coordinate with the rest of
the community? (Current status, what's "left" to get a
"useful"
implementation, etc)

About the working set tool:
How are you thinking about doing the snapshots? How do you plan to
sync the several threads?
Spawning an external process/"thread" (kind of like LSan), or
internally?

About the tools in general:
Do you expect any of the currently planned ones to be intrusive, and
require the user to change their code before they can use the tool
with good results?

Thank you,

  Filipe

Derek Bruening via llvm-dev

2016-Apr-20 16:24 UTC

head link

[llvm-dev] RFC: EfficiencySanitizer

On Wed, Apr 20, 2016 at 8:42 AM, Renato Golin <renato.golin at linaro.org>
wrote:
> Also, if this is not something the compiler can fix, what is the point
> of detecting dead-stores? For all the non-trivial cases the compiler
> can't spot, most will probably arise from special situations where the
> compiler is changing the code to expose the issue, and thus, the user
> has little control over how to fix the underlying problem.
>
The case studies in the DeadSpy paper show how they, the user, fixed the
most frequent cases of dead stores in SPECCPU (yes, in some cases working
around the compiler, but not always): "DeadSpy: A Tool to Pinpoint Program
Inefficiencies"
<https://www.researchgate.net/publication/241623127_DeadSpy_A_tool_to_pinpoint_program_inefficiencies>
by
Chabbi et al. from CGO 2012.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160420/111ac036/attachment.html>

llvm dev - Apr 2016 - RFC: EfficiencySanitizer

[llvm-dev] RFC: EfficiencySanitizer

[llvm-dev] RFC: EfficiencySanitizer

[llvm-dev] RFC: EfficiencySanitizer

[llvm-dev] RFC: EfficiencySanitizer