thr3ads.net - llvm dev - [llvm-dev] Existing studies on the benefits of pointer analysis [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Daniel Berlin via llvm-dev

2016-Mar-22 01:53 UTC

[llvm-dev] Existing studies on the benefits of pointer analysis

On Mon, Mar 21, 2016 at 6:28 PM, Philip Reames via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On 03/21/2016 08:56 AM, Jia Chen via llvm-dev wrote:
>
> Hi Christian,
>
> Thank you so much for the reply! Please see my comments inline.
>
> On 03/21/2016 09:32 AM, Christian Convey wrote:
>
> Hi Jia,
>
> If one looks at existing research literatures, there are even more
>> algorithm to consider for doing pointer analysis.
>
>
> For at least some published AA algorithms, there may be some uncertainty
> about software patents and/or copyright.
>
> At one point I was interested in the status of this AA implementation
> <https://dl.acm.org/citation.cfm?id=2466483> by Lian Li et al.  IIRC,
> when I contacted Lian to ask if there was any chance of getting it into
> LLVM, IIRC she said that her employer wouldn't promise to relinquish
all
> possible claims it had on that algorithm's IP.  So unfortunately, at
least
> in the U.S., an algorithm being published in an academic journal
doesn't
> remove all legal risk associated with using it.
>
> This is news to me. Thanks for sharing it.
>
>
> Also, speaking from my own experience, even when an AA algorithm seems to
> be described in great detail in some piece of literature (e.g., a phd
> thesis), there can still be a lot of details which are glossed over, or
> which seem clear when reading the document but which get a lot more
> confusing when one tries to actually implement it.
>
> That can make implementing such an algorithm take far longer than one
> would expect based on just reading the document.  (It's also an
argument in
> favor of requiring academic papers which describe the behavior of a
> software implementation to actually include a working copy of the source
> code, IMHO.)
>
> My personal experience also coincides. And even if the paper does come
> with an artifact or source codes, they are usually proof-of-concept
> implementations with lots of important real-world corner cases ignored.
>
>
> So my question here is: what kind(s) of precision really justify the cost
>> and what kinds do not? Has anybody done any study in the past to
evaluate
>> what kinds of features in pointer analyses will benefit what kinds of
>> optimization passes?
>>
>
> At one point I discussed this with Daniel Berlin, but I'm having
trouble
> find a record of the conversation.  IIRC, he says that he once threw a huge
> amount of computing power at doing a *full* context-sensitive AA on some
> software, and the speedup he observed in the resulting program as
> underwhelming (10-15%?).
>
> I kind of expect that. As you mentioned later, most optimization passes
> work in a context-insensitive manner (i.e. they won't clone a function
and
> optimize differently on different clones). Context sensitivity on the
> pointer analysis side is probably not going to help a lot if the client
> cannot fully capitalize on it. In the settings of compiler optimization, my
> guess is that flow sensitivity, field sensitivity, heap model and external
> library annotations are the four aspects that are likely to matter.
>
> I did some preliminary experiments with licm on c programs over the last
> weekend. I chose licm because intuitively that's the optimization that
> could have the biggest performance impact. The result suggested that tbaa,
> cfl-aa, scev-aa and globals-aa yields very little additional benefits over
> basic-aa. Again, both the methodology and benchmark selection are very
> immature and the results need to be double-checked, but my hope is that by
> looking at how aa algorithms and their clients interact I may be able to
> get some hints on what kind of aa a compiler really wants.
>
> Just to chime in here, your results match my experience and expectations
> with LICM as well.  Between basic-aa, and TBAA (specifically LLVM's
> implementation thereof), I haven't seen a lot of cases where an
imprecision
> in the alias analysis prevents hoisting.
>
Yeah, at best, for LICM, it's just going to tell you the best place to
insert runtime checks.  LICM has a specific goal, and it's usually not AA
that prevents proving something loop invariant.  Most loads/stores are also
either trivially loop invariant, or impossible to prove loop invariant.

More general PRE and GVN-like opts (basically, load elimination, load
hoisting, dead store elimination, store sinking) are where i expect better
AA to help the most off the top of my head.  But to figure out the gains,
you'd really need to instrument at runtime to figure out what the
theoretical maximum gain is (IE when things are equivalent that it is not
proving equivalent).



>
> *However*, if you're interested in LICM specifically, I have
*definitely*
> seen cases where the precision of AliasSetTracker (our grouping of AA
> results to prevent O(n^2) queries) prevents hoisting in spurious cases.
> AST could use some serious attention, both from an engineering standpoint
> and from (possibly) a theoretically one.
>

You already know my view on this one: It's going to be remarkably hard to
make AST work the way folks want it and have it be incremental and
completely agnostic of anything but the AA API.

It's just really hard if not provably impossible to do this incrementally
and avoid duplicate work, and get precise results and ...


On the other hand, it's pretty easy if you basically say "i provide
list of
all pointers and statements i care about, you make me some sets", and you
let it figure out the answers upfront.

(it's also not clear to me why AST is the right abstraction for LICM to
work on top of, but that's neither here nor there :P)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160321/8079e29c/attachment.html>

Philip Reames via llvm-dev

2016-Mar-22 18:51 UTC

head link

[llvm-dev] Existing studies on the benefits of pointer analysis

On 03/21/2016 06:53 PM, Daniel Berlin wrote:>
>
> On Mon, Mar 21, 2016 at 6:28 PM, Philip Reames via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>
>
>     On 03/21/2016 08:56 AM, Jia Chen via llvm-dev wrote:
>>
>>
>>     I did some preliminary experiments with licm on c programs over
>>     the last weekend. I chose licm because intuitively that's the
>>     optimization that could have the biggest performance impact. The
>>     result suggested that tbaa, cfl-aa, scev-aa and globals-aa yields
>>     very little additional benefits over basic-aa. Again, both the
>>     methodology and benchmark selection are very immature and the
>>     results need to be double-checked, but my hope is that by looking
>>     at how aa algorithms and their clients interact I may be able to
>>     get some hints on what kind of aa a compiler really wants.
>     Just to chime in here, your results match my experience and
>     expectations with LICM as well.  Between basic-aa, and TBAA
>     (specifically LLVM's implementation thereof), I haven't seen a
lot
>     of cases where an imprecision in the alias analysis prevents hoisting.
>
>
> Yeah, at best, for LICM, it's just going to tell you the best place to 
> insert runtime checks.  LICM has a specific goal, and it's usually not 
> AA that prevents proving something loop invariant.  Most loads/stores 
> are also either trivially loop invariant, or impossible to prove loop 
> invariant.Side note: The key limiting factor for load hoisting is most often being 
able to prove dereferenceability.  I only mention that because the OP 
had asked for where other optimizer changes might help.>
>
>     *However*, if you're interested in LICM specifically, I have
>     *definitely* seen cases where the precision of AliasSetTracker
>     (our grouping of AA results to prevent O(n^2) queries) prevents
>     hoisting in spurious cases. AST could use some serious attention,
>     both from an engineering standpoint and from (possibly) a
>     theoretically one.
>
>
>
> You already know my view on this one: It's going to be remarkably hard 
> to make AST work the way folks want it and have it be incremental and 
> completely agnostic of anything but the AA API.
>
> It's just really hard if not provably impossible to do this 
> incrementally and avoid duplicate work, and get precise results and ...
>
>
> On the other hand, it's pretty easy if you basically say "i
provide
> list of all pointers and statements i care about, you make me some 
> sets", and you let it figure out the answers upfront.Er, this is actually fairly close to what the code does.  It just does 
it in a really oddly structured manner.  But yes, I agree, this code 
could be radically improved.>
> (it's also not clear to me why AST is the right abstraction for LICM 
> to work on top of, but that's neither here nor there :P)Out of curiosity, what would you suggest instead?

For context, I have a patch in my tree which falls back to O(n^2) 
queries for small loops.  We found this to be rather important for the 
optimization of IR derived from our Java frontend.

Philip
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/ca6f170e/attachment.html>

Jia Chen via llvm-dev

2016-Mar-22 21:09 UTC

head link

[llvm-dev] Existing studies on the benefits of pointer analysis

On 03/22/2016 01:51 PM, Philip Reames wrote:>
>
> On 03/21/2016 06:53 PM, Daniel Berlin wrote:
>>
>>
>>     *However*, if you're interested in LICM specifically, I have
>>     *definitely* seen cases where the precision of AliasSetTracker
>>     (our grouping of AA results to prevent O(n^2) queries) prevents
>>     hoisting in spurious cases. AST could use some serious attention,
>>     both from an engineering standpoint and from (possibly) a
>>     theoretically one.
>>
>>
>>
>> You already know my view on this one: It's going to be remarkably 
>> hard to make AST work the way folks want it and have it be 
>> incremental and completely agnostic of anything but the AA API.
>>
>> It's just really hard if not provably impossible to do this 
>> incrementally and avoid duplicate work, and get precise results and ...
>>
>>
>> On the other hand, it's pretty easy if you basically say "i
provide
>> list of all pointers and statements i care about, you make me some 
>> sets", and you let it figure out the answers upfront.
> Er, this is actually fairly close to what the code does.  It just does 
> it in a really oddly structured manner.  But yes, I agree, this code 
> could be radically improved.
>>
>> (it's also not clear to me why AST is the right abstraction for
LICM
>> to work on top of, but that's neither here nor there :P)
> Out of curiosity, what would you suggest instead?
>
> For context, I have a patch in my tree which falls back to O(n^2) 
> queries for small loops.  We found this to be rather important for the 
> optimization of IR derived from our Java frontend.
Also out of curiosity, why LLVM choose to expose pointer analysis 
information as alias-query APIs rather than APIs that let the client 
query points-to sets? This question has bothered me ever since I started 
to look into LLVM's alias analysis framework.

It seems to me that alias queries may yield less precise results than 
points-to queries. To put it in another way, it is easy to convert 
points-to information to aliasing information (just check for emptiness 
of points-to set intersection), but the reverse is much harder and may 
not be possible sometimes. The alias set tracker also introduce an 
additional source of imprecision: if a may alias b and b may alias c, it 
is not necessary that a may alias c but the AST would merge them all 
into one set.

It also doesn't seem like alias information is cheaper to compute (e.g. 
andersen) and is cheaper to query (e.g. the O(n^2) query problem).

-- 
Best Regards,

--
Jia Chen
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/9d629f0d/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Mar 2016 - Existing studies on the benefits of pointer analysis

[llvm-dev] Existing studies on the benefits of pointer analysis

[llvm-dev] Existing studies on the benefits of pointer analysis

[llvm-dev] Existing studies on the benefits of pointer analysis

Apparently Analagous Threads