thr3ads.net - llvm dev - [LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results? [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Christian Convey

2015-Jun-14 16:25 UTC

[LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?

Hi all,

I'm playing around with implementing an existing non-LLVM AA algorithm as
an LLVM AA pass.  I'm looking for suggestions for getting it to fit in
AliasAnalysis's worldview, so that it might eventually be a candidate for
inclusion in LLVM.

The algorithm maintains a may-point-to graph.  Unfortunately the algorithm
doesn't delete an "A-->B" edge when there's a strong update
of "A" but the
value copied into "A" isn't a pointer.  So the interpretation of
"A" having
only one outbound edge (to "B") is a little ambiguous.  It means
"'A'
definitely points to 'B', or 'A' doesn't hold a valid
pointer."

This makes it hard for the algorithm to ever return a MustAlias result.  If
the graph has just two edges, "A-->C" and "B-->C",
then the most precise
answer it could give for "alias(A,B)" would be "MustAlias or
NoAlias, I'm
not sure which".   AFAIK, with the current interface I'd have to return
"MayAlias" in that case, which is unsatisfying.

One solution would be for me to adapt the algorithm to remove this
ambiguity.  But if possible I'd like to keep the algorithm as close to the
published version as possible, so I'd rather find another solution.

Another approach is to add a value to the AliasResult enumeration,
indicating "MustAlias or NoAlias, I'm not sure which".  But
I'm not sure if
any downstream analyses could make use of a result like that.

A third, even uglier solution would be to modify the
AliasAnalysis::alias(...) methods to let the caller indicate whether or not
the supplied Values can be assumed to actually contain pointers.  But this
strikes me as an unreasonable concession to this one AA algorithm's quirks.

Any suggestions for how to proceed?

Thanks, Christian
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150614/d02b2822/attachment.html>

Daniel Berlin

2015-Jun-14 16:57 UTC

head link

[LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?

On Sun, Jun 14, 2015 at 9:25 AM, Christian Convey
<christian.convey at gmail.com> wrote:> Hi all,
>
> I'm playing around with implementing an existing non-LLVM AA algorithm
as an
> LLVM AA pass.  I'm looking for suggestions for getting it to fit in
> AliasAnalysis's worldview, so that it might eventually be a candidate
for
> inclusion in LLVM.
>
> The algorithm maintains a may-point-to graph.  Unfortunately the algorithm
> doesn't delete an "A-->B" edge when there's a strong
update of "A" but the
> value copied into "A" isn't a pointer.  So the interpretation
of "A" having
> only one outbound edge (to "B") is a little ambiguous.  It means
"'A'
> definitely points to 'B', or 'A' doesn't hold a valid
pointer."

Define "valid pointer please"?
>
> This makes it hard for the algorithm to ever return a MustAlias result.  If
> the graph has just two edges, "A-->C" and
"B-->C", then the most precise
> answer it could give for "alias(A,B)" would be "MustAlias or
NoAlias, I'm
> not sure which".   AFAIK, with the current interface I'd have to
return
> "MayAlias" in that case, which is unsatisfying.
So i'm trying to understand what harm would come from returning MustAlias
here?
If the value is undef, then we can pick one for which MustAlias is true.
If it's something else, i need to understand how you define "valid
pointer".

(I agree there are situations where this would be the wrong answer,
which is why i'm trying to understand what valid pointer
means)>
> One solution would be for me to adapt the algorithm to remove this
> ambiguity.  But if possible I'd like to keep the algorithm as close to
the
> published version as possible, so I'd rather find another solution.
Why?
Published versions are often ... wrong, not well engineered, etc :)
>
> Another approach is to add a value to the AliasResult enumeration,
> indicating "MustAlias or NoAlias, I'm not sure which".  But
I'm not sure if
> any downstream analyses could make use of a result like that.
Above, you say you want to not return MustAlias.
Here you say it's not clear that any downstream results could make use
of better info.

Before you go and try to figure out what should change, you really
need to actually determine whether the info you have is valuable.

I would do this by finding a pass you think you can improve with your
extra info, and seeing if it improves (add a temporary hack AA
function or something that gives info about this) by giving it must/no
vs may.

If something improves, great, we can figure out whether it's worth the
tradeoffs/etc and help you figure out what to do.
If nothing improves, it may not be worth you spending your time on it.

Christian Convey

2015-Jun-15 01:35 UTC

head link

[LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?

>
> > The algorithm maintains a may-point-to graph.  Unfortunately the
> algorithm
> > doesn't delete an "A-->B" edge when there's a
strong update of "A" but
> the
> > value copied into "A" isn't a pointer.  So the
interpretation of "A"
> having
> > only one outbound edge (to "B") is a little ambiguous.  It
means "'A'
> > definitely points to 'B', or 'A' doesn't hold a
valid pointer."
>
>
> Define "valid pointer please"?
>
Sorry, I can see how my phrasing raised a red flag.

The original version of the algorithm I'm looking at was designed to
analyze C source code, not LLVM IR.  I'm in the process of adapting its
dataflow equations for IR.

The algorithm assumes that a correct C program can't just compute pointer
values *ex nihilo*; that they can only by obtained from certain syntactic
structures like variable declarations, or calls to *malloc*, or pointer
literals.  The AA algorithm reckons that dereferencing a runtime value
obtained by some other mechanism is so likely to be a bug, that they can
skip worrying about it.

The AA algorithm uses dataflow analysis to monitor the possible propagation
of those values through the program code, and it represents those flows by
updates to the may-point-to graph.  If at some code point CP, a
may-point-to graph vertex "B" has no outbound edges, that's
equivalent to
saying that the AA has concluded the runtime memory modeled by "B"
does not
contain any pointer that a correct program has any business trying to
dereference.

So to restate my point in the earlier email: if there's a strong-update of
the form "*A = 42;" (in C parlance), it would be nice to have this AA
algorithm remove any may-point-to graph edges originating at "A" at
that
point.  But, for the sake of efficiency (in the author's judgment), such
assignments are simply ignored by the dataflow equations.  And so any
existing may-point-to edges originating at "A" are allowed to remain
in
existence.


> > One solution would be for me to adapt the algorithm to remove this
>
> ambiguity.  But if possible I'd like to keep the algorithm as close to
the
> > published version as possible, so I'd rather find another
solution.
>
> Why?
> Published versions are often ... wrong, not well engineered, etc :)
>
I'm not dead-set against modifying it, I'm just biased against doing it
without a good reason.  I'm relatively new to implementing AA algorithms,
and the author seems to have put a great deal of thought into this
algorithm.  So I'm trying to follow a policy of "If it's not
broken, don't
fix it."  Also, the more I can remain faithful to the algorithm's
original
writeup, the less I'm on the hook to write my own documentation for my
implementation.

> >
> > Another approach is to add a value to the AliasResult enumeration,
> > indicating "MustAlias or NoAlias, I'm not sure which". 
But I'm not sure
> if
> > any downstream analyses could make use of a result like that.
>
> Above, you say you want to not return MustAlias.
> Here you say it's not clear that any downstream results could make use
> of better info.
>
> Before you go and try to figure out what should change, you really
> need to actually determine whether the info you have is valuable.
>
> I would do this by finding a pass you think you can improve with your
> extra info, and seeing if it improves (add a temporary hack AA
> function or something that gives info about this) by giving it must/no
> vs may.
>
> If something improves, great, we can figure out whether it's worth the
> tradeoffs/etc and help you figure out what to do.
> If nothing improves, it may not be worth you spending your time on it.
>
Thanks, will do.   I appreciate the feedback!

- Christian
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150614/6d9e3a14/attachment.html>

llvm dev - Jun 2015 - [LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?

[LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?

[LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?

[LLVMdev] Expressing ambiguous points-to info in AliasAnalysis::alias(...) results?