thr3ads.net - llvm dev - [llvm-dev] SCCP is not always correct in presence of undef (+ proposed fix) [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Davide Italiano via llvm-dev

2017-Jan-02 22:48 UTC

[llvm-dev] SCCP is not always correct in presence of undef (+ proposed fix)

On Mon, Jan 2, 2017 at 2:24 PM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> Hi Davide,
>
> On Sat, Dec 31, 2016 at 4:19 AM, Davide Italiano <davide at
freebsd.org> wrote:
>> Although originally I wasn't, I'm increasingly convinced this
is the
>> right path forward (at least in the beginning), i.e. strip undef
>> handling entirely. I tried to integrate undef in the solver to see how
>> that worked and it seems the proposed lattice has still some issues.
>
> By "integrate undef" do you mean you implemented the lattice you
> mentioned in the original post?  If so, can you mention what issues
> you ran into?
>
Nothing in the testcases I used, but still you pointed out a case
where the lattice I proposed in the original mail wasn't necessarily
correct.
>> We have hard time reasoning about simple things like what's the
>> lattice structure and what should be the invariants we
>> maintain/enforce at the end of each run (something that's very nice
>> about the original algorithm is that you know at the end of the
>> algorithm nothing should have value TOP otherwise you forgot to lower
>> something/have a bug, but we can't currently assert this because of
>> `undef`).
>
> That is a good point.  Can we get the same kind of checking by keeping
> track of if we've visited every non-dead instruction in the function
> at least once?
>
I think that could work.
>> I'm also becoming increasingly convinced that this problem is
>> something we (LLVM) created.
>> If optimizing `undef` (which is sometimes if not often symptom of
>> undefined behaviour/bugs in your program) should come at the expense
>> of correctness or complicated logic/potentially unsound algorithms I
>> don't think we should go that way.
>
> I agree that undef has soundness issues, but I don't think those are
> relevant here.
>
> And with respect to `undef` and its implications on the correctness of
> the source program, there are two factors here:
>
> Using `undef` in a side effect (e.g. printf(undef)) is usually a sign
> of a buggy source program, especially if the source language is C or
> C++.
>
> The *presence* of `undef` is fine, and does not imply that the source
> program is incorrect.
>
Thanks for the clarification. That's why I said sometimes if not often
(not always), anyway, I get your point =)
>> What Danny proposed I think it's very appealing but I'm taking
a more
>> extreme position here saying that we may consider not doing that in
>> the beginning. Today I ran SCCP over a bunch of codebases (completely
>> removing the undef optimization) and I found it never actually
>> resulting in a runtime performance hit. Of course this is not
>> representative at all, but still a datapoint.
>
> What is the contribution of SCCP itself to the (I presume internal)
> benchmarks?  If you disable SCCP completely, how much performance do
> you lose?
>
~1.5% runtime on a game scene (I mean, disabling both SCCP/IPSCCP,
this is LTO, FWIW).
Note: I wish I could share numbers, but I'm not allowed.
>> I think we should take this opportunity as a way to see if we can make
>> things simpler/easier to consider correct instead of adding cases for
>> problems we discover.
>
> In theory I'm completely fine with not handling undef at all
> (i.e. your proposal) as a starting point.  If we see a need for it, we
> can always add it back later.
>
I have personally no rush to get this code in, so, while we're here,
we can just bite the bullet and fix this entirely.
Side note: I originally wanted to fix this for 4.0 so that I can avoid
maintaining a patch downstream. Turns out that the burden of keeping a
patch downstream is not terribly high, so this can probably wait (and
get more careful thoughts).
Side note 2: I think that even if we want to disable undef handling
that needs to be sustained with a set of benchmarks showing up we
don't lose too much. My original testing was, of course,
non-representative of the entire world (I just reported as a
data-point).
> In practice, this is the kind of thing that tends to get added back
> poorly (because someone's benchmark will be faster by 1% with a
"small
> fix" to SCCP).  Given that we're taking a step back and looking at
the
> bigger picture now anyway, this could be an opportune moment to fix
> the underlying issue with regards to undef.
>
> Having said that, since you're doing the work I'm more than happy
to
> let you make the call on this one.  I have the armchair, but you have
> the keyboard. :)
>
Your comments are of course very welcome (that goes without saying).
Are you happy with me experimenting with something similar to what
Danny proposed (a pre-solver computing the SCCs on the SSA graph?). At
this point this is my favourite solution because we can stick with the
default algorithm (which will keep me happier as somebody else did the
hard work of proving correct on my behalf).
> -- Sanjoy
>
>
> PS:
>
> I still don't have an answer to the very first question I asked:
>
> ```
>>> Looking at the original bug, it seems like a straightforward
>>> undef-propagation bug to me -- SCCP was folding "or undef,
constant"
>>> to "undef", which is wrong.  Why is changing that not the
fix?  That
>>> is, some variant of
>>
>>
>> You would still need to fix the iteration order of the resolver, or it
will
>> make more wrong decisions.
>> As Davide discovered, there are bugs open with the same cause.
>
> Davide -- can you point me to those?
> ```
Sorry, I missed this one :(
I have another case (not reduced) where this falls apart. I think it's
the same issue as I locally patched with something very similar to
what you proposed in your original mail, so I'm tempted to claim it's
the same issue (or a slight modification of it).
That said, sure, I think we can probably get the patch you proposed
originally in and call it a night. But I'm still very nervous that
we're not sure of the correctness of the algorithm and I'm very afraid
it may fall apart in the future again.

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare

Sanjoy Das via llvm-dev

2017-Jan-02 23:26 UTC

head link

[llvm-dev] SCCP is not always correct in presence of undef (+ proposed fix)

Hi Davide,

On Mon, Jan 2, 2017 at 2:48 PM, Davide Italiano <davide at freebsd.org>
wrote:> Your comments are of course very welcome (that goes without saying).
> Are you happy with me experimenting with something similar to what
> Danny proposed (a pre-solver computing the SCCs on the SSA graph?).
SGTM!
> At
> this point this is my favourite solution because we can stick with the
> default algorithm (which will keep me happier as somebody else did the
> hard work of proving correct on my behalf).
Sounds good!

-- Sanjoy

Davide Italiano via llvm-dev

2017-Jan-03 17:45 UTC

head link

[llvm-dev] SCCP is not always correct in presence of undef (+ proposed fix)

On Mon, Jan 2, 2017 at 3:26 PM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> Hi Davide,
>
> On Mon, Jan 2, 2017 at 2:48 PM, Davide Italiano <davide at
freebsd.org> wrote:
>> Your comments are of course very welcome (that goes without saying).
>> Are you happy with me experimenting with something similar to what
>> Danny proposed (a pre-solver computing the SCCs on the SSA graph?).
>
> SGTM!
>
>> At
>> this point this is my favourite solution because we can stick with the
>> default algorithm (which will keep me happier as somebody else did the
>> hard work of proving correct on my behalf).
>
> Sounds good!
>
> -- Sanjoy
Sending a real (updated) proposal to make sure we're all on the same
page (and the idea makes sense), in case somebody not following the
earlier discussion wants to comment. I haven't checked if LLVM has
functions for solving the individual steps (feel free to suggest).

The algorithm (if I understood the idea correctly) is more or less the
following:
1) Build the def-use chain graph.
2) Compute the set of SCCs of the graph
3) Compute RPO of the DAG formed at 2) and visit the nodes(SCCs)
according to that order
Within each SCC sort topologically and use RPO within the SCC (as a
worklist) until a fixed-point is reached (as we do here
https://github.com/dcci/llvm/blob/master/lib/Transforms/Scalar/ConstantProp.cpp#L78
just with a different visitation order). If something is found to be
folded to `undef`, we propagate the information through the DAG.

I think this should work for SCCP, I'm trying to convince myself how
this generalizes to IPSCCP. After the pre-solver run there shouldn't
be `undef` values which can be propagated anymore and we can run the
solver on a lattice which doesn't know (and shouldn't know) about
undef.

Any comment/correction is appreciated, that goes without saying =)

P.S. (mainly curiosity). I wasn't able to find anything in literature
describing an algorithm using this approach for CP (although I haven't
tried really hard). Nielson's book describes the approach of using
SCCs for a more intelligent worklist ordering
http://www.imm.dtu.dk/~hrni/PPA/slides6.pdf for constraint solvers in
general and
http://homepages.dcc.ufmg.br/~fernando/publications/papers/CGO13_raphael.pdf
applies this to range analysis (maybe SCCP can be somehow seen as a
special case/simpler form of)

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare

llvm dev - Jan 2017 - SCCP is not always correct in presence of undef (+ proposed fix)

[llvm-dev] SCCP is not always correct in presence of undef (+ proposed fix)

[llvm-dev] SCCP is not always correct in presence of undef (+ proposed fix)

[llvm-dev] SCCP is not always correct in presence of undef (+ proposed fix)