Bjarke Roune
2015-Jul-01 00:27 UTC
[LLVMdev] Deriving undefined behavior from nsw/inbounds/poison for scalar evolution
Hi Sanjoy, thanks for your thoughts on this. On Sat, Jun 27, 2015 at 12:16 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > First of all, going by the "poison causes UB only when observed", SCEV > does not do the right thing currently: [...] > > That seems like a bug? There's also bug 23527 for GEP. Sounds like theremight be more such bugs. One way to get what you want and also fix this existing bug(?) is to> make the no wrap flags part of a SCEV expression's identity (i.e. make > {A,+,B}<nuw> a different expression from {A,+,B}). If you start with > the precondition that every <nuw> [0] came from a <nuw> [1] present in > the IR, then you don't really have to worry about control dependence > -- you can always optimize under the assumption that the SCEV > expression does not unsign-overflow; if such an optimization changes > program behavior then that program has UB since it was data dependent > on poison [2].To make sure that I understand correctly, are you suggesting making the nuw (and similar) flags part of the key when looking up an SCEV in the UniqueSCEVs member variable on ScalarEvolution? That way, an instruction with nuw and one without will not map to the same SCEV unless the nuw can be inferred in some other way. Sounds good to me, I'm happy to go with either one of that or inferring UB from poison. Adding the flags to the key/identity does break the idea that mathematically equivalent expressions should cancel if subtracted. Andrew Trick wrote something about that previously: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141215/249390.html Andrew: I wonder what your view is on this? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150630/11ad37f8/attachment.html>
Sanjoy Das
2015-Jul-01 07:11 UTC
[LLVMdev] Deriving undefined behavior from nsw/inbounds/poison for scalar evolution
> To make sure that I understand correctly, are you suggesting making > the nuw (and similar) flags part of the key when looking up an SCEV in > the UniqueSCEVs member variable on ScalarEvolution? That way, anYes.> instruction with nuw and one without will not map to the same SCEV > unless the nuw can be inferred in some other way. Sounds good to me,Yes.> I'm happy to go with either one of that or inferring UB from poison.Just to be clear, it is not clear to me which approach is better. I was only offering a potential alternative.> Adding the flags to the key/identity does break the idea that > mathematically equivalent expressions should cancel if > subtracted. Andrew Trick wrote something about that previously: > > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141215/249390.htmlIt depends on what sort of semantics you assign to an overflowing <nsw> scev expression. If such an overflowing expression is supposed to produce poison, then I think you can cancel "(a +nsw b) - (a + b)" since "poison - <anything>" == "poison" --> can be replaced with "0". Another way to look at this is that it should always be legal for the compiler to drop <nsw> to do a simplification. -- Sanjoy
Andrew Trick
2015-Jul-01 18:50 UTC
[LLVMdev] Deriving undefined behavior from nsw/inbounds/poison for scalar evolution
> On Jun 30, 2015, at 7:27 PM, Bjarke Roune <broune at google.com> wrote: > > Hi Sanjoy, thanks for your thoughts on this. > > On Sat, Jun 27, 2015 at 12:16 AM, Sanjoy Das <sanjoy at playingwithpointers.com <mailto:sanjoy at playingwithpointers.com>> wrote: > First of all, going by the "poison causes UB only when observed", SCEV > does not do the right thing currently: [...] > > That seems like a bug? There's also bug 23527 for GEP. Sounds like there might be more such bugs. > > One way to get what you want and also fix this existing bug(?) is to > make the no wrap flags part of a SCEV expression's identity (i.e. make > {A,+,B}<nuw> a different expression from {A,+,B}). If you start with > the precondition that every <nuw> [0] came from a <nuw> [1] present in > the IR, then you don't really have to worry about control dependence > -- you can always optimize under the assumption that the SCEV > expression does not unsign-overflow; if such an optimization changes > program behavior then that program has UB since it was data dependent > on poison [2]. > > To make sure that I understand correctly, are you suggesting making the nuw (and similar) flags part of the key when looking up an SCEV in the UniqueSCEVs member variable on ScalarEvolution? That way, an instruction with nuw and one without will not map to the same SCEV unless the nuw can be inferred in some other way. Sounds good to me, I'm happy to go with either one of that or inferring UB from poison. > > Adding the flags to the key/identity does break the idea that mathematically equivalent expressions should cancel if subtracted. Andrew Trick wrote something about that previously: > > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141215/249390.html <http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141215/249390.html> > > Andrew: I wonder what your view is on this? >In that thread I was pointing out that with no-wrap flags as part of the expression identity, SCEV loses the ability to reason about or reduce expressions that have a mix of flags. Trivial example: {0,+,1}<nsw> - {1,+,1} We can no longer reduce this to constant 1. OTOH, I've never actually tried this. If the nsw key is limited to recurrences, it might not be terrible in practice. I'm guessing we would at least end up with redundant induction variables though. I won't rule out the idea of adding nsw to the SCEV key, but I'm afraid it will lead to more complexity in the long run. I think your approach to strengthening SCEV by using a poison value analysis is legitimate. I've considered this approach in the past. I do think that it adds complexity, but is probably the best way to reuse existing LSR. I can see a few alternatives to solving your problem: (1) Use poison value analysis to inform SCEV. SCEV now sees nsw flags on your recurrence and reduces the expression for you. LSR does the transformation you want. (2) Add nsw flags to the SCEV key so we don't need poison value analysis. I think this could be problematic, but if Sanjoy really wants to go this route and support the work I won't get in the way. (3) Write a *simple* IR-level analysis that refactors address expressions to hoist the loop invariant component, informed by the target's addressing mode. This could be a first step in replacing the LSR pass with something more efficient and predictable. This is the approach I would take if I were doing the work. But I can understand not wanting to design a new pass just to handle your case. Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150701/e943aefc/attachment.html>
Bjarke Roune
2015-Jul-01 21:34 UTC
[LLVMdev] Deriving undefined behavior from nsw/inbounds/poison for scalar evolution
Hi Andy, thanks for the feedback and background information. On Wed, Jul 1, 2015 at 11:50 AM, Andrew Trick <atrick at apple.com> wrote:> [...] >I think your approach to strengthening SCEV by using a poison value> analysis is legitimate. I've considered this approach in the past. I do > think that it adds complexity, but is probably the best way to reuse > existing LSR. I can see a few alternatives to solving your problem: > > (1) Use poison value analysis to inform SCEV. SCEV now sees nsw flags on > your recurrence and reduces the expression for you. LSR does the > transformation you want. > > I'm leaning towards going with this as it should help LSR in other casestoo and LSR already has all the machinery, it just needs a bit more information. (2) Add nsw flags to the SCEV key so we don't need poison value analysis. I> think this could be problematic, but if Sanjoy really wants to go this > route and support the work I won't get in the way. > > It seems that no one is significantly preferring this approach.> (3) Write a *simple* IR-level analysis that refactors address expressions > to hoist the loop invariant component, informed by the target's addressing > mode. This could be a first step in replacing the LSR pass with something > more efficient and predictable. This is the approach I would take if I were > doing the work. But I can understand not wanting to design a new pass just > to handle your case. > > I see how that would work for ptr[i+offset], reassociating it to(&ptr[offset])[i] and then hoisting &ptr[offset]. Here's another simple case that should introduce a 64 bit pointer induction variable for &p[2*i], while keeping i in 32 bits, but doesn't currently: for (int i = 0; i < N; ++i) p[2*i] = ...; It's the same issue, just with nsw on 2*i instead of on an addition. There may be some hoisting approach that works for this, and that also works if it is x*i instead of 2*i for some loop-invariant x, though I don't see it at the moment. Bjarke -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150701/e5b3c916/attachment.html>