Tim Northover via llvm-dev
2019-Sep-17 13:10 UTC
[llvm-dev] ScalarEvolution invariants around wrapping flags
Hi, I'm working on a bug somewhere between SCEV and IndVarSimplify, which tacks an unwanted "nuw" onto an "add i32 %whatever, -1" (which actually almost certainly will overflow), leading ultimately to an infinite loop. A fuller description and test-case is at the end for anyone interested. The issue seems to be with ScalarEvolution's attempts to cache SCEV objects, which don't include wrapping flags in the immutable state. That means that once ScalarEvolution has created an expression as "nuw" it will be that way forever, even if analysed from an entirely separate part of the function where that's not necessarily the case. Worse, callers can create an expression with specified flags through the public interface, which would pollute all analysis after that point with those unwanted flags. I don't know if any callers actually do this but I could see it being useful for looking at hypothetical cases. To some degree mutation of wrapping flags is part of the design, because SCEVAddRecExprs are explicitly const_casted to add flags in multiple places so that they can be found again later. But that might not be quite so harmful because they at least contain a Loop as a contextual cue that prevents some leaking. So, my real question is does anyone know what the contract with ScalarEvolution is? I see a few possibilities: 1. Callers are expected to not engage in speculation. ScalarEvolution itself must only create expressions it knows hold in all cases. This sounds too restrictive to me. 2. Speculation not allowed outside ScalarEvolution, but ScalarEvolution can speculate about a specific Loop. I think this entails making non-AddRec expressions immutable (with Flags included as part of the FoldingSetID) and ensuring that any modification of an AddRec is provable within its Loop. 3. Speculation is allowed, and achieved by making all expressions immutable. Sites currently const_casting AddRecExprs instead get a new one with the flags they want. Sites trying to help out other codepaths by cacheing this info are out of luck. 4. Speculation scenarios are allowed, and achieved by adding something like a "HypothesisToken" to SCEV objects, to keep them separate from each other and guaranteed properties. As in 2, maybe a Loop is enough if public users can't speculate. Any ideas gratefully received. Cheers. Tim. Description of bug: the loop gets duplicated, with one copy guarded by both "x == 0 && x != 0". This gets an "nuw" added to %dec (probably legitimately since it's never executed). That "nuw" gets picked up in the analysis of the other copy of the loop, which also gets "nuw". The loop quickly gets turned into an infinite one because it's now riddled with UB. See test.c, compiled for a 32-bit platform (I've checked i686 and ARM, mac & Linux) at -O3. Strangely I can't reproduce it in opt yet. I suspect multiple runs of the same pass are needed. -------------- next part -------------- A non-text attachment was scrubbed... Name: test.c Type: application/octet-stream Size: 2224 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190917/b03b7e14/attachment.obj>
Finkel, Hal J. via llvm-dev
2019-Sep-17 15:38 UTC
[llvm-dev] ScalarEvolution invariants around wrapping flags
On 9/17/19 8:10 AM, Tim Northover via llvm-dev wrote: Hi, I'm working on a bug somewhere between SCEV and IndVarSimplify, which tacks an unwanted "nuw" onto an "add i32 %whatever, -1" (which actually almost certainly will overflow), My understanding of the current contract is that this is a bug. We cannot add deduced flags to non-AddRec SCEVs precisely because they're cached across loops. AddRecs are different because they're tied to a particular loop. Long term, I think that it would be cleaner to rework this so that all of the SCEV's are immutable and include their flags. -Hal leading ultimately to an infinite loop. A fuller description and test-case is at the end for anyone interested. The issue seems to be with ScalarEvolution's attempts to cache SCEV objects, which don't include wrapping flags in the immutable state. That means that once ScalarEvolution has created an expression as "nuw" it will be that way forever, even if analysed from an entirely separate part of the function where that's not necessarily the case. Worse, callers can create an expression with specified flags through the public interface, which would pollute all analysis after that point with those unwanted flags. I don't know if any callers actually do this but I could see it being useful for looking at hypothetical cases. To some degree mutation of wrapping flags is part of the design, because SCEVAddRecExprs are explicitly const_casted to add flags in multiple places so that they can be found again later. But that might not be quite so harmful because they at least contain a Loop as a contextual cue that prevents some leaking. So, my real question is does anyone know what the contract with ScalarEvolution is? I see a few possibilities: 1. Callers are expected to not engage in speculation. ScalarEvolution itself must only create expressions it knows hold in all cases. This sounds too restrictive to me. 2. Speculation not allowed outside ScalarEvolution, but ScalarEvolution can speculate about a specific Loop. I think this entails making non-AddRec expressions immutable (with Flags included as part of the FoldingSetID) and ensuring that any modification of an AddRec is provable within its Loop. 3. Speculation is allowed, and achieved by making all expressions immutable. Sites currently const_casting AddRecExprs instead get a new one with the flags they want. Sites trying to help out other codepaths by cacheing this info are out of luck. 4. Speculation scenarios are allowed, and achieved by adding something like a "HypothesisToken" to SCEV objects, to keep them separate from each other and guaranteed properties. As in 2, maybe a Loop is enough if public users can't speculate. Any ideas gratefully received. Cheers. Tim. Description of bug: the loop gets duplicated, with one copy guarded by both "x == 0 && x != 0". This gets an "nuw" added to %dec (probably legitimately since it's never executed). That "nuw" gets picked up in the analysis of the other copy of the loop, which also gets "nuw". The loop quickly gets turned into an infinite one because it's now riddled with UB. See test.c, compiled for a 32-bit platform (I've checked i686 and ARM, mac & Linux) at -O3. Strangely I can't reproduce it in opt yet. I suspect multiple runs of the same pass are needed. _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190917/7b5f8f27/attachment.html>
Sanjoy Das via llvm-dev
2019-Sep-19 03:17 UTC
[llvm-dev] ScalarEvolution invariants around wrapping flags
> 1. Callers are expected to not engage in speculation. ScalarEvolution > itself must only create expressions it knows hold in all cases.This is correct. There is some more relevant text in ScalarEvolution::isSCEVExprNeverPoison. And you're right, this is quite restrictive.> Long term, I think that it would be cleaner to rework this so that all of the SCEV's are immutable and include their flags.I proposed this in 2016 but Andy had some concerns and I dropped it. See http://lists.llvm.org/pipermail/llvm-dev/2016-September/105108.html and http://lists.llvm.org/pipermail/llvm-dev/2017-August/116324.html +Andrew Trick -- Sanjoy> > -Hal > > > leading ultimately to an > infinite loop. A fuller description and test-case is at the end for > anyone interested. > > The issue seems to be with ScalarEvolution's attempts to cache SCEV > objects, which don't include wrapping flags in the immutable state. > That means that once ScalarEvolution has created an expression as > "nuw" it will be that way forever, even if analysed from an entirely > separate part of the function where that's not necessarily the case. > > Worse, callers can create an expression with specified flags through > the public interface, which would pollute all analysis after that > point with those unwanted flags. I don't know if any callers actually > do this but I could see it being useful for looking at hypothetical > cases. > > To some degree mutation of wrapping flags is part of the design, > because SCEVAddRecExprs are explicitly const_casted to add flags in > multiple places so that they can be found again later. But that might > not be quite so harmful because they at least contain a Loop as a > contextual cue that prevents some leaking. > > So, my real question is does anyone know what the contract with > ScalarEvolution is? I see a few possibilities: > > 1. Callers are expected to not engage in speculation. ScalarEvolution > itself must only create expressions it knows hold in all cases. This > sounds too restrictive to me. > 2. Speculation not allowed outside ScalarEvolution, but > ScalarEvolution can speculate about a specific Loop. I think this > entails making non-AddRec expressions immutable (with Flags included > as part of the FoldingSetID) and ensuring that any modification of an > AddRec is provable within its Loop. > 3. Speculation is allowed, and achieved by making all expressions > immutable. Sites currently const_casting AddRecExprs instead get a new > one with the flags they want. Sites trying to help out other codepaths > by cacheing this info are out of luck. > 4. Speculation scenarios are allowed, and achieved by adding something > like a "HypothesisToken" to SCEV objects, to keep them separate from > each other and guaranteed properties. As in 2, maybe a Loop is enough > if public users can't speculate. > > Any ideas gratefully received. > > Cheers. > > Tim. > > Description of bug: the loop gets duplicated, with one copy guarded by > both "x == 0 && x != 0". This gets an "nuw" added to %dec (probably > legitimately since it's never executed). That "nuw" gets picked up in > the analysis of the other copy of the loop, which also gets "nuw". The > loop quickly gets turned into an infinite one because it's now riddled > with UB. > > See test.c, compiled for a 32-bit platform (I've checked i686 and ARM, > mac & Linux) at -O3. Strangely I can't reproduce it in opt yet. I > suspect multiple runs of the same pass are needed. > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev