I'm also not fully happy with LLVM's behavior here. There is another undefined case too, which is the minimum integer divided by -1. In Julia I can get "random" answers by doing: julia> sdiv_int(-9223372036854775808, -1) 87106304 julia> sdiv_int(-9223372036854775808, -1) 87108096 In other contexts where the arguments are not constant, this typically gives an FPE trap. More than insisting on a particular behavior, I'd like it to be consistent. I know the result is undefined, so LLVM's behavior here is valid, but I find that to be an overly lawyerly interpretation. Presumably the optimizer benefits from taking advantage of the undefined behavior, but to get a consistent result you need to check for both zero and this case, which is an awful lot of checks. Yes they will branch predict well, but this still can't be good, for code size if nothing else. How much performance can you really get by constant folding -9223372036854775808/-1 to an unspecified value? On Sat, Apr 6, 2013 at 1:18 AM, Owen Anderson <resistor at mac.com> wrote:> > On Apr 5, 2013, at 8:02 PM, Cameron McInally <cameron.mcinally at nyu.edu> > wrote: > > I'm less concerned about "where" the trap happens as I am about "if" it > happens. For example, a Fortran program with division-by-zero is, by the > Standard, non-conforming. Pragmatically, not a Fortran program. Rather than > wrong answers, I would like to see a hard error indicating that a program is > non-conforming to the Standard. > > > As I've pointed out, clang does provide such functionality as an opt-in > feature through its -fsanitize options. A hypothetical Fortran frontend > could do the same, and even make it an opt-out feature if it chose. I'm > sorry if its implementation mechanism doesn't match exactly what you want it > to be, but it's not like nobody else has thought about this problem. They > have, and they've designed and shipped a solution! > > Side note: even if the -fsanitize option introduces a branch around the > division (which I haven't verified), it's quite unlikely to cause a > significant performance regression. The branch to the error block should be > perfectly predicted on any CPU made in the last 25 years. > > --Owen > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Saturday, April 6, 2013, Jeff Bezanson wrote:> > Presumably the optimizer benefits from taking advantage of the > undefined behavior, but to get a consistent result you need to check > for both zero and this case, which is an awful lot of checks. Yes they > will branch predict well, but this still can't be good, for code size > if nothing else. How much performance can you really get by constant > folding -9223372036854775808/-1 to an unspecified value? >Constant folding undefined expressions is sort of silly, but I appreciate that it makes undefined behavior problems in frontends immediately apparent with trivial cases before they creep up on you in more complicated optimized code. After all, even if the backend makes practical concessions to trivial cases, the underlying semantic problem is still there and will bite you eventually. For high-level languages like Julia that want to provide efficiency but also give defined behavior to all user-exposed cases, I think adding an LLVM intrinsic to represent division that's defined to trap on division by zero or overflow would be the best solution. Since the trap is a side effect, it would stifle optimization of code that used the intrinsic, but the intrinsic could still be lowered to a single hardware instruction and avoid the branching and code bloat of manual checks on hardware where division by zero natively traps. -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130406/32ba6710/attachment.html>
A division intrinsic with defined behavior on all arguments would be awesome! Thanks for considering this. On Sat, Apr 6, 2013 at 11:27 AM, Joe Groff <arcata at gmail.com> wrote:> On Saturday, April 6, 2013, Jeff Bezanson wrote: >> >> >> Presumably the optimizer benefits from taking advantage of the >> undefined behavior, but to get a consistent result you need to check >> for both zero and this case, which is an awful lot of checks. Yes they >> will branch predict well, but this still can't be good, for code size >> if nothing else. How much performance can you really get by constant >> folding -9223372036854775808/-1 to an unspecified value? > > > Constant folding undefined expressions is sort of silly, but I appreciate > that it makes undefined behavior problems in frontends immediately apparent > with trivial cases before they creep up on you in more complicated optimized > code. After all, even if the backend makes practical concessions to trivial > cases, the underlying semantic problem is still there and will bite you > eventually. For high-level languages like Julia that want to provide > efficiency but also give defined behavior to all user-exposed cases, I think > adding an LLVM intrinsic to represent division that's defined to trap on > division by zero or overflow would be the best solution. Since the trap is a > side effect, it would stifle optimization of code that used the intrinsic, > but the intrinsic could still be lowered to a single hardware instruction > and avoid the branching and code bloat of manual checks on hardware where > division by zero natively traps. > > -Joe
Joe Groff <arcata at gmail.com> writes:> Constant folding undefined expressions is sort of silly, but I > appreciate that it makes undefined behavior problems in frontends > immediately apparent with trivial cases before they creep up on you in > more complicated optimized code. After all, even if the backend makes > practical concessions to trivial cases, the underlying semantic > problem is still there and will bite you eventually.Not necessarily. It's not uncommon for Fortran programmers to do something like this (C-tran pseudo code): if (error) then temp = x / 0 ; Force a trap to get into the debugger end if The user is relying on trapping behavior to help debug code. There is no bad behavior to bite one later. Optimizing traps away is unfriendly in cases like this.> For high-level languages like Julia that want to provide efficiency > but also give defined behavior to all user-exposed cases, I think > adding an LLVM intrinsic to represent division that's defined to trap > on division by zero or overflow would be the best solution. Since the > trap is a side effect, it would stifle optimization of code that used > the intrinsic, but the intrinsic could still be lowered to a single > hardware instruction and avoid the branching and code bloat of manual > checks on hardware where division by zero natively traps.What about the case where other optimization exposes a constant divide by zero later on? There is no way for a frontend to know to emit the intrinsic _a_priori_. We don't want to force all divides to use an intrinsic because it will kill optimization. I think to handle the general case we will need a flag to preserve traps. -David