thr3ads.net - llvm dev - [LLVMdev] Integer divide by zero [Apr 2013]

If this information is useful, please help other people find it:
Share via:

Cameron McInally

2013-Apr-08 04:28 UTC

[LLVMdev] Integer divide by zero

Well put, Chandler!

On Sun, Apr 7, 2013 at 6:23 PM, Chandler Carruth <chandlerc at
google.com>wrote:
> I think this entire conversation is going a bit off the rails. Let's
try
> to stay focused on the specific request, and why there (may) be problems
> with it.
>
> On Sun, Apr 7, 2013 at 11:50 AM, Cameron McInally <
> cameron.mcinally at nyu.edu> wrote:
>
>> To be clear, you're asking to turn off a set of optimizations. That
>>> is, you're asking to make code in general run slower, so that
you can
>>> get a particular behavior on some CPUs in unusual cases.
>>>
>>
>> I respectfully disagree. I am asking for an *option* to turn off a set
of
>> optimizations; not turn off optimizations in general. I would like to
make
>> it easy for a compiler implementor to choose the desired behaviour. I
>> whole-heartedly believe that both behaviours (undefined and trap) have
>> merit.
>>
>
> I think you're both misconstruing what this would involve.
>
> You're actually asking for the formal model of the LLVM IR to be
> *parameterized*. In one mode, an instruction would produce undefined
> behavior on division, and in another mode it would produce a trap. Then you
> are asking for the optimizer stack to support either semantic model, and
> produce efficient code regardless.
>
> This is completely intractable for LLVM to support. It would make both the
> optimizers and the developers of LLVM crazy to have deep parameterization
> of the fundamental semantic model for integer division.
>
> The correct way to support *exactly* reproducing the architectural
> peculiarities of the x86-64 integer divide instruction is to add a
> target-specific intrinsic which does this. It will have defined behavior
> (of trapping in some cases) as you want, and you can emit this in your FE
> easily. However, even this has the risk of incurring a high maintenance
> burden. If you want much in the way of optimizations of this intrinsic,
> you'll have to go through the optimizer and teach each pass about your
> intrinsic. Some of these will be easy, but some will be hard and there will
> be a *lot* of them. =/
>
>
> Cameron, you (and others interested) will certainly need to provide all of
> the patches and work to support this if you think this is an important use
> case, as the existing developers have found other trade-offs and solutions.
> And even then, if it requires really substantial changes to the optimizer,
> I'm not sure it's worth pursuing this in LLVM. My primary concerns
are
> two-fold. First, I think that the amount of work required to recover the
> optimizations which could theoretically apply to both of these operations
> will be massive. Second, I fear that after having done this work, you will
> immediately find the need to remove some other undefined behavior from the
> IR which happens to have defined behavior on x86-64.
>
Alas, I must have been shortsighted. For my purposes, I had envisioned
using this target-specific intrinsic only when undefined behaviour was
imminent. That information is available before the IR and it would
work-around the constant folder. I did not anticipate needing optimizations
around that intrinsic, since it would ultimately trap.

Supporting the intrinsic as a proper alternative to the integer division
operator(s) sounds like a lot of work. I do not believe that the reward is
worth the effort, at least for my purposes. Others may feel different.

> Fundamentally, the idea of undefined behavior is at the core of the design
> of LLVM's optimizers. It is leveraged everywhere, and without it many
> algorithms that are fast would become slow, transformations that are cheap
> would become expensive, passes that operate locally would be forced to
> operate across ever growing scopes in order to be certain the optimizations
> applied in this specific case. Trying to remove undefined behavior from
> LLVM seems unlikely to be a productive pursuit.
>
Fair enough.

> More productive (IMO) is to emit explicit guards against the undefined
> behavior in your language, much as -fsanitize does for undefined behavior
> in C++. Then work to build a mode where a specific target can take
> advantage of target specific trapping behaviors to emit these guards more
> efficiently. This will allow LLVM's optimizers to continue to function
in
> the world they were designed for, and with a set of rules that we know how
> to build efficient optimizers around, and your source programs can operate
> in a world with checked behavior rather than undefined behavior. As a
> useful side-effect, you can defer the target-specific optimizations until
> you have benchmarks (internally is fine!) and can demonstrate the
> performance problems (if any).
>
Regrettably, this implementation does not suit my needs. The constant
folding would still occur and I would like to produce the actual division,
since the instruction is non-maskable on x86. Others may have a better use
for this implementation though, so I don't want to shoot the idea down for
everyone.

> Cameron, you may disagree, but honestly if you were to convince folks here
> I think it would have happened already. I'm not likely to continue the
> theoretical debate about whether LLVM's stance on UB (as I've
described
> above) is a "good" or "bad" stance. Not that I
wouldn't enjoy the debate
> (especially at a bar some time), but because I fear it isn't a
productive
> way to spend the time of folks on this list. So let's try to stick to
the
> technical discussion of strategies, costs, and tradeoffs.
>
Oh, no. Your analysis was thorough and I can sympathize with it. The seams
of C/C++ and the x86 architecture are foggy. I understand that my
interpretation of their interactions is not gospel.

Thanks again for the thoughtful reply!

-Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130408/b4aa72d9/attachment.html>

Chandler Carruth

2013-Apr-08 05:34 UTC

head link

[LLVMdev] Integer divide by zero

On Sun, Apr 7, 2013 at 9:28 PM, Cameron McInally
<cameron.mcinally at nyu.edu>wrote:
> More productive (IMO) is to emit explicit guards against the undefined
>> behavior in your language, much as -fsanitize does for undefined
behavior
>> in C++. Then work to build a mode where a specific target can take
>> advantage of target specific trapping behaviors to emit these guards
more
>> efficiently. This will allow LLVM's optimizers to continue to
function in
>> the world they were designed for, and with a set of rules that we know
how
>> to build efficient optimizers around, and your source programs can
operate
>> in a world with checked behavior rather than undefined behavior. As a
>> useful side-effect, you can defer the target-specific optimizations
until
>> you have benchmarks (internally is fine!) and can demonstrate the
>> performance problems (if any).
>>
>
> Regrettably, this implementation does not suit my needs.
>
I'm curious why you think so... I think it would very closely match your
needs:

> The constant folding would still occur
>
No? At least, I don't see why.

I don't know what your frontend is doing (but my impression is that it is
not Clang? If I'm wrong, let me know...) but the common idiom is to emit
nothing as a constant other than immediate values, and let LLVM's
optimizers figure it out. A consequence of this pattern combined with
inserting guards is that (at most) the optimizer will fold directly to the
trap. You should be able to control the exact structure in the FE though.

and I would like to produce the actual division, since the instruction
is> non-maskable on x86.
>
Yes, and this is the point I was driving at with the comments about a
target specific optimization. It is reasonable for the x86 backend to
recognize the pattern of testing for a zero divisor and trapping if it
occurs, and transform that into an unconditional divide relying on the
hardware trap instead. I think it is not unreasonable to have this strategy
result in one of two generated code patterns in the overwhelming majority
of cases:

1) An unconditional trap because the optimizer proved divide by zero
2) A direct divide instruction which relies on the x86 trapping behavior.

That said, it remains a fairly significant amount of work to arrive at this
place. Just trying to point out a reasonable approach to arriving there.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130407/a17842d2/attachment.html>

Cameron McInally

2013-Apr-08 15:45 UTC

head link

[LLVMdev] Integer divide by zero

On Mon, Apr 8, 2013 at 1:34 AM, Chandler Carruth <chandlerc at
google.com>wrote:
> On Sun, Apr 7, 2013 at 9:28 PM, Cameron McInally <cameron.mcinally at
nyu.edu
> > wrote:
>
>> More productive (IMO) is to emit explicit guards against the undefined
>>> behavior in your language, much as -fsanitize does for undefined
behavior
>>> in C++. Then work to build a mode where a specific target can take
>>> advantage of target specific trapping behaviors to emit these
guards more
>>> efficiently. This will allow LLVM's optimizers to continue to
function in
>>> the world they were designed for, and with a set of rules that we
know how
>>> to build efficient optimizers around, and your source programs can
operate
>>> in a world with checked behavior rather than undefined behavior. As
a
>>> useful side-effect, you can defer the target-specific optimizations
until
>>> you have benchmarks (internally is fine!) and can demonstrate the
>>> performance problems (if any).
>>>
>>
>> Regrettably, this implementation does not suit my needs.
>>
>
> I'm curious why you think so... I think it would very closely match
your
> needs:
>
It's my current understanding that the actual division would be constant
folded away and I would be left with only the guard. Later, the guard would
be proved always true. I could, of course, be mistaken.

I checked out Clang with -fsanitize=integer-divide-by-zero this weekend and
saw similar in the assembly. I also noticed that the division operands are
still around in the assembly. If it would be possible to recreate the div
instructions, I would be thrilled.


>
>
>> The constant folding would still occur
>>
>
> No? At least, I don't see why.
>
> I don't know what your frontend is doing (but my impression is that it
is
> not Clang? If I'm wrong, let me know...)
>
It is not Clang. Our compiler has proprietary frontends, optimizer,
vectorizer, and more. You may have noticed that I don't often share LLVM IR
on the list. Our IR has tons of proprietary changes to it. But, I digress.


> but the common idiom is to emit nothing as a constant other than immediate
> values, and let LLVM's optimizers figure it out.
>
Our pre-LLVM optimizer may be a culprit here. With the original test case I
provided, our inliner will inline both foo(...) and bar(...). In main(), we
end up calling CreateSDiv in the IRBuilder with two constants, i.e. (6/0).

Would Clang snap these constants into temporaries? Or, does Clang maintain
the original source structure? Sorry for these simple questions, I have no
insight to Clang at all.


> A consequence of this pattern combined with inserting guards is that (at
> most) the optimizer will fold directly to the trap. You should be able to
> control the exact structure in the FE though.
>
> and I would like to produce the actual division, since the instruction is
>> non-maskable on x86.
>>
>
> Yes, and this is the point I was driving at with the comments about a
> target specific optimization. It is reasonable for the x86 backend to
> recognize the pattern of testing for a zero divisor and trapping if it
> occurs, and transform that into an unconditional divide relying on the
> hardware trap instead. I think it is not unreasonable to have this strategy
> result in one of two generated code patterns in the overwhelming majority
> of cases:
>
I'm going to be nitpicky over this. It's really not my intention to be
unreasonable. I hope you can understand.

> 1) An unconditional trap because the optimizer proved divide by zero
>
This isn't desirable for me. I am afraid an unconditional trap in the
assembly will just confuse a user. I suspect that a user will trigger this
code and immediately report a compiler bug that we've inserted a rogue
trap.


> 2) A direct divide instruction which relies on the x86 trapping behavior.
>
This would be great. Would I be able to accurately recreate the actual
constant divide that was folded away? Originally, I had suspected that the
operands would be long gone or at least intractable. Again, this may sound
unreasonable, but I would like to keep the division intact so that a user
can see where their code went wrong in the assembly. Sorry in advance if
I'm misunderstanding.

Thanks again, Chandler.

-Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130408/5375d098/attachment.html>

dag at cray.com

2013-Apr-16 17:01 UTC

head link

[LLVMdev] Integer divide by zero

Chandler Carruth <chandlerc at google.com> writes:
> Yes, and this is the point I was driving at with the comments about a
> target specific optimization. It is reasonable for the x86 backend to
> recognize the pattern of testing for a zero divisor and trapping if it
> occurs, and transform that into an unconditional divide relying on the
> hardware trap instead. I think it is not unreasonable to have this
> strategy result in one of two generated code patterns in the
> overwhelming majority of cases:
>
> 1) An unconditional trap because the optimizer proved divide by zero
> 2) A direct divide instruction which relies on the x86 trapping
> behavior.
>
> That said, it remains a fairly significant amount of work to arrive at
> this place. Just trying to point out a reasonable approach to arriving
> there.
I agree that this is potentially a good approach.  I also agree that it
is a significant amount of work.

Given infinite resources, I think your solution makes the most sense.
Given limited resources, I'm not sure.  But Cameron has said he won't
push anything upstream that people aren't comfortable with.

                                  -David

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Apr 2013 - [LLVMdev] Integer divide by zero

[LLVMdev] Integer divide by zero

[LLVMdev] Integer divide by zero

[LLVMdev] Integer divide by zero

[LLVMdev] Integer divide by zero

Possibly Parallel Threads