Hi all, I'm working on a language in which I would like all operations to be well defined. (and efficient) I want to define a sqrt function in my language, that will return NaN for arguments < 0, and NaN for a NaN argument. As far as I know, these semantics map nicely to the SQRTPS SSE instruction, which seems to return NaN on arguments < 0. However, the LLVM lang ref states "llvm.sqrt has undefined behavior for negative numbers other than -0.0". This means that, to avoid undefined behaviour, in general I will have to add a runtime branch to avoid passing values less than zero to llvm.sqrt(). This is unfortunate since I would like to avoid inefficient, unneeded branching. I propose changing the llvm.sqrt() LLVM instrinsic to be well defined on all inputs, and be defined to return NaN on negative inputs. Btw, I don't particularly care about errno or related, as my language is not C. I realise there is some kind of issue here to do with code reordering and errno, but It would be a pity if these problems slowed down sqrt code emission for all LLVM users. What do people think? Thanks, Nick
On Thu, Oct 31, 2013 at 01:16:20PM +0000, Nicholas Chapman wrote:> Hi all, > I'm working on a language in which I would like all operations to be > well defined. (and efficient) > I want to define a sqrt function in my language, that will return NaN > for arguments < 0, and NaN for a NaN argument. > As far as I know, these semantics map nicely to the SQRTPS SSE > instruction, which seems to return NaN on arguments < 0. > However, the LLVM lang ref states "llvm.sqrt has undefined behavior for > negative numbers other than -0.0". > > This means that, to avoid undefined behaviour, in general I will have to > add a runtime branch to avoid passing values less than zero to llvm.sqrt(). > This is unfortunate since I would like to avoid inefficient, unneeded > branching. > > I propose changing the llvm.sqrt() LLVM instrinsic to be well defined on > all inputs, and be defined to return NaN on negative inputs. > > Btw, I don't particularly care about errno or related, as my language is > not C. I realise there is some kind of issue here to do with code > reordering and errno, but It would be a pity if these problems slowed > down sqrt code emission for all LLVM users. > > What do people think? >My suggestion is to implement sqrt() like this: y = x >= 0.0f : llvm.sqrt(x) ? NaN; If you are worried about performance on X86, you could have the frontend emit the llvm.sse_sqrt_ps intrinsic for sqrt() or you could add a pattern to the X86 backend to select this sequence to a SQRTPS instruction. -Tom> Thanks, > Nick > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> On Oct 31, 2013, at 6:16 AM, Nicholas Chapman <admin at indigorenderer.com> wrote: > > I propose changing the llvm.sqrt() LLVM instrinsic to be well defined on all inputs, and be defined to return NaN on negative inputs.I strongly disagree with this proposal. The purpose of this general purpose intrinsic is to expose sqrt functionality present on many of the architectures LLVM supports. If we defined its edge cases, we won't be able to map it to target functionality freely on targets whose edge cases don't match that definition. I'd recommend using (or adding, if it doesn't already exist) an X86-specific intrinsic to expose exactly the instruction you want. -Owen
> I strongly disagree with this proposal. The purpose of this general > purpose intrinsic is to expose sqrt functionality present on many of > the architectures LLVM supports. If we defined its edge cases, we > won't be able to map it to target functionality freely on targets whose > edge cases don't match that definition.I agree the targets should be the primary focus, but a cursory search failed to find one whose sqrt instruction(s) didn't produce NaN for negative values; it's pretty much the only sane choice. Do they exist (perhaps odd GPUs or something that always traps)? If not, perhaps we could sensibly decouple the errno stuff from the actual value produced: make no guarantees about what happens to the environment but specify the result. Cheers. Tim.
Apparently Analagous Threads
- [LLVMdev] llvm.sqrt intrinsic undefined behaviour
- [LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry
- [LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry
- [LLVMdev] [RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry
- [LLVMdev] llvm.sqrt intrinsic undefined behaviour