thr3ads.net - llvm dev - [LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly) [Oct 2014]

If this information is useful, please help other people find it:
Share via:

Dmitry Borisenkov

2014-Oct-07 17:50 UTC

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

Hello everyone.

I'm not an expert neither in llvm nor in x86 nor in IEEE standard for
floating point numbers, thus any of my following assumptions maybe wrong. If
so, I will be grateful if you clarify me what's goes wrong. But if my
guesses are correct we possibly have a bug in fp arithmetics on x86.

I have the following ir:

  @g = constant i64 1

define i32 @main() {

  %gval = load i64* @g

  %gvalfp = bitcast i64 %gval to double

  %fmul = fmul double %gvalfp, -5.000000e-01

  %fcmp = fcmp ueq double %fmul, -0.000000e+00

  %ret = select i1 %fcmp, i32 1, i32 0

  ret i32 %ret

}

And I expected that minimal positive denormalized double times -0.5 is equal
to -0.0, so correct exit code is 1.

llvm-3.4.2 on x86 linux target produced the following assembly:

      .file "fpfail.ll"

      .section    .rodata.cst8,"aM", at progbits,8

      .align      8

.LCPI0_0:

      .quad -4620693217682128896    # double -0.5

.LCPI0_1:

      .quad -9223372036854775808    # double -0

      .text

      .globl      main

      .align      16, 0x90

      .type main, at function

main:                                   # @main

      .cfi_startproc

# BB#0:

      vmovsd      g, %xmm0

      vmulsd      .LCPI0_0, %xmm0, %xmm0

      vucomisd    .LCPI0_1, %xmm0

      sete  %al

      movzbl      %al, %eax

      ret

.Ltmp0:

      .size main, .Ltmp0-main

      .cfi_endproc

 

      .type g, at object               # @g

      .section    .rodata,"a", at progbits

      .globl      g

      .align      8

g:

      .quad 1                       # 0x1

      .size g, 8

 

      .section    ".note.GNU-stack","", at progbits

 

./llc -march=x86 fpfail.ll; g++ fpfail.s; ./a.out; echo $?

returns 1 as expected.

 

But llvm-3.5 (on the same target) lowers the previous ir using floating
point instructions in the following way.

      .text

      .file "fpfail.ll"

      .section    .rodata.cst4,"aM", at progbits,4

      .align      4

.LCPI0_0:

      .long 3204448256              # float -0.5

      .text

      .globl      main

      .align      16, 0x90

      .type main, at function

main:                                   # @main

      .cfi_startproc

# BB#0:

      fldl  g

      fmuls .LCPI0_0

      fldz

      fchs

      fxch  %st(1)

      fucompp

      fnstsw      %ax

                                        # kill: AX<def> AX<kill>
EAX<def>

                                        # kill: AH<def> AH<kill>
EAX<kill>

      sahf

      sete  %al

      movzbl      %al, %eax

      retl

.Ltmp0:

      .size main, .Ltmp0-main

      .cfi_endproc

 

      .type g, at object               # @g

      .section    .rodata,"a", at progbits

      .globl      g

      .align      8

g:

      .quad 1                       # 0x1

      .size g, 8

 

 

      .section    ".note.GNU-stack","", at progbits

 

First, it doesn't assemble with g++ (4.8):

fpfail.s:26: Error: invalid instruction suffix for `ret'

I downloaded Intel manual and haven't found any mention of retl instruction,
so I manually exchanged it with ret and reassemble:

g++ fpfail.s; ./a.out; echo $?

The exit code is 0. This is correct for Intel 80-bit floats but wrong for
doubles. What am I do wrong or this is actually a bug or even worse -
correct behavior?

 

--

Kind regards, Dmitry Borisenkov

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141007/8d518e8b/attachment.html>

Tim Northover

2014-Oct-07 18:26 UTC

head link

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

Hi Dmitry,

On 7 October 2014 10:50, Dmitry Borisenkov <d.borisenkov at samsung.com>
wrote:> fpfail.s:26: Error: invalid instruction suffix for `ret'
>
> I downloaded Intel manual and haven’t found any mention of retl
instruction,
"retl" is the AT&T syntax for the normal "ret"
instruction in the
Intel manual, which makes it mostly undocumented.
> The exit code is 0. This is correct for Intel 80-bit floats but wrong for
> doubles. What am I do wrong or this is actually a bug or even worse –
> correct behavior?
I think the default CPU used by llc was changed between 3.4 and 3.5.
Before, we defaulted to the host's CPU (from memory), but now we pick
a lowest common denominator "generic", which doesn't support SSE.

When the IR comes from Clang, I believe we define the
"FLT_EVAL_METHOD" macro to be 2 in this case (see C99 5.2.4.2.2),
which signals that operations are performed at "long double" precision
and the outcome you see is permitted.

So I *think* this is OK, unless I'm misunderstanding one of the specs
involved.

Cheers.

Tim.

Joerg Sonnenberger

2014-Oct-07 18:45 UTC

head link

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

On Tue, Oct 07, 2014 at 09:50:37PM +0400, Dmitry Borisenkov
wrote:> I'm not an expert neither in llvm nor in x86 nor in IEEE standard for
> floating point numbers, thus any of my following assumptions maybe wrong.
If
> so, I will be grateful if you clarify me what's goes wrong. But if my
> guesses are correct we possibly have a bug in fp arithmetics on x86.
Are you targetting the same backend? i386 (32bit mode) uses FPU registers
for argument passing and return values, x86_64 / amd64 (64bit mode) uses
SSE registers for float/double values and FPU registers for long double.
The error on retl makes me think the second example is compiled for
i386, while the first example looks more like x86_64.

Joerg

Dmitry Borisenkov

2014-Oct-08 08:36 UTC

head link

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

Hi, Joerg

Both of the examples were compiled ./llc -march=x86 -O3 fpfail.ll (i386).
I've double checked it.

Kind regards, Dmitry Borisenkov

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Joerg Sonnenberger
Sent: Tuesday, October 07, 2014 10:45 PM
To: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Stange behavior in fp arithmetics on x86 (bug
possibly)

On Tue, Oct 07, 2014 at 09:50:37PM +0400, Dmitry Borisenkov
wrote:> I'm not an expert neither in llvm nor in x86 nor in IEEE standard for 
> floating point numbers, thus any of my following assumptions maybe 
> wrong. If so, I will be grateful if you clarify me what's goes wrong. 
> But if my guesses are correct we possibly have a bug in fp arithmetics onx86.

Are you targetting the same backend? i386 (32bit mode) uses FPU registers
for argument passing and return values, x86_64 / amd64 (64bit mode) uses SSE
registers for float/double values and FPU registers for long double.
The error on retl makes me think the second example is compiled for i386,
while the first example looks more like x86_64.

Joerg
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Stephen Checkoway

2014-Oct-10 06:48 UTC

head link

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

On Oct 7, 2014, at 2:26 PM, Tim Northover <t.p.northover at gmail.com>
wrote:
> Hi Dmitry,
> 
> On 7 October 2014 10:50, Dmitry Borisenkov <d.borisenkov at
samsung.com> wrote:
>> fpfail.s:26: Error: invalid instruction suffix for `ret'
>> 
>> I downloaded Intel manual and haven’t found any mention of retl
instruction,
> 
> "retl" is the AT&T syntax for the normal "ret"
instruction in the
> Intel manual, which makes it mostly undocumented.
Are you sure about that? I don't recall ever seeing retl before. A while
back a reference for AT&T was mentioned and, as I recall, this was the best
anyone had <http://docs.oracle.com/cd/E19253-01/817-5477/817-5477.pdf>. It
contains no mention of retl.

This seems to be the commit that added support for it
<http://lists.cs.uiuc.edu/pipermail/llvm-branch-commits/2010-May/003229.html>.

I'm not sure I understand the distinction between retl/retq. x86 has 4
return instruction (cribbing from the Intel manual):

C3	RET		Near return
CB	RET		Far return
C2 iw	RET imm16	Near return + pop imm16 bytes
CA iw	RET imm16	Far return + pop imm16 bytes

(And I think that's been true since the 8086.)

Distinguishing between near and far (e.g., ret vs. lret in AT&T or retn vs.
retf with some other assemblers) makes sense, but what would a l or q suffix
denote?

But more to the point, even if there's a good reason to accept retl/retq as
input, is there any reason to emit it ever?

-- 
Stephen Checkoway

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Oct 2014 - [LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)

Possibly Parallel Threads