thr3ads.net - llvm dev - [LLVMdev] TSVC/Equivalencing-dbl [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2012-Oct-05 18:32 UTC

[LLVMdev] TSVC/Equivalencing-dbl

----- Original Message -----> From: "Duncan Sands" <duncan.sands at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Friday, October 5, 2012 12:10:03 PM
> Subject: Re: TSVC/Equivalencing-dbl
> 
> Oops, I ran the testsuite wrong: read clang output for dragonegg
> output.
Okay, can you resummarize? Do you mean that?

gcc -O0:
S1421         0.00                 16000

gcc -O0 under valgrind:
S1421         0.00                 17208.404325315

clang:
S1421    0.00           17208.404325315

This is all on Darwin, right?

I would certainly tend to suspect an 80-bit-intermediate issue, but, both gcc
and clang give 16000 on PowerPC (which has no 80-bit). It could be a rounding
issue, but would Darwin really have a different default rounding mode?

The computation being performed here is [in s1421() in tsc.inc]:
                for (int i = 0; i < LEN/2; i++) {
                        b[i] = xx[i] + a[i];
                }
So *if* we're adding up the same numbers in the same order, the answer
should be the same everywhere ;) Can you put in some print statements and
confirm?

Thanks again,
Hal
> 
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Duncan Sands

2012-Oct-05 19:50 UTC

head link

[LLVMdev] TSVC/Equivalencing-dbl

Hi Hal,

On 05/10/12 20:32, Hal Finkel wrote:> ----- Original Message -----
>> From: "Duncan Sands" <duncan.sands at gmail.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: llvmdev at cs.uiuc.edu
>> Sent: Friday, October 5, 2012 12:10:03 PM
>> Subject: Re: TSVC/Equivalencing-dbl
>>
>> Oops, I ran the testsuite wrong: read clang output for dragonegg
>> output.
>
> Okay, can you resummarize? Do you mean that?
>
> gcc -O0:
> S1421         0.00                 16000
>
> gcc -O0 under valgrind:
> S1421         0.00                 17208.404325315
>
> clang:
> S1421    0.00           17208.404325315
exactly.  For "clang" this is only when building like the testsuite
does
(i.e. with link-time optimization + llc): if you directly do:
   clang tsc.c dummy.c -std=gnu99 -O3
then you get 16000.
>
> This is all on Darwin, right?
No, this is on x86-64 (ubuntu) linux.
>
> I would certainly tend to suspect an 80-bit-intermediate issue, but, both
gcc and clang give 16000 on PowerPC (which has no 80-bit).
Not sure what you are saying here.  The issue is the x86 internally uses 80 bits
for the 64 bit (double) type, so as long as everything is in registers you get
lots more precision, but the moment you store to memory only 64 bits are stored.
The fact that gcc and clang give the same on powerpc confirms that it is coming
from x86 using an extra 16 bits of precision beyond what you would expect.

  It could be a rounding issue, but would Darwin really have a different default
rounding mode?

As I'm seeing this on linux, I guess not :)
>
> The computation being performed here is [in s1421() in tsc.inc]:
>                  for (int i = 0; i < LEN/2; i++) {
>                          b[i] = xx[i] + a[i];
>                  }
> So *if* we're adding up the same numbers in the same order, the answer
should be the same everywhere ;)
No, why would it be the same everywhere?  If the whole thing is done in
double registers, and x86 processor will maintain 80 bits of precision
even though these are 64 bit (double) types, while if things are loaded
and stored to memory at every step instead then only 64 bits will be used.
This can lead to very different results.

  Can you put in some print statements and confirm?

Not sure what you want me to confirm, but anyway I now have 1/2 an hour to
look into this some more :)

Ciao, Duncan.
>
> Thanks again,
> Hal
>
>>
>

Duncan Sands

2012-Oct-05 20:08 UTC

head link

[LLVMdev] TSVC/Equivalencing-dbl

PS: Here's how I can reproduce with clang on linux:

clang -S -o tsc.ll -O0 -flto -std=gnu99 tsc.c ; clang -S -o dummy.ll -O0 -flto 
-std=gnu99 dummy.c ; opt -std-compile-opts tsc.ll -S -o tsc.1.ll ; opt 
-std-compile-opts dummy.ll -S -o dummy.1.ll ; llvm-link tsc.1.ll dummy.1.ll -S 
-o total.ll ; opt -std-link-opts total.ll -S -o total.1.ll ; llc total.1.ll ; 
gcc -o z total.1.s

The program z shows the problem.  Note that it is essential to have clang use
-O0 (not -O3).

Ciao, Duncan.

Hal Finkel

2012-Oct-05 20:26 UTC

head link

[LLVMdev] TSVC/Equivalencing-dbl

----- Original Message -----> From: "Duncan Sands" <duncan.sands at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Friday, October 5, 2012 2:50:06 PM
> Subject: Re: TSVC/Equivalencing-dbl
> 
> Hi Hal,
> 
> On 05/10/12 20:32, Hal Finkel wrote:
> > ----- Original Message -----
> >> From: "Duncan Sands" <duncan.sands at gmail.com>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: llvmdev at cs.uiuc.edu
> >> Sent: Friday, October 5, 2012 12:10:03 PM
> >> Subject: Re: TSVC/Equivalencing-dbl
> >>
> >> Oops, I ran the testsuite wrong: read clang output for dragonegg
> >> output.
> >
> > Okay, can you resummarize? Do you mean that?
> >
> > gcc -O0:
> > S1421         0.00                 16000
> >
> > gcc -O0 under valgrind:
> > S1421         0.00                 17208.404325315
> >
> > clang:
> > S1421    0.00           17208.404325315
> 
> exactly.  For "clang" this is only when building like the
testsuite
> does
> (i.e. with link-time optimization + llc): if you directly do:
>    clang tsc.c dummy.c -std=gnu99 -O3
> then you get 16000.
> 
> >
> > This is all on Darwin, right?
> 
> No, this is on x86-64 (ubuntu) linux.
OIC, interesting!
> 
> >
> > I would certainly tend to suspect an 80-bit-intermediate issue,
> > but, both gcc and clang give 16000 on PowerPC (which has no
> > 80-bit).
> 
> Not sure what you are saying here.  The issue is the x86 internally
> uses 80 bits
> for the 64 bit (double) type, so as long as everything is in
> registers you get
> lots more precision, but the moment you store to memory only 64 bits
> are stored.
> The fact that gcc and clang give the same on powerpc confirms that it
> is coming
> from x86 using an extra 16 bits of precision beyond what you would
> expect.
> 
>   It could be a rounding issue, but would Darwin really have a
>   different default
> rounding mode?
> 
> As I'm seeing this on linux, I guess not :)
> 
> >
> > The computation being performed here is [in s1421() in tsc.inc]:
> >                  for (int i = 0; i < LEN/2; i++) {
> >                          b[i] = xx[i] + a[i];
> >                  }
> 
> 
> > So *if* we're adding up the same numbers in the same order, the
> > answer should be the same everywhere ;)
> 
> No, why would it be the same everywhere?  If the whole thing is done
> in
> double registers, and x86 processor will maintain 80 bits of
> precision
> even though these are 64 bit (double) types, while if things are
> loaded
> and stored to memory at every step instead then only 64 bits will be
> used.
> This can lead to very different results.
Right.
> 
>   Can you put in some print statements and confirm?
> 
> Not sure what you want me to confirm, but anyway I now have 1/2 an
> hour to
> look into this some more :)
For test s1421, we have:
                for (int i = 0; i < LEN/2; i++) {
                        b[i] = xx[i] + a[i];
                }

in this case xx is set to the second half of the b array. a is initialized to
1/(i+1)^2. The b array, however, does not seem to be explicitly initialized for
this test. When all of the tests are run in order, it is initialized for the
last test in the previous group, s353... so maybe I screwed this up in breaking
apart the tests.

Thanks again,
Hal
> 
> Ciao, Duncan.
> 
> >
> > Thanks again,
> > Hal
> >
> >>
> >
> 
> 
-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Rao, Shivarama

2012-Oct-12 15:22 UTC

head link

[LLVMdev] TSVC/Equivalencing-dbl

Hi,

There was a out of bound array access in the test S1421. This is fixed and
uploaded at TSVC site by the TSVC maintainers. With this fix and Hal's fix
of proper initialization of arrays in broken tests, the test should work fine
now.

Regards,
Shivaram

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Duncan Sands
Sent: Saturday, October 06, 2012 1:39 AM
To: Hal Finkel
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] TSVC/Equivalencing-dbl

PS: Here's how I can reproduce with clang on linux:

clang -S -o tsc.ll -O0 -flto -std=gnu99 tsc.c ; clang -S -o dummy.ll -O0 -flto 
-std=gnu99 dummy.c ; opt -std-compile-opts tsc.ll -S -o tsc.1.ll ; opt 
-std-compile-opts dummy.ll -S -o dummy.1.ll ; llvm-link tsc.1.ll dummy.1.ll -S 
-o total.ll ; opt -std-link-opts total.ll -S -o total.1.ll ; llc total.1.ll ; 
gcc -o z total.1.s

The program z shows the problem.  Note that it is essential to have clang use
-O0 (not -O3).

Ciao, Duncan.
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Oct 2012 - [LLVMdev] TSVC/Equivalencing-dbl

[LLVMdev] TSVC/Equivalencing-dbl

[LLVMdev] TSVC/Equivalencing-dbl

[LLVMdev] TSVC/Equivalencing-dbl

[LLVMdev] TSVC/Equivalencing-dbl

[LLVMdev] TSVC/Equivalencing-dbl

Maybe Matching Threads