thr3ads.net - llvm dev - [llvm-dev] [FP] Constant folding math library functions [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Kaylor, Andrew via llvm-dev

2019-Apr-16 20:47 UTC

[llvm-dev] [FP] Constant folding math library functions

Thanks, Hal.

I hear what you are saying about the accuracy. The problem, from my perspective,
is trying to explain to users what they are going to get. The constant folding
may be as accurate as the lib call would have been, but it isn't necessarily
value safe. I've been operating on the assumption that LLVM's FP
optimizations are value safe unless fast math flags are used. For the most part
that appears to be true. This case breaks my assumption.

I realize that any call to a library function puts claims of value safety on
shaky ground, but the standard I'm going for is that you'll get the same
bitwise results compiling at -O0 as you will at -O2 (for instance).

That said, I agree that the difference between constant folding a library call
and substituting an approximate calculation is significant. Most users would
probably prefer to have this optimization enabled by default. It just leads to a
kind of murky answer to the question of whether or not we're value safe by
default for the users who do care about that.
I guess what I'm saying is that I do like the idea of a separate flag for
this, though as I recall we're running out of bits for fast math flags.
I'm also not sure whether it should be on by default. If we want to permit
this transformation by default, then it shouldn't be a fast math flag.
Probably an attribute on the call site is better? And in that case it feels like
we'd be circling back toward "nobuiltin" but can the front end
identify which call sites would need that?

-Andy

From: Finkel, Hal J. <hfinkel at anl.gov>
Sent: Tuesday, April 16, 2019 1:01 PM
To: llvm-dev <llvm-dev at lists.llvm.org>; Kaylor, Andrew
<andrew.kaylor at intel.com>
Subject: Re: [FP] Constant folding math library functions

Hi, Andy,

This is somewhat tricky. 'afn' is for approximate functions, to
"allow substitution of approximate calculations for functions", but in
this case, the answers aren't any more approximate than the original
function calls. Different, but likely no less accurate. This has long caused
these kinds of subtle differences when cross compiling, etc. but it's not
clear what the best thing to do actually is. Users often want the constant
folding, and I've certainly seen code where the performance depends
critically on it, and yet, the compiler will likely never be able to exactly
replicate the behavior of whatever libm implementation is used at runtime. Maybe
having a dedicated flag to disable just this behavior, aside from suggesting
that users use -fno-builtin=..., would be useful for users who depend on the
compiler not folding these kinds of expressions in ways that might differ from
their runtime libm behavior?

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces
at lists.llvm.org>> on behalf of Kaylor, Andrew via llvm-dev <llvm-dev
at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions


Hi everyone,



I noticed today that LLVM's constant folding of math library functions can
lead to minor differences in results. A colleague sent me the following test
case which demonstrates the issue:



#include <stdio.h>

#include <math.h>



typedef union {

  double d;

  unsigned long long i;

} my_dbl;



int main(void) {

  my_dbl res, x;

  x.i = 0x3feeb39556255de2ull;

  res.d = tanh(x.d);

  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);

  return 0;

}



Compiling with "clang -O2 -g0 -emit-llvm" I get this:



define dso_local i32 @main() local_unnamed_addr #0 {

  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2

  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8],
[24 x i8]* @.str, i64 0, i64 0),

                                                             double
0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,

                                                             i64
4604876745549017449)

  ret i32 0

}



We're still calling 'tanh' but all the values passed to printf are
constant folded. The constant folding is based on a call to tanh made by the
compiler. The problem with this is that if I am linking my program against a
different version of the math library than was used by the compiler I may get a
different result.



I can prevent this constant folding with either the 'nobuiltin' or
'strictfp' attribute. However, it seems to me like this optimization
should really be checking the 'afn' fast math flag.



Opinions?



Thanks,

Andy


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190416/61ef0cd4/attachment.html>

Amara Emerson via llvm-dev

2019-Apr-16 21:18 UTC

head link

[llvm-dev] [FP] Constant folding math library functions

> On Apr 16, 2019, at 1:47 PM, Kaylor, Andrew via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Thanks, Hal.
>  
> I hear what you are saying about the accuracy. The problem, from my
perspective, is trying to explain to users what they are going to get. The
constant folding may be as accurate as the lib call would have been, but it
isn’t necessarily value safe. I’ve been operating on the assumption that LLVM’s
FP optimizations are value safe unless fast math flags are used. For the most
part that appears to be true. This case breaks my assumption.
>  
> I realize that any call to a library function puts claims of value safety
on shaky ground, but the standard I’m going for is that you’ll get the same
bitwise results compiling at -O0 as you will at -O2 (for instance).
>  
> That said, I agree that the difference between constant folding a library
call and substituting an approximate calculation is significant. Most users
would probably prefer to have this optimization enabled by default. It just
leads to a kind of murky answer to the question of whether or not we’re value
safe by default for the users who do care about that.
> I guess what I’m saying is that I do like the idea of a separate flag for
this, though as I recall we’re running out of bits for fast math flags. I’m also
not sure whether it should be on by default. If we want to permit this
transformation by default, then it shouldn’t be a fast math flag. Probably an
attribute on the call site is better? And in that case it feels like we’d be
circling back toward “nobuiltin” but can the front end identify which call sites
would need that?Could this not be a function attribute if it’s intended to be consistent across
entire functions/programs?

I agree with Hal that afn doesn’t sound like the right approach, and in terms of
how the compiler actually treats these calls (I’m thinking about more than
constant folding here) then it seems to be that this is the same as
-fno-builtin. For example, can the optimizer assume some properties about the
result value of a call if it knowns some (partial) information about the
argument (e.g. sign)? If we prevent constant folding then that also precludes
this kind of optimization. Perhaps an umbrella flag like -fno-builtin-math-lib
that would turn on -fno-builtin for all of the libm functions?

Amara>  
> -Andy
>  
>  <>From: Finkel, Hal J. <hfinkel at anl.gov> 
> Sent: Tuesday, April 16, 2019 1:01 PM
> To: llvm-dev <llvm-dev at lists.llvm.org>; Kaylor, Andrew
<andrew.kaylor at intel.com>
> Subject: Re: [FP] Constant folding math library functions
>  
> Hi, Andy,
>  
> This is somewhat tricky. 'afn' is for approximate functions, to
"allow substitution of approximate calculations for functions", but in
this case, the answers aren't any more approximate than the original
function calls. Different, but likely no less accurate. This has long caused
these kinds of subtle differences when cross compiling, etc. but it's not
clear what the best thing to do actually is. Users often want the constant
folding, and I've certainly seen code where the performance depends
critically on it, and yet, the compiler will likely never be able to exactly
replicate the behavior of whatever libm implementation is used at runtime. Maybe
having a dedicated flag to disable just this behavior, aside from suggesting
that users use -fno-builtin=..., would be useful for users who depend on the
compiler not folding these kinds of expressions in ways that might differ from
their runtime libm behavior?
>  
>  -Hal
>  
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>  
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org
<mailto:llvm-dev-bounces at lists.llvm.org>> on behalf of Kaylor,
Andrew via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>>
> Sent: Tuesday, April 16, 2019 2:23 PM
> To: llvm-dev
> Subject: [llvm-dev] [FP] Constant folding math library functions
>  
> Hi everyone,
>  
> I noticed today that LLVM’s constant folding of math library functions can
lead to minor differences in results. A colleague sent me the following test
case which demonstrates the issue:
>  
> #include <stdio.h>
> #include <math.h>
>  
> typedef union {
>   double d;
>   unsigned long long i;
> } my_dbl;
>  
> int main(void) {
>   my_dbl res, x;
>   x.i = 0x3feeb39556255de2ull;
>   res.d = tanh(x.d);
>   printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);
>   return 0;
> }
>  
> Compiling with “clang -O2 -g0 -emit-llvm” I get this:
>  
> define dso_local i32 @main() local_unnamed_addr #0 {
>   %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2
>   %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x
i8], [24 x i8]* @.str, i64 0, i64 0),
>                                                              double
0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,
>                                                              i64
4604876745549017449)
>   ret i32 0
> }
>  
> We’re still calling ‘tanh’ but all the values passed to printf are constant
folded. The constant folding is based on a call to tanh made by the compiler.
The problem with this is that if I am linking my program against a different
version of the math library than was used by the compiler I may get a different
result.
>  
> I can prevent this constant folding with either the ‘nobuiltin’ or
‘strictfp’ attribute. However, it seems to me like this optimization should
really be checking the ‘afn’ fast math flag.
>  
> Opinions?
>  
> Thanks,
> Andy
>  
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190416/8b5b987e/attachment-0001.html>

Finkel, Hal J. via llvm-dev

2019-Apr-17 17:31 UTC

head link

[llvm-dev] [FP] Constant folding math library functions

On 4/16/19 4:18 PM, Amara Emerson wrote:


On Apr 16, 2019, at 1:47 PM, Kaylor, Andrew via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:

Thanks, Hal.

I hear what you are saying about the accuracy. The problem, from my perspective,
is trying to explain to users what they are going to get. The constant folding
may be as accurate as the lib call would have been, but it isn’t necessarily
value safe. I’ve been operating on the assumption that LLVM’s FP optimizations
are value safe unless fast math flags are used. For the most part that appears
to be true. This case breaks my assumption.

I realize that any call to a library function puts claims of value safety on
shaky ground, but the standard I’m going for is that you’ll get the same bitwise
results compiling at -O0 as you will at -O2 (for instance).

That said, I agree that the difference between constant folding a library call
and substituting an approximate calculation is significant. Most users would
probably prefer to have this optimization enabled by default. It just leads to a
kind of murky answer to the question of whether or not we’re value safe by
default for the users who do care about that.
I guess what I’m saying is that I do like the idea of a separate flag for this,
though as I recall we’re running out of bits for fast math flags. I’m also not
sure whether it should be on by default. If we want to permit this
transformation by default, then it shouldn’t be a fast math flag. Probably an
attribute on the call site is better? And in that case it feels like we’d be
circling back toward “nobuiltin” but can the front end identify which call sites
would need that?
Could this not be a function attribute if it’s intended to be consistent across
entire functions/programs?

I agree with Hal that afn doesn’t sound like the right approach, and in terms of
how the compiler actually treats these calls (I’m thinking about more than
constant folding here) then it seems to be that this is the same as
-fno-builtin. For example, can the optimizer assume some properties about the
result value of a call if it knowns some (partial) information about the
argument (e.g. sign)? If we prevent constant folding then that also precludes
this kind of optimization. Perhaps an umbrella flag like -fno-builtin-math-lib
that would turn on -fno-builtin for all of the libm functions?


I certainly think that the umbrella flag is a good approach. The tricky part is
implementing it so that Clang does not need to have a list of the relevant
functions that LLVM's optimizer knows about (thus, I don't think that we
can implement this just by having Clang add metadata to a predefined list of
math functions).

 -Hal


Amara

-Andy

From: Finkel, Hal J. <hfinkel at anl.gov<mailto:hfinkel at anl.gov>>
Sent: Tuesday, April 16, 2019 1:01 PM
To: llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>; Kaylor, Andrew <andrew.kaylor at
intel.com<mailto:andrew.kaylor at intel.com>>
Subject: Re: [FP] Constant folding math library functions

Hi, Andy,

This is somewhat tricky. 'afn' is for approximate functions, to
"allow substitution of approximate calculations for functions", but in
this case, the answers aren't any more approximate than the original
function calls. Different, but likely no less accurate. This has long caused
these kinds of subtle differences when cross compiling, etc. but it's not
clear what the best thing to do actually is. Users often want the constant
folding, and I've certainly seen code where the performance depends
critically on it, and yet, the compiler will likely never be able to exactly
replicate the behavior of whatever libm implementation is used at runtime. Maybe
having a dedicated flag to disable just this behavior, aside from suggesting
that users use -fno-builtin=..., would be useful for users who depend on the
compiler not folding these kinds of expressions in ways that might differ from
their runtime libm behavior?

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces
at lists.llvm.org>> on behalf of Kaylor, Andrew via llvm-dev <llvm-dev
at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions

Hi everyone,

I noticed today that LLVM’s constant folding of math library functions can lead
to minor differences in results. A colleague sent me the following test case
which demonstrates the issue:

#include <stdio.h>
#include <math.h>

typedef union {
  double d;
  unsigned long long i;
} my_dbl;

int main(void) {
  my_dbl res, x;
  x.i = 0x3feeb39556255de2ull;
  res.d = tanh(x.d);
  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);
  return 0;
}

Compiling with “clang -O2 -g0 -emit-llvm” I get this:

define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2
  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8],
[24 x i8]* @.str, i64 0, i64 0),
                                                             double
0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,
                                                             i64
4604876745549017449)
  ret i32 0
}

We’re still calling ‘tanh’ but all the values passed to printf are constant
folded. The constant folding is based on a call to tanh made by the compiler.
The problem with this is that if I am linking my program against a different
version of the math library than was used by the compiler I may get a different
result.

I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’
attribute. However, it seems to me like this optimization should really be
checking the ‘afn’ fast math flag.

Opinions?

Thanks,
Andy

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190417/f157ea30/attachment.html>

llvm dev - Apr 2019 - [FP] Constant folding math library functions

[llvm-dev] [FP] Constant folding math library functions

[llvm-dev] [FP] Constant folding math library functions

[llvm-dev] [FP] Constant folding math library functions