thr3ads.net - llvm dev - [LLVMdev] ConstantFoldBinaryFP and cross compilation [Apr 2013]

If this information is useful, please help other people find it:
Share via:

Dan Gohman

2013-Apr-25 15:58 UTC

[LLVMdev] Proposal for new Legalization framework

On Wed, Apr 24, 2013 at 5:26 PM, Chris Lattner <clattner at apple.com>
wrote:
> On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com>
wrote:
> > In the spirit of the (long-term) intent to migrate away from the
> SelectionDAG framework, it is desirable to implement legalization passes as
> discrete passes. Attached is a patch which implements the beginning of a
> new type legalization pass, to help motivate discussion.
>
> This is a great discussion to have.
>
> > Is LLVM IR the right level for this?
>
> IMO, no, definitely not.
>
> > The main alternative approach that's been discussed is to do
FastISel to
> a target-independent opcode set on MachineInstrs, and then do legalization
> and ultimately the last phase off instruction selection proper after that.
> The most obvious advantage of using LLVM IR for legalization is that
it's
> (currently) more developer-friendly. The most obvious advantage of using
> MachineInstrs is that they would make it easier to do low-level
> manipulations. Also, doing legalization on MachineInstrs would mean
> avoiding having LLVM-IR-level optimization passes which lower the IR, which
> has historically been a design goal of LLVM.
>
> I think that you (in the rest of your email) identify a number of specific
> problems with using LLVM IR for legalization.  These are a lot of specific
> issues caused by the fact that LLVM IR is intentionally not trying to model
> machine issues.  I'm sure you *could* try to make this work by
introducing
> a bunch of new intrinsics into LLVM IR which would model the union of the
> selection dag ISD nodes along with the target specific X86ISD nodes.
>  However, at this point, you have only modeled the operations and
haven't
> modeled the proper type system.
>
I don't wish to argue about this, and am fine following your suggestion.
However, I would like to understand your reasons better.

I don't think the type system is really the issue. The only thing
SelectionDAG's type system has which LLVM IR's lacks which is useful
here
is "untyped", and that's a special-purpose thing that we can
probably
handle in other ways.

You and others are right that there could be a fair number of new
intrinsics, especially considering all the X86ISD ones and all the rest. Is
this a significant concern for you? Targets already have large numbers of
target-specific intrinsics; would adding a relatively moderate number of
new intrinsics really be a problem?

There's also the problem of keeping callers and callees consistent, and
it's indeed quite a dickens, but it need not be a show-stopper.

LLVM IR is just not the right level for this.  You seem to think it
is> better than MachineInstrs because of developer friendliness, but it
isn't
> clear to me that LLVM IR with the additions you're talking about would
> actually be friendly anymore :-)
>
As I see it, people working in codegen are going to have to deal with lots
of codegeny instructions regardless of whether we call them instructions or
intrinsics. Is it really better one way or the other?

> Personally, I think that the right representation for legalization is
> MachineInstrs supplemented with a type system that allows MVTs as well as
> register classes.  If you are seriously interested in pushing forward on
> this, we should probably discuss it in person, or over beer at the next
> social or something.
>
Ok.

Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130425/e5937fa3/attachment.html>

Reed Kotler

2013-Apr-25 21:00 UTC

head link

[LLVMdev] Proposal for new Legalization framework

.....>
>
> I don't wish to argue about this, and am fine following your
suggestion.
> However, I would like to understand your reasons better.
>
What would be the plan as far as incrementally achieving this alternate 
implementation?

Why has " avoiding having LLVM-IR-level optimization passes which lower 
the IR,  which has historically been a design goal of LLVM"?
> I don't think the type system is really the issue. The only thing
> SelectionDAG's type system has which LLVM IR's lacks which is
useful
> here is "untyped", and that's a special-purpose thing that we
can
> probably handle in other ways.
>
> You and others are right that there could be a fair number of new
> intrinsics, especially considering all the X86ISD ones and all the rest.
> Is this a significant concern for you? Targets already have large
> numbers of target-specific intrinsics; would adding a relatively
> moderate number of new intrinsics really be a problem?
>
> There's also the problem of keeping callers and callees consistent, and
> it's indeed quite a dickens, but it need not be a show-stopper.
>
>     LLVM IR is just not the right level for this.  You seem to think it
>     is better than MachineInstrs because of developer friendliness, but
>     it isn't clear to me that LLVM IR with the additions you're
talking
>     about would actually be friendly anymore :-)
>
>
> As I see it, people working in codegen are going to have to deal with
> lots of codegeny instructions regardless of whether we call them
> instructions or intrinsics. Is it really better one way or the other?
>
>     Personally, I think that the right representation for legalization
>     is MachineInstrs supplemented with a type system that allows MVTs as
>     well as register classes.  If you are seriously interested in
>     pushing forward on this, we should probably discuss it in person, or
>     over beer at the next social or something.
>
>
> Ok.
> Dan
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

James Courtier-Dutton

2013-Apr-26 07:50 UTC

head link

[LLVMdev] Proposal for new Legalization framework

I don't know if it helps at all, but another option might be some sort
of target CPU modelling.
I know tablegen does a lot of this, but a decompiler called boomerang
(http://boomerang.sourceforge.net/) does have some interesting
infrastructure in the area of CPU modelling.
It uses a CPU modelling language that is used to automatically
generate source code that assembles/disassembles instructions.
It might not be suitable, but I think it would be good to have a look
at boomerang to see if its CPU modelling can be used in LLVM.

Chris Lattner

2013-Apr-26 18:46 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On Apr 25, 2013, at 2:00 PM, Reed Kotler <rkotler at mips.com> wrote:
> .....
>> 
>> 
>> I don't wish to argue about this, and am fine following your
suggestion.
>> However, I would like to understand your reasons better.
>> 
> 
> What would be the plan as far as incrementally achieving this alternate
implementation?
> 
> Why has " avoiding having LLVM-IR-level optimization passes which
lower the IR,  which has historically been a design goal of LLVM"?

We obviously need an incremental migration plan.

That said, personally, I would prefer to figure out what the right destination
is, before we start trying to discuss how to get there.  I don't think that
any other approach makes much sense.

-Chris

Sergei Larin

2013-Apr-26 19:44 UTC

head link

[LLVMdev] ConstantFoldBinaryFP and cross compilation

Dan, and anyone else interested. 

 

  I am not sure if this has been discussed before, but I do have a case when
the following logic fails to work:

 

lib/Analysis/ConstantFolding.cpp

 

static Constant *ConstantFoldBinaryFP(double (*NativeFP)(double, double),

                                      double V, double W, Type *Ty) {

  sys::llvm_fenv_clearexcept();

  V = NativeFP(V, W);

  if (sys::llvm_fenv_testexcept()) {

    sys::llvm_fenv_clearexcept();

    return 0;

  }

 

..

 

This fragment seems to assumes that host and target behave in exact the same
way in regard to FP exception handling. In some way I understand it, but. On
some cross compilation platforms this might not be always true. In case of
Hexagon for example our FP math handling is apparently more precise then
"stock" one on x86 host.  Specific (but not the best) example would be
computing sqrtf(1.000001). Result is 1 + FE_INEXACT set. My current linux
x86 host fails the inexact part. resulting in wrong code emitted.

 

  Once again, my question is not about this specific example, but rather
about the assumption of identical behavior of completely different systems.
What if my target's "objective" is to exceed IEEE precision? .and
I happen
to have a set of tests to verify that I do J

 

Thank you for any comment.

 

Sergei

 

 

---

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130426/acd2cd4c/attachment.html>

Chris Lattner

2013-Apr-26 20:14 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On Apr 25, 2013, at 8:58 AM, Dan Gohman <dan433584 at gmail.com>
wrote:> I think that you (in the rest of your email) identify a number of specific
problems with using LLVM IR for legalization.  These are a lot of specific
issues caused by the fact that LLVM IR is intentionally not trying to model
machine issues.  I'm sure you *could* try to make this work by introducing a
bunch of new intrinsics into LLVM IR which would model the union of the
selection dag ISD nodes along with the target specific X86ISD nodes.  However,
at this point, you have only modeled the operations and haven't modeled the
proper type system.
> 
> I don't wish to argue about this, and am fine following your
suggestion. However, I would like to understand your reasons better.
Sure, I'm happy to explain.  I apologize if I came across overly-strong
about this.  This is something that has come up many times before.
> I don't think the type system is really the issue. The only thing
SelectionDAG's type system has which LLVM IR's lacks which is useful
here is "untyped", and that's a special-purpose thing that we can
probably handle in other ways.
That's definitely a fair criticism.  In my (often crazy) mind, I'd like
to solve a few problems in SelectionDAG that are not just an aspect of the DAG
representation.  One specific problem area with SelectionDAG (ignoring the DAG)
is that various steps (legalization, isel, etc) want to introduce target
specific operations that *require* a specific register class.  The only way to
model that in SelectionDAG is by picking an MVT that happens to align with it,
and hoping that the right thing happens downstream.

It would be much better if SelectionDAG (and its replacement) could represent
register classes directly in its type system.  However, this is a really really
bad idea for LLVM IR for hopefully obvious reasons.
> You and others are right that there could be a fair number of new
intrinsics, especially considering all the X86ISD ones and all the rest. Is this
a significant concern for you?
No, I'm not specifically concerned with number of intrinsics.
> Targets already have large numbers of target-specific intrinsics; would
adding a relatively moderate number of new intrinsics really be a problem?
No.
> There's also the problem of keeping callers and callees consistent, and
it's indeed quite a dickens, but it need not be a show-stopper.
I consider this to be one (really important!) example of an invariant that would
have to be violated to make this plan happen.  I think that (in order to make
this really work) we'd have to add a non-SSA LLVM IR, potentially multiple
return results, subregs, etc.  I think it is a really bad idea to make LLVM IR
more complicated and worse to work with for the benefit of codegen.
> 
> LLVM IR is just not the right level for this.  You seem to think it is
better than MachineInstrs because of developer friendliness, but it isn't
clear to me that LLVM IR with the additions you're talking about would
actually be friendly anymore :-)
> 
> As I see it, people working in codegen are going to have to deal with lots
of codegeny instructions regardless of whether we call them instructions or
intrinsics. Is it really better one way or the other?
Number of intrinsics is not a strong concern for me.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130426/655aa518/attachment.html>

Dan Gohman

2013-Apr-26 21:04 UTC

head link

[LLVMdev] ConstantFoldBinaryFP and cross compilation

Hi Sergei,

The degree to which LLVM actually makes any guarantees about IEEE
arithmetic precision is ambiguous. LangRef, for one, doesn't even mention
it (it mentions formats, but nothing else). The de-facto way of
interpreting holes in LangRef is to consider how the IR is used by clang
and follow the path up into the C and/or C++ standards and then work from
there. C describes a binding to IEC 60559, but it is optional, and clang
doesn't opt in. C++ doesn't even have the option. So from an official
perspective, it's not clear that you have any basis to complain ;-).

I mention all this not to dismiss your concern, but to put it in context.
Right or wrong, much of the C/C++ software world is not that keenly
concerned in these matters. This includes LLVM in some respects. The
folding of floating-point library routines which you point out in LLVM is
one example of this.

One idea for addressing this would be to teach LLVM's TargetLibraryInfo to
carry information about how precise the target's library functions are.
Then, you could either implement soft-float functions within LLVM itself
for the affected library functions, or you could disable folding for those
functions which are not precise enough on the host (in non-fast-math mode)

Another idea for addressing this would be to convince the LLVM community
that LLVM shouldn't constant-fold floating-point library functions at all
(in non-fast-math mode). I think you could make a reasonable argument for
this. There are ways to do this without loosing much optimization -- such
expressions are still constant after all, so they can be hoisted out of any
loop at all. They could even be hoisted out to main if you want. It's also
worth noting that this problem predates the implementation of fast-math
mode in LLVM's optimizer. Now that fast-math mode is available, it may be
easier to convince people to make the non-fast-math mode more conservative.
I don't know that everyone will accept this, but it's worth considering.

Dan

On Fri, Apr 26, 2013 at 12:44 PM, Sergei Larin <slarin at
codeaurora.org>wrote:
> Dan, and anyone else interested… ****
>
> ** **
>
>   I am not sure if this has been discussed before, but I do have a case
> when the following logic fails to work:****
>
> ** **
>
> lib/Analysis/ConstantFolding.cpp****
>
> ** **
>
> static Constant *ConstantFoldBinaryFP(double (*NativeFP)(double, double),*
> ***
>
>                                       double V, double W, Type *Ty) {****
>
>   sys::llvm_fenv_clearexcept();****
>
>   V = NativeFP(V, W);****
>
>   if (sys::llvm_fenv_testexcept()) {****
>
>     sys::llvm_fenv_clearexcept();****
>
>     return 0;****
>
>   }****
>
> ** **
>
> ….****
>
> ** **
>
> This fragment seems to assumes that host and target behave in exact the
> same way in regard to FP exception handling. In some way I understand it,
> but… On some cross compilation platforms this might not be always true. In
> case of Hexagon for example our FP math handling is apparently more precise
> then “stock” one on x86 host.  Specific (but not the best) example would be
> computing sqrtf(1.000001). Result is 1 + FE_INEXACT set. My current linux
> x86 host fails the inexact part… resulting in wrong code emitted.****
>
> ** **
>
>   Once again, my question is not about this specific example, but rather
> about the assumption of identical behavior of completely different systems.
> What if my target’s “objective” is to exceed IEEE precision? …and I happen
> to have a set of tests to verify that I do J****
>
> ** **
>
> Thank you for any comment.****
>
> ** **
>
> Sergei****
>
> ** **
>
> ** **
>
> ---****
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation****
>
> ** **
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130426/8192f702/attachment.html>

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Apr 2013 - [LLVMdev] ConstantFoldBinaryFP and cross compilation

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] ConstantFoldBinaryFP and cross compilation

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] ConstantFoldBinaryFP and cross compilation

Maybe Matching Threads