thr3ads.net - similar to: "[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86"

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86"

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 28

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

On Wed, Aug 28, 2013 at 2:16 AM, Demikhovsky, Elena < elena.demikhovsky at intel.com> wrote: > I found that there is no diff in code generator for Ordered / Unordered > FP compare instructions. > FUCOMISS, FUCOMISD are generated in the both cases. > > > Yes. That's how fcmp is defined in LangRef. -Eli -------------- next part -------------- An HTML attachment was

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

Should I open a ticket for this? - Elena From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Wednesday, August 28, 2013 19:51 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Ordered / Unordered FP compare are not handled properly on X86 On Wed, Aug 28, 2013 at 2:16 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

On 29 August 2013 10:12, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote: > But this is another case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions. I think LLVM uses ordered/unordered compare to mean something different to what the x86 instructions do. For example, "not equal": fcmp une == unordered not

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

On 29 Aug 2013, at 08:19, Tim Northover <t.p.northover at gmail.com> wrote: > If so, a compare that used that instruction would have to become more > like an "invoke" with a landingpad for the exception and so on, > wouldn't it? The current fcmp can already distinguish between ordered > and unordered, because ucomiss provides that information. There are currently

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

But this is another case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions. But during DAG selection we just lose this information and always generate unordered fcmp. I.e. in case of ordered fcmp the vcomiss should be generated, and in case of unordered - vucomiss. - Elena -----Original Message----- From: Dr D. Chisnall [mailto:dc552 at

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

On 29 August 2013 06:31, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote: > Should I open a ticket for this? I think he was saying this is intended behaviour. Isn't the difference between ucomiss and comiss just whether an exception is raised for NaN? If so, a compare that used that instruction would have to become more like an "invoke" with a landingpad for the

Working on FP SCEV Analysis

2016 May 19

Working on FP SCEV Analysis

> One option would be to extend InductionDescriptor::isInductionPHI in the vectorizer to directly analyze the PHIs without SCEV support as Sanjoy suggested. I *think* that that could be sufficient to handle case B. I implemented this with FP SCEV and the code looks very structured, including SCEVExpander. Extending the existing structures without implementing FP SCEV will be problematic. And

Working on FP SCEV Analysis

2016 May 20

Working on FP SCEV Analysis

Hi Hideki, I like this summary overall, thanks. More below. > On May 20, 2016, at 10:04 AM, Saito, Hideki <hideki.saito at intel.com> wrote: > > > To the best of my experience, handling case B (secondary induction) is must-have, and if I’m not mistaken, > people aren’t opposed to that. > > For me, handling case A (primary induction) is “why not?”, but I certainly

Working on FP SCEV Analysis

2016 May 16

Working on FP SCEV Analysis

[+CC Andy] Hi Elena, I don't have any fundamental issues with teaching SCEV about floating point types, but given this will be a major change, I think a high level roadmap should be discussed on llvm-dev before we start reviewing and committing changes. Here are some issues that I think are worth discussing: - Core motivation: why do we even care about optimizing floating point

Working on FP SCEV Analysis

2016 May 20

Working on FP SCEV Analysis

To the best of my experience, handling case B (secondary induction) is must-have, and if I’m not mistaken, people aren’t opposed to that. For me, handling case A (primary induction) is “why not?”, but I certainly admit that that can be very naïve thinking coming from lack of good understanding on SCEV and their proper usages. Now, let’s assume we can postpone discussion about case A. What is the

Working on FP SCEV Analysis

2016 May 24

Working on FP SCEV Analysis

Adding support for FP inductions through isInductionPHI() is certainly possible, I have a relatively small local patch that does exactly that for simple fp add-recurrence cases, along with changes to the vectorizer to make it aware of FP inductions. It won't get give you the powerful reasoning capabilities of SCEV, but for the B-like cases it should work. Amara On 20 May 2016 at 19:31, Adam

Working on FP SCEV Analysis

2016 May 18

Working on FP SCEV Analysis

Demikhovsky, Elena wrote: > > Even then, I'd personally want to see further evidence of why the > correct solution is to model the floating point IV in SCEV rather than > find a more powerful way of converting the IV to an integer that models > > the non-integer values taken on by the IV. As an example, if the use > case is the following code with appropriate flags to

Working on FP SCEV Analysis

2016 May 18

Working on FP SCEV Analysis

On Tue, May 17, 2016 at 8:49 PM Owen Anderson <resistor at mac.com> wrote: > > On May 16, 2016, at 2:42 PM, Sanjoy Das via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > - Core motivation: why do we even care about optimizing floating > point induction variables? What situations are they common in? Do > programmers _expect_ compilers to optimize them

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

If I'm understanding correctly, you're saying that vgather* is slow on all of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will not generate it for any of those machines. Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() && !hasAVX512()". It could break for some hypothetical future processor that manages to

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

It seems that http://reviews.llvm.org/D15690 only implemented gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to enable gather for AVX/2? Thanks. Best, Zhi On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com> wrote: > I don't think gather has been enabled for AVX2 as of r261875. > Masked load/store were enabled for AVX with: >

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

That makes great sense. It would be great if we have profitability mode to see the necessity to use gathers. Or it also would be good if there is a compiler option for the users to enable LLVM to generate the gather instructions no matter it is faster or slow. Best, Zhi On Fri, Feb 26, 2016 at 12:49 PM, Sanjay Patel <spatel at rotateright.com> wrote: > If I'm understanding

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

Yes, masked load/store/gather/scatter are completed. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 01:20 To: Demikhovsky, Elena <elena.demikhovsky at intel.com> Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to

[LLVMdev] Adding masked vector load and store intrinsics

2014 Oct 27

[LLVMdev] Adding masked vector load and store intrinsics

we just follow a common recommendation to start with intrinsics: http://llvm.org/docs/ExtendingLLVM.html - Elena From: Owen Anderson [mailto:resistor at mac.com] Sent: Sunday, October 26, 2014 23:57 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu; dag at cray.com Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics What is the motivation for using intrinsics

[LLVMdev] Adding masked vector load and store intrinsics

2014 Oct 28

[LLVMdev] Adding masked vector load and store intrinsics

Many oveloaded intrinsics may be replaced with instructions - fabs or fma or sqrt. Chandler will probably explain the criteria. What the diff between fma and fadd? Or fptrunc and fabs? A new instruction like %a = loadm <4 x i32>* %addr, <4 x i32> %passthru, i32 4, <4 x i1>%mask is possible, but may be not very useful for most of targets. So we start from intrinsics. -

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

No. Gather operation is slow on AVX2 processors. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 20:48 To: Sanjay Patel <spatel at rotateright.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force

similar to: [LLVMdev] Ordered / Unordered FP compare are not handled properly on X86