thr3ads.net - similar to: "Default FPENV state"

Displaying 20 results from an estimated 400 matches similar to: "Default FPENV state"

2017 Jul 20

[RFC] dereferenceable metadata

Indeed. But the problem here is that Dinar is trying to keep information after a load/store is removed by instcombine For example: v4sf v = {p[0], p[1], p[2], p[3]}; v4sf v2 = shuffle(v, 0, 0, 2, 2); Some pass comes in and removes the p[3] and p[1]. Now you have smaller code, but lost the ability to use a vector load for all those values + shuffle. The code got scalarized because we lost the

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

On 09/26/2017 08:31 AM, Dinar Temirbulatov wrote: > Hi Hal, >> Are you primarily concerned with being able to widen loads later in the pipeline? Could we attached metadata to the remaining loads indicating that it would be legal to widen them? > no, I don't have any concerns about intrinsic way of implementation, > and intrinsic way looks safer for me since we somehow detach our

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

Hi Hal, >Are you primarily concerned with being able to widen loads later in the pipeline? Could we attached metadata to the remaining loads indicating that it would be legal to widen them? no, I don't have any concerns about intrinsic way of implementation, and intrinsic way looks safer for me since we somehow detach our information about memory from that actual load instruction. I updated

RFC phantom memory intrinsic

2017 Sep 13

RFC phantom memory intrinsic

Hi Michael, >I have a case where InstCombine removes a store and your approach would be >valuable for me if the entire access to an aggregate could be restored. Yes, no problem and we could add the aggregate pointer to this new intrinsic and in my particular case I should ignore it, but I am looking now at "speculation_marker" metadata and I am still not sure how to implement it

[RFC] dereferenceable metadata

2017 Jul 18

[RFC] dereferenceable metadata

Hi, While working on PR21780, I used "dereferenceable_or_null" metadata and I realized now that it is not correct for my solution to use this metadata type since it might point to an address that it is not dereferenceable but null. I think that we need another new metadata type, something like "dereferenceable" with that we could annotate any load (not just pointer type like

RFC phantom memory intrinsic

2017 Sep 12

RFC phantom memory intrinsic

Hi, For PR21780 solution, I plan to add a new functionality to restore memory operations that was once deleted, in this particular case it is the load operations that were deleted by InstCombine, please note that once the load was removed there is no way to restore it back and that prevents us from vectorizing the shuffle operation. There are probably more similar issues where this approach could

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

On 09/13/2017 04:46 PM, Dinar Temirbulatov via llvm-dev wrote: > Hi Michael, >> I have a case where InstCombine removes a store and your approach would be >> valuable for me if the entire access to an aggregate could be restored. > Yes, no problem and we could add the aggregate pointer to this new > intrinsic and in my particular case I should ignore it, but I am > looking

RFC phantom memory intrinsic

2017 Sep 13

RFC phantom memory intrinsic

Hi Michael, >Interesting approach but how do you handle more complex offsets, e.g., when the pointer is part of an aggregate? Only one offset does not seem enough to handle generic cases. Yes, correct, this a little bit changed example is not working. #include <x86intrin.h> __m256d vsht_d4_fold(const double* ptr, unsigned long long i) { __m256d foo = (__m256d){ ptr[i], ptr[i+1],

[LLVMdev] X86 disassembler & assembler mismatch

2014 Dec 26

[LLVMdev] X86 disassembler & assembler mismatch

hi, some instructions mismatch between assembler & disassembler, like below. it seems this happens with all SSECC related instructions? thanks, Jun $ echo "cmpps xmm1, xmm2, 23" | ./Release+Asserts/bin/llvm-mc -assemble -triple=x86_64 --output-asm-variant=1 -x86-asm-syntax=intel -show-encoding .text cmpps xmm1, xmm2, 23 # encoding: [0x0f,0xc2,0xca,0x17] $

[LLVMdev] X86 disassembler & assembler mismatch

2014 Dec 26

[LLVMdev] X86 disassembler & assembler mismatch

The IMM3/IMM5 come from here X86RecognizableInstr.cpp 943 TYPE("SSECC", TYPE_IMM3) 944: TYPE("AVXCC", TYPE_IMM5) On Thu, Dec 25, 2014 at 8:22 PM, Jun Koi <junkoi2004 at gmail.com> wrote: > > > On Fri, Dec 26, 2014 at 11:54 AM, Jun Koi <junkoi2004 at gmail.com> wrote: > >> hi, >> >> some instructions

[FPEnv] FNEG instruction

2018 Oct 02

[FPEnv] FNEG instruction

On Tue, Oct 2, 2018 at 12:09 PM Kevin Neal <Kevin.Neal at sas.com> wrote: > If we don’t have constrained intrinsics for some of the fp math > instructions then aren’t we risking non-strict optimizations? > So far we've only added constrained FP intrinsics for operations that have side effects (i.e. can trap). The quiet-computational sign-bit operations are special. They never

[FPEnv] FNEG instruction

2018 Aug 21

[FPEnv] FNEG instruction

Hey llvm-dev, Continuing a discussion from D50913... A group working on the FP rounding mode and trap-safety project have run into a situation where it would make sense to add a new FNEG instruction and decouple the existing FNEG<->FSUB transformations. The IEEE-754 Standard (Section 5.5.1) makes it clear that neg(x) and sub(-0.0,x) are two different operations. The former is a bitwise

[FPEnv] FNEG instruction

2018 Sep 26

[FPEnv] FNEG instruction

Well, yes, they are different operations. And, yes, this needs to be corrected. This wasn’t my point. It’s a given. I was getting at the _declared_ absence of side effects and what promises we make to anyone using the new fneg instruction. Is this a promise we want to make? From: Cameron McInally <cameron.mcinally at nyu.edu> Sent: Wednesday, September 26, 2018 2:30 PM To: Kevin Neal

[FPEnv] FNEG instruction

2018 Sep 26

[FPEnv] FNEG instruction

Do we really want to have fneg be the only instruction with guaranteed no side effects? That just sounds like a gotcha waiting to happen. Or it could result in horrible code depending on the architecture. I’m still leaning towards having both an intrinsic and an instruction, and if they happen to have the same behavior then that’s fine. If fneg is to be a special instruction with extra promises

[FPEnv] FNEG instruction

2018 Aug 29

[FPEnv] FNEG instruction

On Wed, 29 Aug 2018 at 07:51, Cameron McInally via llvm-dev <llvm-dev at lists.llvm.org> wrote: > The current thinking is that FNEG(X) and FSUB(-0.0, X) are not the same operation when X is a NaN or 0. Do you mean denormals (when flushed) rather than 0 there? AFAIK it's OK for 0 itself. > So, the xforms in question should only be valid under Fast-Math conditions. We could

[FPEnv] FNEG instruction

2018 Aug 29

[FPEnv] FNEG instruction

> On Aug 29, 2018, at 1:22 PM, Cameron McInally via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > FSUB(-0.0, NaN) = NaN > FSUB(-0.0, -NaN) = NaN Some specific architecture may define this, or APFloat might, but IEEE 754 does not interpret the sign of NaN except in four operations (copy, abs, negate, copysign), so it doesn’t say anything about these. – Steve --------------

[FPEnv] FNEG instruction

2018 Aug 30

[FPEnv] FNEG instruction

On Thu, Aug 30, 2018 at 11:14 AM, Tim Northover <t.p.northover at gmail.com> wrote: > ... > I don't think it matters for the question at hand, but I tested > AArch64 too and it exhibits the behaviour you were describing. That > is, we'd have problems if an fsub -0.0 was actually CodeGened like > that (it's not, of course). Great data point. So it's not just

[FPEnv] FNEG instruction

2018 Sep 26

[FPEnv] FNEG instruction

I have no example side effects in hand. But LLVM targets a bunch of architectures, and who knows what the future holds. So it may be prudent to not promise too much so as to leave ourselves an escape hatch. Doesn’t LLVM target some chips that have floating point instruction sets that are not IEEE compliant? Can we be certain that no new LLVM target will ever have to jump through hoops to avoid

[FPEnv] FNEG instruction

2018 Sep 26

[FPEnv] FNEG instruction

On Wed, Sep 26, 2018 at 9:32 AM Sanjay Patel <spatel at rotateright.com> wrote: > > > On Tue, Sep 25, 2018 at 7:47 PM Cameron McInally <cameron.mcinally at nyu.edu> > wrote: > >> >> This is the first time I'm looking at foldShuffledBinop(...), so maybe a >> naive question, but why not do similar shuffle canonicalizations on unary >> (or

[FPEnv] FNEG instruction

2018 Sep 27

[FPEnv] FNEG instruction

Regarding non-IEEE targets: yes, we definitely support those, so we do have to be careful about not breaking them. I know because I have broken them. :) See the discussion and related links here: https://reviews.llvm.org/D19391 But having an exactly specified fneg op makes that easier, not harder, as I see it. Unfortunately, if a target doesn't support this op (always toggle the sign bit and

similar to: Default FPENV state