similar to: Default FPENV state

Displaying 20 results from an estimated 400 matches similar to: "Default FPENV state"

2017 Jul 20
2
[RFC] dereferenceable metadata
Indeed. But the problem here is that Dinar is trying to keep information after a load/store is removed by instcombine For example: v4sf v = {p[0], p[1], p[2], p[3]}; v4sf v2 = shuffle(v, 0, 0, 2, 2); Some pass comes in and removes the p[3] and p[1]. Now you have smaller code, but lost the ability to use a vector load for all those values + shuffle. The code got scalarized because we lost the
2017 Sep 26
0
RFC phantom memory intrinsic
On 09/26/2017 08:31 AM, Dinar Temirbulatov wrote: > Hi Hal, >> Are you primarily concerned with being able to widen loads later in the pipeline? Could we attached metadata to the remaining loads indicating that it would be legal to widen them? > no, I don't have any concerns about intrinsic way of implementation, > and intrinsic way looks safer for me since we somehow detach our
2017 Sep 26
2
RFC phantom memory intrinsic
Hi Hal, >Are you primarily concerned with being able to widen loads later in the pipeline? Could we attached metadata to the remaining loads indicating that it would be legal to widen them? no, I don't have any concerns about intrinsic way of implementation, and intrinsic way looks safer for me since we somehow detach our information about memory from that actual load instruction. I updated
2017 Sep 13
2
RFC phantom memory intrinsic
Hi Michael, >I have a case where InstCombine removes a store and your approach would be >valuable for me if the entire access to an aggregate could be restored. Yes, no problem and we could add the aggregate pointer to this new intrinsic and in my particular case I should ignore it, but I am looking now at "speculation_marker" metadata and I am still not sure how to implement it
2017 Jul 18
2
[RFC] dereferenceable metadata
Hi, While working on PR21780, I used "dereferenceable_or_null" metadata and I realized now that it is not correct for my solution to use this metadata type since it might point to an address that it is not dereferenceable but null. I think that we need another new metadata type, something like "dereferenceable" with that we could annotate any load (not just pointer type like
2017 Sep 12
3
RFC phantom memory intrinsic
Hi, For PR21780 solution, I plan to add a new functionality to restore memory operations that was once deleted, in this particular case it is the load operations that were deleted by InstCombine, please note that once the load was removed there is no way to restore it back and that prevents us from vectorizing the shuffle operation. There are probably more similar issues where this approach could
2017 Sep 26
0
RFC phantom memory intrinsic
On 09/13/2017 04:46 PM, Dinar Temirbulatov via llvm-dev wrote: > Hi Michael, >> I have a case where InstCombine removes a store and your approach would be >> valuable for me if the entire access to an aggregate could be restored. > Yes, no problem and we could add the aggregate pointer to this new > intrinsic and in my particular case I should ignore it, but I am > looking
2017 Sep 13
2
RFC phantom memory intrinsic
Hi Michael, >Interesting approach but how do you handle more complex offsets, e.g., when the pointer is part of an aggregate? Only one offset does not seem enough to handle generic cases. Yes, correct, this a little bit changed example is not working. #include <x86intrin.h> __m256d vsht_d4_fold(const double* ptr, unsigned long long i) { __m256d foo = (__m256d){ ptr[i], ptr[i+1],
2014 Dec 26
2
[LLVMdev] X86 disassembler & assembler mismatch
hi, some instructions mismatch between assembler & disassembler, like below. it seems this happens with all SSECC related instructions? thanks, Jun $ echo "cmpps xmm1, xmm2, 23" | ./Release+Asserts/bin/llvm-mc -assemble -triple=x86_64 --output-asm-variant=1 -x86-asm-syntax=intel -show-encoding .text cmpps xmm1, xmm2, 23 # encoding: [0x0f,0xc2,0xca,0x17] $
2014 Dec 26
2
[LLVMdev] X86 disassembler & assembler mismatch
The IMM3/IMM5 come from here X86RecognizableInstr.cpp 943 TYPE("SSECC", TYPE_IMM3) 944: TYPE("AVXCC", TYPE_IMM5) On Thu, Dec 25, 2014 at 8:22 PM, Jun Koi <junkoi2004 at gmail.com> wrote: > > > On Fri, Dec 26, 2014 at 11:54 AM, Jun Koi <junkoi2004 at gmail.com> wrote: > >> hi, >> >> some instructions
2018 Oct 02
3
[FPEnv] FNEG instruction
On Tue, Oct 2, 2018 at 12:09 PM Kevin Neal <Kevin.Neal at sas.com> wrote: > If we don’t have constrained intrinsics for some of the fp math > instructions then aren’t we risking non-strict optimizations? > So far we've only added constrained FP intrinsics for operations that have side effects (i.e. can trap). The quiet-computational sign-bit operations are special. They never
2018 Aug 21
3
[FPEnv] FNEG instruction
Hey llvm-dev, Continuing a discussion from D50913... A group working on the FP rounding mode and trap-safety project have run into a situation where it would make sense to add a new FNEG instruction and decouple the existing FNEG<->FSUB transformations. The IEEE-754 Standard (Section 5.5.1) makes it clear that neg(x) and sub(-0.0,x) are two different operations. The former is a bitwise
2018 Sep 26
2
[FPEnv] FNEG instruction
Well, yes, they are different operations. And, yes, this needs to be corrected. This wasn’t my point. It’s a given. I was getting at the _declared_ absence of side effects and what promises we make to anyone using the new fneg instruction. Is this a promise we want to make? From: Cameron McInally <cameron.mcinally at nyu.edu> Sent: Wednesday, September 26, 2018 2:30 PM To: Kevin Neal
2018 Sep 26
2
[FPEnv] FNEG instruction
Do we really want to have fneg be the only instruction with guaranteed no side effects? That just sounds like a gotcha waiting to happen. Or it could result in horrible code depending on the architecture. I’m still leaning towards having both an intrinsic and an instruction, and if they happen to have the same behavior then that’s fine. If fneg is to be a special instruction with extra promises
2018 Aug 29
3
[FPEnv] FNEG instruction
On Wed, 29 Aug 2018 at 07:51, Cameron McInally via llvm-dev <llvm-dev at lists.llvm.org> wrote: > The current thinking is that FNEG(X) and FSUB(-0.0, X) are not the same operation when X is a NaN or 0. Do you mean denormals (when flushed) rather than 0 there? AFAIK it's OK for 0 itself. > So, the xforms in question should only be valid under Fast-Math conditions. We could
2018 Aug 29
2
[FPEnv] FNEG instruction
> On Aug 29, 2018, at 1:22 PM, Cameron McInally via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > FSUB(-0.0, NaN) = NaN > FSUB(-0.0, -NaN) = NaN Some specific architecture may define this, or APFloat might, but IEEE 754 does not interpret the sign of NaN except in four operations (copy, abs, negate, copysign), so it doesn’t say anything about these. – Steve --------------
2018 Aug 30
2
[FPEnv] FNEG instruction
On Thu, Aug 30, 2018 at 11:14 AM, Tim Northover <t.p.northover at gmail.com> wrote: > ... > I don't think it matters for the question at hand, but I tested > AArch64 too and it exhibits the behaviour you were describing. That > is, we'd have problems if an fsub -0.0 was actually CodeGened like > that (it's not, of course). Great data point. So it's not just
2018 Sep 26
2
[FPEnv] FNEG instruction
I have no example side effects in hand. But LLVM targets a bunch of architectures, and who knows what the future holds. So it may be prudent to not promise too much so as to leave ourselves an escape hatch. Doesn’t LLVM target some chips that have floating point instruction sets that are not IEEE compliant? Can we be certain that no new LLVM target will ever have to jump through hoops to avoid
2018 Sep 26
3
[FPEnv] FNEG instruction
On Wed, Sep 26, 2018 at 9:32 AM Sanjay Patel <spatel at rotateright.com> wrote: > > > On Tue, Sep 25, 2018 at 7:47 PM Cameron McInally <cameron.mcinally at nyu.edu> > wrote: > >> >> This is the first time I'm looking at foldShuffledBinop(...), so maybe a >> naive question, but why not do similar shuffle canonicalizations on unary >> (or
2018 Sep 27
2
[FPEnv] FNEG instruction
Regarding non-IEEE targets: yes, we definitely support those, so we do have to be careful about not breaking them. I know because I have broken them. :) See the discussion and related links here: https://reviews.llvm.org/D19391 But having an exactly specified fneg op makes that easier, not harder, as I see it. Unfortunately, if a target doesn't support this op (always toggle the sign bit and