similar to: [RFC] Making space for a flush-to-zero flag in FastMathFlags

Displaying 20 results from an estimated 1000 matches similar to: "[RFC] Making space for a flush-to-zero flag in FastMathFlags"

2019 Mar 18
2
[RFC] Making space for a flush-to-zero flag in FastMathFlags
On Sun, Mar 17, 2019 at 1:47 PM Craig Topper <craig.topper at gmail.com> wrote: > Can we move HasValueHandle out of the byte used for SubClassOptionalData and move it to the flags at the bottom of value by shrinking NumUserOperands to 27? I like this approach because it is less work for me. :) But I agree with Sanjay below that this only kicks the can slightly further down the road
2019 Mar 18
3
[RFC] Making space for a flush-to-zero flag in FastMathFlags
We knew the day when we needed another FMF bit was coming back in: https://reviews.llvm.org/D39304 ...it was just a question of 'when'. :) I'm guessing that an FTZ bit won't be the last new bit needed if we consider permutations between strict FP and fast-math. Even without that, denormals-as-zero (DAZ) might also be useful? So rather than continuing to carve these out bit-by-bit,
2012 Nov 16
2
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
Another round of improved patches, and a patch for documentation changes to LangRef. * Make comments more up to date * Use 'arcp' instead of 'ar' * Use logical || Still based off of r168110 On Nov 15, 2012, at 3:31 PM, Michael Ilseman <milseman at apple.com> wrote: > > On Nov 15, 2012, at 3:23 PM, Joe Abbey <joe.abbey at gmail.com> wrote: >
2017 Oct 04
2
Trouble when suppressing a portion of fast-math-transformations
> It might be clearer, instead of using 'libm', to use something like 'trans' (for transcendental functions). That does seem clearer. ‘trans’ is definitely good with me. -Warren From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Tuesday, October 3, 2017 5:13 PM To: Ristow, Warren; Bruce Hoult Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Trouble when suppressing a
2012 Nov 15
2
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
Though semantically equivalent in this case, however I think you should use logical ors here not bitwise. + bool any() { + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros | + AllowReciprocal; + } Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious: SubclassOptionalData = (SubclassOptionalData & ~BitToSet) | (B * BitToSet); This is
2012 Nov 16
0
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
Michael, Overall the code looks good. 80-cols: 2046 FMF.UnsafeAlgebra = 0 != (Record[OpNum] & (1 << bitc::FMF_UNSAFE_ALGEBRA)); 2047 FMF.NoNaNs = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_NANS)); 2048 FMF.NoInfs = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_INFS)); 2049 FMF.NoSignedZeros = 0 !=
2012 Nov 15
0
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
On Nov 15, 2012, at 3:23 PM, Joe Abbey <joe.abbey at gmail.com> wrote: > Though semantically equivalent in this case, however I think you should use logical ors here not bitwise. > > + bool any() { > + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros | > + AllowReciprocal; > + } > Will do. > Gripe: This pattern is probably super fast and has
2012 Nov 15
0
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
Trying to apply patches.. What's your base revision? Joe On Nov 15, 2012, at 5:44 PM, Michael Ilseman <milseman at apple.com> wrote: > New patches with review feedback incorporated: > * Changed single letter flags to short abbreviations ('S' ==> 'nsz') > * Indentation fixes > * Comments don't state function names > >
2012 Nov 15
3
[LLVMdev] [PATCH] fast-math patches!
New patches with review feedback incorporated: * Changed single letter flags to short abbreviations ('S' ==> 'nsz') * Indentation fixes * Comments don't state function names -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Fast-math-flags-added-to-FPMathOperator.patch Type: application/octet-stream Size: 4937 bytes Desc: not
2017 Sep 30
3
Trouble when suppressing a portion of fast-math-transformations
Hi Hal, >> 4. To fix this, I think that additional fast-math-flags are likely >> needed in the IR. Instead of the following set: >> >> 'nnan' + 'ninf' + 'nsz' + 'arcp' + 'contract' >> >> something like this: >> >> 'reassoc' + 'libm' + 'nnan' + 'ninf' + 'nsz' +
2017 Oct 03
2
Trouble when suppressing a portion of fast-math-transformations
On 10/01/2017 06:05 PM, Sanjay Patel wrote: > Are we confident that we just need those 7 bits to represent all of > the relaxed FP states that we need/want to support? > > I'm asking because FMF in IR is currently mapped onto the > SubclassOptionalData of Value...and we have exactly 7 bits there. :) > > If we're redoing the definitions, I'm wondering if we can
2016 Nov 16
3
RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags
Hi, Thanks for the quick feedback. I see your points, but I have a few questions/comments. I'll start at the end of the previous post: > ... > I think these are valuable problems to solve, but you should tackle them piece by piece: > > 1) the clang part of overriding the individual FMF and emitting the right IR is the first thing to fix. > 2) the backend is still using the
2017 Oct 02
2
Trouble when suppressing a portion of fast-math-transformations
I'm not aware of any additional bits needed. But putting us right at the edge leaves me uncomfortable. So an implementation that isn't limited by the 7 bits in SubclassOptionalData seems sensible. Thanks, -Warren From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: Monday, October 2, 2017 12:06 AM To: Ristow, Warren Cc: Hal Finkel; llvm-dev at lists.llvm.org Subject: Re:
2016 Nov 17
4
RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags
I don’t really like the idea of updating checks of UnsafeAlgebra() to depend on all of the other flags. It seems like it would be preferable to look at each optimization and figure out which flags it actually requires. I suspect that in many cases the “new” flag (i.e. allowing reassociation, etc.) will be what is actually needed anyway. I would be inclined to agree with Niolai’s suggestion of
2012 Nov 14
4
[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level
On Nov 14, 2012, at 12:47 PM, Chris Lattner <clattner at apple.com> wrote: > > On Nov 14, 2012, at 12:28 PM, Michael Ilseman <milseman at apple.com> wrote: > >> I think I missed what problem we're trying to solve here. >> >> I'm looking at implementing the bitcode now. I have code to successfully read and write out the LLVM IR textual formal
2017 Jul 01
2
[PATCH 1/2] nv110/exa: Remove depbars
Removed explicit depar instructions as they're not used by the blob anymore. Signed-off-by: Aaryaman Vasishta <jem456.vasishta at gmail.com> --- src/shader/exac8nv110.fp | 5 ++--- src/shader/exac8nv110.fpc | 10 ++++------ src/shader/exacanv110.fp | 5 ++--- src/shader/exacanv110.fpc | 10 ++++------ src/shader/exacmnv110.fp | 5 ++--- src/shader/exacmnv110.fpc | 10 ++++------
2016 Oct 16
2
[PATCH] exa: add GM10x acceleration support
rendercheck -f a8r8g8b8 passes as much as on a GK208, and xv appears to work. Very lightly tested. Instead of sticking coordinates into pushbufs, the vertex shader is modified to read them from a constbuf, indexed by vertex id. This approach could be used for all nvc0 generations, but I didn't want to rock the boat. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Note: this
2017 Jun 10
2
[PATCH v3] nv110/exa: update sched codes
This patch adds proper delays to maxwell exa shaders. rendercheck tests seem consistent with/without this patch. I haven't extensively tested them though. Trello: https://trello.com/c/6LPB2EIS/174-update-maxwell-shaders-with-proper-delays Signed-off-by: Aaryaman Vasishta <jem456.vasishta at gmail.com> --- src/shader/exac8nv110.fp | 10 +++++----- src/shader/exac8nv110.fpc | 18
2017 Jun 03
2
[PATCH v2] nv110/exa: update sched codes
v2: Add missing delays This patch adds proper delays to maxwell exa shaders. rendercheck tests seem consistent with/without this patch. I haven't extensively tested them though. Trello: https://trello.com/c/6LPB2EIS/174-update-maxwell-shaders-with-proper-delays Signed-off-by: Aaryaman Vasishta <jem456.vasishta at gmail.com> --- src/shader/exac8nv110.fp | 10 +++++-----
2017 Jun 28
1
[PATCH v4] nv110/exa: update sched codes
Hi, On Wed, Jun 28, 2017 at 12:53 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: > BTW, you can drop those explicit "depbar" ops. I think they're only > needed when you're doing something weird with barriers. Blob doesn't > use them (anymore) > Gotcha. Should I remove them in the same patch or a different one? It seems like the depbar removal is