search for: 1.0f

Displaying 20 results from an estimated 91 matches for "1.0f".

Did you mean: 1.00
2008 Jun 17
2
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
Hi Nate! I don't see how that would work. Select doesn't work per element. Say we're trying to vectorize the following C++ code: if(v[0] < 0) v[0] += 1.0f; if(v[1] < 0) v[1] += 1.0f; if(v[2] < 0) v[2] += 1.0f; if(v[3] < 0) v[3] += 1.0f; With SSE assembly this would be as simple as: movaps xmm1, xmm0 // v in xmm0 cmpltps xmm1, zero // zero =
2011 Dec 28
3
[LLVMdev] InstCombine "pessimizes" trunc i8 to i1?
>> Hi! >> >> before InstCombine (llvm::createInstructionCombiningPass()) I have >> a trunc from i8 to i1 and then a select: >> >> %45 = load i8* @myGlobal, align 1 >> %tobool = trunc i8 %45 to i1 >> %cond = select i1 %tobool, float 1.000000e+00, float -1.000000e+00 >> >> after instCombine I have: >> >> %29 = load i8*
2011 Dec 29
0
[LLVMdev] InstCombine "pessimizes" trunc i8 to i1?
I think Chris is saying that the and is necessary because with your i1 trunc you're ignoring all of the high bits. The and implements that. If you don't want this behavior, don't generate the trunc in the first place and just compare the full width to zero. Reid On Wed, Dec 28, 2011 at 6:45 AM, Jochen Wilhelmy <j.wilhelmy at arcor.de>wrote: > > >> Hi! >
2008 Jun 16
0
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
On Jun 13, 2008, at 12:27 AM, Nicolas Capens wrote: > Hi all, > > When trying to generate a VFCmp instruction when UnsafeFPMath is set > to true I get an assert “Unexpected CondCode” on my x86 system. This > also happens with UnsafeFPMath set to false and using an unordered > compare. Could someone look into this? > > While I’m at it, is there any reason why only the
2007 Jun 12
0
[PATCH] Combined checkFTB and capDirection into one checkOrientation function.
--- include/cube.h | 18 +++------ plugins/cube.c | 120 +++++++++++++++++-------------------------------------- 2 files changed, 43 insertions(+), 95 deletions(-) diff --git a/include/cube.h b/include/cube.h index 0a87626..293bad1 100644 --- a/include/cube.h +++ b/include/cube.h @@ -87,16 +87,11 @@ typedef void (*CubePaintInsideProc) (CompScreen *s, CompOutput *output,
2009 Dec 07
2
[LLVMdev] a constant folding case causing different results w/ot optimization
Hi, The optimizer folds "(unsigned int) -1.0f" to 0. As a result, the following routine foo returns 0 with optimization, and returns -1 (0xffffffff) without optimization. unsigned int foo() { float myfloat = -1.0f; unsigned int myint = (unsigned int) myfloat; return myint; } INSTCOMBINE ITERATION #0 on IC: ConstFold to: i32 0 from: %conv = fptoui float
2002 Nov 26
1
sprintf crash (PR#2327)
>2) Under MacOS 10.2.2. Given the following dataset: > > > summary(missing.data) > Hosp..No. Category Offset Side > r060093: 4 F Post op 0.5 year :62 Min. :-5.8333 L: 63 > r023250: 3 E Pre op 0.5 year :54 1st Qu.:-0.4167 R:141 > r026316: 3 H Post op 1.5 years :44 Median : 7.7083 > r032583: 3 G Post op 1 year
2009 Jan 08
2
[LLVMdev] Loop elimination with floating point counter.
Hi Devang, Thanks. Yes, in the case variable i incremented by 1.0f is optimized. I don't know why... Anyway, I've filed this problem into bugzilla(Bug 3299) -- Syoyo On Fri, Jan 9, 2009 at 12:42 AM, Devang Patel <dpatel at apple.com> wrote: > > On Jan 8, 2009, at 4:36 AM, Syoyo Fujita wrote: > >> Hi LLVM-ers, >> >> I'd like to eliminate dead loop
2008 Jun 13
6
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
Hi all, When trying to generate a VFCmp instruction when UnsafeFPMath is set to true I get an assert "Unexpected CondCode" on my x86 system. This also happens with UnsafeFPMath set to false and using an unordered compare. Could someone look into this? While I'm at it, is there any reason why only the most significant bit of the return value of VFCmp is defined (according to
2011 Dec 28
0
[LLVMdev] InstCombine "pessimizes" trunc i8 to i1?
On Dec 27, 2011, at 5:09 AM, Jochen Wilhelmy wrote: > Hi! > > before InstCombine (llvm::createInstructionCombiningPass()) I have > a trunc from i8 to i1 and then a select: > > %45 = load i8* @myGlobal, align 1 > %tobool = trunc i8 %45 to i1 > %cond = select i1 %tobool, float 1.000000e+00, float -1.000000e+00 > > after instCombine I have: > > %29 = load i8*
2011 Dec 27
2
[LLVMdev] InstCombine "pessimizes" trunc i8 to i1?
Hi! before InstCombine (llvm::createInstructionCombiningPass()) I have a trunc from i8 to i1 and then a select: %45 = load i8* @myGlobal, align 1 %tobool = trunc i8 %45 to i1 %cond = select i1 %tobool, float 1.000000e+00, float -1.000000e+00 after instCombine I have: %29 = load i8* @myGlobal, align 1 %30 = and i8 %29, 1 %tobool = icmp ne i8 %30, 0 %cond = select i1 %tobool, float 1.000000e+00,
2009 Jan 08
3
[LLVMdev] Loop elimination with floating point counter.
Isn't "simplifying" the loop index to an integer also dependent on precision issues? The following loop is infinite: for (float i=0.0; i<...; i+=1.0) {} where ... is a large "integral" float literal, e.g. 2^40. At some point, adding 1.0 to the loop index would not cause any changes because of precision issues. However, if the float type is replaced by some
2009 Jan 08
0
[LLVMdev] Loop elimination with floating point counter.
It's because with 1.0f, the loop index is simplified into an integer. With 1.2f, it isn't. The loop deletion pass is dependent on the loop analyses being able to determine that the loop is finite, which they don't attempt to do for loops with floating point indices. Attempting to do so would require additional reasoning about floating point precision issues. --Owen On
2009 Jul 28
0
[PATCH 2/8] nv50: fix viewport transform
The translation also needs to be inverted, and in bypass mode the state tracker incorrectly assumes that Y = 0 = TOP, so we need inversion there to; NDC clipping has to be deactivated explicitly. --- src/gallium/drivers/nv50/nv50_state_validate.c | 31 +++++++++++++++-------- 1 files changed, 20 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_state_validate.c
2008 Nov 26
2
[LLVMdev] clang and overloaded functions
In clang is it possible to declare C-language function prototypes for overloaded functions even though it is not C-language legal. It's because I see that GLSL provides something like this, and I'm wondering if clang does too and how to express it either through command line argument or language attributes, etc.? int f(int, int); float f(float, float); int g() { float x = f(1.0f,
2016 May 18
4
Working on FP SCEV Analysis
On Tue, May 17, 2016 at 8:49 PM Owen Anderson <resistor at mac.com> wrote: > > On May 16, 2016, at 2:42 PM, Sanjoy Das via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > - Core motivation: why do we even care about optimizing floating > point induction variables? What situations are they common in? Do > programmers _expect_ compilers to optimize them
2009 Jul 12
0
[PATCH 2/3] nv50: fix viewport transform
We need to invert the viewport translate/scale parameters when the state tracker thinks we have Y_0_TOP. If these cases, we have do to bypass mode by setting an identity viewport transform for x, z and inversion for y, or p.e. clear_with_quad won't work correctly. Clipping for xy in NDC space needs to be disabled then. --- src/gallium/drivers/nv50/nv50_context.h | 1 +
2009 Jan 08
0
[LLVMdev] Loop elimination with floating point counter.
On Jan 8, 2009, at 4:36 AM, Syoyo Fujita wrote: > Hi LLVM-ers, > > I'd like to eliminate dead loop with floating point counter using > LLVM, but the following loop wasn't optimized by opt. > > void > func() { > float i; > for (i = 0.0f; i < 1000.0f; i += 1.2f) { > } > } FWIW, LLVM optimizer can eliminate this loop if i is incremented by 1.0f
2017 Jun 11
0
[RFC 4/9] tgsi: populate precise
Only implemented for glsl->tgsi. Other converters just set precise to 0. Signed-off-by: Karol Herbst <karolherbst at gmail.com> --- src/gallium/auxiliary/tgsi/tgsi_build.c | 3 +++ src/gallium/auxiliary/tgsi/tgsi_ureg.c | 14 +++++++--- src/gallium/auxiliary/tgsi/tgsi_ureg.h | 20 +++++++++++--- src/gallium/auxiliary/util/u_simple_shaders.c | 2 +-
2019 Apr 04
3
EuroLLVM Numerics info
Hi Micheal, Thanks for the blog post. Just like to point out few things that I thought is related to FP Numerics. LLVM could do some additional transformation with "sqrt" and "division" under fast math on X86 like 1/sqrt(x)* 1/sqrt(x) to 1/x. These are long latency instructions and could get benefit if enabled under unsafe math. Also are we considering doing such FP