thr3ads.net - search: "uadd"

2016 May 09

2

x.with.overflow semantics question

...l transform define void @test1(i64 %a, i64 %b, i64* %res_i64, i1* %res_i1) { entry: %add = add i64 %b, %a %cmp = icmp ult i64 %add, %a store i1 %cmp, i1* %res_i1 store i64 %add, i64* %res_i64 ret void } to define void @test1(i64 %a, i64 %b, i64* %res_i64, i1* %res_i1) { entry: %uadd.overflow = call { i64, i1 } @llvm.uadd.with.overflow.i64(i64 %b, i64 %a) %uadd = extractvalue { i64, i1 } %uadd.overflow, 0 %overflow = extractvalue { i64, i1 } %uadd.overflow, 1 store i1 %overflow, i1* %res_i1 store i64 %uadd, i64* %res_i64 ret void } Now if we _know_ that the ari...

[LLVMdev] why llvm does not have uadd, iadd node

2015 Feb 17

5

[LLVMdev] why llvm does not have uadd, iadd node

Hi guys, I just noticed that the LLVM has some node for signed/unsigned type( like udiv, sdiv), but why the ADD, SUB do not have the counter part sadd, uadd? best kevin

Some llvm questions (for tgsi backend)

2016 Jan 11

4

Some llvm questions (for tgsi backend)

...TBAA"} And the "tgsi" looks like this: .text .file "/home/hans/foo.cl" .globl test_kern test_kern: BGNSUB MOVis TEMP1x, 0 CAL _Z13get_global_idj SHLs TEMP1y, TEMP1x, 7 LOADiis TEMP1z, [4] UADDs TEMP1y, TEMP1z, TEMP1y SHLs TEMP1x, TEMP1x, 2 LOADiis TEMP1z, [0] UADDs TEMP1x, TEMP1z, TEMP1x LOADgis TEMP1x, [TEMP1x] INEGs TEMP1x, TEMP1x LOADgis TEMP1z, [TEMP1y] UADDs TEMP1x, TEMP1x, TEMP1z STOREgis [TEMP1y], TEMP1x...

[LLVMdev] `llvm.$op.with.overflow`, InstCombine and ScalarEvolution

2015 Mar 26

4

[LLVMdev] `llvm.$op.with.overflow`, InstCombine and ScalarEvolution

...inc, %loop ] %idx.inc = add i8 %idx, 1 %to.optimize = icmp ule i8 %idx, %sum call void @side_effect(i1 %to.optimize) %c = icmp ule i8 %idx.inc, %sum br i1 %c, label %loop, label %exit exit: ret void } ``` This happens because `-instcombine` does the following tranform: ``` entry: %uadd = call { i8, i1 } @llvm.uadd.with.overflow.i8(i8 %x, i8 %y) %0 = extractvalue { i8, i1 } %uadd, 0 %e = extractvalue { i8, i1 } %uadd, 1 br i1 %e, label %exit, label %loop.preheader ``` and ScalarEvolution can no longer see through the `extractvalue` of the call to `llvm.uadd.with.overflow.i8...

Some llvm questions (for tgsi backend)

2016 Jan 12

1

Some llvm questions (for tgsi backend)

....file "/home/hans/foo.cl" >> .globl test_kern >> test_kern: >> BGNSUB >> MOVis TEMP1x, 0 >> CAL _Z13get_global_idj >> SHLs TEMP1y, TEMP1x, 7 >> LOADiis TEMP1z, [4] >> UADDs TEMP1y, TEMP1z, TEMP1y >> SHLs TEMP1x, TEMP1x, 2 >> LOADiis TEMP1z, [0] >> UADDs TEMP1x, TEMP1z, TEMP1x >> LOADgis TEMP1x, [TEMP1x] >> INEGs TEMP1x, TEMP1x >> LOADgis TEMP1z, [TEMP1y] >> UADDs...

[Mesa-dev] llvm TGSI backend (WIP) questions

2015 Nov 18

1

[Mesa-dev] llvm TGSI backend (WIP) questions

...rom llvm trunk, not sure > what llvm version you are using). > > To use llc: > > llc -march=tgsi input.ll -o - > > > This will output TGSI. So after some bugfixing to fix a bunch of segfaults I get: $ bin/llc -march=tgsi ../test/CodeGen/AMDGPU/add.ll -o - # BB#0: UADDs TEMP0x, TEMP0x, 0 LOADgis TEMP1z, [TEMP1y] UADDs TEMP1y, TEMP1y, 4 LOADgis TEMP1y, [TEMP1y] UADDs TEMP1y, TEMP1z, TEMP1y STOREgis [TEMP1x], TEMP1y UADDs TEMP0x, TEMP0x, 0 RET ENDSUB and add.ll has: ;FUNC-LABEL: {{^}}test1: ;...

[LLVMdev] why llvm does not have uadd, iadd node

2015 Feb 17

2

[LLVMdev] why llvm does not have uadd, iadd node

...r <t.p.northover at gmail.com> wrote: > Hi Kevin, > > On 17 February 2015 at 10:41, kewuzhang <kewu.zhang at amd.com> wrote: >> I just noticed that the LLVM has some node for signed/unsigned type( like udiv, sdiv), but why the ADD, SUB do not have the counter part sadd, uadd? > > That's because in 2s complement arithmetic the bit pattern of the > result doesn't depend on whether the operation is signed (unlike > multiplication & division). > > Cheers. > > Tim.

[hexagon][PowerPC] code regression (sub-optimal code) on LLVM 9 when generating hardware loops, and the "llvm.uadd" intrinsic.

2019 Jul 01

0

[hexagon][PowerPC] code regression (sub-optimal code) on LLVM 9 when generating hardware loops, and the "llvm.uadd" intrinsic.

...lvm-dev-bounces at lists.llvm.org> On Behalf Of Joan Lluch via llvm-dev Sent: Sunday, June 30, 2019 2:04 PM To: llvm-dev <llvm-dev at lists.llvm.org> Subject: [EXT] [llvm-dev] [hexagon][PowerPC] code regression (sub-optimal code) on LLVM 9 when generating hardware loops, and the "llvm.uadd" intrinsic. Hi All, The following code : void hexagon2( int *a, int *res ) { int i = 100; while ( i-- ) { *res++ = *a++; } } gets compiled as a sub-optimal Software loop on LLVM 9.0 instead of a Hardware loop, whereas it was compiled as a Hardware Loop in LLVM 7.0. This is the...

Translating tests/trivial/compute.c gallium tests to opencl (input / help wanted)

2015 Dec 22

0

Translating tests/trivial/compute.c gallium tests to opencl (input / help wanted)

...D[0]\n" "DCL SV[1], BLOCK_SIZE[0]\n" "DCL SV[2], GRID_SIZE[0]\n" @@ -452,13 +451,15 @@ static void test_system_values(struct context *ctx) @@ -452,13 +451,15 @@ static void test_system_values(struct context *ctx) " UADD TEMP[0].xy, TEMP[0].xyxy, TEMP[0].zwzw\n" " UADD TEMP[0].x, TEMP[0].xxxx, TEMP[0].yyyy\n" " UMUL TEMP[0].x, TEMP[0], IMM[0]\n" - " STORE RES[0].xyzw, TEMP[0], SV[0]\n" + " LOAD TEMP[1].x, R...

x.with.overflow semantics question

2016 May 08

3

x.with.overflow semantics question

Hi Pete, > Or do you mean that the result of an add may not even be defined? In that case would reading it be considered UB in the case where the overflow bit was set? Yeah, this is the case I'm worried about: that for example sadd.with.overflow(INT_MAX, 1) might be designed to return { poison, true } instead of giving a useful result in the first element of the struct. John

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

2014 Oct 03

2

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

...| 8 ++ > lib/Target/R600/EvergreenInstructions.td | 3 + > lib/Target/R600/R600ISelLowering.cpp | 39 +++++++- > test/CodeGen/R600/add.ll | 154 +++++++++++++++++-------------- > test/CodeGen/R600/sub.ll | 18 ++-- > test/CodeGen/R600/uaddo.ll | 17 +++- > test/CodeGen/R600/usubo.ll | 23 ++++- > 9 files changed, 189 insertions(+), 81 deletions(-) > > diff --git a/lib/Target/R600/AMDGPUISelLowering.h b/lib/Target/R600/AMDGPUISelLowering.h > index 911576b..6eaf001 100644 > --- a/lib/Tar...

Some llvm questions (for tgsi backend)

2016 Jan 11

0

Some llvm questions (for tgsi backend)

...like this: > > .text > .file "/home/hans/foo.cl" > .globl test_kern > test_kern: > BGNSUB > MOVis TEMP1x, 0 > CAL _Z13get_global_idj > SHLs TEMP1y, TEMP1x, 7 > LOADiis TEMP1z, [4] > UADDs TEMP1y, TEMP1z, TEMP1y > SHLs TEMP1x, TEMP1x, 2 > LOADiis TEMP1z, [0] > UADDs TEMP1x, TEMP1z, TEMP1x > LOADgis TEMP1x, [TEMP1x] > INEGs TEMP1x, TEMP1x > LOADgis TEMP1z, [TEMP1y] > UADDs TEMP1x, TEMP1x, TEMP1z >...

Some llvm questions (for tgsi backend)

2016 Jan 11

0

Some llvm questions (for tgsi backend)

...ike this: > > .text > .file "/home/hans/foo.cl" > .globl test_kern > test_kern: > BGNSUB > MOVis TEMP1x, 0 > CAL _Z13get_global_idj > SHLs TEMP1y, TEMP1x, 7 > LOADiis TEMP1z, [4] > UADDs TEMP1y, TEMP1z, TEMP1y > SHLs TEMP1x, TEMP1x, 2 > LOADiis TEMP1z, [0] > UADDs TEMP1x, TEMP1z, TEMP1x > LOADgis TEMP1x, [TEMP1x] > INEGs TEMP1x, TEMP1x > LOADgis TEMP1z, [TEMP1y] > UADDs TEMP1x, TEMP1x, TEMP1z >...

[hexagon][PowerPC] code regression (sub-optimal code) on LLVM 9 when generating hardware loops, and the "llvm.uadd" intrinsic.

2019 Jun 30

6

[hexagon][PowerPC] code regression (sub-optimal code) on LLVM 9 when generating hardware loops, and the "llvm.uadd" intrinsic.

...e use of Hardware loops. This is in my opinion a bad regression from some earlier version. This is not an isolated case, more cases of the same LLVM 9 ‘defect’ are easy to find. I have investigated the issue and I identified the root cause of it, which is related with the initial use of the “llvm.uadd" intrinsic in LLVM 9.0 to increment the loop Induction Variable, instead of an “add” instruction like LLVM 7.0. This is the while.body excerpt after "CodeGen Prepare” in LLVM 9.0 while.body: ; preds = %entry.old, %while.body %lsr.iv = phi i32 [...

multiprecision add/sub

2017 Feb 15

4

multiprecision add/sub

...carryin = carryout; z[1] = __builtin_addc(x[1], y[1], carryin, &carryout); carryin = carryout; z[2] = __builtin_addc(x[2], y[2], carryin, &carryout); carryin = carryout; z[3] = __builtin_addc(x[3], y[3], carryin, &carryout); } uses the LLVM intrinsic "llvm.uadd.with.overflow" and generates horrible code that doesn't use the "adc" x86 instruction. What is the current thinking on improving multiprecision arithmetic?

Dealing with opencl kernel parameters in nouveau now that RES support is gone

2016 Feb 22

2

Dealing with opencl kernel parameters in nouveau now that RES support is gone

...D[0] >> DCL TEMP[0], LOCAL >> DCL TEMP[1], LOCAL >> IMM UINT32 { 8, 0, 0, 0 } >> >> BGNSUB\n" >> UMUL TEMP[0], SV[0], IMM[0] >> LOAD TEMP[1].xy, RINPUT, TEMP[0] >> LOAD TEMP[0].x, RGLOBAL, TEMP[1].yyyy >> UADD TEMP[1].x, TEMP[0], -TEMP[1] >> STORE RGLOBAL.x, TEMP[1].yyyy, TEMP[1] >> RET >> ENDSUB >> >> >> Where by RINPUT and RGLOBAL get replaces by processing the >> code with cpp and the following defines: >> >> #define RGLOBAL...

Dealing with opencl kernel parameters in nouveau now that RES support is gone

2016 Feb 22

4

Dealing with opencl kernel parameters in nouveau now that RES support is gone

...OCAL >>>> IMM UINT32 { 8, 0, 0, 0 } >>>> >>>> BGNSUB\n" >>>> UMUL TEMP[0], SV[0], IMM[0] >>>> LOAD TEMP[1].xy, RINPUT, TEMP[0] >>>> LOAD TEMP[0].x, RGLOBAL, TEMP[1].yyyy >>>> UADD TEMP[1].x, TEMP[0], -TEMP[1] >>>> STORE RGLOBAL.x, TEMP[1].yyyy, TEMP[1] >>>> RET >>>> ENDSUB >>>> >>>> >>>> Where by RINPUT and RGLOBAL get replaces by processing the >>>> code with cpp and the...

llvm TGSI backend (WIP) questions

2015 Nov 13

6

llvm TGSI backend (WIP) questions

Hi All, So as discussed I've started working on a TGSI backend for llvm to use as a way to get compute going on nouveau (and other gpu-s). I'm still learning all the ins and outs of llvm so I do not have much to show yet. I've rebased Francisco's (curro's) latest version on top of llvm trunk, and added a commit on top to actual get it build with the latest trunk. So

[RFC] Use of saturating intrinsics

2019 Oct 10

2

[RFC] Use of saturating intrinsics

Hello all again, take 2. Over in D68651 I would like to make code that attempt to saturate an value (using higher bitwidth integers) use a saturating intrinsic instead. Something like this: https://godbolt.org/z/9knBnP As can be seen, the unsigned cases are already being matched to llvm.uadd.sat intrinsics. I am hoping to extend that to the signed cases. This has numerous benefits including simpler vectorization, cost-modelling and matching in the backend. The current forms of the saturating intrinsics extend into a higher type, which can be awkward to deal with in some cases (i64'...

[LLVMdev] ASan and UBSan Test Failures

2013 Jan 05

2

[LLVMdev] ASan and UBSan Test Failures

...:: Float/cast-overflow.cpp UndefinedBehaviorSanitizer :: Integer/add-overflow.cpp UndefinedBehaviorSanitizer :: Integer/div-zero.cpp UndefinedBehaviorSanitizer :: Integer/no-recover.cpp UndefinedBehaviorSanitizer :: Integer/sub-overflow.cpp UndefinedBehaviorSanitizer :: Integer/uadd-overflow.cpp UndefinedBehaviorSanitizer :: Integer/usub-overflow.cpp UndefinedBehaviorSanitizer :: Misc/bool.cpp UndefinedBehaviorSanitizer :: Misc/enum.cpp UndefinedBehaviorSanitizer :: TypeCheck/misaligned.cpp UndefinedBehaviorSanitizer :: TypeCheck/null.cpp Expected Passes...

search for: uadd