search for: r0d

Displaying 7 results from an estimated 7 matches for "r0d".

Did you mean: 10d
2015 Nov 05
7
[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates
Hi All, This series implements using double immediates in the nouveau codegen code. This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) This has been tested with the 2 double shader tests which I just send to the piglet list. On a gk208 (gk110 / SM35) card, and by checking the output of nouveau_compiler with both nvdisasm and envydis on gf100 / gk104 / gm107. Regard...
2015 Feb 23
2
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
Does this give correct results for special floats (0, infs)? We tried to improve (for single floats) x86 rcp in llvmpipe with newton-raphson, but unfortunately not being able to give correct results for these two cases (without even more additional code) meant it got all disabled in the end (you can still see that code in the driver) since the problems are at least as bad as those due to bad
2015 Nov 07
0
[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates
...de Goede <hdegoede at redhat.com> wrote: > Hi All, > > This series implements using double immediates in the nouveau codegen code. > > This turns the following (nvc0) code: > 1: mov u32 $r2 0x00000000 (8) > 2: mov u32 $r3 0x3fe00000 (8) > 3: add f64 $r0d $r0d $r2d (8) > > Into: > 1: add f64 $r0d $r0d 0.500000 (8) > > This has been tested with the 2 double shader tests which I just send to > the piglet list. On a gk208 (gk110 / SM35) card, and by checking the output > of nouveau_compiler with both nvdisasm and envydis on g...
2015 Nov 05
1
[PATCH envytools] envydis: gk110: Add support for dadd with an immediate src
...e piglit glsl-algebraic-double-add.shader_test. This commit changes the output from: 00000010: 001c0001 c38001ff $r0 $r0 $r0 $r0 0x3fe00 0x3fe00 0x3fe0000000000000 0x3fe00000 0x0 0x3 ??? [unknown: 00000000 c0800000] [unknown instruction] Into: 00000010: 001c0001 c38001ff add rn f64 $r0d $r0d 0x3fe0000000000000 The machine-code in question disassembles to the same using nvdisasm and works properly on an actual gpu. Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- envydis/gk110.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/envydis/gk110.c b/envydis/gk110....
2015 Dec 18
0
Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?
...TEMP[1].yyyy 4: UADD TEMP[1].x, TEMP[0], -TEMP[1] 5: STORE RES[32767].x, TEMP[1].yyyy, TEMP[1] 6: RET 7: ENDSUB Which translates to: SUB:0 () BB:0 (7 instructions) - df = { } -> BB:1 (cross) 0: rdsv u32 $r0 sv[TID:0] (8) 1: shl u32 $r2 $r0 0x00000003 (8) 2: ld u64 $r0d c0[$r2+0x0] (8) 3: ld u32 $r2 g[$r1+0x0] (8) 4: add u32 $r0 $r2 neg $r0 (8) 5: st u32 # g[$r1+0x0] $r0 (8) 6: ret (8) BB:1 (0 instructions) - idom = BB:0, df = { } MAIN:-1 () BB:0 (0 instructions) - df = { } Which is also using 32 bits loads from global memory and that works fine on m...
2015 Dec 16
4
Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?
I believe that your problem is this: /*01a0*/ LD R8, [R8]; /* 0x8000000000821c85 */ That needs to be LD.E (and your ST's need to be ST.E). You're using a 32-bit gmem address, but you need to be using a 64-bit one. I believe the 32-bit ones work on fermi, but afaik not on Kepler. Cheers, -ilia On Wed, Dec 16, 2015 at 12:06 PM, Hans de Goede
2015 May 21
2
Fermi+ shader header docs
On Thu, May 21, 2015 at 10:05 AM, Robert Morell <rmorell at nvidia.com> wrote: > Hi Ilia, > > On Sat, May 02, 2015 at 12:34:21PM -0400, Ilia Mirkin wrote: >> Hi, >> >> As I'm looking to add some support to nouveau for features like atomic >> counters and images, I'm running into some confusion about what the >> first word of the shader header