thr3ads.net - search: "nvdisasm"

Displaying 20 results from an estimated 25 matches for "nvdisasm".

Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles

2015 May 18

Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles

...der_test It should be noted that other piglit tests don't exhibit this error, however they also tend to be simpler. One key difference is that they don't change the patch size in TCS. I'm including a link to a text file with the tessellation control and evaluation shaders (decoded with nvdisasm which you're hopefully more familiar with), along with the shader headers that we generate. FTR, this is how I feed the raw shader opcode bytes into nvdisasm: perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b SM35 tt (for some reason it doesn't w...

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

2015 Dec 15

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

...uot;step" in the nobody simulation, this is on a >> gk107 card. >> >> So that seems to be the real problem, since the >> error says "INVALID_OPCODE", I've put the tgsi code from nbody.c >> through "nouveau_compiler -a e4" and then run "nvdisasm -b SM30" >> on it, but the output looks ok. There is a 8 byte sequence which does >> not get decoded every 64 bytes but AFAIK that is the scheduling info, >> so that should be fine. >> >> One thing which does stand out is that this: >> >> 0: ld u32 %...

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

2015 Dec 15

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

...VALID_OPCODE] and repeats that for every "step" in the nobody simulation, this is on a gk107 card. So that seems to be the real problem, since the error says "INVALID_OPCODE", I've put the tgsi code from nbody.c through "nouveau_compiler -a e4" and then run "nvdisasm -b SM30" on it, but the output looks ok. There is a 8 byte sequence which does not get decoded every 64 bytes but AFAIK that is the scheduling info, so that should be fine. One thing which does stand out is that this: 0: ld u32 %r219 c0[0x0000000000000000+0x0] (0) 1: ld u32 %r222 c0[0x...

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

2015 Dec 16

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

...ncluded a small bit of the program in my original mail > because I found the use of "MOV" instructions to load constants > suspicious, is that normal ? > > I've put a log with NV50_PROG_DEBUG=1 output here: > > https://fedorapeople.org/~jwrdegoede/nbody.log > > nvdisasm -b SM30 for the generated binary code is here: > > https://fedorapeople.org/~jwrdegoede/nbody.disasm > > There are already .tgsi, .hex and .bin files there if > you find those easier to use then the > NV50_PROG_DEBUG=1 output. > > >> >> On Tue, Dec 15, 2015 at 12...

Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles

2015 May 26

Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles

...her piglit tests don't exhibit this error, >> however they also tend to be simpler. One key difference is that they >> don't change the patch size in TCS. I'm including a link to a text >> file with the tessellation control and evaluation shaders (decoded >> with nvdisasm which you're hopefully more familiar with), along with >> the shader headers that we generate. >> >> FTR, this is how I feed the raw shader opcode bytes into nvdisasm: >> >> perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b...

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

2015 Dec 16

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

...uting into the ether? Sorry I only included a small bit of the program in my original mail because I found the use of "MOV" instructions to load constants suspicious, is that normal ? I've put a log with NV50_PROG_DEBUG=1 output here: https://fedorapeople.org/~jwrdegoede/nbody.log nvdisasm -b SM30 for the generated binary code is here: https://fedorapeople.org/~jwrdegoede/nbody.disasm There are already .tgsi, .hex and .bin files there if you find those easier to use then the NV50_PROG_DEBUG=1 output. > > On Tue, Dec 15, 2015 at 12:00 PM, Ilia Mirkin <imirkin at alum.mit....

What are the restrictions around loading indirect constbuf values

2015 Jun 25

What are the restrictions around loading indirect constbuf values

Hello, We recently tracked down a bug on Tesla GPUs (i.e. G80-GT218) whereby it appears that instructions like 00000028: b5000409 08000780 add rn f32 $r2 $r2 neg c0[$a1] 00000040: b500060d 08004780 add rn f32 $r3 $r3 neg c0[$a1+0x4] or with nvdisasm: .headerflags @"EF_CUDA_SM12 EF_CUDA_PTX_SM(EF_CUDA_SM12)" /*0000*/ FADD R2, R2, -c[0x0][A1+0x0]; /* 0x08000780b5000409 */ /*0008*/ FADD R3, R3, -c[0x0][A1+0x1]; /* 0x08004780b500060d */ don't appear to execute properly. However just MOV&...

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

2015 Dec 16

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

...am in my original mail >> because I found the use of "MOV" instructions to load constants >> suspicious, is that normal ? >> >> I've put a log with NV50_PROG_DEBUG=1 output here: >> >> https://fedorapeople.org/~jwrdegoede/nbody.log >> >> nvdisasm -b SM30 for the generated binary code is here: >> >> https://fedorapeople.org/~jwrdegoede/nbody.disasm >> >> There are already .tgsi, .hex and .bin files there if >> you find those easier to use then the >> NV50_PROG_DEBUG=1 output. >> >> >>>...

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

2015 Dec 18

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

2015 Dec 15

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

...ts that for every "step" in the nobody simulation, this is on a > gk107 card. > > So that seems to be the real problem, since the > error says "INVALID_OPCODE", I've put the tgsi code from nbody.c > through "nouveau_compiler -a e4" and then run "nvdisasm -b SM30" > on it, but the output looks ok. There is a 8 byte sequence which does > not get decoded every 64 bytes but AFAIK that is the scheduling info, > so that should be fine. > > One thing which does stand out is that this: > > 0: ld u32 %r219 c0[0x0000000000000000+0...

Documentation request for MP warp error 0x10

2015 Oct 02

Documentation request for MP warp error 0x10

Hi Robert, Thanks for the quick response! That goes in line with my observations which is that these things happen when using an ATOM/RED instruction. I've checked and rechecked that I'm generating ops with identical bits as what the proprietary driver does, however (and nvdisasm prints identical output). Could you advise what the proper way of indicating that the memory is "global" to the op? I'm sure I'm just missing something simple. If you show me what to look for in SM35 I can probably find it on my own for SM20/SM30/SM50. In case you're interest...

[PATCH] nv50/ir/gk110: fix some instruction emission

2014 Mar 11

[PATCH] nv50/ir/gk110: fix some instruction emission

Information for this was gathered from nvdisasm. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- Entirely untested. Ben, do you think you'll be able to give this a shot? .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 33 +++++++++++----------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/src/galliu...

[PATCH envytools] envydis: gk110: Add support for dadd with an immediate src

2015 Nov 05

[PATCH envytools] envydis: gk110: Add support for dadd with an immediate src

...000010: 001c0001 c38001ff $r0 $r0 $r0 $r0 0x3fe00 0x3fe00 0x3fe0000000000000 0x3fe00000 0x0 0x3 ??? [unknown: 00000000 c0800000] [unknown instruction] Into: 00000010: 001c0001 c38001ff add rn f64 $r0d $r0d 0x3fe0000000000000 The machine-code in question disassembles to the same using nvdisasm and works properly on an actual gpu. Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- envydis/gk110.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/envydis/gk110.c b/envydis/gk110.c index 9af18e1..4790533 100644 --- a/envydis/gk110.c +++ b/envydis/gk110.c @@ -1274,6 +1274,7...

Documentation request for MP warp error 0x10

2015 Oct 26

Documentation request for MP warp error 0x10

...gt; Thanks for the quick response! That goes in line with my observations >> which is that these things happen when using an ATOM/RED instruction. >> I've checked and rechecked that I'm generating ops with identical bits >> as what the proprietary driver does, however (and nvdisasm prints >> identical output). Could you advise what the proper way of indicating >> that the memory is "global" to the op? I'm sure I'm just missing >> something simple. If you show me what to look for in SM35 I can >> probably find it on my own for SM20/SM3...

[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

2015 Nov 05

[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

...2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) This has been tested with the 2 double shader tests which I just send to the piglet list. On a gk208 (gk110 / SM35) card, and by checking the output of nouveau_compiler with both nvdisasm and envydis on gf100 / gk104 / gm107. Regards, Hans

Documentation request for MP warp error 0x10

2015 Oct 26

Documentation request for MP warp error 0x10

...the quick response! That goes in line with my observations > >> which is that these things happen when using an ATOM/RED instruction. > >> I've checked and rechecked that I'm generating ops with identical bits > >> as what the proprietary driver does, however (and nvdisasm prints > >> identical output). Could you advise what the proper way of indicating > >> that the memory is "global" to the op? I'm sure I'm just missing > >> something simple. If you show me what to look for in SM35 I can > >> probably find it on...

[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

2015 Nov 07

[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

...gt; 3: add f64 $r0d $r0d $r2d (8) > > Into: > 1: add f64 $r0d $r0d 0.500000 (8) > > This has been tested with the 2 double shader tests which I just send to > the piglet list. On a gk208 (gk110 / SM35) card, and by checking the output > of nouveau_compiler with both nvdisasm and envydis on gf100 / gk104 / gm107. > > Regards, > > Hans

Question on IPA on GM107

2018 Nov 12

Question on IPA on GM107

So I'm trying to track an special value in IPA instruction generation. https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp#L2561 Register on 0x14 (20) is set to some source on "insn->op == OP_PINTERP" I have found while emulation that such value can be set sometimes to FragCoord.w, I don't however know what that value is and

[PATCH] nv50/ir/gk110: fix some instruction emission

2014 Mar 11

[PATCH] nv50/ir/gk110: fix some instruction emission

On Tue, Mar 11, 2014 at 7:47 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: > Information for this was gathered from nvdisasm. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> > --- > > Entirely untested. Ben, do you think you'll be able to give this a shot? I'll try and take a moment in the next couple of days to give it a go. > > .../drivers/nouveau/codegen/nv50_ir_emit_gk110...

Documentation request for MP warp error 0x10

2015 Oct 02

Documentation request for MP warp error 0x10

...ert, > > Thanks for the quick response! That goes in line with my observations > which is that these things happen when using an ATOM/RED instruction. > I've checked and rechecked that I'm generating ops with identical bits > as what the proprietary driver does, however (and nvdisasm prints > identical output). Could you advise what the proper way of indicating > that the memory is "global" to the op? I'm sure I'm just missing > something simple. If you show me what to look for in SM35 I can > probably find it on my own for SM20/SM30/SM50. Unfortu...

search for: nvdisasm