Hans de Goede
2015-Dec-22 11:37 UTC
[Nouveau] Translating tests/trivial/compute.c gallium tests to opencl (input / help wanted)
Hi All, I've been working on translating the tests/trivial/compute.c tests to opencl (for the buffer setup and kernel launch, I'm keeping the compute kernels in tgsi as an intermediate step). I've got the test_input_global() test working, see: https://fedorapeople.org/~jwrdegoede/compute-opencl-tgsi.c Next I wanted to convert the test_system_values() test and there I've gotten stuck. It uses a PIPE_BUFFER which it rw-binds as a resource. Which OpenCL does not allow. The closest thing is a constant parameter in OpenCL, at first I've tried binding this as a 2d image, but that leads to the gpu hanging, this is the code activated with "#define USE_IMAGE_FOR_BUF 1" in the above file. So in a 2nd attempt I've hacked clover to bind cl_mem objects passed in as constants as rw. This is the code when USE_IMAGE_FOR_BUF is not defined and works. In a 3th attempt because the hack is not something workable in the long run, I've tried to rewrite the test to use a global buffer (not sure this is the best approach other ideas are welcome), leading to this diff: --- a/src/gallium/tests/trivial/compute.c +++ b/src/gallium/tests/trivial/compute.c @@ -431,7 +431,6 @@ static void launch_grid(struct context *ctx, const uint *block_layout, static void test_system_values(struct context *ctx) { const char *src = "COMP\n" - "DCL RES[0], BUFFER, RAW, WR\n" "DCL SV[0], BLOCK_ID[0]\n" "DCL SV[1], BLOCK_SIZE[0]\n" "DCL SV[2], GRID_SIZE[0]\n" @@ -452,13 +451,15 @@ static void test_system_values(struct context *ctx) @@ -452,13 +451,15 @@ static void test_system_values(struct context *ctx) " UADD TEMP[0].xy, TEMP[0].xyxy, TEMP[0].zwzw\n" " UADD TEMP[0].x, TEMP[0].xxxx, TEMP[0].yyyy\n" " UMUL TEMP[0].x, TEMP[0], IMM[0]\n" - " STORE RES[0].xyzw, TEMP[0], SV[0]\n" + " LOAD TEMP[1].x, RINPUT, IMM[2]\n" + " UADD TEMP[0].x, TEMP[0], TEMP[1]\n" + " STORE RGLOBAL.xyzw, TEMP[0], SV[0]\n" " UADD TEMP[0].x, TEMP[0], IMM[1]\n" - " STORE RES[0].xyzw, TEMP[0], SV[1]\n" + " STORE RGLOBAL.xyzw, TEMP[0], SV[1]\n" " UADD TEMP[0].x, TEMP[0], IMM[1]\n" - " STORE RES[0].xyzw, TEMP[0], SV[2]\n" + " STORE RGLOBAL.xyzw, TEMP[0], SV[2]\n" " UADD TEMP[0].x, TEMP[0], IMM[1]\n" - " STORE RES[0].xyzw, TEMP[0], SV[3]\n" + " STORE RGLOBAL.xyzw, TEMP[0], SV[3]\n" " RET\n" "ENDSUB\n"; void init(void *p, int s, int x, int y) { @@ -485,16 +486,18 @@ static void test_system_values(struct context *ctx) break; } } + uint32_t input; printf("- %s\n", __func__); - init_prog(ctx, 0, 0, 0, src, NULL); + init_prog(ctx, 0, 0, 4, src, NULL); init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT, 76800, 0, init); - init_compute_resources(ctx, (int []) { 0, -1 }); - launch_grid(ctx, (uint []){4, 3, 5}, (uint []){5, 4, 1}, 0, NULL); + init_globals(ctx, (int []){ 0, -1 }, + (uint32_t *[]){ &input }); + launch_grid(ctx, (uint []){4, 3, 5}, (uint []){5, 4, 1}, 0, &input); check_tex(ctx, 0, expect, NULL); - destroy_compute_resources(ctx); + destroy_globals(ctx); destroy_tex(ctx); destroy_prog(ctx); } Which also behaves weird, the output is: - test_system_values (1, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (2, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (3, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (5, 0)[0]: got 0x1/0.000000, expected 0x3/0.000000 (6, 0)[0]: got 0x1/0.000000, expected 0x5/0.000000 (9, 0)[0]: got 0x1/0.000000, expected 0x4/0.000000 (13, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (14, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (15, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (17, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (18, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (19, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (21, 0)[0]: got 0x1/0.000000, expected 0x3/0.000000 (22, 0)[0]: got 0x1/0.000000, expected 0x5/0.000000 (25, 0)[0]: got 0x1/0.000000, expected 0x4/0.000000 (29, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (30, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (31, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (33, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (34, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000 (19200, 1): FAIL (10920) So it seems that these tgsi ops: + " STORE RGLOBAL.xyzw, TEMP[0], SV[1]\n" Store the right value in RGLOBAL.x but not in RGLOBAL.yzw for some reason ? I've tried changing this to: + " STORE RGLOBAL.xyzw, TEMP[0], SV[1].xxxx\n" Which of course is wrong, but it does lead to different errors, so it is really writing 4 floats (we already knew really, otherwise we would expect 0xdeadbeaf values), but for some reason the values is it storing in RGLOBAL.yzw are different then the ones stored by the old code in RES[0].yzw? Regards, Hans
Possibly Parallel Threads
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone