Hans de Goede
2016-Feb-22 12:46 UTC
[Nouveau] Dealing with opencl kernel parameters in nouveau now that RES support is gone
Hi, On 22-02-16 13:41, Samuel Pitoiset wrote:> Hi there, > > On 02/22/2016 12:26 PM, Hans de Goede wrote:<snip>>> So back to the problem of getting OpenCL(ish) code to work again with >> the recent mesa changes. For starters I would like to get: >> >> src/gallium/tests/trivial/compute.c and then the test with mask 8, >> test_input_global() to work again, when that is working I should be >> able to adjust my llvm work (and if necessary clover) to start to >> work again. >> >> Currently the test_input_global() test uses the following bit of >> TGSI code: >> >> COMP >> DCL SV[0], THREAD_ID[0] >> DCL TEMP[0], LOCAL >> DCL TEMP[1], LOCAL >> IMM UINT32 { 8, 0, 0, 0 } >> >> BGNSUB\n" >> UMUL TEMP[0], SV[0], IMM[0] >> LOAD TEMP[1].xy, RINPUT, TEMP[0] >> LOAD TEMP[0].x, RGLOBAL, TEMP[1].yyyy >> UADD TEMP[1].x, TEMP[0], -TEMP[1] >> STORE RGLOBAL.x, TEMP[1].yyyy, TEMP[1] >> RET >> ENDSUB >> >> >> Where by RINPUT and RGLOBAL get replaces by processing the >> code with cpp and the following defines: >> >> #define RGLOBAL RES[32767] >> #define RLOCAL RES[32766] >> #define RPRIVATE RES[32765] >> #define RINPUT RES[32764] >> >> If I understand how memory is supposed to work, then I would need to >> change the TGSI as follows: >> >> COMP >> DCL SV[0], THREAD_ID[0] >> DCL MEMORY[0] >> DCL TEMP[0], LOCAL >> DCL TEMP[1], LOCAL >> IMM UINT32 { 8, 0, 0, 0 } >> >> BGNSUB\n" >> UMUL TEMP[0], SV[0], IMM[0] >> LOAD TEMP[1].xy, RINPUT, TEMP[0] >> LOAD TEMP[0].x, MEMORY[0], TEMP[1].yyyy >> UADD TEMP[1].x, TEMP[0], -TEMP[1] >> STORE MEMORY[0].x, TEMP[1].yyyy, TEMP[1] >> RET >> ENDSUB > > Nope, this won't work because RINPUT is RES[32764]. And you have to remove all occurrences to RES because it's not longer supported. In my opinion, using BUFFER[0] in a first time should work. Currently, only SHARED with MEMORY is supported.Right, as I say below "This only solves the accessing of the global memory, it does not solve getting to the kernel input kernel parameters">> This assumes, that as discussed declaring memory without a , SHARED or >> other >> flag means the memory is global. >> >> So 2 questions: >> >> 1) Do the above changes for using the new MEMORY keyword look as intended >> to you? >> >> 2) This only solves the accessing of the global memory, it does not solve >> getting to the kernel input kernel parameters, how would I deal with >> those ? > > The input kernel parameters are directly passed through a call to pipe_context::launch_grid. You just have to fill the pipe_grid_info::input array with your parameters and they will be uploaded by nvXX_compute_upload_input().Right, the uploading side I understand, the question is how to get to them from the compute kernel's tgsi code ? If I understand you correctly you are suggesting to use BUFFER[0] for this, that is fine from a nouveau point-of-view, but might be a bit nouveau centric way of looking at things, I think a better approach would be a separate input register-file for this, as that will be more flexible when people try to do opencl via clang->llvm->tgsi on other GPUs.> I will have a look at the test_input_global().Thanks! Regards, Hans
Samuel Pitoiset
2016-Feb-22 13:04 UTC
[Nouveau] Dealing with opencl kernel parameters in nouveau now that RES support is gone
On 02/22/2016 01:46 PM, Hans de Goede wrote:> Hi, > > On 22-02-16 13:41, Samuel Pitoiset wrote: >> Hi there, >> >> On 02/22/2016 12:26 PM, Hans de Goede wrote: > > <snip> > >>> So back to the problem of getting OpenCL(ish) code to work again with >>> the recent mesa changes. For starters I would like to get: >>> >>> src/gallium/tests/trivial/compute.c and then the test with mask 8, >>> test_input_global() to work again, when that is working I should be >>> able to adjust my llvm work (and if necessary clover) to start to >>> work again. >>> >>> Currently the test_input_global() test uses the following bit of >>> TGSI code: >>> >>> COMP >>> DCL SV[0], THREAD_ID[0] >>> DCL TEMP[0], LOCAL >>> DCL TEMP[1], LOCAL >>> IMM UINT32 { 8, 0, 0, 0 } >>> >>> BGNSUB\n" >>> UMUL TEMP[0], SV[0], IMM[0] >>> LOAD TEMP[1].xy, RINPUT, TEMP[0] >>> LOAD TEMP[0].x, RGLOBAL, TEMP[1].yyyy >>> UADD TEMP[1].x, TEMP[0], -TEMP[1] >>> STORE RGLOBAL.x, TEMP[1].yyyy, TEMP[1] >>> RET >>> ENDSUB >>> >>> >>> Where by RINPUT and RGLOBAL get replaces by processing the >>> code with cpp and the following defines: >>> >>> #define RGLOBAL RES[32767] >>> #define RLOCAL RES[32766] >>> #define RPRIVATE RES[32765] >>> #define RINPUT RES[32764] >>> >>> If I understand how memory is supposed to work, then I would need to >>> change the TGSI as follows: >>> >>> COMP >>> DCL SV[0], THREAD_ID[0] >>> DCL MEMORY[0] >>> DCL TEMP[0], LOCAL >>> DCL TEMP[1], LOCAL >>> IMM UINT32 { 8, 0, 0, 0 } >>> >>> BGNSUB\n" >>> UMUL TEMP[0], SV[0], IMM[0] >>> LOAD TEMP[1].xy, RINPUT, TEMP[0] >>> LOAD TEMP[0].x, MEMORY[0], TEMP[1].yyyy >>> UADD TEMP[1].x, TEMP[0], -TEMP[1] >>> STORE MEMORY[0].x, TEMP[1].yyyy, TEMP[1] >>> RET >>> ENDSUB >> >> Nope, this won't work because RINPUT is RES[32764]. And you have to >> remove all occurrences to RES because it's not longer supported. In my >> opinion, using BUFFER[0] in a first time should work. Currently, only >> SHARED with MEMORY is supported. > > Right, as I say below "This only solves the accessing of the global > memory, it does not solve > getting to the kernel input kernel parameters" > >>> This assumes, that as discussed declaring memory without a , SHARED or >>> other >>> flag means the memory is global. >>> >>> So 2 questions: >>> >>> 1) Do the above changes for using the new MEMORY keyword look as >>> intended >>> to you? >>> >>> 2) This only solves the accessing of the global memory, it does not >>> solve >>> getting to the kernel input kernel parameters, how would I deal with >>> those ? >> >> The input kernel parameters are directly passed through a call to >> pipe_context::launch_grid. You just have to fill the >> pipe_grid_info::input array with your parameters and they will be >> uploaded by nvXX_compute_upload_input(). > > Right, the uploading side I understand, the question is how to get to > them from > the compute kernel's tgsi code ?Right, I wonder if there is already a DECL INPUT or something like that for input parameters of shaders. Oh yeah, there is TGSI_FILE_INPUT, maybe this is what you want?> > If I understand you correctly you are suggesting to use BUFFER[0] for this, > that is fine from a nouveau point-of-view, but might be a bit nouveau > centric way of looking at things, I think a better approach would be > a separate input register-file for this, as that will be more flexible > when people try to do opencl via clang->llvm->tgsi on other GPUs. > >> I will have a look at the test_input_global(). > > Thanks! > > Regards, > > Hans-- -Samuel
Hans de Goede
2016-Feb-22 13:08 UTC
[Nouveau] Dealing with opencl kernel parameters in nouveau now that RES support is gone
Hi, On 22-02-16 14:04, Samuel Pitoiset wrote:> > On 02/22/2016 01:46 PM, Hans de Goede wrote: >> Hi, >> >> On 22-02-16 13:41, Samuel Pitoiset wrote: >>> Hi there, >>> >>> On 02/22/2016 12:26 PM, Hans de Goede wrote: >> >> <snip> >> >>>> So back to the problem of getting OpenCL(ish) code to work again with >>>> the recent mesa changes. For starters I would like to get: >>>> >>>> src/gallium/tests/trivial/compute.c and then the test with mask 8, >>>> test_input_global() to work again, when that is working I should be >>>> able to adjust my llvm work (and if necessary clover) to start to >>>> work again. >>>> >>>> Currently the test_input_global() test uses the following bit of >>>> TGSI code: >>>> >>>> COMP >>>> DCL SV[0], THREAD_ID[0] >>>> DCL TEMP[0], LOCAL >>>> DCL TEMP[1], LOCAL >>>> IMM UINT32 { 8, 0, 0, 0 } >>>> >>>> BGNSUB\n" >>>> UMUL TEMP[0], SV[0], IMM[0] >>>> LOAD TEMP[1].xy, RINPUT, TEMP[0] >>>> LOAD TEMP[0].x, RGLOBAL, TEMP[1].yyyy >>>> UADD TEMP[1].x, TEMP[0], -TEMP[1] >>>> STORE RGLOBAL.x, TEMP[1].yyyy, TEMP[1] >>>> RET >>>> ENDSUB >>>> >>>> >>>> Where by RINPUT and RGLOBAL get replaces by processing the >>>> code with cpp and the following defines: >>>> >>>> #define RGLOBAL RES[32767] >>>> #define RLOCAL RES[32766] >>>> #define RPRIVATE RES[32765] >>>> #define RINPUT RES[32764] >>>> >>>> If I understand how memory is supposed to work, then I would need to >>>> change the TGSI as follows: >>>> >>>> COMP >>>> DCL SV[0], THREAD_ID[0] >>>> DCL MEMORY[0] >>>> DCL TEMP[0], LOCAL >>>> DCL TEMP[1], LOCAL >>>> IMM UINT32 { 8, 0, 0, 0 } >>>> >>>> BGNSUB\n" >>>> UMUL TEMP[0], SV[0], IMM[0] >>>> LOAD TEMP[1].xy, RINPUT, TEMP[0] >>>> LOAD TEMP[0].x, MEMORY[0], TEMP[1].yyyy >>>> UADD TEMP[1].x, TEMP[0], -TEMP[1] >>>> STORE MEMORY[0].x, TEMP[1].yyyy, TEMP[1] >>>> RET >>>> ENDSUB >>> >>> Nope, this won't work because RINPUT is RES[32764]. And you have to >>> remove all occurrences to RES because it's not longer supported. In my >>> opinion, using BUFFER[0] in a first time should work. Currently, only >>> SHARED with MEMORY is supported. >> >> Right, as I say below "This only solves the accessing of the global >> memory, it does not solve >> getting to the kernel input kernel parameters" >> >>>> This assumes, that as discussed declaring memory without a , SHARED or >>>> other >>>> flag means the memory is global. >>>> >>>> So 2 questions: >>>> >>>> 1) Do the above changes for using the new MEMORY keyword look as >>>> intended >>>> to you? >>>> >>>> 2) This only solves the accessing of the global memory, it does not >>>> solve >>>> getting to the kernel input kernel parameters, how would I deal with >>>> those ? >>> >>> The input kernel parameters are directly passed through a call to >>> pipe_context::launch_grid. You just have to fill the >>> pipe_grid_info::input array with your parameters and they will be >>> uploaded by nvXX_compute_upload_input(). >> >> Right, the uploading side I understand, the question is how to get to >> them from >> the compute kernel's tgsi code ? > > Right, I wonder if there is already a DECL INPUT or something like that for input parameters of shaders. Oh yeah, there is TGSI_FILE_INPUT, maybe this is what you want?Yes that sounds right, so now "all" we need to do is make nvXX_compute_upload_input() and TGSI_FILE_INPUT work together. Regards, Hans
Possibly Parallel Threads
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Some llvm questions (for tgsi backend)
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone