Hans de Goede
2016-Feb-18 14:45 UTC
[Nouveau] Dealing with opencl kernel parameters in nouveau now that RES support is gone
Hi Ilia and Samuel, I rebased my mesa git tree today (it was getting a bit stale) and after that src/gallium/tests/trivial/compute.c stopped working as well as any opencl programs with the kernel in TGSI (as clover on nouveau currently only supports having the kernel in TGSI). The problem is that RES no longer is a valid register-file name in TGSI, specifically this is caused by this commit: https://cgit.freedesktop.org/mesa/mesa/commit?id=8cc9a8aa2a97ca9e7a36a993954a3480d44c13d3 (no surprises there) I noticed that Samual adds TGSI_FILE_MEMORY support here: https://cgit.freedesktop.org/mesa/mesa/commit?id=a8328e3a50169c3c74656df7f63f56f061d9e751 But this does not seem to be hooked up yet for nouveau. So some questions: -The commit by Samual says: This introduces TGSI_FILE_MEMORY for shared, global and local memory. Only shared memory is currently supported. The commit introduces MEMORY[x] and MEMORY[x],SHARED so in reality it also introduces a second option next to shared, so what are we going to use plain MEMORY[x] for? I suggest using it for global memory but we need to be in agreement on this. -What about kernel input parameters, so far these have been using RES[32764] I must admit that I do not understand where the file_index of 32764 comes from (or where any of the file indexes come from for src/gallium/tests/trivial/compute.c ? I have the feeling that these are not used at all, and everything simply goes to a flat (virtual) memory space, with the params at address 0, correct ? This seems like a nvidia implementation detail though, and we should IMHO still implement a specific TGSI register file for accessing kernel input params, agreed ? If we can agree on the TGSI syntax for all this, then I'm pretty sure I can hack something together to get things to work with the latest master again, and then we can use this as a starting point for properly implementing this. Regards, Hans p.s. I must say I was somewhat unpleasantly surprised by the RES breakage, yes I know some changes were coming but still. In the future when my llvm->tgsi work is more stable and I'll be pushing it to llvm upstream we really cannot make breaking tgsi changes like this.
Ilia Mirkin
2016-Feb-18 16:39 UTC
[Nouveau] Dealing with opencl kernel parameters in nouveau now that RES support is gone
On Thu, Feb 18, 2016 at 9:45 AM, Hans de Goede <hdegoede at redhat.com> wrote:> But this does not seem to be hooked up yet for nouveau.Samuel has patches. See https://cgit.freedesktop.org/~hakzsam/mesa/log/?h=arb_compute_shader_v3> > So some questions: > -The commit by Samual says: > This introduces TGSI_FILE_MEMORY for shared, global and local memory. Only > shared memory is currently supported. > > The commit introduces MEMORY[x] and MEMORY[x],SHARED so in reality it also > introduces > a second option next to shared, so what are we going to use plain MEMORY[x] > for? > I suggest using it for global memory but we need to be in agreement on > this.That sounds fine to me. However what I had in mind was switching the SHARED field into a 2-bit field and making it 1 = SHARED 2 = GLOBAL 3 = LOCAL (since for OpenCL you also need to be able to address local or private memory). I sorta wanted Samuel to do it, but since I had no idea where you were at, or if you were even still working on this, I figured it should be fixed up by the first person who needed it.> > -What about kernel input parameters, so far these have been using RES[32764] > I must admit that I do not understand where the file_index of 32764 comes > from (or where any of the file indexes come from for > src/gallium/tests/trivial/compute.c ? > I have the feeling that these are not used at all, and everything simply > goes > to a flat (virtual) memory space, with the params at address 0, correct ?It was never particularly well-specified, which was one of the reasons it went away. It also didn't map nicely onto the OpenGL model. There is a remaining question of how to do addressing in memory... there's 40 bits of address space. Should these implicitly be U64 (dual-component in TGSI) addresses that are passed around? Not sure what the OpenCL position on all this is. Also it would be highly preferable to avoid using GLOBAL at all in the first place, and just use BUFFER[] or IMAGE[] or whatever. However after having a look at how OpenCL works, I don't think that's possible. But perhaps I missed something?> > This seems like a nvidia implementation detail though, and we should IMHO > still > implement a specific TGSI register file for accessing kernel input params, > agreed ?I believe that kernel input params should be passed in via user uniforms. i.e. CONST. It doesn't have to be a separate UBO, it can use the user buffer thing, which is basically what it is now.> > If we can agree on the TGSI syntax for all this, then I'm pretty sure I can > hack > something together to get things to work with the latest master again, and > then we > can use this as a starting point for properly implementing this. > > Regards, > > Hans > > > p.s. > > I must say I was somewhat unpleasantly surprised by the RES breakage, yes I > know > some changes were coming but still. In the future when my llvm->tgsi work is > more > stable and I'll be pushing it to llvm upstream we really cannot make > breaking tgsi > changes like this.I thought I had sufficiently explained to you that the resource thing was going away -- it was never spec'd, and there were no users. Perhaps I didn't make it clear enough though -- sorry. The reality is that TGSI is a living thing, and things change every so often. I don't think people would be comfortable with locking down TGSI in such a way. -ilia
Hans de Goede
2016-Feb-19 10:36 UTC
[Nouveau] Dealing with opencl kernel parameters in nouveau now that RES support is gone
Hi, On 18-02-16 17:39, Ilia Mirkin wrote:> On Thu, Feb 18, 2016 at 9:45 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> But this does not seem to be hooked up yet for nouveau. > > Samuel has patches. See > https://cgit.freedesktop.org/~hakzsam/mesa/log/?h=arb_compute_shader_v3Cool, I will take a look at those.>> So some questions: >> -The commit by Samual says: >> This introduces TGSI_FILE_MEMORY for shared, global and local memory. Only >> shared memory is currently supported. >> >> The commit introduces MEMORY[x] and MEMORY[x],SHARED so in reality it also >> introduces >> a second option next to shared, so what are we going to use plain MEMORY[x] >> for? >> I suggest using it for global memory but we need to be in agreement on >> this. > > That sounds fine to me. However what I had in mind was switching the > SHARED field into a 2-bit field and making it > > 1 = SHARED > 2 = GLOBAL > 3 = LOCAL > > (since for OpenCL you also need to be able to address local or private > memory). I sorta wanted Samuel to do it, but since I had no idea where > you were at, or if you were even still working on this, I figured it > should be fixed up by the first person who needed it.Sounds good, only the naming is somewhat unfortunate since opencl uses different naming. I.e. it has no "shared" OpenCL has: -global: accessible by all worker-groups as well as by the CPU -const: read-only global -local: shared by worker-items in the same worker-group, not shared between worker-groups -private: accessible only to a single worker-item So how do these map to the TGSI: > 1 = SHARED > 2 = GLOBAL > 3 = LOCAL ?> >> >> -What about kernel input parameters, so far these have been using RES[32764] >> I must admit that I do not understand where the file_index of 32764 comes >> from (or where any of the file indexes come from for >> src/gallium/tests/trivial/compute.c ? >> I have the feeling that these are not used at all, and everything simply >> goes >> to a flat (virtual) memory space, with the params at address 0, correct ? > > It was never particularly well-specified, which was one of the reasons > it went away. It also didn't map nicely onto the OpenGL model. There > is a remaining question of how to do addressing in memory... there's > 40 bits of address space. Should these implicitly be U64 > (dual-component in TGSI) addresses that are passed around? Not sure > what the OpenCL position on all this is.So far I've been using U32 for addresses as that is what Francisco's original code was using. And this also is what things like the tgsi LOAD instruction take. If you're doing a LOAD on a 1d buffer then you will use TEMP[#].x to specify the index, and the way how this currently works with OpenCL is that clCreateBuffer() will return a cl_mem type which then gets passed into the kernel as input parameter and gets treated as a pointer by the compiler, so e.g. global mem gets treated as a single address space even if there are multiple global buffers and TEMP[#].x contains the value passed in via cl_mem as start offset for the buffer + the index into the buffer. So this means that currently we are limited to U32 since TEMP[#].x is only 32 bits wide. Internally 40 bits addresses can and should probably be used so that at least the different memory spaces each have the full 32 bits available. Note that we could fix this by adding some sort of LOAD64 opcode, which uses TEMP[#].x and TEMP[#].y as address for 1d buffers, I'm not sure how this would work for 3d buffers though. I foresee the llvm backend eventually getting a 64 bit mode where it will use 64 bits for all pointers and use something like a LOAD64 opcode to indicate that the indexes (which it effectively uses as addresses / pointers) are 64 bit wide.> Also it would be highly preferable to avoid using GLOBAL at all in the > first place, and just use BUFFER[] or IMAGE[] or whatever. However > after having a look at how OpenCL works, I don't think that's > possible. But perhaps I missed something?OpenCL is very C-like and as such uses a flat address space per memory space, getting around that is going to be very tricky, if possible at all.>> This seems like a nvidia implementation detail though, and we should IMHO >> still >> implement a specific TGSI register file for accessing kernel input params, >> agreed ? > > I believe that kernel input params should be passed in via user > uniforms. i.e. CONST. It doesn't have to be a separate UBO, it can use > the user buffer thing, which is basically what it is now. > >> >> If we can agree on the TGSI syntax for all this, then I'm pretty sure I can >> hack >> something together to get things to work with the latest master again, and >> then we >> can use this as a starting point for properly implementing this. >> >> Regards, >> >> Hans >> >> >> p.s. >> >> I must say I was somewhat unpleasantly surprised by the RES breakage, yes I >> know >> some changes were coming but still. In the future when my llvm->tgsi work is >> more >> stable and I'll be pushing it to llvm upstream we really cannot make >> breaking tgsi >> changes like this. > > I thought I had sufficiently explained to you that the resource thing > was going away -- it was never spec'd, and there were no users.Yes I was aware that changes were coming.> Perhaps I didn't make it clear enough though -- sorry. The reality is > that TGSI is a living thing, and things change every so often. I don't > think people would be comfortable with locking down TGSI in such a > way.Hmm, I do think that if we end up using a llvm-ir -> tgsi step in some cases that we do need to lock TGSI down. Taken the RES thing for example, it would have been pretty trivial actually to not nuke it, and instead just add the image support next to it. Note I'm fine with making changes while we are figuring this out, but once I start pushing the llvm bits to llvm upstream (which is still months away) we are going to need some sort of TGSI stability. Regards, Hans
Reasonably Related Threads
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Dealing with opencl kernel parameters in nouveau now that RES support is gone
- Some llvm questions (for tgsi backend)