Ilia Mirkin
2018-Mar-02 04:47 UTC
[Nouveau] TLD instruction usage in non-linked sampler mode
Hello, This question is in the context of Tesla / Fermi generations, which have explicit bindings for textures / samplers. It might also apply to Kepler+, not quite as sure due to the bindless nature. I've been trying to understand how the TLD operation works (which is used to implement texelFetch in GLSL). It does not appear to the op takes an explicit sampler id at all (unlike all the other texturing operations). In unlinked TSC mode (i.e. method 0x1234 == 0), my observation is that it desperately wants for a valid sampler to be bound to sampler slot 0. Of course I don't think TLD actually needs anything from the sampler, which makes this all the more odd. Is that a correct assessment of the operation of the TLD instruction? Is there any way to make it just not care about the sampler binding? Does the DirectX driver just always keep something bound to sampler slot 0? (And what happens on Kepler+? Does it always look at TSC ID =0?) (I kind of assume that all these problems go away in linked TSC mode, since it'd naturally just look up the TSC entry associated with the bound TIC.) Thanks, -ilia
Andy Ritger
2018-Mar-07 20:21 UTC
[Nouveau] TLD instruction usage in non-linked sampler mode
Hi Ilia. It looks like there is a hardware bug on Fermi and Kepler where TLD unconditionally uses the sampler from slot 0. This is supposedly fixed in Maxwell. What I could find internally suggests the bug wasn't present on Tesla, but let me know if your observations contradict that. The only work around is to have a sampler defined and bound. Further, be careful to have reasonable state in the entry in slot #0: as I understand it, the one piece of sampler state that will influence TLD is sRGBConversion (bit 13). I hope that helps, - Andy On Thu, Mar 01, 2018 at 11:47:18PM -0500, Ilia Mirkin wrote:> Hello, > > This question is in the context of Tesla / Fermi generations, which > have explicit bindings for textures / samplers. It might also apply to > Kepler+, not quite as sure due to the bindless nature. > > I've been trying to understand how the TLD operation works (which is > used to implement texelFetch in GLSL). It does not appear to the op > takes an explicit sampler id at all (unlike all the other texturing > operations). In unlinked TSC mode (i.e. method 0x1234 == 0), my > observation is that it desperately wants for a valid sampler to be > bound to sampler slot 0. Of course I don't think TLD actually needs > anything from the sampler, which makes this all the more odd. > > Is that a correct assessment of the operation of the TLD instruction? > Is there any way to make it just not care about the sampler binding? > Does the DirectX driver just always keep something bound to sampler > slot 0? (And what happens on Kepler+? Does it always look at TSC ID => 0?) > > (I kind of assume that all these problems go away in linked TSC mode, > since it'd naturally just look up the TSC entry associated with the > bound TIC.) > > Thanks, > > -ilia
Ilia Mirkin
2018-Mar-07 21:20 UTC
[Nouveau] TLD instruction usage in non-linked sampler mode
Hi Andy, Thanks for checking! I do see an issue on Tesla as well (at least G92, and I believe someone else reported on a GT215 or GT218). However I haven't confirmed that it's the identical issue to what I see on Fermi with quite as much certainty as what I've checked on a GF108. (For the G92, the texture buffer object test fails in the same way it does on Fermi, but there could be other reasons for that.) To be clear, when I say "unbound" for Fermi / Tesla, I mean BIND_TSC(ACTIVE=0) (method 0x2400, bit 0 == 0). Kepler doesn't have that, so presumably you mean TLD will look at TSC ID == 0, irrespective of the TSC ID encoded into the texture handle. The sRGB thing is ... rather annoying. Especially since texelFetch has some kind of extra-specially-odd rule around sRGB conversion. I don't remember what it is, but it feels like this would potentially make it impossible to follow. But I guess going back in time and fixing the hw is also difficult. Can you confirm whether it will always look at slot 0, even in LINKED_TSC mode (i.e. method 0x1234 == 1)? We don't currently make use of that mode in nouveau, but I'm thinking about doing it on Tesla / Fermi so that we can get more than 16 textures per stage on there. Thanks again for the info! -ilia On Wed, Mar 7, 2018 at 3:21 PM, Andy Ritger <aritger at nvidia.com> wrote:> Hi Ilia. > > It looks like there is a hardware bug on Fermi and Kepler where TLD > unconditionally uses the sampler from slot 0. This is supposedly fixed > in Maxwell. What I could find internally suggests the bug wasn't present > on Tesla, but let me know if your observations contradict that. > > The only work around is to have a sampler defined and bound. Further, > be careful to have reasonable state in the entry in slot #0: as I > understand it, the one piece of sampler state that will influence TLD > is sRGBConversion (bit 13). > > I hope that helps, > - Andy > > > On Thu, Mar 01, 2018 at 11:47:18PM -0500, Ilia Mirkin wrote: >> Hello, >> >> This question is in the context of Tesla / Fermi generations, which >> have explicit bindings for textures / samplers. It might also apply to >> Kepler+, not quite as sure due to the bindless nature. >> >> I've been trying to understand how the TLD operation works (which is >> used to implement texelFetch in GLSL). It does not appear to the op >> takes an explicit sampler id at all (unlike all the other texturing >> operations). In unlinked TSC mode (i.e. method 0x1234 == 0), my >> observation is that it desperately wants for a valid sampler to be >> bound to sampler slot 0. Of course I don't think TLD actually needs >> anything from the sampler, which makes this all the more odd. >> >> Is that a correct assessment of the operation of the TLD instruction? >> Is there any way to make it just not care about the sampler binding? >> Does the DirectX driver just always keep something bound to sampler >> slot 0? (And what happens on Kepler+? Does it always look at TSC ID =>> 0?) >> >> (I kind of assume that all these problems go away in linked TSC mode, >> since it'd naturally just look up the TSC entry associated with the >> bound TIC.) >> >> Thanks, >> >> -ilia