Alexandre Courbot
2016-Mar-01 07:59 UTC
[Nouveau] [PATCH] fifo/gk104: kick channel upon removal
A channel may still be processed by the PBDMA even after removal, unless it is properly kicked. Some chips are more sensible to this than others, with GM20B triggering the issue very easily (the PBDMA will try to fetch methods from the previously-removed channel after a new one is added). Make sure this cannot happen by kicking the channel right after it is disabled, and before the new runlist is submitted. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- Second attempt at this - the first version was reverted after causing a regression: https://lists.freedesktop.org/archives/nouveau/2015-August/021842.html I am pretty confident this one will pass though: we are now kicking the channel only if it has a chance of being scheduled at that time, and in a path from which we *will* access the FIFO registers after the kick, so if nothing complained until then, kicking the channel should be unconsequential. Eric, since you reported the regression on the first version, would you mind trying this one and giving us your Tested-by? I have tested it on dGPU but unfortunately could not set it up in an optimus-like manner, with power management kicking in. drm/nouveau/nvkm/engine/fifo/gpfifogk104.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c b/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c index 2e1df01bd928..8b4a5e01829c 100644 --- a/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c +++ b/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c @@ -154,6 +154,7 @@ gk104_fifo_gpfifo_fini(struct nvkm_fifo_chan *base) if (!list_empty(&chan->head)) { gk104_fifo_runlist_remove(fifo, chan); nvkm_mask(device, 0x800004 + coff, 0x00000800, 0x00000800); + gk104_fifo_gpfifo_kick(chan); gk104_fifo_runlist_commit(fifo, chan->engine); } -- 2.7.2
Eric Biggers
2016-Mar-02 03:53 UTC
[Nouveau] [PATCH] fifo/gk104: kick channel upon removal
On Tue, Mar 01, 2016 at 04:59:05PM +0900, Alexandre Courbot wrote:> > Eric, since you reported the regression on the first version, would you mind trying > this one and giving us your Tested-by? I have tested it on dGPU but unfortunately > could not set it up in an optimus-like manner, with power management kicking in. >Thanks. I've tested Linux 4.5-rc6 with your patch applied, and I've yet to encounter the crash after about half a dozen reboots. So it seems good so far, but I'll keep running it and will let you know if I notice a problem.
Alexandre Courbot
2016-Mar-02 09:07 UTC
[Nouveau] [PATCH] fifo/gk104: kick channel upon removal
On Wed, Mar 2, 2016 at 12:53 PM, Eric Biggers <ebiggers3 at gmail.com> wrote:> On Tue, Mar 01, 2016 at 04:59:05PM +0900, Alexandre Courbot wrote: >> >> Eric, since you reported the regression on the first version, would you mind trying >> this one and giving us your Tested-by? I have tested it on dGPU but unfortunately >> could not set it up in an optimus-like manner, with power management kicking in. >> > > Thanks. I've tested Linux 4.5-rc6 with your patch applied, and I've yet to > encounter the crash after about half a dozen reboots. So it seems good so far, > but I'll keep running it and will let you know if I notice a problem.Great news, this one was bothering me for some time. Would be grateful if you could let us know whether things are still going well after further testing.