Lukas Wunner
2018-Feb-13 11:52 UTC
[Nouveau] [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers
On Tue, Feb 13, 2018 at 10:55:06AM +0000, Liviu Dudau wrote:> On Sun, Feb 11, 2018 at 10:38:28AM +0100, Lukas Wunner wrote: > > DRM drivers poll connectors in 10 sec intervals. The poll worker is > > stopped on ->runtime_suspend with cancel_delayed_work_sync(). However > > the poll worker invokes the DRM drivers' ->detect callbacks, which call > > pm_runtime_get_sync(). If the poll worker starts after runtime suspend > > has begun, pm_runtime_get_sync() will wait for runtime suspend to finish > > with the intention of runtime resuming the device afterwards. The result > > is a circular wait between poll worker and autosuspend worker. > > I think I understand the problem you are trying to solve, but I'm > struggling to understand where malidp makes any specific mistakes. First > of all, malidp is only a display engine, so there is no GPU attached to > it, but that is only a small clarification. Second, malidp doesn't use > directly any of the callbacks that you are referring to, it uses the > drm_cma_xxxx() API plus the generic drm_xxxx() call. So if there are any > issues there (as they might well be) I think they would apply to a lot > more drivers and the fix will involve more than just malidp, i915 and > msm.I don't know if malidp makes any specific mistakes and didn't mean to cast it in a negative light, sorry. So a lot of DRM drivers acquire a runtime PM ref in the connector ->detect hook because they can't probe connectors while runtime suspended. E.g. for a PCI device, probing might require mmio access, which isn't possible outside of power state D0. There are no ->detect hooks declared in drivers/gpu/drm/arm/, so it's unclear to me whether you're able to probe during runtime suspend. hdlcd_drv.c and malidp_drv.c both enable output polling. Output polling is only necessary if you don't get HPD interrupts. You're not disabling polling upon runtime suspend. Thus, if a runtime PM ref is acquired during polling (such as in a ->detect hook), the GPU will be runtime resumed every 10 secs. You may want to verify that's not the case. If you decide that you do want to stop polling during runtime suspend because it runtime resumes the GPU continuously, you'll need the helper introduced in this series. So by cc'ing you I just wanted to keep you in the loop about an issue that may potentially affect your driver. Let me know if there are any questions. Thanks, Lukas
Liviu Dudau
2018-Feb-13 15:46 UTC
[Nouveau] [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers
On Tue, Feb 13, 2018 at 12:52:06PM +0100, Lukas Wunner wrote:> On Tue, Feb 13, 2018 at 10:55:06AM +0000, Liviu Dudau wrote: > > On Sun, Feb 11, 2018 at 10:38:28AM +0100, Lukas Wunner wrote: > > > DRM drivers poll connectors in 10 sec intervals. The poll worker is > > > stopped on ->runtime_suspend with cancel_delayed_work_sync(). However > > > the poll worker invokes the DRM drivers' ->detect callbacks, which call > > > pm_runtime_get_sync(). If the poll worker starts after runtime suspend > > > has begun, pm_runtime_get_sync() will wait for runtime suspend to finish > > > with the intention of runtime resuming the device afterwards. The result > > > is a circular wait between poll worker and autosuspend worker. > > > > I think I understand the problem you are trying to solve, but I'm > > struggling to understand where malidp makes any specific mistakes. First > > of all, malidp is only a display engine, so there is no GPU attached to > > it, but that is only a small clarification. Second, malidp doesn't use > > directly any of the callbacks that you are referring to, it uses the > > drm_cma_xxxx() API plus the generic drm_xxxx() call. So if there are any > > issues there (as they might well be) I think they would apply to a lot > > more drivers and the fix will involve more than just malidp, i915 and > > msm. > > I don't know if malidp makes any specific mistakes and didn't mean to > cast it in a negative light, sorry.I did not take what you've said as a negative thing, only wanted to understand how you came to your conclusions.> > So a lot of DRM drivers acquire a runtime PM ref in the connector ->detect > hook because they can't probe connectors while runtime suspended. > E.g. for a PCI device, probing might require mmio access, which isn't > possible outside of power state D0. There are no ->detect hooks declared > in drivers/gpu/drm/arm/, so it's unclear to me whether you're able to probe > during runtime suspend.That's because the drivers in drivers/gpu/drm/arm do not have connectors, they are only the CRTC part of the driver. Both hdlcd and mali-dp use the component framework to locate an encoder in device tree that will then provide the connectors.> > hdlcd_drv.c and malidp_drv.c both enable output polling. Output polling > is only necessary if you don't get HPD interrupts.That's right, hdlcd and mali-dp don't receive HPD interrupts because they don't have any. And because we don't know ahead of time which encoder/connector will be paired with the driver, we enable polling as a safe fallback.> > You're not disabling polling upon runtime suspend. Thus, if a runtime PM > ref is acquired during polling (such as in a ->detect hook), the GPU will > be runtime resumed every 10 secs. You may want to verify that's not the > case. If you decide that you do want to stop polling during runtime > suspend because it runtime resumes the GPU continuously, you'll need the > helper introduced in this series. So by cc'ing you I just wanted to keep > you in the loop about an issue that may potentially affect your driver.Again, we have no GPU linked to us and the polling during runtime suspend should be handled by the driver for the paired encoder, not by us. I understand the potential issue but I'm struggling to understand if it really applies to the drivers/gpu/drm/arm drivers other than in an abstract way. Best regards, Liviu> > Let me know if there are any questions. > > Thanks, > > Lukas-- ===================| I would like to | | fix the world, | | but they're not | | giving me the | \ source code! / --------------- ¯\_(ツ)_/¯
Lukas Wunner
2018-Feb-14 13:57 UTC
[Nouveau] [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers
On Tue, Feb 13, 2018 at 03:46:08PM +0000, Liviu Dudau wrote:> On Tue, Feb 13, 2018 at 12:52:06PM +0100, Lukas Wunner wrote: > > On Tue, Feb 13, 2018 at 10:55:06AM +0000, Liviu Dudau wrote: > > > On Sun, Feb 11, 2018 at 10:38:28AM +0100, Lukas Wunner wrote: > > > > DRM drivers poll connectors in 10 sec intervals. The poll worker is > > > > stopped on ->runtime_suspend with cancel_delayed_work_sync(). However > > > > the poll worker invokes the DRM drivers' ->detect callbacks, which call > > > > pm_runtime_get_sync(). If the poll worker starts after runtime suspend > > > > has begun, pm_runtime_get_sync() will wait for runtime suspend to finish > > > > with the intention of runtime resuming the device afterwards. The result > > > > is a circular wait between poll worker and autosuspend worker. > > > > > > I think I understand the problem you are trying to solve, but I'm > > > struggling to understand where malidp makes any specific mistakes. First > > > of all, malidp is only a display engine, so there is no GPU attached to > > > it, but that is only a small clarification. Second, malidp doesn't use > > > directly any of the callbacks that you are referring to, it uses the > > > drm_cma_xxxx() API plus the generic drm_xxxx() call. So if there are any > > > issues there (as they might well be) I think they would apply to a lot > > > more drivers and the fix will involve more than just malidp, i915 and > > > msm.[snip]> > There are no ->detect hooks declared > > in drivers/gpu/drm/arm/, so it's unclear to me whether you're able to probe > > during runtime suspend. > > That's because the drivers in drivers/gpu/drm/arm do not have > connectors, they are only the CRTC part of the driver. Both hdlcd and > mali-dp use the component framework to locate an encoder in device tree > that will then provide the connectors. > > > > > hdlcd_drv.c and malidp_drv.c both enable output polling. Output polling > > is only necessary if you don't get HPD interrupts. > > That's right, hdlcd and mali-dp don't receive HPD interrupts because > they don't have any. And because we don't know ahead of time which > encoder/connector will be paired with the driver, we enable polling as a > safe fallback. >Looking e.g. at inno_hdmi.c (used by rk3036.dtsi), this calls drm_helper_hpd_irq_event() on receiving an HPD interrupt, and that function returns immediately if polling is not enabled. So you *have* to enable polling to receive HPD events. You seem to keep the crtc runtime active as long as it's bound to an encoder. If you do not ever intend to runtime suspend the crtc while an encoder is attached, you don't need to keep polling enabled during runtime suspend (because there's nothing to poll), but it shouldn't hurt either. If you would runtime suspend while an encoder is attached, then you would only runtime resume every 10 sec (upon polling) if the encoder was a child of the crtc and would support runtime suspend as well. That's because the PM core wakes the parent by default when a child runtime resumes. However in the DT's I've looked at, the encoder is never a child of the crtc and at least inno_hdmi.c doesn't use runtime suspend. So I think you're all green, I can't spot any grave issues here. Just be aware of the above-mentioned constraints. Thanks, Lukas
Seemingly Similar Threads
- [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers
- [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers
- [Intel-gfx] [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers
- [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers
- [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers