Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 0/8] Fix connector probing deadlocks from RPM bugs
This is the next version of https://patchwork.freedesktop.org/series/46815/ With a lot more thought put into it so as to avoid the potential deadlock scenarios I missed. This also required fixing some bogus DRM helper usage. Try and deadlock me now, nouveau. I dare you!!! Lyude Paul (8): drm/nouveau: Fix bogus drm_kms_helper_poll_enable() placement drm/nouveau: Enable polling even if we have runtime PM drm/fb_helper: Introduce hotplug_suspend/resume() drm/nouveau: Fix deadlock with fb_helper using new helpers drm/nouveau: Use pm_runtime_get_noresume() in connector_detect() drm/nouveau: Respond to HPDs by probing one conn at a time drm/nouveau: Fix deadlocks in nouveau_connector_detect() drm/nouveau: Call pm_runtime_get_noresume() from hpd handlers drivers/gpu/drm/drm_fb_helper.c | 65 +++++++++++++++++++++ drivers/gpu/drm/nouveau/nouveau_connector.c | 60 +++++++++++++++---- drivers/gpu/drm/nouveau/nouveau_connector.h | 1 + drivers/gpu/drm/nouveau/nouveau_display.c | 7 ++- drivers/gpu/drm/nouveau/nouveau_drm.c | 16 +++-- drivers/gpu/drm/nouveau/nouveau_fbcon.c | 1 + include/drm/drm_fb_helper.h | 20 +++++++ 7 files changed, 154 insertions(+), 16 deletions(-) -- 2.17.1
Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 1/8] drm/nouveau: Fix bogus drm_kms_helper_poll_enable() placement
Turns out this part is my fault for not noticing when reviewing 9a2eba337cace ("drm/nouveau: Fix drm poll_helper handling"). Currently we call drm_kms_helper_poll_enable() from nouveau_display_hpd_work(). This makes basically no sense however, because that means we're calling drm_kms_helper_poll_enable() every time we schedule the hotplug detection work. This is also against the advice mentioned in drm_kms_helper_poll_enable()'s documentation: Note that calls to enable and disable polling must be strictly ordered, which is automatically the case when they're only call from suspend/resume callbacks. Of course, hotplugs can't really be ordered. They could even happen immediately after we called drm_kms_helper_poll_disable() in nouveau_display_fini(), which can lead to all sorts of issues. Additionally; enabling polling /after/ we call drm_helper_hpd_irq_event() could also mean that we'd miss a hotplug event anyway, since drm_helper_hpd_irq_event() wouldn't bother trying to probe connectors so long as polling is disabled. So; simply move this back into nouveau_display_init() again. The race condition that both of these patches attempted to work around has already been fixed properly in d61a5c106351 ("drm/nouveau: Fix deadlock on runtime suspend") Fixes: 9a2eba337cace ("drm/nouveau: Fix drm poll_helper handling") Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: Lukas Wunner <lukas at wunner.de> Cc: Peter Ujfalusi <peter.ujfalusi at ti.com> Cc: stable at vger.kernel.org --- drivers/gpu/drm/nouveau/nouveau_display.c | 7 +++++-- drivers/gpu/drm/nouveau/nouveau_drm.c | 1 - 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index ec7861457b84..1d36ab5d4796 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -355,8 +355,6 @@ nouveau_display_hpd_work(struct work_struct *work) pm_runtime_get_sync(drm->dev->dev); drm_helper_hpd_irq_event(drm->dev); - /* enable polling for external displays */ - drm_kms_helper_poll_enable(drm->dev); pm_runtime_mark_last_busy(drm->dev->dev); pm_runtime_put_sync(drm->dev->dev); @@ -411,6 +409,11 @@ nouveau_display_init(struct drm_device *dev) if (ret) return ret; + /* enable connector detection and polling for connectors without HPD + * support + */ + drm_kms_helper_poll_enable(dev); + /* enable hotplug interrupts */ drm_connector_list_iter_begin(dev, &conn_iter); nouveau_for_each_non_mst_connector_iter(connector, &conn_iter) { diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index c7ec86d6c3c9..5fdc1fbe2ee5 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -835,7 +835,6 @@ nouveau_pmops_runtime_suspend(struct device *dev) return -EBUSY; } - drm_kms_helper_poll_disable(drm_dev); nouveau_switcheroo_optimus_dsm(); ret = nouveau_do_suspend(drm_dev, true); pci_save_state(pdev); -- 2.17.1
Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 2/8] drm/nouveau: Enable polling even if we have runtime PM
Having runtime PM makes no difference on whether or not we want polling, and it's now safe to just enable polling unconditionally in drm_load() thanks to d61a5c106351 ("drm/nouveau: Fix deadlock on runtime suspend") Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: Lukas Wunner <lukas at wunner.de> Cc: Peter Ujfalusi <peter.ujfalusi at ti.com> Cc: stable at vger.kernel.org --- drivers/gpu/drm/nouveau/nouveau_drm.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 5fdc1fbe2ee5..ee2546db09c9 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -592,10 +592,11 @@ nouveau_drm_load(struct drm_device *dev, unsigned long flags) pm_runtime_allow(dev->dev); pm_runtime_mark_last_busy(dev->dev); pm_runtime_put(dev->dev); - } else { - /* enable polling for external displays */ - drm_kms_helper_poll_enable(dev); } + + /* enable polling for connectors without hpd */ + drm_kms_helper_poll_enable(dev); + return 0; fail_dispinit: -- 2.17.1
Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 3/8] drm/fb_helper: Introduce hotplug_suspend/resume()
I'm sure I don't need to tell you that fb_helper's locking is a mess. That being said; fb_helper's locking mess can seriously complicate the runtime suspend/resume operations of drivers because it can invoke atomic commits and connector probing from anywhere that calls drm_fb_helper_hotplug_event(). Since most drivers use drm_fb_helper_output_poll_changed() as their output_poll_changed handler, this can happen in every single context that can fire off a hotplug event. An example: [ 246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds. [ 246.673398] Not tainted 4.18.0-rc5Lyude-Test+ #2 [ 246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.676527] kworker/4:0 D 0 37 2 0x80000000 [ 246.677580] Workqueue: events output_poll_execute [drm_kms_helper] [ 246.678704] Call Trace: [ 246.679753] __schedule+0x322/0xaf0 [ 246.680916] schedule+0x33/0x90 [ 246.681924] schedule_preempt_disabled+0x15/0x20 [ 246.683023] __mutex_lock+0x569/0x9a0 [ 246.684035] ? kobject_uevent_env+0x117/0x7b0 [ 246.685132] ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] [ 246.686179] mutex_lock_nested+0x1b/0x20 [ 246.687278] ? mutex_lock_nested+0x1b/0x20 [ 246.688307] drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] [ 246.689420] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper] [ 246.690462] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper] [ 246.691570] output_poll_execute+0x198/0x1c0 [drm_kms_helper] [ 246.692611] process_one_work+0x231/0x620 [ 246.693725] worker_thread+0x214/0x3a0 [ 246.694756] kthread+0x12b/0x150 [ 246.695856] ? wq_pool_ids_show+0x140/0x140 [ 246.696888] ? kthread_create_worker_on_cpu+0x70/0x70 [ 246.697998] ret_from_fork+0x3a/0x50 [ 246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds. [ 246.700153] Not tainted 4.18.0-rc5Lyude-Test+ #2 [ 246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.702278] kworker/0:1 D 0 60 2 0x80000000 [ 246.703293] Workqueue: pm pm_runtime_work [ 246.704393] Call Trace: [ 246.705403] __schedule+0x322/0xaf0 [ 246.706439] ? wait_for_completion+0x104/0x190 [ 246.707393] schedule+0x33/0x90 [ 246.708375] schedule_timeout+0x3a5/0x590 [ 246.709289] ? mark_held_locks+0x58/0x80 [ 246.710208] ? _raw_spin_unlock_irq+0x2c/0x40 [ 246.711222] ? wait_for_completion+0x104/0x190 [ 246.712134] ? trace_hardirqs_on_caller+0xf4/0x190 [ 246.713094] ? wait_for_completion+0x104/0x190 [ 246.713964] wait_for_completion+0x12c/0x190 [ 246.714895] ? wake_up_q+0x80/0x80 [ 246.715727] ? get_work_pool+0x90/0x90 [ 246.716649] flush_work+0x1c9/0x280 [ 246.717483] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0 [ 246.718442] __cancel_work_timer+0x146/0x1d0 [ 246.719247] cancel_delayed_work_sync+0x13/0x20 [ 246.720043] drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper] [ 246.721123] nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau] [ 246.721897] pci_pm_runtime_suspend+0x6b/0x190 [ 246.722825] ? pci_has_legacy_pm_support+0x70/0x70 [ 246.723737] __rpm_callback+0x7a/0x1d0 [ 246.724721] ? pci_has_legacy_pm_support+0x70/0x70 [ 246.725607] rpm_callback+0x24/0x80 [ 246.726553] ? pci_has_legacy_pm_support+0x70/0x70 [ 246.727376] rpm_suspend+0x142/0x6b0 [ 246.728185] pm_runtime_work+0x97/0xc0 [ 246.728938] process_one_work+0x231/0x620 [ 246.729796] worker_thread+0x44/0x3a0 [ 246.730614] kthread+0x12b/0x150 [ 246.731395] ? wq_pool_ids_show+0x140/0x140 [ 246.732202] ? kthread_create_worker_on_cpu+0x70/0x70 [ 246.732878] ret_from_fork+0x3a/0x50 [ 246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds. [ 246.734587] Not tainted 4.18.0-rc5Lyude-Test+ #2 [ 246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.736113] kworker/4:2 D 0 422 2 0x80000080 [ 246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper] [ 246.737665] Call Trace: [ 246.738490] __schedule+0x322/0xaf0 [ 246.739250] schedule+0x33/0x90 [ 246.739908] rpm_resume+0x19c/0x850 [ 246.740750] ? finish_wait+0x90/0x90 [ 246.741541] __pm_runtime_resume+0x4e/0x90 [ 246.742370] nv50_disp_atomic_commit+0x31/0x210 [nouveau] [ 246.743124] drm_atomic_commit+0x4a/0x50 [drm] [ 246.743775] restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper] [ 246.744603] restore_fbdev_mode+0x31/0x140 [drm_kms_helper] [ 246.745373] drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper] [ 246.746220] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper] [ 246.746884] drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper] [ 246.747675] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper] [ 246.748544] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper] [ 246.749439] nv50_mstm_hotplug+0x15/0x20 [nouveau] [ 246.750111] drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper] [ 246.750764] drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper] [ 246.751602] drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper] [ 246.752314] process_one_work+0x231/0x620 [ 246.752979] worker_thread+0x44/0x3a0 [ 246.753838] kthread+0x12b/0x150 [ 246.754619] ? wq_pool_ids_show+0x140/0x140 [ 246.755386] ? kthread_create_worker_on_cpu+0x70/0x70 [ 246.756162] ret_from_fork+0x3a/0x50 [ 246.756847] Showing all locks held in the system: [ 246.758261] 3 locks held by kworker/4:0/37: [ 246.759016] #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620 [ 246.759856] #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620 [ 246.760670] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] [ 246.761516] 2 locks held by kworker/0:1/60: [ 246.762274] #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620 [ 246.762982] #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620 [ 246.763890] 1 lock held by khungtaskd/64: [ 246.764664] #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185 [ 246.765588] 5 locks held by kworker/4:2/422: [ 246.766440] #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620 [ 246.767390] #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620 [ 246.768154] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper] [ 246.768966] #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper] [ 246.769921] #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm] [ 246.770839] 1 lock held by dmesg/1038: [ 246.771739] 2 locks held by zsh/1172: [ 246.772650] #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40 [ 246.773680] #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870 [ 246.775522] ============================================ Because of this, there's an unreasonable number of places that drm drivers would need to insert special handling to prevent trying to resume the device from all of these contexts that can deadlock. It's difficult even to try synchronizing with fb_helper in these contexts as well, since any of them could introduce a deadlock by waiting to acquire the top-level fb_helper mutex, while it's being held by another thread that might potentially call down to pm_runtime_get_sync(). Luckily-there's no actual reason we need to allow fb_helper to handle hotplugging at all when runtime suspending a device. If a hotplug happens during a runtime suspend operation, there's no reason the driver can't just re-enable fbcon's hotplug handling and bring it up to speed with hotplugging events it may have missed by calling drm_fb_helper_hotplug_event(). So, let's make this easy and just add helpers to handle disabling and enabling fb_helper connector probing() without having to potentially wait on fb_helper to finish it's work. This will let us fix the runtime suspend/resume deadlocks that we've been experiencing with nouveau, along with being able to fix some of the incorrect runtime PM core interaction that other DRM drivers currently perform to work around these issues. Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: stable at vger.kernel.org Cc: Lukas Wunner <lukas at wunner.de> Cc: Karol Herbst <karolherbst at gmail.com> --- drivers/gpu/drm/drm_fb_helper.c | 65 +++++++++++++++++++++++++++++++++ include/drm/drm_fb_helper.h | 20 ++++++++++ 2 files changed, 85 insertions(+) diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 2ee1eaa66188..28d59befbc92 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -2733,6 +2733,66 @@ int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel) } EXPORT_SYMBOL(drm_fb_helper_initial_config); +/** + * drm_fb_helper_resume_hotplug - Allow fb_helper to handle probing new + * connectors again, and bring it up to date + * with the current device's connector status + * fb_helper: driver-allocated fbdev helper, can be NULL + * + * Allow fb_helper to react to connector status changes again after having + * been disabled through drm_fb_helper_suspend_hotplug(). This also schedules + * a fb_helper hotplug event to bring fb_helper up to date with the current + * status of the DRM device's connectors. + */ +void +drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper) +{ + fb_helper->hotplug_suspended = false; + drm_fb_helper_hotplug_event(fb_helper); +} +EXPORT_SYMBOL(drm_fb_helper_resume_hotplug); + +/** + * drm_fb_helper_suspend_hotplug - Attempt to temporarily inhibit fb_helper's + * ability to respond to connector changes + * without waiting + * @fb_helper: driver-allocated fbdev helper, can be NULL + * + * Temporarily prevent fb_helper from being able to handle connector changes, + * but only if it isn't busy doing something else. The connector probing + * routines in fb_helper can potentially grab any modesetting lock imaginable, + * which dramatically complicates the runtime suspend process. This helper can + * be used to simplify the process of runtime suspend by allowing the driver + * to disable unpredictable fb_helper operations as early as possible, without + * requiring that we try waiting on fb_helper (as this could lead to very + * difficult to solve deadlocks with runtime suspend code if fb_helper ends up + * needing to acquire a runtime PM reference). + * + * This call should be put at the very start of a driver's runtime suspend + * operation if desired. The driver must be responsible for re-enabling + * fb_helper hotplug handling when normal hotplug detection becomes available + * on the device again. This will usually happen if runtime suspend is + * aborted, or when the device is runtime resumed. + * + * Returns: true if hotplug handling was disabled, false if disabling hotplug + * handling would mean waiting on fb_helper. + */ +bool +drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper) +{ + int ret; + + ret = mutex_trylock(&fb_helper->lock); + if (!ret) + return false; + + fb_helper->hotplug_suspended = true; + mutex_unlock(&fb_helper->lock); + + return true; +} +EXPORT_SYMBOL(drm_fb_helper_suspend_hotplug); + /** * drm_fb_helper_hotplug_event - respond to a hotplug notification by * probing all the outputs attached to the fb @@ -2774,6 +2834,11 @@ int drm_fb_helper_hotplug_event(struct drm_fb_helper *fb_helper) return err; } + if (fb_helper->hotplug_suspended) { + mutex_unlock(&fb_helper->lock); + return err; + } + DRM_DEBUG_KMS("\n"); drm_setup_crtcs(fb_helper, fb_helper->fb->width, fb_helper->fb->height); diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h index b069433e7fc1..30881131075c 100644 --- a/include/drm/drm_fb_helper.h +++ b/include/drm/drm_fb_helper.h @@ -232,6 +232,14 @@ struct drm_fb_helper { * See also: @deferred_setup */ int preferred_bpp; + + /** + * @hotplug_suspended: + * + * Whether or not we can currently handle hotplug events, or if we + * need to wait for the DRM device to uninhibit us. + */ + bool hotplug_suspended; }; /** @@ -330,6 +338,10 @@ void drm_fb_helper_fbdev_teardown(struct drm_device *dev); void drm_fb_helper_lastclose(struct drm_device *dev); void drm_fb_helper_output_poll_changed(struct drm_device *dev); + +void drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper); +bool drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper); + #else static inline void drm_fb_helper_prepare(struct drm_device *dev, struct drm_fb_helper *helper, @@ -564,6 +576,14 @@ static inline void drm_fb_helper_output_poll_changed(struct drm_device *dev) { } +static inline void +drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper) +{ +} +static inline bool +drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper) +{ +} #endif static inline int -- 2.17.1
Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 4/8] drm/nouveau: Fix deadlock with fb_helper using new helpers
This removes the potential of deadlocking with fb_helper entirely by preventing it from handling hotplugs during the runtime suspend process as early as possible in the suspend process. If it turns out this is not possible, due to some fb_helper action having been queued up before we got a time to disable hotplugging, we simply return -EBUSY so that the runtime PM core attempts autosuspending the device again once fb_helper isn't doing anything. This fixes one of the issues causing deadlocks on runtime suspend/resume with nouveau on my P50. Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: stable at vger.kernel.org Cc: Lukas Wunner <lukas at wunner.de> Cc: Karol Herbst <karolherbst at gmail.com> --- drivers/gpu/drm/nouveau/nouveau_drm.c | 8 ++++++++ drivers/gpu/drm/nouveau/nouveau_fbcon.c | 1 + 2 files changed, 9 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index ee2546db09c9..d47cb5b2af98 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -836,6 +836,14 @@ nouveau_pmops_runtime_suspend(struct device *dev) return -EBUSY; } + /* There's no way for us to stop fb_helper work in reaction to + * hotplugs later in the RPM process. First off: we don't want to, + * fb_helper should be able to keep the GPU awake. Second off: it is + * capable of grabbing basically any lock in existence. + */ + if (!drm_fb_helper_suspend_hotplug(drm_dev->fb_helper)) + return -EBUSY; + nouveau_switcheroo_optimus_dsm(); ret = nouveau_do_suspend(drm_dev, true); pci_save_state(pdev); diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c index 85c1f10bc2b6..963ba630fd04 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c +++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c @@ -466,6 +466,7 @@ nouveau_fbcon_set_suspend_work(struct work_struct *work) console_unlock(); if (state == FBINFO_STATE_RUNNING) { + drm_fb_helper_resume_hotplug(drm->dev->fb_helper); pm_runtime_mark_last_busy(drm->dev->dev); pm_runtime_put_sync(drm->dev->dev); } -- 2.17.1
Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 5/8] drm/nouveau: Use pm_runtime_get_noresume() in connector_detect()
It's true we can't resume the device from poll workers in nouveau_connector_detect(). We can however, prevent the autosuspend timer from elapsing immediately if it hasn't already without risking any sort of deadlock with the runtime suspend/resume operations. So do that instead of entirely avoiding grabbing a power reference. Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: stable at vger.kernel.org Cc: Lukas Wunner <lukas at wunner.de> Cc: Karol Herbst <karolherbst at gmail.com> --- drivers/gpu/drm/nouveau/nouveau_connector.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c index 2a45b4c2ceb0..010d6db14cba 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -572,12 +572,16 @@ nouveau_connector_detect(struct drm_connector *connector, bool force) nv_connector->edid = NULL; } - /* Outputs are only polled while runtime active, so acquiring a - * runtime PM ref here is unnecessary (and would deadlock upon - * runtime suspend because it waits for polling to finish). + /* Outputs are only polled while runtime active, so resuming the + * device here is unnecessary (and would deadlock upon runtime suspend + * because it waits for polling to finish). We do however, want to + * prevent the autosuspend timer from elapsing during this operation + * if possible. */ - if (!drm_kms_helper_is_poll_worker()) { - ret = pm_runtime_get_sync(connector->dev->dev); + if (drm_kms_helper_is_poll_worker()) { + pm_runtime_get_noresume(dev->dev); + } else { + ret = pm_runtime_get_sync(dev->dev); if (ret < 0 && ret != -EACCES) return conn_status; } @@ -655,10 +659,8 @@ nouveau_connector_detect(struct drm_connector *connector, bool force) out: - if (!drm_kms_helper_is_poll_worker()) { - pm_runtime_mark_last_busy(connector->dev->dev); - pm_runtime_put_autosuspend(connector->dev->dev); - } + pm_runtime_mark_last_busy(dev->dev); + pm_runtime_put_autosuspend(dev->dev); return conn_status; } -- 2.17.1
Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 6/8] drm/nouveau: Respond to HPDs by probing one conn at a time
There isn't actually any reason we need to call drm_hpd_irq_event() from our hotplug handler, as we already know which connector the hotplug event was fired for. We're also going to need to avoid probing all connectors needlessly from hotplug handlers anyway so that we can track when nouveau_connector_detect() is being called from the context of it's connector's hotplug handler in order to fix the next deadlocking issue. This is (slightly) faster anyway! Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: stable at vger.kernel.org Cc: Lukas Wunner <lukas at wunner.de> Cc: Karol Herbst <karolherbst at gmail.com> --- drivers/gpu/drm/nouveau/nouveau_connector.c | 28 ++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c index 010d6db14cba..9714e09f17db 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -1114,6 +1114,32 @@ nouveau_connector_funcs_lvds = { .atomic_get_property = nouveau_conn_atomic_get_property, }; +static void +nouveau_connector_hotplug_probe(struct nouveau_connector *nv_conn) +{ + struct drm_modeset_acquire_ctx ctx; + struct drm_connector *conn = &nv_conn->base; + enum drm_connector_status old_status; + struct drm_device *dev = conn->dev; + bool changed; + + mutex_lock(&dev->mode_config.mutex); + + drm_modeset_acquire_init(&ctx, 0); + drm_modeset_lock(&dev->mode_config.connection_mutex, &ctx); + + old_status = conn->status; + conn->status = drm_helper_probe_detect(conn, &ctx, true); + changed = old_status != conn->status; + + drm_modeset_drop_locks(&ctx); + drm_modeset_acquire_fini(&ctx); + mutex_unlock(&dev->mode_config.mutex); + + if (changed) + drm_kms_helper_hotplug_event(dev); +} + static int nouveau_connector_hotplug(struct nvif_notify *notify) { @@ -1138,7 +1164,7 @@ nouveau_connector_hotplug(struct nvif_notify *notify) nv50_mstm_remove(nv_encoder->dp.mstm); } - drm_helper_hpd_irq_event(connector->dev); + nouveau_connector_hotplug_probe(nv_connector); } return NVIF_NOTIFY_KEEP; -- 2.17.1
Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 7/8] drm/nouveau: Fix deadlocks in nouveau_connector_detect()
When we disable hotplugging on the GPU, we need to be able to synchronize with each connector's hotplug interrupt handler before the interrupt is finally disabled. This can be a problem however, since nouveau_connector_detect() currently grabs a runtime power reference when handling connector probing. This will deadlock the runtime suspend handler like so: [ 861.480896] INFO: task kworker/0:2:61 blocked for more than 120 seconds. [ 861.483290] Tainted: G O 4.18.0-rc6Lyude-Test+ #1 [ 861.485158] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 861.486332] kworker/0:2 D 0 61 2 0x80000000 [ 861.487044] Workqueue: events nouveau_display_hpd_work [nouveau] [ 861.487737] Call Trace: [ 861.488394] __schedule+0x322/0xaf0 [ 861.489070] schedule+0x33/0x90 [ 861.489744] rpm_resume+0x19c/0x850 [ 861.490392] ? finish_wait+0x90/0x90 [ 861.491068] __pm_runtime_resume+0x4e/0x90 [ 861.491753] nouveau_display_hpd_work+0x22/0x60 [nouveau] [ 861.492416] process_one_work+0x231/0x620 [ 861.493068] worker_thread+0x44/0x3a0 [ 861.493722] kthread+0x12b/0x150 [ 861.494342] ? wq_pool_ids_show+0x140/0x140 [ 861.494991] ? kthread_create_worker_on_cpu+0x70/0x70 [ 861.495648] ret_from_fork+0x3a/0x50 [ 861.496304] INFO: task kworker/6:2:320 blocked for more than 120 seconds. [ 861.496968] Tainted: G O 4.18.0-rc6Lyude-Test+ #1 [ 861.497654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 861.498341] kworker/6:2 D 0 320 2 0x80000080 [ 861.499045] Workqueue: pm pm_runtime_work [ 861.499739] Call Trace: [ 861.500428] __schedule+0x322/0xaf0 [ 861.501134] ? wait_for_completion+0x104/0x190 [ 861.501851] schedule+0x33/0x90 [ 861.502564] schedule_timeout+0x3a5/0x590 [ 861.503284] ? mark_held_locks+0x58/0x80 [ 861.503988] ? _raw_spin_unlock_irq+0x2c/0x40 [ 861.504710] ? wait_for_completion+0x104/0x190 [ 861.505417] ? trace_hardirqs_on_caller+0xf4/0x190 [ 861.506136] ? wait_for_completion+0x104/0x190 [ 861.506845] wait_for_completion+0x12c/0x190 [ 861.507555] ? wake_up_q+0x80/0x80 [ 861.508268] flush_work+0x1c9/0x280 [ 861.508990] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0 [ 861.509735] nvif_notify_put+0xb1/0xc0 [nouveau] [ 861.510482] nouveau_display_fini+0xbd/0x170 [nouveau] [ 861.511241] nouveau_display_suspend+0x67/0x120 [nouveau] [ 861.511969] nouveau_do_suspend+0x5e/0x2d0 [nouveau] [ 861.512715] nouveau_pmops_runtime_suspend+0x47/0xb0 [nouveau] [ 861.513435] pci_pm_runtime_suspend+0x6b/0x180 [ 861.514165] ? pci_has_legacy_pm_support+0x70/0x70 [ 861.514897] __rpm_callback+0x7a/0x1d0 [ 861.515618] ? pci_has_legacy_pm_support+0x70/0x70 [ 861.516313] rpm_callback+0x24/0x80 [ 861.517027] ? pci_has_legacy_pm_support+0x70/0x70 [ 861.517741] rpm_suspend+0x142/0x6b0 [ 861.518449] pm_runtime_work+0x97/0xc0 [ 861.519144] process_one_work+0x231/0x620 [ 861.519831] worker_thread+0x44/0x3a0 [ 861.520522] kthread+0x12b/0x150 [ 861.521220] ? wq_pool_ids_show+0x140/0x140 [ 861.521925] ? kthread_create_worker_on_cpu+0x70/0x70 [ 861.522622] ret_from_fork+0x3a/0x50 [ 861.523299] INFO: task kworker/6:0:1329 blocked for more than 120 seconds. [ 861.523977] Tainted: G O 4.18.0-rc6Lyude-Test+ #1 [ 861.524644] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 861.525349] kworker/6:0 D 0 1329 2 0x80000000 [ 861.526073] Workqueue: events nvif_notify_work [nouveau] [ 861.526751] Call Trace: [ 861.527411] __schedule+0x322/0xaf0 [ 861.528089] schedule+0x33/0x90 [ 861.528758] rpm_resume+0x19c/0x850 [ 861.529399] ? finish_wait+0x90/0x90 [ 861.530073] __pm_runtime_resume+0x4e/0x90 [ 861.530798] nouveau_connector_detect+0x7e/0x510 [nouveau] [ 861.531459] ? ww_mutex_lock+0x47/0x80 [ 861.532097] ? ww_mutex_lock+0x47/0x80 [ 861.532819] ? drm_modeset_lock+0x88/0x130 [drm] [ 861.533481] drm_helper_probe_detect_ctx+0xa0/0x100 [drm_kms_helper] [ 861.534127] drm_helper_hpd_irq_event+0xa4/0x120 [drm_kms_helper] [ 861.534940] nouveau_connector_hotplug+0x98/0x120 [nouveau] [ 861.535556] nvif_notify_work+0x2d/0xb0 [nouveau] [ 861.536221] process_one_work+0x231/0x620 [ 861.536994] worker_thread+0x44/0x3a0 [ 861.537757] kthread+0x12b/0x150 [ 861.538463] ? wq_pool_ids_show+0x140/0x140 [ 861.539102] ? kthread_create_worker_on_cpu+0x70/0x70 [ 861.539815] ret_from_fork+0x3a/0x50 [ 861.540521] Showing all locks held in the system: [ 861.541696] 2 locks held by kworker/0:2/61: [ 861.542406] #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620 [ 861.543071] #1: 0000000076868126 ((work_completion)(&drm->hpd_work)){+.+.}, at: process_one_work+0x1b3/0x620 [ 861.543814] 1 lock held by khungtaskd/64: [ 861.544535] #0: 0000000059db4b53 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185 [ 861.545160] 3 locks held by kworker/6:2/320: [ 861.545896] #0: 00000000d9e1bc59 ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620 [ 861.546702] #1: 00000000c9f92d84 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620 [ 861.547443] #2: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: nouveau_display_fini+0x96/0x170 [nouveau] [ 861.548146] 1 lock held by dmesg/983: [ 861.548889] 2 locks held by zsh/1250: [ 861.549605] #0: 00000000348e3cf6 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40 [ 861.550393] #1: 000000007009a7a8 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870 [ 861.551122] 6 locks held by kworker/6:0/1329: [ 861.551957] #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620 [ 861.552765] #1: 00000000ddb499ad ((work_completion)(¬ify->work)#2){+.+.}, at: process_one_work+0x1b3/0x620 [ 861.553582] #2: 000000006e013cbe (&dev->mode_config.mutex){+.+.}, at: drm_helper_hpd_irq_event+0x6c/0x120 [drm_kms_helper] [ 861.554357] #3: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: drm_helper_hpd_irq_event+0x78/0x120 [drm_kms_helper] [ 861.555227] #4: 0000000044f294d9 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x3d/0x100 [drm_kms_helper] [ 861.556133] #5: 00000000db193642 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x4b/0x130 [drm] [ 861.557864] ============================================ [ 861.559507] NMI backtrace for cpu 2 [ 861.560363] CPU: 2 PID: 64 Comm: khungtaskd Tainted: G O 4.18.0-rc6Lyude-Test+ #1 [ 861.561197] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018 [ 861.561948] Call Trace: [ 861.562757] dump_stack+0x8e/0xd3 [ 861.563516] nmi_cpu_backtrace.cold.3+0x14/0x5a [ 861.564269] ? lapic_can_unplug_cpu.cold.27+0x42/0x42 [ 861.565029] nmi_trigger_cpumask_backtrace+0xa1/0xae [ 861.565789] arch_trigger_cpumask_backtrace+0x19/0x20 [ 861.566558] watchdog+0x316/0x580 [ 861.567355] kthread+0x12b/0x150 [ 861.568114] ? reset_hung_task_detector+0x20/0x20 [ 861.568863] ? kthread_create_worker_on_cpu+0x70/0x70 [ 861.569598] ret_from_fork+0x3a/0x50 [ 861.570370] Sending NMI from CPU 2 to CPUs 0-1,3-7: [ 861.571426] NMI backtrace for cpu 6 skipped: idling at intel_idle+0x7f/0x120 [ 861.571429] NMI backtrace for cpu 7 skipped: idling at intel_idle+0x7f/0x120 [ 861.571432] NMI backtrace for cpu 3 skipped: idling at intel_idle+0x7f/0x120 [ 861.571464] NMI backtrace for cpu 5 skipped: idling at intel_idle+0x7f/0x120 [ 861.571467] NMI backtrace for cpu 0 skipped: idling at intel_idle+0x7f/0x120 [ 861.571469] NMI backtrace for cpu 4 skipped: idling at intel_idle+0x7f/0x120 [ 861.571472] NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7f/0x120 [ 861.572428] Kernel panic - not syncing: hung_task: blocked tasks So: fix this with a new trick; store the current task_struct that's executing in the nouveau_connector structure, then avoid attempting to runtime resume the device when we know that we're just running from the context of our hotplug interrupt handler. Since hpd interrupts are only enabled while the device is runtime active, this should be totally safe. Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: stable at vger.kernel.org Cc: Lukas Wunner <lukas at wunner.de> Cc: Karol Herbst <karolherbst at gmail.com> --- drivers/gpu/drm/nouveau/nouveau_connector.c | 16 ++++++++++------ drivers/gpu/drm/nouveau/nouveau_connector.h | 1 + 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c index 9714e09f17db..8409c3f2c3a1 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -572,13 +572,14 @@ nouveau_connector_detect(struct drm_connector *connector, bool force) nv_connector->edid = NULL; } - /* Outputs are only polled while runtime active, so resuming the - * device here is unnecessary (and would deadlock upon runtime suspend - * because it waits for polling to finish). We do however, want to - * prevent the autosuspend timer from elapsing during this operation - * if possible. + /* Output polling and HPD only happens while we're runtime active, so + * resuming the device here is unnecessary (and would deadlock upon + * runtime suspend because it waits for polling to finish). We do + * however, want to prevent the autosuspend timer from elapsing during + * this operation if possible. */ - if (drm_kms_helper_is_poll_worker()) { + if (drm_kms_helper_is_poll_worker() || + nv_connector->hpd_task == current) { pm_runtime_get_noresume(dev->dev); } else { ret = pm_runtime_get_sync(dev->dev); @@ -1151,6 +1152,8 @@ nouveau_connector_hotplug(struct nvif_notify *notify) const char *name = connector->name; struct nouveau_encoder *nv_encoder; + nv_connector->hpd_task = current; + if (rep->mask & NVIF_NOTIFY_CONN_V0_IRQ) { NV_DEBUG(drm, "service %s\n", name); if ((nv_encoder = find_encoder(connector, DCB_OUTPUT_DP))) @@ -1167,6 +1170,7 @@ nouveau_connector_hotplug(struct nvif_notify *notify) nouveau_connector_hotplug_probe(nv_connector); } + nv_connector->hpd_task = NULL; return NVIF_NOTIFY_KEEP; } diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.h b/drivers/gpu/drm/nouveau/nouveau_connector.h index 2d9d35a146a4..1964e682ba13 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.h +++ b/drivers/gpu/drm/nouveau/nouveau_connector.h @@ -45,6 +45,7 @@ struct nouveau_connector { u8 *dcb; struct nvif_notify hpd; + struct task_struct *hpd_task; struct drm_dp_aux aux; -- 2.17.1
Lyude Paul
2018-Jul-31 00:39 UTC
[Nouveau] [PATCH v3 8/8] drm/nouveau: Call pm_runtime_get_noresume() from hpd handlers
We can't and don't need to try resuming the device from our hotplug handlers, but hotplug events are generally something we'd like to keep the device awake for whenever possible. So, grab a PM ref safely in our hotplug handlers using pm_runtime_get_noresume() and mark the device as busy once we're finished. Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: stable at vger.kernel.org Cc: Lukas Wunner <lukas at wunner.de> Cc: Karol Herbst <karolherbst at gmail.com> --- drivers/gpu/drm/nouveau/nouveau_connector.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c index 8409c3f2c3a1..5a8e8c1ad647 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -1152,6 +1152,11 @@ nouveau_connector_hotplug(struct nvif_notify *notify) const char *name = connector->name; struct nouveau_encoder *nv_encoder; + /* Resuming the device here isn't possible; but the suspend PM ops + * will wait for us to finish our work before disabling us so this + * should be enough + */ + pm_runtime_get_noresume(drm->dev->dev); nv_connector->hpd_task = current; if (rep->mask & NVIF_NOTIFY_CONN_V0_IRQ) { @@ -1171,6 +1176,9 @@ nouveau_connector_hotplug(struct nvif_notify *notify) } nv_connector->hpd_task = NULL; + + pm_runtime_mark_last_busy(drm->dev->dev); + pm_runtime_put_autosuspend(drm->dev->dev); return NVIF_NOTIFY_KEEP; } -- 2.17.1
William Lewis
2018-Jul-31 14:14 UTC
[Nouveau] [PATCH v3 3/8] drm/fb_helper: Introduce hotplug_suspend/resume()
On 07/30/2018 07:39 PM, Lyude Paul wrote:> I'm sure I don't need to tell you that fb_helper's locking is a mess. > That being said; fb_helper's locking mess can seriously complicate the > runtime suspend/resume operations of drivers because it can invoke > atomic commits and connector probing from anywhere that calls > drm_fb_helper_hotplug_event(). Since most drivers use > drm_fb_helper_output_poll_changed() as their output_poll_changed > handler, this can happen in every single context that can fire off a > hotplug event. An example: > > [ 246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds. > [ 246.673398] Not tainted 4.18.0-rc5Lyude-Test+ #2 > [ 246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 246.676527] kworker/4:0 D 0 37 2 0x80000000 > [ 246.677580] Workqueue: events output_poll_execute [drm_kms_helper] > [ 246.678704] Call Trace: > [ 246.679753] __schedule+0x322/0xaf0 > [ 246.680916] schedule+0x33/0x90 > [ 246.681924] schedule_preempt_disabled+0x15/0x20 > [ 246.683023] __mutex_lock+0x569/0x9a0 > [ 246.684035] ? kobject_uevent_env+0x117/0x7b0 > [ 246.685132] ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] > [ 246.686179] mutex_lock_nested+0x1b/0x20 > [ 246.687278] ? mutex_lock_nested+0x1b/0x20 > [ 246.688307] drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] > [ 246.689420] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper] > [ 246.690462] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper] > [ 246.691570] output_poll_execute+0x198/0x1c0 [drm_kms_helper] > [ 246.692611] process_one_work+0x231/0x620 > [ 246.693725] worker_thread+0x214/0x3a0 > [ 246.694756] kthread+0x12b/0x150 > [ 246.695856] ? wq_pool_ids_show+0x140/0x140 > [ 246.696888] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 246.697998] ret_from_fork+0x3a/0x50 > [ 246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds. > [ 246.700153] Not tainted 4.18.0-rc5Lyude-Test+ #2 > [ 246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 246.702278] kworker/0:1 D 0 60 2 0x80000000 > [ 246.703293] Workqueue: pm pm_runtime_work > [ 246.704393] Call Trace: > [ 246.705403] __schedule+0x322/0xaf0 > [ 246.706439] ? wait_for_completion+0x104/0x190 > [ 246.707393] schedule+0x33/0x90 > [ 246.708375] schedule_timeout+0x3a5/0x590 > [ 246.709289] ? mark_held_locks+0x58/0x80 > [ 246.710208] ? _raw_spin_unlock_irq+0x2c/0x40 > [ 246.711222] ? wait_for_completion+0x104/0x190 > [ 246.712134] ? trace_hardirqs_on_caller+0xf4/0x190 > [ 246.713094] ? wait_for_completion+0x104/0x190 > [ 246.713964] wait_for_completion+0x12c/0x190 > [ 246.714895] ? wake_up_q+0x80/0x80 > [ 246.715727] ? get_work_pool+0x90/0x90 > [ 246.716649] flush_work+0x1c9/0x280 > [ 246.717483] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0 > [ 246.718442] __cancel_work_timer+0x146/0x1d0 > [ 246.719247] cancel_delayed_work_sync+0x13/0x20 > [ 246.720043] drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper] > [ 246.721123] nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau] > [ 246.721897] pci_pm_runtime_suspend+0x6b/0x190 > [ 246.722825] ? pci_has_legacy_pm_support+0x70/0x70 > [ 246.723737] __rpm_callback+0x7a/0x1d0 > [ 246.724721] ? pci_has_legacy_pm_support+0x70/0x70 > [ 246.725607] rpm_callback+0x24/0x80 > [ 246.726553] ? pci_has_legacy_pm_support+0x70/0x70 > [ 246.727376] rpm_suspend+0x142/0x6b0 > [ 246.728185] pm_runtime_work+0x97/0xc0 > [ 246.728938] process_one_work+0x231/0x620 > [ 246.729796] worker_thread+0x44/0x3a0 > [ 246.730614] kthread+0x12b/0x150 > [ 246.731395] ? wq_pool_ids_show+0x140/0x140 > [ 246.732202] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 246.732878] ret_from_fork+0x3a/0x50 > [ 246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds. > [ 246.734587] Not tainted 4.18.0-rc5Lyude-Test+ #2 > [ 246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 246.736113] kworker/4:2 D 0 422 2 0x80000080 > [ 246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper] > [ 246.737665] Call Trace: > [ 246.738490] __schedule+0x322/0xaf0 > [ 246.739250] schedule+0x33/0x90 > [ 246.739908] rpm_resume+0x19c/0x850 > [ 246.740750] ? finish_wait+0x90/0x90 > [ 246.741541] __pm_runtime_resume+0x4e/0x90 > [ 246.742370] nv50_disp_atomic_commit+0x31/0x210 [nouveau] > [ 246.743124] drm_atomic_commit+0x4a/0x50 [drm] > [ 246.743775] restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper] > [ 246.744603] restore_fbdev_mode+0x31/0x140 [drm_kms_helper] > [ 246.745373] drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper] > [ 246.746220] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper] > [ 246.746884] drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper] > [ 246.747675] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper] > [ 246.748544] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper] > [ 246.749439] nv50_mstm_hotplug+0x15/0x20 [nouveau] > [ 246.750111] drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper] > [ 246.750764] drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper] > [ 246.751602] drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper] > [ 246.752314] process_one_work+0x231/0x620 > [ 246.752979] worker_thread+0x44/0x3a0 > [ 246.753838] kthread+0x12b/0x150 > [ 246.754619] ? wq_pool_ids_show+0x140/0x140 > [ 246.755386] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 246.756162] ret_from_fork+0x3a/0x50 > [ 246.756847] > Showing all locks held in the system: > [ 246.758261] 3 locks held by kworker/4:0/37: > [ 246.759016] #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.759856] #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.760670] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] > [ 246.761516] 2 locks held by kworker/0:1/60: > [ 246.762274] #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.762982] #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.763890] 1 lock held by khungtaskd/64: > [ 246.764664] #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185 > [ 246.765588] 5 locks held by kworker/4:2/422: > [ 246.766440] #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.767390] #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.768154] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper] > [ 246.768966] #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper] > [ 246.769921] #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm] > [ 246.770839] 1 lock held by dmesg/1038: > [ 246.771739] 2 locks held by zsh/1172: > [ 246.772650] #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40 > [ 246.773680] #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870 > > [ 246.775522] ============================================> > Because of this, there's an unreasonable number of places that drm > drivers would need to insert special handling to prevent trying to > resume the device from all of these contexts that can deadlock. It's > difficult even to try synchronizing with fb_helper in these contexts as > well, since any of them could introduce a deadlock by waiting to acquire > the top-level fb_helper mutex, while it's being held by another thread > that might potentially call down to pm_runtime_get_sync(). > > Luckily-there's no actual reason we need to allow fb_helper to handle > hotplugging at all when runtime suspending a device. If a hotplug > happens during a runtime suspend operation, there's no reason the driver > can't just re-enable fbcon's hotplug handling and bring it up to speed > with hotplugging events it may have missed by calling > drm_fb_helper_hotplug_event(). > > So, let's make this easy and just add helpers to handle disabling and > enabling fb_helper connector probing() without having to potentially > wait on fb_helper to finish it's work. This will let us fix the runtime > suspend/resume deadlocks that we've been experiencing with nouveau, > along with being able to fix some of the incorrect runtime PM core > interaction that other DRM drivers currently perform to work around > these issues. > > Signed-off-by: Lyude Paul <lyude at redhat.com> > Cc: stable at vger.kernel.org > Cc: Lukas Wunner <lukas at wunner.de> > Cc: Karol Herbst <karolherbst at gmail.com> > --- > drivers/gpu/drm/drm_fb_helper.c | 65 +++++++++++++++++++++++++++++++++ > include/drm/drm_fb_helper.h | 20 ++++++++++ > 2 files changed, 85 insertions(+) > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c > index 2ee1eaa66188..28d59befbc92 100644 > --- a/drivers/gpu/drm/drm_fb_helper.c > +++ b/drivers/gpu/drm/drm_fb_helper.c > @@ -2733,6 +2733,66 @@ int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel) > } > EXPORT_SYMBOL(drm_fb_helper_initial_config); > > +/** > + * drm_fb_helper_resume_hotplug - Allow fb_helper to handle probing new > + * connectors again, and bring it up to date > + * with the current device's connector status > + * fb_helper: driver-allocated fbdev helper, can be NULL > + * > + * Allow fb_helper to react to connector status changes again after having > + * been disabled through drm_fb_helper_suspend_hotplug(). This also schedules > + * a fb_helper hotplug event to bring fb_helper up to date with the current > + * status of the DRM device's connectors. > + */ > +void > +drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper) > +{ > + fb_helper->hotplug_suspended = false; > + drm_fb_helper_hotplug_event(fb_helper); > +} > +EXPORT_SYMBOL(drm_fb_helper_resume_hotplug); > + > +/** > + * drm_fb_helper_suspend_hotplug - Attempt to temporarily inhibit fb_helper's > + * ability to respond to connector changes > + * without waiting > + * @fb_helper: driver-allocated fbdev helper, can be NULLBut if it is...> + * > + * Temporarily prevent fb_helper from being able to handle connector changes, > + * but only if it isn't busy doing something else. The connector probing > + * routines in fb_helper can potentially grab any modesetting lock imaginable, > + * which dramatically complicates the runtime suspend process. This helper can > + * be used to simplify the process of runtime suspend by allowing the driver > + * to disable unpredictable fb_helper operations as early as possible, without > + * requiring that we try waiting on fb_helper (as this could lead to very > + * difficult to solve deadlocks with runtime suspend code if fb_helper ends up > + * needing to acquire a runtime PM reference). > + * > + * This call should be put at the very start of a driver's runtime suspend > + * operation if desired. The driver must be responsible for re-enabling > + * fb_helper hotplug handling when normal hotplug detection becomes available > + * on the device again. This will usually happen if runtime suspend is > + * aborted, or when the device is runtime resumed. > + * > + * Returns: true if hotplug handling was disabled, false if disabling hotplug > + * handling would mean waiting on fb_helper. > + */ > +bool > +drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper) > +{ > + int ret; > + > + ret = mutex_trylock(&fb_helper->lock);Kaboom!> + if (!ret) > + return false; > + > + fb_helper->hotplug_suspended = true; > + mutex_unlock(&fb_helper->lock); > + > + return true; > +} > +EXPORT_SYMBOL(drm_fb_helper_suspend_hotplug); > + > /** > * drm_fb_helper_hotplug_event - respond to a hotplug notification by > * probing all the outputs attached to the fb > @@ -2774,6 +2834,11 @@ int drm_fb_helper_hotplug_event(struct drm_fb_helper *fb_helper) > return err; > } > > + if (fb_helper->hotplug_suspended) { > + mutex_unlock(&fb_helper->lock); > + return err; > + } > + > DRM_DEBUG_KMS("\n"); > > drm_setup_crtcs(fb_helper, fb_helper->fb->width, fb_helper->fb->height); > diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h > index b069433e7fc1..30881131075c 100644 > --- a/include/drm/drm_fb_helper.h > +++ b/include/drm/drm_fb_helper.h > @@ -232,6 +232,14 @@ struct drm_fb_helper { > * See also: @deferred_setup > */ > int preferred_bpp; > + > + /** > + * @hotplug_suspended: > + * > + * Whether or not we can currently handle hotplug events, or if we > + * need to wait for the DRM device to uninhibit us. > + */ > + bool hotplug_suspended; > }; > > /** > @@ -330,6 +338,10 @@ void drm_fb_helper_fbdev_teardown(struct drm_device *dev); > > void drm_fb_helper_lastclose(struct drm_device *dev); > void drm_fb_helper_output_poll_changed(struct drm_device *dev); > + > +void drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper); > +bool drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper); > + > #else > static inline void drm_fb_helper_prepare(struct drm_device *dev, > struct drm_fb_helper *helper, > @@ -564,6 +576,14 @@ static inline void drm_fb_helper_output_poll_changed(struct drm_device *dev) > { > } > > +static inline void > +drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper) > +{ > +} > +static inline bool > +drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper) > +{ > +} > #endif > > static inline int
kbuild test robot
2018-Aug-01 08:36 UTC
[Nouveau] [PATCH v3 3/8] drm/fb_helper: Introduce hotplug_suspend/resume()
Hi Lyude, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v4.18-rc7] [cannot apply to next-20180731] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Lyude-Paul/Fix-connector-probing-deadlocks-from-RPM-bugs/20180801-153816 config: x86_64-randconfig-x014-201830 (attached as .config) compiler: gcc-7 (Debian 7.3.0-16) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): In file included from drivers/gpu//drm/drm_crtc_helper.c:42:0: include/drm/drm_fb_helper.h: In function 'drm_fb_helper_suspend_hotplug':>> include/drm/drm_fb_helper.h:586:1: warning: no return statement in function returning non-void [-Wreturn-type]} ^ vim +586 include/drm/drm_fb_helper.h 578 579 static inline void 580 drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper) 581 { 582 } 583 static inline bool 584 drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper) 585 { > 586 } 587 #endif 588 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: .config.gz Type: application/gzip Size: 26586 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20180801/49773c21/attachment-0001.gz>
kbuild test robot
2018-Aug-01 09:53 UTC
[Nouveau] [PATCH v3 3/8] drm/fb_helper: Introduce hotplug_suspend/resume()
Hi Lyude, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v4.18-rc7] [cannot apply to next-20180731] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Lyude-Paul/Fix-connector-probing-deadlocks-from-RPM-bugs/20180801-153816 reproduce: make htmldocs All warnings (new ones prefixed by >>): WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick (https://www.imagemagick.org) mm/mempool.c:228: warning: Function parameter or member 'pool' not described in 'mempool_init' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.ibss' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.connect' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.keys' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.ie' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.ie_len' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.bssid' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.ssid' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.default_key' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.default_mgmt_key' not described in 'wireless_dev' include/net/cfg80211.h:4279: warning: Function parameter or member 'wext.prev_bssid_valid' not described in 'wireless_dev' include/net/mac80211.h:2282: warning: Function parameter or member 'radiotap_timestamp.units_pos' not described in 'ieee80211_hw' include/net/mac80211.h:2282: warning: Function parameter or member 'radiotap_timestamp.accuracy' not described in 'ieee80211_hw' include/net/mac80211.h:955: warning: Function parameter or member 'control.rates' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.rts_cts_rate_idx' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.use_rts' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.use_cts_prot' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.short_preamble' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.skip_table' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.jiffies' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.vif' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.hw_key' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.flags' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'control.enqueue_time' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'ack' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'ack.cookie' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'status.rates' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'status.ack_signal' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'status.ampdu_ack_len' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'status.ampdu_len' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'status.antenna' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'status.tx_time' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'status.is_valid_ack_signal' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'status.status_driver_data' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'driver_rates' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'pad' not described in 'ieee80211_tx_info' include/net/mac80211.h:955: warning: Function parameter or member 'rate_driver_data' not described in 'ieee80211_tx_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'rx_stats_avg' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'rx_stats_avg.signal' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'rx_stats_avg.chain_signal' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.filtered' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.retry_failed' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.retry_count' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.lost_packets' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.last_tdls_pkt_time' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.msdu_retries' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.msdu_failed' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.last_ack' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.last_ack_signal' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.ack_signal_filled' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'status_stats.avg_ack_signal' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'tx_stats.packets' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'tx_stats.bytes' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'tx_stats.last_rate' not described in 'sta_info' net/mac80211/sta_info.h:588: warning: Function parameter or member 'tx_stats.msdu' not described in 'sta_info' kernel/sched/fair.c:3760: warning: Function parameter or member 'flags' not described in 'attach_entity_load_avg' include/linux/device.h:93: warning: bad line: this bus. include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_excl.cb' not described in 'dma_buf' include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_excl.poll' not described in 'dma_buf' include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_excl.active' not described in 'dma_buf' include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_shared.cb' not described in 'dma_buf' include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_shared.poll' not described in 'dma_buf' include/linux/dma-buf.h:307: warning: Function parameter or member 'cb_shared.active' not described in 'dma_buf' include/linux/dma-fence-array.h:54: warning: Function parameter or member 'work' not described in 'dma_fence_array' include/linux/gpio/driver.h:142: warning: Function parameter or member 'request_key' not described in 'gpio_irq_chip' include/linux/iio/hw-consumer.h:1: warning: no structured comments found include/linux/device.h:94: warning: bad line: this bus. include/linux/input/sparse-keymap.h:46: warning: Function parameter or member 'sw' not described in 'key_entry' include/linux/regulator/driver.h:227: warning: Function parameter or member 'resume_early' not described in 'regulator_ops' drivers/regulator/core.c:4465: warning: Excess function parameter 'state' description in 'regulator_suspend_late' arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.esw0' not described in 'irb' arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.esw1' not described in 'irb' arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.esw2' not described in 'irb' arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.esw3' not described in 'irb' arch/s390/include/asm/cio.h:245: warning: Function parameter or member 'esw.eadm' not described in 'irb' drivers/usb/dwc3/gadget.c:510: warning: Excess function parameter 'dwc' description in 'dwc3_gadget_start_config' include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_pin' not described in 'drm_driver' include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_unpin' not described in 'drm_driver' include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_res_obj' not described in 'drm_driver' include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_get_sg_table' not described in 'drm_driver' include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_import_sg_table' not described in 'drm_driver' include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_vmap' not described in 'drm_driver' include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_vunmap' not described in 'drm_driver' include/drm/drm_drv.h:610: warning: Function parameter or member 'gem_prime_mmap' not described in 'drm_driver'>> drivers/gpu/drm/drm_fb_helper.c:2750: warning: Function parameter or member 'fb_helper' not described in 'drm_fb_helper_resume_hotplug'drivers/gpu/drm/i915/i915_vma.h:48: warning: cannot understand function prototype: 'struct i915_vma ' drivers/gpu/drm/i915/i915_vma.h:1: warning: no structured comments found include/drm/tinydrm/tinydrm.h:34: warning: Function parameter or member 'fb_dirty' not described in 'tinydrm_device' drivers/gpu/drm/tinydrm/mipi-dbi.c:272: warning: Function parameter or member 'crtc_state' not described in 'mipi_dbi_enable_flush' drivers/gpu/drm/tinydrm/mipi-dbi.c:272: warning: Function parameter or member 'plane_state' not described in 'mipi_dbi_enable_flush' include/linux/skbuff.h:852: warning: Function parameter or member 'dev_scratch' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'ip_defrag_offset' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'skb_mstamp' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member '__cloned_offset' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'head_frag' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member '__pkt_type_offset' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'encapsulation' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'encap_hdr_csum' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'csum_valid' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'csum_complete_sw' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'csum_level' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'inner_protocol_type' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'remcsum_offload' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'offload_fwd_mark' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'offload_mr_fwd_mark' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'sender_cpu' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'reserved_tailroom' not described in 'sk_buff' include/linux/skbuff.h:852: warning: Function parameter or member 'inner_ipproto' not described in 'sk_buff' include/net/sock.h:234: warning: Function parameter or member 'skc_addrpair' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_portpair' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_ipv6only' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_net_refcnt' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_v6_daddr' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_v6_rcv_saddr' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_cookie' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_listener' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_tw_dr' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_rcv_wnd' not described in 'sock_common' include/net/sock.h:234: warning: Function parameter or member 'skc_tw_rcv_nxt' not described in 'sock_common' include/net/sock.h:493: warning: Function parameter or member 'sk_backlog.rmem_alloc' not described in 'sock' include/net/sock.h:493: warning: Function parameter or member 'sk_backlog.len' not described in 'sock' include/net/sock.h:493: warning: Function parameter or member 'sk_backlog.head' not described in 'sock' include/net/sock.h:493: warning: Function parameter or member 'sk_backlog.tail' not described in 'sock' include/net/sock.h:493: warning: Function parameter or member 'sk_wq_raw' not described in 'sock' include/net/sock.h:493: warning: Function parameter or member 'tcp_rtx_queue' not described in 'sock' include/net/sock.h:493: warning: Function parameter or member 'sk_route_forced_caps' not described in 'sock' include/net/sock.h:493: warning: Function parameter or member 'sk_validate_xmit_skb' not described in 'sock' include/linux/netdevice.h:1997: warning: Function parameter or member 'adj_list.upper' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'adj_list.lower' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'gso_partial_features' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'switchdev_ops' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'l3mdev_ops' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'xfrmdev_ops' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'tlsdev_ops' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'name_assign_type' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'ieee802154_ptr' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'mpls_ptr' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'xdp_prog' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'gro_flush_timeout' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'nf_hooks_ingress' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member '____cacheline_aligned_in_smp' not described in 'net_device' include/linux/netdevice.h:1997: warning: Function parameter or member 'qdisc_hash' not described in 'net_device' include/linux/phylink.h:56: warning: Function parameter or member '__ETHTOOL_DECLARE_LINK_MODE_MASK(advertising' not described in 'phylink_link_state' include/linux/phylink.h:56: warning: Function parameter or member '__ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising' not described in 'phylink_link_state' sound/soc/soc-core.c:2787: warning: Excess function parameter 'legacy_dai_naming' description in 'snd_soc_register_dais' include/linux/rcupdate.h:571: ERROR: Unexpected indentation. include/linux/rcupdate.h:575: ERROR: Unexpected indentation. include/linux/rcupdate.h:579: WARNING: Block quote ends without a blank line; unexpected unindent. include/linux/rcupdate.h:581: WARNING: Block quote ends without a blank line; unexpected unindent. include/linux/rcupdate.h:581: WARNING: Inline literal start-string without end-string. lib/reed_solomon/reed_solomon.c:287: ERROR: Unknown target name: "gfp". include/linux/wait.h:110: WARNING: Block quote ends without a blank line; unexpected unindent. include/linux/wait.h:113: ERROR: Unexpected indentation. include/linux/wait.h:115: WARNING: Block quote ends without a blank line; unexpected unindent. kernel/time/hrtimer.c:1129: WARNING: Block quote ends without a blank line; unexpected unindent. kernel/signal.c:327: WARNING: Inline literal start-string without end-string. drivers/video/fbdev/core/modedb.c:647: WARNING: Inline strong start-string without end-string. drivers/video/fbdev/core/modedb.c:647: WARNING: Inline strong start-string without end-string. drivers/video/fbdev/core/modedb.c:647: WARNING: Inline strong start-string without end-string. drivers/video/fbdev/core/modedb.c:647: WARNING: Inline strong start-string without end-string. drivers/ata/libata-core.c:5946: ERROR: Unknown target name: "hw". drivers/message/fusion/mptbase.c:5054: WARNING: Definition list ends without a blank line; unexpected unindent. drivers/tty/serial/serial_core.c:1892: WARNING: Definition list ends without a blank line; unexpected unindent. include/linux/mtd/rawnand.h:1446: WARNING: Inline strong start-string without end-string. include/linux/mtd/rawnand.h:1448: WARNING: Inline strong start-string without end-string. include/linux/regulator/driver.h:279: ERROR: Unknown target name: "regulator_regmap_x_voltage". Documentation/driver-api/soundwire/locking.rst:50: ERROR: Inconsistent literal block quoting. Documentation/driver-api/soundwire/locking.rst:51: WARNING: Line block ends without a blank line. Documentation/driver-api/soundwire/locking.rst:55: WARNING: Inline substitution_reference start-string without end-string. Documentation/driver-api/soundwire/locking.rst:56: WARNING: Line block ends without a blank line. Documentation/driver-api/soundwire/stream.rst:177: WARNING: Explicit markup ends without a blank line; unexpected unindent. Documentation/driver-api/soundwire/stream.rst:203: WARNING: Explicit markup ends without a blank line; unexpected unindent. Documentation/driver-api/soundwire/stream.rst:248: WARNING: Explicit markup ends without a blank line; unexpected unindent. Documentation/driver-api/soundwire/stream.rst:277: WARNING: Explicit markup ends without a blank line; unexpected unindent. Documentation/driver-api/soundwire/stream.rst:304: WARNING: Explicit markup ends without a blank line; unexpected unindent. Documentation/driver-api/soundwire/stream.rst:328: WARNING: Explicit markup ends without a blank line; unexpected unindent. Documentation/driver-api/soundwire/stream.rst:352: WARNING: Explicit markup ends without a blank line; unexpected unindent. Documentation/driver-api/soundwire/stream.rst:364: WARNING: Explicit markup ends without a blank line; unexpected unindent. include/linux/spi/spi.h:373: ERROR: Unexpected indentation. block/bio.c:958: WARNING: Inline emphasis start-string without end-string. Documentation/gpu/drivers.rst:5: WARNING: toctree contains reference to nonexisting document u'gpu/v3d' Documentation/misc-devices/ibmvmc.rst:2: WARNING: Explicit markup ends without a blank line; unexpected unindent. net/core/dev.c:4650: ERROR: Unknown target name: "page_is". Documentation/networking/net_failover.rst:48: WARNING: Definition list ends without a blank line; unexpected unindent. Documentation/networking/net_failover.rst:50: ERROR: Unexpected indentation. vim +2750 drivers/gpu/drm/drm_fb_helper.c 2735 2736 /** 2737 * drm_fb_helper_resume_hotplug - Allow fb_helper to handle probing new 2738 * connectors again, and bring it up to date 2739 * with the current device's connector status 2740 * fb_helper: driver-allocated fbdev helper, can be NULL 2741 * 2742 * Allow fb_helper to react to connector status changes again after having 2743 * been disabled through drm_fb_helper_suspend_hotplug(). This also schedules 2744 * a fb_helper hotplug event to bring fb_helper up to date with the current 2745 * status of the DRM device's connectors. 2746 */ 2747 void 2748 drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper) 2749 {> 2750 fb_helper->hotplug_suspended = false;2751 drm_fb_helper_hotplug_event(fb_helper); 2752 } 2753 EXPORT_SYMBOL(drm_fb_helper_resume_hotplug); 2754 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: .config.gz Type: application/gzip Size: 6443 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20180801/6165d3e1/attachment-0001.gz>
Daniel Vetter
2018-Aug-06 08:43 UTC
[Nouveau] [PATCH v3 3/8] drm/fb_helper: Introduce hotplug_suspend/resume()
On Mon, Jul 30, 2018 at 08:39:48PM -0400, Lyude Paul wrote:> I'm sure I don't need to tell you that fb_helper's locking is a mess. > That being said; fb_helper's locking mess can seriously complicate the > runtime suspend/resume operations of drivers because it can invoke > atomic commits and connector probing from anywhere that calls > drm_fb_helper_hotplug_event(). Since most drivers use > drm_fb_helper_output_poll_changed() as their output_poll_changed > handler, this can happen in every single context that can fire off a > hotplug event. An example:Uh, I think this ever more looks like a serious rabbit hole between nouveau and rpm. It looks like nouveau tries to use rpm as the outermost lock, surrounding every other drm lock there is. That doesn't really work, and means you'll keep fighting drm core locking forever. All other rpm supporting drivers (i915, and the pile of armsoc ones) push their rpm_get/put dance way down into the low-level callbacks. That then allows you to fix all the trouble with probing, i2x, dp_aux and stuff abitrarily nesting (since rpm itself is refcounted). But I'm not really sure why nouveau is different here. -Daniel> > [ 246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds. > [ 246.673398] Not tainted 4.18.0-rc5Lyude-Test+ #2 > [ 246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 246.676527] kworker/4:0 D 0 37 2 0x80000000 > [ 246.677580] Workqueue: events output_poll_execute [drm_kms_helper] > [ 246.678704] Call Trace: > [ 246.679753] __schedule+0x322/0xaf0 > [ 246.680916] schedule+0x33/0x90 > [ 246.681924] schedule_preempt_disabled+0x15/0x20 > [ 246.683023] __mutex_lock+0x569/0x9a0 > [ 246.684035] ? kobject_uevent_env+0x117/0x7b0 > [ 246.685132] ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] > [ 246.686179] mutex_lock_nested+0x1b/0x20 > [ 246.687278] ? mutex_lock_nested+0x1b/0x20 > [ 246.688307] drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] > [ 246.689420] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper] > [ 246.690462] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper] > [ 246.691570] output_poll_execute+0x198/0x1c0 [drm_kms_helper] > [ 246.692611] process_one_work+0x231/0x620 > [ 246.693725] worker_thread+0x214/0x3a0 > [ 246.694756] kthread+0x12b/0x150 > [ 246.695856] ? wq_pool_ids_show+0x140/0x140 > [ 246.696888] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 246.697998] ret_from_fork+0x3a/0x50 > [ 246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds. > [ 246.700153] Not tainted 4.18.0-rc5Lyude-Test+ #2 > [ 246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 246.702278] kworker/0:1 D 0 60 2 0x80000000 > [ 246.703293] Workqueue: pm pm_runtime_work > [ 246.704393] Call Trace: > [ 246.705403] __schedule+0x322/0xaf0 > [ 246.706439] ? wait_for_completion+0x104/0x190 > [ 246.707393] schedule+0x33/0x90 > [ 246.708375] schedule_timeout+0x3a5/0x590 > [ 246.709289] ? mark_held_locks+0x58/0x80 > [ 246.710208] ? _raw_spin_unlock_irq+0x2c/0x40 > [ 246.711222] ? wait_for_completion+0x104/0x190 > [ 246.712134] ? trace_hardirqs_on_caller+0xf4/0x190 > [ 246.713094] ? wait_for_completion+0x104/0x190 > [ 246.713964] wait_for_completion+0x12c/0x190 > [ 246.714895] ? wake_up_q+0x80/0x80 > [ 246.715727] ? get_work_pool+0x90/0x90 > [ 246.716649] flush_work+0x1c9/0x280 > [ 246.717483] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0 > [ 246.718442] __cancel_work_timer+0x146/0x1d0 > [ 246.719247] cancel_delayed_work_sync+0x13/0x20 > [ 246.720043] drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper] > [ 246.721123] nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau] > [ 246.721897] pci_pm_runtime_suspend+0x6b/0x190 > [ 246.722825] ? pci_has_legacy_pm_support+0x70/0x70 > [ 246.723737] __rpm_callback+0x7a/0x1d0 > [ 246.724721] ? pci_has_legacy_pm_support+0x70/0x70 > [ 246.725607] rpm_callback+0x24/0x80 > [ 246.726553] ? pci_has_legacy_pm_support+0x70/0x70 > [ 246.727376] rpm_suspend+0x142/0x6b0 > [ 246.728185] pm_runtime_work+0x97/0xc0 > [ 246.728938] process_one_work+0x231/0x620 > [ 246.729796] worker_thread+0x44/0x3a0 > [ 246.730614] kthread+0x12b/0x150 > [ 246.731395] ? wq_pool_ids_show+0x140/0x140 > [ 246.732202] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 246.732878] ret_from_fork+0x3a/0x50 > [ 246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds. > [ 246.734587] Not tainted 4.18.0-rc5Lyude-Test+ #2 > [ 246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 246.736113] kworker/4:2 D 0 422 2 0x80000080 > [ 246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper] > [ 246.737665] Call Trace: > [ 246.738490] __schedule+0x322/0xaf0 > [ 246.739250] schedule+0x33/0x90 > [ 246.739908] rpm_resume+0x19c/0x850 > [ 246.740750] ? finish_wait+0x90/0x90 > [ 246.741541] __pm_runtime_resume+0x4e/0x90 > [ 246.742370] nv50_disp_atomic_commit+0x31/0x210 [nouveau] > [ 246.743124] drm_atomic_commit+0x4a/0x50 [drm] > [ 246.743775] restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper] > [ 246.744603] restore_fbdev_mode+0x31/0x140 [drm_kms_helper] > [ 246.745373] drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper] > [ 246.746220] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper] > [ 246.746884] drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper] > [ 246.747675] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper] > [ 246.748544] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper] > [ 246.749439] nv50_mstm_hotplug+0x15/0x20 [nouveau] > [ 246.750111] drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper] > [ 246.750764] drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper] > [ 246.751602] drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper] > [ 246.752314] process_one_work+0x231/0x620 > [ 246.752979] worker_thread+0x44/0x3a0 > [ 246.753838] kthread+0x12b/0x150 > [ 246.754619] ? wq_pool_ids_show+0x140/0x140 > [ 246.755386] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 246.756162] ret_from_fork+0x3a/0x50 > [ 246.756847] > Showing all locks held in the system: > [ 246.758261] 3 locks held by kworker/4:0/37: > [ 246.759016] #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.759856] #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.760670] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper] > [ 246.761516] 2 locks held by kworker/0:1/60: > [ 246.762274] #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.762982] #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.763890] 1 lock held by khungtaskd/64: > [ 246.764664] #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185 > [ 246.765588] 5 locks held by kworker/4:2/422: > [ 246.766440] #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.767390] #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620 > [ 246.768154] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper] > [ 246.768966] #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper] > [ 246.769921] #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm] > [ 246.770839] 1 lock held by dmesg/1038: > [ 246.771739] 2 locks held by zsh/1172: > [ 246.772650] #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40 > [ 246.773680] #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870 > > [ 246.775522] ============================================> > Because of this, there's an unreasonable number of places that drm > drivers would need to insert special handling to prevent trying to > resume the device from all of these contexts that can deadlock. It's > difficult even to try synchronizing with fb_helper in these contexts as > well, since any of them could introduce a deadlock by waiting to acquire > the top-level fb_helper mutex, while it's being held by another thread > that might potentially call down to pm_runtime_get_sync(). > > Luckily-there's no actual reason we need to allow fb_helper to handle > hotplugging at all when runtime suspending a device. If a hotplug > happens during a runtime suspend operation, there's no reason the driver > can't just re-enable fbcon's hotplug handling and bring it up to speed > with hotplugging events it may have missed by calling > drm_fb_helper_hotplug_event(). > > So, let's make this easy and just add helpers to handle disabling and > enabling fb_helper connector probing() without having to potentially > wait on fb_helper to finish it's work. This will let us fix the runtime > suspend/resume deadlocks that we've been experiencing with nouveau, > along with being able to fix some of the incorrect runtime PM core > interaction that other DRM drivers currently perform to work around > these issues. > > Signed-off-by: Lyude Paul <lyude at redhat.com> > Cc: stable at vger.kernel.org > Cc: Lukas Wunner <lukas at wunner.de> > Cc: Karol Herbst <karolherbst at gmail.com> > --- > drivers/gpu/drm/drm_fb_helper.c | 65 +++++++++++++++++++++++++++++++++ > include/drm/drm_fb_helper.h | 20 ++++++++++ > 2 files changed, 85 insertions(+) > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c > index 2ee1eaa66188..28d59befbc92 100644 > --- a/drivers/gpu/drm/drm_fb_helper.c > +++ b/drivers/gpu/drm/drm_fb_helper.c > @@ -2733,6 +2733,66 @@ int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel) > } > EXPORT_SYMBOL(drm_fb_helper_initial_config); > > +/** > + * drm_fb_helper_resume_hotplug - Allow fb_helper to handle probing new > + * connectors again, and bring it up to date > + * with the current device's connector status > + * fb_helper: driver-allocated fbdev helper, can be NULL > + * > + * Allow fb_helper to react to connector status changes again after having > + * been disabled through drm_fb_helper_suspend_hotplug(). This also schedules > + * a fb_helper hotplug event to bring fb_helper up to date with the current > + * status of the DRM device's connectors. > + */ > +void > +drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper) > +{ > + fb_helper->hotplug_suspended = false; > + drm_fb_helper_hotplug_event(fb_helper); > +} > +EXPORT_SYMBOL(drm_fb_helper_resume_hotplug); > + > +/** > + * drm_fb_helper_suspend_hotplug - Attempt to temporarily inhibit fb_helper's > + * ability to respond to connector changes > + * without waiting > + * @fb_helper: driver-allocated fbdev helper, can be NULL > + * > + * Temporarily prevent fb_helper from being able to handle connector changes, > + * but only if it isn't busy doing something else. The connector probing > + * routines in fb_helper can potentially grab any modesetting lock imaginable, > + * which dramatically complicates the runtime suspend process. This helper can > + * be used to simplify the process of runtime suspend by allowing the driver > + * to disable unpredictable fb_helper operations as early as possible, without > + * requiring that we try waiting on fb_helper (as this could lead to very > + * difficult to solve deadlocks with runtime suspend code if fb_helper ends up > + * needing to acquire a runtime PM reference). > + * > + * This call should be put at the very start of a driver's runtime suspend > + * operation if desired. The driver must be responsible for re-enabling > + * fb_helper hotplug handling when normal hotplug detection becomes available > + * on the device again. This will usually happen if runtime suspend is > + * aborted, or when the device is runtime resumed. > + * > + * Returns: true if hotplug handling was disabled, false if disabling hotplug > + * handling would mean waiting on fb_helper. > + */ > +bool > +drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper) > +{ > + int ret; > + > + ret = mutex_trylock(&fb_helper->lock); > + if (!ret) > + return false; > + > + fb_helper->hotplug_suspended = true; > + mutex_unlock(&fb_helper->lock); > + > + return true; > +} > +EXPORT_SYMBOL(drm_fb_helper_suspend_hotplug); > + > /** > * drm_fb_helper_hotplug_event - respond to a hotplug notification by > * probing all the outputs attached to the fb > @@ -2774,6 +2834,11 @@ int drm_fb_helper_hotplug_event(struct drm_fb_helper *fb_helper) > return err; > } > > + if (fb_helper->hotplug_suspended) { > + mutex_unlock(&fb_helper->lock); > + return err; > + } > + > DRM_DEBUG_KMS("\n"); > > drm_setup_crtcs(fb_helper, fb_helper->fb->width, fb_helper->fb->height); > diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h > index b069433e7fc1..30881131075c 100644 > --- a/include/drm/drm_fb_helper.h > +++ b/include/drm/drm_fb_helper.h > @@ -232,6 +232,14 @@ struct drm_fb_helper { > * See also: @deferred_setup > */ > int preferred_bpp; > + > + /** > + * @hotplug_suspended: > + * > + * Whether or not we can currently handle hotplug events, or if we > + * need to wait for the DRM device to uninhibit us. > + */ > + bool hotplug_suspended; > }; > > /** > @@ -330,6 +338,10 @@ void drm_fb_helper_fbdev_teardown(struct drm_device *dev); > > void drm_fb_helper_lastclose(struct drm_device *dev); > void drm_fb_helper_output_poll_changed(struct drm_device *dev); > + > +void drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper); > +bool drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper); > + > #else > static inline void drm_fb_helper_prepare(struct drm_device *dev, > struct drm_fb_helper *helper, > @@ -564,6 +576,14 @@ static inline void drm_fb_helper_output_poll_changed(struct drm_device *dev) > { > } > > +static inline void > +drm_fb_helper_resume_hotplug(struct drm_fb_helper *fb_helper) > +{ > +} > +static inline bool > +drm_fb_helper_suspend_hotplug(struct drm_fb_helper *fb_helper) > +{ > +} > #endif > > static inline int > -- > 2.17.1 > > _______________________________________________ > dri-devel mailing list > dri-devel at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Possibly Parallel Threads
- [PATCH v7 3/5] drm/nouveau: Fix deadlock with fb_helper with async RPM requests
- [PATCH v3 0/8] Fix connector probing deadlocks from RPM bugs
- [PATCH v3 3/8] drm/fb_helper: Introduce hotplug_suspend/resume()
- [PATCH v3 3/8] drm/fb_helper: Introduce hotplug_suspend/resume()
- [PATCH v4 0/8] Fix connector probing deadlocks from RPM bugs