Lyude Paul
2019-Jul-18 01:42 UTC
[Nouveau] [PATCH 00/26] DP MST Refactors + debugging tools + suspend/resume reprobing
This is the large series for adding MST suspend/resume reprobing that I've been working on for quite a while now. In addition, I: - Refactored and cleaned up any code I ended up digging through in the process of understanding how some parts of these helpers worked. - Added some debugging tools along the way that I ended up needing to figure out some issues in my own code Note that there's still one important part of this process missing that's not included in this patch series: EDID reprobing, which I believe Stanislav Lisovskiy from Intel is currently working on. The main purpose of this series is to fix the issue of the in-memory topology state (e.g. connectors connected to an MST hub, branch devices, etc.) going out of sync if a topology connected to a connector is swapped out with a different topology while the system is resumed, or while the device housing said connector is in runtime suspend. As well, the debugging tools that are added in this include: - A limited debugging utility for dumping the list of topology references on an MST port or branch connector whose topology reference count has reached 0 - Sideband down request dumping, along with some basic selftests for testing our encoding/decoding functions Cc: Juston Li <juston.li at intel.com> Cc: Imre Deak <imre.deak at intel.com> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com> Cc: Harry Wentland <hwentlan at amd.com> Lyude Paul (26): drm/dp_mst: Move link address dumping into a function drm/dp_mst: Destroy mstbs from destroy_connector_work drm/dp_mst: Move test_calc_pbn_mode() into an actual selftest drm/print: Add drm_err_printer() drm/dp_mst: Add sideband down request tracing + selftests drm/dp_mst: Move PDT teardown for ports into destroy_connector_work drm/dp_mst: Get rid of list clear in drm_dp_finish_destroy_port() drm/dp_mst: Refactor drm_dp_send_enum_path_resources drm/dp_mst: Remove huge conditional in drm_dp_mst_handle_up_req() drm/dp_mst: Constify guid in drm_dp_get_mst_branch_by_guid() drm/dp_mst: Refactor drm_dp_mst_handle_up_req() drm/dp_mst: Refactor drm_dp_mst_handle_down_rep() drm/dp_mst: Destroy topology_mgr mutexes drm/dp_mst: Cleanup drm_dp_send_link_address() a bit drm/dp_mst: Refactor pdt setup/teardown, add more locking drm/dp_mst: Rename drm_dp_add_port and drm_dp_update_port drm/dp_mst: Remove lies in {up,down}_rep_recv documentation drm/dp_mst: Handle UP requests asynchronously drm/dp_mst: Protect drm_dp_mst_port members with connection_mutex drm/dp_mst: Don't forget to update port->input in drm_dp_mst_handle_conn_stat() drm/nouveau: Don't grab runtime PM refs for HPD IRQs drm/amdgpu: Iterate through DRM connectors correctly drm/amdgpu/dm: Resume short HPD IRQs before resuming MST topology drm/dp_mst: Add basic topology reprobing when resuming drm/dp_mst: Also print unhashed pointers for malloc/topology references drm/dp_mst: Add topology ref history tracking for debugging drivers/gpu/drm/Kconfig | 14 + .../gpu/drm/amd/amdgpu/amdgpu_connectors.c | 13 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_encoders.c | 40 +- drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 5 +- drivers/gpu/drm/amd/amdgpu/dce_v10_0.c | 34 +- drivers/gpu/drm/amd/amdgpu/dce_v11_0.c | 34 +- drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 40 +- drivers/gpu/drm/amd/amdgpu/dce_v8_0.c | 34 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 41 +- .../drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c | 10 +- drivers/gpu/drm/drm_dp_mst_topology.c | 1592 +++++++++++++---- .../gpu/drm/drm_dp_mst_topology_internal.h | 24 + drivers/gpu/drm/drm_print.c | 6 + drivers/gpu/drm/i915/display/intel_dp.c | 3 +- drivers/gpu/drm/nouveau/dispnv50/disp.c | 6 +- drivers/gpu/drm/nouveau/nouveau_connector.c | 33 +- drivers/gpu/drm/selftests/Makefile | 2 +- .../gpu/drm/selftests/drm_modeset_selftests.h | 2 + .../drm/selftests/test-drm_dp_mst_helper.c | 248 +++ .../drm/selftests/test-drm_modeset_common.h | 2 + include/drm/drm_dp_mst_helper.h | 127 +- include/drm/drm_print.h | 17 + 24 files changed, 1846 insertions(+), 506 deletions(-) create mode 100644 drivers/gpu/drm/drm_dp_mst_topology_internal.h create mode 100644 drivers/gpu/drm/selftests/test-drm_dp_mst_helper.c -- 2.21.0
Lyude Paul
2019-Jul-18 01:42 UTC
[Nouveau] [PATCH 21/26] drm/nouveau: Don't grab runtime PM refs for HPD IRQs
In order for suspend/resume reprobing to work, we need to be able to perform sideband communications during suspend/resume, along with runtime PM suspend/resume. In order to do so, we also need to make sure that nouveau doesn't bother grabbing a runtime PM reference to do so, since otherwise we'll start deadlocking runtime PM again. Note that we weren't able to do this before, because of the DP MST helpers processing UP requests from topologies in the same context as drm_dp_mst_hpd_irq() which would have caused us to open ourselves up to receiving hotplug events and deadlocking with runtime suspend/resume. Now that those requests are handled asynchronously, this change should be completely safe. Cc: Juston Li <juston.li at intel.com> Cc: Imre Deak <imre.deak at intel.com> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com> Cc: Harry Wentland <hwentlan at amd.com> Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/nouveau_connector.c | 33 +++++++++++---------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c index 4116ee62adaf..e9e78696a728 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -1129,6 +1129,16 @@ nouveau_connector_hotplug(struct nvif_notify *notify) const char *name = connector->name; struct nouveau_encoder *nv_encoder; int ret; + bool plugged = (rep->mask != NVIF_NOTIFY_CONN_V0_UNPLUG); + + if (rep->mask & NVIF_NOTIFY_CONN_V0_IRQ) { + NV_DEBUG(drm, "service %s\n", name); + drm_dp_cec_irq(&nv_connector->aux); + if ((nv_encoder = find_encoder(connector, DCB_OUTPUT_DP))) + nv50_mstm_service(nv_encoder->dp.mstm); + + return NVIF_NOTIFY_KEEP; + } ret = pm_runtime_get(drm->dev->dev); if (ret == 0) { @@ -1149,25 +1159,16 @@ nouveau_connector_hotplug(struct nvif_notify *notify) return NVIF_NOTIFY_DROP; } - if (rep->mask & NVIF_NOTIFY_CONN_V0_IRQ) { - NV_DEBUG(drm, "service %s\n", name); - drm_dp_cec_irq(&nv_connector->aux); - if ((nv_encoder = find_encoder(connector, DCB_OUTPUT_DP))) - nv50_mstm_service(nv_encoder->dp.mstm); - } else { - bool plugged = (rep->mask != NVIF_NOTIFY_CONN_V0_UNPLUG); - + if (!plugged) + drm_dp_cec_unset_edid(&nv_connector->aux); + NV_DEBUG(drm, "%splugged %s\n", plugged ? "" : "un", name); + if ((nv_encoder = find_encoder(connector, DCB_OUTPUT_DP))) { if (!plugged) - drm_dp_cec_unset_edid(&nv_connector->aux); - NV_DEBUG(drm, "%splugged %s\n", plugged ? "" : "un", name); - if ((nv_encoder = find_encoder(connector, DCB_OUTPUT_DP))) { - if (!plugged) - nv50_mstm_remove(nv_encoder->dp.mstm); - } - - drm_helper_hpd_irq_event(connector->dev); + nv50_mstm_remove(nv_encoder->dp.mstm); } + drm_helper_hpd_irq_event(connector->dev); + pm_runtime_mark_last_busy(drm->dev->dev); pm_runtime_put_autosuspend(drm->dev->dev); return NVIF_NOTIFY_KEEP; -- 2.21.0
Lyude Paul
2019-Jul-18 01:42 UTC
[Nouveau] [PATCH 24/26] drm/dp_mst: Add basic topology reprobing when resuming
Finally! For a very long time, our MST helpers have had one very annoying issue: They don't know how to reprobe the topology state when coming out of suspend. This means that if a user has a machine connected to an MST topology and decides to suspend their machine, we lose all topology changes that happened during that period. That can be a big problem if the machine was connected to a different topology on the same port before resuming, as we won't bother reprobing any of the ports and likely cause the user's monitors not to come back up as expected. So, we start fixing this by teaching our MST helpers how to reprobe the link addresses of each connected topology when resuming. As it turns out, the behavior that we want here is identical to the behavior we want when initially probing a newly connected MST topology, with a couple of important differences: - We need to be more careful about handling the potential races between events from the MST hub that could change the topology state as we're performing the link address reprobe - We need to be more careful about handling unlikely state changes on ports - such as an input port turning into an output port, something that would be far more likely to happen in situations like the MST hub we're connected to being changed while we're suspend Both of which have been solved by previous commits. That leaves one requirement: - We need to prune any MST ports in our in-memory topology state that were present when suspending, but have not appeared in the post-resume link address response from their parent branch device Which we can now handle in this commit by modifying drm_dp_send_link_address(). We then introduce suspend/resume reprobing by introducing drm_dp_mst_topology_mgr_invalidate_mstb(), which we call in drm_dp_mst_topology_mgr_suspend() to traverse the in-memory topology state to indicate that each mstb needs it's link address resent and PBN resources reprobed. On resume, we start back up &mgr->work and have it reprobe the topology in the same way we would on a hotplug, removing any leftover ports that no longer appear in the topology state. Cc: Juston Li <juston.li at intel.com> Cc: Imre Deak <imre.deak at intel.com> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com> Cc: Harry Wentland <hwentlan at amd.com> Signed-off-by: Lyude Paul <lyude at redhat.com> --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- drivers/gpu/drm/drm_dp_mst_topology.c | 138 +++++++++++++----- drivers/gpu/drm/i915/display/intel_dp.c | 3 +- drivers/gpu/drm/nouveau/dispnv50/disp.c | 6 +- include/drm/drm_dp_mst_helper.h | 3 +- 5 files changed, 112 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index e33e080cf16d..2ebd4811ff40 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -829,7 +829,7 @@ static void s3_handle_mst(struct drm_device *dev, bool suspend) if (suspend) { drm_dp_mst_topology_mgr_suspend(mgr); } else { - ret = drm_dp_mst_topology_mgr_resume(mgr); + ret = drm_dp_mst_topology_mgr_resume(mgr, true); if (ret < 0) { drm_dp_mst_topology_mgr_set_mst(mgr, false); need_hotplug = true; diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 126db36c9337..320158970e25 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1931,6 +1931,14 @@ drm_dp_mst_handle_link_address_port(struct drm_dp_mst_branch *mstb, goto fail_unlock; } + /* + * If this port wasn't just created, then we're reprobing because + * we're coming out of suspend. In this case, always resend the link + * address if there's an MSTB on this port + */ + if (!created && port->pdt == DP_PEER_DEVICE_MST_BRANCHING) + send_link_addr = true; + if (send_link_addr) { mutex_lock(&mgr->lock); if (port->mstb) { @@ -2443,7 +2451,8 @@ static void drm_dp_send_link_address(struct drm_dp_mst_topology_mgr *mgr, { struct drm_dp_sideband_msg_tx *txmsg; struct drm_dp_link_address_ack_reply *reply; - int i, len, ret; + struct drm_dp_mst_port *port, *tmp; + int i, len, ret, port_mask = 0; txmsg = kzalloc(sizeof(*txmsg), GFP_KERNEL); if (!txmsg) @@ -2473,9 +2482,28 @@ static void drm_dp_send_link_address(struct drm_dp_mst_topology_mgr *mgr, drm_dp_check_mstb_guid(mstb, reply->guid); - for (i = 0; i < reply->nports; i++) + for (i = 0; i < reply->nports; i++) { + port_mask |= BIT(reply->ports[i].port_number); drm_dp_mst_handle_link_address_port(mstb, mgr->dev, &reply->ports[i]); + } + + /* Prune any ports that are currently a part of mstb in our in-memory + * topology, but were not seen in this link address. Usually this + * means that they were removed while the topology was out of sync, + * e.g. during suspend/resume + */ + mutex_lock(&mgr->lock); + list_for_each_entry_safe(port, tmp, &mstb->ports, next) { + if (port_mask & BIT(port->port_num)) + continue; + + DRM_DEBUG_KMS("port %d was not in link address, removing\n", + port->port_num); + list_del(&port->next); + drm_dp_mst_topology_put_port(port); + } + mutex_unlock(&mgr->lock); drm_kms_helper_hotplug_event(mgr->dev); @@ -3072,6 +3100,23 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms } EXPORT_SYMBOL(drm_dp_mst_topology_mgr_set_mst); +static void +drm_dp_mst_topology_mgr_invalidate_mstb(struct drm_dp_mst_branch *mstb) +{ + struct drm_dp_mst_port *port; + + /* The link address will need to be re-sent on resume */ + mstb->link_address_sent = false; + + list_for_each_entry(port, &mstb->ports, next) { + /* The PBN for each port will also need to be re-probed */ + port->available_pbn = 0; + + if (port->mstb) + drm_dp_mst_topology_mgr_invalidate_mstb(port->mstb); + } +} + /** * drm_dp_mst_topology_mgr_suspend() - suspend the MST manager * @mgr: manager to suspend @@ -3088,60 +3133,85 @@ void drm_dp_mst_topology_mgr_suspend(struct drm_dp_mst_topology_mgr *mgr) flush_work(&mgr->up_req_work); flush_work(&mgr->work); flush_work(&mgr->destroy_connector_work); + + mutex_lock(&mgr->lock); + if (mgr->mst_state && mgr->mst_primary) + drm_dp_mst_topology_mgr_invalidate_mstb(mgr->mst_primary); + mutex_unlock(&mgr->lock); } EXPORT_SYMBOL(drm_dp_mst_topology_mgr_suspend); /** * drm_dp_mst_topology_mgr_resume() - resume the MST manager * @mgr: manager to resume + * @sync: whether or not to perform topology reprobing synchronously * * This will fetch DPCD and see if the device is still there, * if it is, it will rewrite the MSTM control bits, and return. * - * if the device fails this returns -1, and the driver should do + * If the device fails this returns -1, and the driver should do * a full MST reprobe, in case we were undocked. + * + * During system resume (where it is assumed that the driver will be calling + * drm_atomic_helper_resume()) this function should be called beforehand with + * @sync set to true. In contexts like runtime resume where the driver is not + * expected to be calling drm_atomic_helper_resume(), this function should be + * called with @sync set to false in order to avoid deadlocking. + * + * Returns: -1 if the MST topology was removed while we were suspended, 0 + * otherwise. */ -int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr) +int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr, + bool sync) { - int ret = 0; + int ret; + u8 guid[16]; mutex_lock(&mgr->lock); + if (!mgr->mst_primary) + goto out_fail; - if (mgr->mst_primary) { - int sret; - u8 guid[16]; + ret = drm_dp_dpcd_read(mgr->aux, DP_DPCD_REV, mgr->dpcd, + DP_RECEIVER_CAP_SIZE); + if (ret != DP_RECEIVER_CAP_SIZE) { + DRM_DEBUG_KMS("dpcd read failed - undocked during suspend?\n"); + goto out_fail; + } - sret = drm_dp_dpcd_read(mgr->aux, DP_DPCD_REV, mgr->dpcd, DP_RECEIVER_CAP_SIZE); - if (sret != DP_RECEIVER_CAP_SIZE) { - DRM_DEBUG_KMS("dpcd read failed - undocked during suspend?\n"); - ret = -1; - goto out_unlock; - } + ret = drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, + DP_MST_EN | + DP_UP_REQ_EN | + DP_UPSTREAM_IS_SRC); + if (ret < 0) { + DRM_DEBUG_KMS("mst write failed - undocked during suspend?\n"); + goto out_fail; + } - ret = drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, - DP_MST_EN | DP_UP_REQ_EN | DP_UPSTREAM_IS_SRC); - if (ret < 0) { - DRM_DEBUG_KMS("mst write failed - undocked during suspend?\n"); - ret = -1; - goto out_unlock; - } + /* Some hubs forget their guids after they resume */ + ret = drm_dp_dpcd_read(mgr->aux, DP_GUID, guid, 16); + if (ret != 16) { + DRM_DEBUG_KMS("dpcd read failed - undocked during suspend?\n"); + goto out_fail; + } + drm_dp_check_mstb_guid(mgr->mst_primary, guid); - /* Some hubs forget their guids after they resume */ - sret = drm_dp_dpcd_read(mgr->aux, DP_GUID, guid, 16); - if (sret != 16) { - DRM_DEBUG_KMS("dpcd read failed - undocked during suspend?\n"); - ret = -1; - goto out_unlock; - } - drm_dp_check_mstb_guid(mgr->mst_primary, guid); + /* For the final step of resuming the topology, we need to bring the + * state of our in-memory topology back into sync with reality. So, + * restart the probing process as if we're probing a new hub + */ + queue_work(system_long_wq, &mgr->work); + mutex_unlock(&mgr->lock); - ret = 0; - } else - ret = -1; + if (sync) { + DRM_DEBUG_KMS("Waiting for link probe work to finish re-syncing topology...\n"); + flush_work(&mgr->work); + } -out_unlock: + return 0; + +out_fail: mutex_unlock(&mgr->lock); - return ret; + return -1; } EXPORT_SYMBOL(drm_dp_mst_topology_mgr_resume); diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 0eb5d66f87a7..7190ff5c3649 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -7366,7 +7366,8 @@ void intel_dp_mst_resume(struct drm_i915_private *dev_priv) if (!intel_dp->can_mst) continue; - ret = drm_dp_mst_topology_mgr_resume(&intel_dp->mst_mgr); + ret = drm_dp_mst_topology_mgr_resume(&intel_dp->mst_mgr, + true); if (ret) { intel_dp->is_mst = false; drm_dp_mst_topology_mgr_set_mst(&intel_dp->mst_mgr, diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index 7ba373f493b2..c44b54eeddce 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -1300,14 +1300,14 @@ nv50_mstm_fini(struct nv50_mstm *mstm) } static void -nv50_mstm_init(struct nv50_mstm *mstm) +nv50_mstm_init(struct nv50_mstm *mstm, bool runtime) { int ret; if (!mstm || !mstm->mgr.mst_state) return; - ret = drm_dp_mst_topology_mgr_resume(&mstm->mgr); + ret = drm_dp_mst_topology_mgr_resume(&mstm->mgr, !runtime); if (ret == -1) { drm_dp_mst_topology_mgr_set_mst(&mstm->mgr, false); drm_kms_helper_hotplug_event(mstm->mgr.dev); @@ -2257,7 +2257,7 @@ nv50_display_init(struct drm_device *dev, bool resume, bool runtime) if (encoder->encoder_type != DRM_MODE_ENCODER_DPMST) { struct nouveau_encoder *nv_encoder nouveau_encoder(encoder); - nv50_mstm_init(nv_encoder->dp.mstm); + nv50_mstm_init(nv_encoder->dp.mstm, runtime); } } diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h index aed68d7e6492..eece28525d52 100644 --- a/include/drm/drm_dp_mst_helper.h +++ b/include/drm/drm_dp_mst_helper.h @@ -683,7 +683,8 @@ void drm_dp_mst_dump_topology(struct seq_file *m, void drm_dp_mst_topology_mgr_suspend(struct drm_dp_mst_topology_mgr *mgr); int __must_check -drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr); +drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr, + bool sync); struct drm_dp_mst_topology_state *drm_atomic_get_mst_topology_state(struct drm_atomic_state *state, struct drm_dp_mst_topology_mgr *mgr); int __must_check -- 2.21.0
Possibly Parallel Threads
- [PATCH v2 22/27] drm/nouveau: Don't grab runtime PM refs for HPD IRQs
- [PATCH v5 09/14] drm/nouveau: Don't grab runtime PM refs for HPD IRQs
- [PATCH AUTOSEL 5.4 146/350] drm/nouveau: Don't grab runtime PM refs for HPD IRQs
- [PATCH (repost) 4/5] drm/nouveau: add DisplayPort CEC-Tunneling-over-AUX support
- [PATCH 4/5] drm/nouveau: add DisplayPort CEC-Tunneling-over-AUX support