Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 00/11] drm/nouveau: Introduce CRC support for gf119+
Nvidia released some documentation on how CRC support works on their GPUs, hooray! So: this patch series implements said CRC support in nouveau, along with adding some special debugfs interfaces for some relevant igt-gpu-tools tests (already on the ML). First - we add some new functionality to kthread_work in the kernel, and then use this to add a new feature to DRM that Ville Syrj?l? came up with: vblank workers. Basically, this is just a generic DRM interface that allows for scheduling high-priority workers that start on a given vblank interrupt. Note that while we're currently only using this in nouveau, Intel has plans to use this for i915 as well (hence why they came up with it!). And finally: in order to implement the last feature, we expose some new functions in the kernel's kthread_worker infrastructure so that we can de-complicate our implementation of this. Anyway-welcome to the future! :) Major changes since v6: * Move vblank_work related functions into their own files * Write documentation * Simplify work flushing and cancellation by getting rid of seqcounts and ->pending Major changes since v4: * Remove the interfaces we tried adding to kthread_worker and use a wait queue + seqcount in order to implement flushing vblank workers. * Rebase Major changes since v3: * Style fixes on nouveau patches from checkpatch, no functional changes * Don't integrate so tightly with kthread_work (and use our own lock), instead introduce some new functions for doing simple async flushing and cancelling. I think this interface looks a lot more acceptable then what I was previously trying. * Apply some changes requested by danvet Major changes since v2: * Use kthread_worker instead of kthreadd for vblank workers * Don't check debugfs return values Lyude Paul (11): drm/vblank: Register drmm cleanup action once per drm_vblank_crtc drm/vblank: Use spin_(un)lock_irq() in drm_crtc_vblank_off() drm/vblank: Add vblank works drm/nouveau/kms/nv50-: Unroll error cleanup in nv50_head_create() drm/nouveau/kms/nv140-: Don't modify depth in state during atomic commit drm/nouveau/kms/nv50-: Fix disabling dithering drm/nouveau/kms/nv50-: s/harm/armh/g drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h drm/nouveau/kms/nv50-: Move hard-coded object handles into header drm/nouveau/kms/nvd9-: Add CRC support Documentation/gpu/drm-kms.rst | 15 + drivers/gpu/drm/Makefile | 2 +- drivers/gpu/drm/drm_vblank.c | 83 ++- drivers/gpu/drm/drm_vblank_internal.h | 19 + drivers/gpu/drm/drm_vblank_work.c | 259 +++++++ drivers/gpu/drm/drm_vblank_work_internal.h | 24 + drivers/gpu/drm/nouveau/dispnv04/crtc.c | 25 +- drivers/gpu/drm/nouveau/dispnv50/Kbuild | 4 + drivers/gpu/drm/nouveau/dispnv50/atom.h | 21 + drivers/gpu/drm/nouveau/dispnv50/core.h | 4 + drivers/gpu/drm/nouveau/dispnv50/core907d.c | 3 + drivers/gpu/drm/nouveau/dispnv50/core917d.c | 3 + drivers/gpu/drm/nouveau/dispnv50/corec37d.c | 3 + drivers/gpu/drm/nouveau/dispnv50/corec57d.c | 3 + drivers/gpu/drm/nouveau/dispnv50/crc.c | 714 ++++++++++++++++++++ drivers/gpu/drm/nouveau/dispnv50/crc.h | 125 ++++ drivers/gpu/drm/nouveau/dispnv50/crc907d.c | 139 ++++ drivers/gpu/drm/nouveau/dispnv50/crcc37d.c | 153 +++++ drivers/gpu/drm/nouveau/dispnv50/disp.c | 58 +- drivers/gpu/drm/nouveau/dispnv50/disp.h | 24 + drivers/gpu/drm/nouveau/dispnv50/handles.h | 16 + drivers/gpu/drm/nouveau/dispnv50/head.c | 134 +++- drivers/gpu/drm/nouveau/dispnv50/head.h | 12 +- drivers/gpu/drm/nouveau/dispnv50/head907d.c | 14 +- drivers/gpu/drm/nouveau/dispnv50/headc37d.c | 27 +- drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 20 +- drivers/gpu/drm/nouveau/dispnv50/wndw.c | 15 +- drivers/gpu/drm/nouveau/nouveau_display.c | 60 +- include/drm/drm_vblank.h | 20 + include/drm/drm_vblank_work.h | 71 ++ 30 files changed, 1910 insertions(+), 160 deletions(-) create mode 100644 drivers/gpu/drm/drm_vblank_internal.h create mode 100644 drivers/gpu/drm/drm_vblank_work.c create mode 100644 drivers/gpu/drm/drm_vblank_work_internal.h create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.c create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.h create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc907d.c create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crcc37d.c create mode 100644 drivers/gpu/drm/nouveau/dispnv50/handles.h create mode 100644 include/drm/drm_vblank_work.h -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 01/11] drm/vblank: Register drmm cleanup action once per drm_vblank_crtc
Since we'll be allocating resources for kthread_create_worker() in the next commit (which could fail and require us to clean up the mess), let's simplify the cleanup process a bit by registering a drm_vblank_init_release() action for each drm_vblank_crtc so they're still cleaned up if we fail to initialize one of them. Changes since v3: * Use drmm_add_action_or_reset() - Daniel Vetter Cc: Daniel Vetter <daniel at ffwll.ch> Cc: Ville Syrj?l? <ville.syrjala at linux.intel.com> Cc: dri-devel at lists.freedesktop.org Cc: nouveau at lists.freedesktop.org Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/drm_vblank.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c index 85e5f2db16085..ce5c1e1d29963 100644 --- a/drivers/gpu/drm/drm_vblank.c +++ b/drivers/gpu/drm/drm_vblank.c @@ -492,16 +492,12 @@ static void vblank_disable_fn(struct timer_list *t) static void drm_vblank_init_release(struct drm_device *dev, void *ptr) { - unsigned int pipe; - - for (pipe = 0; pipe < dev->num_crtcs; pipe++) { - struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; + struct drm_vblank_crtc *vblank = ptr; - drm_WARN_ON(dev, READ_ONCE(vblank->enabled) && - drm_core_check_feature(dev, DRIVER_MODESET)); + drm_WARN_ON(dev, READ_ONCE(vblank->enabled) && + drm_core_check_feature(dev, DRIVER_MODESET)); - del_timer_sync(&vblank->disable_timer); - } + del_timer_sync(&vblank->disable_timer); } /** @@ -511,7 +507,7 @@ static void drm_vblank_init_release(struct drm_device *dev, void *ptr) * * This function initializes vblank support for @num_crtcs display pipelines. * Cleanup is handled automatically through a cleanup function added with - * drmm_add_action(). + * drmm_add_action_or_reset(). * * Returns: * Zero on success or a negative error code on failure. @@ -530,10 +526,6 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs) dev->num_crtcs = num_crtcs; - ret = drmm_add_action(dev, drm_vblank_init_release, NULL); - if (ret) - return ret; - for (i = 0; i < num_crtcs; i++) { struct drm_vblank_crtc *vblank = &dev->vblank[i]; @@ -542,6 +534,11 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs) init_waitqueue_head(&vblank->queue); timer_setup(&vblank->disable_timer, vblank_disable_fn, 0); seqlock_init(&vblank->seqlock); + + ret = drmm_add_action_or_reset(dev, drm_vblank_init_release, + vblank); + if (ret) + return ret; } return 0; -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 02/11] drm/vblank: Use spin_(un)lock_irq() in drm_crtc_vblank_off()
This got me confused for a bit while looking over this code: I had been planning on adding some blocking function calls into this function, but seeing the irqsave/irqrestore variants of spin_(un)lock() didn't make it very clear whether or not that would actually be safe. So I went ahead and reviewed every single driver in the kernel that uses this function, and they all fall into three categories: * Driver probe code * ->atomic_disable() callbacks * Legacy modesetting callbacks All of these will be guaranteed to have IRQs enabled, which means it's perfectly safe to block here. Just to make things a little less confusing to others in the future, let's switch over to spin_lock_irq()/spin_unlock_irq() to make that fact a little more obvious. Signed-off-by: Lyude Paul <lyude at redhat.com> Cc: Daniel Vetter <daniel at ffwll.ch> Cc: Ville Syrj?l? <ville.syrjala at linux.intel.com> --- drivers/gpu/drm/drm_vblank.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c index ce5c1e1d29963..e895f5331fdb4 100644 --- a/drivers/gpu/drm/drm_vblank.c +++ b/drivers/gpu/drm/drm_vblank.c @@ -1283,13 +1283,12 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) struct drm_pending_vblank_event *e, *t; ktime_t now; - unsigned long irqflags; u64 seq; if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) return; - spin_lock_irqsave(&dev->event_lock, irqflags); + spin_lock_irq(&dev->event_lock); spin_lock(&dev->vbl_lock); drm_dbg_vbl(dev, "crtc %d, vblank enabled %d, inmodeset %d\n", @@ -1325,7 +1324,7 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) drm_vblank_put(dev, pipe); send_vblank_event(dev, e, seq, now); } - spin_unlock_irqrestore(&dev->event_lock, irqflags); + spin_unlock_irq(&dev->event_lock); /* Will be reset by the modeset helpers when re-enabling the crtc by * calling drm_calc_timestamping_constants(). */ -- 2.26.2
Add some kind of vblank workers. The interface is similar to regular delayed works, and is mostly based off kthread_work. It allows for scheduling delayed works that execute once a particular vblank sequence has passed. It also allows for accurate flushing of scheduled vblank works - in that flushing waits for both the vblank sequence and job execution to complete, or for the work to get cancelled - whichever comes first. Whatever hardware programming we do in the work must be fast (must at least complete during the vblank or scanout period, sometimes during the first few scanlines of the vblank). As such we use a high-priority per-CRTC thread to accomplish this. Changes since v6: * Get rid of ->pending and seqcounts, and implement flushing through simpler means - danvet * Get rid of work_lock, just use drm_device->event_lock * Move drm_vblank_work item cleanup into drm_crtc_vblank_off() so that we ensure that all vblank work has finished before disabling vblanks * Add checks into drm_crtc_vblank_reset() so we yell if it gets called while there's vblank workers active * Grab event_lock in both drm_crtc_vblank_on()/drm_crtc_vblank_off(), the main reason for this is so that other threads calling drm_vblank_work_schedule() are blocked from attempting to schedule while we're in the middle of enabling/disabling vblanks. * Move drm_handle_vblank_works() call below drm_handle_vblank_events() * Simplify drm_vblank_work_cancel_sync() * Fix drm_vblank_work_cancel_sync() documentation * Move wake_up_all() calls out of spinlock where we can. The only one I left was the call to wake_up_all() in drm_vblank_handle_works() as this seemed like it made more sense just living in that function (which is all technically under lock) * Move drm_vblank_work related functions into their own source files * Add drm_vblank_internal.h so we can export some functions we don't want drivers using, but that we do need to use in drm_vblank_work.c * Add a bunch of documentation Changes since v4: * Get rid of kthread interfaces we tried adding and move all of the locking into drm_vblank.c. For implementing drm_vblank_work_flush(), we now use a wait_queue and sequence counters in order to differentiate between multiple work item executions. * Get rid of drm_vblank_work_cancel() - this would have been pretty difficult to actually reimplement and it occurred to me that neither nouveau or i915 are even planning to use this function. Since there's also no async cancel function for most of the work interfaces in the kernel, it seems a bit unnecessary anyway. * Get rid of to_drm_vblank_work() since we now are also able to just pass the struct drm_vblank_work to work item callbacks anyway Changes since v3: * Use our own spinlocks, don't integrate so tightly with kthread_works Changes since v2: * Use kthread_workers instead of reinventing the wheel. Cc: Daniel Vetter <daniel at ffwll.ch> Cc: Tejun Heo <tj at kernel.org> Cc: dri-devel at lists.freedesktop.org Cc: nouveau at lists.freedesktop.org Co-developed-by: Ville Syrj?l? <ville.syrjala at linux.intel.com> Signed-off-by: Lyude Paul <lyude at redhat.com> --- Documentation/gpu/drm-kms.rst | 15 ++ drivers/gpu/drm/Makefile | 2 +- drivers/gpu/drm/drm_vblank.c | 55 +++-- drivers/gpu/drm/drm_vblank_internal.h | 19 ++ drivers/gpu/drm/drm_vblank_work.c | 259 +++++++++++++++++++++ drivers/gpu/drm/drm_vblank_work_internal.h | 24 ++ include/drm/drm_vblank.h | 20 ++ include/drm/drm_vblank_work.h | 71 ++++++ 8 files changed, 447 insertions(+), 18 deletions(-) create mode 100644 drivers/gpu/drm/drm_vblank_internal.h create mode 100644 drivers/gpu/drm/drm_vblank_work.c create mode 100644 drivers/gpu/drm/drm_vblank_work_internal.h create mode 100644 include/drm/drm_vblank_work.h diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst index 975cfeb8a3532..3c5ae4f6dfd23 100644 --- a/Documentation/gpu/drm-kms.rst +++ b/Documentation/gpu/drm-kms.rst @@ -543,3 +543,18 @@ Vertical Blanking and Interrupt Handling Functions Reference .. kernel-doc:: drivers/gpu/drm/drm_vblank.c :export: + +Vertical Blank Work +==================+ +.. kernel-doc:: drivers/gpu/drm/drm_vblank_work.c + :doc: vblank works + +Vertical Blank Work Functions Reference +--------------------------------------- + +.. kernel-doc:: include/drm/drm_vblank_work.h + :internal: + +.. kernel-doc:: drivers/gpu/drm/drm_vblank_work.c + :export: diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 2c0e5a7e59536..02ee5faf1a925 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -18,7 +18,7 @@ drm-y := drm_auth.o drm_cache.o \ drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \ drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \ drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \ - drm_managed.o + drm_managed.o drm_vblank_work.o drm-$(CONFIG_DRM_LEGACY) += drm_legacy_misc.o drm_bufs.o drm_context.o drm_dma.o drm_scatter.o drm_lock.o drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c index e895f5331fdb4..b353bc8328414 100644 --- a/drivers/gpu/drm/drm_vblank.c +++ b/drivers/gpu/drm/drm_vblank.c @@ -25,6 +25,7 @@ */ #include <linux/export.h> +#include <linux/kthread.h> #include <linux/moduleparam.h> #include <drm/drm_crtc.h> @@ -37,6 +38,8 @@ #include "drm_internal.h" #include "drm_trace.h" +#include "drm_vblank_internal.h" +#include "drm_vblank_work_internal.h" /** * DOC: vblank handling @@ -363,7 +366,7 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, store_vblank(dev, pipe, diff, t_vblank, cur_vblank); } -static u64 drm_vblank_count(struct drm_device *dev, unsigned int pipe) +u64 drm_vblank_count(struct drm_device *dev, unsigned int pipe) { struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; u64 count; @@ -497,6 +500,7 @@ static void drm_vblank_init_release(struct drm_device *dev, void *ptr) drm_WARN_ON(dev, READ_ONCE(vblank->enabled) && drm_core_check_feature(dev, DRIVER_MODESET)); + drm_vblank_destroy_worker(vblank); del_timer_sync(&vblank->disable_timer); } @@ -539,6 +543,10 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs) vblank); if (ret) return ret; + + ret = drm_vblank_worker_init(vblank); + if (ret) + return ret; } return 0; @@ -1135,7 +1143,7 @@ static int drm_vblank_enable(struct drm_device *dev, unsigned int pipe) return ret; } -static int drm_vblank_get(struct drm_device *dev, unsigned int pipe) +int drm_vblank_get(struct drm_device *dev, unsigned int pipe) { struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; unsigned long irqflags; @@ -1178,7 +1186,7 @@ int drm_crtc_vblank_get(struct drm_crtc *crtc) } EXPORT_SYMBOL(drm_crtc_vblank_get); -static void drm_vblank_put(struct drm_device *dev, unsigned int pipe) +void drm_vblank_put(struct drm_device *dev, unsigned int pipe) { struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; @@ -1281,13 +1289,16 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) unsigned int pipe = drm_crtc_index(crtc); struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; struct drm_pending_vblank_event *e, *t; - ktime_t now; u64 seq; if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) return; + /* + * Grab event_lock early to prevent vblank work from being scheduled + * while we're in the middle of shutting down vblank interrupts + */ spin_lock_irq(&dev->event_lock); spin_lock(&dev->vbl_lock); @@ -1324,11 +1335,18 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) drm_vblank_put(dev, pipe); send_vblank_event(dev, e, seq, now); } + + /* Cancel any leftover pending vblank work */ + drm_vblank_cancel_pending_works(vblank); + spin_unlock_irq(&dev->event_lock); /* Will be reset by the modeset helpers when re-enabling the crtc by * calling drm_calc_timestamping_constants(). */ vblank->hwmode.crtc_clock = 0; + + /* Wait for any vblank work that's still executing to finish */ + drm_vblank_flush_worker(vblank); } EXPORT_SYMBOL(drm_crtc_vblank_off); @@ -1363,6 +1381,7 @@ void drm_crtc_vblank_reset(struct drm_crtc *crtc) spin_unlock_irqrestore(&dev->vbl_lock, irqflags); drm_WARN_ON(dev, !list_empty(&dev->vblank_event_list)); + drm_WARN_ON(dev, !list_empty(&vblank->pending_work)); } EXPORT_SYMBOL(drm_crtc_vblank_reset); @@ -1417,7 +1436,10 @@ void drm_crtc_vblank_on(struct drm_crtc *crtc) if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) return; - spin_lock_irqsave(&dev->vbl_lock, irqflags); + /* So vblank works can't be scheduled until we've finished */ + spin_lock_irqsave(&dev->event_lock, irqflags); + + spin_lock(&dev->vbl_lock); drm_dbg_vbl(dev, "crtc %d, vblank enabled %d, inmodeset %d\n", pipe, vblank->enabled, vblank->inmodeset); @@ -1435,7 +1457,9 @@ void drm_crtc_vblank_on(struct drm_crtc *crtc) */ if (atomic_read(&vblank->refcount) != 0 || drm_vblank_offdelay == 0) drm_WARN_ON(dev, drm_vblank_enable(dev, pipe)); - spin_unlock_irqrestore(&dev->vbl_lock, irqflags); + + spin_unlock(&dev->vbl_lock); + spin_unlock_irqrestore(&dev->event_lock, irqflags); } EXPORT_SYMBOL(drm_crtc_vblank_on); @@ -1589,11 +1613,6 @@ int drm_legacy_modeset_ctl_ioctl(struct drm_device *dev, void *data, return 0; } -static inline bool vblank_passed(u64 seq, u64 ref) -{ - return (seq - ref) <= (1 << 23); -} - static int drm_queue_vblank_event(struct drm_device *dev, unsigned int pipe, u64 req_seq, union drm_wait_vblank *vblwait, @@ -1650,7 +1669,7 @@ static int drm_queue_vblank_event(struct drm_device *dev, unsigned int pipe, trace_drm_vblank_event_queued(file_priv, pipe, req_seq); e->sequence = req_seq; - if (vblank_passed(seq, req_seq)) { + if (drm_vblank_passed(seq, req_seq)) { drm_vblank_put(dev, pipe); send_vblank_event(dev, e, seq, now); vblwait->reply.sequence = seq; @@ -1805,7 +1824,7 @@ int drm_wait_vblank_ioctl(struct drm_device *dev, void *data, } if ((flags & _DRM_VBLANK_NEXTONMISS) && - vblank_passed(seq, req_seq)) { + drm_vblank_passed(seq, req_seq)) { req_seq = seq + 1; vblwait->request.type &= ~_DRM_VBLANK_NEXTONMISS; vblwait->request.sequence = req_seq; @@ -1824,7 +1843,7 @@ int drm_wait_vblank_ioctl(struct drm_device *dev, void *data, drm_dbg_core(dev, "waiting on vblank count %llu, crtc %u\n", req_seq, pipe); wait = wait_event_interruptible_timeout(vblank->queue, - vblank_passed(drm_vblank_count(dev, pipe), req_seq) || + drm_vblank_passed(drm_vblank_count(dev, pipe), req_seq) || !READ_ONCE(vblank->enabled), msecs_to_jiffies(3000)); @@ -1873,7 +1892,7 @@ static void drm_handle_vblank_events(struct drm_device *dev, unsigned int pipe) list_for_each_entry_safe(e, t, &dev->vblank_event_list, base.link) { if (e->pipe != pipe) continue; - if (!vblank_passed(seq, e->sequence)) + if (!drm_vblank_passed(seq, e->sequence)) continue; drm_dbg_core(dev, "vblank event on %llu, current %llu\n", @@ -1943,6 +1962,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) !atomic_read(&vblank->refcount)); drm_handle_vblank_events(dev, pipe); + drm_handle_vblank_works(vblank); spin_unlock_irqrestore(&dev->event_lock, irqflags); @@ -2096,7 +2116,7 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data, if (flags & DRM_CRTC_SEQUENCE_RELATIVE) req_seq += seq; - if ((flags & DRM_CRTC_SEQUENCE_NEXT_ON_MISS) && vblank_passed(seq, req_seq)) + if ((flags & DRM_CRTC_SEQUENCE_NEXT_ON_MISS) && drm_vblank_passed(seq, req_seq)) req_seq = seq + 1; e->pipe = pipe; @@ -2125,7 +2145,7 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data, e->sequence = req_seq; - if (vblank_passed(seq, req_seq)) { + if (drm_vblank_passed(seq, req_seq)) { drm_crtc_vblank_put(crtc); send_vblank_event(dev, e, seq, now); queue_seq->sequence = seq; @@ -2145,3 +2165,4 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data, kfree(e); return ret; } + diff --git a/drivers/gpu/drm/drm_vblank_internal.h b/drivers/gpu/drm/drm_vblank_internal.h new file mode 100644 index 0000000000000..217ae5442ddce --- /dev/null +++ b/drivers/gpu/drm/drm_vblank_internal.h @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: MIT + +#ifndef DRM_VBLANK_INTERNAL_H +#define DRM_VBLANK_INTERNAL_H + +#include <linux/types.h> + +#include <drm/drm_device.h> + +static inline bool drm_vblank_passed(u64 seq, u64 ref) +{ + return (seq - ref) <= (1 << 23); +} + +int drm_vblank_get(struct drm_device *dev, unsigned int pipe); +void drm_vblank_put(struct drm_device *dev, unsigned int pipe); +u64 drm_vblank_count(struct drm_device *dev, unsigned int pipe); + +#endif /* !DRM_VBLANK_INTERNAL_H */ diff --git a/drivers/gpu/drm/drm_vblank_work.c b/drivers/gpu/drm/drm_vblank_work.c new file mode 100644 index 0000000000000..0762ad34cdcc0 --- /dev/null +++ b/drivers/gpu/drm/drm_vblank_work.c @@ -0,0 +1,259 @@ +// SPDX-License-Identifier: MIT + +#include <uapi/linux/sched/types.h> + +#include <drm/drm_print.h> +#include <drm/drm_vblank.h> +#include <drm/drm_vblank_work.h> +#include <drm/drm_crtc.h> + +#include "drm_vblank_internal.h" +#include "drm_vblank_work_internal.h" + +/** + * DOC: vblank works + * + * Many DRM drivers need to program hardware in a time-sensitive manner, many + * times with a deadline of starting and finishing within a certain region of + * the scanout. Most of the time the safest way to accomplish this is to + * simply do said time-sensitive programming in the driver's IRQ handler, + * which allows drivers to avoid being preempted during these critical + * regions. Or even better, the hardware may even handle applying such + * time-critical programming independently of the CPU. + * + * While there's a decent amount of hardware that's designed so that the CPU + * doesn't need to be concerned with extremely time-sensitive programming, + * there's a few situations where it can't be helped. Some unforgiving + * hardware may require that certain time-sensitive programming be handled + * completely by the CPU, and said programming may even take too long to + * handle in an IRQ handler. Another such situation would be where the driver + * needs to perform a task that needs to complete within a specific scanout + * period, but might possibly block and thus cannot be handled in an IRQ + * context. Both of these situations can't be solved perfectly in Linux since + * we're not a realtime kernel, and thus the scheduler may cause us to miss + * our deadline if it decides to preempt us. But for some drivers, it's good + * enough if we can lower our chance of being preempted to an absolute + * minimum. + * + * This is where &drm_vblank_work comes in. &drm_vblank_work provides a simple + * generic delayed work implementation which delays work execution until a + * particular vblank has passed, and then executes the work at realtime + * priority. This provides the best possible chance at performing + * time-sensitive hardware programming on time, even when the system is under + * heavy load. &drm_vblank_work also supports rescheduling, so that self + * re-arming work items can be easily implemented. + */ + +void drm_handle_vblank_works(struct drm_vblank_crtc *vblank) +{ + struct drm_vblank_work *work, *next; + u64 count = atomic64_read(&vblank->count); + bool wake = false; + + assert_spin_locked(&vblank->dev->event_lock); + + list_for_each_entry_safe(work, next, &vblank->pending_work, node) { + if (!drm_vblank_passed(count, work->count)) + continue; + + list_del_init(&work->node); + drm_vblank_put(vblank->dev, vblank->pipe); + kthread_queue_work(vblank->worker, &work->base); + wake = true; + } + if (wake) + wake_up_all(&vblank->work_wait_queue); +} + +/* Handle cancelling any pending vblank work items and drop respective vblank + * references in response to vblank interrupts being disabled. + */ +void drm_vblank_cancel_pending_works(struct drm_vblank_crtc *vblank) +{ + struct drm_vblank_work *work, *next; + + assert_spin_locked(&vblank->dev->event_lock); + + list_for_each_entry_safe(work, next, &vblank->pending_work, node) { + list_del_init(&work->node); + drm_vblank_put(vblank->dev, vblank->pipe); + } + + wake_up_all(&vblank->work_wait_queue); +} + +/** + * drm_vblank_work_schedule - schedule a vblank work + * @work: vblank work to schedule + * @count: target vblank count + * @nextonmiss: defer until the next vblank if target vblank was missed + * + * Schedule @work for execution once the crtc vblank count reaches @count. + * + * If the crtc vblank count has already reached @count and @nextonmiss is + * %false the work starts to execute immediately. + * + * If the crtc vblank count has already reached @count and @nextonmiss is + * %true the work is deferred until the next vblank (as if @count has been + * specified as crtc vblank count + 1). + * + * If @work is already scheduled, this function will reschedule said work + * using the new @count. + * + * Returns: + * 0 on success, error code on failure. + */ +int drm_vblank_work_schedule(struct drm_vblank_work *work, + u64 count, bool nextonmiss) +{ + struct drm_vblank_crtc *vblank = work->vblank; + struct drm_device *dev = vblank->dev; + u64 cur_vbl; + unsigned long irqflags; + bool passed, rescheduling = false, wake = false; + int ret = 0; + + spin_lock_irqsave(&dev->event_lock, irqflags); + if (!vblank->worker || vblank->inmodeset || work->cancelling) + goto out; + + if (list_empty(&work->node)) { + ret = drm_vblank_get(dev, vblank->pipe); + if (ret < 0) + goto out; + } else if (work->count == count) { + /* Already scheduled w/ same vbl count */ + goto out; + } else { + rescheduling = true; + } + + work->count = count; + cur_vbl = drm_vblank_count(dev, vblank->pipe); + passed = drm_vblank_passed(cur_vbl, count); + if (passed) + DRM_DEV_ERROR(dev->dev, + "crtc %d vblank %llu already passed (current %llu)\n", + vblank->pipe, count, cur_vbl); + + if (!nextonmiss && passed) { + drm_vblank_put(dev, vblank->pipe); + kthread_queue_work(vblank->worker, &work->base); + + if (rescheduling) { + list_del_init(&work->node); + wake = true; + } + } else if (!rescheduling) { + list_add_tail(&work->node, &vblank->pending_work); + } + +out: + spin_unlock_irqrestore(&dev->event_lock, irqflags); + if (wake) + wake_up_all(&vblank->work_wait_queue); + return ret; +} +EXPORT_SYMBOL(drm_vblank_work_schedule); + +/** + * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it to + * finish executing + * @work: vblank work to cancel + * + * Cancel an already scheduled vblank work and wait for its + * execution to finish. + * + * On return, @work is guaranteed to no longer be scheduled or running, even + * if it's self-arming. + * + * Returns: + * %True if the work was cancelled before it started to execute, %false + * otherwise. + */ +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work) +{ + struct drm_vblank_crtc *vblank = work->vblank; + struct drm_device *dev = vblank->dev; + bool ret = false; + + spin_lock_irq(&dev->event_lock); + if (!list_empty(&work->node)) { + list_del_init(&work->node); + drm_vblank_put(vblank->dev, vblank->pipe); + ret = true; + } + + work->cancelling++; + spin_unlock_irq(&dev->event_lock); + + wake_up_all(&vblank->work_wait_queue); + + if (kthread_cancel_work_sync(&work->base)) + ret = true; + + spin_lock_irq(&dev->event_lock); + work->cancelling--; + spin_unlock_irq(&dev->event_lock); + + return ret; +} +EXPORT_SYMBOL(drm_vblank_work_cancel_sync); + +/** + * drm_vblank_work_flush - wait for a scheduled vblank work to finish + * executing + * @work: vblank work to flush + * + * Wait until @work has finished executing once. + */ +void drm_vblank_work_flush(struct drm_vblank_work *work) +{ + struct drm_vblank_crtc *vblank = work->vblank; + struct drm_device *dev = vblank->dev; + + spin_lock_irq(&dev->event_lock); + wait_event_lock_irq(vblank->work_wait_queue, list_empty(&work->node), + dev->event_lock); + spin_unlock_irq(&dev->event_lock); + + kthread_flush_work(&work->base); +} +EXPORT_SYMBOL(drm_vblank_work_flush); + +/** + * drm_vblank_work_init - initialize a vblank work item + * @work: vblank work item + * @crtc: CRTC whose vblank will trigger the work execution + * @func: work function to be executed + * + * Initialize a vblank work item for a specific crtc. + */ +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc *crtc, + void (*func)(struct kthread_work *work)) +{ + kthread_init_work(&work->base, func); + INIT_LIST_HEAD(&work->node); + work->vblank = &crtc->dev->vblank[drm_crtc_index(crtc)]; +} +EXPORT_SYMBOL(drm_vblank_work_init); + +int drm_vblank_worker_init(struct drm_vblank_crtc *vblank) +{ + struct sched_param param = { + .sched_priority = MAX_RT_PRIO - 1, + }; + struct kthread_worker *worker; + + INIT_LIST_HEAD(&vblank->pending_work); + init_waitqueue_head(&vblank->work_wait_queue); + worker = kthread_create_worker(0, "card%d-crtc%d", + vblank->dev->primary->index, + vblank->pipe); + if (IS_ERR(worker)) + return PTR_ERR(worker); + + vblank->worker = worker; + + return sched_setscheduler(vblank->worker->task, SCHED_FIFO, ¶m); +} diff --git a/drivers/gpu/drm/drm_vblank_work_internal.h b/drivers/gpu/drm/drm_vblank_work_internal.h new file mode 100644 index 0000000000000..0a4abbc4ab295 --- /dev/null +++ b/drivers/gpu/drm/drm_vblank_work_internal.h @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: MIT + +#ifndef _DRM_VBLANK_WORK_INTERNAL_H_ +#define _DRM_VBLANK_WORK_INTERNAL_H_ + +#include <drm/drm_vblank.h> + +int drm_vblank_worker_init(struct drm_vblank_crtc *vblank); +void drm_vblank_cancel_pending_works(struct drm_vblank_crtc *vblank); +void drm_handle_vblank_works(struct drm_vblank_crtc *vblank); + +static inline void drm_vblank_flush_worker(struct drm_vblank_crtc *vblank) +{ + if (vblank->worker) + kthread_flush_worker(vblank->worker); +} + +static inline void drm_vblank_destroy_worker(struct drm_vblank_crtc *vblank) +{ + if (vblank->worker) + kthread_destroy_worker(vblank->worker); +} + +#endif /* !_DRM_VBLANK_WORK_INTERNAL_H_ */ diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h index dd9f5b9e56e4e..dd125f8c766cf 100644 --- a/include/drm/drm_vblank.h +++ b/include/drm/drm_vblank.h @@ -27,12 +27,14 @@ #include <linux/seqlock.h> #include <linux/idr.h> #include <linux/poll.h> +#include <linux/kthread.h> #include <drm/drm_file.h> #include <drm/drm_modes.h> struct drm_device; struct drm_crtc; +struct drm_vblank_work; /** * struct drm_pending_vblank_event - pending vblank event tracking @@ -203,6 +205,24 @@ struct drm_vblank_crtc { * disabling functions multiple times. */ bool enabled; + + /** + * @worker: The &kthread_worker used for executing vblank works. + */ + struct kthread_worker *worker; + + /** + * @pending_work: A list of scheduled &drm_vblank_work items that are + * waiting for a future vblank. + */ + struct list_head pending_work; + + /** + * @work_wait_queue: The wait queue used for signaling that a + * &drm_vblank_work item has either finished executing, or was + * cancelled. + */ + wait_queue_head_t work_wait_queue; }; int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); diff --git a/include/drm/drm_vblank_work.h b/include/drm/drm_vblank_work.h new file mode 100644 index 0000000000000..f0439c039f7ce --- /dev/null +++ b/include/drm/drm_vblank_work.h @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: MIT + +#ifndef _DRM_VBLANK_WORK_H_ +#define _DRM_VBLANK_WORK_H_ + +#include <linux/kthread.h> + +struct drm_crtc; + +/** + * struct drm_vblank_work - A delayed work item which delays until a target + * vblank passes, and then executes at realtime priority outside of IRQ + * context. + * + * See also: + * drm_vblank_work_schedule() + * drm_vblank_work_init() + * drm_vblank_work_cancel_sync() + * drm_vblank_work_flush() + */ +struct drm_vblank_work { + /** + * @base: The base &kthread_work item which will be executed by + * &drm_vblank_crtc.worker. Drivers should not interact with this + * directly, and instead rely on drm_vblank_work_init() to initialize + * this. + */ + struct kthread_work base; + + /** + * @vblank: A pointer to &drm_vblank_crtc this work item belongs to. + */ + struct drm_vblank_crtc *vblank; + + /** + * @count: The target vblank this work will execute on. Drivers should + * not modify this value directly, and instead use + * drm_vblank_work_schedule() + */ + u64 count; + + /** + * @cancelling: The number of drm_vblank_work_cancel_sync() calls that + * are currently running. A work item cannot be rescheduled until all + * calls have finished. + */ + int cancelling; + + /** + * @node: The position of this work item in + * &drm_vblank_crtc.pending_work. + */ + struct list_head node; +}; + +/** + * to_drm_vblank_work - Retrieve the respective &drm_vblank_work item from a + * &kthread_work + * @_work: The &kthread_work embedded inside a &drm_vblank_work + */ +#define to_drm_vblank_work(_work) \ + container_of((_work), struct drm_vblank_work, base) + +int drm_vblank_work_schedule(struct drm_vblank_work *work, + u64 count, bool nextonmiss); +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc *crtc, + void (*func)(struct kthread_work *work)); +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work); +void drm_vblank_work_flush(struct drm_vblank_work *work); + +#endif /* !_DRM_VBLANK_WORK_H_ */ -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 04/11] drm/nouveau/kms/nv50-: Unroll error cleanup in nv50_head_create()
We'll be rolling back more things in this function, and the way it's structured is a bit confusing. So, let's clean this up a bit and just unroll in the event of failure. Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/dispnv50/head.c | 33 +++++++++++++++++-------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c index 8f6455697ba72..e29ea40e7c334 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head.c +++ b/drivers/gpu/drm/nouveau/dispnv50/head.c @@ -507,20 +507,28 @@ nv50_head_create(struct drm_device *dev, int index) if (disp->disp->object.oclass < GV100_DISP) { ret = nv50_base_new(drm, head->base.index, &base); + if (ret) + goto fail_free; + ret = nv50_ovly_new(drm, head->base.index, &ovly); + if (ret) + goto fail_free; } else { ret = nv50_wndw_new(drm, DRM_PLANE_TYPE_PRIMARY, head->base.index * 2 + 0, &base); + if (ret) + goto fail_free; + ret = nv50_wndw_new(drm, DRM_PLANE_TYPE_OVERLAY, head->base.index * 2 + 1, &ovly); - } - if (ret == 0) - ret = nv50_curs_new(drm, head->base.index, &curs); - if (ret) { - kfree(head); - return ERR_PTR(ret); + if (ret) + goto fail_free; } + ret = nv50_curs_new(drm, head->base.index, &curs); + if (ret) + goto fail_free; + crtc = &head->base.base; drm_crtc_init_with_planes(dev, crtc, &base->plane, &curs->plane, &nv50_head_func, "head-%d", head->base.index); @@ -533,11 +541,16 @@ nv50_head_create(struct drm_device *dev, int index) if (head->func->olut_set) { ret = nv50_lut_init(disp, &drm->client.mmu, &head->olut); - if (ret) { - nv50_head_destroy(crtc); - return ERR_PTR(ret); - } + if (ret) + goto fail_crtc_cleanup; } return head; + +fail_crtc_cleanup: + drm_crtc_cleanup(crtc); +fail_free: + kfree(head); + + return ERR_PTR(ret); } -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 05/11] drm/nouveau/kms/nv140-: Don't modify depth in state during atomic commit
Currently, we modify the depth value stored in the atomic state when performing a commit in order to workaround the fact we haven't implemented support for depths higher then 10 yet. This isn't idempotent though, as it will happen every atomic commit where we modify the OR state even if the head's depth in the atomic state hasn't been modified. Normally this wouldn't matter, since we don't modify OR state outside of modesets, but since the CRC capture region is implemented as part of the OR state in hardware we'll want to make sure all commits modifying OR state are idempotent so as to avoid changing the depth unexpectedly. So, fix this by simply not writing the reduced depth value we come up with to the atomic state. Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/dispnv50/headc37d.c | 11 +++++++---- drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 11 +++++++---- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c index 4a9a32b89f746..9ef3c603fc43e 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c @@ -27,17 +27,20 @@ static void headc37d_or(struct nv50_head *head, struct nv50_head_atom *asyh) { struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan; + u8 depth; u32 *push; + if ((push = evo_wait(core, 2))) { /*XXX: This is a dirty hack until OR depth handling is * improved later for deep colour etc. */ switch (asyh->or.depth) { - case 6: asyh->or.depth = 5; break; - case 5: asyh->or.depth = 4; break; - case 2: asyh->or.depth = 1; break; - case 0: asyh->or.depth = 4; break; + case 6: depth = 5; break; + case 5: depth = 4; break; + case 2: depth = 1; break; + case 0: depth = 4; break; default: + depth = asyh->or.depth; WARN_ON(1); break; } diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c index 859131a8bc3c8..97141eb8e75ab 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c @@ -27,17 +27,20 @@ static void headc57d_or(struct nv50_head *head, struct nv50_head_atom *asyh) { struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan; + u8 depth; u32 *push; + if ((push = evo_wait(core, 2))) { /*XXX: This is a dirty hack until OR depth handling is * improved later for deep colour etc. */ switch (asyh->or.depth) { - case 6: asyh->or.depth = 5; break; - case 5: asyh->or.depth = 4; break; - case 2: asyh->or.depth = 1; break; - case 0: asyh->or.depth = 4; break; + case 6: depth = 5; break; + case 5: depth = 4; break; + case 2: depth = 1; break; + case 0: depth = 4; break; default: + depth = asyh->or.depth; WARN_ON(1); break; } -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 06/11] drm/nouveau/kms/nv50-: Fix disabling dithering
While we expose the ability to turn off hardware dithering for nouveau, we actually make the mistake of turning it on anyway, due to dithering_depth containing a non-zero value if our dithering depth isn't also set to 6 bpc. So, fix it by never enabling dithering when it's disabled. Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/dispnv50/head.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c index e29ea40e7c334..72bc3bce396a7 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head.c +++ b/drivers/gpu/drm/nouveau/dispnv50/head.c @@ -84,18 +84,20 @@ nv50_head_atomic_check_dither(struct nv50_head_atom *armh, { u32 mode = 0x00; - if (asyc->dither.mode == DITHERING_MODE_AUTO) { - if (asyh->base.depth > asyh->or.bpc * 3) - mode = DITHERING_MODE_DYNAMIC2X2; - } else { - mode = asyc->dither.mode; - } + if (asyc->dither.mode) { + if (asyc->dither.mode == DITHERING_MODE_AUTO) { + if (asyh->base.depth > asyh->or.bpc * 3) + mode = DITHERING_MODE_DYNAMIC2X2; + } else { + mode = asyc->dither.mode; + } - if (asyc->dither.depth == DITHERING_DEPTH_AUTO) { - if (asyh->or.bpc >= 8) - mode |= DITHERING_DEPTH_8BPC; - } else { - mode |= asyc->dither.depth; + if (asyc->dither.depth == DITHERING_DEPTH_AUTO) { + if (asyh->or.bpc >= 8) + mode |= DITHERING_DEPTH_8BPC; + } else { + mode |= asyc->dither.depth; + } } asyh->dither.enable = mode; -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 07/11] drm/nouveau/kms/nv50-: s/harm/armh/g
We refer to the armed hardware assembly as armh elsewhere in nouveau, so fix the naming here to make it consistent. This patch contains no functional changes. Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/dispnv50/wndw.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c index 99b9b681736da..cfee61f14aa49 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c @@ -408,7 +408,7 @@ nv50_wndw_atomic_check(struct drm_plane *plane, struct drm_plane_state *state) struct nv50_wndw *wndw = nv50_wndw(plane); struct nv50_wndw_atom *armw = nv50_wndw_atom(wndw->plane.state); struct nv50_wndw_atom *asyw = nv50_wndw_atom(state); - struct nv50_head_atom *harm = NULL, *asyh = NULL; + struct nv50_head_atom *armh = NULL, *asyh = NULL; bool modeset = false; int ret; @@ -429,9 +429,9 @@ nv50_wndw_atomic_check(struct drm_plane *plane, struct drm_plane_state *state) /* Fetch assembly state for the head the window used to belong to. */ if (armw->state.crtc) { - harm = nv50_head_atom_get(asyw->state.state, armw->state.crtc); - if (IS_ERR(harm)) - return PTR_ERR(harm); + armh = nv50_head_atom_get(asyw->state.state, armw->state.crtc); + if (IS_ERR(armh)) + return PTR_ERR(armh); } /* LUT configuration can potentially cause the window to be disabled. */ @@ -455,8 +455,8 @@ nv50_wndw_atomic_check(struct drm_plane *plane, struct drm_plane_state *state) asyh->wndw.mask |= BIT(wndw->id); } else if (armw->visible) { - nv50_wndw_atomic_check_release(wndw, asyw, harm); - harm->wndw.mask &= ~BIT(wndw->id); + nv50_wndw_atomic_check_release(wndw, asyw, armh); + armh->wndw.mask &= ~BIT(wndw->id); } else { return 0; } -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 08/11] drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom
While we're not quite ready yet to add support for flexible wndw mappings, we are going to need to at least keep track of the static wndw mappings we're currently using in each head's atomic state. We'll likely use this in the future to implement real flexible window mapping, but the primary reason we'll need this is for CRC support. See: on nvidia hardware, each CRC entry in the CRC notifier dma context has a "tag". This tag corresponds to the nth update on a specific EVO/NvDisplay channel, which itself is referred to as the "controlling channel". For gf119+ this can be the core channel, ovly channel, or base channel. Since we don't expose CRC entry tags to userspace, we simply ignore this feature and always use the core channel as the controlling channel. Simple. Things get a little bit more complicated on gv100+ though. GV100+ only lets us set the controlling channel to a specific wndw channel, and that wndw must be owned by the head that we're grabbing CRCs when we enable CRC generation. Thus, we always need to make sure that each atomic head state has at least one wndw that is mapped to the head, which will be used as the controlling channel. Note that since we don't have flexible wndw mappings yet, we don't expect to run into any scenarios yet where we'd have a head with no mapped wndws. When we do add support for flexible wndw mappings however, we'll need to make sure that we handle reprogramming CRC capture if our controlling wndw is moved to another head (and potentially reject the new head state entirely if we can't find another available wndw to replace it). With that being said, nouveau currently tracks wndw visibility on heads. It does not keep track of the actual ownership mappings, which are (currently) statically programmed. To fix this, we introduce another bitmask into nv50_head_atom.wndw to keep track of ownership separately from visibility. We then introduce a nv50_head callback to handle populating the wndw ownership map, and call it during the atomic check phase when core->assign_windows is set to true. Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/dispnv50/atom.h | 1 + drivers/gpu/drm/nouveau/dispnv50/disp.c | 16 ++++++++++++++++ drivers/gpu/drm/nouveau/dispnv50/head.h | 2 ++ drivers/gpu/drm/nouveau/dispnv50/headc37d.c | 10 ++++++++++ drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 2 ++ 5 files changed, 31 insertions(+) diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h index 24f7700768dab..62faaf60f47a5 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/atom.h +++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h @@ -18,6 +18,7 @@ struct nv50_head_atom { struct { u32 mask; + u32 owned; u32 olut; } wndw; diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index d472942102f50..368069a5b181a 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -2287,12 +2287,28 @@ static int nv50_disp_atomic_check(struct drm_device *dev, struct drm_atomic_state *state) { struct nv50_atom *atom = nv50_atom(state); + struct nv50_core *core = nv50_disp(dev)->core; struct drm_connector_state *old_connector_state, *new_connector_state; struct drm_connector *connector; struct drm_crtc_state *new_crtc_state; struct drm_crtc *crtc; + struct nv50_head *head; + struct nv50_head_atom *asyh; int ret, i; + if (core->assign_windows && core->func->head->static_wndw_map) { + drm_for_each_crtc(crtc, dev) { + new_crtc_state = drm_atomic_get_crtc_state(state, + crtc); + if (IS_ERR(new_crtc_state)) + return PTR_ERR(new_crtc_state); + + head = nv50_head(crtc); + asyh = nv50_head_atom(new_crtc_state); + core->func->head->static_wndw_map(head, asyh); + } + } + /* We need to handle colour management on a per-plane basis. */ for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) { if (new_crtc_state->color_mgmt_changed) { diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.h b/drivers/gpu/drm/nouveau/dispnv50/head.h index c32b27cdaefc9..c05bbba9e247c 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head.h +++ b/drivers/gpu/drm/nouveau/dispnv50/head.h @@ -40,6 +40,7 @@ struct nv50_head_func { void (*dither)(struct nv50_head *, struct nv50_head_atom *); void (*procamp)(struct nv50_head *, struct nv50_head_atom *); void (*or)(struct nv50_head *, struct nv50_head_atom *); + void (*static_wndw_map)(struct nv50_head *, struct nv50_head_atom *); }; extern const struct nv50_head_func head507d; @@ -86,6 +87,7 @@ int headc37d_curs_format(struct nv50_head *, struct nv50_wndw_atom *, void headc37d_curs_set(struct nv50_head *, struct nv50_head_atom *); void headc37d_curs_clr(struct nv50_head *); void headc37d_dither(struct nv50_head *, struct nv50_head_atom *); +void headc37d_static_wndw_map(struct nv50_head *, struct nv50_head_atom *); extern const struct nv50_head_func headc57d; #endif diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c index 9ef3c603fc43e..c2619652ff2ee 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c @@ -204,6 +204,15 @@ headc37d_view(struct nv50_head *head, struct nv50_head_atom *asyh) } } +void +headc37d_static_wndw_map(struct nv50_head *head, struct nv50_head_atom *asyh) +{ + int i, end; + + for (i = head->base.index * 2, end = i + 2; i < end; i++) + asyh->wndw.owned |= BIT(i); +} + const struct nv50_head_func headc37d = { .view = headc37d_view, @@ -219,4 +228,5 @@ headc37d = { .dither = headc37d_dither, .procamp = headc37d_procamp, .or = headc37d_or, + .static_wndw_map = headc37d_static_wndw_map, }; diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c index 97141eb8e75ab..1c1887749f4c5 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c @@ -211,4 +211,6 @@ headc57d = { .dither = headc37d_dither, .procamp = headc57d_procamp, .or = headc57d_or, + /* TODO: flexible window mappings */ + .static_wndw_map = headc37d_static_wndw_map, }; -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 09/11] drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h
In order to make sure that we flush disable updates at the right time when disabling CRCs, we'll need to be able to look at the outp state to see if we're changing it at the same time that we're disabling CRCs. So, expose the struct in disp.h. Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/dispnv50/disp.c | 18 ------------------ drivers/gpu/drm/nouveau/dispnv50/disp.h | 14 ++++++++++++++ 2 files changed, 14 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index 368069a5b181a..090882794f7d6 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -57,24 +57,6 @@ #include <subdev/bios/dp.h> -/****************************************************************************** - * Atomic state - *****************************************************************************/ - -struct nv50_outp_atom { - struct list_head head; - - struct drm_encoder *encoder; - bool flush_disable; - - union nv50_outp_atom_mask { - struct { - bool ctrl:1; - }; - u8 mask; - } set, clr; -}; - /****************************************************************************** * EVO channel *****************************************************************************/ diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.h b/drivers/gpu/drm/nouveau/dispnv50/disp.h index 696e70a6b98b6..c7b72fa850995 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.h +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.h @@ -71,6 +71,20 @@ struct nv50_dmac { struct mutex lock; }; +struct nv50_outp_atom { + struct list_head head; + + struct drm_encoder *encoder; + bool flush_disable; + + union nv50_outp_atom_mask { + struct { + bool ctrl:1; + }; + u8 mask; + } set, clr; +}; + int nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp, const s32 *oclass, u8 head, void *data, u32 size, u64 syncbuf, struct nv50_dmac *dmac); -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 10/11] drm/nouveau/kms/nv50-: Move hard-coded object handles into header
While most of the functionality on Nvidia GPUs doesn't require using an explicit handle instead of the main VRAM handle + offset, there are a couple of places that do require explicit handles, such as CRC functionality. Since this means we're about to add another nouveau-chosen handle, let's just go ahead and move any hard-coded handles into a single header. This is just to keep things slightly organized, and to make it a little bit easier if we need to add more handles in the future. This patch should contain no functional changes. Changes since v3: * Correct SPDX license identifier (checkpatch) Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/dispnv50/disp.c | 7 +++++-- drivers/gpu/drm/nouveau/dispnv50/handles.h | 15 +++++++++++++++ drivers/gpu/drm/nouveau/dispnv50/wndw.c | 3 ++- 3 files changed, 22 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/dispnv50/handles.h diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index 090882794f7d6..bf7ba1e1c0f74 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -26,6 +26,7 @@ #include "core.h" #include "head.h" #include "wndw.h" +#include "handles.h" #include <linux/dma-mapping.h> #include <linux/hdmi.h> @@ -154,7 +155,8 @@ nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp, if (!syncbuf) return 0; - ret = nvif_object_init(&dmac->base.user, 0xf0000000, NV_DMA_IN_MEMORY, + ret = nvif_object_init(&dmac->base.user, NV50_DISP_HANDLE_SYNCBUF, + NV_DMA_IN_MEMORY, &(struct nv_dma_v0) { .target = NV_DMA_V0_TARGET_VRAM, .access = NV_DMA_V0_ACCESS_RDWR, @@ -165,7 +167,8 @@ nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp, if (ret) return ret; - ret = nvif_object_init(&dmac->base.user, 0xf0000001, NV_DMA_IN_MEMORY, + ret = nvif_object_init(&dmac->base.user, NV50_DISP_HANDLE_VRAM, + NV_DMA_IN_MEMORY, &(struct nv_dma_v0) { .target = NV_DMA_V0_TARGET_VRAM, .access = NV_DMA_V0_ACCESS_RDWR, diff --git a/drivers/gpu/drm/nouveau/dispnv50/handles.h b/drivers/gpu/drm/nouveau/dispnv50/handles.h new file mode 100644 index 0000000000000..d1beeb9a444db --- /dev/null +++ b/drivers/gpu/drm/nouveau/dispnv50/handles.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: MIT +#ifndef __NV50_KMS_HANDLES_H__ +#define __NV50_KMS_HANDLES_H__ + +/* + * Various hard-coded object handles that nouveau uses. These are made-up by + * nouveau developers, not Nvidia. The only significance of the handles chosen + * is that they must all be unique. + */ +#define NV50_DISP_HANDLE_SYNCBUF 0xf0000000 +#define NV50_DISP_HANDLE_VRAM 0xf0000001 + +#define NV50_DISP_HANDLE_WNDW_CTX(kind) (0xfb000000 | kind) + +#endif /* !__NV50_KMS_HANDLES_H__ */ diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c index cfee61f14aa49..9d963ecdd34e8 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c @@ -21,6 +21,7 @@ */ #include "wndw.h" #include "wimm.h" +#include "handles.h" #include <nvif/class.h> #include <nvif/cl0002.h> @@ -59,7 +60,7 @@ nv50_wndw_ctxdma_new(struct nv50_wndw *wndw, struct drm_framebuffer *fb) int ret; nouveau_framebuffer_get_layout(fb, &unused, &kind); - handle = 0xfb000000 | kind; + handle = NV50_DISP_HANDLE_WNDW_CTX(kind); list_for_each_entry(ctxdma, &wndw->ctxdma.list, head) { if (ctxdma->object.handle == handle) -- 2.26.2
Lyude Paul
2020-Jun-24 23:03 UTC
[Nouveau] [RFC v7 11/11] drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the documentation generously provided to us by Nvidia: https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed through a single set of "outp" sources: outp-active/auto for a CRC of the scanout region, outp-complete for a CRC of both the scanout and blanking/sync region combined, and outp-inactive for a CRC of only the blanking/sync region. For each source, nouveau selects the appropriate tap point based on the output path in use. We also expose an "rg" source, which allows for capturing CRCs of the scanout raster before it's encoded into a video signal in the output path. This tap point is referred to as the raster generator. Note that while there's some other neat features that can be used with CRC capture on nvidia hardware, like capturing from two CRC sources simultaneously, I couldn't see any usecase for them and did not implement them. Nvidia only allows for accessing CRCs through a shared DMA region that we program through the core EVO/NvDisplay channel which is referred to as the notifier context. The notifier context is limited to either 255 (for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and unfortunately the hardware simply drops CRCs and reports an overflow once all available entries in the notifier context are filled. Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit on how many CRCs can be captured, we work around this in nouveau by allocating two separate notifier contexts for each head instead of one. We schedule a vblank worker ahead of time so that once we start getting close to filling up all of the available entries in the notifier context, we can swap the currently used notifier context out with another pre-prepared notifier context in a manner similar to page flipping. Unfortunately, the hardware only allows us to this by flushing two separate updates on the core channel: one to release the current notifier context handle, and one to program the next notifier context's handle. When the hardware processes the first update, the CRC for the current frame is lost. However, the second update can be flushed immediately without waiting for the first to complete so that CRC generation resumes on the next frame. According to Nvidia's hardware engineers, there isn't any cleaner way of flipping notifier contexts that would avoid this. Since using vblank workers to swap out the notifier context will ensure we can usually flush both updates to hardware within the timespan of a single frame, we can also ensure that there will only be exactly one frame lost between the first and second update being executed by the hardware. This gives us the guarantee that we're always correctly matching each CRC entry with it's respective frame even after a context flip. And since IGT will retrieve the CRC entry for a frame by waiting until it receives a CRC for any subsequent frames, this doesn't cause an issue with any tests and is much simpler than trying to change the current DRM API to accommodate. In order to facilitate testing of correct handling of this limitation, we also expose a debugfs interface to manually control the threshold for when we start trying to flip the notifier context. We will use this in igt to trigger a context flip for testing purposes without needing to wait for the notifier to completely fill up. This threshold is reset to the default value set by nouveau after each capture, and is exposed in a separate folder within each CRTC's debugfs directory labelled "nv_crc". Changes since v1: * Forgot to finish saving crc.h before saving, whoops. This just adds some corrections to the empty function declarations that we use if CONFIG_DEBUG_FS isn't enabled. Changes since v2: * Don't check return code from debugfs_create_dir() or debugfs_create_file() - Greg K-H Changes since v3: (no functional changes) * Fix SPDX license identifiers (checkpatch) * s/uint32_t/u32/ (checkpatch) * Fix indenting in switch cases (checkpatch) Changes since v4: * Remove unneeded param changes with nv50_head_flush_clr/set * Rebase Changes since v5: * Remove set but unused variable (outp) in nv50_crc_atomic_check() - Kbuild bot Signed-off-by: Lyude Paul <lyude at redhat.com> --- drivers/gpu/drm/nouveau/dispnv04/crtc.c | 25 +- drivers/gpu/drm/nouveau/dispnv50/Kbuild | 4 + drivers/gpu/drm/nouveau/dispnv50/atom.h | 20 + drivers/gpu/drm/nouveau/dispnv50/core.h | 4 + drivers/gpu/drm/nouveau/dispnv50/core907d.c | 3 + drivers/gpu/drm/nouveau/dispnv50/core917d.c | 3 + drivers/gpu/drm/nouveau/dispnv50/corec37d.c | 3 + drivers/gpu/drm/nouveau/dispnv50/corec57d.c | 3 + drivers/gpu/drm/nouveau/dispnv50/crc.c | 714 ++++++++++++++++++++ drivers/gpu/drm/nouveau/dispnv50/crc.h | 125 ++++ drivers/gpu/drm/nouveau/dispnv50/crc907d.c | 139 ++++ drivers/gpu/drm/nouveau/dispnv50/crcc37d.c | 153 +++++ drivers/gpu/drm/nouveau/dispnv50/disp.c | 17 + drivers/gpu/drm/nouveau/dispnv50/disp.h | 10 + drivers/gpu/drm/nouveau/dispnv50/handles.h | 1 + drivers/gpu/drm/nouveau/dispnv50/head.c | 77 ++- drivers/gpu/drm/nouveau/dispnv50/head.h | 10 +- drivers/gpu/drm/nouveau/dispnv50/head907d.c | 14 +- drivers/gpu/drm/nouveau/dispnv50/headc37d.c | 6 +- drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 7 +- drivers/gpu/drm/nouveau/nouveau_display.c | 60 +- 21 files changed, 1328 insertions(+), 70 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.c create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.h create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc907d.c create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crcc37d.c diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c index 640738f3196ce..cec65a12e8ec5 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c @@ -44,6 +44,9 @@ #include <subdev/bios/pll.h> #include <subdev/clk.h> +#include <nvif/event.h> +#include <nvif/cl0046.h> + static int nv04_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y, struct drm_framebuffer *old_fb); @@ -756,6 +759,7 @@ static void nv_crtc_destroy(struct drm_crtc *crtc) nouveau_bo_unmap(nv_crtc->cursor.nvbo); nouveau_bo_unpin(nv_crtc->cursor.nvbo); nouveau_bo_ref(NULL, &nv_crtc->cursor.nvbo); + nvif_notify_fini(&nv_crtc->vblank); kfree(nv_crtc); } @@ -1297,9 +1301,19 @@ create_primary_plane(struct drm_device *dev) return primary; } +static int nv04_crtc_vblank_handler(struct nvif_notify *notify) +{ + struct nouveau_crtc *nv_crtc + container_of(notify, struct nouveau_crtc, vblank); + + drm_crtc_handle_vblank(&nv_crtc->base); + return NVIF_NOTIFY_KEEP; +} + int nv04_crtc_create(struct drm_device *dev, int crtc_num) { + struct nouveau_display *disp = nouveau_display(dev); struct nouveau_crtc *nv_crtc; int ret; @@ -1337,5 +1351,14 @@ nv04_crtc_create(struct drm_device *dev, int crtc_num) nv04_cursor_init(nv_crtc); - return 0; + ret = nvif_notify_init(&disp->disp.object, nv04_crtc_vblank_handler, + false, NV04_DISP_NTFY_VBLANK, + &(struct nvif_notify_head_req_v0) { + .head = nv_crtc->index, + }, + sizeof(struct nvif_notify_head_req_v0), + sizeof(struct nvif_notify_head_rep_v0), + &nv_crtc->vblank); + + return ret; } diff --git a/drivers/gpu/drm/nouveau/dispnv50/Kbuild b/drivers/gpu/drm/nouveau/dispnv50/Kbuild index e0c435eae6646..6fdddb266fb1b 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/Kbuild +++ b/drivers/gpu/drm/nouveau/dispnv50/Kbuild @@ -10,6 +10,10 @@ nouveau-y += dispnv50/core917d.o nouveau-y += dispnv50/corec37d.o nouveau-y += dispnv50/corec57d.o +nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crc.o +nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crc907d.o +nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crcc37d.o + nouveau-y += dispnv50/dac507d.o nouveau-y += dispnv50/dac907d.o diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h index 62faaf60f47a5..3d82b3c67decc 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/atom.h +++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h @@ -2,6 +2,9 @@ #define __NV50_KMS_ATOM_H__ #define nv50_atom(p) container_of((p), struct nv50_atom, state) #include <drm/drm_atomic.h> +#include "crc.h" + +struct nouveau_encoder; struct nv50_atom { struct drm_atomic_state state; @@ -115,9 +118,12 @@ struct nv50_head_atom { u8 nhsync:1; u8 nvsync:1; u8 depth:4; + u8 crc_raster:2; u8 bpc; } or; + struct nv50_crc_atom crc; + /* Currently only used for MST */ struct { int pbn; @@ -135,6 +141,7 @@ struct nv50_head_atom { bool ovly:1; bool dither:1; bool procamp:1; + bool crc:1; bool or:1; }; u16 mask; @@ -150,6 +157,19 @@ nv50_head_atom_get(struct drm_atomic_state *state, struct drm_crtc *crtc) return nv50_head_atom(statec); } +static inline struct drm_encoder * +nv50_head_atom_get_encoder(struct nv50_head_atom *atom) +{ + struct drm_encoder *encoder = NULL; + + /* We only ever have a single encoder */ + drm_for_each_encoder_mask(encoder, atom->state.crtc->dev, + atom->state.encoder_mask) + break; + + return encoder; +} + #define nv50_wndw_atom(p) container_of((p), struct nv50_wndw_atom, state) struct nv50_wndw_atom { diff --git a/drivers/gpu/drm/nouveau/dispnv50/core.h b/drivers/gpu/drm/nouveau/dispnv50/core.h index 99157dc94d235..e021cb340569b 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/core.h +++ b/drivers/gpu/drm/nouveau/dispnv50/core.h @@ -2,6 +2,7 @@ #define __NV50_KMS_CORE_H__ #include "disp.h" #include "atom.h" +#include "crc.h" #include <nouveau_encoder.h> struct nv50_core { @@ -26,6 +27,9 @@ struct nv50_core_func { } wndw; const struct nv50_head_func *head; +#if IS_ENABLED(CONFIG_DEBUG_FS) + const struct nv50_crc_func *crc; +#endif const struct nv50_outp_func { void (*ctrl)(struct nv50_core *, int or, u32 ctrl, struct nv50_head_atom *); diff --git a/drivers/gpu/drm/nouveau/dispnv50/core907d.c b/drivers/gpu/drm/nouveau/dispnv50/core907d.c index 2716298326299..b17c03529c784 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/core907d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/core907d.c @@ -30,6 +30,9 @@ core907d = { .ntfy_wait_done = core507d_ntfy_wait_done, .update = core507d_update, .head = &head907d, +#if IS_ENABLED(CONFIG_DEBUG_FS) + .crc = &crc907d, +#endif .dac = &dac907d, .sor = &sor907d, }; diff --git a/drivers/gpu/drm/nouveau/dispnv50/core917d.c b/drivers/gpu/drm/nouveau/dispnv50/core917d.c index 5cc072d4c30fe..66846f3720805 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/core917d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/core917d.c @@ -30,6 +30,9 @@ core917d = { .ntfy_wait_done = core507d_ntfy_wait_done, .update = core507d_update, .head = &head917d, +#if IS_ENABLED(CONFIG_DEBUG_FS) + .crc = &crc907d, +#endif .dac = &dac907d, .sor = &sor907d, }; diff --git a/drivers/gpu/drm/nouveau/dispnv50/corec37d.c b/drivers/gpu/drm/nouveau/dispnv50/corec37d.c index e0c8811fb8e45..ec83189a1d481 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/corec37d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/corec37d.c @@ -142,6 +142,9 @@ corec37d = { .wndw.owner = corec37d_wndw_owner, .head = &headc37d, .sor = &sorc37d, +#if IS_ENABLED(CONFIG_DEBUG_FS) + .crc = &crcc37d, +#endif }; int diff --git a/drivers/gpu/drm/nouveau/dispnv50/corec57d.c b/drivers/gpu/drm/nouveau/dispnv50/corec57d.c index 10ba9e9e4ae6b..e1c11eba0ce17 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/corec57d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/corec57d.c @@ -52,6 +52,9 @@ corec57d = { .wndw.owner = corec37d_wndw_owner, .head = &headc57d, .sor = &sorc37d, +#if IS_ENABLED(CONFIG_DEBUG_FS) + .crc = &crcc37d, +#endif }; int diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.c b/drivers/gpu/drm/nouveau/dispnv50/crc.c new file mode 100644 index 0000000000000..0b18d9e3a2b96 --- /dev/null +++ b/drivers/gpu/drm/nouveau/dispnv50/crc.c @@ -0,0 +1,714 @@ +// SPDX-License-Identifier: MIT +#include <linux/string.h> +#include <drm/drm_crtc.h> +#include <drm/drm_atomic_helper.h> +#include <drm/drm_vblank.h> +#include <drm/drm_vblank_work.h> + +#include <nvif/class.h> +#include <nvif/cl0002.h> +#include <nvif/timer.h> + +#include "nouveau_drv.h" +#include "core.h" +#include "head.h" +#include "wndw.h" +#include "handles.h" +#include "crc.h" + +static const char * const nv50_crc_sources[] = { + [NV50_CRC_SOURCE_NONE] = "none", + [NV50_CRC_SOURCE_AUTO] = "auto", + [NV50_CRC_SOURCE_RG] = "rg", + [NV50_CRC_SOURCE_OUTP_ACTIVE] = "outp-active", + [NV50_CRC_SOURCE_OUTP_COMPLETE] = "outp-complete", + [NV50_CRC_SOURCE_OUTP_INACTIVE] = "outp-inactive", +}; + +static int nv50_crc_parse_source(const char *buf, enum nv50_crc_source *s) +{ + int i; + + if (!buf) { + *s = NV50_CRC_SOURCE_NONE; + return 0; + } + + i = match_string(nv50_crc_sources, ARRAY_SIZE(nv50_crc_sources), buf); + if (i < 0) + return i; + + *s = i; + return 0; +} + +int +nv50_crc_verify_source(struct drm_crtc *crtc, const char *source_name, + size_t *values_cnt) +{ + struct nouveau_drm *drm = nouveau_drm(crtc->dev); + enum nv50_crc_source source; + + if (nv50_crc_parse_source(source_name, &source) < 0) { + NV_DEBUG(drm, "unknown source %s\n", source_name); + return -EINVAL; + } + + *values_cnt = 1; + return 0; +} + +const char *const *nv50_crc_get_sources(struct drm_crtc *crtc, size_t *count) +{ + *count = ARRAY_SIZE(nv50_crc_sources); + return nv50_crc_sources; +} + +static void +nv50_crc_program_ctx(struct nv50_head *head, + struct nv50_crc_notifier_ctx *ctx) +{ + struct nv50_disp *disp = nv50_disp(head->base.base.dev); + struct nv50_core *core = disp->core; + u32 interlock[NV50_DISP_INTERLOCK__SIZE] = { 0 }; + + core->func->crc->set_ctx(head, ctx); + core->func->update(core, interlock, false); +} + +static void nv50_crc_ctx_flip_work(struct kthread_work *base) +{ + struct drm_vblank_work *work = to_drm_vblank_work(base); + struct nv50_crc *crc = container_of(work, struct nv50_crc, flip_work); + struct nv50_head *head = container_of(crc, struct nv50_head, crc); + struct drm_crtc *crtc = &head->base.base; + struct nv50_disp *disp = nv50_disp(crtc->dev); + u8 new_idx = crc->ctx_idx ^ 1; + + /* + * We don't want to accidentally wait for longer then the vblank, so + * try again for the next vblank if we don't grab the lock + */ + if (!mutex_trylock(&disp->mutex)) { + DRM_DEV_DEBUG_KMS(crtc->dev->dev, + "Lock contended, delaying CRC ctx flip for head-%d\n", + head->base.index); + drm_vblank_work_schedule(work, + drm_crtc_vblank_count(crtc) + 1, + true); + return; + } + + DRM_DEV_DEBUG_KMS(crtc->dev->dev, + "Flipping notifier ctx for head %d (%d -> %d)\n", + drm_crtc_index(crtc), crc->ctx_idx, new_idx); + + nv50_crc_program_ctx(head, NULL); + nv50_crc_program_ctx(head, &crc->ctx[new_idx]); + mutex_unlock(&disp->mutex); + + spin_lock_irq(&crc->lock); + crc->ctx_changed = true; + spin_unlock_irq(&crc->lock); +} + +static inline void nv50_crc_reset_ctx(struct nv50_crc_notifier_ctx *ctx) +{ + memset_io(ctx->mem.object.map.ptr, 0, ctx->mem.object.map.size); +} + +static void +nv50_crc_get_entries(struct nv50_head *head, + const struct nv50_crc_func *func, + enum nv50_crc_source source) +{ + struct drm_crtc *crtc = &head->base.base; + struct nv50_crc *crc = &head->crc; + u32 output_crc; + + while (crc->entry_idx < func->num_entries) { + /* + * While Nvidia's documentation says CRCs are written on each + * subsequent vblank after being enabled, in practice they + * aren't written immediately. + */ + output_crc = func->get_entry(head, &crc->ctx[crc->ctx_idx], + source, crc->entry_idx); + if (!output_crc) + return; + + drm_crtc_add_crc_entry(crtc, true, crc->frame, &output_crc); + crc->frame++; + crc->entry_idx++; + } +} + +void nv50_crc_handle_vblank(struct nv50_head *head) +{ + struct drm_crtc *crtc = &head->base.base; + struct nv50_crc *crc = &head->crc; + const struct nv50_crc_func *func + nv50_disp(head->base.base.dev)->core->func->crc; + struct nv50_crc_notifier_ctx *ctx; + bool need_reschedule = false; + + if (!func) + return; + + /* + * We don't lose events if we aren't able to report CRCs until the + * next vblank, so only report CRCs if the locks we need aren't + * contended to prevent missing an actual vblank event + */ + if (!spin_trylock(&crc->lock)) + return; + + if (!crc->src) + goto out; + + ctx = &crc->ctx[crc->ctx_idx]; + if (crc->ctx_changed && func->ctx_finished(head, ctx)) { + nv50_crc_get_entries(head, func, crc->src); + + crc->ctx_idx ^= 1; + crc->entry_idx = 0; + crc->ctx_changed = false; + + /* + * Unfortunately when notifier contexts are changed during CRC + * capture, we will inevitably lose the CRC entry for the + * frame where the hardware actually latched onto the first + * UPDATE. According to Nvidia's hardware engineers, there's + * no workaround for this. + * + * Now, we could try to be smart here and calculate the number + * of missed CRCs based on audit timestamps, but those were + * removed starting with volta. Since we always flush our + * updates back-to-back without waiting, we'll just be + * optimistic and assume we always miss exactly one frame. + */ + DRM_DEV_DEBUG_KMS(head->base.base.dev->dev, + "Notifier ctx flip for head-%d finished, lost CRC for frame %llu\n", + head->base.index, crc->frame); + crc->frame++; + + nv50_crc_reset_ctx(ctx); + need_reschedule = true; + } + + nv50_crc_get_entries(head, func, crc->src); + + if (need_reschedule) + drm_vblank_work_schedule(&crc->flip_work, + drm_crtc_vblank_count(crtc) + + crc->flip_threshold + - crc->entry_idx, + true); + +out: + spin_unlock(&crc->lock); +} + +static void nv50_crc_wait_ctx_finished(struct nv50_head *head, + const struct nv50_crc_func *func, + struct nv50_crc_notifier_ctx *ctx) +{ + struct drm_device *dev = head->base.base.dev; + struct nouveau_drm *drm = nouveau_drm(dev); + s64 ret; + + ret = nvif_msec(&drm->client.device, 50, + if (func->ctx_finished(head, ctx)) break;); + if (ret == -ETIMEDOUT) + NV_ERROR(drm, + "CRC notifier ctx for head %d not finished after 50ms\n", + head->base.index); + else if (ret) + NV_ATOMIC(drm, + "CRC notifier ctx for head-%d finished after %lldns\n", + head->base.index, ret); +} + +void nv50_crc_atomic_stop_reporting(struct drm_atomic_state *state) +{ + struct drm_crtc_state *crtc_state; + struct drm_crtc *crtc; + int i; + + for_each_new_crtc_in_state(state, crtc, crtc_state, i) { + struct nv50_head *head = nv50_head(crtc); + struct nv50_head_atom *asyh = nv50_head_atom(crtc_state); + struct nv50_crc *crc = &head->crc; + + if (!asyh->clr.crc) + continue; + + spin_lock_irq(&crc->lock); + crc->src = NV50_CRC_SOURCE_NONE; + spin_unlock_irq(&crc->lock); + + drm_crtc_vblank_put(crtc); + drm_vblank_work_cancel_sync(&crc->flip_work); + + NV_ATOMIC(nouveau_drm(crtc->dev), + "CRC reporting on vblank for head-%d disabled\n", + head->base.index); + + /* CRC generation is still enabled in hw, we'll just report + * any remaining CRC entries ourselves after it gets disabled + * in hardware + */ + } +} + +void nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *state) +{ + const struct nv50_crc_func *func + nv50_disp(state->dev)->core->func->crc; + struct drm_crtc_state *new_crtc_state; + struct drm_crtc *crtc; + int i; + + for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) { + struct nv50_head *head = nv50_head(crtc); + struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state); + struct nv50_crc *crc = &head->crc; + struct nv50_crc_notifier_ctx *ctx = &crc->ctx[crc->ctx_idx]; + int i; + + if (asyh->clr.crc && asyh->crc.src) { + if (crc->ctx_changed) { + nv50_crc_wait_ctx_finished(head, func, ctx); + ctx = &crc->ctx[crc->ctx_idx ^ 1]; + } + nv50_crc_wait_ctx_finished(head, func, ctx); + } + + if (asyh->set.crc) { + crc->entry_idx = 0; + crc->ctx_changed = false; + for (i = 0; i < ARRAY_SIZE(crc->ctx); i++) + nv50_crc_reset_ctx(&crc->ctx[i]); + } + } +} + +void nv50_crc_atomic_start_reporting(struct drm_atomic_state *state) +{ + struct drm_crtc_state *crtc_state; + struct drm_crtc *crtc; + int i; + + for_each_new_crtc_in_state(state, crtc, crtc_state, i) { + struct nv50_head *head = nv50_head(crtc); + struct nv50_head_atom *asyh = nv50_head_atom(crtc_state); + struct nv50_crc *crc = &head->crc; + u64 vbl_count; + + if (!asyh->set.crc) + continue; + + drm_crtc_vblank_get(crtc); + + spin_lock_irq(&crc->lock); + vbl_count = drm_crtc_vblank_count(crtc); + crc->frame = vbl_count; + crc->src = asyh->crc.src; + drm_vblank_work_schedule(&crc->flip_work, + vbl_count + crc->flip_threshold, + true); + spin_unlock_irq(&crc->lock); + + NV_ATOMIC(nouveau_drm(crtc->dev), + "CRC reporting on vblank for head-%d enabled\n", + head->base.index); + } +} + +int nv50_crc_atomic_check(struct nv50_head *head, + struct nv50_head_atom *asyh, + struct nv50_head_atom *armh) +{ + struct drm_atomic_state *state = asyh->state.state; + struct drm_device *dev = head->base.base.dev; + struct nv50_atom *atom = nv50_atom(state); + struct nv50_disp *disp = nv50_disp(dev); + struct drm_encoder *encoder; + struct nv50_outp_atom *outp_atom; + bool changed = armh->crc.src != asyh->crc.src; + + if (!armh->crc.src && !asyh->crc.src) { + asyh->set.crc = false; + asyh->clr.crc = false; + return 0; + } + + /* While we don't care about entry tags, Volta+ hw always needs the + * controlling wndw channel programmed to a wndw that's owned by our + * head + */ + if (asyh->crc.src && disp->disp->object.oclass >= GV100_DISP && + !(BIT(asyh->crc.wndw) & asyh->wndw.owned)) { + if (!asyh->wndw.owned) { + /* TODO: once we support flexible channel ownership, + * we should write some code here to handle attempting + * to "steal" a plane: e.g. take a plane that is + * currently not-visible and owned by another head, + * and reassign it to this head. If we fail to do so, + * we shuld reject the mode outright as CRC capture + * then becomes impossible. + */ + NV_ATOMIC(nouveau_drm(dev), + "No available wndws for CRC readback\n"); + return -EINVAL; + } + asyh->crc.wndw = ffs(asyh->wndw.owned) - 1; + } + + if (drm_atomic_crtc_needs_modeset(&asyh->state) || changed || + armh->crc.wndw != asyh->crc.wndw) { + asyh->clr.crc = armh->crc.src && armh->state.active; + asyh->set.crc = asyh->crc.src && asyh->state.active; + if (changed) + asyh->set.or |= armh->or.crc_raster !+ asyh->or.crc_raster; + + /* + * If we're reprogramming our OR, we need to flush the CRC + * disable first + */ + if (asyh->clr.crc) { + encoder = nv50_head_atom_get_encoder(armh); + + list_for_each_entry(outp_atom, &atom->outp, head) { + if (outp_atom->encoder == encoder) { + if (outp_atom->set.mask) + atom->flush_disable = true; + break; + } + } + } + } else { + asyh->set.crc = false; + asyh->clr.crc = false; + } + + return 0; +} + +static enum nv50_crc_source_type +nv50_crc_source_type(struct nouveau_encoder *outp, + enum nv50_crc_source source) +{ + struct dcb_output *dcbe = outp->dcb; + + switch (source) { + case NV50_CRC_SOURCE_NONE: return NV50_CRC_SOURCE_TYPE_NONE; + case NV50_CRC_SOURCE_RG: return NV50_CRC_SOURCE_TYPE_RG; + default: break; + } + + if (dcbe->location != DCB_LOC_ON_CHIP) + return NV50_CRC_SOURCE_TYPE_PIOR; + + switch (dcbe->type) { + case DCB_OUTPUT_DP: return NV50_CRC_SOURCE_TYPE_SF; + case DCB_OUTPUT_ANALOG: return NV50_CRC_SOURCE_TYPE_DAC; + default: return NV50_CRC_SOURCE_TYPE_SOR; + } +} + +void nv50_crc_atomic_set(struct nv50_head *head, + struct nv50_head_atom *asyh) +{ + struct drm_crtc *crtc = &head->base.base; + struct drm_device *dev = crtc->dev; + struct nv50_crc *crc = &head->crc; + const struct nv50_crc_func *func = nv50_disp(dev)->core->func->crc; + struct nouveau_encoder *outp + nv50_real_outp(nv50_head_atom_get_encoder(asyh)); + + func->set_src(head, outp->or, + nv50_crc_source_type(outp, asyh->crc.src), + &crc->ctx[crc->ctx_idx], asyh->crc.wndw); +} + +void nv50_crc_atomic_clr(struct nv50_head *head) +{ + const struct nv50_crc_func *func + nv50_disp(head->base.base.dev)->core->func->crc; + + func->set_src(head, 0, NV50_CRC_SOURCE_TYPE_NONE, NULL, 0); +} + +#define NV50_CRC_RASTER_ACTIVE 0 +#define NV50_CRC_RASTER_COMPLETE 1 +#define NV50_CRC_RASTER_INACTIVE 2 + +static inline int +nv50_crc_raster_type(enum nv50_crc_source source) +{ + switch (source) { + case NV50_CRC_SOURCE_NONE: + case NV50_CRC_SOURCE_AUTO: + case NV50_CRC_SOURCE_RG: + case NV50_CRC_SOURCE_OUTP_ACTIVE: + return NV50_CRC_RASTER_ACTIVE; + case NV50_CRC_SOURCE_OUTP_COMPLETE: + return NV50_CRC_RASTER_COMPLETE; + case NV50_CRC_SOURCE_OUTP_INACTIVE: + return NV50_CRC_RASTER_INACTIVE; + } + + return 0; +} + +/* We handle mapping the memory for CRC notifiers ourselves, since each + * notifier needs it's own handle + */ +static inline int +nv50_crc_ctx_init(struct nv50_head *head, struct nvif_mmu *mmu, + struct nv50_crc_notifier_ctx *ctx, size_t len, int idx) +{ + struct nv50_core *core = nv50_disp(head->base.base.dev)->core; + int ret; + + ret = nvif_mem_init_map(mmu, NVIF_MEM_VRAM, len, &ctx->mem); + if (ret) + return ret; + + ret = nvif_object_init(&core->chan.base.user, + NV50_DISP_HANDLE_CRC_CTX(head, idx), + NV_DMA_IN_MEMORY, + &(struct nv_dma_v0) { + .target = NV_DMA_V0_TARGET_VRAM, + .access = NV_DMA_V0_ACCESS_RDWR, + .start = ctx->mem.addr, + .limit = ctx->mem.addr + + ctx->mem.size - 1, + }, sizeof(struct nv_dma_v0), + &ctx->ntfy); + if (ret) + goto fail_fini; + + return 0; + +fail_fini: + nvif_mem_fini(&ctx->mem); + return ret; +} + +static inline void +nv50_crc_ctx_fini(struct nv50_crc_notifier_ctx *ctx) +{ + nvif_object_fini(&ctx->ntfy); + nvif_mem_fini(&ctx->mem); +} + +int nv50_crc_set_source(struct drm_crtc *crtc, const char *source_str) +{ + struct drm_device *dev = crtc->dev; + struct drm_atomic_state *state; + struct drm_modeset_acquire_ctx ctx; + struct nv50_head *head = nv50_head(crtc); + struct nv50_crc *crc = &head->crc; + const struct nv50_crc_func *func = nv50_disp(dev)->core->func->crc; + struct nvif_mmu *mmu = &nouveau_drm(dev)->client.mmu; + struct nv50_head_atom *asyh; + struct drm_crtc_state *crtc_state; + enum nv50_crc_source source; + int ret = 0, ctx_flags = 0, i; + + ret = nv50_crc_parse_source(source_str, &source); + if (ret) + return ret; + + /* + * Since we don't want the user to accidentally interrupt us as we're + * disabling CRCs + */ + if (source) + ctx_flags |= DRM_MODESET_ACQUIRE_INTERRUPTIBLE; + drm_modeset_acquire_init(&ctx, ctx_flags); + + state = drm_atomic_state_alloc(dev); + if (!state) { + ret = -ENOMEM; + goto out_acquire_fini; + } + state->acquire_ctx = &ctx; + + if (source) { + for (i = 0; i < ARRAY_SIZE(head->crc.ctx); i++) { + ret = nv50_crc_ctx_init(head, mmu, &crc->ctx[i], + func->notifier_len, i); + if (ret) + goto out_ctx_fini; + } + } + +retry: + crtc_state = drm_atomic_get_crtc_state(state, &head->base.base); + if (IS_ERR(crtc_state)) { + ret = PTR_ERR(crtc_state); + if (ret == -EDEADLK) + goto deadlock; + else if (ret) + goto out_drop_locks; + } + asyh = nv50_head_atom(crtc_state); + asyh->crc.src = source; + asyh->or.crc_raster = nv50_crc_raster_type(source); + + ret = drm_atomic_commit(state); + if (ret == -EDEADLK) + goto deadlock; + else if (ret) + goto out_drop_locks; + + if (!source) { + /* + * If the user specified a custom flip threshold through + * debugfs, reset it + */ + crc->flip_threshold = func->flip_threshold; + } + +out_drop_locks: + drm_modeset_drop_locks(&ctx); +out_ctx_fini: + if (!source || ret) { + for (i = 0; i < ARRAY_SIZE(crc->ctx); i++) + nv50_crc_ctx_fini(&crc->ctx[i]); + } + drm_atomic_state_put(state); +out_acquire_fini: + drm_modeset_acquire_fini(&ctx); + return ret; + +deadlock: + drm_atomic_state_clear(state); + drm_modeset_backoff(&ctx); + goto retry; +} + +static int +nv50_crc_debugfs_flip_threshold_get(struct seq_file *m, void *data) +{ + struct nv50_head *head = m->private; + struct drm_crtc *crtc = &head->base.base; + struct nv50_crc *crc = &head->crc; + int ret; + + ret = drm_modeset_lock_single_interruptible(&crtc->mutex); + if (ret) + return ret; + + seq_printf(m, "%d\n", crc->flip_threshold); + + drm_modeset_unlock(&crtc->mutex); + return ret; +} + +static int +nv50_crc_debugfs_flip_threshold_open(struct inode *inode, struct file *file) +{ + return single_open(file, nv50_crc_debugfs_flip_threshold_get, + inode->i_private); +} + +static ssize_t +nv50_crc_debugfs_flip_threshold_set(struct file *file, + const char __user *ubuf, size_t len, + loff_t *offp) +{ + struct seq_file *m = file->private_data; + struct nv50_head *head = m->private; + struct nv50_head_atom *armh; + struct drm_crtc *crtc = &head->base.base; + struct nouveau_drm *drm = nouveau_drm(crtc->dev); + struct nv50_crc *crc = &head->crc; + const struct nv50_crc_func *func + nv50_disp(crtc->dev)->core->func->crc; + int value, ret; + + ret = kstrtoint_from_user(ubuf, len, 10, &value); + if (ret) + return ret; + + if (value > func->flip_threshold) + return -EINVAL; + else if (value == -1) + value = func->flip_threshold; + else if (value < -1) + return -EINVAL; + + ret = drm_modeset_lock_single_interruptible(&crtc->mutex); + if (ret) + return ret; + + armh = nv50_head_atom(crtc->state); + if (armh->crc.src) { + ret = -EBUSY; + goto out; + } + + NV_DEBUG(drm, + "Changing CRC flip threshold for next capture on head-%d to %d\n", + head->base.index, value); + crc->flip_threshold = value; + ret = len; + +out: + drm_modeset_unlock(&crtc->mutex); + return ret; +} + +static const struct file_operations nv50_crc_flip_threshold_fops = { + .owner = THIS_MODULE, + .open = nv50_crc_debugfs_flip_threshold_open, + .read = seq_read, + .write = nv50_crc_debugfs_flip_threshold_set, +}; + +int nv50_head_crc_late_register(struct nv50_head *head) +{ + struct drm_crtc *crtc = &head->base.base; + const struct nv50_crc_func *func + nv50_disp(crtc->dev)->core->func->crc; + struct dentry *root; + + if (!func || !crtc->debugfs_entry) + return 0; + + root = debugfs_create_dir("nv_crc", crtc->debugfs_entry); + debugfs_create_file("flip_threshold", 0644, root, head, + &nv50_crc_flip_threshold_fops); + + return 0; +} + +static inline void +nv50_crc_init_head(struct nv50_disp *disp, const struct nv50_crc_func *func, + struct nv50_head *head) +{ + struct nv50_crc *crc = &head->crc; + + crc->flip_threshold = func->flip_threshold; + spin_lock_init(&crc->lock); + drm_vblank_work_init(&crc->flip_work, &head->base.base, + nv50_crc_ctx_flip_work); +} + +void nv50_crc_init(struct drm_device *dev) +{ + struct nv50_disp *disp = nv50_disp(dev); + struct drm_crtc *crtc; + const struct nv50_crc_func *func = disp->core->func->crc; + + if (!func) + return; + + drm_for_each_crtc(crtc, dev) + nv50_crc_init_head(disp, func, nv50_head(crtc)); +} diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.h b/drivers/gpu/drm/nouveau/dispnv50/crc.h new file mode 100644 index 0000000000000..2d588bb7f65a6 --- /dev/null +++ b/drivers/gpu/drm/nouveau/dispnv50/crc.h @@ -0,0 +1,125 @@ +// SPDX-License-Identifier: MIT +#ifndef __NV50_CRC_H__ +#define __NV50_CRC_H__ + +#include <linux/mutex.h> +#include <drm/drm_crtc.h> +#include <drm/drm_vblank_work.h> + +#include <nvif/mem.h> +#include <nvkm/subdev/bios.h> +#include "nouveau_encoder.h" + +struct nv50_disp; +struct nv50_head; + +#if IS_ENABLED(CONFIG_DEBUG_FS) +enum nv50_crc_source { + NV50_CRC_SOURCE_NONE = 0, + NV50_CRC_SOURCE_AUTO, + NV50_CRC_SOURCE_RG, + NV50_CRC_SOURCE_OUTP_ACTIVE, + NV50_CRC_SOURCE_OUTP_COMPLETE, + NV50_CRC_SOURCE_OUTP_INACTIVE, +}; + +/* RG -> SF (DP only) + * -> SOR + * -> PIOR + * -> DAC + */ +enum nv50_crc_source_type { + NV50_CRC_SOURCE_TYPE_NONE = 0, + NV50_CRC_SOURCE_TYPE_SOR, + NV50_CRC_SOURCE_TYPE_PIOR, + NV50_CRC_SOURCE_TYPE_DAC, + NV50_CRC_SOURCE_TYPE_RG, + NV50_CRC_SOURCE_TYPE_SF, +}; + +struct nv50_crc_notifier_ctx { + struct nvif_mem mem; + struct nvif_object ntfy; +}; + +struct nv50_crc_atom { + enum nv50_crc_source src; + /* Only used for gv100+ */ + u8 wndw : 4; +}; + +struct nv50_crc_func { + void (*set_src)(struct nv50_head *, int or, enum nv50_crc_source_type, + struct nv50_crc_notifier_ctx *, u32 wndw); + void (*set_ctx)(struct nv50_head *, struct nv50_crc_notifier_ctx *); + u32 (*get_entry)(struct nv50_head *, struct nv50_crc_notifier_ctx *, + enum nv50_crc_source, int idx); + bool (*ctx_finished)(struct nv50_head *, + struct nv50_crc_notifier_ctx *); + short flip_threshold; + short num_entries; + size_t notifier_len; +}; + +struct nv50_crc { + spinlock_t lock; + struct nv50_crc_notifier_ctx ctx[2]; + struct drm_vblank_work flip_work; + enum nv50_crc_source src; + + u64 frame; + short entry_idx; + short flip_threshold; + u8 ctx_idx : 1; + bool ctx_changed : 1; +}; + +void nv50_crc_init(struct drm_device *dev); +int nv50_head_crc_late_register(struct nv50_head *); +void nv50_crc_handle_vblank(struct nv50_head *head); + +int nv50_crc_verify_source(struct drm_crtc *, const char *, size_t *); +const char *const *nv50_crc_get_sources(struct drm_crtc *, size_t *); +int nv50_crc_set_source(struct drm_crtc *, const char *); + +int nv50_crc_atomic_check(struct nv50_head *, struct nv50_head_atom *, + struct nv50_head_atom *); +void nv50_crc_atomic_stop_reporting(struct drm_atomic_state *); +void nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *); +void nv50_crc_atomic_start_reporting(struct drm_atomic_state *); +void nv50_crc_atomic_set(struct nv50_head *, struct nv50_head_atom *); +void nv50_crc_atomic_clr(struct nv50_head *); + +extern const struct nv50_crc_func crc907d; +extern const struct nv50_crc_func crcc37d; + +#else /* IS_ENABLED(CONFIG_DEBUG_FS) */ +struct nv50_crc {}; +struct nv50_crc_func {}; +struct nv50_crc_atom {}; + +#define nv50_crc_verify_source NULL +#define nv50_crc_get_sources NULL +#define nv50_crc_set_source NULL + +static inline void nv50_crc_init(struct drm_device *dev) {} +static inline int nv50_head_crc_late_register(struct nv50_head *) {} +static inline void +nv50_crc_handle_vblank(struct nv50_head *head) { return 0; } + +static inline int +nv50_crc_atomic_check(struct nv50_head *, struct nv50_head_atom *, + struct nv50_head_atom *) {} +static inline void +nv50_crc_atomic_stop_reporting(struct drm_atomic_state *) {} +static inline void +nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *) {} +static inline void +nv50_crc_atomic_start_reporting(struct drm_atomic_state *) {} +static inline void +nv50_crc_atomic_set(struct nv50_head *, struct nv50_head_atom *) {} +static inline void +nv50_crc_atomic_clr(struct nv50_head *) {} + +#endif /* IS_ENABLED(CONFIG_DEBUG_FS) */ +#endif /* !__NV50_CRC_H__ */ diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc907d.c b/drivers/gpu/drm/nouveau/dispnv50/crc907d.c new file mode 100644 index 0000000000000..92e907de76454 --- /dev/null +++ b/drivers/gpu/drm/nouveau/dispnv50/crc907d.c @@ -0,0 +1,139 @@ +// SPDX-License-Identifier: MIT +#include <drm/drm_crtc.h> + +#include "crc.h" +#include "core.h" +#include "disp.h" +#include "head.h" + +#define CRC907D_MAX_ENTRIES 255 + +struct crc907d_notifier { + u32 status; + u32 :32; /* reserved */ + struct crc907d_entry { + u32 status; + u32 compositor_crc; + u32 output_crc[2]; + } entries[CRC907D_MAX_ENTRIES]; +} __packed; + +static void +crc907d_set_src(struct nv50_head *head, int or, + enum nv50_crc_source_type source, + struct nv50_crc_notifier_ctx *ctx, u32 wndw) +{ + struct drm_crtc *crtc = &head->base.base; + struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan; + const u32 hoff = head->base.index * 0x300; + u32 *push; + u32 crc_args = 0xfff00000; + + switch (source) { + case NV50_CRC_SOURCE_TYPE_SOR: + crc_args |= (0x00000f0f + or * 16) << 8; + break; + case NV50_CRC_SOURCE_TYPE_PIOR: + crc_args |= (0x000000ff + or * 256) << 8; + break; + case NV50_CRC_SOURCE_TYPE_DAC: + crc_args |= (0x00000ff0 + or) << 8; + break; + case NV50_CRC_SOURCE_TYPE_RG: + crc_args |= (0x00000ff8 + drm_crtc_index(crtc)) << 8; + break; + case NV50_CRC_SOURCE_TYPE_SF: + crc_args |= (0x00000f8f + drm_crtc_index(crtc) * 16) << 8; + break; + case NV50_CRC_SOURCE_NONE: + crc_args |= 0x000fff00; + break; + } + + push = evo_wait(core, 4); + if (!push) + return; + + if (source) { + evo_mthd(push, 0x0438 + hoff, 1); + evo_data(push, ctx->ntfy.handle); + evo_mthd(push, 0x0430 + hoff, 1); + evo_data(push, crc_args); + } else { + evo_mthd(push, 0x0430 + hoff, 1); + evo_data(push, crc_args); + evo_mthd(push, 0x0438 + hoff, 1); + evo_data(push, 0); + } + evo_kick(push, core); +} + +static void crc907d_set_ctx(struct nv50_head *head, + struct nv50_crc_notifier_ctx *ctx) +{ + struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan; + u32 *push = evo_wait(core, 2); + + if (!push) + return; + + evo_mthd(push, 0x0438 + (head->base.index * 0x300), 1); + evo_data(push, ctx ? ctx->ntfy.handle : 0); + evo_kick(push, core); +} + +static u32 crc907d_get_entry(struct nv50_head *head, + struct nv50_crc_notifier_ctx *ctx, + enum nv50_crc_source source, int idx) +{ + struct crc907d_notifier __iomem *notifier = ctx->mem.object.map.ptr; + + return ioread32_native(¬ifier->entries[idx].output_crc[0]); +} + +static bool crc907d_ctx_finished(struct nv50_head *head, + struct nv50_crc_notifier_ctx *ctx) +{ + struct nouveau_drm *drm = nouveau_drm(head->base.base.dev); + struct crc907d_notifier __iomem *notifier = ctx->mem.object.map.ptr; + const u32 status = ioread32_native(¬ifier->status); + const u32 overflow = status & 0x0000003e; + + if (!(status & 0x00000001)) + return false; + + if (overflow) { + const char *engine = NULL; + + switch (overflow) { + case 0x00000004: engine = "DSI"; break; + case 0x00000008: engine = "Compositor"; break; + case 0x00000010: engine = "CRC output 1"; break; + case 0x00000020: engine = "CRC output 2"; break; + } + + if (engine) + NV_ERROR(drm, + "CRC notifier context for head %d overflowed on %s: %x\n", + head->base.index, engine, status); + else + NV_ERROR(drm, + "CRC notifier context for head %d overflowed: %x\n", + head->base.index, status); + } + + NV_DEBUG(drm, "Head %d CRC context status: %x\n", + head->base.index, status); + + return true; +} + +const struct nv50_crc_func crc907d = { + .set_src = crc907d_set_src, + .set_ctx = crc907d_set_ctx, + .get_entry = crc907d_get_entry, + .ctx_finished = crc907d_ctx_finished, + .flip_threshold = CRC907D_MAX_ENTRIES - 10, + .num_entries = CRC907D_MAX_ENTRIES, + .notifier_len = sizeof(struct crc907d_notifier), +}; diff --git a/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c b/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c new file mode 100644 index 0000000000000..940cefd5517d5 --- /dev/null +++ b/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c @@ -0,0 +1,153 @@ +// SPDX-License-Identifier: MIT +#include <drm/drm_crtc.h> + +#include "crc.h" +#include "core.h" +#include "disp.h" +#include "head.h" + +#define CRCC37D_MAX_ENTRIES 2047 + +struct crcc37d_notifier { + u32 status; + + /* reserved */ + u32 :32; + u32 :32; + u32 :32; + u32 :32; + u32 :32; + u32 :32; + u32 :32; + + struct crcc37d_entry { + u32 status[2]; + u32 :32; /* reserved */ + u32 compositor_crc; + u32 rg_crc; + u32 output_crc[2]; + u32 :32; /* reserved */ + } entries[CRCC37D_MAX_ENTRIES]; +} __packed; + +static void +crcc37d_set_src(struct nv50_head *head, int or, + enum nv50_crc_source_type source, + struct nv50_crc_notifier_ctx *ctx, u32 wndw) +{ + struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan; + const u32 hoff = head->base.index * 0x400; + u32 *push; + u32 crc_args; + + switch (source) { + case NV50_CRC_SOURCE_TYPE_SOR: + crc_args = (0x00000050 + or) << 12; + break; + case NV50_CRC_SOURCE_TYPE_PIOR: + crc_args = (0x00000060 + or) << 12; + break; + case NV50_CRC_SOURCE_TYPE_SF: + crc_args = 0x00000030 << 12; + break; + default: + crc_args = 0; + break; + } + + push = evo_wait(core, 4); + if (!push) + return; + + if (source) { + evo_mthd(push, 0x2180 + hoff, 1); + evo_data(push, ctx->ntfy.handle); + evo_mthd(push, 0x2184 + hoff, 1); + evo_data(push, crc_args | wndw); + } else { + evo_mthd(push, 0x2184 + hoff, 1); + evo_data(push, 0); + evo_mthd(push, 0x2180 + hoff, 1); + evo_data(push, 0); + } + + evo_kick(push, core); +} + +static void crcc37d_set_ctx(struct nv50_head *head, + struct nv50_crc_notifier_ctx *ctx) +{ + struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan; + u32 *push = evo_wait(core, 2); + + if (!push) + return; + + evo_mthd(push, 0x2180 + (head->base.index * 0x400), 1); + evo_data(push, ctx ? ctx->ntfy.handle : 0); + evo_kick(push, core); +} + +static u32 crcc37d_get_entry(struct nv50_head *head, + struct nv50_crc_notifier_ctx *ctx, + enum nv50_crc_source source, int idx) +{ + struct crcc37d_notifier __iomem *notifier = ctx->mem.object.map.ptr; + struct crcc37d_entry __iomem *entry = ¬ifier->entries[idx]; + u32 __iomem *crc_addr; + + if (source == NV50_CRC_SOURCE_RG) + crc_addr = &entry->rg_crc; + else + crc_addr = &entry->output_crc[0]; + + return ioread32_native(crc_addr); +} + +static bool crcc37d_ctx_finished(struct nv50_head *head, + struct nv50_crc_notifier_ctx *ctx) +{ + struct nouveau_drm *drm = nouveau_drm(head->base.base.dev); + struct crcc37d_notifier __iomem *notifier = ctx->mem.object.map.ptr; + const u32 status = ioread32_native(¬ifier->status); + const u32 overflow = status & 0x0000007e; + + if (!(status & 0x00000001)) + return false; + + if (overflow) { + const char *engine = NULL; + + switch (overflow) { + case 0x00000004: engine = "Front End"; break; + case 0x00000008: engine = "Compositor"; break; + case 0x00000010: engine = "RG"; break; + case 0x00000020: engine = "CRC output 1"; break; + case 0x00000040: engine = "CRC output 2"; break; + } + + if (engine) + NV_ERROR(drm, + "CRC notifier context for head %d overflowed on %s: %x\n", + head->base.index, engine, status); + else + NV_ERROR(drm, + "CRC notifier context for head %d overflowed: %x\n", + head->base.index, status); + } + + NV_DEBUG(drm, "Head %d CRC context status: %x\n", + head->base.index, status); + + return true; +} + +const struct nv50_crc_func crcc37d = { + .set_src = crcc37d_set_src, + .set_ctx = crcc37d_set_ctx, + .get_entry = crcc37d_get_entry, + .ctx_finished = crcc37d_ctx_finished, + .flip_threshold = CRCC37D_MAX_ENTRIES - 30, + .num_entries = CRCC37D_MAX_ENTRIES, + .notifier_len = sizeof(struct crcc37d_notifier), +}; diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index bf7ba1e1c0f74..9cb06d6d6c3fb 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -783,6 +783,19 @@ struct nv50_msto { bool disabled; }; +struct nouveau_encoder *nv50_real_outp(struct drm_encoder *encoder) +{ + struct nv50_msto *msto; + + if (encoder->encoder_type != DRM_MODE_ENCODER_DPMST) + return nouveau_encoder(encoder); + + msto = nv50_msto(encoder); + if (!msto->mstc) + return NULL; + return msto->mstc->mstm->outp; +} + static struct drm_dp_payload * nv50_msto_payload(struct nv50_msto *msto) { @@ -1932,6 +1945,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state) int i; NV_ATOMIC(drm, "commit %d %d\n", atom->lock_core, atom->flush_disable); + nv50_crc_atomic_stop_reporting(state); drm_atomic_helper_wait_for_fences(dev, state, false); drm_atomic_helper_wait_for_dependencies(state); drm_atomic_helper_update_legacy_modeset_state(dev, state); @@ -2002,6 +2016,8 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state) } } + nv50_crc_atomic_prepare_notifier_contexts(state); + /* Update output path(s). */ list_for_each_entry_safe(outp, outt, &atom->outp, head) { const struct drm_encoder_helper_funcs *help; @@ -2115,6 +2131,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state) } } + nv50_crc_atomic_start_reporting(state); drm_atomic_helper_commit_hw_done(state); drm_atomic_helper_cleanup_planes(dev, state); drm_atomic_helper_commit_cleanup_done(state); diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.h b/drivers/gpu/drm/nouveau/dispnv50/disp.h index c7b72fa850995..1968c6921f9e7 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.h +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.h @@ -1,10 +1,12 @@ #ifndef __NV50_KMS_H__ #define __NV50_KMS_H__ +#include <linux/workqueue.h> #include <nvif/mem.h> #include "nouveau_display.h" struct nv50_msto; +struct nouveau_encoder; struct nv50_disp { struct nvif_disp *disp; @@ -90,6 +92,14 @@ int nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp, u64 syncbuf, struct nv50_dmac *dmac); void nv50_dmac_destroy(struct nv50_dmac *); +/* + * For normal encoders this just returns the encoder. For active MST encoders, + * this returns the real outp that's driving displays on the topology. + * Inactive MST encoders return NULL, since they would have no real outp to + * return anyway. + */ +struct nouveau_encoder *nv50_real_outp(struct drm_encoder *encoder); + u32 *evo_wait(struct nv50_dmac *, int nr); void evo_kick(u32 *, struct nv50_dmac *); diff --git a/drivers/gpu/drm/nouveau/dispnv50/handles.h b/drivers/gpu/drm/nouveau/dispnv50/handles.h index d1beeb9a444db..27af7680294c6 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/handles.h +++ b/drivers/gpu/drm/nouveau/dispnv50/handles.h @@ -11,5 +11,6 @@ #define NV50_DISP_HANDLE_VRAM 0xf0000001 #define NV50_DISP_HANDLE_WNDW_CTX(kind) (0xfb000000 | kind) +#define NV50_DISP_HANDLE_CRC_CTX(head, i) (0xfc000000 | head->base.index << 1 | i) #endif /* !__NV50_KMS_HANDLES_H__ */ diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c index 72bc3bce396a7..eb905cdd54fe1 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head.c +++ b/drivers/gpu/drm/nouveau/dispnv50/head.c @@ -24,13 +24,17 @@ #include "core.h" #include "curs.h" #include "ovly.h" +#include "crc.h" #include <nvif/class.h> +#include <nvif/event.h> +#include <nvif/cl0046.h> #include <drm/drm_atomic_helper.h> #include <drm/drm_crtc_helper.h> #include <drm/drm_vblank.h> #include "nouveau_connector.h" + void nv50_head_flush_clr(struct nv50_head *head, struct nv50_head_atom *asyh, bool flush) @@ -38,6 +42,7 @@ nv50_head_flush_clr(struct nv50_head *head, union nv50_head_atom_mask clr = { .mask = asyh->clr.mask & ~(flush ? 0 : asyh->set.mask), }; + if (clr.crc) nv50_crc_atomic_clr(head); if (clr.olut) head->func->olut_clr(head); if (clr.core) head->func->core_clr(head); if (clr.curs) head->func->curs_clr(head); @@ -61,6 +66,7 @@ nv50_head_flush_set(struct nv50_head *head, struct nv50_head_atom *asyh) if (asyh->set.ovly ) head->func->ovly (head, asyh); if (asyh->set.dither ) head->func->dither (head, asyh); if (asyh->set.procamp) head->func->procamp (head, asyh); + if (asyh->set.crc ) nv50_crc_atomic_set (head, asyh); if (asyh->set.or ) head->func->or (head, asyh); } @@ -313,7 +319,7 @@ nv50_head_atomic_check(struct drm_crtc *crtc, struct drm_crtc_state *state) struct nouveau_conn_atom *asyc = NULL; struct drm_connector_state *conns; struct drm_connector *conn; - int i; + int i, ret; NV_ATOMIC(drm, "%s atomic_check %d\n", crtc->name, asyh->state.active); if (asyh->state.active) { @@ -408,6 +414,10 @@ nv50_head_atomic_check(struct drm_crtc *crtc, struct drm_crtc_state *state) asyh->set.curs = asyh->curs.visible; } + ret = nv50_crc_atomic_check(head, asyh, armh); + if (ret) + return ret; + if (asyh->clr.mask || asyh->set.mask) nv50_atom(asyh->state.state)->lock_core = true; return 0; @@ -446,6 +456,7 @@ nv50_head_atomic_duplicate_state(struct drm_crtc *crtc) asyh->ovly = armh->ovly; asyh->dither = armh->dither; asyh->procamp = armh->procamp; + asyh->crc = armh->crc; asyh->or = armh->or; asyh->dp = armh->dp; asyh->clr.mask = 0; @@ -467,10 +478,18 @@ nv50_head_reset(struct drm_crtc *crtc) __drm_atomic_helper_crtc_reset(crtc, &asyh->state); } +static int +nv50_head_late_register(struct drm_crtc *crtc) +{ + return nv50_head_crc_late_register(nv50_head(crtc)); +} + static void nv50_head_destroy(struct drm_crtc *crtc) { struct nv50_head *head = nv50_head(crtc); + + nvif_notify_fini(&head->base.vblank); nv50_lut_fini(&head->olut); drm_crtc_cleanup(crtc); kfree(head); @@ -488,8 +507,38 @@ nv50_head_func = { .enable_vblank = nouveau_display_vblank_enable, .disable_vblank = nouveau_display_vblank_disable, .get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp, + .late_register = nv50_head_late_register, +}; + +static const struct drm_crtc_funcs +nvd9_head_func = { + .reset = nv50_head_reset, + .gamma_set = drm_atomic_helper_legacy_gamma_set, + .destroy = nv50_head_destroy, + .set_config = drm_atomic_helper_set_config, + .page_flip = drm_atomic_helper_page_flip, + .atomic_duplicate_state = nv50_head_atomic_duplicate_state, + .atomic_destroy_state = nv50_head_atomic_destroy_state, + .enable_vblank = nouveau_display_vblank_enable, + .disable_vblank = nouveau_display_vblank_disable, + .get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp, + .verify_crc_source = nv50_crc_verify_source, + .get_crc_sources = nv50_crc_get_sources, + .set_crc_source = nv50_crc_set_source, + .late_register = nv50_head_late_register, }; +static int nv50_head_vblank_handler(struct nvif_notify *notify) +{ + struct nouveau_crtc *nv_crtc + container_of(notify, struct nouveau_crtc, vblank); + + if (drm_crtc_handle_vblank(&nv_crtc->base)) + nv50_crc_handle_vblank(nv50_head(&nv_crtc->base)); + + return NVIF_NOTIFY_KEEP; +} + struct nv50_head * nv50_head_create(struct drm_device *dev, int index) { @@ -497,7 +546,9 @@ nv50_head_create(struct drm_device *dev, int index) struct nv50_disp *disp = nv50_disp(dev); struct nv50_head *head; struct nv50_wndw *base, *ovly, *curs; + struct nouveau_crtc *nv_crtc; struct drm_crtc *crtc; + const struct drm_crtc_funcs *funcs; int ret; head = kzalloc(sizeof(*head), GFP_KERNEL); @@ -507,6 +558,11 @@ nv50_head_create(struct drm_device *dev, int index) head->func = disp->core->func->head; head->base.index = index; + if (disp->disp->object.oclass < GF110_DISP) + funcs = &nv50_head_func; + else + funcs = &nvd9_head_func; + if (disp->disp->object.oclass < GV100_DISP) { ret = nv50_base_new(drm, head->base.index, &base); if (ret) @@ -531,9 +587,10 @@ nv50_head_create(struct drm_device *dev, int index) if (ret) goto fail_free; - crtc = &head->base.base; + nv_crtc = &head->base; + crtc = &nv_crtc->base; drm_crtc_init_with_planes(dev, crtc, &base->plane, &curs->plane, - &nv50_head_func, "head-%d", head->base.index); + funcs, "head-%d", head->base.index); drm_crtc_helper_add(crtc, &nv50_head_help); /* Keep the legacy gamma size at 256 to avoid compatibility issues */ drm_mode_crtc_set_gamma_size(crtc, 256); @@ -547,8 +604,22 @@ nv50_head_create(struct drm_device *dev, int index) goto fail_crtc_cleanup; } + ret = nvif_notify_init(&disp->disp->object, nv50_head_vblank_handler, + false, NV04_DISP_NTFY_VBLANK, + &(struct nvif_notify_head_req_v0) { + .head = nv_crtc->index, + }, + sizeof(struct nvif_notify_head_req_v0), + sizeof(struct nvif_notify_head_rep_v0), + &nv_crtc->vblank); + if (ret) + goto fail_lut_fini; + return head; +fail_lut_fini: + if (head->func->olut_set) + nv50_lut_fini(&head->olut); fail_crtc_cleanup: drm_crtc_cleanup(crtc); fail_free: diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.h b/drivers/gpu/drm/nouveau/dispnv50/head.h index c05bbba9e247c..30501ad1824ec 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head.h +++ b/drivers/gpu/drm/nouveau/dispnv50/head.h @@ -1,22 +1,28 @@ #ifndef __NV50_KMS_HEAD_H__ #define __NV50_KMS_HEAD_H__ #define nv50_head(c) container_of((c), struct nv50_head, base.base) +#include <linux/workqueue.h> + #include "disp.h" #include "atom.h" +#include "crc.h" #include "lut.h" #include "nouveau_crtc.h" +#include "nouveau_encoder.h" struct nv50_head { const struct nv50_head_func *func; struct nouveau_crtc base; + struct nv50_crc crc; struct nv50_lut olut; struct nv50_msto *msto; }; struct nv50_head *nv50_head_create(struct drm_device *, int index); -void nv50_head_flush_set(struct nv50_head *, struct nv50_head_atom *); -void nv50_head_flush_clr(struct nv50_head *, struct nv50_head_atom *, bool y); +void nv50_head_flush_set(struct nv50_head *head, struct nv50_head_atom *asyh); +void nv50_head_flush_clr(struct nv50_head *head, + struct nv50_head_atom *asyh, bool flush); struct nv50_head_func { void (*view)(struct nv50_head *, struct nv50_head_atom *); diff --git a/drivers/gpu/drm/nouveau/dispnv50/head907d.c b/drivers/gpu/drm/nouveau/dispnv50/head907d.c index 3002ec23d7a6f..63a0b45d96d63 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/head907d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/head907d.c @@ -19,8 +19,15 @@ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. */ +#include <drm/drm_connector.h> +#include <drm/drm_mode_config.h> +#include <drm/drm_vblank.h> +#include "nouveau_drv.h" +#include "nouveau_bios.h" +#include "nouveau_connector.h" #include "head.h" #include "core.h" +#include "crc.h" void head907d_or(struct nv50_head *head, struct nv50_head_atom *asyh) @@ -29,9 +36,10 @@ head907d_or(struct nv50_head *head, struct nv50_head_atom *asyh) u32 *push; if ((push = evo_wait(core, 3))) { evo_mthd(push, 0x0404 + (head->base.index * 0x300), 2); - evo_data(push, 0x00000001 | asyh->or.depth << 6 | - asyh->or.nvsync << 4 | - asyh->or.nhsync << 3); + evo_data(push, asyh->or.depth << 6 | + asyh->or.nvsync << 4 | + asyh->or.nhsync << 3 | + asyh->or.crc_raster); evo_data(push, 0x31ec6000 | head->base.index << 25 | asyh->mode.interlace); evo_kick(push, core); diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c index c2619652ff2ee..35fcdf8825b5a 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c @@ -46,10 +46,10 @@ headc37d_or(struct nv50_head *head, struct nv50_head_atom *asyh) } evo_mthd(push, 0x2004 + (head->base.index * 0x400), 1); - evo_data(push, 0x00000001 | - asyh->or.depth << 4 | + evo_data(push, depth << 4 | asyh->or.nvsync << 3 | - asyh->or.nhsync << 2); + asyh->or.nhsync << 2 | + asyh->or.crc_raster); evo_kick(push, core); } } diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c index 1c1887749f4c5..c7d04dd935fdf 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c +++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c @@ -46,10 +46,11 @@ headc57d_or(struct nv50_head *head, struct nv50_head_atom *asyh) } evo_mthd(push, 0x2004 + (head->base.index * 0x400), 1); - evo_data(push, 0xfc000001 | - asyh->or.depth << 4 | + evo_data(push, 0xfc000000 | + depth << 4 | asyh->or.nvsync << 3 | - asyh->or.nhsync << 2); + asyh->or.nhsync << 2 | + asyh->or.crc_raster); evo_kick(push, core); } } diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index 901ac55506d65..657554cf011ee 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -44,15 +44,7 @@ #include <nvif/class.h> #include <nvif/cl0046.h> #include <nvif/event.h> - -static int -nouveau_display_vblank_handler(struct nvif_notify *notify) -{ - struct nouveau_crtc *nv_crtc - container_of(notify, typeof(*nv_crtc), vblank); - drm_crtc_handle_vblank(&nv_crtc->base); - return NVIF_NOTIFY_KEEP; -} +#include <dispnv50/crc.h> int nouveau_display_vblank_enable(struct drm_crtc *crtc) @@ -136,50 +128,6 @@ nouveau_display_scanoutpos(struct drm_crtc *crtc, stime, etime); } -static void -nouveau_display_vblank_fini(struct drm_device *dev) -{ - struct drm_crtc *crtc; - - list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) { - struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc); - nvif_notify_fini(&nv_crtc->vblank); - } -} - -static int -nouveau_display_vblank_init(struct drm_device *dev) -{ - struct nouveau_display *disp = nouveau_display(dev); - struct drm_crtc *crtc; - int ret; - - list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) { - struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc); - ret = nvif_notify_init(&disp->disp.object, - nouveau_display_vblank_handler, false, - NV04_DISP_NTFY_VBLANK, - &(struct nvif_notify_head_req_v0) { - .head = nv_crtc->index, - }, - sizeof(struct nvif_notify_head_req_v0), - sizeof(struct nvif_notify_head_rep_v0), - &nv_crtc->vblank); - if (ret) { - nouveau_display_vblank_fini(dev); - return ret; - } - } - - ret = drm_vblank_init(dev, dev->mode_config.num_crtc); - if (ret) { - nouveau_display_vblank_fini(dev); - return ret; - } - - return 0; -} - static const struct drm_framebuffer_funcs nouveau_framebuffer_funcs = { .destroy = drm_gem_fb_destroy, .create_handle = drm_gem_fb_create_handle, @@ -705,9 +653,12 @@ nouveau_display_create(struct drm_device *dev) drm_mode_config_reset(dev); if (dev->mode_config.num_crtc) { - ret = nouveau_display_vblank_init(dev); + ret = drm_vblank_init(dev, dev->mode_config.num_crtc); if (ret) goto vblank_err; + + if (disp->disp.object.oclass >= NV50_DISP) + nv50_crc_init(dev); } INIT_WORK(&drm->hpd_work, nouveau_display_hpd_work); @@ -734,7 +685,6 @@ nouveau_display_destroy(struct drm_device *dev) #ifdef CONFIG_ACPI unregister_acpi_notifier(&nouveau_drm(dev)->acpi_nb); #endif - nouveau_display_vblank_fini(dev); drm_kms_helper_poll_fini(dev); drm_mode_config_cleanup(dev); -- 2.26.2
Daniel Vetter
2020-Jun-26 20:47 UTC
[Nouveau] [RFC v7 02/11] drm/vblank: Use spin_(un)lock_irq() in drm_crtc_vblank_off()
On Wed, Jun 24, 2020 at 07:03:09PM -0400, Lyude Paul wrote:> This got me confused for a bit while looking over this code: I had been > planning on adding some blocking function calls into this function, but > seeing the irqsave/irqrestore variants of spin_(un)lock() didn't make it > very clear whether or not that would actually be safe. > > So I went ahead and reviewed every single driver in the kernel that uses > this function, and they all fall into three categories: > > * Driver probe code > * ->atomic_disable() callbacks > * Legacy modesetting callbacks > > All of these will be guaranteed to have IRQs enabled, which means it's > perfectly safe to block here. Just to make things a little less > confusing to others in the future, let's switch over to > spin_lock_irq()/spin_unlock_irq() to make that fact a little more > obvious. > > Signed-off-by: Lyude Paul <lyude at redhat.com> > Cc: Daniel Vetter <daniel at ffwll.ch> > Cc: Ville Syrj?l? <ville.syrjala at linux.intel.com>I think the patch is correct, but now we're having a bit a inconsistency, since all other functions where the same applies still use _irqsave. I looked through the file and I think drm_vblank_get, drm_crtc_vblank_reset, drm_crtc_vblank_on and drm_legacy_vblank_post_modeset, drm_queue_vblank_event and drm_crtc_queue_sequence_ioctl are all candiates for the same cleanup. Maybe follow up patches for less confusion? On this: Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>> --- > drivers/gpu/drm/drm_vblank.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > index ce5c1e1d29963..e895f5331fdb4 100644 > --- a/drivers/gpu/drm/drm_vblank.c > +++ b/drivers/gpu/drm/drm_vblank.c > @@ -1283,13 +1283,12 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) > struct drm_pending_vblank_event *e, *t; > > ktime_t now; > - unsigned long irqflags; > u64 seq; > > if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) > return; > > - spin_lock_irqsave(&dev->event_lock, irqflags); > + spin_lock_irq(&dev->event_lock); > > spin_lock(&dev->vbl_lock); > drm_dbg_vbl(dev, "crtc %d, vblank enabled %d, inmodeset %d\n", > @@ -1325,7 +1324,7 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) > drm_vblank_put(dev, pipe); > send_vblank_event(dev, e, seq, now); > } > - spin_unlock_irqrestore(&dev->event_lock, irqflags); > + spin_unlock_irq(&dev->event_lock); > > /* Will be reset by the modeset helpers when re-enabling the crtc by > * calling drm_calc_timestamping_constants(). */ > -- > 2.26.2 >-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Wed, Jun 24, 2020 at 07:03:10PM -0400, Lyude Paul wrote:> Add some kind of vblank workers. The interface is similar to regular > delayed works, and is mostly based off kthread_work. It allows for > scheduling delayed works that execute once a particular vblank sequence > has passed. It also allows for accurate flushing of scheduled vblank > works - in that flushing waits for both the vblank sequence and job > execution to complete, or for the work to get cancelled - whichever > comes first. > > Whatever hardware programming we do in the work must be fast (must at > least complete during the vblank or scanout period, sometimes during the > first few scanlines of the vblank). As such we use a high-priority > per-CRTC thread to accomplish this. > > Changes since v6: > * Get rid of ->pending and seqcounts, and implement flushing through > simpler means - danvet > * Get rid of work_lock, just use drm_device->event_lock > * Move drm_vblank_work item cleanup into drm_crtc_vblank_off() so that > we ensure that all vblank work has finished before disabling vblanks > * Add checks into drm_crtc_vblank_reset() so we yell if it gets called > while there's vblank workers active > * Grab event_lock in both drm_crtc_vblank_on()/drm_crtc_vblank_off(), > the main reason for this is so that other threads calling > drm_vblank_work_schedule() are blocked from attempting to schedule > while we're in the middle of enabling/disabling vblanks. > * Move drm_handle_vblank_works() call below drm_handle_vblank_events() > * Simplify drm_vblank_work_cancel_sync() > * Fix drm_vblank_work_cancel_sync() documentation > * Move wake_up_all() calls out of spinlock where we can. The only one I > left was the call to wake_up_all() in drm_vblank_handle_works() as > this seemed like it made more sense just living in that function > (which is all technically under lock) > * Move drm_vblank_work related functions into their own source files > * Add drm_vblank_internal.h so we can export some functions we don't > want drivers using, but that we do need to use in drm_vblank_work.c > * Add a bunch of documentation > Changes since v4: > * Get rid of kthread interfaces we tried adding and move all of the > locking into drm_vblank.c. For implementing drm_vblank_work_flush(), > we now use a wait_queue and sequence counters in order to > differentiate between multiple work item executions. > * Get rid of drm_vblank_work_cancel() - this would have been pretty > difficult to actually reimplement and it occurred to me that neither > nouveau or i915 are even planning to use this function. Since there's > also no async cancel function for most of the work interfaces in the > kernel, it seems a bit unnecessary anyway. > * Get rid of to_drm_vblank_work() since we now are also able to just > pass the struct drm_vblank_work to work item callbacks anyway > Changes since v3: > * Use our own spinlocks, don't integrate so tightly with kthread_works > Changes since v2: > * Use kthread_workers instead of reinventing the wheel. > > Cc: Daniel Vetter <daniel at ffwll.ch> > Cc: Tejun Heo <tj at kernel.org> > Cc: dri-devel at lists.freedesktop.org > Cc: nouveau at lists.freedesktop.org > Co-developed-by: Ville Syrj?l? <ville.syrjala at linux.intel.com> > Signed-off-by: Lyude Paul <lyude at redhat.com>I found a bunch of tiny details below, but overall looks great and thanks for polishing the kerneldoc. With the details addressed one way or another: Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch> But feel free to resend and poke me again if you want me to recheck the details that needed changing. Cheers, Daniel> --- > Documentation/gpu/drm-kms.rst | 15 ++ > drivers/gpu/drm/Makefile | 2 +- > drivers/gpu/drm/drm_vblank.c | 55 +++-- > drivers/gpu/drm/drm_vblank_internal.h | 19 ++ > drivers/gpu/drm/drm_vblank_work.c | 259 +++++++++++++++++++++ > drivers/gpu/drm/drm_vblank_work_internal.h | 24 ++ > include/drm/drm_vblank.h | 20 ++ > include/drm/drm_vblank_work.h | 71 ++++++ > 8 files changed, 447 insertions(+), 18 deletions(-) > create mode 100644 drivers/gpu/drm/drm_vblank_internal.h > create mode 100644 drivers/gpu/drm/drm_vblank_work.c > create mode 100644 drivers/gpu/drm/drm_vblank_work_internal.h > create mode 100644 include/drm/drm_vblank_work.h > > diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst > index 975cfeb8a3532..3c5ae4f6dfd23 100644 > --- a/Documentation/gpu/drm-kms.rst > +++ b/Documentation/gpu/drm-kms.rst > @@ -543,3 +543,18 @@ Vertical Blanking and Interrupt Handling Functions Reference > > .. kernel-doc:: drivers/gpu/drm/drm_vblank.c > :export: > + > +Vertical Blank Work > +==================> + > +.. kernel-doc:: drivers/gpu/drm/drm_vblank_work.c > + :doc: vblank works > + > +Vertical Blank Work Functions Reference > +--------------------------------------- > + > +.. kernel-doc:: include/drm/drm_vblank_work.h > + :internal: > + > +.. kernel-doc:: drivers/gpu/drm/drm_vblank_work.c > + :export: > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile > index 2c0e5a7e59536..02ee5faf1a925 100644 > --- a/drivers/gpu/drm/Makefile > +++ b/drivers/gpu/drm/Makefile > @@ -18,7 +18,7 @@ drm-y := drm_auth.o drm_cache.o \ > drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \ > drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \ > drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \ > - drm_managed.o > + drm_managed.o drm_vblank_work.o > > drm-$(CONFIG_DRM_LEGACY) += drm_legacy_misc.o drm_bufs.o drm_context.o drm_dma.o drm_scatter.o drm_lock.o > drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c > index e895f5331fdb4..b353bc8328414 100644 > --- a/drivers/gpu/drm/drm_vblank.c > +++ b/drivers/gpu/drm/drm_vblank.c > @@ -25,6 +25,7 @@ > */ > > #include <linux/export.h> > +#include <linux/kthread.h> > #include <linux/moduleparam.h> > > #include <drm/drm_crtc.h> > @@ -37,6 +38,8 @@ > > #include "drm_internal.h" > #include "drm_trace.h" > +#include "drm_vblank_internal.h" > +#include "drm_vblank_work_internal.h"Feels mild overkill to have these files with 1-2 functions each, I'd stuff them all into drm_internal.h. We do have other vblank stuff in there already.> > /** > * DOC: vblank handling > @@ -363,7 +366,7 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > store_vblank(dev, pipe, diff, t_vblank, cur_vblank); > } > > -static u64 drm_vblank_count(struct drm_device *dev, unsigned int pipe) > +u64 drm_vblank_count(struct drm_device *dev, unsigned int pipe) > { > struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; > u64 count; > @@ -497,6 +500,7 @@ static void drm_vblank_init_release(struct drm_device *dev, void *ptr) > drm_WARN_ON(dev, READ_ONCE(vblank->enabled) && > drm_core_check_feature(dev, DRIVER_MODESET)); > > + drm_vblank_destroy_worker(vblank); > del_timer_sync(&vblank->disable_timer); > } > > @@ -539,6 +543,10 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs) > vblank); > if (ret) > return ret; > + > + ret = drm_vblank_worker_init(vblank); > + if (ret) > + return ret; > } > > return 0; > @@ -1135,7 +1143,7 @@ static int drm_vblank_enable(struct drm_device *dev, unsigned int pipe) > return ret; > } > > -static int drm_vblank_get(struct drm_device *dev, unsigned int pipe) > +int drm_vblank_get(struct drm_device *dev, unsigned int pipe) > { > struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; > unsigned long irqflags; > @@ -1178,7 +1186,7 @@ int drm_crtc_vblank_get(struct drm_crtc *crtc) > } > EXPORT_SYMBOL(drm_crtc_vblank_get); > > -static void drm_vblank_put(struct drm_device *dev, unsigned int pipe) > +void drm_vblank_put(struct drm_device *dev, unsigned int pipe) > { > struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; > > @@ -1281,13 +1289,16 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) > unsigned int pipe = drm_crtc_index(crtc); > struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; > struct drm_pending_vblank_event *e, *t; > - > ktime_t now; > u64 seq; > > if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) > return; > > + /* > + * Grab event_lock early to prevent vblank work from being scheduled > + * while we're in the middle of shutting down vblank interrupts > + */ > spin_lock_irq(&dev->event_lock); > > spin_lock(&dev->vbl_lock); > @@ -1324,11 +1335,18 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) > drm_vblank_put(dev, pipe); > send_vblank_event(dev, e, seq, now); > } > + > + /* Cancel any leftover pending vblank work */ > + drm_vblank_cancel_pending_works(vblank); > + > spin_unlock_irq(&dev->event_lock); > > /* Will be reset by the modeset helpers when re-enabling the crtc by > * calling drm_calc_timestamping_constants(). */ > vblank->hwmode.crtc_clock = 0; > + > + /* Wait for any vblank work that's still executing to finish */ > + drm_vblank_flush_worker(vblank); > } > EXPORT_SYMBOL(drm_crtc_vblank_off); > > @@ -1363,6 +1381,7 @@ void drm_crtc_vblank_reset(struct drm_crtc *crtc) > spin_unlock_irqrestore(&dev->vbl_lock, irqflags); > > drm_WARN_ON(dev, !list_empty(&dev->vblank_event_list)); > + drm_WARN_ON(dev, !list_empty(&vblank->pending_work)); > } > EXPORT_SYMBOL(drm_crtc_vblank_reset); > > @@ -1417,7 +1436,10 @@ void drm_crtc_vblank_on(struct drm_crtc *crtc) > if (drm_WARN_ON(dev, pipe >= dev->num_crtcs)) > return; > > - spin_lock_irqsave(&dev->vbl_lock, irqflags); > + /* So vblank works can't be scheduled until we've finished */ > + spin_lock_irqsave(&dev->event_lock, irqflags);This smells fishy, why do we need this? drm_enable_vblank takes the ->vblank_time_lock spinlock, which is the first thing drm_handle_vblank takes, so there's absolute no way for a vblank event or worker to get ahead of this here. Except if I'm missing something this isn't needed.> + > + spin_lock(&dev->vbl_lock); > drm_dbg_vbl(dev, "crtc %d, vblank enabled %d, inmodeset %d\n", > pipe, vblank->enabled, vblank->inmodeset); > > @@ -1435,7 +1457,9 @@ void drm_crtc_vblank_on(struct drm_crtc *crtc) > */ > if (atomic_read(&vblank->refcount) != 0 || drm_vblank_offdelay == 0) > drm_WARN_ON(dev, drm_vblank_enable(dev, pipe)); > - spin_unlock_irqrestore(&dev->vbl_lock, irqflags); > + > + spin_unlock(&dev->vbl_lock); > + spin_unlock_irqrestore(&dev->event_lock, irqflags); > } > EXPORT_SYMBOL(drm_crtc_vblank_on); > > @@ -1589,11 +1613,6 @@ int drm_legacy_modeset_ctl_ioctl(struct drm_device *dev, void *data, > return 0; > } > > -static inline bool vblank_passed(u64 seq, u64 ref) > -{ > - return (seq - ref) <= (1 << 23); > -} > - > static int drm_queue_vblank_event(struct drm_device *dev, unsigned int pipe, > u64 req_seq, > union drm_wait_vblank *vblwait, > @@ -1650,7 +1669,7 @@ static int drm_queue_vblank_event(struct drm_device *dev, unsigned int pipe, > trace_drm_vblank_event_queued(file_priv, pipe, req_seq); > > e->sequence = req_seq; > - if (vblank_passed(seq, req_seq)) { > + if (drm_vblank_passed(seq, req_seq)) { > drm_vblank_put(dev, pipe); > send_vblank_event(dev, e, seq, now); > vblwait->reply.sequence = seq; > @@ -1805,7 +1824,7 @@ int drm_wait_vblank_ioctl(struct drm_device *dev, void *data, > } > > if ((flags & _DRM_VBLANK_NEXTONMISS) && > - vblank_passed(seq, req_seq)) { > + drm_vblank_passed(seq, req_seq)) { > req_seq = seq + 1; > vblwait->request.type &= ~_DRM_VBLANK_NEXTONMISS; > vblwait->request.sequence = req_seq; > @@ -1824,7 +1843,7 @@ int drm_wait_vblank_ioctl(struct drm_device *dev, void *data, > drm_dbg_core(dev, "waiting on vblank count %llu, crtc %u\n", > req_seq, pipe); > wait = wait_event_interruptible_timeout(vblank->queue, > - vblank_passed(drm_vblank_count(dev, pipe), req_seq) || > + drm_vblank_passed(drm_vblank_count(dev, pipe), req_seq) || > !READ_ONCE(vblank->enabled), > msecs_to_jiffies(3000)); > > @@ -1873,7 +1892,7 @@ static void drm_handle_vblank_events(struct drm_device *dev, unsigned int pipe) > list_for_each_entry_safe(e, t, &dev->vblank_event_list, base.link) { > if (e->pipe != pipe) > continue; > - if (!vblank_passed(seq, e->sequence)) > + if (!drm_vblank_passed(seq, e->sequence)) > continue; > > drm_dbg_core(dev, "vblank event on %llu, current %llu\n", > @@ -1943,6 +1962,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe) > !atomic_read(&vblank->refcount)); > > drm_handle_vblank_events(dev, pipe); > + drm_handle_vblank_works(vblank); > > spin_unlock_irqrestore(&dev->event_lock, irqflags); > > @@ -2096,7 +2116,7 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data, > if (flags & DRM_CRTC_SEQUENCE_RELATIVE) > req_seq += seq; > > - if ((flags & DRM_CRTC_SEQUENCE_NEXT_ON_MISS) && vblank_passed(seq, req_seq)) > + if ((flags & DRM_CRTC_SEQUENCE_NEXT_ON_MISS) && drm_vblank_passed(seq, req_seq)) > req_seq = seq + 1; > > e->pipe = pipe; > @@ -2125,7 +2145,7 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data, > > e->sequence = req_seq; > > - if (vblank_passed(seq, req_seq)) { > + if (drm_vblank_passed(seq, req_seq)) { > drm_crtc_vblank_put(crtc); > send_vblank_event(dev, e, seq, now); > queue_seq->sequence = seq; > @@ -2145,3 +2165,4 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data, > kfree(e); > return ret; > } > + > diff --git a/drivers/gpu/drm/drm_vblank_internal.h b/drivers/gpu/drm/drm_vblank_internal.h > new file mode 100644 > index 0000000000000..217ae5442ddce > --- /dev/null > +++ b/drivers/gpu/drm/drm_vblank_internal.h > @@ -0,0 +1,19 @@ > +// SPDX-License-Identifier: MIT > + > +#ifndef DRM_VBLANK_INTERNAL_H > +#define DRM_VBLANK_INTERNAL_H > + > +#include <linux/types.h> > + > +#include <drm/drm_device.h> > + > +static inline bool drm_vblank_passed(u64 seq, u64 ref) > +{ > + return (seq - ref) <= (1 << 23); > +} > + > +int drm_vblank_get(struct drm_device *dev, unsigned int pipe); > +void drm_vblank_put(struct drm_device *dev, unsigned int pipe); > +u64 drm_vblank_count(struct drm_device *dev, unsigned int pipe); > + > +#endif /* !DRM_VBLANK_INTERNAL_H */ > diff --git a/drivers/gpu/drm/drm_vblank_work.c b/drivers/gpu/drm/drm_vblank_work.c > new file mode 100644 > index 0000000000000..0762ad34cdcc0 > --- /dev/null > +++ b/drivers/gpu/drm/drm_vblank_work.c > @@ -0,0 +1,259 @@ > +// SPDX-License-Identifier: MIT > + > +#include <uapi/linux/sched/types.h> > + > +#include <drm/drm_print.h> > +#include <drm/drm_vblank.h> > +#include <drm/drm_vblank_work.h> > +#include <drm/drm_crtc.h> > + > +#include "drm_vblank_internal.h" > +#include "drm_vblank_work_internal.h" > + > +/** > + * DOC: vblank works > + * > + * Many DRM drivers need to program hardware in a time-sensitive manner, many > + * times with a deadline of starting and finishing within a certain region of > + * the scanout. Most of the time the safest way to accomplish this is to > + * simply do said time-sensitive programming in the driver's IRQ handler, > + * which allows drivers to avoid being preempted during these critical > + * regions. Or even better, the hardware may even handle applying such > + * time-critical programming independently of the CPU. > + * > + * While there's a decent amount of hardware that's designed so that the CPU > + * doesn't need to be concerned with extremely time-sensitive programming, > + * there's a few situations where it can't be helped. Some unforgiving > + * hardware may require that certain time-sensitive programming be handled > + * completely by the CPU, and said programming may even take too long to > + * handle in an IRQ handler. Another such situation would be where the driver > + * needs to perform a task that needs to complete within a specific scanout > + * period, but might possibly block and thus cannot be handled in an IRQ > + * context. Both of these situations can't be solved perfectly in Linux since > + * we're not a realtime kernel, and thus the scheduler may cause us to miss > + * our deadline if it decides to preempt us. But for some drivers, it's good > + * enough if we can lower our chance of being preempted to an absolute > + * minimum. > + * > + * This is where &drm_vblank_work comes in. &drm_vblank_work provides a simple > + * generic delayed work implementation which delays work execution until a > + * particular vblank has passed, and then executes the work at realtime > + * priority. This provides the best possible chance at performing > + * time-sensitive hardware programming on time, even when the system is under > + * heavy load. &drm_vblank_work also supports rescheduling, so that self > + * re-arming work items can be easily implemented. > + */ > + > +void drm_handle_vblank_works(struct drm_vblank_crtc *vblank) > +{ > + struct drm_vblank_work *work, *next; > + u64 count = atomic64_read(&vblank->count); > + bool wake = false; > + > + assert_spin_locked(&vblank->dev->event_lock); > + > + list_for_each_entry_safe(work, next, &vblank->pending_work, node) { > + if (!drm_vblank_passed(count, work->count)) > + continue; > + > + list_del_init(&work->node); > + drm_vblank_put(vblank->dev, vblank->pipe); > + kthread_queue_work(vblank->worker, &work->base); > + wake = true; > + } > + if (wake) > + wake_up_all(&vblank->work_wait_queue); > +} > + > +/* Handle cancelling any pending vblank work items and drop respective vblank > + * references in response to vblank interrupts being disabled. > + */ > +void drm_vblank_cancel_pending_works(struct drm_vblank_crtc *vblank) > +{ > + struct drm_vblank_work *work, *next; > + > + assert_spin_locked(&vblank->dev->event_lock); > + > + list_for_each_entry_safe(work, next, &vblank->pending_work, node) { > + list_del_init(&work->node); > + drm_vblank_put(vblank->dev, vblank->pipe); > + } > + > + wake_up_all(&vblank->work_wait_queue); > +} > + > +/** > + * drm_vblank_work_schedule - schedule a vblank work > + * @work: vblank work to schedule > + * @count: target vblank count > + * @nextonmiss: defer until the next vblank if target vblank was missed > + * > + * Schedule @work for execution once the crtc vblank count reaches @count. > + * > + * If the crtc vblank count has already reached @count and @nextonmiss is > + * %false the work starts to execute immediately. > + * > + * If the crtc vblank count has already reached @count and @nextonmiss is > + * %true the work is deferred until the next vblank (as if @count has been > + * specified as crtc vblank count + 1). > + * > + * If @work is already scheduled, this function will reschedule said work > + * using the new @count.Maybe clarify here that "This can be use for self-rearming work items." or something like that.> + * > + * Returns: > + * 0 on success, error code on failure. > + */ > +int drm_vblank_work_schedule(struct drm_vblank_work *work, > + u64 count, bool nextonmiss) > +{ > + struct drm_vblank_crtc *vblank = work->vblank; > + struct drm_device *dev = vblank->dev; > + u64 cur_vbl; > + unsigned long irqflags; > + bool passed, rescheduling = false, wake = false; > + int ret = 0; > + > + spin_lock_irqsave(&dev->event_lock, irqflags); > + if (!vblank->worker || vblank->inmodeset || work->cancelling)Oh nice catch with ->inmodeset, I totally missed to check re-arming vs drm_crtc_vblank_off races. Only problem I'm seeing is that we're holding the wrong spinlock, this needs to be check under ->vbl_lock. But ->cancelling needs the event_lock, so I think you need to split this check into two, and grab the ->vbl_lock around the ->inmodeset check. The ->worker check otoh looks fishy, that should never happen. If you feel like some defensive programming then I think that should be an if (WARN_ON(!vblank->worker)) return;> + goto out; > + > + if (list_empty(&work->node)) { > + ret = drm_vblank_get(dev, vblank->pipe);Ok that kills the idea of converting the _irqsave to _irq in drm_vblank_get. I do wonder whether it wouldn't be nicer to have the vblank_get outside of the spinlock, and unconditional - would allow you to drop the ->inmodeset check. But the end result in code flow cleanliness is not any better, so not a good idea I think.> + if (ret < 0) > + goto out; > + } else if (work->count == count) { > + /* Already scheduled w/ same vbl count */ > + goto out; > + } else { > + rescheduling = true; > + } > + > + work->count = count; > + cur_vbl = drm_vblank_count(dev, vblank->pipe); > + passed = drm_vblank_passed(cur_vbl, count); > + if (passed) > + DRM_DEV_ERROR(dev->dev, > + "crtc %d vblank %llu already passed (current %llu)\n", > + vblank->pipe, count, cur_vbl);This is a bit loud, I think that should be debug out most. You can't really prevent races. I do wonder though whether we should do something like 1 indicates that the work item has been scheduled, and 0 that it hasn't been scheduled (aside from failure, which is negative).> + > + if (!nextonmiss && passed) { > + drm_vblank_put(dev, vblank->pipe); > + kthread_queue_work(vblank->worker, &work->base); > + > + if (rescheduling) { > + list_del_init(&work->node); > + wake = true; > + } > + } else if (!rescheduling) { > + list_add_tail(&work->node, &vblank->pending_work); > + } > + > +out: > + spin_unlock_irqrestore(&dev->event_lock, irqflags); > + if (wake) > + wake_up_all(&vblank->work_wait_queue); > + return ret; > +} > +EXPORT_SYMBOL(drm_vblank_work_schedule);I think the above control flow is all correct, but this is the kind of stuff that's prime material for some selftests. But we don't have enough ready-made mocking I think, so not going to ask for that. Just an idea.> + > +/** > + * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it to > + * finish executing > + * @work: vblank work to cancel > + * > + * Cancel an already scheduled vblank work and wait for its > + * execution to finish. > + * > + * On return, @work is guaranteed to no longer be scheduled or running, even > + * if it's self-arming. > + * > + * Returns: > + * %True if the work was cancelled before it started to execute, %false > + * otherwise. > + */ > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work) > +{ > + struct drm_vblank_crtc *vblank = work->vblank; > + struct drm_device *dev = vblank->dev; > + bool ret = false; > + > + spin_lock_irq(&dev->event_lock); > + if (!list_empty(&work->node)) { > + list_del_init(&work->node); > + drm_vblank_put(vblank->dev, vblank->pipe); > + ret = true; > + } > + > + work->cancelling++; > + spin_unlock_irq(&dev->event_lock); > + > + wake_up_all(&vblank->work_wait_queue); > + > + if (kthread_cancel_work_sync(&work->base)) > + ret = true; > + > + spin_lock_irq(&dev->event_lock); > + work->cancelling--; > + spin_unlock_irq(&dev->event_lock);lgtm, everything looks ordered correctly to avoid a self-arming work escaping.> + > + return ret; > +} > +EXPORT_SYMBOL(drm_vblank_work_cancel_sync); > + > +/** > + * drm_vblank_work_flush - wait for a scheduled vblank work to finish > + * executing > + * @work: vblank work to flush > + * > + * Wait until @work has finished executing once. > + */ > +void drm_vblank_work_flush(struct drm_vblank_work *work) > +{ > + struct drm_vblank_crtc *vblank = work->vblank; > + struct drm_device *dev = vblank->dev; > + > + spin_lock_irq(&dev->event_lock); > + wait_event_lock_irq(vblank->work_wait_queue, list_empty(&work->node), > + dev->event_lock); > + spin_unlock_irq(&dev->event_lock); > + > + kthread_flush_work(&work->base);So much less magic here, I like.> +} > +EXPORT_SYMBOL(drm_vblank_work_flush); > + > +/** > + * drm_vblank_work_init - initialize a vblank work item > + * @work: vblank work item > + * @crtc: CRTC whose vblank will trigger the work execution > + * @func: work function to be executed > + * > + * Initialize a vblank work item for a specific crtc. > + */ > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc *crtc, > + void (*func)(struct kthread_work *work)) > +{ > + kthread_init_work(&work->base, func); > + INIT_LIST_HEAD(&work->node); > + work->vblank = &crtc->dev->vblank[drm_crtc_index(crtc)]; > +} > +EXPORT_SYMBOL(drm_vblank_work_init); > + > +int drm_vblank_worker_init(struct drm_vblank_crtc *vblank) > +{ > + struct sched_param param = { > + .sched_priority = MAX_RT_PRIO - 1, > + }; > + struct kthread_worker *worker; > + > + INIT_LIST_HEAD(&vblank->pending_work); > + init_waitqueue_head(&vblank->work_wait_queue); > + worker = kthread_create_worker(0, "card%d-crtc%d", > + vblank->dev->primary->index, > + vblank->pipe); > + if (IS_ERR(worker)) > + return PTR_ERR(worker); > + > + vblank->worker = worker; > + > + return sched_setscheduler(vblank->worker->task, SCHED_FIFO, ¶m); > +} > diff --git a/drivers/gpu/drm/drm_vblank_work_internal.h b/drivers/gpu/drm/drm_vblank_work_internal.h > new file mode 100644 > index 0000000000000..0a4abbc4ab295 > --- /dev/null > +++ b/drivers/gpu/drm/drm_vblank_work_internal.h > @@ -0,0 +1,24 @@ > +// SPDX-License-Identifier: MIT > + > +#ifndef _DRM_VBLANK_WORK_INTERNAL_H_ > +#define _DRM_VBLANK_WORK_INTERNAL_H_ > + > +#include <drm/drm_vblank.h> > + > +int drm_vblank_worker_init(struct drm_vblank_crtc *vblank); > +void drm_vblank_cancel_pending_works(struct drm_vblank_crtc *vblank); > +void drm_handle_vblank_works(struct drm_vblank_crtc *vblank); > + > +static inline void drm_vblank_flush_worker(struct drm_vblank_crtc *vblank) > +{ > + if (vblank->worker)Is this check really required? We should always have a worker I thought?> + kthread_flush_worker(vblank->worker); > +} > + > +static inline void drm_vblank_destroy_worker(struct drm_vblank_crtc *vblank) > +{ > + if (vblank->worker)Same here.> + kthread_destroy_worker(vblank->worker); > +} > + > +#endif /* !_DRM_VBLANK_WORK_INTERNAL_H_ */ > diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h > index dd9f5b9e56e4e..dd125f8c766cf 100644 > --- a/include/drm/drm_vblank.h > +++ b/include/drm/drm_vblank.h > @@ -27,12 +27,14 @@ > #include <linux/seqlock.h> > #include <linux/idr.h> > #include <linux/poll.h> > +#include <linux/kthread.h> > > #include <drm/drm_file.h> > #include <drm/drm_modes.h> > > struct drm_device; > struct drm_crtc; > +struct drm_vblank_work; > > /** > * struct drm_pending_vblank_event - pending vblank event tracking > @@ -203,6 +205,24 @@ struct drm_vblank_crtc { > * disabling functions multiple times. > */ > bool enabled; > + > + /** > + * @worker: The &kthread_worker used for executing vblank works. > + */ > + struct kthread_worker *worker; > + > + /** > + * @pending_work: A list of scheduled &drm_vblank_work items that are > + * waiting for a future vblank. > + */ > + struct list_head pending_work; > + > + /** > + * @work_wait_queue: The wait queue used for signaling that a > + * &drm_vblank_work item has either finished executing, or was > + * cancelled. > + */ > + wait_queue_head_t work_wait_queue; > }; > > int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs); > diff --git a/include/drm/drm_vblank_work.h b/include/drm/drm_vblank_work.h > new file mode 100644 > index 0000000000000..f0439c039f7ce > --- /dev/null > +++ b/include/drm/drm_vblank_work.h > @@ -0,0 +1,71 @@ > +// SPDX-License-Identifier: MIT > + > +#ifndef _DRM_VBLANK_WORK_H_ > +#define _DRM_VBLANK_WORK_H_ > + > +#include <linux/kthread.h> > + > +struct drm_crtc; > + > +/** > + * struct drm_vblank_work - A delayed work item which delays until a target > + * vblank passes, and then executes at realtime priority outside of IRQ > + * context. > + * > + * See also: > + * drm_vblank_work_schedule() > + * drm_vblank_work_init() > + * drm_vblank_work_cancel_sync() > + * drm_vblank_work_flush() > + */ > +struct drm_vblank_work { > + /** > + * @base: The base &kthread_work item which will be executed by > + * &drm_vblank_crtc.worker. Drivers should not interact with this > + * directly, and instead rely on drm_vblank_work_init() to initialize > + * this. > + */ > + struct kthread_work base; > + > + /** > + * @vblank: A pointer to &drm_vblank_crtc this work item belongs to. > + */ > + struct drm_vblank_crtc *vblank; > + > + /** > + * @count: The target vblank this work will execute on. Drivers should > + * not modify this value directly, and instead use > + * drm_vblank_work_schedule() > + */ > + u64 count; > + > + /** > + * @cancelling: The number of drm_vblank_work_cancel_sync() calls that > + * are currently running. A work item cannot be rescheduled until all > + * calls have finished. > + */ > + int cancelling; > + > + /** > + * @node: The position of this work item in > + * &drm_vblank_crtc.pending_work. > + */ > + struct list_head node; > +}; > + > +/** > + * to_drm_vblank_work - Retrieve the respective &drm_vblank_work item from a > + * &kthread_work > + * @_work: The &kthread_work embedded inside a &drm_vblank_work > + */ > +#define to_drm_vblank_work(_work) \ > + container_of((_work), struct drm_vblank_work, base) > + > +int drm_vblank_work_schedule(struct drm_vblank_work *work, > + u64 count, bool nextonmiss); > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc *crtc, > + void (*func)(struct kthread_work *work)); > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work); > +void drm_vblank_work_flush(struct drm_vblank_work *work); > + > +#endif /* !_DRM_VBLANK_WORK_H_ */ > -- > 2.26.2 >-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch