thr3ads.net - Nouveau - [Nouveau] [PATCH drm-next 13/14] drm/nouveau: implement new VM

If this information is useful, please help other people find it:
Share via:
Thomas Hellström (Intel)
2023-Jan-19 07:42 UTC
[Nouveau] [PATCH drm-next 13/14] drm/nouveau: implement new VM_BIND UAPI

On 1/19/23 05:58, Matthew Brost wrote:> On Thu, Jan 19, 2023 at 04:44:23AM +0100, Danilo Krummrich wrote:
>> On 1/18/23 21:37, Thomas Hellstr?m (Intel) wrote:
>>> On 1/18/23 07:12, Danilo Krummrich wrote:
>>>> This commit provides the implementation for the new uapi
motivated by the
>>>> Vulkan API. It allows user mode drivers (UMDs) to:
>>>>
>>>> 1) Initialize a GPU virtual address (VA) space via the new
>>>>  ??? DRM_IOCTL_NOUVEAU_VM_INIT ioctl for UMDs to specify the
portion of VA
>>>>  ??? space managed by the kernel and userspace, respectively.
>>>>
>>>> 2) Allocate and free a VA space region as well as bind and
unbind memory
>>>>  ??? to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
ioctl.
>>>>  ??? UMDs can request the named operations to be processed
either
>>>>  ??? synchronously or asynchronously. It supports DRM syncobjs
>>>>  ??? (incl. timelines) as synchronization mechanism. The
management of the
>>>>  ??? GPU VA mappings is implemented with the DRM GPU VA
manager.
>>>>
>>>> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC
ioctl. The
>>>>  ??? execution happens asynchronously. It supports DRM syncobj
(incl.
>>>>  ??? timelines) as synchronization mechanism. DRM GEM object
locking is
>>>>  ??? handled with drm_exec.
>>>>
>>>> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, use
the DRM
>>>> GPU scheduler for the asynchronous paths.
>>>>
>>>> Signed-off-by: Danilo Krummrich <dakr at redhat.com>
>>>> ---
>>>>  ? Documentation/gpu/driver-uapi.rst?????? |?? 3 +
>>>>  ? drivers/gpu/drm/nouveau/Kbuild????????? |?? 2 +
>>>>  ? drivers/gpu/drm/nouveau/Kconfig???????? |?? 2 +
>>>>  ? drivers/gpu/drm/nouveau/nouveau_abi16.c |? 16 +
>>>>  ? drivers/gpu/drm/nouveau/nouveau_abi16.h |?? 1 +
>>>>  ? drivers/gpu/drm/nouveau/nouveau_drm.c?? |? 23 +-
>>>>  ? drivers/gpu/drm/nouveau/nouveau_drv.h?? |?? 9 +-
>>>>  ? drivers/gpu/drm/nouveau/nouveau_exec.c? | 310 ++++++++++
>>>>  ? drivers/gpu/drm/nouveau/nouveau_exec.h? |? 55 ++
>>>>  ? drivers/gpu/drm/nouveau/nouveau_sched.c | 780
++++++++++++++++++++++++
>>>>  ? drivers/gpu/drm/nouveau/nouveau_sched.h |? 98 +++
>>>>  ? 11 files changed, 1295 insertions(+), 4 deletions(-)
>>>>  ? create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
>>>>  ? create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
>>>>  ? create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
>>>>  ? create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
>>> ...
>>>> +static struct dma_fence *
>>>> +nouveau_bind_job_run(struct nouveau_job *job)
>>>> +{
>>>> +??? struct nouveau_bind_job *bind_job =
to_nouveau_bind_job(job);
>>>> +??? struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(job->cli);
>>>> +??? struct bind_job_op *op;
>>>> +??? int ret = 0;
>>>> +
>>> I was looking at how nouveau does the async binding compared to how
xe
>>> does it.
>>> It looks to me that this function being a scheduler run_job
callback is
>>> the main part of the VM_BIND dma-fence signalling critical section
for
>>> the job's done_fence and if so, needs to be annotated as such?
>> Yes, that's the case.
>>
>>> For example nouveau_uvma_region_new allocates memory, which is not
>>> allowed if in a dma_fence signalling critical section and the
locking
>>> also looks suspicious?
>> Thanks for pointing this out, I missed that somehow.
>>
>> I will change it to pre-allocate new regions, mappings and page tables
>> within the job's submit() function.
>>
> Yea that what we basically do in Xe, in the IOCTL step allocate all the
> backing store for new page tables, populate new page tables (these are
> not yet visible in the page table structure), and in last step which is
> executed after all the dependencies are satified program all the leaf
> entires making the new binding visible.
>
> We screwed have this up by defering most of the IOCTL to a worker but
> will fix this fix this one way or another soon - get rid of worker or
> introduce a type of sync that is signaled after the worker + publish the
> dma-fence in the worker. I'd like to close on this one soon.
>   
>> For the ops structures the drm_gpuva_manager allocates for reporting
the
>> split/merge steps back to the driver I have ideas to entirely avoid
>> allocations, which also is a good thing in respect of Christians
feedback
>> regarding the huge amount of mapping requests some applications seem to
>> generate.
>>
> It should be fine to have allocations to report the split/merge step as
> this step should be before a dma-fence is published, but yea if possible
> to avoid extra allocs as that is always better.
>
> Also BTW, great work on drm_gpuva_manager too. We will almost likely
> pick this up in Xe rather than open coding all of this as we currently
> do. We should probably start the port to this soon so we can contribute
> to the implementation and get both of our drivers upstream sooner.
>   
>> Regarding the locking, anything specific that makes it look suspicious
to
>> you?
>>
> I haven't looked into this too but almost certainly Thomas is
suggesting
> that if you allocate memory anywhere under the nouveau_uvmm_lock then
> you can't use this lock in the run_job() callback as this in the
> dma-fencing path.
Yes, that was what looked suspicious to me, although I haven't either 
looked at the code in detail to say for sure.

But starting by annotating this with dma_fence_[begin | 
end]_signalling() would help find all issues with this.

FWIW, by coincidence I? discussed drm-scheduler dma-fence annotation 
with Daniel Vetter yesterday and it appears he has a patch-set to enable 
that, at least for drivers that want to opt-in. We probably should try 
to get that merged and then we'd be able to catch this type of things 
earlier.

Thanks,

Thomas


>
> Matt
>
>>> Thanks,
>>>
>>> Thomas
>>>
>>>
>>>> +??? nouveau_uvmm_lock(uvmm);
>>>> +??? list_for_each_op(op, &bind_job->ops) {
>>>> +??????? switch (op->op) {
>>>> +??????? case OP_ALLOC: {
>>>> +??????????? bool sparse = op->flags &
DRM_NOUVEAU_VM_BIND_SPARSE;
>>>> +
>>>> +??????????? ret = nouveau_uvma_region_new(uvmm,
>>>> +????????????????????????????? op->va.addr,
>>>> +????????????????????????????? op->va.range,
>>>> +????????????????????????????? sparse);
>>>> +??????????? if (ret)
>>>> +??????????????? goto out_unlock;
>>>> +??????????? break;
>>>> +??????? }
>>>> +??????? case OP_FREE:
>>>> +??????????? ret = nouveau_uvma_region_destroy(uvmm,
>>>> +????????????????????????????? op->va.addr,
>>>> +????????????????????????????? op->va.range);
>>>> +??????????? if (ret)
>>>> +??????????????? goto out_unlock;
>>>> +??????????? break;
>>>> +??????? case OP_MAP:
>>>> +??????????? ret = nouveau_uvmm_sm_map(uvmm,
>>>> +????????????????????????? op->va.addr, op->va.range,
>>>> +????????????????????????? op->gem.obj, op->gem.offset,
>>>> +????????????????????????? op->flags && 0xff);
>>>> +??????????? if (ret)
>>>> +??????????????? goto out_unlock;
>>>> +??????????? break;
>>>> +??????? case OP_UNMAP:
>>>> +??????????? ret = nouveau_uvmm_sm_unmap(uvmm,
>>>> +??????????????????????????? op->va.addr,
>>>> +??????????????????????????? op->va.range);
>>>> +??????????? if (ret)
>>>> +??????????????? goto out_unlock;
>>>> +??????????? break;
>>>> +??????? }
>>>> +??? }
>>>> +
>>>> +out_unlock:
>>>> +??? nouveau_uvmm_unlock(uvmm);
>>>> +??? if (ret)
>>>> +??????? NV_PRINTK(err, job->cli, "bind job failed:
%d\n", ret);
>>>> +??? return ERR_PTR(ret);
>>>> +}
>>>> +
>>>> +static void
>>>> +nouveau_bind_job_free(struct nouveau_job *job)
>>>> +{
>>>> +??? struct nouveau_bind_job *bind_job =
to_nouveau_bind_job(job);
>>>> +??? struct bind_job_op *op, *next;
>>>> +
>>>> +??? list_for_each_op_safe(op, next, &bind_job->ops) {
>>>> +??????? struct drm_gem_object *obj = op->gem.obj;
>>>> +
>>>> +??????? if (obj)
>>>> +??????????? drm_gem_object_put(obj);
>>>> +
>>>> +??????? list_del(&op->entry);
>>>> +??????? kfree(op);
>>>> +??? }
>>>> +
>>>> +??? nouveau_base_job_free(job);
>>>> +??? kfree(bind_job);
>>>> +}
>>>> +
>>>> +static struct nouveau_job_ops nouveau_bind_job_ops = {
>>>> +??? .submit = nouveau_bind_job_submit,
>>>> +??? .run = nouveau_bind_job_run,
>>>> +??? .free = nouveau_bind_job_free,
>>>> +};
>>>> +
>>>> +static int
>>>> +bind_job_op_from_uop(struct bind_job_op **pop,
>>>> +???????????? struct drm_nouveau_vm_bind_op *uop)
>>>> +{
>>>> +??? struct bind_job_op *op;
>>>> +
>>>> +??? op = *pop = kzalloc(sizeof(*op), GFP_KERNEL);
>>>> +??? if (!op)
>>>> +??????? return -ENOMEM;
>>>> +
>>>> +??? op->op = uop->op;
>>>> +??? op->flags = uop->flags;
>>>> +??? op->va.addr = uop->addr;
>>>> +??? op->va.range = uop->range;
>>>> +
>>>> +??? if (op->op == DRM_NOUVEAU_VM_BIND_OP_MAP) {
>>>> +??????? op->gem.handle = uop->handle;
>>>> +??????? op->gem.offset = uop->bo_offset;
>>>> +??? }
>>>> +
>>>> +??? return 0;
>>>> +}
>>>> +
>>>> +static void
>>>> +bind_job_ops_free(struct list_head *ops)
>>>> +{
>>>> +??? struct bind_job_op *op, *next;
>>>> +
>>>> +??? list_for_each_op_safe(op, next, ops) {
>>>> +??????? list_del(&op->entry);
>>>> +??????? kfree(op);
>>>> +??? }
>>>> +}
>>>> +
>>>> +int
>>>> +nouveau_bind_job_init(struct nouveau_bind_job **pjob,
>>>> +????????????? struct nouveau_exec_bind *bind)
>>>> +{
>>>> +??? struct nouveau_bind_job *job;
>>>> +??? struct bind_job_op *op;
>>>> +??? int i, ret;
>>>> +
>>>> +??? job = *pjob = kzalloc(sizeof(*job), GFP_KERNEL);
>>>> +??? if (!job)
>>>> +??????? return -ENOMEM;
>>>> +
>>>> +??? INIT_LIST_HEAD(&job->ops);
>>>> +
>>>> +??? for (i = 0; i < bind->op.count; i++) {
>>>> +??????? ret = bind_job_op_from_uop(&op,
&bind->op.s[i]);
>>>> +??????? if (ret)
>>>> +??????????? goto err_free;
>>>> +
>>>> +??????? list_add_tail(&op->entry, &job->ops);
>>>> +??? }
>>>> +
>>>> +??? job->base.sync = !(bind->flags &
DRM_NOUVEAU_VM_BIND_RUN_ASYNC);
>>>> +??? job->base.ops = &nouveau_bind_job_ops;
>>>> +
>>>> +??? ret = nouveau_base_job_init(&job->base,
&bind->base);
>>>> +??? if (ret)
>>>> +??????? goto err_free;
>>>> +
>>>> +??? return 0;
>>>> +
>>>> +err_free:
>>>> +??? bind_job_ops_free(&job->ops);
>>>> +??? kfree(job);
>>>> +??? *pjob = NULL;
>>>> +
>>>> +??? return ret;
>>>> +}
>>>> +
>>>> +static int
>>>> +sync_find_fence(struct nouveau_job *job,
>>>> +??????? struct drm_nouveau_sync *sync,
>>>> +??????? struct dma_fence **fence)
>>>> +{
>>>> +??? u32 stype = sync->flags &
DRM_NOUVEAU_SYNC_TYPE_MASK;
>>>> +??? u64 point = 0;
>>>> +??? int ret;
>>>> +
>>>> +??? if (stype != DRM_NOUVEAU_SYNC_SYNCOBJ &&
>>>> +??????? stype != DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ)
>>>> +??????? return -EOPNOTSUPP;
>>>> +
>>>> +??? if (stype == DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ)
>>>> +??????? point = sync->timeline_value;
>>>> +
>>>> +??? ret = drm_syncobj_find_fence(job->file_priv,
>>>> +???????????????????? sync->handle, point,
>>>> +???????????????????? sync->flags, fence);
>>>> +??? if (ret)
>>>> +??????? return ret;
>>>> +
>>>> +??? return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +exec_job_binds_wait(struct nouveau_job *job)
>>>> +{
>>>> +??? struct nouveau_exec_job *exec_job =
to_nouveau_exec_job(job);
>>>> +??? struct nouveau_cli *cli = exec_job->base.cli;
>>>> +??? struct nouveau_sched_entity *bind_entity =
&cli->sched_entity;
>>>> +??? signed long ret;
>>>> +??? int i;
>>>> +
>>>> +??? for (i = 0; i < job->in_sync.count; i++) {
>>>> +??????? struct nouveau_job *it;
>>>> +??????? struct drm_nouveau_sync *sync =
&job->in_sync.s[i];
>>>> +??????? struct dma_fence *fence;
>>>> +??????? bool found;
>>>> +
>>>> +??????? ret = sync_find_fence(job, sync, &fence);
>>>> +??????? if (ret)
>>>> +??????????? return ret;
>>>> +
>>>> +??????? mutex_lock(&bind_entity->job.mutex);
>>>> +??????? found = false;
>>>> +??????? list_for_each_entry(it, &bind_entity->job.list,
head) {
>>>> +??????????? if (fence == it->done_fence) {
>>>> +??????????????? found = true;
>>>> +??????????????? break;
>>>> +??????????? }
>>>> +??????? }
>>>> +??????? mutex_unlock(&bind_entity->job.mutex);
>>>> +
>>>> +??????? /* If the fence is not from a VM_BIND job, don't
wait for it. */
>>>> +??????? if (!found)
>>>> +??????????? continue;
>>>> +
>>>> +??????? ret = dma_fence_wait_timeout(fence, true,
>>>> +???????????????????????? msecs_to_jiffies(500));
>>>> +??????? if (ret < 0)
>>>> +??????????? return ret;
>>>> +??????? else if (ret == 0)
>>>> +??????????? return -ETIMEDOUT;
>>>> +??? }
>>>> +
>>>> +??? return 0;
>>>> +}
>>>> +
>>>> +int
>>>> +nouveau_exec_job_submit(struct nouveau_job *job)
>>>> +{
>>>> +??? struct nouveau_exec_job *exec_job =
to_nouveau_exec_job(job);
>>>> +??? struct nouveau_cli *cli = exec_job->base.cli;
>>>> +??? struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(cli);
>>>> +??? struct drm_exec *exec = &job->exec;
>>>> +??? struct drm_gem_object *obj;
>>>> +??? unsigned long index;
>>>> +??? int ret;
>>>> +
>>>> +??? ret = exec_job_binds_wait(job);
>>>> +??? if (ret)
>>>> +??????? return ret;
>>>> +
>>>> +??? nouveau_uvmm_lock(uvmm);
>>>> +??? drm_exec_while_not_all_locked(exec) {
>>>> +??????? struct drm_gpuva *va;
>>>> +
>>>> +??????? drm_gpuva_for_each_va(va, &uvmm->umgr) {
>>>> +??????????? ret = drm_exec_prepare_obj(exec, va->gem.obj,
1);
>>>> +??????????? drm_exec_break_on_contention(exec);
>>>> +??????????? if (ret)
>>>> +??????????????? return ret;
>>>> +??????? }
>>>> +??? }
>>>> +??? nouveau_uvmm_unlock(uvmm);
>>>> +
>>>> +??? drm_exec_for_each_locked_object(exec, index, obj) {
>>>> +??????? struct dma_resv *resv = obj->resv;
>>>> +??????? struct nouveau_bo *nvbo = nouveau_gem_object(obj);
>>>> +
>>>> +??????? ret = nouveau_bo_validate(nvbo, true, false);
>>>> +??????? if (ret)
>>>> +??????????? return ret;
>>>> +
>>>> +??????? dma_resv_add_fence(resv, job->done_fence,
DMA_RESV_USAGE_WRITE);
>>>> +??? }
>>>> +
>>>> +??? return 0;
>>>> +}
>>>> +
>>>> +static struct dma_fence *
>>>> +nouveau_exec_job_run(struct nouveau_job *job)
>>>> +{
>>>> +??? struct nouveau_exec_job *exec_job =
to_nouveau_exec_job(job);
>>>> +??? struct nouveau_fence *fence;
>>>> +??? int i, ret;
>>>> +
>>>> +??? ret = nouveau_dma_wait(job->chan,
exec_job->push.count + 1, 16);
>>>> +??? if (ret) {
>>>> +??????? NV_PRINTK(err, job->cli, "nv50cal_space:
%d\n", ret);
>>>> +??????? return ERR_PTR(ret);
>>>> +??? }
>>>> +
>>>> +??? for (i = 0; i < exec_job->push.count; i++) {
>>>> +??????? nv50_dma_push(job->chan, exec_job->push.s[i].va,
>>>> +????????????????? exec_job->push.s[i].va_len);
>>>> +??? }
>>>> +
>>>> +??? ret = nouveau_fence_new(job->chan, false, &fence);
>>>> +??? if (ret) {
>>>> +??????? NV_PRINTK(err, job->cli, "error fencing
pushbuf: %d\n", ret);
>>>> +??????? WIND_RING(job->chan);
>>>> +??????? return ERR_PTR(ret);
>>>> +??? }
>>>> +
>>>> +??? return &fence->base;
>>>> +}
>>>> +static void
>>>> +nouveau_exec_job_free(struct nouveau_job *job)
>>>> +{
>>>> +??? struct nouveau_exec_job *exec_job =
to_nouveau_exec_job(job);
>>>> +
>>>> +??? nouveau_base_job_free(job);
>>>> +
>>>> +??? kfree(exec_job->push.s);
>>>> +??? kfree(exec_job);
>>>> +}
>>>> +
>>>> +static struct nouveau_job_ops nouveau_exec_job_ops = {
>>>> +??? .submit = nouveau_exec_job_submit,
>>>> +??? .run = nouveau_exec_job_run,
>>>> +??? .free = nouveau_exec_job_free,
>>>> +};
>>>> +
>>>> +int
>>>> +nouveau_exec_job_init(struct nouveau_exec_job **pjob,
>>>> +????????????? struct nouveau_exec *exec)
>>>> +{
>>>> +??? struct nouveau_exec_job *job;
>>>> +??? int ret;
>>>> +
>>>> +??? job = *pjob = kzalloc(sizeof(*job), GFP_KERNEL);
>>>> +??? if (!job)
>>>> +??????? return -ENOMEM;
>>>> +
>>>> +??? job->push.count = exec->push.count;
>>>> +??? job->push.s = kmemdup(exec->push.s,
>>>> +????????????????? sizeof(*exec->push.s) *
>>>> +????????????????? exec->push.count,
>>>> +????????????????? GFP_KERNEL);
>>>> +??? if (!job->push.s) {
>>>> +??????? ret = -ENOMEM;
>>>> +??????? goto err_free_job;
>>>> +??? }
>>>> +
>>>> +??? job->base.ops = &nouveau_exec_job_ops;
>>>> +??? ret = nouveau_base_job_init(&job->base,
&exec->base);
>>>> +??? if (ret)
>>>> +??????? goto err_free_pushs;
>>>> +
>>>> +??? return 0;
>>>> +
>>>> +err_free_pushs:
>>>> +??? kfree(job->push.s);
>>>> +err_free_job:
>>>> +??? kfree(job);
>>>> +??? *pjob = NULL;
>>>> +
>>>> +??? return ret;
>>>> +}
>>>> +
>>>> +void nouveau_job_fini(struct nouveau_job *job)
>>>> +{
>>>> +??? dma_fence_put(job->done_fence);
>>>> +??? drm_sched_job_cleanup(&job->base);
>>>> +??? job->ops->free(job);
>>>> +}
>>>> +
>>>> +static int
>>>> +nouveau_job_add_deps(struct nouveau_job *job)
>>>> +{
>>>> +??? struct dma_fence *in_fence = NULL;
>>>> +??? int ret, i;
>>>> +
>>>> +??? for (i = 0; i < job->in_sync.count; i++) {
>>>> +??????? struct drm_nouveau_sync *sync =
&job->in_sync.s[i];
>>>> +
>>>> +??????? ret = sync_find_fence(job, sync, &in_fence);
>>>> +??????? if (ret) {
>>>> +??????????? NV_PRINTK(warn, job->cli,
>>>> +????????????????? "Failed to find syncobj (-> in):
handle=%d\n",
>>>> +????????????????? sync->handle);
>>>> +??????????? return ret;
>>>> +??????? }
>>>> +
>>>> +??????? ret = drm_sched_job_add_dependency(&job->base,
in_fence);
>>>> +??????? if (ret)
>>>> +??????????? return ret;
>>>> +??? }
>>>> +
>>>> +??? return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +nouveau_job_fence_attach(struct nouveau_job *job, struct
dma_fence
>>>> *fence)
>>>> +{
>>>> +??? struct drm_syncobj *out_sync;
>>>> +??? int i;
>>>> +
>>>> +??? for (i = 0; i < job->out_sync.count; i++) {
>>>> +??????? struct drm_nouveau_sync *sync =
&job->out_sync.s[i];
>>>> +??????? u32 stype = sync->flags &
DRM_NOUVEAU_SYNC_TYPE_MASK;
>>>> +
>>>> +??????? if (stype != DRM_NOUVEAU_SYNC_SYNCOBJ &&
>>>> +??????????? stype != DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ)
>>>> +??????????? return -EOPNOTSUPP;
>>>> +
>>>> +??????? out_sync = drm_syncobj_find(job->file_priv,
sync->handle);
>>>> +??????? if (!out_sync) {
>>>> +??????????? NV_PRINTK(warn, job->cli,
>>>> +????????????????? "Failed to find syncobj (-> out):
handle=%d\n",
>>>> +????????????????? sync->handle);
>>>> +??????????? return -ENOENT;
>>>> +??????? }
>>>> +
>>>> +??????? if (stype == DRM_NOUVEAU_SYNC_TIMELINE_SYNCOBJ) {
>>>> +??????????? struct dma_fence_chain *chain;
>>>> +
>>>> +??????????? chain = dma_fence_chain_alloc();
>>>> +??????????? if (!chain) {
>>>> +??????????????? drm_syncobj_put(out_sync);
>>>> +??????????????? return -ENOMEM;
>>>> +??????????? }
>>>> +
>>>> +??????????? drm_syncobj_add_point(out_sync, chain, fence,
>>>> +????????????????????????? sync->timeline_value);
>>>> +??????? } else {
>>>> +??????????? drm_syncobj_replace_fence(out_sync, fence);
>>>> +??????? }
>>>> +
>>>> +??????? drm_syncobj_put(out_sync);
>>>> +??? }
>>>> +
>>>> +??? return 0;
>>>> +}
>>>> +
>>>> +static struct dma_fence *
>>>> +nouveau_job_run(struct nouveau_job *job)
>>>> +{
>>>> +??? return job->ops->run(job);
>>>> +}
>>>> +
>>>> +static int
>>>> +nouveau_job_run_sync(struct nouveau_job *job)
>>>> +{
>>>> +??? struct dma_fence *fence;
>>>> +??? int ret;
>>>> +
>>>> +??? fence = nouveau_job_run(job);
>>>> +??? if (IS_ERR(fence)) {
>>>> +??????? return PTR_ERR(fence);
>>>> +??? } else if (fence) {
>>>> +??????? ret = dma_fence_wait(fence, true);
>>>> +??????? if (ret)
>>>> +??????????? return ret;
>>>> +??? }
>>>> +
>>>> +??? dma_fence_signal(job->done_fence);
>>>> +
>>>> +??? return 0;
>>>> +}
>>>> +
>>>> +int
>>>> +nouveau_job_submit(struct nouveau_job *job)
>>>> +{
>>>> +??? struct nouveau_sched_entity *entity >>>>
to_nouveau_sched_entity(job->base.entity);
>>>> +??? int ret;
>>>> +
>>>> +??? drm_exec_init(&job->exec, true);
>>>> +
>>>> +??? ret = nouveau_job_add_deps(job);
>>>> +??? if (ret)
>>>> +??????? goto out;
>>>> +
>>>> +??? drm_sched_job_arm(&job->base);
>>>> +??? job->done_fence =
dma_fence_get(&job->base.s_fence->finished);
>>>> +
>>>> +??? ret = nouveau_job_fence_attach(job, job->done_fence);
>>>> +??? if (ret)
>>>> +??????? goto out;
>>>> +
>>>> +??? if (job->ops->submit) {
>>>> +??????? ret = job->ops->submit(job);
>>>> +??????? if (ret)
>>>> +??????????? goto out;
>>>> +??? }
>>>> +
>>>> +??? if (job->sync) {
>>>> +??????? drm_exec_fini(&job->exec);
>>>> +
>>>> +??????? /* We're requested to run a synchronous job, hence
don't push
>>>> +???????? * the job, bypassing the job scheduler, and execute
the jobs
>>>> +???????? * run() function right away.
>>>> +???????? *
>>>> +???????? * As a consequence of bypassing the job scheduler we
need to
>>>> +???????? * handle fencing and job cleanup ourselfes.
>>>> +???????? */
>>>> +??????? ret = nouveau_job_run_sync(job);
>>>> +
>>>> +??????? /* If the job fails, the caller will do the cleanup
for us. */
>>>> +??????? if (!ret)
>>>> +??????????? nouveau_job_fini(job);
>>>> +
>>>> +??????? return ret;
>>>> +??? } else {
>>>> +??????? mutex_lock(&entity->job.mutex);
>>>> +??????? drm_sched_entity_push_job(&job->base);
>>>> +??????? list_add_tail(&job->head,
&entity->job.list);
>>>> +??????? mutex_unlock(&entity->job.mutex);
>>>> +??? }
>>>> +
>>>> +out:
>>>> +??? drm_exec_fini(&job->exec);
>>>> +??? return ret;
>>>> +}
>>>> +
>>>> +static struct dma_fence *
>>>> +nouveau_sched_run_job(struct drm_sched_job *sched_job)
>>>> +{
>>>> +??? struct nouveau_job *job = to_nouveau_job(sched_job);
>>>> +
>>>> +??? return nouveau_job_run(job);
>>>> +}
>>>> +
>>>> +static enum drm_gpu_sched_stat
>>>> +nouveau_sched_timedout_job(struct drm_sched_job *sched_job)
>>>> +{
>>>> +??? struct nouveau_job *job = to_nouveau_job(sched_job);
>>>> +??? struct nouveau_channel *chan = job->chan;
>>>> +
>>>> +??? if (unlikely(!atomic_read(&chan->killed)))
>>>> +??????? nouveau_channel_kill(chan);
>>>> +
>>>> +??? NV_PRINTK(warn, job->cli, "job timeout, channel %d
killed!\n",
>>>> +????????? chan->chid);
>>>> +
>>>> +??? nouveau_sched_entity_fini(job->entity);
>>>> +
>>>> +??? return DRM_GPU_SCHED_STAT_ENODEV;
>>>> +}
>>>> +
>>>> +static void
>>>> +nouveau_sched_free_job(struct drm_sched_job *sched_job)
>>>> +{
>>>> +??? struct nouveau_job *job = to_nouveau_job(sched_job);
>>>> +??? struct nouveau_sched_entity *entity = job->entity;
>>>> +
>>>> +??? mutex_lock(&entity->job.mutex);
>>>> +??? list_del(&job->head);
>>>> +??? mutex_unlock(&entity->job.mutex);
>>>> +
>>>> +??? nouveau_job_fini(job);
>>>> +}
>>>> +
>>>> +int nouveau_sched_entity_init(struct nouveau_sched_entity
*entity,
>>>> +????????????????? struct drm_gpu_scheduler *sched)
>>>> +{
>>>> +
>>>> +??? INIT_LIST_HEAD(&entity->job.list);
>>>> +??? mutex_init(&entity->job.mutex);
>>>> +
>>>> +??? return drm_sched_entity_init(&entity->base,
>>>> +???????????????????? DRM_SCHED_PRIORITY_NORMAL,
>>>> +???????????????????? &sched, 1, NULL);
>>>> +}
>>>> +
>>>> +void
>>>> +nouveau_sched_entity_fini(struct nouveau_sched_entity *entity)
>>>> +{
>>>> +??? drm_sched_entity_destroy(&entity->base);
>>>> +}
>>>> +
>>>> +static const struct drm_sched_backend_ops nouveau_sched_ops =
{
>>>> +??? .run_job = nouveau_sched_run_job,
>>>> +??? .timedout_job = nouveau_sched_timedout_job,
>>>> +??? .free_job = nouveau_sched_free_job,
>>>> +};
>>>> +
>>>> +int nouveau_sched_init(struct drm_gpu_scheduler *sched,
>>>> +?????????????? struct nouveau_drm *drm)
>>>> +{
>>>> +??? long job_hang_limit >>>>
msecs_to_jiffies(NOUVEAU_SCHED_JOB_TIMEOUT_MS);
>>>> +
>>>> +??? return drm_sched_init(sched, &nouveau_sched_ops,
>>>> +????????????????? NOUVEAU_SCHED_HW_SUBMISSIONS, 0,
job_hang_limit,
>>>> +????????????????? NULL, NULL, "nouveau",
drm->dev->dev);
>>>> +}
>>>> +
>>>> +void nouveau_sched_fini(struct drm_gpu_scheduler *sched)
>>>> +{
>>>> +??? drm_sched_fini(sched);
>>>> +}
>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.h
>>>> b/drivers/gpu/drm/nouveau/nouveau_sched.h
>>>> new file mode 100644
>>>> index 000000000000..7fc5b7eea810
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.h
>>>> @@ -0,0 +1,98 @@
>>>> +// SPDX-License-Identifier: MIT
>>>> +
>>>> +#ifndef NOUVEAU_SCHED_H
>>>> +#define NOUVEAU_SCHED_H
>>>> +
>>>> +#include <linux/types.h>
>>>> +
>>>> +#include <drm/drm_exec.h>
>>>> +#include <drm/gpu_scheduler.h>
>>>> +
>>>> +#include "nouveau_drv.h"
>>>> +#include "nouveau_exec.h"
>>>> +
>>>> +#define to_nouveau_job(sched_job)??????? \
>>>> +??????? container_of((sched_job), struct nouveau_job, base)
>>>> +
>>>> +#define to_nouveau_exec_job(job)??????? \
>>>> +??????? container_of((job), struct nouveau_exec_job, base)
>>>> +
>>>> +#define to_nouveau_bind_job(job)??????? \
>>>> +??????? container_of((job), struct nouveau_bind_job, base)
>>>> +
>>>> +struct nouveau_job {
>>>> +??? struct drm_sched_job base;
>>>> +??? struct list_head head;
>>>> +
>>>> +??? struct nouveau_sched_entity *entity;
>>>> +
>>>> +??? struct drm_file *file_priv;
>>>> +??? struct nouveau_cli *cli;
>>>> +??? struct nouveau_channel *chan;
>>>> +
>>>> +??? struct drm_exec exec;
>>>> +??? struct dma_fence *done_fence;
>>>> +
>>>> +??? bool sync;
>>>> +
>>>> +??? struct {
>>>> +??????? struct drm_nouveau_sync *s;
>>>> +??????? u32 count;
>>>> +??? } in_sync;
>>>> +
>>>> +??? struct {
>>>> +??????? struct drm_nouveau_sync *s;
>>>> +??????? u32 count;
>>>> +??? } out_sync;
>>>> +
>>>> +??? struct nouveau_job_ops {
>>>> +??????? int (*submit)(struct nouveau_job *);
>>>> +??????? struct dma_fence *(*run)(struct nouveau_job *);
>>>> +??????? void (*free)(struct nouveau_job *);
>>>> +??? } *ops;
>>>> +};
>>>> +
>>>> +struct nouveau_exec_job {
>>>> +??? struct nouveau_job base;
>>>> +
>>>> +??? struct {
>>>> +??????? struct drm_nouveau_exec_push *s;
>>>> +??????? u32 count;
>>>> +??? } push;
>>>> +};
>>>> +
>>>> +struct nouveau_bind_job {
>>>> +??? struct nouveau_job base;
>>>> +
>>>> +??? /* struct bind_job_op */
>>>> +??? struct list_head ops;
>>>> +};
>>>> +
>>>> +int nouveau_bind_job_init(struct nouveau_bind_job **job,
>>>> +????????????? struct nouveau_exec_bind *bind);
>>>> +int nouveau_exec_job_init(struct nouveau_exec_job **job,
>>>> +????????????? struct nouveau_exec *exec);
>>>> +
>>>> +int nouveau_job_submit(struct nouveau_job *job);
>>>> +void nouveau_job_fini(struct nouveau_job *job);
>>>> +
>>>> +#define to_nouveau_sched_entity(entity)??????? \
>>>> +??????? container_of((entity), struct nouveau_sched_entity,
base)
>>>> +
>>>> +struct nouveau_sched_entity {
>>>> +??? struct drm_sched_entity base;
>>>> +??? struct {
>>>> +??????? struct list_head list;
>>>> +??????? struct mutex mutex;
>>>> +??? } job;
>>>> +};
>>>> +
>>>> +int nouveau_sched_entity_init(struct nouveau_sched_entity
*entity,
>>>> +????????????????? struct drm_gpu_scheduler *sched);
>>>> +void nouveau_sched_entity_fini(struct nouveau_sched_entity
*entity);
>>>> +
>>>> +int nouveau_sched_init(struct drm_gpu_scheduler *sched,
>>>> +?????????????? struct nouveau_drm *drm);
>>>> +void nouveau_sched_fini(struct drm_gpu_scheduler *sched);
>>>> +
>>>> +#endif
Nouveau - Jan 2023 - [PATCH drm-next 13/14] drm/nouveau: implement new VM_BIND UAPI

[Nouveau] [PATCH drm-next 13/14] drm/nouveau: implement new VM_BIND UAPI