Pierre-Eric Pelloux-Prayer
2025-Jan-31 11:02 UTC
[PATCH v7 0/7] Improve gpu_scheduler trace events + uAPI
Hi, The initial goal of this series was to improve the drm and amdgpu trace events to be able to expose more of the inner workings of the scheduler and drivers to developers via tools. Then, the series evolved to become focused only on gpu_scheduler. The changes around vblank events will be part of a different series, as well as the amdgpu ones. Moreover Sima suggested to make some trace events stable uAPI, so tools can rely on them long term. The first patches extend and cleanup the gpu scheduler events. The last one adds a documentation entry in drm-uapi.rst. Changes since v6: * Addressed comments from Philipp, Tvrtko, Steven * The commit storing drm_client_id in sched_fence adds an extra field which looks like a duplicate of the owner field. Currently amdgpu uses the owner field with some magic values, so we have to have 2 separate fields for now, but ultimately one could be removed. Similarly storing the drm_client_id in the sched_entity is not really possible as there's nothing preventing a driver to use a sched_entity with multiple drm_file instances. Useful links: - userspace tool using the updated events: https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37 - v6: https://lists.freedesktop.org/archives/dri-devel/2024-November/477644.html Pierre-Eric Pelloux-Prayer (7): drm/debugfs: output client_id in in drm_clients_info drm/sched: store the drm client_id in drm_sched_fence drm/sched: add device name to the drm_sched_process_job event drm/sched: cleanup gpu_scheduler trace events drm/sched: trace dependencies for gpu jobs drm/sched: add the drm_client_id to the drm_sched_run/exec_job events drm/doc: document some tracepoints as uAPI Documentation/gpu/drm-uapi.rst | 19 +++ drivers/accel/amdxdna/aie2_ctx.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +- drivers/gpu/drm/drm_debugfs.c | 10 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- drivers/gpu/drm/imagination/pvr_job.c | 2 +- drivers/gpu/drm/imagination/pvr_queue.c | 5 +- drivers/gpu/drm/imagination/pvr_queue.h | 2 +- drivers/gpu/drm/lima/lima_gem.c | 2 +- drivers/gpu/drm/lima/lima_sched.c | 6 +- drivers/gpu/drm/lima/lima_sched.h | 3 +- drivers/gpu/drm/msm/msm_gem_submit.c | 8 +- drivers/gpu/drm/nouveau/nouveau_sched.c | 3 +- drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +- drivers/gpu/drm/panthor/panthor_drv.c | 3 +- drivers/gpu/drm/panthor/panthor_mmu.c | 2 +- drivers/gpu/drm/panthor/panthor_sched.c | 5 +- drivers/gpu/drm/panthor/panthor_sched.h | 3 +- .../gpu/drm/scheduler/gpu_scheduler_trace.h | 123 ++++++++++++++---- drivers/gpu/drm/scheduler/sched_entity.c | 8 +- drivers/gpu/drm/scheduler/sched_fence.c | 4 +- drivers/gpu/drm/scheduler/sched_main.c | 8 +- drivers/gpu/drm/v3d/v3d_submit.c | 2 +- drivers/gpu/drm/xe/xe_sched_job.c | 3 +- include/drm/gpu_scheduler.h | 12 +- 28 files changed, 192 insertions(+), 64 deletions(-) -- 2.47.1
Pierre-Eric Pelloux-Prayer
2025-Jan-31 11:03 UTC
[PATCH v7 2/7] drm/sched: store the drm client_id in drm_sched_fence
This will be used in a later commit to trace the drm client_id in some of the gpu_scheduler trace events. This requires changing all the users of drm_sched_job_init to add an extra parameter. The newly added drm_client_id field in the drm_sched_fence is a bit of a duplicate of the owner one. One suggestion I received was to merge those 2 fields - this can't be done right now as amdgpu uses some special values (AMDGPU_FENCE_OWNER_*) that can't really be translated into a client id. Christian is working on getting rid of those; when it's done we should be able to squash owner/drm_client_id together. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer at amd.com> --- drivers/accel/amdxdna/aie2_ctx.c | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 +++++--- drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 ++- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- drivers/gpu/drm/imagination/pvr_job.c | 2 +- drivers/gpu/drm/imagination/pvr_queue.c | 5 +++-- drivers/gpu/drm/imagination/pvr_queue.h | 2 +- drivers/gpu/drm/lima/lima_gem.c | 2 +- drivers/gpu/drm/lima/lima_sched.c | 6 ++++-- drivers/gpu/drm/lima/lima_sched.h | 3 ++- drivers/gpu/drm/msm/msm_gem_submit.c | 8 +++++--- drivers/gpu/drm/nouveau/nouveau_sched.c | 3 ++- drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +- drivers/gpu/drm/panthor/panthor_drv.c | 3 ++- drivers/gpu/drm/panthor/panthor_mmu.c | 2 +- drivers/gpu/drm/panthor/panthor_sched.c | 5 +++-- drivers/gpu/drm/panthor/panthor_sched.h | 3 ++- drivers/gpu/drm/scheduler/sched_fence.c | 4 +++- drivers/gpu/drm/scheduler/sched_main.c | 6 ++++-- drivers/gpu/drm/v3d/v3d_submit.c | 2 +- drivers/gpu/drm/xe/xe_sched_job.c | 3 ++- include/drm/gpu_scheduler.h | 12 ++++++++++-- 24 files changed, 61 insertions(+), 33 deletions(-) diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c index 5f43db02b240..d33e62f1b127 100644 --- a/drivers/accel/amdxdna/aie2_ctx.c +++ b/drivers/accel/amdxdna/aie2_ctx.c @@ -819,7 +819,8 @@ int aie2_cmd_submit(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job *job, goto up_sem; } - ret = drm_sched_job_init(&job->base, &hwctx->priv->entity, 1, hwctx); + ret = drm_sched_job_init(&job->base, &hwctx->priv->entity, 1, hwctx, + hwctx->client->filp->client_id); if (ret) { XDNA_ERR(xdna, "DRM job init failed, ret %d", ret); goto free_chain; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 2e5732dfd425..d480be803784 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -681,7 +681,7 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev, goto err; } - ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job); + ret = amdgpu_job_alloc(adev, NULL, NULL, NULL, 1, &job, 0); if (ret) goto err; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 5cc5f59e3018..d115fa10f8ee 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -293,7 +293,8 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p, for (i = 0; i < p->gang_size; ++i) { ret = amdgpu_job_alloc(p->adev, vm, p->entities[i], vm, - num_ibs[i], &p->jobs[i]); + num_ibs[i], &p->jobs[i], + p->filp->client_id); if (ret) goto free_all_kdata; p->jobs[i]->enforce_isolation = p->adev->enforce_isolation[fpriv->xcp_id]; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 100f04475943..0a9a53f2e78a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -185,7 +185,8 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct drm_sched_entity *entity, void *owner, - unsigned int num_ibs, struct amdgpu_job **job) + unsigned int num_ibs, struct amdgpu_job **job, + u64 drm_client_id) { if (num_ibs == 0) return -EINVAL; @@ -203,7 +204,8 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, if (!entity) return 0; - return drm_sched_job_init(&(*job)->base, entity, 1, owner); + return drm_sched_job_init(&(*job)->base, entity, 1, owner, + drm_client_id); } int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, @@ -213,7 +215,7 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, { int r; - r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job); + r = amdgpu_job_alloc(adev, NULL, entity, owner, 1, job, 0); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h index ce6b9ba967ff..5a8bc6342222 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h @@ -90,7 +90,8 @@ static inline struct amdgpu_ring *amdgpu_job_ring(struct amdgpu_job *job) int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct drm_sched_entity *entity, void *owner, - unsigned int num_ibs, struct amdgpu_job **job); + unsigned int num_ibs, struct amdgpu_job **job, + u64 drm_client_id); int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, struct drm_sched_entity *entity, void *owner, size_t size, enum amdgpu_ib_pool_type pool_type, diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 3c0a5c3e0e3d..76c742328edb 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -534,7 +534,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, ret = drm_sched_job_init(&submit->sched_job, &ctx->sched_entity[args->pipe], - 1, submit->ctx); + 1, submit->ctx, file->client_id); if (ret) goto err_submit_put; diff --git a/drivers/gpu/drm/imagination/pvr_job.c b/drivers/gpu/drm/imagination/pvr_job.c index 618503a212a7..64152b57e8b1 100644 --- a/drivers/gpu/drm/imagination/pvr_job.c +++ b/drivers/gpu/drm/imagination/pvr_job.c @@ -446,7 +446,7 @@ create_job(struct pvr_device *pvr_dev, if (err) goto err_put_job; - err = pvr_queue_job_init(job); + err = pvr_queue_job_init(job, pvr_file->file->client_id); if (err) goto err_put_job; diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imagination/pvr_queue.c index c4f08432882b..240d85e1006d 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -1059,6 +1059,7 @@ static int pvr_queue_cleanup_fw_context(struct pvr_queue *queue) /** * pvr_queue_job_init() - Initialize queue related fields in a pvr_job object. * @job: The job to initialize. + * @drm_client_id: drm_file.client_id submitting the job * * Bind the job to a queue and allocate memory to guarantee pvr_queue_job_arm() * and pvr_queue_job_push() can't fail. We also make sure the context type is @@ -1068,7 +1069,7 @@ static int pvr_queue_cleanup_fw_context(struct pvr_queue *queue) * * 0 on success, or * * An error code if something failed. */ -int pvr_queue_job_init(struct pvr_job *job) +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id) { /* Fragment jobs need at least one native fence wait on the geometry job fence. */ u32 min_native_dep_count = job->type == DRM_PVR_JOB_TYPE_FRAGMENT ? 1 : 0; @@ -1085,7 +1086,7 @@ int pvr_queue_job_init(struct pvr_job *job) if (!pvr_cccb_cmdseq_can_fit(&queue->cccb, job_cmds_size(job, min_native_dep_count))) return -E2BIG; - err = drm_sched_job_init(&job->base, &queue->entity, 1, THIS_MODULE); + err = drm_sched_job_init(&job->base, &queue->entity, 1, THIS_MODULE, drm_client_id); if (err) return err; diff --git a/drivers/gpu/drm/imagination/pvr_queue.h b/drivers/gpu/drm/imagination/pvr_queue.h index e06ced69302f..0cfbe1cd311e 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.h +++ b/drivers/gpu/drm/imagination/pvr_queue.h @@ -139,7 +139,7 @@ struct pvr_queue { bool pvr_queue_fence_is_ufo_backed(struct dma_fence *f); -int pvr_queue_job_init(struct pvr_job *job); +int pvr_queue_job_init(struct pvr_job *job, u64 drm_client_id); void pvr_queue_job_cleanup(struct pvr_job *job); diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c index 9bb997dbb4b9..f46f961afc56 100644 --- a/drivers/gpu/drm/lima/lima_gem.c +++ b/drivers/gpu/drm/lima/lima_gem.c @@ -341,7 +341,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit) err = lima_sched_task_init( submit->task, submit->ctx->context + submit->pipe, - bos, submit->nr_bos, vm); + bos, submit->nr_bos, vm, file->client_id); if (err) goto err_out1; diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index b40c90e97d7e..ffc369dd4301 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -113,7 +113,8 @@ static inline struct lima_sched_pipe *to_lima_pipe(struct drm_gpu_scheduler *sch int lima_sched_task_init(struct lima_sched_task *task, struct lima_sched_context *context, struct lima_bo **bos, int num_bos, - struct lima_vm *vm) + struct lima_vm *vm, + u64 drm_client_id) { int err, i; @@ -124,7 +125,8 @@ int lima_sched_task_init(struct lima_sched_task *task, for (i = 0; i < num_bos; i++) drm_gem_object_get(&bos[i]->base.base); - err = drm_sched_job_init(&task->base, &context->base, 1, vm); + err = drm_sched_job_init(&task->base, &context->base, 1, vm, + drm_client_id); if (err) { kfree(task->bos); return err; diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h index 85b23ba901d5..1a08faf8a529 100644 --- a/drivers/gpu/drm/lima/lima_sched.h +++ b/drivers/gpu/drm/lima/lima_sched.h @@ -88,7 +88,8 @@ struct lima_sched_pipe { int lima_sched_task_init(struct lima_sched_task *task, struct lima_sched_context *context, struct lima_bo **bos, int num_bos, - struct lima_vm *vm); + struct lima_vm *vm, + u64 drm_client_id); void lima_sched_task_fini(struct lima_sched_task *task); int lima_sched_context_init(struct lima_sched_pipe *pipe, diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index dee470403036..378119155673 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -30,7 +30,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, struct msm_gpu *gpu, struct msm_gpu_submitqueue *queue, uint32_t nr_bos, - uint32_t nr_cmds) + uint32_t nr_cmds, u64 drm_client_id) { static atomic_t ident = ATOMIC_INIT(0); struct msm_gem_submit *submit; @@ -54,7 +54,8 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, return ERR_PTR(ret); } - ret = drm_sched_job_init(&submit->base, queue->entity, 1, queue); + ret = drm_sched_job_init(&submit->base, queue->entity, 1, queue, + drm_client_id); if (ret) { kfree(submit->hw_fence); kfree(submit); @@ -693,7 +694,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, } } - submit = submit_create(dev, gpu, queue, args->nr_bos, args->nr_cmds); + submit = submit_create(dev, gpu, queue, args->nr_bos, args->nr_cmds, + file->client_id); if (IS_ERR(submit)) { ret = PTR_ERR(submit); goto out_post_unlock; diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c index 4412f2711fb5..ebc31adea39a 100644 --- a/drivers/gpu/drm/nouveau/nouveau_sched.c +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c @@ -87,7 +87,8 @@ nouveau_job_init(struct nouveau_job *job, } ret = drm_sched_job_init(&job->base, &sched->entity, - args->credits, NULL); + args->credits, NULL, + job->file_priv->client_id); if (ret) goto err_free_chains; diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 0f3935556ac7..f4f20d11ecd9 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -312,7 +312,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, ret = drm_sched_job_init(&job->base, &file_priv->sched_entity[slot], - 1, NULL); + 1, NULL, file->client_id); if (ret) goto out_put_job; diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c index d5dcd3d1b33a..bf8c92ebcc2d 100644 --- a/drivers/gpu/drm/panthor/panthor_drv.c +++ b/drivers/gpu/drm/panthor/panthor_drv.c @@ -988,7 +988,8 @@ static int panthor_ioctl_group_submit(struct drm_device *ddev, void *data, const struct drm_panthor_queue_submit *qsubmit = &jobs_args[i]; struct drm_sched_job *job; - job = panthor_job_create(pfile, args->group_handle, qsubmit); + job = panthor_job_create(pfile, args->group_handle, qsubmit, + file->client_id); if (IS_ERR(job)) { ret = PTR_ERR(job); goto out_cleanup_submit_ctx; diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c index c39e3eb1c15d..9c49b1aef759 100644 --- a/drivers/gpu/drm/panthor/panthor_mmu.c +++ b/drivers/gpu/drm/panthor/panthor_mmu.c @@ -2480,7 +2480,7 @@ panthor_vm_bind_job_create(struct drm_file *file, kref_init(&job->refcount); job->vm = panthor_vm_get(vm); - ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm); + ret = drm_sched_job_init(&job->base, &vm->entity, 1, vm, file->client_id); if (ret) goto err_put_job; diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index 77b184c3fb0c..b81c428bf66c 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -3672,7 +3672,8 @@ struct panthor_vm *panthor_job_vm(struct drm_sched_job *sched_job) struct drm_sched_job * panthor_job_create(struct panthor_file *pfile, u16 group_handle, - const struct drm_panthor_queue_submit *qsubmit) + const struct drm_panthor_queue_submit *qsubmit, + u64 drm_client_id) { struct panthor_group_pool *gpool = pfile->groups; struct panthor_job *job; @@ -3744,7 +3745,7 @@ panthor_job_create(struct panthor_file *pfile, ret = drm_sched_job_init(&job->base, &job->group->queues[job->queue_idx]->entity, - credits, job->group); + credits, job->group, drm_client_id); if (ret) goto err_put_job; diff --git a/drivers/gpu/drm/panthor/panthor_sched.h b/drivers/gpu/drm/panthor/panthor_sched.h index 5ae6b4bde7c5..d80c040ebf13 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.h +++ b/drivers/gpu/drm/panthor/panthor_sched.h @@ -28,7 +28,8 @@ int panthor_group_get_state(struct panthor_file *pfile, struct drm_sched_job * panthor_job_create(struct panthor_file *pfile, u16 group_handle, - const struct drm_panthor_queue_submit *qsubmit); + const struct drm_panthor_queue_submit *qsubmit, + u64 drm_client_id); struct drm_sched_job *panthor_job_get(struct drm_sched_job *job); struct panthor_vm *panthor_job_vm(struct drm_sched_job *sched_job); void panthor_job_put(struct drm_sched_job *job); diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c index 0f35f009b9d3..41c3d596603c 100644 --- a/drivers/gpu/drm/scheduler/sched_fence.c +++ b/drivers/gpu/drm/scheduler/sched_fence.c @@ -204,7 +204,8 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f) EXPORT_SYMBOL(to_drm_sched_fence); struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity, - void *owner) + void *owner, + u64 drm_client_id) { struct drm_sched_fence *fence = NULL; @@ -213,6 +214,7 @@ struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity, return NULL; fence->owner = owner; + fence->drm_client_id = drm_client_id; spin_lock_init(&fence->lock); return fence; diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 57da84908752..ad306d3d7282 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -776,6 +776,7 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs); * @credits: the number of credits this job contributes to the schedulers * credit limit * @owner: job owner for debugging + * @drm_client_id: drm_file.client_id of the owner * * Refer to drm_sched_entity_push_job() documentation * for locking considerations. @@ -796,7 +797,8 @@ EXPORT_SYMBOL(drm_sched_resubmit_jobs); */ int drm_sched_job_init(struct drm_sched_job *job, struct drm_sched_entity *entity, - u32 credits, void *owner) + u32 credits, void *owner, + uint64_t drm_client_id) { if (!entity->rq) { /* This will most likely be followed by missing frames @@ -822,7 +824,7 @@ int drm_sched_job_init(struct drm_sched_job *job, job->entity = entity; job->credits = credits; - job->s_fence = drm_sched_fence_alloc(entity, owner); + job->s_fence = drm_sched_fence_alloc(entity, owner, drm_client_id); if (!job->s_fence) return -ENOMEM; diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index 4ff5de46fb22..5171ffe9012d 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -169,7 +169,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, job->file = file_priv; ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], - 1, v3d_priv); + 1, v3d_priv, file_priv->client_id); if (ret) return ret; diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c index 1905ca590965..f4679cb9a56b 100644 --- a/drivers/gpu/drm/xe/xe_sched_job.c +++ b/drivers/gpu/drm/xe/xe_sched_job.c @@ -113,7 +113,8 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q, kref_init(&job->refcount); xe_exec_queue_get(job->q); - err = drm_sched_job_init(&job->drm, q->entity, 1, NULL); + err = drm_sched_job_init(&job->drm, q->entity, 1, NULL, + q->xef->drm->client_id); if (err) goto err_free; diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 95e17504e46a..7982ca8d7ca5 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -311,6 +311,13 @@ struct drm_sched_fence { * @owner: job owner for debugging */ void *owner; + + /** + * @drm_client_id: + * + * The client_id of the drm_file which owns the job. + */ + uint64_t drm_client_id; }; struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f); @@ -563,7 +570,8 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, void drm_sched_fini(struct drm_gpu_scheduler *sched); int drm_sched_job_init(struct drm_sched_job *job, struct drm_sched_entity *entity, - u32 credits, void *owner); + u32 credits, void *owner, + u64 drm_client_id); void drm_sched_job_arm(struct drm_sched_job *job); int drm_sched_job_add_dependency(struct drm_sched_job *job, struct dma_fence *fence); @@ -624,7 +632,7 @@ bool drm_sched_entity_is_ready(struct drm_sched_entity *entity); int drm_sched_entity_error(struct drm_sched_entity *entity); struct drm_sched_fence *drm_sched_fence_alloc( - struct drm_sched_entity *s_entity, void *owner); + struct drm_sched_entity *s_entity, void *owner, u64 drm_client_id); void drm_sched_fence_init(struct drm_sched_fence *fence, struct drm_sched_entity *entity); void drm_sched_fence_free(struct drm_sched_fence *fence); -- 2.47.1
Christian König
2025-Jan-31 13:12 UTC
[PATCH v7 0/7] Improve gpu_scheduler trace events + uAPI
I can't see patch #1 in my inbox for some reason, but I already know what it does from your repository. Feel free to add Reviewed-by: Christian K?nig <christian.koenig at amd.com> to the entire series. Regards, Christian. Am 31.01.25 um 12:02 schrieb Pierre-Eric Pelloux-Prayer:> Hi, > > The initial goal of this series was to improve the drm and amdgpu > trace events to be able to expose more of the inner workings of > the scheduler and drivers to developers via tools. > > Then, the series evolved to become focused only on gpu_scheduler. > The changes around vblank events will be part of a different > series, as well as the amdgpu ones. > > Moreover Sima suggested to make some trace events stable uAPI, > so tools can rely on them long term. > > The first patches extend and cleanup the gpu scheduler events. > > The last one adds a documentation entry in drm-uapi.rst. > > Changes since v6: > * Addressed comments from Philipp, Tvrtko, Steven > * The commit storing drm_client_id in sched_fence adds an extra > field which looks like a duplicate of the owner field. Currently > amdgpu uses the owner field with some magic values, so we have to > have 2 separate fields for now, but ultimately one could be removed. > Similarly storing the drm_client_id in the sched_entity is not > really possible as there's nothing preventing a driver to use a > sched_entity with multiple drm_file instances. > > > Useful links: > - userspace tool using the updated events: > https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37 > - v6: > https://lists.freedesktop.org/archives/dri-devel/2024-November/477644.html > > Pierre-Eric Pelloux-Prayer (7): > drm/debugfs: output client_id in in drm_clients_info > drm/sched: store the drm client_id in drm_sched_fence > drm/sched: add device name to the drm_sched_process_job event > drm/sched: cleanup gpu_scheduler trace events > drm/sched: trace dependencies for gpu jobs > drm/sched: add the drm_client_id to the drm_sched_run/exec_job events > drm/doc: document some tracepoints as uAPI > > Documentation/gpu/drm-uapi.rst | 19 +++ > drivers/accel/amdxdna/aie2_ctx.c | 3 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +- > drivers/gpu/drm/drm_debugfs.c | 10 +- > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- > drivers/gpu/drm/imagination/pvr_job.c | 2 +- > drivers/gpu/drm/imagination/pvr_queue.c | 5 +- > drivers/gpu/drm/imagination/pvr_queue.h | 2 +- > drivers/gpu/drm/lima/lima_gem.c | 2 +- > drivers/gpu/drm/lima/lima_sched.c | 6 +- > drivers/gpu/drm/lima/lima_sched.h | 3 +- > drivers/gpu/drm/msm/msm_gem_submit.c | 8 +- > drivers/gpu/drm/nouveau/nouveau_sched.c | 3 +- > drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +- > drivers/gpu/drm/panthor/panthor_drv.c | 3 +- > drivers/gpu/drm/panthor/panthor_mmu.c | 2 +- > drivers/gpu/drm/panthor/panthor_sched.c | 5 +- > drivers/gpu/drm/panthor/panthor_sched.h | 3 +- > .../gpu/drm/scheduler/gpu_scheduler_trace.h | 123 ++++++++++++++---- > drivers/gpu/drm/scheduler/sched_entity.c | 8 +- > drivers/gpu/drm/scheduler/sched_fence.c | 4 +- > drivers/gpu/drm/scheduler/sched_main.c | 8 +- > drivers/gpu/drm/v3d/v3d_submit.c | 2 +- > drivers/gpu/drm/xe/xe_sched_job.c | 3 +- > include/drm/gpu_scheduler.h | 12 +- > 28 files changed, 192 insertions(+), 64 deletions(-) >