Marco Crivellari
2025-Oct-31 10:20 UTC
[PATCH 0/2] replaced old wq name, added WQ_PERCPU to alloc_workqueue
Hi,
=== Current situation: problems ==
Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
This leads to different scenarios if a work item is scheduled on an
isolated CPU where "delay" value is 0 or greater then 0:
schedule_delayed_work(, 0);
This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:
schedule_delayed_work(, 1);
Will move the timer on an housekeeping CPU, and schedule the work there.
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
=== Recent changes to the WQ API ==
The following, address the recent changes in the Workqueue API:
- commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and
system_dfl_wq")
- commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
The old workqueues will be removed in a future release cycle.
=== Introduced Changes by this series ==
1) [P 1] Replace uses of system_unbound_wq
system_unbound_wq is to be used when locality is not required.
Because of that, system_unbound_wq has been replaced with
system_dfl_wq, to make it clear it should be used if locality
is not important.
2) [P 2] WQ_PERCPU added to alloc_workqueue()
This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
Thanks!
Marco Crivellari (2):
drm/nouveau: replace use of system_unbound_wq with system_dfl_wq
drm/nouveau: WQ_PERCPU added to alloc_workqueue users
drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_sched.c | 3 ++-
3 files changed, 4 insertions(+), 3 deletions(-)
--
2.51.0
Marco Crivellari
2025-Oct-31 10:20 UTC
[PATCH 1/2] drm/nouveau: replace use of system_unbound_wq with system_dfl_wq
Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. system_unbound_wq should be the default workqueue so as not to enforce locality constraints for random work whenever it's not required. Adding system_dfl_wq to encourage its use when unbound work should be used. The old system_unbound_wq will be kept for a few release cycles. Suggested-by: Tejun Heo <tj at kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari at suse.com> --- drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index e97e39abf3a2..50b7aade5f0a 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -2474,7 +2474,7 @@ nv50_disp_atomic_commit(struct drm_device *dev, pm_runtime_get_noresume(dev->dev); if (nonblock) - queue_work(system_unbound_wq, &state->commit_work); + queue_work(system_dfl_wq, &state->commit_work); else nv50_disp_atomic_commit_tail(state); -- 2.51.0
Marco Crivellari
2025-Oct-31 10:20 UTC
[PATCH 2/2] drm/nouveau: WQ_PERCPU added to alloc_workqueue users
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they?re needed and
reducing noise when CPUs are isolated.
This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn?t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
Suggested-by: Tejun Heo <tj at kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari at suse.com>
---
drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_sched.c | 3 ++-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 1527b801f013..5a2970ef27d4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -631,7 +631,7 @@ nouveau_drm_device_init(struct nouveau_drm *drm)
struct drm_device *dev = drm->dev;
int ret;
- drm->sched_wq = alloc_workqueue("nouveau_sched_wq_shared", 0,
+ drm->sched_wq = alloc_workqueue("nouveau_sched_wq_shared",
WQ_PERCPU,
WQ_MAX_ACTIVE);
if (!drm->sched_wq)
return -ENOMEM;
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c
b/drivers/gpu/drm/nouveau/nouveau_sched.c
index e60f7892f5ce..79cf157ab2a5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -416,7 +416,8 @@ nouveau_sched_init(struct nouveau_sched *sched, struct
nouveau_drm *drm,
int ret;
if (!wq) {
- wq = alloc_workqueue("nouveau_sched_wq_%d", 0, WQ_MAX_ACTIVE,
+ wq = alloc_workqueue("nouveau_sched_wq_%d", WQ_PERCPU,
+ WQ_MAX_ACTIVE,
current->pid);
if (!wq)
return -ENOMEM;
--
2.51.0
Marco Crivellari
2025-Dec-02 13:20 UTC
[PATCH 0/2] replaced old wq name, added WQ_PERCPU to alloc_workqueue
Hi, On Fri, Oct 31, 2025 at 11:20?AM Marco Crivellari <marco.crivellari at suse.com> wrote:> Marco Crivellari (2): > drm/nouveau: replace use of system_unbound_wq with system_dfl_wq > drm/nouveau: WQ_PERCPU added to alloc_workqueue users > > drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_sched.c | 3 ++- > 3 files changed, 4 insertions(+), 3 deletions(-)Gentle ping. Thanks! -- Marco Crivellari L3 Support Engineer, Technology & Product