thr3ads.net - Nouveau - [Nouveau] Proposal for a DRM-compliant GPU command scheduler [Sep 2011]

If this information is useful, please help other people find it:
Share via:

Shinpei KATO

2011-Sep-15 23:20 UTC

[Nouveau] Proposal for a DRM-compliant GPU command scheduler

Hi,

I am the main developer of the TimeGraph GPU command scheduler, which was
presented at XDC 2011 in Chicago a few days ago.
Please let me propose this approach to scheduling GPU-accelerated processes
with DRM.

This GPU scheduler will help to prioritize and isolate multiple
GPU-accelerated processes executing concurrently for protecting important
GPU workload in multi-tasking environments. It is designed and implemented
at the DRM level as a "drm_sched" component. Each
architecture-dependent
driver (nouveau, radeon, i915, etc.) is also required to call the scheduling
functions provided by this drm_sched component accordingly. Nothing needs to
be changed in user-space runtimes.
The priorities and resource limits can be specified through
"/proc/driver/drm_sched/#PID/{sched_policy,resv_policy,priority,runtime,peri
od}" and/or "/etc/drm_sched.spec". Other user interfaces could
also be
possible.
The impact of prioritization and isolation on protecting important
GPU-accelerated processes from competing GPU workload is quite significant
(e.g., 3-D games can run at a 10x~ faster rate, using our GPU scheduler,
when heavy workload is competing the GPU). Performance interference among
GPU-accelerated processes can also be well-controlled.
We can activate this GPU scheduler only when multiple processes use the GPU.
Hence *nothing* would be harmful for a standalone execution.
I felt that the audience at XDC 2011 was pretty supportive for this idea.

TimeGraph is developed in a collaborative project with Carnegie Mellon
University, University of California Santa Cruz, and University of Tokyo.
The project website is: http://rtml.ece.cmu.edu/projects/timegraph/

The documentation about how it works is:
http://rtml.ece.cmu.edu/projects/timegraph/raw-attachment/wiki/documentation
/drm_sched_rtgpu.pdf
More information is available at:
http://rtml.ece.cmu.edu/projects/timegraph/wiki/documentation

The instruction to install our Nouveau-based prototype driver is:
http://rtml.ece.cmu.edu/projects/timegraph/wiki/install
For convenience of development, the GPU scheduler is provided as an
independent kernel module (https://gitorious.org/rtgpu/timegraph), but it
can also be part of DRM.
You will also need the Nouveau-tree Linux kernel patched for drm_sched
(https://gitorious.org/rtgpu/linux-rtgpu). Please see the instruction above.
There are not so many changes applied to the current kernel code, as you can
quickly reference at:
http://rtml.ece.cmu.edu/projects/timegraph/attachment/wiki/install/linux-rtg
pu.patch

I would appreciate any comments and feedback from you.

Best Regards,
- Shinpei Kato

Daniel Vetter

2011-Sep-16 14:58 UTC

head link

[Nouveau] Proposal for a DRM-compliant GPU command scheduler

Hi,

I haven't attended xdc in chicago, but I've read through your slides and
looked a bit at the code. We're planing to implement gpu scheduling for
intel gpus, too, so I'm pretty interested in this area. Comments:

- your code seems to rather thightly integrated with how nvidia hw works.
  I'm not sure whether it's a good fit for other hw, especially if/when
  there's better support from the hw for context switching.

- by the looks of it, scheduling happens via: gpu completion irq handler
  -> realtime thread -> waking up of the blocked process that got put to
  sleep before command submission. There's also a fastpath that does not
  block the command submission if the scheduler allows the process to run
  immediately. I fear this has quite high overhead and the scheduler
  design doesn't seem to allow clever tricks like issuing gpu batchbuffers
  eagerly and patching up execution after the fact (if e.g. a previous
  batchbuffer used up too much time). Your argument that the scheduler
  completely disables itself is also a bit void - contemporary desktop and
  mobile systems always have mutliple clients: A compositor and the
  clients, sometimes there's even an X process rearing its ugly head ;-)

- Imo your code needs quite some clean-up. I've noticed e.g. that it
  doesn't use the linux struct list_head functions. It also seems to
  re-implement a waitqueue/completions in a (racy) way.

- Imo the area that would most benefit from a shared gpu scheduler
  infrastructure is the userspace interface, so that a common set of tools
  can be used accross different drivers. Your solutions to configure the
  scheduler seems to be to read a file in /etc from the kernel module
  which is ... a bit ugly, to say the least.

In short it'd be awesome if you can help in creating a gpu scheduler
infrastructure for linux. But unfortunately your current code is imo
pretty far away from something that could be merged.

Yours, Daniel

On Thu, Sep 15, 2011 at 04:20:13PM -0700, Shinpei KATO
wrote:> Hi,
> 
> I am the main developer of the TimeGraph GPU command scheduler, which was
> presented at XDC 2011 in Chicago a few days ago.
> Please let me propose this approach to scheduling GPU-accelerated processes
> with DRM.
> 
> This GPU scheduler will help to prioritize and isolate multiple
> GPU-accelerated processes executing concurrently for protecting important
> GPU workload in multi-tasking environments. It is designed and implemented
> at the DRM level as a "drm_sched" component. Each
architecture-dependent
> driver (nouveau, radeon, i915, etc.) is also required to call the
scheduling
> functions provided by this drm_sched component accordingly. Nothing needs
to
> be changed in user-space runtimes.
> The priorities and resource limits can be specified through
>
"/proc/driver/drm_sched/#PID/{sched_policy,resv_policy,priority,runtime,peri
> od}" and/or "/etc/drm_sched.spec". Other user interfaces
could also be
> possible.
> The impact of prioritization and isolation on protecting important
> GPU-accelerated processes from competing GPU workload is quite significant
> (e.g., 3-D games can run at a 10x~ faster rate, using our GPU scheduler,
> when heavy workload is competing the GPU). Performance interference among
> GPU-accelerated processes can also be well-controlled.
> We can activate this GPU scheduler only when multiple processes use the
GPU.
> Hence *nothing* would be harmful for a standalone execution.
> I felt that the audience at XDC 2011 was pretty supportive for this idea.
> 
> TimeGraph is developed in a collaborative project with Carnegie Mellon
> University, University of California Santa Cruz, and University of Tokyo.
> The project website is: http://rtml.ece.cmu.edu/projects/timegraph/
> 
> The documentation about how it works is:
>
http://rtml.ece.cmu.edu/projects/timegraph/raw-attachment/wiki/documentation
> /drm_sched_rtgpu.pdf
> More information is available at:
> http://rtml.ece.cmu.edu/projects/timegraph/wiki/documentation
> 
> The instruction to install our Nouveau-based prototype driver is:
> http://rtml.ece.cmu.edu/projects/timegraph/wiki/install
> For convenience of development, the GPU scheduler is provided as an
> independent kernel module (https://gitorious.org/rtgpu/timegraph), but it
> can also be part of DRM.
> You will also need the Nouveau-tree Linux kernel patched for drm_sched
> (https://gitorious.org/rtgpu/linux-rtgpu). Please see the instruction
above.
> There are not so many changes applied to the current kernel code, as you
can
> quickly reference at:
>
http://rtml.ece.cmu.edu/projects/timegraph/attachment/wiki/install/linux-rtg
> pu.patch
> 
> I would appreciate any comments and feedback from you.
> 
> Best Regards,
> - Shinpei Kato
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48

Nouveau - Sep 2011 - Proposal for a DRM-compliant GPU command scheduler

[Nouveau] Proposal for a DRM-compliant GPU command scheduler

[Nouveau] Proposal for a DRM-compliant GPU command scheduler