thr3ads.net - Nouveau - [Nouveau] CUDA fixed VA allocations and sparse mappings [Jul 2015]

If this information is useful, please help other people find it:
Share via:

C Bergström

2015-Jul-08 00:07 UTC

[Nouveau] CUDA fixed VA allocations and sparse mappings

On Wed, Jul 8, 2015 at 6:58 AM, Ben Skeggs <skeggsb at gmail.com>
wrote:> On 8 July 2015 at 09:53, C Bergström <cbergstrom at pathscale.com>
wrote:
>> regarding
>> --------
>> Fixed address allocations weren't going to be part of that, but I
see
>> that it makes sense for a variety of use cases.  One question I have
>> here is how this is intended to work where the RM needs to make some
>> of these allocations itself (for graphics context mapping, etc), how
>> should potential conflicts with user mappings be handled?
>> --------
>> As an initial implemetation you can probably assume that the GPU
>> offloading is in "exclusive" mode. Basically that the CUDA or
OpenACC
>> code has full ownership of the card. The Tesla cards don't even
have a
>> video out on them. To complicate this even more - some offloading code
>> has very long running kernels and even worse - may critically depend
>> on using the full available GPU ram. (Large matrix sizes and soon big
>> Fortran arrays or complex data types)
> This doesn't change that, to setup the graphics engine, the driver
> needs to map various system-use data structures into the channel's
> address space *somewhere* :)
I'm not sure I follow exactly what you mean, but I think the answer is
- don't setup the graphics engine if you're in "compute" mode.
Doing
that, iiuc, will at least provide a start to support for compute.
Anyone who argues that graphics+compute is critical to have working at
the same time is probably a 1%.

Ilia Mirkin

2015-Jul-08 00:08 UTC

head link

[Nouveau] CUDA fixed VA allocations and sparse mappings

On Tue, Jul 7, 2015 at 8:07 PM, C Bergström <cbergstrom at pathscale.com>
wrote:> On Wed, Jul 8, 2015 at 6:58 AM, Ben Skeggs <skeggsb at gmail.com>
wrote:
>> On 8 July 2015 at 09:53, C Bergström <cbergstrom at
pathscale.com> wrote:
>>> regarding
>>> --------
>>> Fixed address allocations weren't going to be part of that, but
I see
>>> that it makes sense for a variety of use cases.  One question I
have
>>> here is how this is intended to work where the RM needs to make
some
>>> of these allocations itself (for graphics context mapping, etc),
how
>>> should potential conflicts with user mappings be handled?
>>> --------
>>> As an initial implemetation you can probably assume that the GPU
>>> offloading is in "exclusive" mode. Basically that the
CUDA or OpenACC
>>> code has full ownership of the card. The Tesla cards don't even
have a
>>> video out on them. To complicate this even more - some offloading
code
>>> has very long running kernels and even worse - may critically
depend
>>> on using the full available GPU ram. (Large matrix sizes and soon
big
>>> Fortran arrays or complex data types)
>> This doesn't change that, to setup the graphics engine, the driver
>> needs to map various system-use data structures into the channel's
>> address space *somewhere* :)
>
> I'm not sure I follow exactly what you mean, but I think the answer is
> - don't setup the graphics engine if you're in "compute"
mode. Doing
> that, iiuc, will at least provide a start to support for compute.
> Anyone who argues that graphics+compute is critical to have working at
> the same time is probably a 1%.
On NVIDIA GPUs, compute _is_ part of the graphics engine... aka PGRAPH.

C Bergström

2015-Jul-08 00:11 UTC

head link

[Nouveau] CUDA fixed VA allocations and sparse mappings

On Wed, Jul 8, 2015 at 7:08 AM, Ilia Mirkin <imirkin at alum.mit.edu>
wrote:> On Tue, Jul 7, 2015 at 8:07 PM, C Bergström <cbergstrom at
pathscale.com> wrote:
>> On Wed, Jul 8, 2015 at 6:58 AM, Ben Skeggs <skeggsb at gmail.com>
wrote:
>>> On 8 July 2015 at 09:53, C Bergström <cbergstrom at
pathscale.com> wrote:
>>>> regarding
>>>> --------
>>>> Fixed address allocations weren't going to be part of that,
but I see
>>>> that it makes sense for a variety of use cases.  One question I
have
>>>> here is how this is intended to work where the RM needs to make
some
>>>> of these allocations itself (for graphics context mapping,
etc), how
>>>> should potential conflicts with user mappings be handled?
>>>> --------
>>>> As an initial implemetation you can probably assume that the
GPU
>>>> offloading is in "exclusive" mode. Basically that the
CUDA or OpenACC
>>>> code has full ownership of the card. The Tesla cards don't
even have a
>>>> video out on them. To complicate this even more - some
offloading code
>>>> has very long running kernels and even worse - may critically
depend
>>>> on using the full available GPU ram. (Large matrix sizes and
soon big
>>>> Fortran arrays or complex data types)
>>> This doesn't change that, to setup the graphics engine, the
driver
>>> needs to map various system-use data structures into the
channel's
>>> address space *somewhere* :)
>>
>> I'm not sure I follow exactly what you mean, but I think the answer
is
>> - don't setup the graphics engine if you're in
"compute" mode. Doing
>> that, iiuc, will at least provide a start to support for compute.
>> Anyone who argues that graphics+compute is critical to have working at
>> the same time is probably a 1%.
>
> On NVIDIA GPUs, compute _is_ part of the graphics engine... aka PGRAPH.
You can afaik setup PGRAPH without mapping memory for graphics. You
just init the engine and get out of the way.

Nouveau - Jul 2015 - CUDA fixed VA allocations and sparse mappings

[Nouveau] CUDA fixed VA allocations and sparse mappings

[Nouveau] CUDA fixed VA allocations and sparse mappings

[Nouveau] CUDA fixed VA allocations and sparse mappings