thr3ads.net - Nouveau - [PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages [Oct 2024]

If this information is useful, please help other people find it:
Share via:

Yonatan Maman

2024-Oct-16 15:16 UTC

[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages

On 16/10/2024 7:23, Christoph Hellwig wrote:> On Tue, Oct 15, 2024 at 06:23:44PM +0300, Yonatan Maman wrote:
>> From: Yonatan Maman <Ymaman at Nvidia.com>
>>
>> This patch series aims to enable Peer-to-Peer (P2P) DMA access in
>> GPU-centric applications that utilize RDMA and private device pages.
This
>> enhancement is crucial for minimizing data transfer overhead by
allowing
>> the GPU to directly expose device private page data to devices such as
>> NICs, eliminating the need to traverse system RAM, which is the native
>> method for exposing device private page data.
> 
> Please tone down your marketing language and explain your factual
> changes.  If you make performance claims back them by numbers.
> 
Got it, thanks! I'll fix that. Regarding performance, we?re achieving 
over 10x higher bandwidth and 10x lower latency using perftest-rdma, 
especially (with a high rate of GPU memory access).

Alistair Popple

2024-Oct-16 22:22 UTC

head link

[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages

Yonatan Maman <ymaman at nvidia.com> writes:
> On 16/10/2024 7:23, Christoph Hellwig wrote:
>> On Tue, Oct 15, 2024 at 06:23:44PM +0300, Yonatan Maman wrote:
>>> From: Yonatan Maman <Ymaman at Nvidia.com>
>>>
>>> This patch series aims to enable Peer-to-Peer (P2P) DMA access in
>>> GPU-centric applications that utilize RDMA and private device
pages. This
>>> enhancement is crucial for minimizing data transfer overhead by
allowing
>>> the GPU to directly expose device private page data to devices such
as
>>> NICs, eliminating the need to traverse system RAM, which is the
native
>>> method for exposing device private page data.
>> Please tone down your marketing language and explain your factual
>> changes.  If you make performance claims back them by numbers.
>> 
>
> Got it, thanks! I'll fix that. Regarding performance, we?re achieving
> over 10x higher bandwidth and 10x lower latency using perftest-rdma,
> especially (with a high rate of GPU memory access).
The performance claims still sound a bit vague. Please make sure you
include actual perftest-rdma performance numbers from before and after
applying this series when you repost.

Zhu Yanjun

2024-Oct-18 07:26 UTC

head link

[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages

? 2024/10/16 17:16, Yonatan Maman ??:> 
> 
> On 16/10/2024 7:23, Christoph Hellwig wrote:
>> On Tue, Oct 15, 2024 at 06:23:44PM +0300, Yonatan Maman wrote:
>>> From: Yonatan Maman <Ymaman at Nvidia.com>
>>>
>>> This patch series aims to enable Peer-to-Peer (P2P) DMA access in
>>> GPU-centric applications that utilize RDMA and private device
pages.
>>> This
>>> enhancement is crucial for minimizing data transfer overhead by
allowing
>>> the GPU to directly expose device private page data to devices such
as
>>> NICs, eliminating the need to traverse system RAM, which is the
native
>>> method for exposing device private page data.
>>
>> Please tone down your marketing language and explain your factual
>> changes.? If you make performance claims back them by numbers.
>>
> 
> Got it, thanks! I'll fix that. Regarding performance, we?re achieving 
> over 10x higher bandwidth and 10x lower latency using perftest-rdma, 
> especially (with a high rate of GPU memory access).
If I got this patch series correctly, this is based on ODP (On Demand 
Paging). And a way also exists which is based on non-ODP. From the 
following links, this way is implemented on efa, irdma and mlx5.
1. iRDMA
https://lore.kernel.org/all/20230217011425.498847-1-yanjun.zhu at intel.com/

2. efa
https://lore.kernel.org/lkml/20211007114018.GD2688930 at ziepe.ca/t/

3. mlx5
https://lore.kernel.org/all/1608067636-98073-5-git-send-email-jianxin.xiong at
intel.com/

Because these 2 methods are both implemented on mlx5, have you compared 
the test results with the 2 methods on mlx5?

The most important results should be latency and bandwidth. Please let 
us know the test results.

Thanks a lot.
Zhu Yanjun

Reasonably Related Threads

Search for more apparently analagous threads

Nouveau - Oct 2024 - [PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages

[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages

[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages

[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages

Reasonably Related Threads