Yonatan Maman
2024-Oct-16 15:16 UTC
[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages
On 16/10/2024 7:23, Christoph Hellwig wrote:> On Tue, Oct 15, 2024 at 06:23:44PM +0300, Yonatan Maman wrote: >> From: Yonatan Maman <Ymaman at Nvidia.com> >> >> This patch series aims to enable Peer-to-Peer (P2P) DMA access in >> GPU-centric applications that utilize RDMA and private device pages. This >> enhancement is crucial for minimizing data transfer overhead by allowing >> the GPU to directly expose device private page data to devices such as >> NICs, eliminating the need to traverse system RAM, which is the native >> method for exposing device private page data. > > Please tone down your marketing language and explain your factual > changes. If you make performance claims back them by numbers. >Got it, thanks! I'll fix that. Regarding performance, we?re achieving over 10x higher bandwidth and 10x lower latency using perftest-rdma, especially (with a high rate of GPU memory access).
Alistair Popple
2024-Oct-16 22:22 UTC
[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages
Yonatan Maman <ymaman at nvidia.com> writes:> On 16/10/2024 7:23, Christoph Hellwig wrote: >> On Tue, Oct 15, 2024 at 06:23:44PM +0300, Yonatan Maman wrote: >>> From: Yonatan Maman <Ymaman at Nvidia.com> >>> >>> This patch series aims to enable Peer-to-Peer (P2P) DMA access in >>> GPU-centric applications that utilize RDMA and private device pages. This >>> enhancement is crucial for minimizing data transfer overhead by allowing >>> the GPU to directly expose device private page data to devices such as >>> NICs, eliminating the need to traverse system RAM, which is the native >>> method for exposing device private page data. >> Please tone down your marketing language and explain your factual >> changes. If you make performance claims back them by numbers. >> > > Got it, thanks! I'll fix that. Regarding performance, we?re achieving > over 10x higher bandwidth and 10x lower latency using perftest-rdma, > especially (with a high rate of GPU memory access).The performance claims still sound a bit vague. Please make sure you include actual perftest-rdma performance numbers from before and after applying this series when you repost.
Zhu Yanjun
2024-Oct-18 07:26 UTC
[PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages
? 2024/10/16 17:16, Yonatan Maman ??:> > > On 16/10/2024 7:23, Christoph Hellwig wrote: >> On Tue, Oct 15, 2024 at 06:23:44PM +0300, Yonatan Maman wrote: >>> From: Yonatan Maman <Ymaman at Nvidia.com> >>> >>> This patch series aims to enable Peer-to-Peer (P2P) DMA access in >>> GPU-centric applications that utilize RDMA and private device pages. >>> This >>> enhancement is crucial for minimizing data transfer overhead by allowing >>> the GPU to directly expose device private page data to devices such as >>> NICs, eliminating the need to traverse system RAM, which is the native >>> method for exposing device private page data. >> >> Please tone down your marketing language and explain your factual >> changes.? If you make performance claims back them by numbers. >> > > Got it, thanks! I'll fix that. Regarding performance, we?re achieving > over 10x higher bandwidth and 10x lower latency using perftest-rdma, > especially (with a high rate of GPU memory access).If I got this patch series correctly, this is based on ODP (On Demand Paging). And a way also exists which is based on non-ODP. From the following links, this way is implemented on efa, irdma and mlx5. 1. iRDMA https://lore.kernel.org/all/20230217011425.498847-1-yanjun.zhu at intel.com/ 2. efa https://lore.kernel.org/lkml/20211007114018.GD2688930 at ziepe.ca/t/ 3. mlx5 https://lore.kernel.org/all/1608067636-98073-5-git-send-email-jianxin.xiong at intel.com/ Because these 2 methods are both implemented on mlx5, have you compared the test results with the 2 methods on mlx5? The most important results should be latency and bandwidth. Please let us know the test results. Thanks a lot. Zhu Yanjun
Possibly Parallel Threads
- [PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages
- [PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages
- [PATCH v4 0/2] drm/nouveau/dmem: Fix Vulnerability and Device Channels configuration
- [PATCH v3 0/2] drm/nouveau/dmem: Fix Vulnerability and Device Channels configuration
- [PATCH v3 1/2] nouveau/dmem: Fix privileged error in copy engine channel