thr3ads.net - Nouveau - [Nouveau] [PATCH 13/16] mm: support THP migration to device private memory [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Yang Shi

2020-Jun-22 23:55 UTC

[Nouveau] [PATCH 13/16] mm: support THP migration to device private memory

On Mon, Jun 22, 2020 at 4:02 PM John Hubbard <jhubbard at nvidia.com>
wrote:>
> On 2020-06-22 15:33, Yang Shi wrote:
> > On Mon, Jun 22, 2020 at 3:30 PM Yang Shi <shy828301 at
gmail.com> wrote:
> >> On Mon, Jun 22, 2020 at 2:53 PM Zi Yan <ziy at nvidia.com>
wrote:
> >>> On 22 Jun 2020, at 17:31, Ralph Campbell wrote:
> >>>> On 6/22/20 1:10 PM, Zi Yan wrote:
> >>>>> On 22 Jun 2020, at 15:36, Ralph Campbell wrote:
> >>>>>> On 6/21/20 4:20 PM, Zi Yan wrote:
> >>>>>>> On 19 Jun 2020, at 17:56, Ralph Campbell
wrote:
> ...
> >>> Ying(cc?d) developed the code to swapout and swapin THP in one
piece: https://lore.kernel.org/linux-mm/20181207054122.27822-1-ying.huang at
intel.com/.
> >>> I am not sure whether the patchset makes into mainstream or
not. It could be a good technical reference
> >>> for swapping in device private pages, although swapping in
pages from disk and from device private
> >>> memory are two different scenarios.
> >>>
> >>> Since the device private memory swapin impacts core mm
performance, we might want to discuss your patches
> >>> with more people, like the ones from Ying?s patchset, in the
next version.
> >>
> >> I believe Ying will give you more insights about how THP swap
works.
> >>
> >> But, IMHO device memory migration (migrate to system memory) seems
> >> like THP CoW more than swap.
>
>
> A fine point: overall, the desired behavior is "migrate", not
CoW.
> That's important. Migrate means that you don't leave a page behind,
even
> a read-only one. And that's exactly how device private migration is
> specified.
>
> We should try to avoid any erosion of clarity here. Even if somehow
> (really?) the underlying implementation calls this THP CoW, the actual
> goal is to migrate pages over to the device (and back).
>
>
> >>
> >> When migrating in:
> >
> > Sorry for my fat finger, hit sent button inadvertently, let me finish
here.
> >
> > When migrating in:
> >
> >          - if THP is enabled: allocate THP, but need handle allocation
> > failure by falling back to base page
> >          - if THP is disabled: fallback to base page
> >
>
> OK, but *all* page entries (base and huge/large pages) need to be cleared,
> when migrating to device memory, unless I'm really confused here.
> So: not CoW.
I realized the comment caused more confusion. I apologize for the
confusion. Yes, the trigger condition for swap/migration and CoW are
definitely different. Here I mean the fault handling part of migrating
into system memory.

Swap-in just needs to handle the base page case since THP swapin is
not supported in upstream yet and the PMD is split in swap-out phase
(see shrink_page_list).

The patch adds THP migration support to device memory, but you need to
handle migrate in (back to system memory) case correctly. The fault
handling should look like THP CoW fault handling behavior (before
5.8):
    - if THP is enabled: allocate THP, fallback if allocation is failed
    - if THP is disabled: fallback to base page

Swap fault handling doesn't look like the above. So, I said it seems
like more THP CoW (fault handling part only before 5.8). I hope I
articulate my mind.

However, I didn't see such fallback is handled. It looks if THP
allocation is failed, it just returns SIGBUS; and no check about THP
status if I read the patches correctly. The THP might be disabled for
the specific vma or system wide before migrating from device memory
back to system memory.
>
> thanks,
> --
> John Hubbard
> NVIDIA

Ralph Campbell

2020-Jun-23 00:05 UTC

head link

[Nouveau] [PATCH 13/16] mm: support THP migration to device private memory

On 6/22/20 4:54 PM, Yang Shi wrote:> On Mon, Jun 22, 2020 at 4:02 PM John Hubbard <jhubbard at nvidia.com>
wrote:
>>
>> On 2020-06-22 15:33, Yang Shi wrote:
>>> On Mon, Jun 22, 2020 at 3:30 PM Yang Shi <shy828301 at
gmail.com> wrote:
>>>> On Mon, Jun 22, 2020 at 2:53 PM Zi Yan <ziy at
nvidia.com> wrote:
>>>>> On 22 Jun 2020, at 17:31, Ralph Campbell wrote:
>>>>>> On 6/22/20 1:10 PM, Zi Yan wrote:
>>>>>>> On 22 Jun 2020, at 15:36, Ralph Campbell wrote:
>>>>>>>> On 6/21/20 4:20 PM, Zi Yan wrote:
>>>>>>>>> On 19 Jun 2020, at 17:56, Ralph Campbell
wrote:
>> ...
>>>>> Ying(cc?d) developed the code to swapout and swapin THP in
one piece: https://lore.kernel.org/linux-mm/20181207054122.27822-1-ying.huang at
intel.com/.
>>>>> I am not sure whether the patchset makes into mainstream or
not. It could be a good technical reference
>>>>> for swapping in device private pages, although swapping in
pages from disk and from device private
>>>>> memory are two different scenarios.
>>>>>
>>>>> Since the device private memory swapin impacts core mm
performance, we might want to discuss your patches
>>>>> with more people, like the ones from Ying?s patchset, in
the next version.
>>>>
>>>> I believe Ying will give you more insights about how THP swap
works.
>>>>
>>>> But, IMHO device memory migration (migrate to system memory)
seems
>>>> like THP CoW more than swap.
>>
>>
>> A fine point: overall, the desired behavior is "migrate", not
CoW.
>> That's important. Migrate means that you don't leave a page
behind, even
>> a read-only one. And that's exactly how device private migration is
>> specified.
>>
>> We should try to avoid any erosion of clarity here. Even if somehow
>> (really?) the underlying implementation calls this THP CoW, the actual
>> goal is to migrate pages over to the device (and back).
>>
>>
>>>>
>>>> When migrating in:
>>>
>>> Sorry for my fat finger, hit sent button inadvertently, let me
finish here.
>>>
>>> When migrating in:
>>>
>>>           - if THP is enabled: allocate THP, but need handle
allocation
>>> failure by falling back to base page
>>>           - if THP is disabled: fallback to base page
>>>
>>
>> OK, but *all* page entries (base and huge/large pages) need to be
cleared,
>> when migrating to device memory, unless I'm really confused here.
>> So: not CoW.
> 
> I realized the comment caused more confusion. I apologize for the
> confusion. Yes, the trigger condition for swap/migration and CoW are
> definitely different. Here I mean the fault handling part of migrating
> into system memory.
> 
> Swap-in just needs to handle the base page case since THP swapin is
> not supported in upstream yet and the PMD is split in swap-out phase
> (see shrink_page_list).
> 
> The patch adds THP migration support to device memory, but you need to
> handle migrate in (back to system memory) case correctly. The fault
> handling should look like THP CoW fault handling behavior (before
> 5.8):
>      - if THP is enabled: allocate THP, fallback if allocation is failed
>      - if THP is disabled: fallback to base page
> 
> Swap fault handling doesn't look like the above. So, I said it seems
> like more THP CoW (fault handling part only before 5.8). I hope I
> articulate my mind.
> 
> However, I didn't see such fallback is handled. It looks if THP
> allocation is failed, it just returns SIGBUS; and no check about THP
> status if I read the patches correctly. The THP might be disabled for
> the specific vma or system wide before migrating from device memory
> back to system memory.
You are correct, the patch wasn't handling the fallback case.
I'll add that in the next version.
>>
>> thanks,
>> --
>> John Hubbard
>> NVIDIA

Huang, Ying

2020-Jun-23 02:51 UTC

head link

[Nouveau] [PATCH 13/16] mm: support THP migration to device private memory

Ralph Campbell <rcampbell at nvidia.com> writes:
> On 6/22/20 4:54 PM, Yang Shi wrote:
>> On Mon, Jun 22, 2020 at 4:02 PM John Hubbard <jhubbard at
nvidia.com> wrote:
>>>
>>> On 2020-06-22 15:33, Yang Shi wrote:
>>>> On Mon, Jun 22, 2020 at 3:30 PM Yang Shi <shy828301 at
gmail.com> wrote:
>>>>> On Mon, Jun 22, 2020 at 2:53 PM Zi Yan <ziy at
nvidia.com> wrote:
>>>>>> On 22 Jun 2020, at 17:31, Ralph Campbell wrote:
>>>>>>> On 6/22/20 1:10 PM, Zi Yan wrote:
>>>>>>>> On 22 Jun 2020, at 15:36, Ralph Campbell wrote:
>>>>>>>>> On 6/21/20 4:20 PM, Zi Yan wrote:
>>>>>>>>>> On 19 Jun 2020, at 17:56, Ralph
Campbell wrote:
>>> ...
>>>>>> Ying(cc?d) developed the code to swapout and swapin THP
in one piece: https://lore.kernel.org/linux-mm/20181207054122.27822-1-ying.huang
at intel.com/.
>>>>>> I am not sure whether the patchset makes into
mainstream or not. It could be a good technical reference
>>>>>> for swapping in device private pages, although swapping
in pages from disk and from device private
>>>>>> memory are two different scenarios.
>>>>>>
>>>>>> Since the device private memory swapin impacts core mm
performance, we might want to discuss your patches
>>>>>> with more people, like the ones from Ying?s patchset,
in the next version.
>>>>>
>>>>> I believe Ying will give you more insights about how THP
swap works.
>>>>>
>>>>> But, IMHO device memory migration (migrate to system
memory) seems
>>>>> like THP CoW more than swap.
>>>
>>>
>>> A fine point: overall, the desired behavior is "migrate",
not CoW.
>>> That's important. Migrate means that you don't leave a page
behind, even
>>> a read-only one. And that's exactly how device private
migration is
>>> specified.
>>>
>>> We should try to avoid any erosion of clarity here. Even if somehow
>>> (really?) the underlying implementation calls this THP CoW, the
actual
>>> goal is to migrate pages over to the device (and back).
>>>
>>>
>>>>>
>>>>> When migrating in:
>>>>
>>>> Sorry for my fat finger, hit sent button inadvertently, let me
finish here.
>>>>
>>>> When migrating in:
>>>>
>>>>           - if THP is enabled: allocate THP, but need handle
allocation
>>>> failure by falling back to base page
>>>>           - if THP is disabled: fallback to base page
>>>>
>>>
>>> OK, but *all* page entries (base and huge/large pages) need to be
cleared,
>>> when migrating to device memory, unless I'm really confused
here.
>>> So: not CoW.
>>
>> I realized the comment caused more confusion. I apologize for the
>> confusion. Yes, the trigger condition for swap/migration and CoW are
>> definitely different. Here I mean the fault handling part of migrating
>> into system memory.
>>
>> Swap-in just needs to handle the base page case since THP swapin is
>> not supported in upstream yet and the PMD is split in swap-out phase
>> (see shrink_page_list).
>>
>> The patch adds THP migration support to device memory, but you need to
>> handle migrate in (back to system memory) case correctly. The fault
>> handling should look like THP CoW fault handling behavior (before
>> 5.8):
>>      - if THP is enabled: allocate THP, fallback if allocation is
failed
>>      - if THP is disabled: fallback to base page
>>
>> Swap fault handling doesn't look like the above. So, I said it
seems
>> like more THP CoW (fault handling part only before 5.8). I hope I
>> articulate my mind.
>>
>> However, I didn't see such fallback is handled. It looks if THP
>> allocation is failed, it just returns SIGBUS; and no check about THP
>> status if I read the patches correctly. The THP might be disabled for
>> the specific vma or system wide before migrating from device memory
>> back to system memory.
>
> You are correct, the patch wasn't handling the fallback case.
> I'll add that in the next version.
For fallback, you need to split the THP in device firstly.  Because you
will migrate a base page instead a whole THP now.

Best Regards,
Huang, Ying
>>>
>>> thanks,
>>> --
>>> John Hubbard
>>> NVIDIA

Seemingly Similar Threads

Search for more possibly parallel threads

Nouveau - Jun 2020 - [PATCH 13/16] mm: support THP migration to device private memory

[Nouveau] [PATCH 13/16] mm: support THP migration to device private memory

[Nouveau] [PATCH 13/16] mm: support THP migration to device private memory

[Nouveau] [PATCH 13/16] mm: support THP migration to device private memory

Seemingly Similar Threads