John Hubbard
2020-Jun-01 05:26 UTC
[PATCH v2 0/2] vhost, docs: convert to pin_user_pages(), new "case 5"
This is based on Linux 5.7, plus one prerequisite patch: "mm/gup: update pin_user_pages.rst for "case 3" (mmu notifiers)" [1] Changes since v1: removed references to set_page_dirty*(), in response to Souptick Joarder's review (thanks!). Cover letter for v1, edited/updated slightly: It recently became clear to me that there are some get_user_pages*() callers that don't fit neatly into any of the four cases that are so far listed in pin_user_pages.rst. vhost.c is one of those. Add a Case 5 to the documentation, and refer to that when converting vhost.c. Thanks to Jan Kara for helping me (again) in understanding the interaction between get_user_pages() and page writeback [2]. Note that I have only compile-tested the vhost.c patch, although that does also include cross-compiling for a few other arches. Any run-time testing would be greatly appreciated. [1] https://lore.kernel.org/r/20200527194953.11130-1-jhubbard at nvidia.com [2] https://lore.kernel.org/r/20200529070343.GL14550 at quack2.suse.cz John Hubbard (2): docs: mm/gup: pin_user_pages.rst: add a "case 5" vhost: convert get_user_pages() --> pin_user_pages() Documentation/core-api/pin_user_pages.rst | 18 ++++++++++++++++++ drivers/vhost/vhost.c | 5 ++--- 2 files changed, 20 insertions(+), 3 deletions(-) base-commit: 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 -- 2.26.2
John Hubbard
2020-Jun-01 05:26 UTC
[PATCH v2 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5"
There are four cases listed in pin_user_pages.rst. These are intended to help developers figure out whether to use get_user_pages*(), or pin_user_pages*(). However, the four cases do not cover all the situations. For example, drivers/vhost/vhost.c has a "pin, write to page, set page dirty, unpin" case. Add a fifth case, to help explain that there is a general pattern that requires pin_user_pages*() API calls. Cc: Vlastimil Babka <vbabka at suse.cz> Cc: Jan Kara <jack at suse.cz> Cc: J?r?me Glisse <jglisse at redhat.com> Cc: Dave Chinner <david at fromorbit.com> Cc: Jonathan Corbet <corbet at lwn.net> Cc: linux-doc at vger.kernel.org Cc: linux-fsdevel at vger.kernel.org Signed-off-by: John Hubbard <jhubbard at nvidia.com> --- Documentation/core-api/pin_user_pages.rst | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst index 4675b04e8829..6068266dd303 100644 --- a/Documentation/core-api/pin_user_pages.rst +++ b/Documentation/core-api/pin_user_pages.rst @@ -171,6 +171,24 @@ If only struct page data (as opposed to the actual memory contents that a page is tracking) is affected, then normal GUP calls are sufficient, and neither flag needs to be set. +CASE 5: Pinning in order to write to the data within the page +------------------------------------------------------------- +Even though neither DMA nor Direct IO is involved, just a simple case of "pin, +write to a page's data, unpin" can cause a problem. Case 5 may be considered a +superset of Case 1, plus Case 2, plus anything that invokes that pattern. In +other words, if the code is neither Case 1 nor Case 2, it may still require +FOLL_PIN, for patterns like this: + +Correct (uses FOLL_PIN calls): + pin_user_pages() + write to the data within the pages + unpin_user_pages() + +INCORRECT (uses FOLL_GET calls): + get_user_pages() + write to the data within the pages + put_page() + page_maybe_dma_pinned(): the whole point of pinning ================================================== -- 2.26.2
John Hubbard
2020-Jun-01 05:26 UTC
[PATCH v2 2/2] vhost: convert get_user_pages() --> pin_user_pages()
This code was using get_user_pages*(), in approximately a "Case 5" scenario (accessing the data within a page), using the categorization from [1]. That means that it's time to convert the get_user_pages*() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Cc: Michael S. Tsirkin <mst at redhat.com> Cc: Jason Wang <jasowang at redhat.com> Cc: kvm at vger.kernel.org Cc: virtualization at lists.linux-foundation.org Cc: netdev at vger.kernel.org Signed-off-by: John Hubbard <jhubbard at nvidia.com> --- drivers/vhost/vhost.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 21a59b598ed8..596132a96cd5 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1762,15 +1762,14 @@ static int set_bit_to_user(int nr, void __user *addr) int bit = nr + (log % PAGE_SIZE) * 8; int r; - r = get_user_pages_fast(log, 1, FOLL_WRITE, &page); + r = pin_user_pages_fast(log, 1, FOLL_WRITE, &page); if (r < 0) return r; BUG_ON(r != 1); base = kmap_atomic(page); set_bit(bit, base); kunmap_atomic(base); - set_page_dirty_lock(page); - put_page(page); + unpin_user_pages_dirty_lock(&page, 1, true); return 0; } -- 2.26.2
Jan Kara
2020-Jun-01 11:34 UTC
[PATCH v2 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5"
On Sun 31-05-20 22:26:32, John Hubbard wrote:> There are four cases listed in pin_user_pages.rst. These are > intended to help developers figure out whether to use > get_user_pages*(), or pin_user_pages*(). However, the four cases > do not cover all the situations. For example, drivers/vhost/vhost.c > has a "pin, write to page, set page dirty, unpin" case. > > Add a fifth case, to help explain that there is a general pattern > that requires pin_user_pages*() API calls. > > Cc: Vlastimil Babka <vbabka at suse.cz> > Cc: Jan Kara <jack at suse.cz> > Cc: J?r?me Glisse <jglisse at redhat.com> > Cc: Dave Chinner <david at fromorbit.com> > Cc: Jonathan Corbet <corbet at lwn.net> > Cc: linux-doc at vger.kernel.org > Cc: linux-fsdevel at vger.kernel.org > Signed-off-by: John Hubbard <jhubbard at nvidia.com>Looks good to me. You can add: Reviewed-by: Jan Kara <jack at suse.cz> Honza> --- > Documentation/core-api/pin_user_pages.rst | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst > index 4675b04e8829..6068266dd303 100644 > --- a/Documentation/core-api/pin_user_pages.rst > +++ b/Documentation/core-api/pin_user_pages.rst > @@ -171,6 +171,24 @@ If only struct page data (as opposed to the actual memory contents that a page > is tracking) is affected, then normal GUP calls are sufficient, and neither flag > needs to be set. > > +CASE 5: Pinning in order to write to the data within the page > +------------------------------------------------------------- > +Even though neither DMA nor Direct IO is involved, just a simple case of "pin, > +write to a page's data, unpin" can cause a problem. Case 5 may be considered a > +superset of Case 1, plus Case 2, plus anything that invokes that pattern. In > +other words, if the code is neither Case 1 nor Case 2, it may still require > +FOLL_PIN, for patterns like this: > + > +Correct (uses FOLL_PIN calls): > + pin_user_pages() > + write to the data within the pages > + unpin_user_pages() > + > +INCORRECT (uses FOLL_GET calls): > + get_user_pages() > + write to the data within the pages > + put_page() > + > page_maybe_dma_pinned(): the whole point of pinning > ==================================================> > -- > 2.26.2 >-- Jan Kara <jack at suse.com> SUSE Labs, CR