Linux-Fsdevel Archive on lore.kernel.org help / color / mirror / Atom feed
* [PATCH 0/2] vhost, docs: convert to pin_user_pages(), new "case 5" @ 2020-05-29 23:43 John Hubbard 2020-05-29 23:43 ` [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a " John Hubbard 2020-05-29 23:43 ` [PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages() John Hubbard 0 siblings, 2 replies; 9+ messages in thread From: John Hubbard @ 2020-05-29 23:43 UTC (permalink / raw) To: Andrew Morton Cc: Michael S . Tsirkin, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Souptick Joarder, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, linux-mm, John Hubbard Hi, It recently became clear to me that there are some get_user_pages*() callers that don't fit neatly into any of the four cases that are so far listed in pin_user_pages.rst. vhost.c is one of those. Add a Case 5 to the documentation, and refer to that when converting vhost.c. Thanks to Jan Kara for helping me (again) in understanding the interaction between get_user_pages() and page writeback [1]. This is based on today's mmotm, which has a nearby patch to pin_user_pages.rst that rewords cases 3 and 4. Note that I have only compile-tested the vhost.c patch, although that does also include cross-compiling for a few other arches. Any run-time testing would be greatly appreciated. [1] https://lore.kernel.org/r/20200529070343.GL14550@quack2.suse.cz John Hubbard (2): docs: mm/gup: pin_user_pages.rst: add a "case 5" vhost: convert get_user_pages() --> pin_user_pages() Documentation/core-api/pin_user_pages.rst | 20 ++++++++++++++++++++ drivers/vhost/vhost.c | 5 ++--- 2 files changed, 22 insertions(+), 3 deletions(-) -- 2.26.2 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5" 2020-05-29 23:43 [PATCH 0/2] vhost, docs: convert to pin_user_pages(), new "case 5" John Hubbard @ 2020-05-29 23:43 ` John Hubbard 2020-05-31 7:11 ` Souptick Joarder 2020-06-12 19:24 ` Matthew Wilcox 2020-05-29 23:43 ` [PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages() John Hubbard 1 sibling, 2 replies; 9+ messages in thread From: John Hubbard @ 2020-05-29 23:43 UTC (permalink / raw) To: Andrew Morton Cc: Michael S . Tsirkin, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Souptick Joarder, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, linux-mm, John Hubbard There are four cases listed in pin_user_pages.rst. These are intended to help developers figure out whether to use get_user_pages*(), or pin_user_pages*(). However, the four cases do not cover all the situations. For example, drivers/vhost/vhost.c has a "pin, write to page, set page dirty, unpin" case. Add a fifth case, to help explain that there is a general pattern that requires pin_user_pages*() API calls. Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: John Hubbard <jhubbard@nvidia.com> --- Documentation/core-api/pin_user_pages.rst | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst index 4675b04e8829..b9f2688a2c67 100644 --- a/Documentation/core-api/pin_user_pages.rst +++ b/Documentation/core-api/pin_user_pages.rst @@ -171,6 +171,26 @@ If only struct page data (as opposed to the actual memory contents that a page is tracking) is affected, then normal GUP calls are sufficient, and neither flag needs to be set. +CASE 5: Pinning in order to write to the data within the page +------------------------------------------------------------- +Even though neither DMA nor Direct IO is involved, just a simple case of "pin, +access page's data, unpin" can cause a problem. Case 5 may be considered a +superset of Case 1, plus Case 2, plus anything that invokes that pattern. In +other words, if the code is neither Case 1 nor Case 2, it may still require +FOLL_PIN, for patterns like this: + +Correct (uses FOLL_PIN calls): + pin_user_pages() + access the data within the pages + set_page_dirty_lock() + unpin_user_pages() + +INCORRECT (uses FOLL_GET calls): + get_user_pages() + access the data within the pages + set_page_dirty_lock() + put_page() + page_maybe_dma_pinned(): the whole point of pinning =================================================== -- 2.26.2 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5" 2020-05-29 23:43 ` [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a " John Hubbard @ 2020-05-31 7:11 ` Souptick Joarder 2020-06-01 5:11 ` John Hubbard 2020-06-12 19:24 ` Matthew Wilcox 1 sibling, 1 reply; 9+ messages in thread From: Souptick Joarder @ 2020-05-31 7:11 UTC (permalink / raw) To: John Hubbard Cc: Andrew Morton, Michael S . Tsirkin, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, Linux-MM On Sat, May 30, 2020 at 5:13 AM John Hubbard <jhubbard@nvidia.com> wrote: > > There are four cases listed in pin_user_pages.rst. These are > intended to help developers figure out whether to use > get_user_pages*(), or pin_user_pages*(). However, the four cases > do not cover all the situations. For example, drivers/vhost/vhost.c > has a "pin, write to page, set page dirty, unpin" case. > > Add a fifth case, to help explain that there is a general pattern > that requires pin_user_pages*() API calls. > > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Jan Kara <jack@suse.cz> > Cc: Jérôme Glisse <jglisse@redhat.com> > Cc: Dave Chinner <david@fromorbit.com> > Cc: Jonathan Corbet <corbet@lwn.net> > Cc: linux-doc@vger.kernel.org > Cc: linux-fsdevel@vger.kernel.org > Signed-off-by: John Hubbard <jhubbard@nvidia.com> > --- > Documentation/core-api/pin_user_pages.rst | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst > index 4675b04e8829..b9f2688a2c67 100644 > --- a/Documentation/core-api/pin_user_pages.rst > +++ b/Documentation/core-api/pin_user_pages.rst > @@ -171,6 +171,26 @@ If only struct page data (as opposed to the actual memory contents that a page > is tracking) is affected, then normal GUP calls are sufficient, and neither flag > needs to be set. > > +CASE 5: Pinning in order to write to the data within the page > +------------------------------------------------------------- > +Even though neither DMA nor Direct IO is involved, just a simple case of "pin, > +access page's data, unpin" can cause a problem. Will it be, *"pin, access page's data, set page dirty, unpin" * ? Case 5 may be considered a > +superset of Case 1, plus Case 2, plus anything that invokes that pattern. In > +other words, if the code is neither Case 1 nor Case 2, it may still require > +FOLL_PIN, for patterns like this: > + > +Correct (uses FOLL_PIN calls): > + pin_user_pages() > + access the data within the pages > + set_page_dirty_lock() > + unpin_user_pages() > + > +INCORRECT (uses FOLL_GET calls): > + get_user_pages() > + access the data within the pages > + set_page_dirty_lock() > + put_page() > + > page_maybe_dma_pinned(): the whole point of pinning > =================================================== > > -- > 2.26.2 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5" 2020-05-31 7:11 ` Souptick Joarder @ 2020-06-01 5:11 ` John Hubbard 0 siblings, 0 replies; 9+ messages in thread From: John Hubbard @ 2020-06-01 5:11 UTC (permalink / raw) To: Souptick Joarder Cc: Andrew Morton, Michael S . Tsirkin, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, Linux-MM On 2020-05-31 00:11, Souptick Joarder wrote: ... >> diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst >> index 4675b04e8829..b9f2688a2c67 100644 >> --- a/Documentation/core-api/pin_user_pages.rst >> +++ b/Documentation/core-api/pin_user_pages.rst >> @@ -171,6 +171,26 @@ If only struct page data (as opposed to the actual memory contents that a page >> is tracking) is affected, then normal GUP calls are sufficient, and neither flag >> needs to be set. >> >> +CASE 5: Pinning in order to write to the data within the page >> +------------------------------------------------------------- >> +Even though neither DMA nor Direct IO is involved, just a simple case of "pin, >> +access page's data, unpin" can cause a problem. > > Will it be, *"pin, access page's data, set page dirty, unpin" * ? Well...the problem can show up with just accessing (writing) the data. But it is true that this statement is a little different from the patterns below, which is confusing. I'll delete set_page_dirty() from each of them, in order to avoid confusing things. (Although each is correct.) And I'll also change the above to "pin, write to a page's data, upin". set_page_dirty() interactions are really just extra credit here. :) And fully read-only situations won't cause a problem. > > Case 5 may be considered a >> +superset of Case 1, plus Case 2, plus anything that invokes that pattern. In >> +other words, if the code is neither Case 1 nor Case 2, it may still require >> +FOLL_PIN, for patterns like this: >> + >> +Correct (uses FOLL_PIN calls): >> + pin_user_pages() >> + access the data within the pages >> + set_page_dirty_lock() >> + unpin_user_pages() >> + >> +INCORRECT (uses FOLL_GET calls): >> + get_user_pages() >> + access the data within the pages >> + set_page_dirty_lock() >> + put_page() >> + I'll send a v2 shortly. thanks, -- John Hubbard NVIDIA ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5" 2020-05-29 23:43 ` [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a " John Hubbard 2020-05-31 7:11 ` Souptick Joarder @ 2020-06-12 19:24 ` Matthew Wilcox 2020-06-12 20:03 ` John Hubbard 1 sibling, 1 reply; 9+ messages in thread From: Matthew Wilcox @ 2020-06-12 19:24 UTC (permalink / raw) To: John Hubbard Cc: Andrew Morton, Michael S . Tsirkin, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Souptick Joarder, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, linux-mm On Fri, May 29, 2020 at 04:43:08PM -0700, John Hubbard wrote: > +CASE 5: Pinning in order to write to the data within the page > +------------------------------------------------------------- > +Even though neither DMA nor Direct IO is involved, just a simple case of "pin, > +access page's data, unpin" can cause a problem. Case 5 may be considered a > +superset of Case 1, plus Case 2, plus anything that invokes that pattern. In > +other words, if the code is neither Case 1 nor Case 2, it may still require > +FOLL_PIN, for patterns like this: > + > +Correct (uses FOLL_PIN calls): > + pin_user_pages() > + access the data within the pages > + set_page_dirty_lock() > + unpin_user_pages() > + > +INCORRECT (uses FOLL_GET calls): > + get_user_pages() > + access the data within the pages > + set_page_dirty_lock() > + put_page() Why does this case need to pin? Why can't it just do ... get_user_pages() lock_page(page); ... modify the data ... set_page_dirty(page); unlock_page(page); ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5" 2020-06-12 19:24 ` Matthew Wilcox @ 2020-06-12 20:03 ` John Hubbard 0 siblings, 0 replies; 9+ messages in thread From: John Hubbard @ 2020-06-12 20:03 UTC (permalink / raw) To: Matthew Wilcox Cc: Andrew Morton, Michael S . Tsirkin, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Souptick Joarder, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, linux-mm On 2020-06-12 12:24, Matthew Wilcox wrote: > On Fri, May 29, 2020 at 04:43:08PM -0700, John Hubbard wrote: >> +CASE 5: Pinning in order to write to the data within the page >> +------------------------------------------------------------- >> +Even though neither DMA nor Direct IO is involved, just a simple case of "pin, >> +access page's data, unpin" can cause a problem. Case 5 may be considered a >> +superset of Case 1, plus Case 2, plus anything that invokes that pattern. In >> +other words, if the code is neither Case 1 nor Case 2, it may still require >> +FOLL_PIN, for patterns like this: >> + >> +Correct (uses FOLL_PIN calls): >> + pin_user_pages() >> + access the data within the pages >> + set_page_dirty_lock() >> + unpin_user_pages() >> + >> +INCORRECT (uses FOLL_GET calls): >> + get_user_pages() >> + access the data within the pages >> + set_page_dirty_lock() >> + put_page() > > Why does this case need to pin? Why can't it just do ... > > get_user_pages() > lock_page(page); > ... modify the data ... > set_page_dirty(page); > unlock_page(page); > Yes, it could do that. And that would also make a good additional "correct" example. Especially for the case of just dealing with a single page, lock_page() has the benefit of completely fixing the problem *today*, without waiting for the pin_user_pages*() handling improvements to get implemented. And it's also another (probably better) way to fix the vhost.c problem, than commit 690623e1b496 ("vhost: convert get_user_pages() --> pin_user_pages()"). I'm inclined to leave vhost.c alone for now, unless someone really prefers it to be changed, but to update the Case 5 documentation with your point above. Sound about right? thanks, -- John Hubbard NVIDIA ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages() 2020-05-29 23:43 [PATCH 0/2] vhost, docs: convert to pin_user_pages(), new "case 5" John Hubbard 2020-05-29 23:43 ` [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a " John Hubbard @ 2020-05-29 23:43 ` John Hubbard 2020-06-01 11:30 ` Jan Kara 2020-06-02 4:22 ` Michael S. Tsirkin 1 sibling, 2 replies; 9+ messages in thread From: John Hubbard @ 2020-05-29 23:43 UTC (permalink / raw) To: Andrew Morton Cc: Michael S . Tsirkin, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Souptick Joarder, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, linux-mm, John Hubbard This code was using get_user_pages*(), in approximately a "Case 5" scenario (accessing the data within a page), using the categorization from [1]. That means that it's time to convert the get_user_pages*() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: netdev@vger.kernel.org Signed-off-by: John Hubbard <jhubbard@nvidia.com> --- drivers/vhost/vhost.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 21a59b598ed8..596132a96cd5 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1762,15 +1762,14 @@ static int set_bit_to_user(int nr, void __user *addr) int bit = nr + (log % PAGE_SIZE) * 8; int r; - r = get_user_pages_fast(log, 1, FOLL_WRITE, &page); + r = pin_user_pages_fast(log, 1, FOLL_WRITE, &page); if (r < 0) return r; BUG_ON(r != 1); base = kmap_atomic(page); set_bit(bit, base); kunmap_atomic(base); - set_page_dirty_lock(page); - put_page(page); + unpin_user_pages_dirty_lock(&page, 1, true); return 0; } -- 2.26.2 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages() 2020-05-29 23:43 ` [PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages() John Hubbard @ 2020-06-01 11:30 ` Jan Kara 2020-06-02 4:22 ` Michael S. Tsirkin 1 sibling, 0 replies; 9+ messages in thread From: Jan Kara @ 2020-06-01 11:30 UTC (permalink / raw) To: John Hubbard Cc: Andrew Morton, Michael S . Tsirkin, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Souptick Joarder, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, linux-mm On Fri 29-05-20 16:43:09, John Hubbard wrote: > This code was using get_user_pages*(), in approximately a "Case 5" > scenario (accessing the data within a page), using the categorization > from [1]. That means that it's time to convert the get_user_pages*() + > put_page() calls to pin_user_pages*() + unpin_user_pages() calls. > > There is some helpful background in [2]: basically, this is a small > part of fixing a long-standing disconnect between pinning pages, and > file systems' use of those pages. > > [1] Documentation/core-api/pin_user_pages.rst > > [2] "Explicit pinning of user-space pages": > https://lwn.net/Articles/807108/ > > Cc: Michael S. Tsirkin <mst@redhat.com> > Cc: Jason Wang <jasowang@redhat.com> > Cc: kvm@vger.kernel.org > Cc: virtualization@lists.linux-foundation.org > Cc: netdev@vger.kernel.org > Signed-off-by: John Hubbard <jhubbard@nvidia.com> Looks good to me. You can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > drivers/vhost/vhost.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index 21a59b598ed8..596132a96cd5 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -1762,15 +1762,14 @@ static int set_bit_to_user(int nr, void __user *addr) > int bit = nr + (log % PAGE_SIZE) * 8; > int r; > > - r = get_user_pages_fast(log, 1, FOLL_WRITE, &page); > + r = pin_user_pages_fast(log, 1, FOLL_WRITE, &page); > if (r < 0) > return r; > BUG_ON(r != 1); > base = kmap_atomic(page); > set_bit(bit, base); > kunmap_atomic(base); > - set_page_dirty_lock(page); > - put_page(page); > + unpin_user_pages_dirty_lock(&page, 1, true); > return 0; > } > > -- > 2.26.2 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages() 2020-05-29 23:43 ` [PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages() John Hubbard 2020-06-01 11:30 ` Jan Kara @ 2020-06-02 4:22 ` Michael S. Tsirkin 1 sibling, 0 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2020-06-02 4:22 UTC (permalink / raw) To: John Hubbard Cc: Andrew Morton, Jason Wang, Vlastimil Babka, Jérôme Glisse, Jan Kara, Dave Chinner, Souptick Joarder, Jonathan Corbet, linux-doc, linux-fsdevel, kvm, virtualization, netdev, LKML, linux-mm On Fri, May 29, 2020 at 04:43:09PM -0700, John Hubbard wrote: > This code was using get_user_pages*(), in approximately a "Case 5" > scenario (accessing the data within a page), using the categorization > from [1]. That means that it's time to convert the get_user_pages*() + > put_page() calls to pin_user_pages*() + unpin_user_pages() calls. > > There is some helpful background in [2]: basically, this is a small > part of fixing a long-standing disconnect between pinning pages, and > file systems' use of those pages. > > [1] Documentation/core-api/pin_user_pages.rst > > [2] "Explicit pinning of user-space pages": > https://lwn.net/Articles/807108/ > > Cc: Michael S. Tsirkin <mst@redhat.com> > Cc: Jason Wang <jasowang@redhat.com> > Cc: kvm@vger.kernel.org > Cc: virtualization@lists.linux-foundation.org > Cc: netdev@vger.kernel.org > Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> > --- > drivers/vhost/vhost.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index 21a59b598ed8..596132a96cd5 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -1762,15 +1762,14 @@ static int set_bit_to_user(int nr, void __user *addr) > int bit = nr + (log % PAGE_SIZE) * 8; > int r; > > - r = get_user_pages_fast(log, 1, FOLL_WRITE, &page); > + r = pin_user_pages_fast(log, 1, FOLL_WRITE, &page); > if (r < 0) > return r; > BUG_ON(r != 1); > base = kmap_atomic(page); > set_bit(bit, base); > kunmap_atomic(base); > - set_page_dirty_lock(page); > - put_page(page); > + unpin_user_pages_dirty_lock(&page, 1, true); > return 0; > } > > -- > 2.26.2 ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-06-12 20:03 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-29 23:43 [PATCH 0/2] vhost, docs: convert to pin_user_pages(), new "case 5" John Hubbard 2020-05-29 23:43 ` [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a " John Hubbard 2020-05-31 7:11 ` Souptick Joarder 2020-06-01 5:11 ` John Hubbard 2020-06-12 19:24 ` Matthew Wilcox 2020-06-12 20:03 ` John Hubbard 2020-05-29 23:43 ` [PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages() John Hubbard 2020-06-01 11:30 ` Jan Kara 2020-06-02 4:22 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).