LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/3] mm/hmm: fixes for device private page migration
@ 2019-07-17  0:14 Ralph Campbell
  2019-07-17  0:14 ` [PATCH 1/3] mm: document zone device struct page reserved fields Ralph Campbell
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ralph Campbell @ 2019-07-17  0:14 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Ralph Campbell

Testing the latest linux git tree turned up a few bugs with page
migration to and from ZONE_DEVICE private and anonymous pages.
Hopefully it clarifies how ZONE_DEVICE private struct page uses
the same mapping and index fields from the source anonymous page
mapping.

Patch #3 was sent earlier and this is v2 with an updated change log.
http://lkml.kernel.org/r/20190709223556.28908-1-rcampbell@nvidia.com

Ralph Campbell (3):
  mm: document zone device struct page reserved fields
  mm/hmm: fix ZONE_DEVICE anon page mapping reuse
  mm/hmm: Fix bad subpage pointer in try_to_unmap_one

 include/linux/mm_types.h | 9 ++++++++-
 kernel/memremap.c        | 4 ++++
 mm/rmap.c                | 1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] mm: document zone device struct page reserved fields
  2019-07-17  0:14 [PATCH 0/3] mm/hmm: fixes for device private page migration Ralph Campbell
@ 2019-07-17  0:14 ` Ralph Campbell
  2019-07-17  1:20   ` John Hubbard
  2019-07-17  0:14 ` [PATCH 2/3] mm/hmm: fix ZONE_DEVICE anon page mapping reuse Ralph Campbell
  2019-07-17  0:14 ` [PATCH 3/3] mm/hmm: Fix bad subpage pointer in try_to_unmap_one Ralph Campbell
  2 siblings, 1 reply; 12+ messages in thread
From: Ralph Campbell @ 2019-07-17  0:14 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Ralph Campbell, Matthew Wilcox, Vlastimil Babka,
	Christoph Lameter, Dave Hansen, Jérôme Glisse,
	Kirill A . Shutemov, Lai Jiangshan, Martin Schwidefsky,
	Pekka Enberg, Randy Dunlap, Andrey Ryabinin, Christoph Hellwig,
	Jason Gunthorpe, Andrew Morton, Linus Torvalds

Struct page for ZONE_DEVICE private pages uses the reserved fields when
anonymous pages are migrated to device private memory. This is so
the page->mapping and page->index fields are preserved and the page can
be migrated back to system memory.
Document this in comments so it is more clear.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/mm_types.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3a37a89eb7a7..d6ea74e20306 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -159,7 +159,14 @@ struct page {
 			/** @pgmap: Points to the hosting device page map. */
 			struct dev_pagemap *pgmap;
 			void *zone_device_data;
-			unsigned long _zd_pad_1;	/* uses mapping */
+			/*
+			 * The following fields are used to hold the source
+			 * page anonymous mapping information while it is
+			 * migrated to device memory. See migrate_page().
+			 */
+			unsigned long _zd_pad_1;	/* aliases mapping */
+			unsigned long _zd_pad_2;	/* aliases index */
+			unsigned long _zd_pad_3;	/* aliases private */
 		};
 
 		/** @rcu_head: You can use this to free a page by RCU. */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/3] mm/hmm: fix ZONE_DEVICE anon page mapping reuse
  2019-07-17  0:14 [PATCH 0/3] mm/hmm: fixes for device private page migration Ralph Campbell
  2019-07-17  0:14 ` [PATCH 1/3] mm: document zone device struct page reserved fields Ralph Campbell
@ 2019-07-17  0:14 ` Ralph Campbell
  2019-07-17  1:40   ` John Hubbard
  2019-07-17  0:14 ` [PATCH 3/3] mm/hmm: Fix bad subpage pointer in try_to_unmap_one Ralph Campbell
  2 siblings, 1 reply; 12+ messages in thread
From: Ralph Campbell @ 2019-07-17  0:14 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Ralph Campbell, stable, Christoph Hellwig,
	Dan Williams, Andrew Morton, Jason Gunthorpe, Logan Gunthorpe,
	Ira Weiny, Matthew Wilcox, Mel Gorman, Jan Kara,
	Kirill A. Shutemov, Michal Hocko, Andrea Arcangeli, Mike Kravetz,
	Jérôme Glisse

When a ZONE_DEVICE private page is freed, the page->mapping field can be
set. If this page is reused as an anonymous page, the previous value can
prevent the page from being inserted into the CPU's anon rmap table.
For example, when migrating a pte_none() page to device memory:
  migrate_vma(ops, vma, start, end, src, dst, private)
    migrate_vma_collect()
      src[] = MIGRATE_PFN_MIGRATE
    migrate_vma_prepare()
      /* no page to lock or isolate so OK */
    migrate_vma_unmap()
      /* no page to unmap so OK */
    ops->alloc_and_copy()
      /* driver allocates ZONE_DEVICE page for dst[] */
    migrate_vma_pages()
      migrate_vma_insert_page()
        page_add_new_anon_rmap()
          __page_set_anon_rmap()
            /* This check sees the page's stale mapping field */
            if (PageAnon(page))
              return
            /* page->mapping is not updated */

The result is that the migration appears to succeed but a subsequent CPU
fault will be unable to migrate the page back to system memory or worse.

Clear the page->mapping field when freeing the ZONE_DEVICE page so stale
pointer data doesn't affect future page use.

Fixes: b7a523109fb5c9d2d6dd ("mm: don't clear ->mapping in hmm_devmem_free")
Cc: stable@vger.kernel.org
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jan Kara <jack@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
---
 kernel/memremap.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/memremap.c b/kernel/memremap.c
index bea6f887adad..238ae5d0ae8a 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -408,6 +408,10 @@ void __put_devmap_managed_page(struct page *page)
 
 		mem_cgroup_uncharge(page);
 
+		/* Clear anonymous page mapping to prevent stale pointers */
+		if (is_device_private_page(page))
+			page->mapping = NULL;
+
 		page->pgmap->ops->page_free(page);
 	} else if (!count)
 		__put_page(page);
-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/3] mm/hmm: Fix bad subpage pointer in try_to_unmap_one
  2019-07-17  0:14 [PATCH 0/3] mm/hmm: fixes for device private page migration Ralph Campbell
  2019-07-17  0:14 ` [PATCH 1/3] mm: document zone device struct page reserved fields Ralph Campbell
  2019-07-17  0:14 ` [PATCH 2/3] mm/hmm: fix ZONE_DEVICE anon page mapping reuse Ralph Campbell
@ 2019-07-17  0:14 ` Ralph Campbell
  2019-07-17  1:51   ` John Hubbard
  2 siblings, 1 reply; 12+ messages in thread
From: Ralph Campbell @ 2019-07-17  0:14 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Ralph Campbell, Jérôme Glisse,
	Kirill A. Shutemov, Mike Kravetz, Christoph Hellwig,
	Jason Gunthorpe, stable, Andrew Morton

When migrating an anonymous private page to a ZONE_DEVICE private page,
the source page->mapping and page->index fields are copied to the
destination ZONE_DEVICE struct page and the page_mapcount() is increased.
This is so rmap_walk() can be used to unmap and migrate the page back to
system memory. However, try_to_unmap_one() computes the subpage pointer
from a swap pte which computes an invalid page pointer and a kernel panic
results such as:

BUG: unable to handle page fault for address: ffffea1fffffffc8

Currently, only single pages can be migrated to device private memory so
no subpage computation is needed and it can be set to "page".

Fixes: a5430dda8a3a1c ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 mm/rmap.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/rmap.c b/mm/rmap.c
index e5dfe2ae6b0d..ec1af8b60423 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1476,6 +1476,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			 * No need to invalidate here it will synchronize on
 			 * against the special swap migration pte.
 			 */
+			subpage = page;
 			goto discard;
 		}
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] mm: document zone device struct page reserved fields
  2019-07-17  0:14 ` [PATCH 1/3] mm: document zone device struct page reserved fields Ralph Campbell
@ 2019-07-17  1:20   ` John Hubbard
  2019-07-17  4:22     ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: John Hubbard @ 2019-07-17  1:20 UTC (permalink / raw)
  To: Ralph Campbell, linux-mm
  Cc: linux-kernel, Matthew Wilcox, Vlastimil Babka, Christoph Lameter,
	Dave Hansen, Jérôme Glisse, Kirill A . Shutemov,
	Lai Jiangshan, Martin Schwidefsky, Pekka Enberg, Randy Dunlap,
	Andrey Ryabinin, Christoph Hellwig, Jason Gunthorpe,
	Andrew Morton, Linus Torvalds

On 7/16/19 5:14 PM, Ralph Campbell wrote:
> Struct page for ZONE_DEVICE private pages uses the reserved fields when
> anonymous pages are migrated to device private memory. This is so
> the page->mapping and page->index fields are preserved and the page can
> be migrated back to system memory.
> Document this in comments so it is more clear.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> Cc: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  include/linux/mm_types.h | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3a37a89eb7a7..d6ea74e20306 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -159,7 +159,14 @@ struct page {
>  			/** @pgmap: Points to the hosting device page map. */
>  			struct dev_pagemap *pgmap;
>  			void *zone_device_data;
> -			unsigned long _zd_pad_1;	/* uses mapping */
> +			/*
> +			 * The following fields are used to hold the source
> +			 * page anonymous mapping information while it is
> +			 * migrated to device memory. See migrate_page().
> +			 */
> +			unsigned long _zd_pad_1;	/* aliases mapping */
> +			unsigned long _zd_pad_2;	/* aliases index */
> +			unsigned long _zd_pad_3;	/* aliases private */

Actually, I do think this helps. It's hard to document these fields, and
the ZONE_DEVICE pages have a really complicated situation during migration
to a device. 

Additionally, I'm not sure, but should we go even further, and do this on the 
other side of the alias:

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index d6ea74e20306..c5ce5989d8a8 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -83,7 +83,12 @@ struct page {
                         * by the page owner.
                         */
                        struct list_head lru;
-                       /* See page-flags.h for PAGE_MAPPING_FLAGS */
+                       /*
+                        * See page-flags.h for PAGE_MAPPING_FLAGS.
+                        *
+                        * Also: the next three fields (mapping, index and
+                        * private) are all used by ZONE_DEVICE pages.
+                        */
                        struct address_space *mapping;
                        pgoff_t index;          /* Our offset within mapping. */
                        /**

?

Either way, you can add:

    Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] mm/hmm: fix ZONE_DEVICE anon page mapping reuse
  2019-07-17  0:14 ` [PATCH 2/3] mm/hmm: fix ZONE_DEVICE anon page mapping reuse Ralph Campbell
@ 2019-07-17  1:40   ` John Hubbard
  0 siblings, 0 replies; 12+ messages in thread
From: John Hubbard @ 2019-07-17  1:40 UTC (permalink / raw)
  To: Ralph Campbell, linux-mm
  Cc: linux-kernel, stable, Christoph Hellwig, Dan Williams,
	Andrew Morton, Jason Gunthorpe, Logan Gunthorpe, Ira Weiny,
	Matthew Wilcox, Mel Gorman, Jan Kara, Kirill A. Shutemov,
	Michal Hocko, Andrea Arcangeli, Mike Kravetz,
	Jérôme Glisse

On 7/16/19 5:14 PM, Ralph Campbell wrote:
> When a ZONE_DEVICE private page is freed, the page->mapping field can be
> set. If this page is reused as an anonymous page, the previous value can
> prevent the page from being inserted into the CPU's anon rmap table.
> For example, when migrating a pte_none() page to device memory:
>   migrate_vma(ops, vma, start, end, src, dst, private)
>     migrate_vma_collect()
>       src[] = MIGRATE_PFN_MIGRATE
>     migrate_vma_prepare()
>       /* no page to lock or isolate so OK */
>     migrate_vma_unmap()
>       /* no page to unmap so OK */
>     ops->alloc_and_copy()
>       /* driver allocates ZONE_DEVICE page for dst[] */
>     migrate_vma_pages()
>       migrate_vma_insert_page()
>         page_add_new_anon_rmap()
>           __page_set_anon_rmap()
>             /* This check sees the page's stale mapping field */
>             if (PageAnon(page))
>               return
>             /* page->mapping is not updated */
> 
> The result is that the migration appears to succeed but a subsequent CPU
> fault will be unable to migrate the page back to system memory or worse.
> 
> Clear the page->mapping field when freeing the ZONE_DEVICE page so stale
> pointer data doesn't affect future page use.
> 
> Fixes: b7a523109fb5c9d2d6dd ("mm: don't clear ->mapping in hmm_devmem_free")
> Cc: stable@vger.kernel.org
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Jan Kara <jack@suse.cz>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> ---
>  kernel/memremap.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/memremap.c b/kernel/memremap.c
> index bea6f887adad..238ae5d0ae8a 100644
> --- a/kernel/memremap.c
> +++ b/kernel/memremap.c
> @@ -408,6 +408,10 @@ void __put_devmap_managed_page(struct page *page)
>  
>  		mem_cgroup_uncharge(page);
>  
> +		/* Clear anonymous page mapping to prevent stale pointers */

This is sufficiently complex, that some concise form of the documentation
that you've put in the commit description, needs to also exist right here, as
a comment. 

How's this read:

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 238ae5d0ae8a..e52e9da5d0a7 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -408,7 +408,27 @@ void __put_devmap_managed_page(struct page *page)
 
                mem_cgroup_uncharge(page);
 
-               /* Clear anonymous page mapping to prevent stale pointers */
+               /*
+                * When a device_private page is freed, the page->mapping field
+                * may still contain a (stale) mapping value. For example, the
+                * lower bits of page->mapping may still identify the page an an
+                * anonymous page. Ultimately, this entire field is just stale
+                * and wrong, and it will cause errors if not cleared. One
+                * example is:
+                *
+                *  migrate_vma_pages()
+                *    migrate_vma_insert_page()
+                *      page_add_new_anon_rmap()
+                *        __page_set_anon_rmap()
+                *          ...checks page->mapping, via PageAnon(page) call,
+                *            and incorrectly concludes that the page is an
+                *            anonymous page. Therefore, it incorrectly,
+                *            silently fails to set up the new anon rmap.
+                *
+                * For other types of ZONE_DEVICE pages, migration is either
+                * handled differently or not done at all, so there is no need
+                * to clear page->mapping.
+                */
                if (is_device_private_page(page))
                        page->mapping = NULL;
 
?

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] mm/hmm: Fix bad subpage pointer in try_to_unmap_one
  2019-07-17  0:14 ` [PATCH 3/3] mm/hmm: Fix bad subpage pointer in try_to_unmap_one Ralph Campbell
@ 2019-07-17  1:51   ` John Hubbard
  0 siblings, 0 replies; 12+ messages in thread
From: John Hubbard @ 2019-07-17  1:51 UTC (permalink / raw)
  To: Ralph Campbell, linux-mm
  Cc: linux-kernel, Jérôme Glisse, Kirill A. Shutemov,
	Mike Kravetz, Christoph Hellwig, Jason Gunthorpe, stable,
	Andrew Morton

On 7/16/19 5:14 PM, Ralph Campbell wrote:
> When migrating an anonymous private page to a ZONE_DEVICE private page,
> the source page->mapping and page->index fields are copied to the
> destination ZONE_DEVICE struct page and the page_mapcount() is increased.
> This is so rmap_walk() can be used to unmap and migrate the page back to
> system memory. However, try_to_unmap_one() computes the subpage pointer
> from a swap pte which computes an invalid page pointer and a kernel panic
> results such as:
> 
> BUG: unable to handle page fault for address: ffffea1fffffffc8
> 
> Currently, only single pages can be migrated to device private memory so
> no subpage computation is needed and it can be set to "page".
> 
> Fixes: a5430dda8a3a1c ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>  mm/rmap.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/rmap.c b/mm/rmap.c
> index e5dfe2ae6b0d..ec1af8b60423 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1476,6 +1476,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  			 * No need to invalidate here it will synchronize on
>  			 * against the special swap migration pte.
>  			 */
> +			subpage = page;
>  			goto discard;
>  		}

The problem is clear, but the solution still leaves the code ever so slightly
more confusing, and it was already pretty difficult to begin with.

I still hold out hope for some comment documentation at least, and maybe
even just removing the subpage variable (as Jerome mentioned, offline) as
well.

Jerome?


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] mm: document zone device struct page reserved fields
  2019-07-17  1:20   ` John Hubbard
@ 2019-07-17  4:22     ` Christoph Hellwig
  2019-07-17  4:31       ` John Hubbard
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2019-07-17  4:22 UTC (permalink / raw)
  To: John Hubbard
  Cc: Ralph Campbell, linux-mm, linux-kernel, willy, Vlastimil Babka,
	Christoph Lameter, Dave Hansen, Jérôme Glisse,
	Kirill A . Shutemov, Lai Jiangshan, Martin Schwidefsky,
	Pekka Enberg, Randy Dunlap, Andrey Ryabinin, Christoph Hellwig,
	Jason Gunthorpe, Andrew Morton, Linus Torvalds

On Tue, Jul 16, 2019 at 06:20:23PM -0700, John Hubbard wrote:
> > -			unsigned long _zd_pad_1;	/* uses mapping */
> > +			/*
> > +			 * The following fields are used to hold the source
> > +			 * page anonymous mapping information while it is
> > +			 * migrated to device memory. See migrate_page().
> > +			 */
> > +			unsigned long _zd_pad_1;	/* aliases mapping */
> > +			unsigned long _zd_pad_2;	/* aliases index */
> > +			unsigned long _zd_pad_3;	/* aliases private */
> 
> Actually, I do think this helps. It's hard to document these fields, and
> the ZONE_DEVICE pages have a really complicated situation during migration
> to a device. 
> 
> Additionally, I'm not sure, but should we go even further, and do this on the 
> other side of the alias:

The _zd_pad_* field obviously are NOT used anywhere in the source tree.
So these comments are very misleading.  If we still keep
using ->mapping, ->index and ->private we really should clean up the
definition of struct page to make that obvious instead of trying to
doctor around it using comments.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] mm: document zone device struct page reserved fields
  2019-07-17  4:22     ` Christoph Hellwig
@ 2019-07-17  4:31       ` John Hubbard
  2019-07-17  4:38         ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: John Hubbard @ 2019-07-17  4:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ralph Campbell, linux-mm, linux-kernel, willy, Vlastimil Babka,
	Christoph Lameter, Dave Hansen, Jérôme Glisse,
	Kirill A . Shutemov, Lai Jiangshan, Martin Schwidefsky,
	Pekka Enberg, Randy Dunlap, Andrey Ryabinin, Jason Gunthorpe,
	Andrew Morton, Linus Torvalds

On 7/16/19 9:22 PM, Christoph Hellwig wrote:
> On Tue, Jul 16, 2019 at 06:20:23PM -0700, John Hubbard wrote:
>>> -			unsigned long _zd_pad_1;	/* uses mapping */
>>> +			/*
>>> +			 * The following fields are used to hold the source
>>> +			 * page anonymous mapping information while it is
>>> +			 * migrated to device memory. See migrate_page().
>>> +			 */
>>> +			unsigned long _zd_pad_1;	/* aliases mapping */
>>> +			unsigned long _zd_pad_2;	/* aliases index */
>>> +			unsigned long _zd_pad_3;	/* aliases private */
>>
>> Actually, I do think this helps. It's hard to document these fields, and
>> the ZONE_DEVICE pages have a really complicated situation during migration
>> to a device. 
>>
>> Additionally, I'm not sure, but should we go even further, and do this on the 
>> other side of the alias:
> 
> The _zd_pad_* field obviously are NOT used anywhere in the source tree.
> So these comments are very misleading.  If we still keep
> using ->mapping, ->index and ->private we really should clean up the
> definition of struct page to make that obvious instead of trying to
> doctor around it using comments.
> 

OK, so just delete all the _zd_pad_* fields? Works for me. It's misleading to
calling something padding, if it's actually unavailable because it's used
in the other union, so deleting would be even better than commenting.

In that case, it would still be nice to have this new snippet, right?:

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index d6ea74e20306..c5ce5989d8a8 100644

--- a/include/linux/mm_types.h

+++ b/include/linux/mm_types.h

@@ -83,7 +83,12 @@

 struct page {

                         * by the page owner. 
                         */ 
                        struct list_head lru; 
-                       /* See page-flags.h for PAGE_MAPPING_FLAGS */ 
+                       /* 
+                        * See page-flags.h for PAGE_MAPPING_FLAGS. 
+                        * 
+                        * Also: the next three fields (mapping, index and 
+                        * private) are all used by ZONE_DEVICE pages. 
+                        */ 
                        struct address_space *mapping; 
                        pgoff_t index;          /* Our offset within mapping. */ 
                        /** 

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] mm: document zone device struct page reserved fields
  2019-07-17  4:31       ` John Hubbard
@ 2019-07-17  4:38         ` Christoph Hellwig
  2019-07-17 17:50           ` Ralph Campbell
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2019-07-17  4:38 UTC (permalink / raw)
  To: John Hubbard
  Cc: Christoph Hellwig, Ralph Campbell, linux-mm, linux-kernel, willy,
	Vlastimil Babka, Christoph Lameter, Dave Hansen,
	Jérôme Glisse, Kirill A . Shutemov, Lai Jiangshan,
	Martin Schwidefsky, Pekka Enberg, Randy Dunlap, Andrey Ryabinin,
	Jason Gunthorpe, Andrew Morton, Linus Torvalds

On Tue, Jul 16, 2019 at 09:31:33PM -0700, John Hubbard wrote:
> OK, so just delete all the _zd_pad_* fields? Works for me. It's misleading to
> calling something padding, if it's actually unavailable because it's used
> in the other union, so deleting would be even better than commenting.
> 
> In that case, it would still be nice to have this new snippet, right?:

I hope willy can chime in a bit on his thoughts about how the union in
struct page should look like.  The padding at the end of the sub-structs
certainly looks pointless, and other places don't use it either.  But if
we are using the other fields it almost seems to me like we only want to
union the lru field in the first sub-struct instead of overlaying most
of it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] mm: document zone device struct page reserved fields
  2019-07-17  4:38         ` Christoph Hellwig
@ 2019-07-17 17:50           ` Ralph Campbell
  0 siblings, 0 replies; 12+ messages in thread
From: Ralph Campbell @ 2019-07-17 17:50 UTC (permalink / raw)
  To: Christoph Hellwig, John Hubbard
  Cc: linux-mm, linux-kernel, willy, Vlastimil Babka,
	Christoph Lameter, Dave Hansen, Jérôme Glisse,
	Kirill A . Shutemov, Lai Jiangshan, Martin Schwidefsky,
	Pekka Enberg, Randy Dunlap, Andrey Ryabinin, Jason Gunthorpe,
	Andrew Morton, Linus Torvalds


On 7/16/19 9:38 PM, Christoph Hellwig wrote:
> On Tue, Jul 16, 2019 at 09:31:33PM -0700, John Hubbard wrote:
>> OK, so just delete all the _zd_pad_* fields? Works for me. It's misleading to
>> calling something padding, if it's actually unavailable because it's used
>> in the other union, so deleting would be even better than commenting.
>>
>> In that case, it would still be nice to have this new snippet, right?:
> 
> I hope willy can chime in a bit on his thoughts about how the union in
> struct page should look like.  The padding at the end of the sub-structs
> certainly looks pointless, and other places don't use it either.  But if
> we are using the other fields it almost seems to me like we only want to
> union the lru field in the first sub-struct instead of overlaying most
> of it.
>

I like this approach.
I'll work on an updated patch that makes "struct list_head lru" part
of a union with the ZONE_DEVICE struct without the padding and update
the comments and change log.

I will also wait a day or two for others to add their comments.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/3] mm/hmm: Fix bad subpage pointer in try_to_unmap_one
  2019-07-19 19:06 [PATCH 0/3] mm/hmm: fixes for device private page migration Ralph Campbell
@ 2019-07-19 19:06 ` Ralph Campbell
  0 siblings, 0 replies; 12+ messages in thread
From: Ralph Campbell @ 2019-07-19 19:06 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Ralph Campbell, Jérôme Glisse,
	Kirill A. Shutemov, Mike Kravetz, Christoph Hellwig,
	Jason Gunthorpe, John Hubbard, stable, Andrew Morton

When migrating an anonymous private page to a ZONE_DEVICE private page,
the source page->mapping and page->index fields are copied to the
destination ZONE_DEVICE struct page and the page_mapcount() is increased.
This is so rmap_walk() can be used to unmap and migrate the page back to
system memory. However, try_to_unmap_one() computes the subpage pointer
from a swap pte which computes an invalid page pointer and a kernel panic
results such as:

BUG: unable to handle page fault for address: ffffea1fffffffc8

Currently, only single pages can be migrated to device private memory so
no subpage computation is needed and it can be set to "page".

Fixes: a5430dda8a3a1c ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 mm/rmap.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/rmap.c b/mm/rmap.c
index e5dfe2ae6b0d..ec1af8b60423 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1476,6 +1476,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			 * No need to invalidate here it will synchronize on
 			 * against the special swap migration pte.
 			 */
+			subpage = page;
 			goto discard;
 		}
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-07-19 19:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-17  0:14 [PATCH 0/3] mm/hmm: fixes for device private page migration Ralph Campbell
2019-07-17  0:14 ` [PATCH 1/3] mm: document zone device struct page reserved fields Ralph Campbell
2019-07-17  1:20   ` John Hubbard
2019-07-17  4:22     ` Christoph Hellwig
2019-07-17  4:31       ` John Hubbard
2019-07-17  4:38         ` Christoph Hellwig
2019-07-17 17:50           ` Ralph Campbell
2019-07-17  0:14 ` [PATCH 2/3] mm/hmm: fix ZONE_DEVICE anon page mapping reuse Ralph Campbell
2019-07-17  1:40   ` John Hubbard
2019-07-17  0:14 ` [PATCH 3/3] mm/hmm: Fix bad subpage pointer in try_to_unmap_one Ralph Campbell
2019-07-17  1:51   ` John Hubbard
2019-07-19 19:06 [PATCH 0/3] mm/hmm: fixes for device private page migration Ralph Campbell
2019-07-19 19:06 ` [PATCH 3/3] mm/hmm: Fix bad subpage pointer in try_to_unmap_one Ralph Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).