LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/3] Cleanup and fixups for memory hotplug
@ 2021-08-21  9:42 Miaohe Lin
  2021-08-21  9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Miaohe Lin @ 2021-08-21  9:42 UTC (permalink / raw)
  To: akpm
  Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm,
	linux-kernel, linmiaohe

Hi all,
This series contains cleanup to use helper function to simplify the
code. Also we fix some potential bugs. More details can be found in
the respective changelogs. Thanks!

Miaohe Lin (3):
  mm/memory_hotplug: use helper zone_is_zone_device() to simplify the
    code
  mm/memory_hotplug: fix potential permanent lru cache disable
  mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable

 mm/memory_hotplug.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code
  2021-08-21  9:42 [PATCH 0/3] Cleanup and fixups for memory hotplug Miaohe Lin
@ 2021-08-21  9:42 ` Miaohe Lin
  2021-08-23  8:20   ` HORIGUCHI NAOYA(堀口 直也)
                     ` (2 more replies)
  2021-08-21  9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin
  2021-08-21  9:42 ` [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable Miaohe Lin
  2 siblings, 3 replies; 14+ messages in thread
From: Miaohe Lin @ 2021-08-21  9:42 UTC (permalink / raw)
  To: akpm
  Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm,
	linux-kernel, linmiaohe

Use helper zone_is_zone_device() to simplify the code and remove some
explicit CONFIG_ZONE_DEVICE codes.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/memory_hotplug.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b287ff3d7229..d986d3791986 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -477,15 +477,13 @@ void __ref remove_pfn_range_from_zone(struct zone *zone,
 				 sizeof(struct page) * cur_nr_pages);
 	}
 
-#ifdef CONFIG_ZONE_DEVICE
 	/*
 	 * Zone shrinking code cannot properly deal with ZONE_DEVICE. So
 	 * we will not try to shrink the zones - which is okay as
 	 * set_zone_contiguous() cannot deal with ZONE_DEVICE either way.
 	 */
-	if (zone_idx(zone) == ZONE_DEVICE)
+	if (zone_is_zone_device(zone))
 		return;
-#endif
 
 	clear_zone_contiguous(zone);
 
-- 
2.23.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable
  2021-08-21  9:42 [PATCH 0/3] Cleanup and fixups for memory hotplug Miaohe Lin
  2021-08-21  9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin
@ 2021-08-21  9:42 ` Miaohe Lin
  2021-08-23  8:21   ` HORIGUCHI NAOYA(堀口 直也)
                     ` (2 more replies)
  2021-08-21  9:42 ` [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable Miaohe Lin
  2 siblings, 3 replies; 14+ messages in thread
From: Miaohe Lin @ 2021-08-21  9:42 UTC (permalink / raw)
  To: akpm
  Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm,
	linux-kernel, linmiaohe

If offline_pages failed after lru_cache_disable(), it forgot to do
lru_cache_enable() in error path. So we would have lru cache disabled
permanently in this case.

Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/memory_hotplug.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index d986d3791986..9fd0be32a281 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 failed_removal_pcplists_disabled:
+	lru_cache_enable();
 	zone_pcp_enable(zone);
 failed_removal:
 	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
-- 
2.23.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable
  2021-08-21  9:42 [PATCH 0/3] Cleanup and fixups for memory hotplug Miaohe Lin
  2021-08-21  9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin
  2021-08-21  9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin
@ 2021-08-21  9:42 ` Miaohe Lin
  2021-08-23  8:26   ` HORIGUCHI NAOYA(堀口 直也)
  2 siblings, 1 reply; 14+ messages in thread
From: Miaohe Lin @ 2021-08-21  9:42 UTC (permalink / raw)
  To: akpm
  Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm,
	linux-kernel, linmiaohe

HWPoisoned dirty swapcache pages are kept for killing owner processes.
We should not offline these pages or do_swap_page() would access the
offline pages and lead to bad ending.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/memory_hotplug.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9fd0be32a281..0488eed3327c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1664,6 +1664,12 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
 		 */
 		if (PageOffline(page) && page_count(page))
 			return -EBUSY;
+		/*
+		 * HWPoisoned dirty swapcache pages are definitely unmovable
+		 * because they are kept for killing owner processes.
+		 */
+		if (PageHWPoison(page) && PageSwapCache(page))
+			return -EBUSY;
 
 		if (!PageHuge(page))
 			continue;
-- 
2.23.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code
  2021-08-21  9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin
@ 2021-08-23  8:20   ` HORIGUCHI NAOYA(堀口 直也)
  2021-08-23  9:11   ` Oscar Salvador
  2021-08-23 12:14   ` David Hildenbrand
  2 siblings, 0 replies; 14+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23  8:20 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: akpm, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On Sat, Aug 21, 2021 at 05:42:44PM +0800, Miaohe Lin wrote:
> Use helper zone_is_zone_device() to simplify the code and remove some
> explicit CONFIG_ZONE_DEVICE codes.
> 
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable
  2021-08-21  9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin
@ 2021-08-23  8:21   ` HORIGUCHI NAOYA(堀口 直也)
  2021-08-23  9:15   ` Oscar Salvador
  2021-08-23 12:15   ` David Hildenbrand
  2 siblings, 0 replies; 14+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23  8:21 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: akpm, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On Sat, Aug 21, 2021 at 05:42:45PM +0800, Miaohe Lin wrote:
> If offline_pages failed after lru_cache_disable(), it forgot to do
> lru_cache_enable() in error path. So we would have lru cache disabled
> permanently in this case.
> 
> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable
  2021-08-21  9:42 ` [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable Miaohe Lin
@ 2021-08-23  8:26   ` HORIGUCHI NAOYA(堀口 直也)
  2021-08-23  9:14     ` Miaohe Lin
  0 siblings, 1 reply; 14+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23  8:26 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: akpm, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On Sat, Aug 21, 2021 at 05:42:46PM +0800, Miaohe Lin wrote:
> HWPoisoned dirty swapcache pages are kept for killing owner processes.
> We should not offline these pages or do_swap_page() would access the
> offline pages and lead to bad ending.
> 

Thank you for the report.  I'm not yet sure of the whole picture of this
issue.  do_swap_page() is expected to return with fault VM_FAULT_HWPOISON
when called via the access to the error page, so I wonder why this doesn't
work for your situation.  And what is the "bad ending" in the description?

I feel that aborting memory hotremove due to a hwpoisoned dirty swapcache
might be too hard, so I'd like to find another solution if we have.
# You may separate this patch from former two to make them merged to
# mainline soon.

Thanks,
Naoya Horiguchi

> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/memory_hotplug.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 9fd0be32a281..0488eed3327c 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1664,6 +1664,12 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>  		 */
>  		if (PageOffline(page) && page_count(page))
>  			return -EBUSY;
> +		/*
> +		 * HWPoisoned dirty swapcache pages are definitely unmovable
> +		 * because they are kept for killing owner processes.
> +		 */
> +		if (PageHWPoison(page) && PageSwapCache(page))
> +			return -EBUSY;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code
  2021-08-21  9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin
  2021-08-23  8:20   ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-08-23  9:11   ` Oscar Salvador
  2021-08-23 12:14   ` David Hildenbrand
  2 siblings, 0 replies; 14+ messages in thread
From: Oscar Salvador @ 2021-08-23  9:11 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: akpm, naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On 2021-08-21 11:42, Miaohe Lin wrote:
> Use helper zone_is_zone_device() to simplify the code and remove some
> explicit CONFIG_ZONE_DEVICE codes.
> 
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable
  2021-08-23  8:26   ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-08-23  9:14     ` Miaohe Lin
  2021-11-04 22:07       ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Miaohe Lin @ 2021-08-23  9:14 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: akpm, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On 2021/8/23 16:26, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Sat, Aug 21, 2021 at 05:42:46PM +0800, Miaohe Lin wrote:
>> HWPoisoned dirty swapcache pages are kept for killing owner processes.
>> We should not offline these pages or do_swap_page() would access the
>> offline pages and lead to bad ending.
>>
> 
> Thank you for the report.  I'm not yet sure of the whole picture of this
> issue.  do_swap_page() is expected to return with fault VM_FAULT_HWPOISON
> when called via the access to the error page, so I wonder why this doesn't
> work for your situation.  And what is the "bad ending" in the description?
> 

IMO we might hotremove the page while SwapCache still have ref to it. Thus the page
struct would be accessed after offlined. The page struct should be invalid in this case
and this would make do_swap_page fragile. Or am I miss something?

> I feel that aborting memory hotremove due to a hwpoisoned dirty swapcache
> might be too hard, so I'd like to find another solution if we have.

If there is a better way, we can just drop this one.

Many thanks for your review and reply! :)

> # You may separate this patch from former two to make them merged to
> # mainline soon.
> 
> Thanks,
> Naoya Horiguchi
> 
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/memory_hotplug.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 9fd0be32a281..0488eed3327c 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1664,6 +1664,12 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  		 */
>>  		if (PageOffline(page) && page_count(page))
>>  			return -EBUSY;
>> +		/*
>> +		 * HWPoisoned dirty swapcache pages are definitely unmovable
>> +		 * because they are kept for killing owner processes.
>> +		 */
>> +		if (PageHWPoison(page) && PageSwapCache(page))
>> +			return -EBUSY;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable
  2021-08-21  9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin
  2021-08-23  8:21   ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-08-23  9:15   ` Oscar Salvador
  2021-08-23 11:13     ` Miaohe Lin
  2021-08-23 12:15   ` David Hildenbrand
  2 siblings, 1 reply; 14+ messages in thread
From: Oscar Salvador @ 2021-08-23  9:15 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: akpm, naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On 2021-08-21 11:42, Miaohe Lin wrote:
> If offline_pages failed after lru_cache_disable(), it forgot to do
> lru_cache_enable() in error path. So we would have lru cache disabled
> permanently in this case.
> 
> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration 
> temporarily")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

Should this go to stable?
In case we fail to enable it again, we will bypass the pvec cache 
anytime we add a new page to the LRU which might lead to severe 
performance regression?

> ---
>  mm/memory_hotplug.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index d986d3791986..9fd0be32a281 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn,
> unsigned long nr_pages,
>  	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>  	memory_notify(MEM_CANCEL_OFFLINE, &arg);
>  failed_removal_pcplists_disabled:
> +	lru_cache_enable();
>  	zone_pcp_enable(zone);
>  failed_removal:
>  	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to 
> %s\n",

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable
  2021-08-23  9:15   ` Oscar Salvador
@ 2021-08-23 11:13     ` Miaohe Lin
  0 siblings, 0 replies; 14+ messages in thread
From: Miaohe Lin @ 2021-08-23 11:13 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: akpm, naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On 2021/8/23 17:15, Oscar Salvador wrote:
> On 2021-08-21 11:42, Miaohe Lin wrote:
>> If offline_pages failed after lru_cache_disable(), it forgot to do
>> lru_cache_enable() in error path. So we would have lru cache disabled
>> permanently in this case.
>>
>> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> 
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> 

Many thanks for your review and reply. :)

> Should this go to stable?
> In case we fail to enable it again, we will bypass the pvec cache anytime we add a new page to the LRU which might lead to severe performance regression?
> 

Agree with you. I think this should go to stable too.

>> ---
>>  mm/memory_hotplug.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index d986d3791986..9fd0be32a281 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn,
>> unsigned long nr_pages,
>>      undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>>      memory_notify(MEM_CANCEL_OFFLINE, &arg);
>>  failed_removal_pcplists_disabled:
>> +    lru_cache_enable();
>>      zone_pcp_enable(zone);
>>  failed_removal:
>>      pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code
  2021-08-21  9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin
  2021-08-23  8:20   ` HORIGUCHI NAOYA(堀口 直也)
  2021-08-23  9:11   ` Oscar Salvador
@ 2021-08-23 12:14   ` David Hildenbrand
  2 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2021-08-23 12:14 UTC (permalink / raw)
  To: Miaohe Lin, akpm
  Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On 21.08.21 11:42, Miaohe Lin wrote:
> Use helper zone_is_zone_device() to simplify the code and remove some
> explicit CONFIG_ZONE_DEVICE codes.
> 
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>   mm/memory_hotplug.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index b287ff3d7229..d986d3791986 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -477,15 +477,13 @@ void __ref remove_pfn_range_from_zone(struct zone *zone,
>   				 sizeof(struct page) * cur_nr_pages);
>   	}
>   
> -#ifdef CONFIG_ZONE_DEVICE
>   	/*
>   	 * Zone shrinking code cannot properly deal with ZONE_DEVICE. So
>   	 * we will not try to shrink the zones - which is okay as
>   	 * set_zone_contiguous() cannot deal with ZONE_DEVICE either way.
>   	 */
> -	if (zone_idx(zone) == ZONE_DEVICE)
> +	if (zone_is_zone_device(zone))
>   		return;
> -#endif
>   
>   	clear_zone_contiguous(zone);
>   
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable
  2021-08-21  9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin
  2021-08-23  8:21   ` HORIGUCHI NAOYA(堀口 直也)
  2021-08-23  9:15   ` Oscar Salvador
@ 2021-08-23 12:15   ` David Hildenbrand
  2 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2021-08-23 12:15 UTC (permalink / raw)
  To: Miaohe Lin, akpm
  Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On 21.08.21 11:42, Miaohe Lin wrote:
> If offline_pages failed after lru_cache_disable(), it forgot to do
> lru_cache_enable() in error path. So we would have lru cache disabled
> permanently in this case.
> 
> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>   mm/memory_hotplug.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index d986d3791986..9fd0be32a281 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>   	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
>   	memory_notify(MEM_CANCEL_OFFLINE, &arg);
>   failed_removal_pcplists_disabled:
> +	lru_cache_enable();
>   	zone_pcp_enable(zone);
>   failed_removal:
>   	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

As mentioned, this should be backported to stable.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable
  2021-08-23  9:14     ` Miaohe Lin
@ 2021-11-04 22:07       ` Andrew Morton
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2021-11-04 22:07 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: HORIGUCHI NAOYA, mhocko, minchan, cgoldswo, linux-mm, linux-kernel

On Mon, 23 Aug 2021 17:14:29 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:

> On 2021/8/23 16:26, HORIGUCHI NAOYA(堀口 直也) wrote:
> > On Sat, Aug 21, 2021 at 05:42:46PM +0800, Miaohe Lin wrote:
> >> HWPoisoned dirty swapcache pages are kept for killing owner processes.
> >> We should not offline these pages or do_swap_page() would access the
> >> offline pages and lead to bad ending.
> >>
> > 
> > Thank you for the report.  I'm not yet sure of the whole picture of this
> > issue.  do_swap_page() is expected to return with fault VM_FAULT_HWPOISON
> > when called via the access to the error page, so I wonder why this doesn't
> > work for your situation.  And what is the "bad ending" in the description?
> > 
> 
> IMO we might hotremove the page while SwapCache still have ref to it. Thus the page
> struct would be accessed after offlined. The page struct should be invalid in this case
> and this would make do_swap_page fragile. Or am I miss something?
> 
> > I feel that aborting memory hotremove due to a hwpoisoned dirty swapcache
> > might be too hard, so I'd like to find another solution if we have.
> 
> If there is a better way, we can just drop this one.
> 
> Many thanks for your review and reply! :)
> 
> > # You may separate this patch from former two to make them merged to
> > # mainline soon.
>
> ...
>
> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -1664,6 +1664,12 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
> >>  		 */
> >>  		if (PageOffline(page) && page_count(page))
> >>  			return -EBUSY;
> >> +		/*
> >> +		 * HWPoisoned dirty swapcache pages are definitely unmovable
> >> +		 * because they are kept for killing owner processes.
> >> +		 */
> >> +		if (PageHWPoison(page) && PageSwapCache(page))
> >> +			return -EBUSY;
> 

I'll drop this.  Please resend something if you still believe that
changes are desirable.  

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-11-04 22:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-21  9:42 [PATCH 0/3] Cleanup and fixups for memory hotplug Miaohe Lin
2021-08-21  9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin
2021-08-23  8:20   ` HORIGUCHI NAOYA(堀口 直也)
2021-08-23  9:11   ` Oscar Salvador
2021-08-23 12:14   ` David Hildenbrand
2021-08-21  9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin
2021-08-23  8:21   ` HORIGUCHI NAOYA(堀口 直也)
2021-08-23  9:15   ` Oscar Salvador
2021-08-23 11:13     ` Miaohe Lin
2021-08-23 12:15   ` David Hildenbrand
2021-08-21  9:42 ` [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable Miaohe Lin
2021-08-23  8:26   ` HORIGUCHI NAOYA(堀口 直也)
2021-08-23  9:14     ` Miaohe Lin
2021-11-04 22:07       ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).