LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2] mm: thp: update split_queue_len correctly
@ 2021-11-23 19:09 Shakeel Butt
  2021-11-23 20:00 ` Yang Shi
  2021-11-24 20:12 ` Kirill A. Shutemov
  0 siblings, 2 replies; 6+ messages in thread
From: Shakeel Butt @ 2021-11-23 19:09 UTC (permalink / raw)
  To: David Hildenbrand, Kirill A . Shutemov, Yang Shi, Zi Yan, Matthew Wilcox
  Cc: Andrew Morton, linux-mm, linux-kernel, Shakeel Butt

The deferred THPs are split on memory pressure through shrinker
callback and splitting of THP during reclaim can fail for several
reasons like unable to lock the THP, under writeback or unexpected
number of pins on the THP. Such pages are put back on the deferred split
list for consideration later. However kernel does not update the
deferred queue size on putting back the pages whose split was failed.
This patch fixes that.

Without this patch the split_queue_len can underflow. Shrinker will
always get that there are some THPs to split even if there are not and
waste some cpu to scan the empty list.

Fixes: 364c1eebe453 ("mm: thp: extract split_queue_* into a struct")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
Changes since v1:
- updated commit message
- incorporated Yang Shi's suggestion

 mm/huge_memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e5483347291c..d393028681e2 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2809,7 +2809,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
 	unsigned long flags;
 	LIST_HEAD(list), *pos, *next;
 	struct page *page;
-	int split = 0;
+	unsigned long split = 0;
 
 #ifdef CONFIG_MEMCG
 	if (sc->memcg)
@@ -2847,6 +2847,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
 
 	spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
 	list_splice_tail(&list, &ds_queue->split_queue);
+	ds_queue->split_queue_len -= split;
 	spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
 
 	/*
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: thp: update split_queue_len correctly
  2021-11-23 19:09 [PATCH v2] mm: thp: update split_queue_len correctly Shakeel Butt
@ 2021-11-23 20:00 ` Yang Shi
  2021-11-24 20:12 ` Kirill A. Shutemov
  1 sibling, 0 replies; 6+ messages in thread
From: Yang Shi @ 2021-11-23 20:00 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: David Hildenbrand, Kirill A . Shutemov, Zi Yan, Matthew Wilcox,
	Andrew Morton, Linux MM, Linux Kernel Mailing List

On Tue, Nov 23, 2021 at 11:09 AM Shakeel Butt <shakeelb@google.com> wrote:
>
> The deferred THPs are split on memory pressure through shrinker
> callback and splitting of THP during reclaim can fail for several
> reasons like unable to lock the THP, under writeback or unexpected
> number of pins on the THP. Such pages are put back on the deferred split
> list for consideration later. However kernel does not update the
> deferred queue size on putting back the pages whose split was failed.
> This patch fixes that.
>
> Without this patch the split_queue_len can underflow. Shrinker will
> always get that there are some THPs to split even if there are not and
> waste some cpu to scan the empty list.
>
> Fixes: 364c1eebe453 ("mm: thp: extract split_queue_* into a struct")
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
> Changes since v1:
> - updated commit message
> - incorporated Yang Shi's suggestion

Reviewed-by: Yang Shi <shy828301@gmail.com>

>
>  mm/huge_memory.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e5483347291c..d393028681e2 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2809,7 +2809,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>         unsigned long flags;
>         LIST_HEAD(list), *pos, *next;
>         struct page *page;
> -       int split = 0;
> +       unsigned long split = 0;
>
>  #ifdef CONFIG_MEMCG
>         if (sc->memcg)
> @@ -2847,6 +2847,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>
>         spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
>         list_splice_tail(&list, &ds_queue->split_queue);
> +       ds_queue->split_queue_len -= split;
>         spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>
>         /*
> --
> 2.34.0.rc2.393.gf8c9666880-goog
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: thp: update split_queue_len correctly
  2021-11-23 19:09 [PATCH v2] mm: thp: update split_queue_len correctly Shakeel Butt
  2021-11-23 20:00 ` Yang Shi
@ 2021-11-24 20:12 ` Kirill A. Shutemov
  2021-11-24 20:44   ` Shakeel Butt
  1 sibling, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2021-11-24 20:12 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: David Hildenbrand, Kirill A . Shutemov, Yang Shi, Zi Yan,
	Matthew Wilcox, Andrew Morton, linux-mm, linux-kernel

On Tue, Nov 23, 2021 at 11:09:16AM -0800, Shakeel Butt wrote:
> The deferred THPs are split on memory pressure through shrinker
> callback and splitting of THP during reclaim can fail for several
> reasons like unable to lock the THP, under writeback or unexpected
> number of pins on the THP. Such pages are put back on the deferred split
> list for consideration later. However kernel does not update the
> deferred queue size on putting back the pages whose split was failed.
> This patch fixes that.

Hm. No. split_huge_page_to_list() updates the queue size on split success.

NAK.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: thp: update split_queue_len correctly
  2021-11-24 20:12 ` Kirill A. Shutemov
@ 2021-11-24 20:44   ` Shakeel Butt
  2021-11-24 21:17     ` Yang Shi
  0 siblings, 1 reply; 6+ messages in thread
From: Shakeel Butt @ 2021-11-24 20:44 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: David Hildenbrand, Kirill A . Shutemov, Yang Shi, Zi Yan,
	Matthew Wilcox, Andrew Morton, linux-mm, linux-kernel

On Wed, Nov 24, 2021 at 12:12 PM Kirill A. Shutemov
<kirill@shutemov.name> wrote:
>
> On Tue, Nov 23, 2021 at 11:09:16AM -0800, Shakeel Butt wrote:
> > The deferred THPs are split on memory pressure through shrinker
> > callback and splitting of THP during reclaim can fail for several
> > reasons like unable to lock the THP, under writeback or unexpected
> > number of pins on the THP. Such pages are put back on the deferred split
> > list for consideration later. However kernel does not update the
> > deferred queue size on putting back the pages whose split was failed.
> > This patch fixes that.
>
> Hm. No. split_huge_page_to_list() updates the queue size on split success.
>

Right. This is really convoluted. split_huge_page_to_list() is just
assuming that if the given page is on a deferred list then it must be
on the list returned by get_deferred_split_queue(page). The
interaction of move_charge and deferred split seems broken.

Andrew, can you please drop this patch?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: thp: update split_queue_len correctly
  2021-11-24 20:44   ` Shakeel Butt
@ 2021-11-24 21:17     ` Yang Shi
  2021-11-24 21:19       ` Shakeel Butt
  0 siblings, 1 reply; 6+ messages in thread
From: Yang Shi @ 2021-11-24 21:17 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Kirill A. Shutemov, David Hildenbrand, Kirill A . Shutemov,
	Zi Yan, Matthew Wilcox, Andrew Morton, Linux MM,
	Linux Kernel Mailing List

On Wed, Nov 24, 2021 at 12:44 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Wed, Nov 24, 2021 at 12:12 PM Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> >
> > On Tue, Nov 23, 2021 at 11:09:16AM -0800, Shakeel Butt wrote:
> > > The deferred THPs are split on memory pressure through shrinker
> > > callback and splitting of THP during reclaim can fail for several
> > > reasons like unable to lock the THP, under writeback or unexpected
> > > number of pins on the THP. Such pages are put back on the deferred split
> > > list for consideration later. However kernel does not update the
> > > deferred queue size on putting back the pages whose split was failed.
> > > This patch fixes that.
> >
> > Hm. No. split_huge_page_to_list() updates the queue size on split success.
> >
>
> Right. This is really convoluted. split_huge_page_to_list() is just
> assuming that if the given page is on a deferred list then it must be
> on the list returned by get_deferred_split_queue(page). The
> interaction of move_charge and deferred split seems broken.

Because memcg code doesn't move charge for PTE mapped THP at all. See
the below comment from mem_cgroup_move_charge_pte_range():

"We can have a part of the split pmd here. Moving it can be done but
it would be too convoluted so simply ignore such a partial THP and
keep it in original memcg. There should be somebody mapping the head."

BTW, did you run into any problem related to this?

>
> Andrew, can you please drop this patch?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: thp: update split_queue_len correctly
  2021-11-24 21:17     ` Yang Shi
@ 2021-11-24 21:19       ` Shakeel Butt
  0 siblings, 0 replies; 6+ messages in thread
From: Shakeel Butt @ 2021-11-24 21:19 UTC (permalink / raw)
  To: Yang Shi
  Cc: Kirill A. Shutemov, David Hildenbrand, Kirill A . Shutemov,
	Zi Yan, Matthew Wilcox, Andrew Morton, Linux MM,
	Linux Kernel Mailing List

On Wed, Nov 24, 2021 at 1:17 PM Yang Shi <shy828301@gmail.com> wrote:
>
> On Wed, Nov 24, 2021 at 12:44 PM Shakeel Butt <shakeelb@google.com> wrote:
> >
> > On Wed, Nov 24, 2021 at 12:12 PM Kirill A. Shutemov
> > <kirill@shutemov.name> wrote:
> > >
> > > On Tue, Nov 23, 2021 at 11:09:16AM -0800, Shakeel Butt wrote:
> > > > The deferred THPs are split on memory pressure through shrinker
> > > > callback and splitting of THP during reclaim can fail for several
> > > > reasons like unable to lock the THP, under writeback or unexpected
> > > > number of pins on the THP. Such pages are put back on the deferred split
> > > > list for consideration later. However kernel does not update the
> > > > deferred queue size on putting back the pages whose split was failed.
> > > > This patch fixes that.
> > >
> > > Hm. No. split_huge_page_to_list() updates the queue size on split success.
> > >
> >
> > Right. This is really convoluted. split_huge_page_to_list() is just
> > assuming that if the given page is on a deferred list then it must be
> > on the list returned by get_deferred_split_queue(page). The
> > interaction of move_charge and deferred split seems broken.
>
> Because memcg code doesn't move charge for PTE mapped THP at all. See
> the below comment from mem_cgroup_move_charge_pte_range():
>
> "We can have a part of the split pmd here. Moving it can be done but
> it would be too convoluted so simply ignore such a partial THP and
> keep it in original memcg. There should be somebody mapping the head."
>
> BTW, did you run into any problem related to this?
>

No, just reading code to see if I can share code for the sync splitting of THPs.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-24 21:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23 19:09 [PATCH v2] mm: thp: update split_queue_len correctly Shakeel Butt
2021-11-23 20:00 ` Yang Shi
2021-11-24 20:12 ` Kirill A. Shutemov
2021-11-24 20:44   ` Shakeel Butt
2021-11-24 21:17     ` Yang Shi
2021-11-24 21:19       ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).