LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC PATCH] mm: thp: grab the lock before manipulation defer list
@ 2020-01-03 14:34 Wei Yang
2020-01-03 19:29 ` David Rientjes
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Wei Yang @ 2020-01-03 14:34 UTC (permalink / raw)
To: hannes, mhocko, vdavydov.dev, akpm
Cc: kirill.shutemov, cgroups, linux-mm, linux-kernel, yang.shi, Wei Yang
As all the other places, we grab the lock before manipulate the defer list.
Current implementation may face a race condition.
Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
---
I notice the difference during code reading and just confused about the
difference. No specific test is done since limited knowledge about cgroup.
Maybe I miss something important?
---
mm/memcontrol.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index bc01423277c5..62b7ec34ef1a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ spin_lock(&from->deferred_split_queue.split_queue_lock);
if (compound && !list_empty(page_deferred_list(page))) {
- spin_lock(&from->deferred_split_queue.split_queue_lock);
list_del_init(page_deferred_list(page));
from->deferred_split_queue.split_queue_len--;
- spin_unlock(&from->deferred_split_queue.split_queue_lock);
}
+ spin_unlock(&from->deferred_split_queue.split_queue_lock);
#endif
/*
* It is safe to change page->mem_cgroup here because the page
@@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
page->mem_cgroup = to;
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ spin_lock(&to->deferred_split_queue.split_queue_lock);
if (compound && list_empty(page_deferred_list(page))) {
- spin_lock(&to->deferred_split_queue.split_queue_lock);
list_add_tail(page_deferred_list(page),
&to->deferred_split_queue.split_queue);
to->deferred_split_queue.split_queue_len++;
- spin_unlock(&to->deferred_split_queue.split_queue_lock);
}
+ spin_unlock(&to->deferred_split_queue.split_queue_lock);
#endif
spin_unlock_irqrestore(&from->move_lock, flags);
--
2.17.1
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-03 14:34 [RFC PATCH] mm: thp: grab the lock before manipulation defer list Wei Yang
@ 2020-01-03 19:29 ` David Rientjes
2020-01-03 23:39 ` Wei Yang
2020-01-06 10:23 ` Michal Hocko
2020-01-06 16:18 ` Alexander Duyck
2 siblings, 1 reply; 19+ messages in thread
From: David Rientjes @ 2020-01-03 19:29 UTC (permalink / raw)
To: Wei Yang
Cc: hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov, cgroups,
linux-mm, linux-kernel, yang.shi
On Fri, 3 Jan 2020, Wei Yang wrote:
> As all the other places, we grab the lock before manipulate the defer list.
> Current implementation may face a race condition.
>
> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>
> ---
> I notice the difference during code reading and just confused about the
> difference. No specific test is done since limited knowledge about cgroup.
>
> Maybe I miss something important?
The check for !list_empty(page_deferred_list(page)) must certainly be
serialized with doing list_del_init(page_deferred_list(page)).
> ---
> mm/memcontrol.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc01423277c5..62b7ec34ef1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
> }
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + spin_lock(&from->deferred_split_queue.split_queue_lock);
> if (compound && !list_empty(page_deferred_list(page))) {
> - spin_lock(&from->deferred_split_queue.split_queue_lock);
> list_del_init(page_deferred_list(page));
> from->deferred_split_queue.split_queue_len--;
> - spin_unlock(&from->deferred_split_queue.split_queue_lock);
> }
> + spin_unlock(&from->deferred_split_queue.split_queue_lock);
> #endif
> /*
> * It is safe to change page->mem_cgroup here because the page
> @@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
> page->mem_cgroup = to;
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + spin_lock(&to->deferred_split_queue.split_queue_lock);
> if (compound && list_empty(page_deferred_list(page))) {
> - spin_lock(&to->deferred_split_queue.split_queue_lock);
> list_add_tail(page_deferred_list(page),
> &to->deferred_split_queue.split_queue);
> to->deferred_split_queue.split_queue_len++;
> - spin_unlock(&to->deferred_split_queue.split_queue_lock);
> }
> + spin_unlock(&to->deferred_split_queue.split_queue_lock);
> #endif
>
> spin_unlock_irqrestore(&from->move_lock, flags);
> --
> 2.17.1
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-03 19:29 ` David Rientjes
@ 2020-01-03 23:39 ` Wei Yang
2020-01-04 0:44 ` David Rientjes
0 siblings, 1 reply; 19+ messages in thread
From: Wei Yang @ 2020-01-03 23:39 UTC (permalink / raw)
To: David Rientjes
Cc: Wei Yang, hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov,
cgroups, linux-mm, linux-kernel, yang.shi
On Fri, Jan 03, 2020 at 11:29:06AM -0800, David Rientjes wrote:
>On Fri, 3 Jan 2020, Wei Yang wrote:
>
>> As all the other places, we grab the lock before manipulate the defer list.
>> Current implementation may face a race condition.
>>
>> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>>
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>>
>> ---
>> I notice the difference during code reading and just confused about the
>> difference. No specific test is done since limited knowledge about cgroup.
>>
>> Maybe I miss something important?
>
>The check for !list_empty(page_deferred_list(page)) must certainly be
>serialized with doing list_del_init(page_deferred_list(page)).
>
Hi David
Would you mind giving more information? You mean list_empty and list_del_init
is atomic?
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-03 23:39 ` Wei Yang
@ 2020-01-04 0:44 ` David Rientjes
2020-01-06 1:20 ` Wei Yang
0 siblings, 1 reply; 19+ messages in thread
From: David Rientjes @ 2020-01-04 0:44 UTC (permalink / raw)
To: Wei Yang
Cc: hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov, cgroups,
linux-mm, linux-kernel, yang.shi
On Sat, 4 Jan 2020, Wei Yang wrote:
> On Fri, Jan 03, 2020 at 11:29:06AM -0800, David Rientjes wrote:
> >On Fri, 3 Jan 2020, Wei Yang wrote:
> >
> >> As all the other places, we grab the lock before manipulate the defer list.
> >> Current implementation may face a race condition.
> >>
> >> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
> >>
> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >>
> >> ---
> >> I notice the difference during code reading and just confused about the
> >> difference. No specific test is done since limited knowledge about cgroup.
> >>
> >> Maybe I miss something important?
> >
> >The check for !list_empty(page_deferred_list(page)) must certainly be
> >serialized with doing list_del_init(page_deferred_list(page)).
> >
>
> Hi David
>
> Would you mind giving more information? You mean list_empty and list_del_init
> is atomic?
>
I mean your patch is obviously correct :) It should likely also have a
stable@vger.kernel.org # 5.4+
Acked-by: David Rientjes <rientjes@google.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-04 0:44 ` David Rientjes
@ 2020-01-06 1:20 ` Wei Yang
0 siblings, 0 replies; 19+ messages in thread
From: Wei Yang @ 2020-01-06 1:20 UTC (permalink / raw)
To: David Rientjes
Cc: Wei Yang, hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov,
cgroups, linux-mm, linux-kernel, yang.shi
On Fri, Jan 03, 2020 at 04:44:59PM -0800, David Rientjes wrote:
>On Sat, 4 Jan 2020, Wei Yang wrote:
>
>> On Fri, Jan 03, 2020 at 11:29:06AM -0800, David Rientjes wrote:
>> >On Fri, 3 Jan 2020, Wei Yang wrote:
>> >
>> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> Current implementation may face a race condition.
>> >>
>> >> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>> >>
>> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >>
>> >> ---
>> >> I notice the difference during code reading and just confused about the
>> >> difference. No specific test is done since limited knowledge about cgroup.
>> >>
>> >> Maybe I miss something important?
>> >
>> >The check for !list_empty(page_deferred_list(page)) must certainly be
>> >serialized with doing list_del_init(page_deferred_list(page)).
>> >
>>
>> Hi David
>>
>> Would you mind giving more information? You mean list_empty and list_del_init
>> is atomic?
>>
>
>I mean your patch is obviously correct :) It should likely also have a
>stable@vger.kernel.org # 5.4+
Ah, my poor English ;-)
>
>Acked-by: David Rientjes <rientjes@google.com>
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-03 14:34 [RFC PATCH] mm: thp: grab the lock before manipulation defer list Wei Yang
2020-01-03 19:29 ` David Rientjes
@ 2020-01-06 10:23 ` Michal Hocko
2020-01-07 1:22 ` Wei Yang
2020-01-06 16:18 ` Alexander Duyck
2 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2020-01-06 10:23 UTC (permalink / raw)
To: Wei Yang
Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
linux-kernel, yang.shi
On Fri 03-01-20 22:34:07, Wei Yang wrote:
> As all the other places, we grab the lock before manipulate the defer list.
> Current implementation may face a race condition.
Please always make sure to describe the effect of the change. Why a racy
list_empty check matters?
> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>
> ---
> I notice the difference during code reading and just confused about the
> difference. No specific test is done since limited knowledge about cgroup.
>
> Maybe I miss something important?
> ---
> mm/memcontrol.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc01423277c5..62b7ec34ef1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
> }
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + spin_lock(&from->deferred_split_queue.split_queue_lock);
> if (compound && !list_empty(page_deferred_list(page))) {
> - spin_lock(&from->deferred_split_queue.split_queue_lock);
> list_del_init(page_deferred_list(page));
> from->deferred_split_queue.split_queue_len--;
> - spin_unlock(&from->deferred_split_queue.split_queue_lock);
> }
> + spin_unlock(&from->deferred_split_queue.split_queue_lock);
> #endif
> /*
> * It is safe to change page->mem_cgroup here because the page
> @@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
> page->mem_cgroup = to;
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + spin_lock(&to->deferred_split_queue.split_queue_lock);
> if (compound && list_empty(page_deferred_list(page))) {
> - spin_lock(&to->deferred_split_queue.split_queue_lock);
> list_add_tail(page_deferred_list(page),
> &to->deferred_split_queue.split_queue);
> to->deferred_split_queue.split_queue_len++;
> - spin_unlock(&to->deferred_split_queue.split_queue_lock);
> }
> + spin_unlock(&to->deferred_split_queue.split_queue_lock);
> #endif
>
> spin_unlock_irqrestore(&from->move_lock, flags);
> --
> 2.17.1
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-03 14:34 [RFC PATCH] mm: thp: grab the lock before manipulation defer list Wei Yang
2020-01-03 19:29 ` David Rientjes
2020-01-06 10:23 ` Michal Hocko
@ 2020-01-06 16:18 ` Alexander Duyck
2020-01-07 1:26 ` Wei Yang
2 siblings, 1 reply; 19+ messages in thread
From: Alexander Duyck @ 2020-01-06 16:18 UTC (permalink / raw)
To: Wei Yang
Cc: Johannes Weiner, Michal Hocko, vdavydov.dev, Andrew Morton,
Kirill A. Shutemov, cgroups, linux-mm, LKML, Yang Shi
On Fri, Jan 3, 2020 at 6:34 AM Wei Yang <richardw.yang@linux.intel.com> wrote:
>
> As all the other places, we grab the lock before manipulate the defer list.
> Current implementation may face a race condition.
>
> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>
> ---
> I notice the difference during code reading and just confused about the
> difference. No specific test is done since limited knowledge about cgroup.
>
> Maybe I miss something important?
> ---
> mm/memcontrol.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc01423277c5..62b7ec34ef1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
> }
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + spin_lock(&from->deferred_split_queue.split_queue_lock);
> if (compound && !list_empty(page_deferred_list(page))) {
> - spin_lock(&from->deferred_split_queue.split_queue_lock);
> list_del_init(page_deferred_list(page));
> from->deferred_split_queue.split_queue_len--;
> - spin_unlock(&from->deferred_split_queue.split_queue_lock);
> }
> + spin_unlock(&from->deferred_split_queue.split_queue_lock);
> #endif
> /*
> * It is safe to change page->mem_cgroup here because the page
So I suspect the lock placement has to do with the compound boolean
value passed to the function.
One thing you might want to do is pull the "if (compound)" check out
and place it outside of the spinlock check. It would then simplify
this signficantly so it is something like
if (compound) {
spin_lock();
list = page_deferred_list(page);
if (!list_empty(list)) {
list_del_init(list);
from->..split_queue_len--;
}
spin_unlock();
}
Same for the block below. I would pull the check for compound outside
of the spinlock call since it is a value that shouldn't change and
would eliminate an unnecessary lock in the non-compound case.
> @@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
> page->mem_cgroup = to;
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + spin_lock(&to->deferred_split_queue.split_queue_lock);
> if (compound && list_empty(page_deferred_list(page))) {
> - spin_lock(&to->deferred_split_queue.split_queue_lock);
> list_add_tail(page_deferred_list(page),
> &to->deferred_split_queue.split_queue);
> to->deferred_split_queue.split_queue_len++;
> - spin_unlock(&to->deferred_split_queue.split_queue_lock);
> }
> + spin_unlock(&to->deferred_split_queue.split_queue_lock);
> #endif
>
> spin_unlock_irqrestore(&from->move_lock, flags);
> --
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-06 10:23 ` Michal Hocko
@ 2020-01-07 1:22 ` Wei Yang
2020-01-07 8:38 ` Michal Hocko
0 siblings, 1 reply; 19+ messages in thread
From: Wei Yang @ 2020-01-07 1:22 UTC (permalink / raw)
To: Michal Hocko
Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
linux-mm, linux-kernel, yang.shi
On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> As all the other places, we grab the lock before manipulate the defer list.
>> Current implementation may face a race condition.
>
>Please always make sure to describe the effect of the change. Why a racy
>list_empty check matters?
>
Hmm... access the list without proper lock leads to many bad behaviors.
For example, if we grab the lock after checking list_empty, the page may
already be removed from list in split_huge_page_list. And then list_del_init
would trigger bug.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-06 16:18 ` Alexander Duyck
@ 2020-01-07 1:26 ` Wei Yang
2020-01-07 2:07 ` David Rientjes
0 siblings, 1 reply; 19+ messages in thread
From: Wei Yang @ 2020-01-07 1:26 UTC (permalink / raw)
To: Alexander Duyck
Cc: Wei Yang, Johannes Weiner, Michal Hocko, vdavydov.dev,
Andrew Morton, Kirill A. Shutemov, cgroups, linux-mm, LKML,
Yang Shi
On Mon, Jan 06, 2020 at 08:18:34AM -0800, Alexander Duyck wrote:
>On Fri, Jan 3, 2020 at 6:34 AM Wei Yang <richardw.yang@linux.intel.com> wrote:
>>
>> As all the other places, we grab the lock before manipulate the defer list.
>> Current implementation may face a race condition.
>>
>> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>>
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>>
>> ---
>> I notice the difference during code reading and just confused about the
>> difference. No specific test is done since limited knowledge about cgroup.
>>
>> Maybe I miss something important?
>> ---
>> mm/memcontrol.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index bc01423277c5..62b7ec34ef1a 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
>> }
>>
>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> + spin_lock(&from->deferred_split_queue.split_queue_lock);
>> if (compound && !list_empty(page_deferred_list(page))) {
>> - spin_lock(&from->deferred_split_queue.split_queue_lock);
>> list_del_init(page_deferred_list(page));
>> from->deferred_split_queue.split_queue_len--;
>> - spin_unlock(&from->deferred_split_queue.split_queue_lock);
>> }
>> + spin_unlock(&from->deferred_split_queue.split_queue_lock);
>> #endif
>> /*
>> * It is safe to change page->mem_cgroup here because the page
>
>So I suspect the lock placement has to do with the compound boolean
>value passed to the function.
>
Hey, Alexander
Thanks for your comment.
>One thing you might want to do is pull the "if (compound)" check out
>and place it outside of the spinlock check. It would then simplify
>this signficantly so it is something like
>if (compound) {
> spin_lock();
> list = page_deferred_list(page);
> if (!list_empty(list)) {
> list_del_init(list);
> from->..split_queue_len--;
> }
> spin_unlock();
>}
>
>Same for the block below. I would pull the check for compound outside
>of the spinlock call since it is a value that shouldn't change and
>would eliminate an unnecessary lock in the non-compound case.
This is reasonable, if no objection from others, I would change this in v2.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-07 1:26 ` Wei Yang
@ 2020-01-07 2:07 ` David Rientjes
2020-01-07 2:33 ` Wei Yang
0 siblings, 1 reply; 19+ messages in thread
From: David Rientjes @ 2020-01-07 2:07 UTC (permalink / raw)
To: Wei Yang
Cc: Alexander Duyck, Johannes Weiner, Michal Hocko, vdavydov.dev,
Andrew Morton, Kirill A. Shutemov, cgroups, linux-mm, LKML,
Yang Shi
On Tue, 7 Jan 2020, Wei Yang wrote:
> >One thing you might want to do is pull the "if (compound)" check out
> >and place it outside of the spinlock check. It would then simplify
> >this signficantly so it is something like
> >if (compound) {
> > spin_lock();
> > list = page_deferred_list(page);
> > if (!list_empty(list)) {
> > list_del_init(list);
> > from->..split_queue_len--;
> > }
> > spin_unlock();
> >}
> >
> >Same for the block below. I would pull the check for compound outside
> >of the spinlock call since it is a value that shouldn't change and
> >would eliminate an unnecessary lock in the non-compound case.
>
> This is reasonable, if no objection from others, I would change this in v2.
Looks fine to me; I don't see it as a necessary improvement but there's
also no reason to object to it. It's definitely a patch that is needed,
however, for the simple reason that with the existing code we can
manipulate the deferred split queue incorrectly so either way works for
me. Feel free to keep my acked-by.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-07 2:07 ` David Rientjes
@ 2020-01-07 2:33 ` Wei Yang
0 siblings, 0 replies; 19+ messages in thread
From: Wei Yang @ 2020-01-07 2:33 UTC (permalink / raw)
To: David Rientjes
Cc: Wei Yang, Alexander Duyck, Johannes Weiner, Michal Hocko,
vdavydov.dev, Andrew Morton, Kirill A. Shutemov, cgroups,
linux-mm, LKML, Yang Shi
On Mon, Jan 06, 2020 at 06:07:29PM -0800, David Rientjes wrote:
>On Tue, 7 Jan 2020, Wei Yang wrote:
>
>> >One thing you might want to do is pull the "if (compound)" check out
>> >and place it outside of the spinlock check. It would then simplify
>> >this signficantly so it is something like
>> >if (compound) {
>> > spin_lock();
>> > list = page_deferred_list(page);
>> > if (!list_empty(list)) {
>> > list_del_init(list);
>> > from->..split_queue_len--;
>> > }
>> > spin_unlock();
>> >}
>> >
>> >Same for the block below. I would pull the check for compound outside
>> >of the spinlock call since it is a value that shouldn't change and
>> >would eliminate an unnecessary lock in the non-compound case.
>>
>> This is reasonable, if no objection from others, I would change this in v2.
>
>Looks fine to me; I don't see it as a necessary improvement but there's
>also no reason to object to it. It's definitely a patch that is needed,
>however, for the simple reason that with the existing code we can
>manipulate the deferred split queue incorrectly so either way works for
>me. Feel free to keep my acked-by.
Ah, thanks David. You are so supportive.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-07 1:22 ` Wei Yang
@ 2020-01-07 8:38 ` Michal Hocko
2020-01-08 0:35 ` Wei Yang
0 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2020-01-07 8:38 UTC (permalink / raw)
To: Wei Yang
Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
linux-kernel, yang.shi
On Tue 07-01-20 09:22:41, Wei Yang wrote:
> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
> >> As all the other places, we grab the lock before manipulate the defer list.
> >> Current implementation may face a race condition.
> >
> >Please always make sure to describe the effect of the change. Why a racy
> >list_empty check matters?
> >
>
> Hmm... access the list without proper lock leads to many bad behaviors.
My point is that the changelog should describe that bad behavior.
> For example, if we grab the lock after checking list_empty, the page may
> already be removed from list in split_huge_page_list. And then list_del_init
> would trigger bug.
And how does list_empty check under the lock guarantee that the page is
on the deferred list?
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-07 8:38 ` Michal Hocko
@ 2020-01-08 0:35 ` Wei Yang
2020-01-08 9:40 ` Michal Hocko
0 siblings, 1 reply; 19+ messages in thread
From: Wei Yang @ 2020-01-08 0:35 UTC (permalink / raw)
To: Michal Hocko
Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
linux-mm, linux-kernel, yang.shi
On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> Current implementation may face a race condition.
>> >
>> >Please always make sure to describe the effect of the change. Why a racy
>> >list_empty check matters?
>> >
>>
>> Hmm... access the list without proper lock leads to many bad behaviors.
>
>My point is that the changelog should describe that bad behavior.
>
>> For example, if we grab the lock after checking list_empty, the page may
>> already be removed from list in split_huge_page_list. And then list_del_init
>> would trigger bug.
>
>And how does list_empty check under the lock guarantee that the page is
>on the deferred list?
Just one confusion, is this kind of description basic concept of concurrent
programming? How detail level we need to describe the effect?
To me, grab the lock before accessing the critical section is obvious.
list_empty and list_del should be the critical section. And the
lock should protect the whole critical section instead of part of it.
>--
>Michal Hocko
>SUSE Labs
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-08 0:35 ` Wei Yang
@ 2020-01-08 9:40 ` Michal Hocko
2020-01-09 2:03 ` Wei Yang
2020-01-09 3:18 ` Wei Yang
0 siblings, 2 replies; 19+ messages in thread
From: Michal Hocko @ 2020-01-08 9:40 UTC (permalink / raw)
To: Wei Yang
Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
linux-kernel, yang.shi
On Wed 08-01-20 08:35:43, Wei Yang wrote:
> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
> >> >> As all the other places, we grab the lock before manipulate the defer list.
> >> >> Current implementation may face a race condition.
> >> >
> >> >Please always make sure to describe the effect of the change. Why a racy
> >> >list_empty check matters?
> >> >
> >>
> >> Hmm... access the list without proper lock leads to many bad behaviors.
> >
> >My point is that the changelog should describe that bad behavior.
> >
> >> For example, if we grab the lock after checking list_empty, the page may
> >> already be removed from list in split_huge_page_list. And then list_del_init
> >> would trigger bug.
> >
> >And how does list_empty check under the lock guarantee that the page is
> >on the deferred list?
>
> Just one confusion, is this kind of description basic concept of concurrent
> programming? How detail level we need to describe the effect?
When I write changelogs for patches like this I usually describe, what
is the potential race - e.g.
CPU1 CPU2
path1 path2
check lock
operation2
unlock
lock
# check might not hold anymore
operation1
unlock
and what is the effect of the race - e.g. a crash, data corruption,
pointless attempt for operation1 which fails with user visible effect
etc.
This helps reviewers and everybody reading the code in the future to
understand the locking scheme.
> To me, grab the lock before accessing the critical section is obvious.
It might be obvious but in many cases it is useful to minimize the
locking and do a potentially race check before the lock is taken if the
resulting operation can handle that.
> list_empty and list_del should be the critical section. And the
> lock should protect the whole critical section instead of part of it.
I am not disputing that. What I am trying to say is that the changelog
should described the problem in the first place.
Moreover, look at the code you are trying to fix. Sure extending the
locking seem straightforward but does it result in a correct code
though? See my question in the previous email. How do we know that the
page is actually enqued in a non-empty list?
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-08 9:40 ` Michal Hocko
@ 2020-01-09 2:03 ` Wei Yang
2020-01-09 8:34 ` Michal Hocko
2020-01-09 3:18 ` Wei Yang
1 sibling, 1 reply; 19+ messages in thread
From: Wei Yang @ 2020-01-09 2:03 UTC (permalink / raw)
To: Michal Hocko
Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
linux-mm, linux-kernel, yang.shi
On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
>On Wed 08-01-20 08:35:43, Wei Yang wrote:
>> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> >> Current implementation may face a race condition.
>> >> >
>> >> >Please always make sure to describe the effect of the change. Why a racy
>> >> >list_empty check matters?
>> >> >
>> >>
>> >> Hmm... access the list without proper lock leads to many bad behaviors.
>> >
>> >My point is that the changelog should describe that bad behavior.
>> >
>> >> For example, if we grab the lock after checking list_empty, the page may
>> >> already be removed from list in split_huge_page_list. And then list_del_init
>> >> would trigger bug.
>> >
>> >And how does list_empty check under the lock guarantee that the page is
>> >on the deferred list?
>>
>> Just one confusion, is this kind of description basic concept of concurrent
>> programming? How detail level we need to describe the effect?
>
>When I write changelogs for patches like this I usually describe, what
>is the potential race - e.g.
> CPU1 CPU2
> path1 path2
> check lock
> operation2
> unlock
> lock
> # check might not hold anymore
> operation1
> unlock
>
Nice, I would prepare a changelog like this.
>and what is the effect of the race - e.g. a crash, data corruption,
>pointless attempt for operation1 which fails with user visible effect
>etc.
>This helps reviewers and everybody reading the code in the future to
>understand the locking scheme.
>
>> To me, grab the lock before accessing the critical section is obvious.
>
>It might be obvious but in many cases it is useful to minimize the
>locking and do a potentially race check before the lock is taken if the
>resulting operation can handle that.
>
>> list_empty and list_del should be the critical section. And the
>> lock should protect the whole critical section instead of part of it.
>
>I am not disputing that. What I am trying to say is that the changelog
>should described the problem in the first place.
>
>Moreover, look at the code you are trying to fix. Sure extending the
>locking seem straightforward but does it result in a correct code
>though? See my question in the previous email. How do we know that the
>page is actually enqued in a non-empty list?
I may not get your point for the last sentence.
The list_empty() doesn't check the queue is empty but check the list, here is
the page, is not enqueued into any list. Is this your concern?
>--
>Michal Hocko
>SUSE Labs
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-08 9:40 ` Michal Hocko
2020-01-09 2:03 ` Wei Yang
@ 2020-01-09 3:18 ` Wei Yang
2020-01-09 8:36 ` Michal Hocko
1 sibling, 1 reply; 19+ messages in thread
From: Wei Yang @ 2020-01-09 3:18 UTC (permalink / raw)
To: Michal Hocko
Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
linux-mm, linux-kernel, yang.shi
On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
>On Wed 08-01-20 08:35:43, Wei Yang wrote:
>> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> >> Current implementation may face a race condition.
>> >> >
>> >> >Please always make sure to describe the effect of the change. Why a racy
>> >> >list_empty check matters?
>> >> >
>> >>
>> >> Hmm... access the list without proper lock leads to many bad behaviors.
>> >
>> >My point is that the changelog should describe that bad behavior.
>> >
>> >> For example, if we grab the lock after checking list_empty, the page may
>> >> already be removed from list in split_huge_page_list. And then list_del_init
>> >> would trigger bug.
>> >
>> >And how does list_empty check under the lock guarantee that the page is
>> >on the deferred list?
>>
>> Just one confusion, is this kind of description basic concept of concurrent
>> programming? How detail level we need to describe the effect?
>
>When I write changelogs for patches like this I usually describe, what
>is the potential race - e.g.
> CPU1 CPU2
> path1 path2
> check lock
> operation2
> unlock
> lock
> # check might not hold anymore
> operation1
> unlock
>
>and what is the effect of the race - e.g. a crash, data corruption,
>pointless attempt for operation1 which fails with user visible effect
>etc.
Hi, Michal, here is my attempt for an example. Hope this one looks good to
you.
For example, the potential race would be:
CPU1 CPU2
mem_cgroup_move_account split_huge_page_to_list
!list_empty
lock
!list_empty
list_del
unlock
lock
# !list_empty might not hold anymore
list_del_init
unlock
When this sequence happens, the list_del_init() in
mem_cgroup_move_account() would crash since the page is already been
removed by list_del in split_huge_page_to_list().
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-09 2:03 ` Wei Yang
@ 2020-01-09 8:34 ` Michal Hocko
0 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2020-01-09 8:34 UTC (permalink / raw)
To: Wei Yang
Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
linux-kernel, yang.shi
On Thu 09-01-20 10:03:19, Wei Yang wrote:
> On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
[...]
> >Moreover, look at the code you are trying to fix. Sure extending the
> >locking seem straightforward but does it result in a correct code
> >though? See my question in the previous email. How do we know that the
> >page is actually enqued in a non-empty list?
>
> I may not get your point for the last sentence.
>
> The list_empty() doesn't check the queue is empty but check the list, here is
> the page, is not enqueued into any list. Is this your concern?
My bad. For some reason I have misread the code and thought this was
get_deferred_split_queue rather than page_deferred_list. Sorry about the
confusion.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-09 3:18 ` Wei Yang
@ 2020-01-09 8:36 ` Michal Hocko
2020-01-09 8:52 ` Wei Yang
0 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2020-01-09 8:36 UTC (permalink / raw)
To: Wei Yang
Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
linux-kernel, yang.shi
On Thu 09-01-20 11:18:21, Wei Yang wrote:
> On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
> >On Wed 08-01-20 08:35:43, Wei Yang wrote:
> >> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
> >> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
> >> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
> >> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
> >> >> >> As all the other places, we grab the lock before manipulate the defer list.
> >> >> >> Current implementation may face a race condition.
> >> >> >
> >> >> >Please always make sure to describe the effect of the change. Why a racy
> >> >> >list_empty check matters?
> >> >> >
> >> >>
> >> >> Hmm... access the list without proper lock leads to many bad behaviors.
> >> >
> >> >My point is that the changelog should describe that bad behavior.
> >> >
> >> >> For example, if we grab the lock after checking list_empty, the page may
> >> >> already be removed from list in split_huge_page_list. And then list_del_init
> >> >> would trigger bug.
> >> >
> >> >And how does list_empty check under the lock guarantee that the page is
> >> >on the deferred list?
> >>
> >> Just one confusion, is this kind of description basic concept of concurrent
> >> programming? How detail level we need to describe the effect?
> >
> >When I write changelogs for patches like this I usually describe, what
> >is the potential race - e.g.
> > CPU1 CPU2
> > path1 path2
> > check lock
> > operation2
> > unlock
> > lock
> > # check might not hold anymore
> > operation1
> > unlock
> >
> >and what is the effect of the race - e.g. a crash, data corruption,
> >pointless attempt for operation1 which fails with user visible effect
> >etc.
>
> Hi, Michal, here is my attempt for an example. Hope this one looks good to
> you.
>
>
> For example, the potential race would be:
>
> CPU1 CPU2
> mem_cgroup_move_account split_huge_page_to_list
> !list_empty
> lock
> !list_empty
> list_del
> unlock
> lock
> # !list_empty might not hold anymore
> list_del_init
> unlock
>
> When this sequence happens, the list_del_init() in
> mem_cgroup_move_account() would crash since the page is already been
> removed by list_del in split_huge_page_to_list().
Yes this looks much more informative. I would just add that this will
crash if CONFIG_DEBUG_LIST.
Thanks!
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
2020-01-09 8:36 ` Michal Hocko
@ 2020-01-09 8:52 ` Wei Yang
0 siblings, 0 replies; 19+ messages in thread
From: Wei Yang @ 2020-01-09 8:52 UTC (permalink / raw)
To: Michal Hocko
Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
linux-mm, linux-kernel, yang.shi
On Thu, Jan 09, 2020 at 09:36:41AM +0100, Michal Hocko wrote:
>On Thu 09-01-20 11:18:21, Wei Yang wrote:
>> On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
>> >On Wed 08-01-20 08:35:43, Wei Yang wrote:
>> >> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>> >> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> >> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> >> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> >> >> Current implementation may face a race condition.
>> >> >> >
>> >> >> >Please always make sure to describe the effect of the change. Why a racy
>> >> >> >list_empty check matters?
>> >> >> >
>> >> >>
>> >> >> Hmm... access the list without proper lock leads to many bad behaviors.
>> >> >
>> >> >My point is that the changelog should describe that bad behavior.
>> >> >
>> >> >> For example, if we grab the lock after checking list_empty, the page may
>> >> >> already be removed from list in split_huge_page_list. And then list_del_init
>> >> >> would trigger bug.
>> >> >
>> >> >And how does list_empty check under the lock guarantee that the page is
>> >> >on the deferred list?
>> >>
>> >> Just one confusion, is this kind of description basic concept of concurrent
>> >> programming? How detail level we need to describe the effect?
>> >
>> >When I write changelogs for patches like this I usually describe, what
>> >is the potential race - e.g.
>> > CPU1 CPU2
>> > path1 path2
>> > check lock
>> > operation2
>> > unlock
>> > lock
>> > # check might not hold anymore
>> > operation1
>> > unlock
>> >
>> >and what is the effect of the race - e.g. a crash, data corruption,
>> >pointless attempt for operation1 which fails with user visible effect
>> >etc.
>>
>> Hi, Michal, here is my attempt for an example. Hope this one looks good to
>> you.
>>
>>
>> For example, the potential race would be:
>>
>> CPU1 CPU2
>> mem_cgroup_move_account split_huge_page_to_list
>> !list_empty
>> lock
>> !list_empty
>> list_del
>> unlock
>> lock
>> # !list_empty might not hold anymore
>> list_del_init
>> unlock
>>
>> When this sequence happens, the list_del_init() in
>> mem_cgroup_move_account() would crash since the page is already been
>> removed by list_del in split_huge_page_to_list().
>
>Yes this looks much more informative. I would just add that this will
>crash if CONFIG_DEBUG_LIST.
>
>Thanks!
Glad you like it~
Will prepare v2 with your suggestion :-)
>--
>Michal Hocko
>SUSE Labs
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2020-01-09 8:52 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-03 14:34 [RFC PATCH] mm: thp: grab the lock before manipulation defer list Wei Yang
2020-01-03 19:29 ` David Rientjes
2020-01-03 23:39 ` Wei Yang
2020-01-04 0:44 ` David Rientjes
2020-01-06 1:20 ` Wei Yang
2020-01-06 10:23 ` Michal Hocko
2020-01-07 1:22 ` Wei Yang
2020-01-07 8:38 ` Michal Hocko
2020-01-08 0:35 ` Wei Yang
2020-01-08 9:40 ` Michal Hocko
2020-01-09 2:03 ` Wei Yang
2020-01-09 8:34 ` Michal Hocko
2020-01-09 3:18 ` Wei Yang
2020-01-09 8:36 ` Michal Hocko
2020-01-09 8:52 ` Wei Yang
2020-01-06 16:18 ` Alexander Duyck
2020-01-07 1:26 ` Wei Yang
2020-01-07 2:07 ` David Rientjes
2020-01-07 2:33 ` Wei Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).