LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] Revert "kmemleak: allow to coexist with fault injection"
@ 2019-07-16 17:50 Yang Shi
  2019-07-16 18:23 ` Qian Cai
  2019-07-17  5:07 ` Michal Hocko
  0 siblings, 2 replies; 11+ messages in thread
From: Yang Shi @ 2019-07-16 17:50 UTC (permalink / raw)
  To: catalin.marinas, mhocko, dvyukov, rientjes, willy, cai, akpm
  Cc: yang.shi, linux-mm, linux-kernel

When running ltp's oom test with kmemleak enabled, the below warning was
triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
passed in:

WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
...
 kmemleak_alloc+0x4e/0xb0
 kmem_cache_alloc+0x2a7/0x3e0
 ? __kmalloc+0x1d6/0x470
 ? ___might_sleep+0x9c/0x170
 ? mempool_alloc+0x2b0/0x2b0
 mempool_alloc_slab+0x2d/0x40
 mempool_alloc+0x118/0x2b0
 ? __kasan_check_read+0x11/0x20
 ? mempool_resize+0x390/0x390
 ? lock_downgrade+0x3c0/0x3c0
 bio_alloc_bioset+0x19d/0x350
 ? __swap_duplicate+0x161/0x240
 ? bvec_alloc+0x1b0/0x1b0
 ? do_raw_spin_unlock+0xa8/0x140
 ? _raw_spin_unlock+0x27/0x40
 get_swap_bio+0x80/0x230
 ? __x64_sys_madvise+0x50/0x50
 ? end_swap_bio_read+0x310/0x310
 ? __kasan_check_read+0x11/0x20
 ? check_chain_key+0x24e/0x300
 ? bdev_write_page+0x55/0x130
 __swap_writepage+0x5ff/0xb20

The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak has
__GFP_NOFAIL set all the time due to commit
d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
with fault injection").  But, it doesn't make any sense to have
__GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same time.

According to the discussion on the mailing list, the commit should be
reverted for short term solution.  Catalin Marinas would follow up with a better
solution for longer term.

The failure rate of kmemleak metadata allocation may increase in some
circumstances, but this should be expected side effect.

Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 mm/kmemleak.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 9dd581d..884a5e3 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -114,7 +114,7 @@
 /* GFP bitmask for kmemleak internal allocations */
 #define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
 				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
-				 __GFP_NOWARN | __GFP_NOFAIL)
+				 __GFP_NOWARN)
 
 /* scanning area inside a memory block */
 struct kmemleak_scan_area {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-16 17:50 [PATCH] Revert "kmemleak: allow to coexist with fault injection" Yang Shi
@ 2019-07-16 18:23 ` Qian Cai
  2019-07-16 19:01   ` Yang Shi
  2019-07-27 10:13   ` Catalin Marinas
  2019-07-17  5:07 ` Michal Hocko
  1 sibling, 2 replies; 11+ messages in thread
From: Qian Cai @ 2019-07-16 18:23 UTC (permalink / raw)
  To: Yang Shi, catalin.marinas, mhocko, dvyukov, rientjes, willy, akpm
  Cc: linux-mm, linux-kernel

On Wed, 2019-07-17 at 01:50 +0800, Yang Shi wrote:
> When running ltp's oom test with kmemleak enabled, the below warning was
> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> passed in:
> 
> WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608
> __alloc_pages_nodemask+0x1c31/0x1d50
> Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs
> virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring
> virtio libata
> CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-
> g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
> ...
>  kmemleak_alloc+0x4e/0xb0
>  kmem_cache_alloc+0x2a7/0x3e0
>  ? __kmalloc+0x1d6/0x470
>  ? ___might_sleep+0x9c/0x170
>  ? mempool_alloc+0x2b0/0x2b0
>  mempool_alloc_slab+0x2d/0x40
>  mempool_alloc+0x118/0x2b0
>  ? __kasan_check_read+0x11/0x20
>  ? mempool_resize+0x390/0x390
>  ? lock_downgrade+0x3c0/0x3c0
>  bio_alloc_bioset+0x19d/0x350
>  ? __swap_duplicate+0x161/0x240
>  ? bvec_alloc+0x1b0/0x1b0
>  ? do_raw_spin_unlock+0xa8/0x140
>  ? _raw_spin_unlock+0x27/0x40
>  get_swap_bio+0x80/0x230
>  ? __x64_sys_madvise+0x50/0x50
>  ? end_swap_bio_read+0x310/0x310
>  ? __kasan_check_read+0x11/0x20
>  ? check_chain_key+0x24e/0x300
>  ? bdev_write_page+0x55/0x130
>  __swap_writepage+0x5ff/0xb20
> 
> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak has
> __GFP_NOFAIL set all the time due to commit
> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
> with fault injection").  But, it doesn't make any sense to have
> __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same time.
> 
> According to the discussion on the mailing list, the commit should be
> reverted for short term solution.  Catalin Marinas would follow up with a
> better
> solution for longer term.
> 
> The failure rate of kmemleak metadata allocation may increase in some
> circumstances, but this should be expected side effect.

As mentioned in anther thread, the situation for kmemleak under memory pressure
has already been unhealthy. I don't feel comfortable to make it even worse by
reverting this commit alone. This could potentially make kmemleak kill itself
easier and miss some more real memory leak later.

To make it really a short-term solution before the reverting, I think someone
needs to follow up with the mempool solution with tunable pool size mentioned
in,

https://lore.kernel.org/linux-mm/20190328145917.GC10283@arrakis.emea.arm.com/

I personally not very confident that Catalin will find some time soon to
implement embedding kmemleak metadata into the slab. Even he or someone does
eventually, it probably need quite some time to test and edge out many of corner
cases that kmemleak could have by its natural.

> 
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Qian Cai <cai@lca.pw>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
>  mm/kmemleak.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 9dd581d..884a5e3 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -114,7 +114,7 @@
>  /* GFP bitmask for kmemleak internal allocations */
>  #define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) |
> \
>  				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
> -				 __GFP_NOWARN | __GFP_NOFAIL)
> +				 __GFP_NOWARN)
>  
>  /* scanning area inside a memory block */
>  struct kmemleak_scan_area {

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-16 18:23 ` Qian Cai
@ 2019-07-16 19:01   ` Yang Shi
  2019-07-16 19:21     ` Qian Cai
  2019-07-27 10:13   ` Catalin Marinas
  1 sibling, 1 reply; 11+ messages in thread
From: Yang Shi @ 2019-07-16 19:01 UTC (permalink / raw)
  To: Qian Cai, catalin.marinas, mhocko, dvyukov, rientjes, willy, akpm
  Cc: linux-mm, linux-kernel



On 7/16/19 11:23 AM, Qian Cai wrote:
> On Wed, 2019-07-17 at 01:50 +0800, Yang Shi wrote:
>> When running ltp's oom test with kmemleak enabled, the below warning was
>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
>> passed in:
>>
>> WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608
>> __alloc_pages_nodemask+0x1c31/0x1d50
>> Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs
>> virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring
>> virtio libata
>> CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-
>> g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
>> ...
>>   kmemleak_alloc+0x4e/0xb0
>>   kmem_cache_alloc+0x2a7/0x3e0
>>   ? __kmalloc+0x1d6/0x470
>>   ? ___might_sleep+0x9c/0x170
>>   ? mempool_alloc+0x2b0/0x2b0
>>   mempool_alloc_slab+0x2d/0x40
>>   mempool_alloc+0x118/0x2b0
>>   ? __kasan_check_read+0x11/0x20
>>   ? mempool_resize+0x390/0x390
>>   ? lock_downgrade+0x3c0/0x3c0
>>   bio_alloc_bioset+0x19d/0x350
>>   ? __swap_duplicate+0x161/0x240
>>   ? bvec_alloc+0x1b0/0x1b0
>>   ? do_raw_spin_unlock+0xa8/0x140
>>   ? _raw_spin_unlock+0x27/0x40
>>   get_swap_bio+0x80/0x230
>>   ? __x64_sys_madvise+0x50/0x50
>>   ? end_swap_bio_read+0x310/0x310
>>   ? __kasan_check_read+0x11/0x20
>>   ? check_chain_key+0x24e/0x300
>>   ? bdev_write_page+0x55/0x130
>>   __swap_writepage+0x5ff/0xb20
>>
>> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak has
>> __GFP_NOFAIL set all the time due to commit
>> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
>> with fault injection").  But, it doesn't make any sense to have
>> __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same time.
>>
>> According to the discussion on the mailing list, the commit should be
>> reverted for short term solution.  Catalin Marinas would follow up with a
>> better
>> solution for longer term.
>>
>> The failure rate of kmemleak metadata allocation may increase in some
>> circumstances, but this should be expected side effect.
> As mentioned in anther thread, the situation for kmemleak under memory pressure
> has already been unhealthy. I don't feel comfortable to make it even worse by
> reverting this commit alone. This could potentially make kmemleak kill itself
> easier and miss some more real memory leak later.
>
> To make it really a short-term solution before the reverting, I think someone
> needs to follow up with the mempool solution with tunable pool size mentioned
> in,
>
> https://lore.kernel.org/linux-mm/20190328145917.GC10283@arrakis.emea.arm.com/
>
> I personally not very confident that Catalin will find some time soon to
> implement embedding kmemleak metadata into the slab. Even he or someone does
> eventually, it probably need quite some time to test and edge out many of corner
> cases that kmemleak could have by its natural.

Thanks for sharing some background. I didn't notice this topic had been 
discussed. I'm not sure if this revert would make things worse since I'm 
supposed real memory leak would be detected sooner before oom kicks in, 
and kmemleak is already broken with __GFP_NOFAIL.

It seems everyone agree __GFP_NPFAIL should be removed? Anyway, I would 
like leave the decision to Catalin.

>
>> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Dmitry Vyukov <dvyukov@google.com>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Qian Cai <cai@lca.pw>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> ---
>>   mm/kmemleak.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>> index 9dd581d..884a5e3 100644
>> --- a/mm/kmemleak.c
>> +++ b/mm/kmemleak.c
>> @@ -114,7 +114,7 @@
>>   /* GFP bitmask for kmemleak internal allocations */
>>   #define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) |
>> \
>>   				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
>> -				 __GFP_NOWARN | __GFP_NOFAIL)
>> +				 __GFP_NOWARN)
>>   
>>   /* scanning area inside a memory block */
>>   struct kmemleak_scan_area {


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-16 19:01   ` Yang Shi
@ 2019-07-16 19:21     ` Qian Cai
  2019-07-16 20:07       ` Michal Hocko
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-07-16 19:21 UTC (permalink / raw)
  To: Yang Shi, catalin.marinas, mhocko, dvyukov, rientjes, willy, akpm
  Cc: linux-mm, linux-kernel

On Tue, 2019-07-16 at 12:01 -0700, Yang Shi wrote:
> 
> On 7/16/19 11:23 AM, Qian Cai wrote:
> > On Wed, 2019-07-17 at 01:50 +0800, Yang Shi wrote:
> > > When running ltp's oom test with kmemleak enabled, the below warning was
> > > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> > > passed in:
> > > 
> > > WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608
> > > __alloc_pages_nodemask+0x1c31/0x1d50
> > > Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs
> > > virtio_net net_failover virtio_blk failover ata_generic virtio_pci
> > > virtio_ring
> > > virtio libata
> > > CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-
> > > g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > > RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
> > > ...
> > >   kmemleak_alloc+0x4e/0xb0
> > >   kmem_cache_alloc+0x2a7/0x3e0
> > >   ? __kmalloc+0x1d6/0x470
> > >   ? ___might_sleep+0x9c/0x170
> > >   ? mempool_alloc+0x2b0/0x2b0
> > >   mempool_alloc_slab+0x2d/0x40
> > >   mempool_alloc+0x118/0x2b0
> > >   ? __kasan_check_read+0x11/0x20
> > >   ? mempool_resize+0x390/0x390
> > >   ? lock_downgrade+0x3c0/0x3c0
> > >   bio_alloc_bioset+0x19d/0x350
> > >   ? __swap_duplicate+0x161/0x240
> > >   ? bvec_alloc+0x1b0/0x1b0
> > >   ? do_raw_spin_unlock+0xa8/0x140
> > >   ? _raw_spin_unlock+0x27/0x40
> > >   get_swap_bio+0x80/0x230
> > >   ? __x64_sys_madvise+0x50/0x50
> > >   ? end_swap_bio_read+0x310/0x310
> > >   ? __kasan_check_read+0x11/0x20
> > >   ? check_chain_key+0x24e/0x300
> > >   ? bdev_write_page+0x55/0x130
> > >   __swap_writepage+0x5ff/0xb20
> > > 
> > > The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak has
> > > __GFP_NOFAIL set all the time due to commit
> > > d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
> > > with fault injection").  But, it doesn't make any sense to have
> > > __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same time.
> > > 
> > > According to the discussion on the mailing list, the commit should be
> > > reverted for short term solution.  Catalin Marinas would follow up with a
> > > better
> > > solution for longer term.
> > > 
> > > The failure rate of kmemleak metadata allocation may increase in some
> > > circumstances, but this should be expected side effect.
> > 
> > As mentioned in anther thread, the situation for kmemleak under memory
> > pressure
> > has already been unhealthy. I don't feel comfortable to make it even worse
> > by
> > reverting this commit alone. This could potentially make kmemleak kill
> > itself
> > easier and miss some more real memory leak later.
> > 
> > To make it really a short-term solution before the reverting, I think
> > someone
> > needs to follow up with the mempool solution with tunable pool size
> > mentioned
> > in,
> > 
> > https://lore.kernel.org/linux-mm/20190328145917.GC10283@arrakis.emea.arm.com
> > /
> > 
> > I personally not very confident that Catalin will find some time soon to
> > implement embedding kmemleak metadata into the slab. Even he or someone does
> > eventually, it probably need quite some time to test and edge out many of
> > corner
> > cases that kmemleak could have by its natural.
> 
> Thanks for sharing some background. I didn't notice this topic had been 
> discussed. I'm not sure if this revert would make things worse since I'm 
> supposed real memory leak would be detected sooner before oom kicks in, 
> and kmemleak is already broken with __GFP_NOFAIL.

Well, people could inject some memory pressure at the middle of a test run. OOM
does not necessarily mean kmemleak would always be disabled, as it sometimes
could survive if the memory is recovering fast enough.

Thanks to this commit, there are allocation with __GFP_DIRECT_RECLAIM that
succeeded would keep trying with __GFP_NOFAIL for kmemleak tracking object
allocations. Otherwise, one kmemleak object allocation failure would kill the
whole kmemleak.

> 
> It seems everyone agree __GFP_NPFAIL should be removed? Anyway, I would 
> like leave the decision to Catalin.
> 
> > 
> > > Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Michal Hocko <mhocko@suse.com>
> > > Cc: Dmitry Vyukov <dvyukov@google.com>
> > > Cc: David Rientjes <rientjes@google.com>
> > > Cc: Matthew Wilcox <willy@infradead.org>
> > > Cc: Qian Cai <cai@lca.pw>
> > > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> > > ---
> > >   mm/kmemleak.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> > > index 9dd581d..884a5e3 100644
> > > --- a/mm/kmemleak.c
> > > +++ b/mm/kmemleak.c
> > > @@ -114,7 +114,7 @@
> > >   /* GFP bitmask for kmemleak internal allocations */
> > >   #define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL |
> > > GFP_ATOMIC)) |
> > > \
> > >   				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
> > > -				 __GFP_NOWARN | __GFP_NOFAIL)
> > > +				 __GFP_NOWARN)
> > >   
> > >   /* scanning area inside a memory block */
> > >   struct kmemleak_scan_area {
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-16 19:21     ` Qian Cai
@ 2019-07-16 20:07       ` Michal Hocko
  2019-07-16 20:28         ` Qian Cai
  0 siblings, 1 reply; 11+ messages in thread
From: Michal Hocko @ 2019-07-16 20:07 UTC (permalink / raw)
  To: Qian Cai
  Cc: Yang Shi, catalin.marinas, dvyukov, rientjes, willy, akpm,
	linux-mm, linux-kernel

On Tue 16-07-19 15:21:17, Qian Cai wrote:
[...]
> Thanks to this commit, there are allocation with __GFP_DIRECT_RECLAIM that
> succeeded would keep trying with __GFP_NOFAIL for kmemleak tracking object
> allocations.

Well, not really. Because low order allocations with
__GFP_DIRECT_RECLAIM basically never fail (they keep retrying) even
without GFP_NOFAIL because that flag is actually to guarantee no
failure. And for high order allocations the nofail mode is actively
harmful. It completely changes the behavior of a system. A light costly
order workload could put the system on knees and completely change the
behavior. I am not really convinced this is a good behavior of a
debugging feature TBH.

> Otherwise, one kmemleak object allocation failure would kill the
> whole kmemleak.

Which is not great but quite likely a better than an unpredictable MM
behavior caused by NOFAIL storms. Really, this NOFAIL patch is a
completely broken behavior. There shouldn't be much discussion about
reverting it. I would even argue it shouldn't have been merged in the
first place. It doesn't have any acks nor reviewed-bys while it abuses
__GFP_NOFAIL which is generally discouraged to be used.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-16 20:07       ` Michal Hocko
@ 2019-07-16 20:28         ` Qian Cai
  2019-07-17  5:35           ` Michal Hocko
  0 siblings, 1 reply; 11+ messages in thread
From: Qian Cai @ 2019-07-16 20:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Yang Shi, catalin.marinas, dvyukov, rientjes, willy, akpm,
	linux-mm, linux-kernel

On Tue, 2019-07-16 at 22:07 +0200, Michal Hocko wrote:
> On Tue 16-07-19 15:21:17, Qian Cai wrote:
> [...]
> > Thanks to this commit, there are allocation with __GFP_DIRECT_RECLAIM that
> > succeeded would keep trying with __GFP_NOFAIL for kmemleak tracking object
> > allocations.
> 
> Well, not really. Because low order allocations with
> __GFP_DIRECT_RECLAIM basically never fail (they keep retrying) even
> without GFP_NOFAIL because that flag is actually to guarantee no
> failure. And for high order allocations the nofail mode is actively
> harmful. It completely changes the behavior of a system. A light costly
> order workload could put the system on knees and completely change the
> behavior. I am not really convinced this is a good behavior of a
> debugging feature TBH.

While I agree your general observation about GFP_NOFAIL, I am afraid the
discussion here is about "struct kmemleak_object" slab cache from a single call
site create_object(). 

> 
> > Otherwise, one kmemleak object allocation failure would kill the
> > whole kmemleak.
> 
> Which is not great but quite likely a better than an unpredictable MM
> behavior caused by NOFAIL storms. Really, this NOFAIL patch is a
> completely broken behavior. There shouldn't be much discussion about
> reverting it. I would even argue it shouldn't have been merged in the
> first place. It doesn't have any acks nor reviewed-bys while it abuses
> __GFP_NOFAIL which is generally discouraged to be used.

Again, it seems you are talking about GFP_NOFAIL in general. I don't really see
much unpredictable MM behavior which would disrupt the testing or generate
false-positive bug reports when "struct kmemleak_object" allocations with
GFP_NOFAIL apart from some warnings. All I see is that kmemleak stay alive help
find real memory leaks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-16 17:50 [PATCH] Revert "kmemleak: allow to coexist with fault injection" Yang Shi
  2019-07-16 18:23 ` Qian Cai
@ 2019-07-17  5:07 ` Michal Hocko
  2019-07-17  5:09   ` Michal Hocko
  1 sibling, 1 reply; 11+ messages in thread
From: Michal Hocko @ 2019-07-17  5:07 UTC (permalink / raw)
  To: Yang Shi
  Cc: catalin.marinas, dvyukov, rientjes, willy, cai, akpm, linux-mm,
	linux-kernel

On Wed 17-07-19 01:50:31, Yang Shi wrote:
> When running ltp's oom test with kmemleak enabled, the below warning was
> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> passed in:
> 
> WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
> Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
> CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
> ...
>  kmemleak_alloc+0x4e/0xb0
>  kmem_cache_alloc+0x2a7/0x3e0
>  ? __kmalloc+0x1d6/0x470
>  ? ___might_sleep+0x9c/0x170
>  ? mempool_alloc+0x2b0/0x2b0
>  mempool_alloc_slab+0x2d/0x40
>  mempool_alloc+0x118/0x2b0
>  ? __kasan_check_read+0x11/0x20
>  ? mempool_resize+0x390/0x390
>  ? lock_downgrade+0x3c0/0x3c0
>  bio_alloc_bioset+0x19d/0x350
>  ? __swap_duplicate+0x161/0x240
>  ? bvec_alloc+0x1b0/0x1b0
>  ? do_raw_spin_unlock+0xa8/0x140
>  ? _raw_spin_unlock+0x27/0x40
>  get_swap_bio+0x80/0x230
>  ? __x64_sys_madvise+0x50/0x50
>  ? end_swap_bio_read+0x310/0x310
>  ? __kasan_check_read+0x11/0x20
>  ? check_chain_key+0x24e/0x300
>  ? bdev_write_page+0x55/0x130
>  __swap_writepage+0x5ff/0xb20
> 
> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak has
> __GFP_NOFAIL set all the time due to commit
> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
> with fault injection").  But, it doesn't make any sense to have
> __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same time.
> 
> According to the discussion on the mailing list, the commit should be
> reverted for short term solution.  Catalin Marinas would follow up with a better
> solution for longer term.
> 
> The failure rate of kmemleak metadata allocation may increase in some
> circumstances, but this should be expected side effect.
> 
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Qian Cai <cai@lca.pw>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>

I forgot
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/kmemleak.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 9dd581d..884a5e3 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -114,7 +114,7 @@
>  /* GFP bitmask for kmemleak internal allocations */
>  #define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
>  				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
> -				 __GFP_NOWARN | __GFP_NOFAIL)
> +				 __GFP_NOWARN)
>  
>  /* scanning area inside a memory block */
>  struct kmemleak_scan_area {
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-17  5:07 ` Michal Hocko
@ 2019-07-17  5:09   ` Michal Hocko
  0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2019-07-17  5:09 UTC (permalink / raw)
  To: Yang Shi
  Cc: catalin.marinas, dvyukov, rientjes, willy, cai, akpm, linux-mm,
	linux-kernel

On Wed 17-07-19 07:07:11, Michal Hocko wrote:
> On Wed 17-07-19 01:50:31, Yang Shi wrote:
> > When running ltp's oom test with kmemleak enabled, the below warning was
> > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> > passed in:
> > 
> > WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
> > Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
> > CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
> > ...
> >  kmemleak_alloc+0x4e/0xb0
> >  kmem_cache_alloc+0x2a7/0x3e0
> >  ? __kmalloc+0x1d6/0x470
> >  ? ___might_sleep+0x9c/0x170
> >  ? mempool_alloc+0x2b0/0x2b0
> >  mempool_alloc_slab+0x2d/0x40
> >  mempool_alloc+0x118/0x2b0
> >  ? __kasan_check_read+0x11/0x20
> >  ? mempool_resize+0x390/0x390
> >  ? lock_downgrade+0x3c0/0x3c0
> >  bio_alloc_bioset+0x19d/0x350
> >  ? __swap_duplicate+0x161/0x240
> >  ? bvec_alloc+0x1b0/0x1b0
> >  ? do_raw_spin_unlock+0xa8/0x140
> >  ? _raw_spin_unlock+0x27/0x40
> >  get_swap_bio+0x80/0x230
> >  ? __x64_sys_madvise+0x50/0x50
> >  ? end_swap_bio_read+0x310/0x310
> >  ? __kasan_check_read+0x11/0x20
> >  ? check_chain_key+0x24e/0x300
> >  ? bdev_write_page+0x55/0x130
> >  __swap_writepage+0x5ff/0xb20
> > 
> > The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak has
> > __GFP_NOFAIL set all the time due to commit
> > d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
> > with fault injection").  But, it doesn't make any sense to have
> > __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same time.
> > 
> > According to the discussion on the mailing list, the commit should be
> > reverted for short term solution.  Catalin Marinas would follow up with a better
> > solution for longer term.
> > 
> > The failure rate of kmemleak metadata allocation may increase in some
> > circumstances, but this should be expected side effect.
> > 
> > Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Matthew Wilcox <willy@infradead.org>
> > Cc: Qian Cai <cai@lca.pw>
> > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> 
> I forgot
> Acked-by: Michal Hocko <mhocko@suse.com>

Btw. If this leads to early allocation failures too often then
dropping __GFP_NORETRY should help for now until a better solution is
available. It could lead to OOM killer invocation which is probably
the reason why it has been added but probably better than completely
disabling kmemleak altogether. Up to Catalin I guess.
 
> > ---
> >  mm/kmemleak.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> > index 9dd581d..884a5e3 100644
> > --- a/mm/kmemleak.c
> > +++ b/mm/kmemleak.c
> > @@ -114,7 +114,7 @@
> >  /* GFP bitmask for kmemleak internal allocations */
> >  #define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
> >  				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
> > -				 __GFP_NOWARN | __GFP_NOFAIL)
> > +				 __GFP_NOWARN)
> >  
> >  /* scanning area inside a memory block */
> >  struct kmemleak_scan_area {
> > -- 
> > 1.8.3.1
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-16 20:28         ` Qian Cai
@ 2019-07-17  5:35           ` Michal Hocko
  0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2019-07-17  5:35 UTC (permalink / raw)
  To: Qian Cai
  Cc: Yang Shi, catalin.marinas, dvyukov, rientjes, willy, akpm,
	linux-mm, linux-kernel

On Tue 16-07-19 16:28:21, Qian Cai wrote:
> On Tue, 2019-07-16 at 22:07 +0200, Michal Hocko wrote:
> > On Tue 16-07-19 15:21:17, Qian Cai wrote:
> > [...]
> > > Thanks to this commit, there are allocation with __GFP_DIRECT_RECLAIM that
> > > succeeded would keep trying with __GFP_NOFAIL for kmemleak tracking object
> > > allocations.
> > 
> > Well, not really. Because low order allocations with
> > __GFP_DIRECT_RECLAIM basically never fail (they keep retrying) even
> > without GFP_NOFAIL because that flag is actually to guarantee no
> > failure. And for high order allocations the nofail mode is actively
> > harmful. It completely changes the behavior of a system. A light costly
> > order workload could put the system on knees and completely change the
> > behavior. I am not really convinced this is a good behavior of a
> > debugging feature TBH.
> 
> While I agree your general observation about GFP_NOFAIL, I am afraid the
> discussion here is about "struct kmemleak_object" slab cache from a single call
> site create_object(). 

OK, this makes it less harmfull because the order aspect doesn't really
apply here. But still stretches the NOFAIL semantic a lot. The kmemleak
essentially asks for NORETRY | NOFAIL which means no oom but retry for
ever semantic for sleeping allocations. This can still lead to
unexpected side effects. Just consider a call site that holds locks and
now cannot make any forward progress without anybody else hitting the
oom killer for example. As noted in other email, I would simply drop
NORETRY flag as well and live with the fact that the oom killer can be
invoked. It still wouldn't solve the NOWAIT contexts but those need a
proper solution anyway.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-16 18:23 ` Qian Cai
  2019-07-16 19:01   ` Yang Shi
@ 2019-07-27 10:13   ` Catalin Marinas
  2019-07-27 11:48     ` Qian Cai
  1 sibling, 1 reply; 11+ messages in thread
From: Catalin Marinas @ 2019-07-27 10:13 UTC (permalink / raw)
  To: Qian Cai
  Cc: Yang Shi, mhocko, dvyukov, rientjes, willy, akpm, linux-mm, linux-kernel

On Tue, Jul 16, 2019 at 02:23:30PM -0400, Qian Cai wrote:
> As mentioned in anther thread, the situation for kmemleak under memory pressure
> has already been unhealthy. I don't feel comfortable to make it even worse by
> reverting this commit alone. This could potentially make kmemleak kill itself
> easier and miss some more real memory leak later.
> 
> To make it really a short-term solution before the reverting, I think someone
> needs to follow up with the mempool solution with tunable pool size mentioned
> in,
> 
> https://lore.kernel.org/linux-mm/20190328145917.GC10283@arrakis.emea.arm.com/

Before my little bit of spare time disappears, let's add the tunable to
the mempool size so that I can repost the patch. Are you ok with a
kernel cmdline parameter or you'd rather change it at runtime? The
latter implies a minor extension to mempool to allow it to refill on
demand. I'd personally go for the former.

-- 
Catalin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"
  2019-07-27 10:13   ` Catalin Marinas
@ 2019-07-27 11:48     ` Qian Cai
  0 siblings, 0 replies; 11+ messages in thread
From: Qian Cai @ 2019-07-27 11:48 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Yang Shi, mhocko, Dmitry Vyukov, rientjes, willy, Andrew Morton,
	linux-mm, linux-kernel



> On Jul 27, 2019, at 6:13 AM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> On Tue, Jul 16, 2019 at 02:23:30PM -0400, Qian Cai wrote:
>> As mentioned in anther thread, the situation for kmemleak under memory pressure
>> has already been unhealthy. I don't feel comfortable to make it even worse by
>> reverting this commit alone. This could potentially make kmemleak kill itself
>> easier and miss some more real memory leak later.
>> 
>> To make it really a short-term solution before the reverting, I think someone
>> needs to follow up with the mempool solution with tunable pool size mentioned
>> in,
>> 
>> https://lore.kernel.org/linux-mm/20190328145917.GC10283@arrakis.emea.arm.com/
> 
> Before my little bit of spare time disappears, let's add the tunable to
> the mempool size so that I can repost the patch. Are you ok with a
> kernel cmdline parameter or you'd rather change it at runtime? The
> latter implies a minor extension to mempool to allow it to refill on
> demand. I'd personally go for the former.

Agreed. The cmdline is good enough.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-07-27 11:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-16 17:50 [PATCH] Revert "kmemleak: allow to coexist with fault injection" Yang Shi
2019-07-16 18:23 ` Qian Cai
2019-07-16 19:01   ` Yang Shi
2019-07-16 19:21     ` Qian Cai
2019-07-16 20:07       ` Michal Hocko
2019-07-16 20:28         ` Qian Cai
2019-07-17  5:35           ` Michal Hocko
2019-07-27 10:13   ` Catalin Marinas
2019-07-27 11:48     ` Qian Cai
2019-07-17  5:07 ` Michal Hocko
2019-07-17  5:09   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).