LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* deadlock during writeback when using f2fs filesystem
@ 2018-06-01  9:32 Sahitya Tummala
  2018-06-01 10:26 ` Michal Hocko
  0 siblings, 1 reply; 4+ messages in thread
From: Sahitya Tummala @ 2018-06-01  9:32 UTC (permalink / raw)
  To: linux-f2fs-devel, linux-fsdevel, linux-kernel, Jaegeuk Kim, Chao Yu
  Cc: vinmenon, stummala

Hi,

We are observing a deadlock scenario during FS writeback under low-memory
condition with F2FS filesystem.

Here is the callstack of this scenario -

shrink_inactive_list()
shrink_node_memcg.isra.74()
shrink_node()
shrink_zones(inline)
do_try_to_free_pages(inline)
try_to_free_pages()
__perform_reclaim(inline)
__alloc_pages_direct_reclaim(inline)
__alloc_pages_slowpath(inline)
no_zone()
__alloc_pages(inline)
__alloc_pages_node(inline)
alloc_pages_node(inline)
__page_cache_alloc(inline)
pagecache_get_page()
find_or_create_page(inline)
grab_cache_page(inline)
f2fs_grab_cache_page(inline)
__get_node_page.part.32()
__get_node_page(inline)
get_node_page()
update_inode_page()
f2fs_write_inode()
write_inode(inline)
__writeback_single_inode()
writeback_sb_inodes()
__writeback_inodes_wb()
wb_writeback()
wb_do_writeback(inline)
wb_workfn()

The writeback thread is entering into the direct reclaim path due to low-memory and is
getting stuck in shrink_inactive_list(), as shrink_inactive_list() is inturn waiting for
writeback to happen for the dirty pages present in the inactive list.

Do you think we can use GFP_NOWAIT for node mapping gfp_mask so that we can avoid direct
reclaim path in the writeback context? As we may now see allocation failures with this flag,
do you see any risk or issue in using it w.r.t F2FS FS and writeback?
Appreciate your suggestions on this.

diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 89c838b..d3daf3b 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -316,7 +316,7 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino)
 make_now:
        if (ino == F2FS_NODE_INO(sbi)) {
                inode->i_mapping->a_ops = &f2fs_node_aops;
-               mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
+               mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_NODE_MAPPING);
        } else if (ino == F2FS_META_INO(sbi)) {
                inode->i_mapping->a_ops = &f2fs_meta_aops;
                mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
index 58aecb6..bb985cd 100644
--- a/include/linux/f2fs_fs.h
+++ b/include/linux/f2fs_fs.h
@@ -47,6 +47,7 @@
 /* This flag is used by node and meta inodes, and by recovery */
 #define GFP_F2FS_ZERO          (GFP_NOFS | __GFP_ZERO)
 #define GFP_F2FS_HIGH_ZERO     (GFP_NOFS | __GFP_ZERO | __GFP_HIGHMEM)
+#define GFP_F2FS_NODE_MAPPING  (GFP_NOWAIT | __GFP_IO | __GFP_ZERO)

Thanks,
Sahitya.
-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: deadlock during writeback when using f2fs filesystem
  2018-06-01  9:32 deadlock during writeback when using f2fs filesystem Sahitya Tummala
@ 2018-06-01 10:26 ` Michal Hocko
  2018-06-01 11:20   ` Sahitya Tummala
  0 siblings, 1 reply; 4+ messages in thread
From: Michal Hocko @ 2018-06-01 10:26 UTC (permalink / raw)
  To: Sahitya Tummala
  Cc: linux-f2fs-devel, linux-fsdevel, linux-kernel, Jaegeuk Kim,
	Chao Yu, vinmenon

On Fri 01-06-18 15:02:35, Sahitya Tummala wrote:
> Hi,
> 
> We are observing a deadlock scenario during FS writeback under low-memory
> condition with F2FS filesystem.
> 
> Here is the callstack of this scenario -
> 
> shrink_inactive_list()
> shrink_node_memcg.isra.74()
> shrink_node()
> shrink_zones(inline)
> do_try_to_free_pages(inline)
> try_to_free_pages()
> __perform_reclaim(inline)
> __alloc_pages_direct_reclaim(inline)
> __alloc_pages_slowpath(inline)
> no_zone()
> __alloc_pages(inline)
> __alloc_pages_node(inline)
> alloc_pages_node(inline)
> __page_cache_alloc(inline)
> pagecache_get_page()
> find_or_create_page(inline)
> grab_cache_page(inline)
> f2fs_grab_cache_page(inline)
> __get_node_page.part.32()
> __get_node_page(inline)
> get_node_page()
> update_inode_page()
> f2fs_write_inode()
> write_inode(inline)
> __writeback_single_inode()
> writeback_sb_inodes()
> __writeback_inodes_wb()
> wb_writeback()
> wb_do_writeback(inline)
> wb_workfn()
> 
> The writeback thread is entering into the direct reclaim path due to low-memory and is
> getting stuck in shrink_inactive_list(), as shrink_inactive_list() is inturn waiting for
> writeback to happen for the dirty pages present in the inactive list.

shrink_page_list waits only for writeback pages when we are in the memcg
reclaim. The above seems to be the global reclaim though. Moreover
GFP_F2FS_ZERO is GFP_NOFS so we are not waiting for writeback pages at
all. Are you sure the above is really a deadlock?

> Do you think we can use GFP_NOWAIT for node mapping gfp_mask so that we can avoid direct
> reclaim path in the writeback context? As we may now see allocation failures with this flag,
> do you see any risk or issue in using it w.r.t F2FS FS and writeback?
> Appreciate your suggestions on this.
> 
> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> index 89c838b..d3daf3b 100644
> --- a/fs/f2fs/inode.c
> +++ b/fs/f2fs/inode.c
> @@ -316,7 +316,7 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino)
>  make_now:
>         if (ino == F2FS_NODE_INO(sbi)) {
>                 inode->i_mapping->a_ops = &f2fs_node_aops;
> -               mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> +               mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_NODE_MAPPING);
>         } else if (ino == F2FS_META_INO(sbi)) {
>                 inode->i_mapping->a_ops = &f2fs_meta_aops;
>                 mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
> index 58aecb6..bb985cd 100644
> --- a/include/linux/f2fs_fs.h
> +++ b/include/linux/f2fs_fs.h
> @@ -47,6 +47,7 @@
>  /* This flag is used by node and meta inodes, and by recovery */
>  #define GFP_F2FS_ZERO          (GFP_NOFS | __GFP_ZERO)
>  #define GFP_F2FS_HIGH_ZERO     (GFP_NOFS | __GFP_ZERO | __GFP_HIGHMEM)
> +#define GFP_F2FS_NODE_MAPPING  (GFP_NOWAIT | __GFP_IO | __GFP_ZERO)
> 
> Thanks,
> Sahitya.
> -- 
> --
> Sent by a consultant of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: deadlock during writeback when using f2fs filesystem
  2018-06-01 10:26 ` Michal Hocko
@ 2018-06-01 11:20   ` Sahitya Tummala
  2018-06-01 11:27     ` Michal Hocko
  0 siblings, 1 reply; 4+ messages in thread
From: Sahitya Tummala @ 2018-06-01 11:20 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-f2fs-devel, linux-fsdevel, linux-kernel, Jaegeuk Kim,
	Chao Yu, vinmenon

On Fri, Jun 01, 2018 at 12:26:09PM +0200, Michal Hocko wrote:
> On Fri 01-06-18 15:02:35, Sahitya Tummala wrote:
> > Hi,
> > 
> > We are observing a deadlock scenario during FS writeback under low-memory
> > condition with F2FS filesystem.
> > 
> > Here is the callstack of this scenario -
> > 
> > shrink_inactive_list()
> > shrink_node_memcg.isra.74()
> > shrink_node()
> > shrink_zones(inline)
> > do_try_to_free_pages(inline)
> > try_to_free_pages()
> > __perform_reclaim(inline)
> > __alloc_pages_direct_reclaim(inline)
> > __alloc_pages_slowpath(inline)
> > no_zone()
> > __alloc_pages(inline)
> > __alloc_pages_node(inline)
> > alloc_pages_node(inline)
> > __page_cache_alloc(inline)
> > pagecache_get_page()
> > find_or_create_page(inline)
> > grab_cache_page(inline)
> > f2fs_grab_cache_page(inline)
> > __get_node_page.part.32()
> > __get_node_page(inline)
> > get_node_page()
> > update_inode_page()
> > f2fs_write_inode()
> > write_inode(inline)
> > __writeback_single_inode()
> > writeback_sb_inodes()
> > __writeback_inodes_wb()
> > wb_writeback()
> > wb_do_writeback(inline)
> > wb_workfn()
> > 
> > The writeback thread is entering into the direct reclaim path due to low-memory and is
> > getting stuck in shrink_inactive_list(), as shrink_inactive_list() is inturn waiting for
> > writeback to happen for the dirty pages present in the inactive list.
> 
> shrink_page_list waits only for writeback pages when we are in the memcg
> reclaim. The above seems to be the global reclaim though. Moreover
> GFP_F2FS_ZERO is GFP_NOFS so we are not waiting for writeback pages at
> all. Are you sure the above is really a deadlock?
> 

Let me correct my statement. It could be more of a livelock scenario.

The direct reclaim path is not doing any writeback here, so the GFP_NOFS doesn't
make any difference. In this case, the direct reclaim has to reclaim ~32 pages,
which it picks up from the tail of the list. All of those tail pages are dirty
and since direct reclaim path can't do any writeback, it just loops picking and
skipping them.

> > Do you think we can use GFP_NOWAIT for node mapping gfp_mask so that we can avoid direct
> > reclaim path in the writeback context? As we may now see allocation failures with this flag,
> > do you see any risk or issue in using it w.r.t F2FS FS and writeback?
> > Appreciate your suggestions on this.
> > 
> > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> > index 89c838b..d3daf3b 100644
> > --- a/fs/f2fs/inode.c
> > +++ b/fs/f2fs/inode.c
> > @@ -316,7 +316,7 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino)
> >  make_now:
> >         if (ino == F2FS_NODE_INO(sbi)) {
> >                 inode->i_mapping->a_ops = &f2fs_node_aops;
> > -               mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> > +               mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_NODE_MAPPING);
> >         } else if (ino == F2FS_META_INO(sbi)) {
> >                 inode->i_mapping->a_ops = &f2fs_meta_aops;
> >                 mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> > diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
> > index 58aecb6..bb985cd 100644
> > --- a/include/linux/f2fs_fs.h
> > +++ b/include/linux/f2fs_fs.h
> > @@ -47,6 +47,7 @@
> >  /* This flag is used by node and meta inodes, and by recovery */
> >  #define GFP_F2FS_ZERO          (GFP_NOFS | __GFP_ZERO)
> >  #define GFP_F2FS_HIGH_ZERO     (GFP_NOFS | __GFP_ZERO | __GFP_HIGHMEM)
> > +#define GFP_F2FS_NODE_MAPPING  (GFP_NOWAIT | __GFP_IO | __GFP_ZERO)
> > 
> > Thanks,
> > Sahitya.
> > -- 
> > --
> > Sent by a consultant of the Qualcomm Innovation Center, Inc.
> > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: deadlock during writeback when using f2fs filesystem
  2018-06-01 11:20   ` Sahitya Tummala
@ 2018-06-01 11:27     ` Michal Hocko
  0 siblings, 0 replies; 4+ messages in thread
From: Michal Hocko @ 2018-06-01 11:27 UTC (permalink / raw)
  To: Sahitya Tummala
  Cc: linux-f2fs-devel, linux-fsdevel, linux-kernel, Jaegeuk Kim,
	Chao Yu, vinmenon

On Fri 01-06-18 16:50:50, Sahitya Tummala wrote:
> On Fri, Jun 01, 2018 at 12:26:09PM +0200, Michal Hocko wrote:
> > On Fri 01-06-18 15:02:35, Sahitya Tummala wrote:
> > > Hi,
> > > 
> > > We are observing a deadlock scenario during FS writeback under low-memory
> > > condition with F2FS filesystem.
> > > 
> > > Here is the callstack of this scenario -
> > > 
> > > shrink_inactive_list()
> > > shrink_node_memcg.isra.74()
> > > shrink_node()
> > > shrink_zones(inline)
> > > do_try_to_free_pages(inline)
> > > try_to_free_pages()
> > > __perform_reclaim(inline)
> > > __alloc_pages_direct_reclaim(inline)
> > > __alloc_pages_slowpath(inline)
> > > no_zone()
> > > __alloc_pages(inline)
> > > __alloc_pages_node(inline)
> > > alloc_pages_node(inline)
> > > __page_cache_alloc(inline)
> > > pagecache_get_page()
> > > find_or_create_page(inline)
> > > grab_cache_page(inline)
> > > f2fs_grab_cache_page(inline)
> > > __get_node_page.part.32()
> > > __get_node_page(inline)
> > > get_node_page()
> > > update_inode_page()
> > > f2fs_write_inode()
> > > write_inode(inline)
> > > __writeback_single_inode()
> > > writeback_sb_inodes()
> > > __writeback_inodes_wb()
> > > wb_writeback()
> > > wb_do_writeback(inline)
> > > wb_workfn()
> > > 
> > > The writeback thread is entering into the direct reclaim path due to low-memory and is
> > > getting stuck in shrink_inactive_list(), as shrink_inactive_list() is inturn waiting for
> > > writeback to happen for the dirty pages present in the inactive list.
> > 
> > shrink_page_list waits only for writeback pages when we are in the memcg
> > reclaim. The above seems to be the global reclaim though. Moreover
> > GFP_F2FS_ZERO is GFP_NOFS so we are not waiting for writeback pages at
> > all. Are you sure the above is really a deadlock?
> > 
> 
> Let me correct my statement. It could be more of a livelock scenario.
> 
> The direct reclaim path is not doing any writeback here, so the GFP_NOFS doesn't
> make any difference. In this case, the direct reclaim has to reclaim ~32 pages,
> which it picks up from the tail of the list. All of those tail pages are dirty
> and since direct reclaim path can't do any writeback, it just loops picking and
> skipping them.

But there are surely other pages on the LRU list, aren't they?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-06-01 11:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-01  9:32 deadlock during writeback when using f2fs filesystem Sahitya Tummala
2018-06-01 10:26 ` Michal Hocko
2018-06-01 11:20   ` Sahitya Tummala
2018-06-01 11:27     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).