Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "zhangyi (F)" <yi.zhang@huawei.com>
To: <linux-ext4@vger.kernel.org>, <tytso@mit.edu>, <jack@suse.com>
Cc: <adilger.kernel@dilger.ca>, <zhangxiaoxu5@huawei.com>,
	<linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v3 0/5] ext4: fix inconsistency since async write metadata buffer error
Date: Mon, 13 Jul 2020 09:40:47 +0800	[thread overview]
Message-ID: <4b8a3738-cf3a-a1fb-06d6-c14436cf2cf4@huawei.com> (raw)
In-Reply-To: <20200620025427.1756360-1-yi.zhang@huawei.com>

Hi, Ted and Jan, what do you think about this solution ?

Thanks,
Yi.

On 2020/6/20 10:54, zhangyi (F) wrote:
> Changes since v2:
>  - Christoph against the solution of adding callback in the block layer
>    that could let ext4 handle write error. So for simplicity, switch to
>    check the bdev mapping->wb_err when ext4 getting journal write access
>    as Jan suggested now. Maybe we could implement the callback through
>    introduce a special inode (e.g. a meta inode) for ext4 in the future.
>  - Patch 1: Add mapping->wb_err check and invoke ext4_error_err() in
>    ext4_journal_get_write_access() if wb_err is different from the
>    original one saved at mount time.
>  - Patch 2-3: Remove partial fix <7963e5ac90125> and <9c83a923c67d>.
>  - Patch 4: Fix another inconsistency problem since we may bypass the
>    journal's checkpoint procedure if we free metadata buffers which
>    were failed to async write out.
>  - Patch 5: Just a cleanup patch.
>    
> The above 5 patches are based on linux-5.8-rc1 and have been tested by
> xfstests, no newly increased failures.
> 
> Thanks,
> Yi.
> 
> -----------------------
> 
> Original background
> ===================
> 
> This patch set point to fix the inconsistency problem which has been
> discussed and partial fixed in [1].
> 
> Now, the problem is on the unstable storage which has a flaky transport
> (e.g. iSCSI transport may disconnect few seconds and reconnect due to
> the bad network environment), if we failed to async write metadata in
> background, the end write routine in block layer will clear the buffer's
> uptodate flag, but the data in such buffer is actually uptodate. Finally
> we may read "old && inconsistent" metadata from the disk when we get the
> buffer later because not only the uptodate flag was cleared but also we
> do not check the write io error flag, or even worse the buffer has been
> freed due to memory presure.
> 
> Fortunately, if the jbd2 do checkpoint after async IO error happens,
> the checkpoint routine will check the write_io_error flag and abort the
> the journal if detect IO error. And in the journal recover case, the
> recover code will invoke sync_blockdev() after recover complete, it will
> also detect IO error and refuse to mount the filesystem.
> 
> Current ext4 have already deal with this problem in __ext4_get_inode_loc()
> and commit 7963e5ac90125 ("ext4: treat buffers with write errors as
> containing valid data"), but it's not enough.
> 
> [1] https://lore.kernel.org/linux-ext4/20190823030207.GC8130@mit.edu/
> 
> 
> zhangyi (F) (5):
>   ext4: abort the filesystem if failed to async write metadata buffer
>   ext4: remove ext4_buffer_uptodate()
>   ext4: remove write io error check before read inode block
>   jbd2: abort journal if free a async write error metadata buffer
>   jbd2: remove unused parameter in jbd2_journal_try_to_free_buffers()
> 
>  fs/ext4/ext4.h        | 16 +++-------------
>  fs/ext4/ext4_jbd2.c   | 25 +++++++++++++++++++++++++
>  fs/ext4/inode.c       | 15 +++------------
>  fs/ext4/super.c       | 23 ++++++++++++++++++++---
>  fs/jbd2/transaction.c | 20 ++++++++++++++------
>  include/linux/jbd2.h  |  2 +-
>  6 files changed, 66 insertions(+), 35 deletions(-)
> 


      parent reply	other threads:[~2020-07-13  1:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-20  2:54 zhangyi (F)
2020-06-20  2:54 ` [PATCH v3 1/5] ext4: abort the filesystem if failed to async write metadata buffer zhangyi (F)
2020-08-07 17:49   ` tytso
2020-06-20  2:54 ` [PATCH v3 2/5] ext4: remove ext4_buffer_uptodate() zhangyi (F)
2020-08-07 17:53   ` tytso
2020-06-20  2:54 ` [PATCH v3 3/5] ext4: remove write io error check before read inode block zhangyi (F)
2020-06-20  2:54 ` [PATCH v3 4/5] jbd2: abort journal if free a async write error metadata buffer zhangyi (F)
2020-08-07 17:59   ` tytso
2020-06-20  2:54 ` [PATCH v3 5/5] jbd2: remove unused parameter in jbd2_journal_try_to_free_buffers() zhangyi (F)
2020-07-13  1:40 ` zhangyi (F) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b8a3738-cf3a-a1fb-06d6-c14436cf2cf4@huawei.com \
    --to=yi.zhang@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=zhangxiaoxu5@huawei.com \
    --subject='Re: [PATCH v3 0/5] ext4: fix inconsistency since async write metadata buffer error' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).