LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Al Boldi <a1426z@gawab.com>
To: Theodore Tso <tytso@MIT.EDU>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFD] Incremental fsck
Date: Sun, 13 Jan 2008 14:05:42 +0300 [thread overview]
Message-ID: <200801131405.42083.a1426z@gawab.com> (raw)
In-Reply-To: <20080112145140.GB6751@mit.edu>
Theodore Tso wrote:
> On Wed, Jan 09, 2008 at 02:52:14PM +0300, Al Boldi wrote:
> > Ok, but let's look at this a bit more opportunistic / optimistic.
> >
> > Even after a black-out shutdown, the corruption is pretty minimal, using
> > ext3fs at least.
>
> After a unclean shutdown, assuming you have decent hardware that
> doesn't lie about when blocks hit iron oxide, you shouldn't have any
> corruption at all. If you have crappy hardware, then all bets are off....
Maybe with barriers...
> > So let's take advantage of this fact and do an optimistic fsck, to
> > assure integrity per-dir, and assume no external corruption. Then
> > we release this checked dir to the wild (optionally ro), and check
> > the next. Once we find external inconsistencies we either fix it
> > unconditionally, based on some preconfigured actions, or present the
> > user with options.
>
> So what can you check? The *only* thing you can check is whether or
> not the directory syntax looks sane, whether the inode structure looks
> sane, and whether or not the blocks reported as belong to an inode
> looks sane.
Which would make this dir/area ready for read/write access.
> What is very hard to check is whether or not the link count on the
> inode is correct. Suppose the link count is 1, but there are actually
> two directory entries pointing at it. Now when someone unlinks the
> file through one of the directory hard entries, the link count will go
> to zero, and the blocks will start to get reused, even though the
> inode is still accessible via another pathname. Oops. Data Loss.
We could buffer this, and only actually overwrite when we are completely
finished with the fsck.
> This is why doing incremental, on-line fsck'ing is *hard*. You're not
> going to find this while doing each directory one at a time, and if
> the filesystem is changing out from under you, it gets worse. And
> it's not just the hard link count. There is a similar issue with the
> block allocation bitmap. Detecting the case where two files are
> simultaneously can't be done if you are doing it incrementally, and if
> the filesystem is changing out from under you, it's impossible, unless
> you also have the filesystem telling you every single change while it
> is happening, and you keep an insane amount of bookkeeping.
Ok, you have a point, so how about we change the implementation detail a bit,
from external fsck to internal fsck, leveraging the internal fs bookkeeping,
while allowing immediate but controlled read/write access.
Thanks for more thoughts!
--
Al
next prev parent reply other threads:[~2008-01-13 11:06 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-08 21:22 Al Boldi
2008-01-08 21:31 ` Alan
2008-01-09 9:16 ` Andreas Dilger
2008-01-12 23:55 ` Daniel Phillips
2008-01-08 21:41 ` Rik van Riel
2008-01-09 4:40 ` Al Boldi
2008-01-09 7:45 ` Valerie Henson
2008-01-09 11:52 ` Al Boldi
2008-01-09 14:44 ` Rik van Riel
2008-01-10 13:26 ` Al Boldi
2008-01-12 14:51 ` Theodore Tso
2008-01-13 11:05 ` Al Boldi [this message]
2008-01-13 17:19 ` Pavel Machek
2008-01-13 17:41 ` Alan Cox
2008-01-15 20:16 ` [Patch] document ext3 requirements (was Re: [RFD] Incremental fsck) Pavel Machek
2008-01-15 21:43 ` David Chinner
2008-01-15 23:07 ` Pavel Machek
2008-01-15 23:44 ` Daniel Phillips
2008-01-16 0:15 ` Alan Cox
2008-01-16 1:24 ` Daniel Phillips
2008-01-16 1:36 ` Chris Mason
2008-01-17 20:54 ` Pavel Machek
2008-01-16 19:06 ` Bryan Henderson
2008-01-16 20:05 ` Alan Cox
2008-01-17 2:02 ` Daniel Phillips
2008-01-17 21:37 ` Bryan Henderson
2008-01-17 22:45 ` Theodore Tso
2008-01-17 22:58 ` Alan Cox
2008-01-17 23:18 ` Ric Wheeler
2008-01-18 0:31 ` Bryan Henderson
2008-01-18 14:23 ` Theodore Tso
2008-01-18 15:16 ` [Patch] document ext3 requirements (was Re: [RFD] Incrementalfsck) linux-os (Dick Johnson)
2008-01-19 14:53 ` Pavel Machek
2008-01-18 15:26 ` [Patch] document ext3 requirements (was Re: [RFD] Incremental fsck) Ric Wheeler
2008-01-18 20:34 ` Jeff Garzik
2008-01-18 22:35 ` Bryan Henderson
2008-01-18 15:08 ` H. Peter Anvin
2008-01-18 17:43 ` Bryan Henderson
2008-01-16 21:28 ` Eric Sandeen
2008-01-16 11:51 ` Pavel Machek
2008-01-16 12:20 ` Valdis.Kletnieks
2008-01-19 14:51 ` Pavel Machek
2008-01-16 16:38 ` Christoph Hellwig
2008-01-16 1:44 ` Daniel Phillips
2008-01-16 3:05 ` Rik van Riel
2008-01-17 7:38 ` Andreas Dilger
2008-01-16 11:49 ` Pavel Machek
2008-01-16 20:52 ` Valerie Henson
2008-01-17 12:29 ` Szabolcs Szakacsits
2008-01-17 22:51 ` Daniel Phillips
2008-01-15 1:04 ` [RFD] Incremental fsck Ric Wheeler
2008-01-14 0:22 ` Daniel Phillips
2008-01-09 8:04 ` Valdis.Kletnieks
[not found] <9JubJ-5mo-57@gated-at.bofh.it>
[not found] ` <9JB3e-85S-13@gated-at.bofh.it>
[not found] ` <9JDRm-4bR-1@gated-at.bofh.it>
[not found] ` <9JHLl-2dL-1@gated-at.bofh.it>
2008-01-11 14:20 ` Bodo Eggert
2008-01-12 10:20 ` Al Boldi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200801131405.42083.a1426z@gawab.com \
--to=a1426z@gawab.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@MIT.EDU \
--subject='Re: [RFD] Incremental fsck' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).