LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Jörn Engel" <joern@lazybastard.org>
To: Juan Piernas Canovas <piernas@ditec.um.es>
Cc: Sorin Faibish <sfaibish@emc.com>,
kernel list <linux-kernel@vger.kernel.org>
Subject: Re: [ANNOUNCE] DualFS: File System with Meta-data and Data Separation
Date: Fri, 23 Feb 2007 13:26:45 +0000 [thread overview]
Message-ID: <20070223132645.GB11653@lazybastard.org> (raw)
In-Reply-To: <Pine.LNX.4.61.0702222033460.19296@ditec.inf.um.es>
On Thu, 22 February 2007 20:57:12 +0100, Juan Piernas Canovas wrote:
>
> I do not agree with this picture, because it does not show that all the
> indirect blocks which point to a direct block are along with it in the
> same segment. That figure should look like:
>
> Segment 1: [some data] [ DA D1' D2' ] [more data]
> Segment 2: [some data] [ D0 D1' D2' ] [more data]
> Segment 3: [some data] [ DB D1 D2 ] [more data]
>
> where D0, DA, and DB are datablocks, D1 and D2 indirect blocks which
> point to the datablocks, and D1' and D2' obsolete copies of those
> indirect blocks. By using this figure, is is clear that if you need to
> move D0 to clean the segment 2, you will need only one free segment at
> most, and not more. You will get:
>
> Segment 1: [some data] [ DA D1' D2' ] [more data]
> Segment 2: [ free ]
> Segment 3: [some data] [ DB D1' D2' ] [more data]
> ......
> Segment n: [ D0 D1 D2 ] [ empty ]
>
> That is, D0 needs in the new segment the same space that it needs in the
> previous one.
>
> The differences are subtle but important.
Ah, now I see. Yes, that is deadlock-free. If you are not accounting
the bytes of used space but the number of used segments, and you count
each partially used segment the same as a 100% used segment, there is no
deadlock.
Some people may consider this to be cheating, however. It will cause
more than 50% wasted space. All obsolete copies are garbage, after all.
With a maximum tree height of N, you can have up to (N-1) / N of your
filesystem occupied by garbage.
It also means that "df" will have unexpected output. You cannot
estimate how much data can fit into the filesystem, as that depends on
how much garbage you will accumulate in the segments. Admittedly this
is not a problem for DualFS, as the uncertainty only exists for
metadata, do "df" for DualFS still makes sense.
Another downside is that with large amounts of garbage between otherwise
useful data, your disk cache hit rate goes down. Read performance is
suffering. But that may be a fair tradeoff and will only show up in
large metadata reads in the uncached (per Linux) case. Seems fair.
Quite interesting, actually. The costs of your design are disk space,
depending on the amount and depth of your metadata, and metadata read
performance. Disk space is cheap and metadata reads tend to be slow for
most filesystems, in comparison to data reads. You gain faster metadata
writes and loss of journal overhead. I like the idea.
Jörn
--
All art is but imitation of nature.
-- Lucius Annaeus Seneca
next prev parent reply other threads:[~2007-02-23 14:28 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <op.tnkdlbgsrwwil4@brcsmondepl2c.corp.emc.com>
2007-02-14 21:10 ` sfaibish
2007-02-14 21:57 ` Jan Engelhardt
2007-02-15 18:38 ` Juan Piernas Canovas
2007-02-15 20:09 ` Jörn Engel
2007-02-15 22:59 ` Juan Piernas Canovas
2007-02-16 9:13 ` Jörn Engel
2007-02-16 11:05 ` Benny Amorsen
2007-02-16 23:47 ` Bill Davidsen
2007-02-17 15:11 ` Jörn Engel
2007-02-17 18:10 ` Bill Davidsen
2007-02-17 18:36 ` Jörn Engel
2007-02-17 20:47 ` Sorin Faibish
2007-02-18 5:59 ` Jörn Engel
2007-02-18 12:46 ` Jörn Engel
2007-02-19 23:57 ` Juan Piernas Canovas
2007-02-20 0:10 ` Bron Gondwana
2007-02-20 0:30 ` Jörn Engel
2007-02-21 4:36 ` Juan Piernas Canovas
2007-02-21 12:37 ` Jörn Engel
2007-02-21 18:31 ` Juan Piernas Canovas
2007-02-21 19:25 ` Jörn Engel
2007-02-22 4:30 ` Juan Piernas Canovas
2007-02-22 16:25 ` Jörn Engel
2007-02-22 19:57 ` Juan Piernas Canovas
2007-02-23 13:26 ` Jörn Engel [this message]
2007-02-24 22:35 ` Sorin Faibish
2007-02-25 2:41 ` Juan Piernas Canovas
2007-02-25 12:01 ` Jörn Engel
2007-02-26 3:48 ` Juan Piernas Canovas
2007-02-20 20:43 ` Bill Davidsen
2007-02-15 20:38 ` Andi Kleen
2007-02-15 19:46 ` Jan Engelhardt
2007-02-16 1:43 ` sfaibish
2007-02-15 21:09 ` Juan Piernas Canovas
2007-02-15 23:57 ` Andi Kleen
2007-02-16 4:57 ` Juan Piernas Canovas
2007-02-26 11:49 ` Yakov Lerner
2007-02-26 13:08 ` Matthias Schniedermeyer
2007-02-26 13:24 ` Sorin Faibish
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070223132645.GB11653@lazybastard.org \
--to=joern@lazybastard.org \
--cc=linux-kernel@vger.kernel.org \
--cc=piernas@ditec.um.es \
--cc=sfaibish@emc.com \
--subject='Re: [ANNOUNCE] DualFS: File System with Meta-data and Data Separation' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).