LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Fengguang Wu <wfg@mail.ustc.edu.cn> To: David Chinner <dgc@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Michael Rubin <mrubin@google.com>, Peter Zijlstra <a.p.zijlstra@chello.nl>, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] Converting writeback linked lists to a tree based data structure Date: Thu, 17 Jan 2008 11:16:00 +0800 [thread overview] Message-ID: <400539769.00869@ustc.edu.cn> (raw) Message-ID: <E1JFLEW-0002oE-G1@localhost.localdomain> (raw) In-Reply-To: <20080116223510.GY155407@sgi.com> On Thu, Jan 17, 2008 at 09:35:10AM +1100, David Chinner wrote: > On Wed, Jan 16, 2008 at 05:07:20PM +0800, Fengguang Wu wrote: > > On Tue, Jan 15, 2008 at 09:51:49PM -0800, Andrew Morton wrote: > > > > Then to do better ordering by adopting radix tree(or rbtree > > > > if radix tree is not enough), > > > > > > ordering of what? > > > > Switch from time to location. > > Note that data writeback may be adversely affected by location > based writeback rather than time based writeback - think of > the effect of location based data writeback on an app that > creates lots of short term (<30s) temp files and then removes > them before they are written back. A small(e.g. 5s) time window can still be enforced, but... > Also, data writeback locatio cannot be easily derived from > the inode number in pretty much all cases. "near" in terms > of XFS means the same AG which means the data could be up to > a TB away from the inode, and if you have >1TB filesystems > usingthe default inode32 allocator, file data is *never* > placed near the inode - the inodes are in the first TB of > the filesystem, the data is rotored around the rest of the > filesystem. > > And with delayed allocation, you don't know where the data is even > going to be written ahead of the filesystem ->writepage call, so you > can't do optimal location ordering for data in this case. Agreed. > Hmmmm - I'm wondering if we'd do better to split data writeback from > inode writeback. i.e. we do two passes. The first pass writes all > the data back in time order, the second pass writes all the inodes > back in location order. > > Right now we interleave data and inode writeback, (i.e. we do data, > inode, data, inode, data, inode, ....). I'd much prefer to see all > data written out first, then the inodes. ->writepage often dirties > the inode and hence if we need to do multiple do_writepages() calls > on an inode to flush all the data (e.g. congestion, large amounts of > data to be written, etc), we really shouldn't be calling > write_inode() after every do_writepages() call. The inode > should not be written until all the data is written.... That may do good to XFS. Another case is documented as follows: "the write_inode() function of a typical fs will perform no I/O, but will mark buffers in the blockdev mapping as dirty."
prev parent reply other threads:[~2008-01-17 3:16 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2008-01-15 8:09 [patch] Converting writeback linked lists to a tree based data structure Michael Rubin 2008-01-15 8:46 ` Peter Zijlstra 2008-01-15 17:53 ` Michael Rubin [not found] ` <400452490.28636@ustc.edu.cn> 2008-01-16 3:01 ` Fengguang Wu 2008-01-16 3:44 ` Andrew Morton [not found] ` <400457571.32162@ustc.edu.cn> 2008-01-16 4:25 ` Fengguang Wu 2008-01-16 4:42 ` Andrew Morton [not found] ` <400459376.04290@ustc.edu.cn> 2008-01-16 4:55 ` Fengguang Wu 2008-01-16 5:51 ` Andrew Morton [not found] ` <400474447.19383@ustc.edu.cn> 2008-01-16 9:07 ` Fengguang Wu 2008-01-16 22:35 ` David Chinner [not found] ` <400539769.00869@ustc.edu.cn> 2008-01-17 3:16 ` Fengguang Wu 2008-01-17 5:21 ` David Chinner 2008-01-18 7:36 ` Mike Waychison 2008-01-16 7:55 ` David Chinner 2008-01-16 8:13 ` Andrew Morton [not found] ` <400488821.15609@ustc.edu.cn> 2008-01-16 13:06 ` Fengguang Wu 2008-01-16 18:55 ` Michael Rubin [not found] ` <400540692.29046@ustc.edu.cn> 2008-01-17 3:31 ` Fengguang Wu [not found] ` <400562938.07583@ustc.edu.cn> 2008-01-17 9:41 ` Fengguang Wu 2008-01-17 21:07 ` Michael Rubin 2008-01-18 5:01 ` David Chinner 2008-01-18 5:38 ` Michael Rubin 2008-01-18 8:54 ` David Chinner 2008-01-18 9:26 ` Michael Rubin [not found] ` <400634919.20750@ustc.edu.cn> 2008-01-18 5:41 ` Fengguang Wu 2008-01-19 2:50 ` David Chinner [not found] ` <400632190.14601@ustc.edu.cn> 2008-01-18 4:56 ` Fengguang Wu 2008-01-18 5:41 ` Andi Kleen [not found] ` <400644314.11994@ustc.edu.cn> 2008-01-18 6:01 ` Fengguang Wu 2008-01-18 7:48 ` Mike Waychison 2008-01-18 6:43 ` Michael Rubin [not found] ` <400651538.20437@ustc.edu.cn> 2008-01-18 9:32 ` Fengguang Wu -- strict thread matches above, loose matches on Subject: below -- 2007-12-13 0:32 Michael Rubin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=400539769.00869@ustc.edu.cn \ --to=wfg@mail.ustc.edu.cn \ --cc=a.p.zijlstra@chello.nl \ --cc=akpm@linux-foundation.org \ --cc=dgc@sgi.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mrubin@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).