LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com> To: Fengguang Wu <wfg@mail.ustc.edu.cn> Cc: David Chinner <dgc@sgi.com>, Michael Rubin <mrubin@google.com>, a.p.zijlstra@chello.nl, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] Converting writeback linked lists to a tree based data structure Date: Sat, 19 Jan 2008 13:50:24 +1100 [thread overview] Message-ID: <20080119025024.GW155259@sgi.com> (raw) In-Reply-To: <E1JFjyv-0001hU-FA@localhost.localdomain> On Fri, Jan 18, 2008 at 01:41:33PM +0800, Fengguang Wu wrote: > > That is, think of large file writes like process scheduler batch > > jobs - bulk throughput is what matters, so the larger the time slice > > you give them the higher the throughput. > > > > IMO, the sort of result we should be looking at is a > > writeback design that results in cycling somewhat like: > > > > slice 1: iterate over small files > > slice 2: flush large file 1 > > slice 3: iterate over small files > > slice 4: flush large file 2 > > ...... > > slice n-1: flush large file N > > slice n: iterate over small files > > slice n+1: flush large file N+1 > > > > So that we keep the disk busy with a relatively fair mix of > > small and large I/Os while both are necessary. > > If we can sync fast enough, the lower layer would be able to merge > those 4MB requests. No, not necessarily - think of a stripe with a chunk size of 512k. That 4MB will be split into 8x512k chunks and sent to different devices (and hence elevator queues). The only way you get elevator merging in this sort of config is that if you send multiple stripe *width* sized amounts to the device in a very short time period. I see quite a few filesystems with stripe widths in the tens of MB range..... > > Put simply: > > > > The higher the bandwidth of the device, the more frequently > > we need to be servicing the inodes with large amounts of > > dirty data to be written to maintain write throughput at a > > significant percentage of the device capability. > > > > The writeback algorithm needs to take this into account for it > > to be able to scale effectively for high throughput devices. > > Slow queues go full first. Currently the writeback code will skip > _and_ congestion_wait() for congested filesystems. The better policy > is to congestion_wait() _after_ all other writable pages have been > synced. Agreed. The comments I've made are mainly concerned with getting efficient flushing of a single device occuring. Interactions between multiple devices are a separable issue.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group
next prev parent reply other threads:[~2008-01-19 2:51 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2008-01-15 8:09 [patch] Converting writeback linked lists to a tree based data structure Michael Rubin 2008-01-15 8:46 ` Peter Zijlstra 2008-01-15 17:53 ` Michael Rubin [not found] ` <400452490.28636@ustc.edu.cn> 2008-01-16 3:01 ` Fengguang Wu 2008-01-16 3:44 ` Andrew Morton [not found] ` <400457571.32162@ustc.edu.cn> 2008-01-16 4:25 ` Fengguang Wu 2008-01-16 4:42 ` Andrew Morton [not found] ` <400459376.04290@ustc.edu.cn> 2008-01-16 4:55 ` Fengguang Wu 2008-01-16 5:51 ` Andrew Morton [not found] ` <400474447.19383@ustc.edu.cn> 2008-01-16 9:07 ` Fengguang Wu 2008-01-16 22:35 ` David Chinner [not found] ` <400539769.00869@ustc.edu.cn> 2008-01-17 3:16 ` Fengguang Wu 2008-01-17 5:21 ` David Chinner 2008-01-18 7:36 ` Mike Waychison 2008-01-16 7:55 ` David Chinner 2008-01-16 8:13 ` Andrew Morton [not found] ` <400488821.15609@ustc.edu.cn> 2008-01-16 13:06 ` Fengguang Wu 2008-01-16 18:55 ` Michael Rubin [not found] ` <400540692.29046@ustc.edu.cn> 2008-01-17 3:31 ` Fengguang Wu [not found] ` <400562938.07583@ustc.edu.cn> 2008-01-17 9:41 ` Fengguang Wu 2008-01-17 21:07 ` Michael Rubin 2008-01-18 5:01 ` David Chinner 2008-01-18 5:38 ` Michael Rubin 2008-01-18 8:54 ` David Chinner 2008-01-18 9:26 ` Michael Rubin [not found] ` <400634919.20750@ustc.edu.cn> 2008-01-18 5:41 ` Fengguang Wu 2008-01-19 2:50 ` David Chinner [this message] [not found] ` <400632190.14601@ustc.edu.cn> 2008-01-18 4:56 ` Fengguang Wu 2008-01-18 5:41 ` Andi Kleen [not found] ` <400644314.11994@ustc.edu.cn> 2008-01-18 6:01 ` Fengguang Wu 2008-01-18 7:48 ` Mike Waychison 2008-01-18 6:43 ` Michael Rubin [not found] ` <400651538.20437@ustc.edu.cn> 2008-01-18 9:32 ` Fengguang Wu -- strict thread matches above, loose matches on Subject: below -- 2007-12-13 0:32 Michael Rubin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20080119025024.GW155259@sgi.com \ --to=dgc@sgi.com \ --cc=a.p.zijlstra@chello.nl \ --cc=akpm@linux-foundation.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mrubin@google.com \ --cc=wfg@mail.ustc.edu.cn \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).