LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Fengguang Wu <wfg@mail.ustc.edu.cn> To: Michael Rubin <mrubin@google.com> Cc: a.p.zijlstra@chello.nl, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] Converting writeback linked lists to a tree based data structure Date: Fri, 18 Jan 2008 12:56:09 +0800 [thread overview] Message-ID: <400632190.14601@ustc.edu.cn> (raw) Message-ID: <E1JFjGz-0001eU-3O@localhost.localdomain> (raw) In-Reply-To: <532480950801171307q4b540ewa3acb6bfbea5dbc8@mail.gmail.com> On Thu, Jan 17, 2008 at 01:07:05PM -0800, Michael Rubin wrote: > On Jan 17, 2008 1:41 AM, Fengguang Wu <wfg@mail.ustc.edu.cn> wrote: > > On Tue, Jan 15, 2008 at 12:09:21AM -0800, Michael Rubin wrote: > > The main benefit of rbtree is possibly better support of future policies. > > Can you demonstrate an example? > > These are ill-formed thoughts as of now on my end but the idea was > that keeping one tree sorted via a scheme might be simpler than > multiple list_heads. Suppose we want to grant longer expiration window for temp files, adding a new list named s_dirty_tmpfile would be a handy solution. So the question is: should we need more than 3 QoS classes? > > The most tricky writeback issues could be starvation prevention > > between > > > > - small/large files > > - new/old files > > - superblocks > > So I have written tests and believe I have covered these issues. If > you are concerned in specific on any and have a test case please let > me know. OK. > > Some kind of limit should be applied for each. They used to be: > > - requeue to s_more_io whenever MAX_WRITEBACK_PAGES is reached > > this preempts big files > > The patch uses th same limit. > > > - refill s_io iif it is drained > > this prevents promotion of big/old files > > Once a big file gets its first do_writepages it is moved behind the > other smaller files via i_flushed_when. And the same in reverse for > big vs old. You mean i_flush_gen? No, sync_sb_inodes() will abort on every MAX_WRITEBACK_PAGES, and s_flush_gen will be updated accordingly. Hence the sync will restart from big/old files. > > > - return from sync_sb_inodes() after one go of s_io > > I am not sure how this limit helps things out. Is this for superblock > starvation? Can you elaborate? We should have a way to go to next superblock even if new dirty inodes or pages are emerging fast in this superblock. Fill and drain s_io only once and then abort helps. s_io is a stable and bounded working set in one go of superblock. > > Michael, could you sort out and document the new starvation prevention schemes? > > The basic idea behind the writeback algorithm to handle starvation. > The over arching idea is that we want to preserve order of writeback > based on when an inode was dirtied and also preserve the dirtied_when > contents until the inode has been written back (partially or fully) > > Every sync_sb_inodes we find the least recent inodes dirtied. To deal > with large or small starvation we have a s_flush_gen for each > iteration of sync_sb_inodes every time we issue a writeback we mark > that the inode cannot be processed until the next s_flush_gen. This > way we don't process one get to the rest since we keep pushing them > into subsequent s_fush_gen's. > > Let me know if you want more detail or structured responses. > > > Introduce i_flush_gen to help restarting from the last inode? > > Well, it's not as simple as list_heads. Basically you make one list_head in each rbtree node. That list_head is recycled cyclic, and is an analog to the old fashioned s_dirty. We need to know 'where we are' and 'where it ends'. So an extra indicator must be introduced - i_flush_gen. It's awkward. We are simply repeating the aged list_heads' problem. > > > 2) Added an inode flag to allow inodes to be marked so that they > > > are never written back to disk. > > > > > > The motivation behind this change is several fold. The first is > > > to insure fairness in the writeback algorithm. The second is to > > > > What do you mean by fairness? > > So originally this comment was written when I was trying to fix a bug > in 2.6.23. The one where we were starving large files from being > flushed. There was a fairness issue where small files were being > flushed but the large ones were just ballooning in memory. In fact the bug is turned-around rather than fixed - now the small files could be starved. > > Why cannot I_WRITEBACK_NEVER be in a decoupled standalone patch? > > The WRITEBACK_NEVER could be in a previous patch to the rbtree. But > not a subsequent patch to the rbtree. The rbtree depends on the > WRITEBACK_NEVER patch otherwise we run in to problems in > generic_delete_inode. Now that you point it out I think I can split > this patch into two patches and make the WRITEBACK_NEVER in the first > one. OK.
prev parent reply other threads:[~2008-01-18 4:56 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2008-01-15 8:09 [patch] Converting writeback linked lists to a tree based data structure Michael Rubin 2008-01-15 8:46 ` Peter Zijlstra 2008-01-15 17:53 ` Michael Rubin [not found] ` <400452490.28636@ustc.edu.cn> 2008-01-16 3:01 ` Fengguang Wu 2008-01-16 3:44 ` Andrew Morton [not found] ` <400457571.32162@ustc.edu.cn> 2008-01-16 4:25 ` Fengguang Wu 2008-01-16 4:42 ` Andrew Morton [not found] ` <400459376.04290@ustc.edu.cn> 2008-01-16 4:55 ` Fengguang Wu 2008-01-16 5:51 ` Andrew Morton [not found] ` <400474447.19383@ustc.edu.cn> 2008-01-16 9:07 ` Fengguang Wu 2008-01-16 22:35 ` David Chinner [not found] ` <400539769.00869@ustc.edu.cn> 2008-01-17 3:16 ` Fengguang Wu 2008-01-17 5:21 ` David Chinner 2008-01-18 7:36 ` Mike Waychison 2008-01-16 7:55 ` David Chinner 2008-01-16 8:13 ` Andrew Morton [not found] ` <400488821.15609@ustc.edu.cn> 2008-01-16 13:06 ` Fengguang Wu 2008-01-16 18:55 ` Michael Rubin [not found] ` <400540692.29046@ustc.edu.cn> 2008-01-17 3:31 ` Fengguang Wu [not found] ` <400562938.07583@ustc.edu.cn> 2008-01-17 9:41 ` Fengguang Wu 2008-01-17 21:07 ` Michael Rubin 2008-01-18 5:01 ` David Chinner 2008-01-18 5:38 ` Michael Rubin 2008-01-18 8:54 ` David Chinner 2008-01-18 9:26 ` Michael Rubin [not found] ` <400634919.20750@ustc.edu.cn> 2008-01-18 5:41 ` Fengguang Wu 2008-01-19 2:50 ` David Chinner [not found] ` <400632190.14601@ustc.edu.cn> 2008-01-18 4:56 ` Fengguang Wu 2008-01-18 5:41 ` Andi Kleen [not found] ` <400644314.11994@ustc.edu.cn> 2008-01-18 6:01 ` Fengguang Wu 2008-01-18 7:48 ` Mike Waychison 2008-01-18 6:43 ` Michael Rubin [not found] ` <400651538.20437@ustc.edu.cn> 2008-01-18 9:32 ` Fengguang Wu -- strict thread matches above, loose matches on Subject: below -- 2007-12-13 0:32 Michael Rubin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=400632190.14601@ustc.edu.cn \ --to=wfg@mail.ustc.edu.cn \ --cc=a.p.zijlstra@chello.nl \ --cc=akpm@linux-foundation.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mrubin@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).