LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: <linux-fsdevel@vger.kernel.org>, <david@fromorbit.com>,
<viro@zeniv.linux.org.uk>, <jack@suse.cz>,
<linux-kernel@vger.kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Subject: [PATCH 1/9] writeback: plug writeback at a high level
Date: Tue, 10 Mar 2015 15:45:16 -0400 [thread overview]
Message-ID: <1426016724-23912-2-git-send-email-jbacik@fb.com> (raw)
In-Reply-To: <1426016724-23912-1-git-send-email-jbacik@fb.com>
From: Dave Chinner <dchinner@redhat.com>
Doing writeback on lots of little files causes terrible IOPS storms
because of the per-mapping writeback plugging we do. This
essentially causes imeediate dispatch of IO for each mapping,
regardless of the context in which writeback is occurring.
IOWs, running a concurrent write-lots-of-small 4k files using fsmark
on XFS results in a huge number of IOPS being issued for data
writes. Metadata writes are sorted and plugged at a high level by
XFS, so aggregate nicely into large IOs. However, data writeback IOs
are dispatched in individual 4k IOs, even when the blocks of two
consecutively written files are adjacent.
Test VM: 8p, 8GB RAM, 4xSSD in RAID0, 100TB sparse XFS filesystem,
metadata CRCs enabled.
Kernel: 3.10-rc5 + xfsdev + my 3.11 xfs queue (~70 patches)
Test:
$ ./fs_mark -D 10000 -S0 -n 10000 -s 4096 -L 120 -d
/mnt/scratch/0 -d /mnt/scratch/1 -d /mnt/scratch/2 -d
/mnt/scratch/3 -d /mnt/scratch/4 -d /mnt/scratch/5 -d
/mnt/scratch/6 -d /mnt/scratch/7
Result:
wall sys create rate Physical write IO
time CPU (avg files/s) IOPS Bandwidth
----- ----- ------------ ------ ---------
unpatched 6m56s 15m47s 24,000+/-500 26,000 130MB/s
patched 5m06s 13m28s 32,800+/-600 1,500 180MB/s
improvement -26.44% -14.68% +36.67% -94.23% +38.46%
If I use zero length files, this workload at about 500 IOPS, so
plugging drops the data IOs from roughly 25,500/s to 1000/s.
3 lines of code, 35% better throughput for 15% less CPU.
The benefits of plugging at this layer are likely to be higher for
spinning media as the IO patterns for this workload are going make a
much bigger difference on high IO latency devices.....
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/fs-writeback.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index e907052..a9ff2b7 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -659,7 +659,9 @@ static long writeback_sb_inodes(struct super_block *sb,
unsigned long start_time = jiffies;
long write_chunk;
long wrote = 0; /* count both pages and inodes */
+ struct blk_plug plug;
+ blk_start_plug(&plug);
while (!list_empty(&wb->b_io)) {
struct inode *inode = wb_inode(wb->b_io.prev);
@@ -756,6 +758,7 @@ static long writeback_sb_inodes(struct super_block *sb,
break;
}
}
+ blk_finish_plug(&plug);
return wrote;
}
--
1.9.3
next prev parent reply other threads:[~2015-03-10 19:47 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-10 19:45 [PATCH 0/9] Sync and VFS scalability improvements Josef Bacik
2015-03-10 19:45 ` Josef Bacik [this message]
2015-03-10 19:45 ` [PATCH 2/9] inode: add IOP_NOTHASHED to avoid inode hash lock in evict Josef Bacik
2015-03-12 9:52 ` Al Viro
2015-03-12 12:18 ` [PATCH] inode: add hlist_fake to avoid the " Josef Bacik
2015-03-12 12:20 ` [PATCH] inode: add hlist_fake to avoid the inode hash lock in evict V2 Josef Bacik
2015-03-14 7:00 ` Jan Kara
2015-03-12 12:24 ` [PATCH 2/9] inode: add IOP_NOTHASHED to avoid inode hash lock in evict Josef Bacik
2015-03-10 19:45 ` [PATCH 3/9] inode: convert inode_sb_list_lock to per-sb Josef Bacik
2015-03-10 19:45 ` [PATCH 4/9] sync: serialise per-superblock sync operations Josef Bacik
2015-03-10 19:45 ` [PATCH 5/9] inode: rename i_wb_list to i_io_list Josef Bacik
2015-03-10 19:45 ` [PATCH 6/9] bdi: add a new writeback list for sync Josef Bacik
2015-03-16 10:14 ` Jan Kara
2015-03-10 19:45 ` [PATCH 7/9] writeback: periodically trim the writeback list Josef Bacik
2015-03-16 10:16 ` Jan Kara
2015-03-16 11:43 ` Jan Kara
2015-03-10 19:45 ` [PATCH 8/9] inode: convert per-sb inode list to a list_lru Josef Bacik
2015-03-16 12:27 ` Jan Kara
2015-03-16 15:34 ` Josef Bacik
2015-03-16 15:48 ` Jan Kara
2015-03-10 19:45 ` [PATCH 9/9] inode: don't softlockup when evicting inodes Josef Bacik
2015-03-16 12:31 ` Jan Kara
2015-03-16 11:39 ` [PATCH 0/9] Sync and VFS scalability improvements Jan Kara
2015-03-25 11:18 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1426016724-23912-2-git-send-email-jbacik@fb.com \
--to=jbacik@fb.com \
--cc=david@fromorbit.com \
--cc=dchinner@redhat.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
--subject='Re: [PATCH 1/9] writeback: plug writeback at a high level' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).