LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: Paolo Valente <paolo.valente@linaro.org>,
	"Srivatsa S. Bhat" <srivatsa@csail.mit.edu>,
	linux-fsdevel@vger.kernel.org,
	linux-block <linux-block@vger.kernel.org>,
	linux-ext4@vger.kernel.org, cgroups@vger.kernel.org,
	kernel list <linux-kernel@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>,
	jmoyer@redhat.com, amakhalov@vmware.com, anishs@vmware.com,
	srivatsab@vmware.com
Subject: Re: CFQ idling kills I/O performance on ext4 with blkio cgroup controller
Date: Tue, 21 May 2019 12:31:08 -0400	[thread overview]
Message-ID: <20190521163108.GB2591@mit.edu> (raw)
In-Reply-To: <20190521091026.GA17019@quack2.suse.cz>

On Tue, May 21, 2019 at 11:10:26AM +0200, Jan Kara wrote:
> > [root@localhost tmp]# dd if=/dev/zero of=/root/test.img bs=512 count=10000 oflag=dsync
> 
> Yes and that's expected. It just shows how inefficient small synchronous IO
> is. Look, dd(1) writes 512-bytes. From FS point of view we have to write:
> full fs block with data (+4KB), inode to journal (+4KB), journal descriptor
> block (+4KB), journal superblock (+4KB), transaction commit block (+4KB) -
> so that's 20KB just from top of my head to write 512 bytes...

Well, it's not *that* bad.  With fdatasync(), we're only having to do
this worse case thing every 8 writes.  The other writes, we don't
actually need to do any file-system level block allocation, so it's
only a 512 byte write to the disk[1] seven out of eight writes.

That's also true for the slice_idle hit, of course, We only need to do
a jbd2 transaction when there is a block allocation, and that's only
going to happen one in eight writes.

       	   	      	     	     	   - Ted

[1] Of course, small synchronous writes to a HDD are *also* terrible
for performance, just from the HDD's perspective.  For a random write
workload, if you are using disks with a 4k physical sector size, it's
having to do a read/modify/write for each 512 byte write.  And HDD
vendors are talking about wanting to go to a 32k or 64k physical
sector size...  In this sequential write workload, you'll mostly be
shielded from this by the HDD's cache, but the fact that you have to
wait for the bits to hit the platter is always going to be painful.

  reply	other threads:[~2019-05-21 16:32 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17 22:16 Srivatsa S. Bhat
2019-05-18 18:39 ` Paolo Valente
2019-05-18 19:28   ` Theodore Ts'o
2019-05-20  9:15     ` Jan Kara
2019-05-20 10:45       ` Paolo Valente
2019-05-21 16:48       ` Theodore Ts'o
2019-05-21 18:19         ` Josef Bacik
2019-05-21 19:10           ` Theodore Ts'o
2019-05-20 10:38     ` Paolo Valente
2019-05-21  7:38       ` Andrea Righi
2019-05-18 20:50   ` Srivatsa S. Bhat
2019-05-20 10:19     ` Paolo Valente
2019-05-20 22:45       ` Srivatsa S. Bhat
2019-05-21  6:23         ` Paolo Valente
2019-05-21  7:19           ` Srivatsa S. Bhat
2019-05-21  9:10           ` Jan Kara
2019-05-21 16:31             ` Theodore Ts'o [this message]
2019-05-21 11:25       ` Paolo Valente
2019-05-21 13:20         ` Paolo Valente
2019-05-21 16:21           ` Paolo Valente
2019-05-21 17:38             ` Paolo Valente
2019-05-21 22:51               ` Srivatsa S. Bhat
2019-05-22  8:05                 ` Paolo Valente
2019-05-22  9:02                   ` Srivatsa S. Bhat
2019-05-22  9:12                     ` Paolo Valente
2019-05-22 10:02                       ` Srivatsa S. Bhat
2019-05-22  9:09                   ` Paolo Valente
2019-05-22 10:01                     ` Srivatsa S. Bhat
2019-05-22 10:54                       ` Paolo Valente
2019-05-23  2:30                         ` Srivatsa S. Bhat
2019-05-23  9:19                           ` Paolo Valente
2019-05-23 17:22                             ` Paolo Valente
2019-05-23 23:43                               ` Srivatsa S. Bhat
2019-05-24  6:51                                 ` Paolo Valente
2019-05-24  7:56                                   ` Paolo Valente
2019-05-29  1:09                                   ` Srivatsa S. Bhat
2019-05-29  7:41                                     ` Paolo Valente
2019-05-30  8:29                                       ` Srivatsa S. Bhat
2019-05-30 10:45                                         ` Paolo Valente
2019-06-02  7:04                                           ` Srivatsa S. Bhat
2019-06-11 22:34                                             ` Srivatsa S. Bhat
2019-06-12 13:04                                               ` Jan Kara
2019-06-12 19:36                                                 ` Srivatsa S. Bhat
2019-06-13  6:02                                                   ` Greg Kroah-Hartman
2019-06-13 19:03                                                     ` Srivatsa S. Bhat
2019-06-13  8:20                                                   ` Jan Kara
2019-06-13 19:05                                                     ` Srivatsa S. Bhat
2019-06-13  8:37                                                   ` Jens Axboe
2019-06-13  5:46                                               ` Paolo Valente
2019-06-13 19:13                                                 ` Srivatsa S. Bhat
2019-05-23 23:32                           ` Srivatsa S. Bhat
2019-05-30  8:38                             ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190521163108.GB2591@mit.edu \
    --to=tytso@mit.edu \
    --cc=amakhalov@vmware.com \
    --cc=anishs@vmware.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=jack@suse.cz \
    --cc=jmoyer@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paolo.valente@linaro.org \
    --cc=srivatsa@csail.mit.edu \
    --cc=srivatsab@vmware.com \
    --subject='Re: CFQ idling kills I/O performance on ext4 with blkio cgroup controller' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).