LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: David Chinner <dgc@sgi.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>, Ric Wheeler <ric@emc.com>,
	device-mapper development <dm-devel@redhat.com>,
	Andi Kleen <andi@firstfloor.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [dm-devel] Re: [PATCH] Implement barrier support for single device DM devices
Date: Thu, 21 Feb 2008 14:39:36 +1100	[thread overview]
Message-ID: <18364.62072.219900.681747@notabene.brown> (raw)
In-Reply-To: message from David Chinner on Tuesday February 19

On Tuesday February 19, dgc@sgi.com wrote:
> On Mon, Feb 18, 2008 at 04:24:27PM +0300, Michael Tokarev wrote:
> > First, I still don't understand why in God's sake barriers are "working"
> > while regular cache flushes are not.  Almost no consumer-grade hard drive
> > supports write barriers, but they all support regular cache flushes, and
> > the latter should be enough (while not the most speed-optimal) to ensure
> > data safety.  Why to require write cache disable (like in XFS FAQ) instead
> > of going the flush-cache-when-appropriate (as opposed to write-barrier-
> > when-appropriate) way?
> 
> Devil's advocate:
> 
> Why should we need to support multiple different block layer APIs
> to do the same thing? Surely any hardware that doesn't support barrier
> operations can emulate them with cache flushes when they receive a
> barrier I/O from the filesystem....

The simple answer to "why multiple APIs" is "different performance
trade-offs". 
If barriers are implemented in at the end of the pipeline, they can
presumably be reasonably cheap.
If they have to be implemented at the top of the pipeline, thus
stalling the whole pipeline, they are likely to be more expensive.

A filesystem may be able to mitigate the expense if it knows something
about the purpose of the data.
e.g. ext3 in data=writeback mode could wait only for journal writes to
complete before submitting the (would-be) barrier write of the commit
block, and would not bother to wait for data writes.

However, consistent APIs are also a good thing.
I would easily accept an argument that a BIO_RW_BARRER request must
*always* be correctly ordered around all other requests to the same
device.  If a layered device cannot get the service it requires from
lower level devices, it must do that flush/write/wait itself.

That should be paired with a way for the upper levels to find out how
efficient barriers are.  I guess the three levels of barrier
efficiency are:
  1/ handled above the elevator - least efficient
  2/ handled between elevator and device (by 'flush request'), medium
  3/ handled inside device (e.g. ordered SCSI request) most efficient.

NeilBrown

  parent reply	other threads:[~2008-02-21  3:39 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-15 12:08 [PATCH] Implement barrier support for single device DM devices Andi Kleen
2008-02-15 12:20 ` Alasdair G Kergon
2008-02-15 13:07   ` Michael Tokarev
2008-02-15 14:20     ` Andi Kleen
2008-02-15 14:12       ` [dm-devel] " Alasdair G Kergon
2008-02-15 15:34         ` Andi Kleen
2008-02-15 15:31           ` Alan Cox
2008-02-18 12:48         ` Ric Wheeler
2008-02-18 13:24           ` Michael Tokarev
2008-02-18 13:52             ` Ric Wheeler
2008-02-19  2:45               ` Alasdair G Kergon
2008-05-16 19:55                 ` Mike Snitzer
2008-05-16 21:48                   ` Andi Kleen
2008-02-18 22:16             ` David Chinner
2008-02-19  2:56               ` Alasdair G Kergon
2008-02-19  5:36                 ` David Chinner
2008-02-19  9:43                 ` Andi Kleen
2008-02-19  7:19               ` Jeremy Higdon
2008-02-19  7:58                 ` Michael Tokarev
2008-02-20 13:38                 ` Ric Wheeler
2008-02-21  3:29                 ` Neil Brown
2008-02-21  3:39               ` Neil Brown [this message]
2008-02-17 23:31     ` David Chinner
2008-02-19  2:39     ` Alasdair G Kergon
2008-02-19 11:12       ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18364.62072.219900.681747@notabene.brown \
    --to=neilb@suse.de \
    --cc=andi@firstfloor.org \
    --cc=dgc@sgi.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjt@tls.msk.ru \
    --cc=ric@emc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).