LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Ric Wheeler <ric@emc.com>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Tejun Heo <htejun@gmail.com>, Robert Hancock <hancockr@shaw.ca>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ide@vger.kernel.org, edmudama@gmail.com,
	Nicolas.Mailhot@LaPoste.net, Jeff Garzik <jeff@garzik.org>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>, Mark Lord <mlord@pobox.com>,
	Dongjun Shin <d.j.shin@samsung.com>,
	Hannes Reinecke <hare@suse.de>
Subject: Re: libata FUA revisited
Date: Thu, 22 Feb 2007 17:40:21 -0500	[thread overview]
Message-ID: <45DE1BD5.1000902@emc.com> (raw)
In-Reply-To: <20070221084613.GB3924@kernel.dk>

Jens Axboe wrote:
> On Wed, Feb 21 2007, Tejun Heo wrote:
>> [cc'ing Ric, Hannes and Dongjun, Hello.  Feel free to drag other people in.]
>>
>> Robert Hancock wrote:
>>> Jens Axboe wrote:
>>>> But we can't really change that, since you need the cache flushed before
>>>> issuing the FUA write. I've been advocating for an ordered bit for
>>>> years, so that we could just do:
>>>>
>>>> 3. w/FUA+ORDERED
>>>>
>>>> normal operation -> barrier issued -> write barrier FUA+ORDERED
>>>>  -> normal operation resumes
>>>>
>>>> So we don't have to serialize everything both at the block and device
>>>> level. I would have made FUA imply this already, but apparently it's not
>>>> what MS wanted FUA for, so... The current implementations take the FUA
>>>> bit (or WRITE FUA) as a hint to boost it to head of queue, so you are
>>>> almost certainly going to jump ahead of already queued writes. Which we
>>>> of course really do not.
>> Yeah, I think if we have tagged write command and flush tagged (or
>> barrier tagged) things can be pretty efficient.  Again, I'm much more
>> comfortable with separate opcodes for those rather than bits changing
>> the behavior.
> 
> ORDERED+FUA NCQ would still be preferable to an NCQ enabled flush
> command, though.
> 
>> Another idea Dongjun talked about while drinking in LSF was ranged
>> flush.  Not as flexible/efficient as the previous option but much less
>> intrusive and should help quite a bit, I think.
> 
> But that requires extensive tracking, I'm not so sure the implementation
> of that for barriers would be very clean. It'd probably be good for
> fsync, though.
> 

If we could invent any mechanism, it would seem that it would be nicest 
if we could have independent sequences of IO requests (say with a 
distinct tag per sequence) and an ability to issue a per sequence flush 
request.  That might tie into the QOS support, but would still have 
issues when you try to map it back up the stack through the journal and 
into application level promises of data integrity.

For example, in data journal mode, we would probably need to flush not 
only the transaction level data, but also all data sequences that had 
IO's in that transaction first.

Pretty rapidly, we start to get into the database notions of nested 
transactions and so on ;-)

ric


  parent reply	other threads:[~2007-02-22 22:42 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <fa.S80SRyQbD/hm4SxliPUKU88BaCo@ifi.uio.no>
2007-02-12  5:47 ` Robert Hancock
     [not found] ` <fa.Q/csgyCHkAsD84yi+bN78H1WNNM@ifi.uio.no>
2007-02-13  0:23   ` Robert Hancock
2007-02-13 15:20     ` Tejun Heo
2007-02-14  0:07       ` Robert Hancock
2007-02-14  0:50         ` Tejun Heo
2007-02-15 18:00           ` Jens Axboe
2007-02-19 19:46             ` Robert Hancock
2007-02-21  8:37               ` Tejun Heo
2007-02-21  8:46                 ` Jens Axboe
2007-02-21  8:57                   ` Tejun Heo
2007-02-21  9:01                     ` Jens Axboe
2007-02-22 22:44                     ` Ric Wheeler
2007-02-22 22:40                   ` Ric Wheeler [this message]
2007-02-21 14:06                 ` Robert Hancock
2007-02-22 22:34                 ` Ric Wheeler
2007-02-23  0:04                   ` Robert Hancock
2007-02-21  8:44               ` Jens Axboe
2007-02-12  3:25 Robert Hancock
2007-02-12  8:31 ` Tejun Heo
2007-02-16 18:14   ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45DE1BD5.1000902@emc.com \
    --to=ric@emc.com \
    --cc=Nicolas.Mailhot@LaPoste.net \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=d.j.shin@samsung.com \
    --cc=edmudama@gmail.com \
    --cc=hancockr@shaw.ca \
    --cc=hare@suse.de \
    --cc=htejun@gmail.com \
    --cc=jeff@garzik.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlord@pobox.com \
    --subject='Re: libata FUA revisited' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).