LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
To: jens.axboe@oracle.com, agk@redhat.com,
James.Bottomley@HansenPartnership.com
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
dm-devel@redhat.com, j-nomura@ce.jp.nec.com,
k-ueda@ct.jp.nec.com, stefan.bader@canonical.com,
akpm@linux-foundation.org
Subject: [PATCH 00/13] request-based dm-multipath
Date: Fri, 12 Sep 2008 10:38:14 -0400 (EDT) [thread overview]
Message-ID: <20080912.103814.74754581.k-ueda@ct.jp.nec.com> (raw)
Hi Jens, James and Alasdair,
This is a new version of request-based dm-multipath patches.
The patches are created on top of 2.6.27-rc6 + Alasdair's dm patches
for linux-next below:
dm-mpath-use-more-error-codes.patch
dm-mpath-remove-is_active-from-struct-dm_path.patch
Major changes from the previous version (*) are:
- Moved busy state information for device/host to
q->backing_dev_info from q->queue_flags, since backing_dev_info
seems to be more appropriate location. (PATCH 03)
And corresponding changes to the scsi driver. (PATCH 04)
- Added a queue flag to indicate whether the block device is
request stackable or not, so that request stacking drivers
can avoid to stack request-based device on bio-based device.
(PATCH 05)
- Fixed the problem that requests are not flushed on flush suspend.
(PATCH 10)
- Changed queue initialization method for bio-based dm devices
from blk_alloc_queue() to blk_init_queue(). (PATCH 11)
- Changed congestion check method in dm-multipath not to invoke
__choose_pgpath(). (PATCH 13)
(*) http://lkml.org/lkml/2008/3/19/478
Some basic function/performance testings are done with NEC iStorage
(active-active multipath), and no problem was found.
Please review and apply if no problem.
Summary of the patch-set:
01/13: block: add request data completion interface
02/13: block: add request submission interface
03/13: mm: export driver's busy state via backing_dev_info
04/13: scsi: export busy status
05/13: block: add a queue flag for request stacking support
06/13: dm: remove unused code (preparation for request-based dm)
07/13: dm: tidy local_init (preparation for request-based dm)
08/13: dm: prepare mempools on module init for request-based dm
09/13: dm: add target interface for request-based dm
10/13: dm: add core functions for request-based dm
11/13: dm: add a switch to enable request-based dm if target is ready
12/13: dm: reject requests violating limits for request-based dm
13/13: dm-mpath: convert to request-based from bio-based
Summary of the design and request-based dm-multipath are below.
BACKGROUND
==========
Currently, device-mapper (dm) is implemented as a stacking block device
at bio level. This bio-based implementation has an issue below
on dm-multipath.
Because hook for I/O mapping is above block layer __make_request(),
contiguous bios can be mapped to different underlying devices
and these bios aren't merged into a request.
Dynamic load balancing could happen this situation, though
it has not been implemented yet.
Therefore, I/O mapping after bio merging is needed for better
dynamic load balancing.
The basic idea to resolve the issue is to move multipathing layer
down below the I/O scheduler, and it was proposed from Mike Christie
as the block layer (request-based) multipath:
http://marc.info/?l=linux-scsi&m=115520444515914&w=2
Mike's patch added new block layer device for multipath and didn't
have dm interface. So I modified his patch to be used from dm.
It is request-based dm-multipath.
DESIGN OVERVIEW
===============
While currently dm and md stacks block devices at bio level,
request-based dm stacks at request level and submits/completes
struct request instead of struct bio.
Overview of the request-based dm patch:
- Mapping is done in a unit of struct request, instead of struct bio
- Hook for I/O mapping is at q->request_fn() after merging and
sorting by I/O scheduler, instead of q->make_request_fn().
- Hook for I/O completion is at bio->bi_end_io() and rq->end_io(),
instead of only bio->bi_end_io()
bio-based (current) request-based (this patch)
------------------------------------------------------------------
submission q->make_request_fn() q->request_fn()
completion bio->bi_end_io() bio->bi_end_io(), rq->end_io()
- Whether the dm device is bio-based or request-based is determined
at table loading time
- Keep user interface same (table/message/status/ioctl)
- Any bio-based devices (like dm/md) can be stacked on request-based
dm device.
Request-based dm device *cannot* be stacked on any bio-based device.
Expected benefit:
- better load balancing
Additional explanations:
Why does request-based dm use bio->bi_end_io(), too?
Because:
- dm needs to keep not only the request but also bios of the request,
if dm target drivers want to retry or do something on the request.
For example, dm-multipath has to check errors and retry with other
paths if necessary before returning the I/O result to the upper layer.
- But rq->end_io() is called at the very late stage of completion
handling where all bios in the request have been completed and
the I/O results are already visible to the upper layer.
So request-based dm hooks bio->bi_end_io() and doesn't complete the bio
in error cases, and gives over the error handling to rq->end_io() hook.
Thanks,
Kiyoshi Ueda
next reply other threads:[~2008-09-12 14:39 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-12 14:38 Kiyoshi Ueda [this message]
2008-09-12 14:40 ` [PATCH 01/13] block: add request update interface Kiyoshi Ueda
2008-09-12 14:41 ` [PATCH 02/13] block: add request submission interface Kiyoshi Ueda
2008-09-14 13:10 ` Boaz Harrosh
2008-09-16 16:06 ` Kiyoshi Ueda
2008-09-16 17:02 ` Jens Axboe
2008-09-16 18:12 ` Kiyoshi Ueda
2008-09-12 14:42 ` [PATCH 03/13] mm: lld busy status exporting interface Kiyoshi Ueda
2008-09-12 14:43 ` [PATCH 04/13] scsi: exports busy status Kiyoshi Ueda
2008-09-12 14:43 ` [PATCH 05/13] block: add a queue flag for request stacking support Kiyoshi Ueda
2008-09-12 14:44 ` [PATCH 06/13] dm: remove unused DM_WQ_FLUSH_ALL Kiyoshi Ueda
2008-09-12 14:44 ` [PATCH 07/13] dm: tidy local_init Kiyoshi Ueda
2008-09-12 14:45 ` [PATCH 08/13] dm: add kmem_cache for request-based dm Kiyoshi Ueda
2008-09-12 14:45 ` [PATCH 09/13] dm: add target interfaces " Kiyoshi Ueda
2008-09-12 14:46 ` [PATCH 10/13] dm: add core functions " Kiyoshi Ueda
2008-10-24 7:44 ` [dm-devel] " Nikanth K
2008-10-28 16:00 ` Kiyoshi Ueda
[not found] ` <490FB852.3FEE.00C5.1@novell.com>
[not found] ` <49102C03020000C50002E257@victor.provo.novell.com>
2008-11-04 15:01 ` Kiyoshi Ueda
2008-09-12 14:46 ` [PATCH 11/13] dm: enable " Kiyoshi Ueda
2008-10-24 7:52 ` [dm-devel] " Nikanth K
2008-10-28 16:02 ` Kiyoshi Ueda
2008-09-12 14:46 ` [PATCH 12/13] dm: reject I/O violating new queue limits Kiyoshi Ueda
2008-09-12 14:47 ` [PATCH 13/13] dm-mpath: convert to request-based Kiyoshi Ueda
2008-10-24 7:55 ` [dm-devel] " Nikanth K
2008-10-28 16:03 ` Kiyoshi Ueda
2008-09-14 13:17 ` [PATCH 00/13] request-based dm-multipath Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080912.103814.74754581.k-ueda@ct.jp.nec.com \
--to=k-ueda@ct.jp.nec.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=agk@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=dm-devel@redhat.com \
--cc=j-nomura@ce.jp.nec.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=stefan.bader@canonical.com \
--subject='Re: [PATCH 00/13] request-based dm-multipath' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).