LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Ryo Tsuruta <ryov@valinux.co.jp>
Cc: linux-kernel@vger.kernel.org, dm-devel@redhat.com,
containers@lists.linux-foundation.org,
virtualization@lists.linux-foundation.org,
xen-devel@lists.xensource.com
Subject: Re: [PATCH 0/2] dm-band: The I/O bandwidth controller: Overview
Date: Wed, 23 Jan 2008 13:22:36 -0600 [thread overview]
Message-ID: <479793FC.70701@codemonkey.ws> (raw)
In-Reply-To: <20080123.215350.193721890.ryov__34610.100350301$1201092994$gmane$org@valinux.co.jp>
Hi,
I believe this work is very important especially in the context of
virtual machines. I think it would be more useful though implemented in
the context of the IO scheduler. Since we already support a notion of
IO priority, it seems reasonable to add a notion of an IO cap.
Regards,
Anthony Liguori
Ryo Tsuruta wrote:
> Hi everyone,
>
> I'm happy to announce that I've implemented a Block I/O bandwidth controller.
> The controller is designed to be of use in a cgroup or virtual machine
> environment. The current approach is that the controller is implemented as
> a device-mapper driver.
>
> What's dm-band all about?
> ========================
> Dm-band is an I/O bandwidth controller implemented as a device-mapper driver.
> Several jobs using the same physical device have to share the bandwidth of
> the device. Dm-band gives bandwidth to each job according to its weight,
> which each job can set its own value to.
>
> At this time, a job is a group of processes with the same pid or pgrp or uid.
> There is also a plan to make it support cgroup. A job can also be a virtual
> machine such as KVM or Xen.
>
> +------+ +------+ +------+ +------+ +------+ +------+
> |cgroup| |cgroup| | the | | pid | | pid | | the | jobs
> | A | | B | |others| | X | | Y | |others|
> +--|---+ +--|---+ +--|---+ +--|---+ +--|---+ +--|---+
> +--V----+---V---+----V---+ +--V----+---V---+----V---+
> | group | group | default| | group | group | default| band groups
> | | | group | | | | group |
> +-------+-------+--------+ +-------+-------+--------+
> | band1 | | band2 | band devices
> +-----------|------------+ +-----------|------------+
> +-----------V--------------+-------------V------------+
> | | |
> | sdb1 | sdb2 | physical devices
> +--------------------------+--------------------------+
>
>
> How dm-band works.
> ========================
> Every band device has one band group, which by default is called the default
> group.
>
> Band devices can also have extra band groups in them. Each band group
> has a job to support and a weight. Proportional to the weight, dm-band gives
> tokens to the group.
>
> A group passes on I/O requests that its job issues to the underlying
> layer so long as it has tokens left, while requests are blocked
> if there aren't any tokens left in the group. One token is consumed each
> time the group passes on a request. Dm-band will refill groups with tokens
> once all of groups that have requests on a given physical device use up their
> tokens.
>
> With this approach, a job running on a band group with large weight is
> guaranteed to be able to issue a large number of I/O requests.
>
>
> Getting started
> =============
> The following is a brief description how to control the I/O bandwidth of
> disks. In this description, we'll take one disk with two partitions as an
> example target.
>
> You can also check the manual at Document/device-mapper/band.txt of the
> linux kernel source tree for more information.
>
>
> Create and map band devices
> ---------------------------
> Create two band devices "band1" and "band2" and map them to "/dev/sda1"
> and "/dev/sda2" respectively.
>
> # echo "0 `blockdev --getsize /dev/sda1` band /dev/sda1 1" | dmsetup create band1
> # echo "0 `blockdev --getsize /dev/sda2` band /dev/sda2 1" | dmsetup create band2
>
> If the commands are successful then the device files "/dev/mapper/band1"
> and "/dev/mapper/band2" will have been created.
>
>
> Bandwidth control
> ----------------
> In this example weights of 40 and 10 will be assigned to "band1" and
> "band2" respectively. This is done using the following commands:
>
> # dmsetup message band1 0 weight 40
> # dmsetup message band2 0 weight 10
>
> After these commands, "band1" can use 80% --- 40/(40+10)*100 --- of the
> bandwidth of the physical disk "/dev/sda" while "band2" can use 20%.
>
>
> Additional bandwidth control
> ---------------------------
> In this example two extra band groups are created on "band1".
> The first group consists of all the processes with user-id 1000 and the
> second group consists of all the processes with user-id 2000. Their
> weights are 30 and 20 respectively.
>
> Firstly the band group type of "band1" is set to "user".
> Then, the user-id 1000 and 2000 groups are attached to "band1".
> Finally, weights are assigned to the user-id 1000 and 2000 groups.
>
> # dmsetup message band1 0 type user
> # dmsetup message band1 0 attach 1000
> # dmsetup message band1 0 attach 2000
> # dmsetup message band1 0 weight 1000:30
> # dmsetup message band1 0 weight 2000:20
>
> Now the processes in the user-id 1000 group can use 30% ---
> 30/(30+20+40+10)*100 --- of the bandwidth of the physical disk.
>
> Band Device Band Group Weight
> band1 user id 1000 30
> band1 user id 2000 20
> band1 default group(the other users) 40
> band2 default group 10
>
>
> Remove band devices
> -------------------
> Remove the band devices when no longer used.
>
> # dmsetup remove band1
> # dmsetup remove band2
>
>
> TODO
> ========================
> - Cgroup support.
> - Control read and write requests separately.
> - Support WRITE_BARRIER.
> - Optimization.
> - More configuration tools. Or is the dmsetup command sufficient?
> - Other policies to schedule BIOs. Or is the weight policy sufficient?
>
> Thanks,
> Ryo Tsuruta
next parent reply other threads:[~2008-01-23 19:22 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080123.215350.193721890.ryov__34610.100350301$1201092994$gmane$org@valinux.co.jp>
2008-01-23 19:22 ` Anthony Liguori [this message]
2008-01-24 8:11 ` Hirokazu Takahashi
2008-01-23 12:53 Ryo Tsuruta
2008-01-23 14:32 ` Peter Zijlstra
2008-01-23 17:25 ` Ryo Tsuruta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=479793FC.70701@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=containers@lists.linux-foundation.org \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ryov@valinux.co.jp \
--cc=virtualization@lists.linux-foundation.org \
--cc=xen-devel@lists.xensource.com \
--subject='Re: [PATCH 0/2] dm-band: The I/O bandwidth controller: Overview' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).