LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Ryo Tsuruta <ryov@valinux.co.jp>
Cc: linux-kernel@vger.kernel.org, dm-devel@redhat.com,
	containers@lists.linux-foundation.org,
	virtualization@lists.linux-foundation.org,
	xen-devel@lists.xensource.com
Subject: Re: [PATCH 0/2] dm-band: The I/O bandwidth controller: Overview
Date: Wed, 23 Jan 2008 13:22:36 -0600	[thread overview]
Message-ID: <479793FC.70701@codemonkey.ws> (raw)
In-Reply-To: <20080123.215350.193721890.ryov__34610.100350301$1201092994$gmane$org@valinux.co.jp>

Hi,

I believe this work is very important especially in the context of 
virtual machines.  I think it would be more useful though implemented in 
the context of the IO scheduler.  Since we already support a notion of 
IO priority, it seems reasonable to add a notion of an IO cap.

Regards,

Anthony Liguori

Ryo Tsuruta wrote:
> Hi everyone,
> 
> I'm happy to announce that I've implemented a Block I/O bandwidth controller.
> The controller is designed to be of use in a cgroup or virtual machine
> environment. The current approach is that the controller is implemented as
> a device-mapper driver.
> 
> What's dm-band all about?
> ========================
> Dm-band is an I/O bandwidth controller implemented as a device-mapper driver.
> Several jobs using the same physical device have to share the bandwidth of
> the device. Dm-band gives bandwidth to each job according to its weight, 
> which each job can set its own value to.
> 
> At this time, a job is a group of processes with the same pid or pgrp or uid.
> There is also a plan to make it support cgroup. A job can also be a virtual
> machine such as KVM or Xen.
> 
>   +------+ +------+ +------+   +------+ +------+ +------+ 
>   |cgroup| |cgroup| | the  |   | pid  | | pid  | | the  |  jobs
>   |  A   | |  B   | |others|   |  X   | |  Y   | |others| 
>   +--|---+ +--|---+ +--|---+   +--|---+ +--|---+ +--|---+   
>   +--V----+---V---+----V---+   +--V----+---V---+----V---+   
>   | group | group | default|   | group | group | default|  band groups
>   |       |       |  group |   |       |       |  group | 
>   +-------+-------+--------+   +-------+-------+--------+
>   |         band1          |   |         band2          |  band devices
>   +-----------|------------+   +-----------|------------+
>   +-----------V--------------+-------------V------------+
>   |                          |                          |
>   |          sdb1            |           sdb2           |  physical devices
>   +--------------------------+--------------------------+
> 
> 
> How dm-band works.
> ========================
> Every band device has one band group, which by default is called the default
> group.
> 
> Band devices can also have extra band groups in them. Each band group
> has a job to support and a weight. Proportional to the weight, dm-band gives
> tokens to the group.
> 
> A group passes on I/O requests that its job issues to the underlying
> layer so long as it has tokens left, while requests are blocked
> if there aren't any tokens left in the group. One token is consumed each
> time the group passes on a request. Dm-band will refill groups with tokens
> once all of groups that have requests on a given physical device use up their
> tokens.
> 
> With this approach, a job running on a band group with large weight is
> guaranteed to be able to issue a large number of I/O requests.
> 
> 
> Getting started
> =============
> The following is a brief description how to control the I/O bandwidth of
> disks. In this description, we'll take one disk with two partitions as an
> example target.
> 
> You can also check the manual at Document/device-mapper/band.txt of the
> linux kernel source tree for more information.
> 
> 
> Create and map band devices
> ---------------------------
> Create two band devices "band1" and "band2" and map them to "/dev/sda1"
> and "/dev/sda2" respectively.
> 
>  # echo "0 `blockdev --getsize /dev/sda1` band /dev/sda1 1" | dmsetup create band1
>  # echo "0 `blockdev --getsize /dev/sda2` band /dev/sda2 1" | dmsetup create band2
> 
> If the commands are successful then the device files "/dev/mapper/band1"
> and "/dev/mapper/band2" will have been created.
> 
> 
> Bandwidth control
> ----------------
> In this example weights of 40 and 10 will be assigned to "band1" and
> "band2" respectively. This is done using the following commands:
> 
>  # dmsetup message band1 0 weight 40
>  # dmsetup message band2 0 weight 10
> 
> After these commands, "band1" can use 80% --- 40/(40+10)*100 --- of the
> bandwidth of the physical disk "/dev/sda" while "band2" can use 20%.
> 
> 
> Additional bandwidth control
> ---------------------------
> In this example two extra band groups are created on "band1".
> The first group consists of all the processes with user-id 1000 and the
> second group consists of all the processes with user-id 2000. Their
> weights are 30 and 20 respectively.
> 
> Firstly the band group type of "band1" is set to "user".
> Then, the user-id 1000 and 2000 groups are attached to "band1".
> Finally, weights are assigned to the user-id 1000 and 2000 groups.
> 
>  # dmsetup message band1 0 type user
>  # dmsetup message band1 0 attach 1000
>  # dmsetup message band1 0 attach 2000
>  # dmsetup message band1 0 weight 1000:30
>  # dmsetup message band1 0 weight 2000:20
> 
> Now the processes in the user-id 1000 group can use 30% ---
> 30/(30+20+40+10)*100 --- of the bandwidth of the physical disk.
> 
>  Band Device    Band Group                     Weight
>   band1         user id 1000                     30
>   band1         user id 2000                     20
>   band1         default group(the other users)   40
>   band2         default group                    10
> 
> 
> Remove band devices
> -------------------
> Remove the band devices when no longer used.
> 
>   # dmsetup remove band1
>   # dmsetup remove band2
> 
> 
> TODO
> ========================
>   - Cgroup support. 
>   - Control read and write requests separately.
>   - Support WRITE_BARRIER.
>   - Optimization.
>   - More configuration tools. Or is the dmsetup command sufficient?
>   - Other policies to schedule BIOs. Or is the weight policy sufficient?
> 
> Thanks,
> Ryo Tsuruta


       reply	other threads:[~2008-01-23 19:22 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20080123.215350.193721890.ryov__34610.100350301$1201092994$gmane$org@valinux.co.jp>
2008-01-23 19:22 ` Anthony Liguori [this message]
2008-01-24  8:11   ` Hirokazu Takahashi
2008-01-23 12:53 Ryo Tsuruta
2008-01-23 14:32 ` Peter Zijlstra
2008-01-23 17:25   ` Ryo Tsuruta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=479793FC.70701@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=containers@lists.linux-foundation.org \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ryov@valinux.co.jp \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xen-devel@lists.xensource.com \
    --subject='Re: [PATCH 0/2] dm-band: The I/O bandwidth controller: Overview' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).