LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Satoshi UCHIDA" <s-uchida@ap.jp.nec.com>
To: "'Paul Menage'" <menage@google.com>,
<linux-kernel@vger.kernel.org>,
<containers@lists.linux-foundation.org>
Cc: <axboe@kernel.dk>, <tom-sugawara@ap.jp.nec.com>,
<m-takahashi@ex.jp.nec.com>
Subject: [RFC][v2][patch 0/12][CFQ-cgroup]Yet another I/O bandwidth controlling subsystem for CGroups based on CFQ
Date: Thu, 3 Apr 2008 16:09:12 +0900 [thread overview]
Message-ID: <005d01c89559$9e538200$dafa8600$@jp.nec.com> (raw)
In-Reply-To: <6599ad830804021541s3c1e3197y77d87f63bf47e4b3@mail.gmail.com>
This patchset modified a name of subsystem (from "cfq_cgroup" to "cfq")
and a checking in create function.
This patchset introduce "Yet Another" I/O bandwidth controlling
subsystem for cgroups based on CFQ (called 2 layer CFQ).
The idea of 2 layer CFQ is to build fairness control per group on the top of existing CFQ control.
We add a new data structure called CFQ meta-data on the top of
cfqd in order to control I/O bandwidth for cgroups.
CFQ meta-data control cfq_datas by service tree (rb-tree) and
CFQ algorithm when synchronous I/O.
An active cfqd controls queue for cfq by service tree.
Namely, the CFQ meta-data control traditional CFQ data.
the CFQ data runs conventionally.
cfqmd cfqmd (cfqmd = cfq meta-data)
| |
cfqc -- cfqd ----- cfqd (cfqd = cfq data,
| | cfqc = cfq cgroup data)
cfqc --[cfqd]----- cfqd
↑
conventional control.
This patchset is gainst 2.6.25-rc2-mm1.
Last week, we found a patchset from Vasily Tarasov (Open VZ) that
posted to LKML.
[RFC][PATCH 0/9] cgroups: block: cfq: I/O bandwidth controlling subsystem for CGroups based on CFQ
http://lwn.net/Articles/274652/
Our subsystem and Vasily's one are similar on the point of modifying
the CFQ subsystem, but they are different on the point of the layer of
implementation. Vasily's subsystem add a new layer for cgroup between
cfqd and cfqq, but our subsystem add a new layer for cgroup on the top
of cfqd.
The different of implementation from OpenVZ's one are:
* top layer algorithm is also based on service tree, and
* top layer program is stored in the different file (block/cfq-cgroup.c).
We hope to discuss not which is better implementation, but what is the
best way to implement I/O bandwidth control based on CFQ here.
Please give us your comments, questions and suggestions.
Finally, we introduce a usage of our implementation.
* Preparation for using 2 layer CFQ
1. Adopt this patchset to kernel 2.6.25-rc2-mm1.
2. Build kernel with CFQ-CGROUP option.
3. Restart new kernel.
4. Mount cfq_cgroup special device to device directory.
ex.
mkdir /dev/cgroup
mount -t cgroup -o cfq cfq /dev/cgroup
* Usage of grouping control.
- Create New group
Make new directory under /dev/cgroup.
For example, the following command genrerates a 'test1' group.
mkdir /dev/cgroup/test1
- Insert task to group
Write process id(pid) on "tasks" entry in the corresponding group.
For example, the following command sets task with pid 1100 into test1 group.
echo 1100 > /dev/cgroup/test1/tasks
Child tasks of this tasks is also inserted into test1 group.
- Change I/O priority of group
Write priority on "cfq.ioprio" entry in corresponding group.
For example, the following command sets priority of rank 2 to 'test1' group.
echo 2 > /dev/cgroup/test1/tasks
I/O priority for cgroups takes the value from 0 to 7. It is same as
existing per-task CFQ.
- Change I/O priority of task
Use existing "ionice" command.
* Example
Two I/O load (dd command) runs some conditions.
- When they are same group and same priority,
program
#!/bin/sh
echo $$ > /dev/cgroup/tasks
echo $$ > /dev/cgroup/test/tasks
ionice -c 2 -n 3 dd if=/internal/data1 of=/dev/null bs=1M count=1K &
ionice -c 2 -n 3 dd if=/internal/data2 of=/dev/null bs=1M count=1K &
echo $$ > /dev/cgroup/test2/tasks
echo $$ > /dev/cgroup/tasks
result
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 27.7676 s, 38.7 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 28.8482 s, 37.2 MB/s
These tasks was fair, therefore they finished at similar time.
- When they are same group and different priorities (0 and 7),
program
#!/bin/sh
echo $$ > /dev/cgroup/tasks
echo $$ > /dev/cgroup/test/tasks
ionice -c 2 -n 0 dd if=/internal/data1 of=/dev/null bs=1M count=1K &
ionice -c 2 -n 7 dd if=/internal/data2 of=/dev/null bs=1M count=1K &
echo $$ > /dev/cgroup/test2/tasks
echo $$ > /dev/cgroup/tasks
result
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 18.8373 s, 57.0 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 28.108 s, 38.2 MB/s
The first task (copy data1) had high priority, therefore it finished at fast.
- When they are different groups and different priorities (0 and 7),
program
#!/bin/sh
echo $$ > /dev/cgroup/tasks
echo $$ > /dev/cgroup/test/tasks
ionice -c 2 -n 0 dd if=/internal/data1 of=/dev/null bs=1M count=1K
echo $$ > /dev/cgroup/test2/tasks
ionice -c 2 -n 7 dd if=/internal/data2 of=/dev/null bs=1M count=1K
echo $$ > /dev/cgroup/tasks
result
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 28.1661 s, 38.1 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 28.8486 s, 37.2 MB/s
The first task (copy data1) had high priority, but they finished at similar time.
Because their groups had same priority.
- When they are different groups with different priorities (7 and 0)
and same priority,
program
#!/bin/sh
echo $$ > /dev/cgroup/tasks
echo 7 > /dev/cgroup/test/cfq.ioprio
echo $$ > /dev/cgroup/test/tasks
ionice -c 2 -n 0 dd if=/internal/data1 of=/dev/null bs=1M count=1K >& test1.log &
echo 0 > /dev/cgroup/test2/cfq.ioprio
echo $$ > /dev/cgroup/test2/tasks
ionice -c 2 -n 7 dd if=/internal/data2 of=/dev/null bs=1M count=1K >& test2.log &
echo $$ > /dev/cgroup/tasks
result
=== test1.log ===
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 27.3971 s, 39.2 MB/s
=== test2.log ===
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 17.3837 s, 61.8 MB/s
This first task (copy data1) had high priority, but they finished at late.
Because its group had low priority.
=====
Satoshi UHICDA
NEC Corporation.
next prev parent reply other threads:[~2008-04-03 7:09 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-01 9:22 [RFC][patch 0/11][CFQ-cgroup]Yet " Satoshi UCHIDA
2008-04-01 9:27 ` [RFC][patch 1/11][CFQ-cgroup] Add Configuration Satoshi UCHIDA
2008-04-01 9:30 ` [RFC][patch 2/11][CFQ-cgroup] Move header file Satoshi UCHIDA
2008-04-01 9:32 ` [RFC][patch 3/11][CFQ-cgroup] Introduce cgroup subsystem Satoshi UCHIDA
2008-04-02 22:41 ` Paul Menage
2008-04-03 2:31 ` Satoshi UCHIDA
2008-04-03 2:39 ` Li Zefan
2008-04-03 15:31 ` Paul Menage
2008-04-03 7:09 ` Satoshi UCHIDA [this message]
2008-04-03 7:11 ` [PATCH] [RFC][patch 1/12][CFQ-cgroup] Add Configuration Satoshi UCHIDA
2008-04-03 7:12 ` [RFC][patch 2/11][CFQ-cgroup] Move header file Satoshi UCHIDA
2008-04-03 7:12 ` [RFC][patch 3/12][CFQ-cgroup] Introduce cgroup subsystem Satoshi UCHIDA
2008-04-03 7:13 ` [PATCH] [RFC][patch 4/12][CFQ-cgroup] Add ioprio entry Satoshi UCHIDA
2008-04-03 7:14 ` [RFC][patch 5/12][CFQ-cgroup] Create cfq driver unique data Satoshi UCHIDA
2008-04-03 7:14 ` [RFC][patch 6/12][CFQ-cgroup] Add cfq optional operation framework Satoshi UCHIDA
2008-04-03 7:15 ` [RFC][patch 7/12][CFQ-cgroup] Add new control layer over traditional control layer Satoshi UCHIDA
2008-04-03 7:15 ` [RFC][patch 8/12][CFQ-cgroup] Control cfq_data per driver Satoshi UCHIDA
2008-04-03 7:16 ` [RFC][patch 9/12][CFQ-cgroup] Control cfq_data per cgroup Satoshi UCHIDA
2008-04-03 7:16 ` [PATCH] [RFC][patch 10/12][CFQ-cgroup] Search cfq_data when not connected Satoshi UCHIDA
2008-04-03 7:17 ` [RFC][patch 11/12][CFQ-cgroup] Control service tree: Main functions Satoshi UCHIDA
2008-04-03 7:18 ` [RFC][patch 12/12][CFQ-cgroup] entry/remove active cfq_data Satoshi UCHIDA
2008-04-25 9:54 ` [RFC][v2][patch 0/12][CFQ-cgroup]Yet another I/O bandwidth controlling subsystem for CGroups based on CFQ Ryo Tsuruta
2008-04-25 21:37 ` [Devel] " Florian Westphal
2008-04-29 0:44 ` Ryo Tsuruta
2008-05-09 10:17 ` Satoshi UCHIDA
2008-05-12 3:10 ` Ryo Tsuruta
2008-05-12 15:33 ` Ryo Tsuruta
2008-05-22 13:04 ` Ryo Tsuruta
2008-05-23 2:53 ` Satoshi UCHIDA
2008-05-26 2:46 ` Ryo Tsuruta
2008-05-27 11:32 ` Satoshi UCHIDA
2008-05-30 10:37 ` Andrea Righi
2008-06-18 9:48 ` Satoshi UCHIDA
2008-06-18 22:33 ` Andrea Righi
2008-06-22 17:04 ` Andrea Righi
2008-06-03 8:15 ` Ryo Tsuruta
2008-06-26 4:49 ` Satoshi UCHIDA
2008-04-01 9:33 ` [RFC][patch 4/11][CFQ-cgroup] Create cfq driver unique data Satoshi UCHIDA
2008-04-01 9:35 ` [RFC][patch 5/11][CFQ-cgroup] Add cfq optional operation framework Satoshi UCHIDA
2008-04-01 9:36 ` [RFC][patch 6/11][CFQ-cgroup] Add new control layer over traditional control layer Satoshi UCHIDA
2008-04-01 9:37 ` [RFC][patch 7/11][CFQ-cgroup] Control cfq_data per driver Satoshi UCHIDA
2008-04-01 9:38 ` [RFC][patch 8/11][CFQ-cgroup] Control cfq_data per cgroup Satoshi UCHIDA
2008-04-03 15:35 ` Paul Menage
2008-04-04 6:20 ` Satoshi UCHIDA
2008-04-04 9:00 ` Paul Menage
2008-04-04 9:46 ` Satoshi UCHIDA
2008-04-01 9:40 ` [RFC][patch 9/11][CFQ-cgroup] Search cfq_data when not connected Satoshi UCHIDA
2008-04-01 9:41 ` [RFC][patch 10/11][CFQ-cgroup] Control service tree: Main functions Satoshi UCHIDA
2008-04-01 9:42 ` [RFC][patch 11/11][CFQ-cgroup] entry/remove active cfq_data Satoshi UCHIDA
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='005d01c89559$9e538200$dafa8600$@jp.nec.com' \
--to=s-uchida@ap.jp.nec.com \
--cc=axboe@kernel.dk \
--cc=containers@lists.linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=m-takahashi@ex.jp.nec.com \
--cc=menage@google.com \
--cc=tom-sugawara@ap.jp.nec.com \
--subject='Re: [RFC][v2][patch 0/12][CFQ-cgroup]Yet another I/O bandwidth controlling subsystem for CGroups based on CFQ' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).