LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
To: Vivek Goyal <vgoyal@redhat.com>, Jens Axboe <axboe@kernel.dk>
Cc: linux kernel mailing list <linux-kernel@vger.kernel.org>,
	Corrado Zoccolo <czoccolo@gmail.com>,
	Chad Talbott <ctalbott@google.com>,
	Nauman Rafique <nauman@google.com>,
	Divyesh Shah <dpshah@google.com>,
	jmoyer@redhat.com, Shaohua Li <shaohua.li@intel.com>
Subject: Re: [PATCH 3/6 v3] cfq-iosched: Introduce vdisktime and io weight for CFQ queue
Date: Mon, 24 Jan 2011 12:45:18 +0800	[thread overview]
Message-ID: <4D3D03DE.20604@cn.fujitsu.com> (raw)
In-Reply-To: <20110120110952.GA16184@redhat.com>

Vivek Goyal wrote:
> On Thu, Jan 20, 2011 at 11:58:18AM +0800, Gui Jianfeng wrote:
>> Vivek Goyal wrote:
>>> On Mon, Dec 27, 2010 at 04:51:00PM +0800, Gui Jianfeng wrote:
>>>> Introduce vdisktime and io weight for CFQ queue scheduling. Currently, io priority
>>>> maps to a range [100,1000]. It also gets rid of cfq_slice_offset() logic and makes
>>>> use the same scheduling algorithm as CFQ group does. This helps for CFQ queue and
>>>> group scheduling on the same service tree.
>>>>
>>>> Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
>>> [..]
>>>> @@ -1246,47 +1278,71 @@ static void cfq_service_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq,
>>>>  
>>>>  	service_tree = service_tree_for(cfqq->cfqg, cfqq_prio(cfqq),
>>>>  						cfqq_type(cfqq));
>>>> +	/*
>>>> +	 * For the time being, put the newly added CFQ queue at the end of the
>>>> +	 * service tree.
>>>> +	 */
>>>> +	if (RB_EMPTY_NODE(&cfqe->rb_node)) {
>>>> +		/*
>>>> +		 * If this CFQ queue moves to another group, the original
>>>> +		 * vdisktime makes no sense any more, reset the vdisktime
>>>> +		 * here.
>>>> +		 */
>>>> +		parent = rb_last(&service_tree->rb);
>>>> +		if (parent) {
>>>> +			u64 boost;
>>>> +			s64 __vdisktime;
>>>> +
>>>> +			__cfqe = rb_entry_entity(parent);
>>>> +			cfqe->vdisktime = __cfqe->vdisktime + CFQ_IDLE_DELAY;
>>>> +
>>>> +			/* Give some vdisktime boost according to its weight */
>>>> +			boost = cfq_get_boost(cfqd, cfqe);
>>>> +			__vdisktime = cfqe->vdisktime - boost;
>>>> +			if (__vdisktime > service_tree->min_vdisktime)
>>>> +				cfqe->vdisktime = __vdisktime;
>>>> +			else
>>>> +				cfqe->vdisktime = service_tree->min_vdisktime;
>>>> +		} else
>>>> +			cfqe->vdisktime = service_tree->min_vdisktime;
>>> Hi Gui,
>>>
>>> Is above logic actually working? I suspect that most of the time all the
>>> new queues will end up getting same vdisktime and that is st->min_vdisktime
>>> and there will be no service differentiation across various prio.
>>>
>>> Reason being, on SSD, idle is disabled. So very few/no queue will consume
>>> its slice and follow reque path. So every queue will be new. Now you are
>>> doing following.
>>>
>>> 	cfqd->vdisktime = vdisktime_of_parent + IDLE_DELAY - boost
>>>
>>> assume vdisktime_of_parent=st->min_vdisktime, then (IDLE_DELAY - boost)
>>> is always going to be a -ve number and hence cfqd->vdisktime will 
>>> default to st->min_vdisktime. (IDLE_DELAY=200 while boost should be a huge
>>> value due to SERVICE_SHIFT thing).
>> Vivek,
>>
>> Actually, I tested on rotational disk with idling disabled, I saw service
>> differentiation between two tasks with different ioprio.
>> I don't have a SSD on hand, But I'll get one and do more tests.
> 
> Ok, I am really not sure how did it work for you because boost will
> turn out to be huge number in comparision to IDLE_DELAY or vdisktime
> of parent. If you have some blktrace data or if you can explain how
> it is working will help in understanding it better.

Vivek,

I think there must be something configuration wrong in my previous tests.
I tested again with patch applied. I don't see service differentiation
this time.
But, I do the same test on vanilla kernel, I don't see service differentiation
either. I dig into it. It seems when I set slice_idle and group_idle to zero,
it means i close idle window by mannual. In this case, IO tasks becomes SYNC NOIDLE
workloads, they will preempts each other mutually. And the cfqq will always be put
at the front of service tree. So we don't see service differentiation.

Actually, We don't want preemption happening in such case, right? So I think we should
refine the preemption logic. 
Thoughts?

Below is my fio job file:

[global]
directory=/hdb3
ioengine=sync
runtime=30
time_based=1
direct=1
bs=128K
size=64M
numjobs=1
exec_prerun='echo 3 > /proc/sys/vm/drop_caches'

[task-prio-0]
description=task-prio-0
rw=read
prio=0
filename=file1

[task-prio-7]
description=task-prio-7
rw=read
prio=7
filename=file2

Thanks,
Gui

  reply	other threads:[~2011-01-24  4:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4D180ECE.4000305@cn.fujitsu.com>
2010-12-27  8:50 ` [PATCH 1/6 v3] cfq-iosched: Introduce cfq_entity " Gui Jianfeng
2010-12-27  8:50 ` [PATCH 2/6 v3] cfq-iosched: Introduce cfq_entity for CFQ group Gui Jianfeng
2011-01-18 21:20   ` Vivek Goyal
2011-01-19  1:25     ` Gui Jianfeng
2011-01-23  2:15     ` Gui Jianfeng
2010-12-27  8:51 ` [PATCH 3/6 v3] cfq-iosched: Introduce vdisktime and io weight for CFQ queue Gui Jianfeng
2010-12-28  2:50   ` Shaohua Li
2010-12-28  3:59     ` Gui Jianfeng
2010-12-28  6:03       ` Shaohua Li
2010-12-28  6:59         ` Gui Jianfeng
2011-01-19 22:58   ` Vivek Goyal
2011-01-20  3:58     ` Gui Jianfeng
2011-01-20 11:09       ` Vivek Goyal
2011-01-24  4:45         ` Gui Jianfeng [this message]
2011-01-24 18:51           ` Vivek Goyal
2010-12-27  8:51 ` [PATCH 4/6 v3] cfq-iosched: Extract some common code of service tree handling for CFQ queue and CFQ group Gui Jianfeng
2010-12-27  8:51 ` [PATCH 5/6 v3] cfq-iosched: CFQ group hierarchical scheduling and use_hierarchy interface Gui Jianfeng
2011-01-24 22:52   ` Vivek Goyal
2010-12-27  8:51 ` [PATCH 6/6 v3] blkio-cgroup: Document for blkio.use_hierarchy interface Gui Jianfeng
2011-01-18 20:27   ` Vivek Goyal
2011-01-19  1:20     ` Gui Jianfeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D3D03DE.20604@cn.fujitsu.com \
    --to=guijianfeng@cn.fujitsu.com \
    --cc=axboe@kernel.dk \
    --cc=ctalbott@google.com \
    --cc=czoccolo@gmail.com \
    --cc=dpshah@google.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nauman@google.com \
    --cc=shaohua.li@intel.com \
    --cc=vgoyal@redhat.com \
    --subject='Re: [PATCH 3/6 v3] cfq-iosched: Introduce vdisktime and io weight for CFQ queue' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).