LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nikanth Karthikesan <knikanth@suse.de>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: linux-kernel@vger.kernel.org, nickpiggin@yahoo.com.au
Subject: Re: [PATCH] [RFC] block/elevator: change nr_requests handling
Date: Wed, 30 Jan 2008 19:23:44 +0530	[thread overview]
Message-ID: <1201701224.4197.31.camel@nikanth-laptop.blr.novell.com> (raw)
In-Reply-To: <20080129181915.GD15220@kernel.dk>

On Tue, 2008-01-29 at 19:19 +0100, Jens Axboe wrote: 
> On Tue, Jan 29 2008, Nikanth Karthikesan wrote:
> > The /sys/block/whateverdisk/queue/nr_requests is used to limit the
> > number of requests allocated per queue. There can be atmost nr_requests
> > read requests and nr_requests write requests allocated.
> > 
> > But this can lead to starvation when there are many tasks doing heavy
> > IO. And the ioc_batching code lets those who had failed to allocate a
> > request once, to allocate upto 150% of nr_requests in next round. This
> > just makes the limit a little higher, when there are too many tasks
> > doing heavy IO. IO schedulers have no control over this to choose which
> > task should get the requests. During such situations, even ionice cannot
> > help as CFQ cannot serve a task which cannot allocate any requests.
> > Setting nr_requests to a higher value is like disabling this queue depth
> > check at all.
> > 
> > This patch defers the nr_requests check till dispatching the requests
> > instead of limiting while creating them. There can be more than
> > nr_requests allocated, but only upto nr_requests chosen by the io
> > scheduler will be dispatched at a time. This lets the io scheduler to
> > guarantee some sort of interactivity even when there are many batchers,
> > if it wants to.
> > 
> > This patch removes the ioc_batching code. Also the schedulers stop
> > dispatching when it chooses a Read request and Read queue is full and
> > vice versa. On top of this patch, the individual schedulers can be
> > changed to take advantage of knowing the no of read and write requests
> > that can be dispatched. Or atleast dispatch until both read and write
> > queues are full.
> 
> This is all a bit backwards, I think. The io schedulers CAN choose which
> processes get to allocate requests, through the ->may_queue() hook.
> 

Sorry If I am re-inventing the wheel.

At first this looked like under-utilization. Letting only batchers and
ELV_MQUEUE_MUST tasks to allocate upto 150% of nr_requests, which is in
someway equivalent of setting nr_requests at 150% and keeping 1/3 of it
reserved for batchers. But we are actually jamming the queue only when
required and allowing it to be just full otherwise! But this patch will
never over-fill the queue.

> I definitely think the batching logic could do with some improvements,
> so I'd encourage you to try and fix that instead. It'd be nice if it did
> honor the max number of requests limit. The current batching works well
> for allowing a process to queue some IO when it gets the allocation
> 'token', that should be retained.
> 

Another way to honor the nr_requests limit strictly would be to set it
at 66.6% of the real value, so that we never exceed the limit ;-)

But even with the token, if we reach 150% the task goes to
uninteruptible sleep. But with the patch, the task will not sleep in
get_request, unless memory allocation fails. Will this skew the
performance or atleast the performance numbers? 

> Did you do any performance numbers with your patch, btw?
> 

hmm.. The patch is not completely finished yet. Just posted early to
know if this was considered already. Also I do not have good
test-cases. 

A system was getting sporadic delays when copying huge files. And I
doubted the ioc_batching code, and started working on this patch. I have
made noop & cfq to fill both the read and write queues only now, the
patch I sent yesterday will stop dispatching if it cannot dispatch the
request it chooses, leading to under-utilization. Also I guess, the
schedulers are tweaked/optimized for the ioc_batching logic, which has
to be tweaked for this, to make a real comparison.

Do you still think that, this option is not worth trying? If so I will
try to find ways to improve the ioc_batching logic

Thanks
Nikanth Karthikesan


  reply	other threads:[~2008-01-30 13:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-29 17:29 Nikanth Karthikesan
2008-01-29 17:47 ` Nikanth Karthikesan
2008-01-29 18:19 ` Jens Axboe
2008-01-30 13:53   ` Nikanth Karthikesan [this message]
2008-01-30 13:56     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1201701224.4197.31.camel@nikanth-laptop.blr.novell.com \
    --to=knikanth@suse.de \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --subject='Re: [PATCH] [RFC] block/elevator: change nr_requests handling' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).