LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Barry Song <21cnbao@gmail.com>
Cc: "Wanghui (John)" <john.wanghui@huawei.com>,
	Intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	LKML <linux-kernel@vger.kernel.org>,
	Tvrtko Ursulin <tvrtko.ursulin@intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [RFC 1/8] sched: Add nice value change notifier
Date: Thu, 7 Oct 2021 10:09:51 +0100	[thread overview]
Message-ID: <6818e34e-d41c-67b7-85dd-76d2e47bc078@linux.intel.com> (raw)
In-Reply-To: <382a4bd5-bb74-5928-be67-afbdc7aa3663@linux.intel.com>


On 07/10/2021 09:50, Tvrtko Ursulin wrote:
> 
> On 06/10/2021 21:21, Barry Song wrote:
>> On Thu, Oct 7, 2021 at 2:44 AM Tvrtko Ursulin
>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> On 06/10/2021 08:58, Barry Song wrote:
>>>> On Wed, Oct 6, 2021 at 5:15 PM Wanghui (John) 
>>>> <john.wanghui@huawei.com> wrote:
>>>>>
>>>>> HI Tvrtko
>>>>>
>>>>> On 2021/10/4 22:36, Tvrtko Ursulin wrote:
>>>>>>     void set_user_nice(struct task_struct *p, long nice)
>>>>>>     {
>>>>>>         bool queued, running;
>>>>>> -     int old_prio;
>>>>>> +     int old_prio, ret;
>>>>>>         struct rq_flags rf;
>>>>>>         struct rq *rq;
>>>>>>
>>>>>> @@ -6915,6 +6947,9 @@ void set_user_nice(struct task_struct *p, 
>>>>>> long nice)
>>>>>>
>>>>>>     out_unlock:
>>>>>>         task_rq_unlock(rq, p, &rf);
>>>>>> +
>>>>>> +     ret = atomic_notifier_call_chain(&user_nice_notifier_list, 
>>>>>> nice, p);
>>>>>> +     WARN_ON_ONCE(ret != NOTIFY_DONE);
>>>>>>     }
>>>>> How about adding a new "io_nice" to task_struct,and move the call 
>>>>> chain to
>>>>> sched_setattr/getattr, there are two benefits:
>>>>
>>>> We already have an ionice for block io scheduler. hardly can this 
>>>> new io_nice
>>>> be generic to all I/O. it seems the patchset is trying to link
>>>> process' nice with
>>>> GPU's scheduler, to some extent, it makes more senses than having a
>>>> common ionice because we have a lot of IO devices in the systems, we 
>>>> don't
>>>> know which I/O the ionice of task_struct should be applied to.
>>>>
>>>> Maybe we could have an ionice dedicated for GPU just like ionice for 
>>>> CFQ
>>>> of bio/request scheduler.
>>>
>>> Thought crossed my mind but I couldn't see the practicality of a 3rd
>>> nice concept. I mean even to start with I struggle a bit with the
>>> usefulness of existing ionice vs nice. Like coming up with practical
>>> examples of usecases where it makes sense to decouple the two 
>>> priorities.
>>>
>>>   From a different angle I did think inheriting CPU nice makes sense for
>>> GPU workloads. This is because today, and more so in the future,
>>> computations on a same data set do flow from one to the other.
>>>
>>> Like maybe a simple example of batch image processing where CPU decodes,
>>> GPU does a transform and then CPU encodes. Or a different mix, doesn't
>>> really matter, since the main point it is one computing pipeline from
>>> users point of view.
>>>
>>
>> I am on it. but I am also seeing two problems here:
>> 1. nice is not global in linux. For example, if you have two cgroups, 
>> cgroup A
>> has more quota then cgroup B. Tasks in B won't win even if it has a 
>> lower nice.
>> cgroups will run proportional-weight time-based division of CPU.
>>
>> 2. Historically, we had dynamic nice which was adjusted based on the 
>> average
>> sleep/running time; right now, we don't have dynamic nice, but virtual 
>> time
>> still make tasks which sleep more preempt other tasks with the same nice
>> or even lower nice.
>> virtual time += physical time/weight by nice
>> so, static nice number doesn't always make sense to decide preemption.
>>
>> So it seems your patch only works under some simple situation for example
>> no cgroups, tasks have similar sleep/running time.
> 
> Yes, I broadly agree with your assessment. Although there are plans for 
> adding cgroup support to i915 scheduling, I doubt as fine grained 
> control and exact semantics as there are on the CPU side will happen.
> 
> Mostly because the drive seems to be for more micro-controller managed 
> scheduling which adds further challenges in connecting the two sides 
> together.
> 
> But when you say it is a problem, I would characterize it more a 
> weakness in terms of being only a subset of possible control. It is 
> still richer (better?) than what currently exists and as demonstrated 
> with benchmarks in my cover letter it can deliver improvements in user 
> experience. If in the mid term future we can extend it with cgroup 
> support then the concept should still apply and get closer to how you 
> described nice works in the CPU world.
> 
> Main question in my mind is whether the idea of adding the 
> sched_attr/priority notifier to the kernel can be justified. Because as 
> mentioned before, everything apart from adjusting currently running GPU 
> jobs could be done purely in userspace. Stack changes would be quite 
> extensive and all, but that is not usually a good enough reason to put 
> something in the kernel. That's why it is an RFC an invitation to discuss.
> 
> Even ionice inherits from nice (see task_nice_ioprio()) so I think 
> argument can be made for drivers as well.

Now that I wrote this, I had a little bit of a light bulb moment. If I 
abandon the idea of adjusting the priority of already submitted work 
items, then I can do much of what I want purely from within the confines 
of i915.

I simply add code to inherit from current task nice on every new work 
item submission. This should probably bring the majority of the benefit 
I measured.

Regards,

Tvrtko

  reply	other threads:[~2021-10-07  9:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-04 14:36 [RFC v2 0/8] CPU + GPU synchronised priority scheduling Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 1/8] sched: Add nice value change notifier Tvrtko Ursulin
2021-10-06  4:10   ` Wanghui (John)
2021-10-06  7:58     ` Barry Song
2021-10-06 13:44       ` Tvrtko Ursulin
2021-10-06 20:21         ` Barry Song
2021-10-07  8:50           ` Tvrtko Ursulin
2021-10-07  9:09             ` Tvrtko Ursulin [this message]
2021-10-07 10:00               ` Barry Song
2021-10-04 14:36 ` [RFC 2/8] drm/i915: Explicitly track DRM clients Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 3/8] drm/i915: Make GEM contexts " Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 4/8] drm/i915: Track all user contexts per client Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 5/8] drm/i915: Keep track of registered clients indexed by task struct Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute Tvrtko Ursulin
2021-10-06 17:12   ` Matthew Brost
2021-10-06 19:06     ` Tvrtko Ursulin
2021-10-13 12:01     ` [Intel-gfx] " Daniel Vetter
2021-10-13 15:50       ` Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 7/8] drm/i915: Inherit process nice for context scheduling priority Tvrtko Ursulin
2021-10-06 17:16   ` [Intel-gfx] " Matthew Brost
2021-10-06 17:24   ` Matthew Brost
2021-10-06 18:42     ` Tvrtko Ursulin
2021-10-04 14:36 ` [RFC 8/8] drm/i915: Connect with the process nice change notifier Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6818e34e-d41c-67b7-85dd-76d2e47bc078@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=21cnbao@gmail.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=john.wanghui@huawei.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tvrtko.ursulin@intel.com \
    --cc=vincent.guittot@linaro.org \
    --subject='Re: [RFC 1/8] sched: Add nice value change notifier' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).