LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Dhaval Giani <dhaval@linux.vnet.ibm.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Sudhir Kumar <skumar@linux.vnet.ibm.com>,
	Balbir Singh <balbir@in.ibm.com>,
	Aneesh Kumar KV <aneesh.kumar@linux.vnet.ibm.com>,
	lkml <linux-kernel@vger.kernel.org>,
	vgoyal@redhat.com, serue@us.ibm.com, menage@google.com
Subject: Re: [RFC, PATCH 1/2] sched: change the fairness model of the CFS group scheduler
Date: Fri, 29 Feb 2008 14:34:00 +0530	[thread overview]
Message-ID: <20080229090400.GD15328@linux.vnet.ibm.com> (raw)
In-Reply-To: <1204227768.6243.42.camel@lappy>

On Thu, Feb 28, 2008 at 08:42:48PM +0100, Peter Zijlstra wrote:
> 
> On Mon, 2008-02-25 at 19:46 +0530, Dhaval Giani wrote:
> > This patch allows tasks and groups to exist in the same cfs_rq. With this
> > change the CFS group scheduling follows a 1/(M+N) model from a 1/(1+N)
> > fairness model where M tasks and N groups exist at the cfs_rq level.
> 
> Good. You know I agree with this direction.
> 
> > Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
> > Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> > ---
> >  kernel/sched.c      |   46 +++++++++++++++++++++
> >  kernel/sched_fair.c |  113 +++++++++++++++++++++++++++++++++++++++++-----------
> >  2 files changed, 137 insertions(+), 22 deletions(-)
> > 
> > Index: linux-2.6.25-rc2/kernel/sched.c
> > ===================================================================
> > --- linux-2.6.25-rc2.orig/kernel/sched.c
> > +++ linux-2.6.25-rc2/kernel/sched.c
> > @@ -224,10 +224,13 @@ struct task_group {
> >  };
> >  
> >  #ifdef CONFIG_FAIR_GROUP_SCHED
> > +
> > +#ifdef CONFIG_USER_SCHED
> >  /* Default task group's sched entity on each cpu */
> >  static DEFINE_PER_CPU(struct sched_entity, init_sched_entity);
> >  /* Default task group's cfs_rq on each cpu */
> >  static DEFINE_PER_CPU(struct cfs_rq, init_cfs_rq) ____cacheline_aligned_in_smp;
> > +#endif
> >  
> >  static struct sched_entity *init_sched_entity_p[NR_CPUS];
> >  static struct cfs_rq *init_cfs_rq_p[NR_CPUS];
> > @@ -7163,6 +7166,10 @@ static void init_tg_cfs_entry(struct rq 
> >  		list_add(&cfs_rq->leaf_cfs_rq_list, &rq->leaf_cfs_rq_list);
> >  
> >  	tg->se[cpu] = se;
> > +	/* se could be NULL for init_task_group */
> > +	if (!se)
> > +		return;
> > +
> >  	se->cfs_rq = &rq->cfs;
> >  	se->my_q = cfs_rq;
> >  	se->load.weight = tg->shares;
> > @@ -7217,11 +7224,46 @@ void __init sched_init(void)
> >  #ifdef CONFIG_FAIR_GROUP_SCHED
> >  		init_task_group.shares = init_task_group_load;
> >  		INIT_LIST_HEAD(&rq->leaf_cfs_rq_list);
> > +#ifdef CONFIG_CGROUP_SCHED
> > +		/*
> > +		 * How much cpu bandwidth does init_task_group get?
> > +		 *
> > +		 * In case of task-groups formed thr' the cgroup filesystem, it
> > +		 * gets 100% of the cpu resources in the system. This overall
> > +		 * system cpu resource is divided among the tasks of
> > +		 * init_task_group and its child task-groups in a fair manner,
> > +		 * based on each entity's (task or task-group's) weight
> > +		 * (se->load.weight).
> > +		 *
> > +		 * In other words, if init_task_group has 10 tasks of weight
> > +		 * 1024) and two child groups A0 and A1 (of weight 1024 each),
> > +		 * then A0's share of the cpu resource is:
> > +		 *
> > +		 * 	A0's bandwidth = 1024 / (10*1024 + 1024 + 1024) = 8.33%
> > +		 *
> > +		 * We achieve this by letting init_task_group's tasks sit
> > +		 * directly in rq->cfs (i.e init_task_group->se[] = NULL).
> > +		 */
> > +		init_tg_cfs_entry(rq, &init_task_group, &rq->cfs, NULL, i, 1);
> > +		init_tg_rt_entry(rq, &init_task_group, &rq->rt, NULL, i, 1);
> 
> That seems to agree with that.
> 
> > +#elif defined CONFIG_USER_SCHED
> > +		/*
> > +		 * In case of task-groups formed thr' the user id of tasks,
> > +		 * init_task_group represents tasks belonging to root user.
> > +		 * Hence it forms a sibling of all subsequent groups formed.
> > +		 * In this case, init_task_group gets only a fraction of overall
> > +		 * system cpu resource, based on the weight assigned to root
> > +		 * user's cpu share (INIT_TASK_GROUP_LOAD). This is accomplished
> > +		 * by letting tasks of init_task_group sit in a separate cfs_rq
> > +		 * (init_cfs_rq) and having one entity represent this group of
> > +		 * tasks in rq->cfs (i.e init_task_group->se[] != NULL).
> > +		 */
> >  		init_tg_cfs_entry(rq, &init_task_group,
> >  				&per_cpu(init_cfs_rq, i),
> >  				&per_cpu(init_sched_entity, i), i, 1);
> 
> But I fail to parse this lengthy comment. What does it do:
> 
>     init_group
>    /     |    \
> uid-0 uid-1000 uid-n
> 
> or does it blend uid-0 into the init_group?
> 

It blends uid-0 (root) into init_group.

<snip>

> > @@ -1100,6 +1127,27 @@ static void check_preempt_wakeup(struct 
> >  	if (!sched_feat(WAKEUP_PREEMPT))
> >  		return;
> >  
> > +	/*
> > +	 * preemption test can be made between sibling entities who are in the
> > +	 * same cfs_rq i.e who have a common parent. Walk up the hierarchy of
> > +	 * both tasks untill we find their ancestors who are siblings of common
> > +	 * parent.
> > +	 */
> > +
> > +	/* First walk up until both entities are at same depth */
> > +	se_depth = depth_se(se);
> > +	pse_depth = depth_se(pse);
> > +
> > +	while (se_depth > pse_depth) {
> > +		se_depth--;
> > +		se = parent_entity(se);
> > +	}
> > +
> > +	while (pse_depth > se_depth) {
> > +		pse_depth--;
> > +		pse = parent_entity(pse);
> > +	}
> 
> Sad, but needed.. for now..
> 

better ideas if any are welcome! Cannot think of anything right now :(

-- 
regards,
Dhaval

  reply	other threads:[~2008-02-29  9:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-25 14:15 [RFC, PATCH 0/2] sched: add multiple hierarchy support to " Dhaval Giani
2008-02-25 14:16 ` [RFC, PATCH 1/2] sched: change the fairness model of " Dhaval Giani
2008-02-27 20:23   ` Serge E. Hallyn
2008-02-28 19:42   ` Peter Zijlstra
2008-02-29  9:04     ` Dhaval Giani [this message]
2008-02-29 10:37       ` Peter Zijlstra
2008-03-04  9:49         ` Dhaval Giani
2008-02-25 14:17 ` [RFC, PATCH 1/2] sched: allow the CFS group scheduler to have multiple levels Dhaval Giani
2008-02-25 14:17   ` Dhaval Giani
2008-02-28 19:42   ` Peter Zijlstra
2008-02-29  5:57     ` Dhaval Giani
2008-02-29  0:43 [RFC, PATCH 1/2] sched: change the fairness model of the CFS group scheduler J.C. Pizarro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080229090400.GD15328@linux.vnet.ibm.com \
    --to=dhaval@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=balbir@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=mingo@elte.hu \
    --cc=serue@us.ibm.com \
    --cc=skumar@linux.vnet.ibm.com \
    --cc=vatsa@linux.vnet.ibm.com \
    --cc=vgoyal@redhat.com \
    --subject='Re: [RFC, PATCH 1/2] sched: change the fairness model of the CFS group scheduler' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).