LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v5 0/2] A couple of uclamp fixes
@ 2021-08-05 10:21 Quentin Perret
  2021-08-05 10:21 ` [PATCH v5 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
  2021-08-05 10:21 ` [PATCH v5 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS Quentin Perret
  0 siblings, 2 replies; 5+ messages in thread
From: Quentin Perret @ 2021-08-05 10:21 UTC (permalink / raw)
  To: mingo, peterz, vincent.guittot, dietmar.eggemann, qais.yousef,
	rickyiu, wvw, patrick.bellasi
  Cc: linux-kernel, kernel-team

Hi all,

This is another round of uclamp fixes, previously posted here:

https://lore.kernel.org/lkml/20210719161656.3833943-1-qperret@google.com/

Changes since v4:
 - rebased on tip/sched/core
 - improved commit message in patch 01 (Dietmar)

Thanks!

Quentin Perret (2):
  sched: Fix UCLAMP_FLAG_IDLE setting
  sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS

 kernel/sched/core.c | 44 ++++++++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 12 deletions(-)

-- 
2.32.0.554.ge1b32706d8-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v5 1/2] sched: Fix UCLAMP_FLAG_IDLE setting
  2021-08-05 10:21 [PATCH v5 0/2] A couple of uclamp fixes Quentin Perret
@ 2021-08-05 10:21 ` Quentin Perret
  2021-08-06 12:57   ` [tip: sched/core] " tip-bot2 for Quentin Perret
  2021-08-05 10:21 ` [PATCH v5 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS Quentin Perret
  1 sibling, 1 reply; 5+ messages in thread
From: Quentin Perret @ 2021-08-05 10:21 UTC (permalink / raw)
  To: mingo, peterz, vincent.guittot, dietmar.eggemann, qais.yousef,
	rickyiu, wvw, patrick.bellasi
  Cc: linux-kernel, kernel-team

The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
uclamp active task (that is, when buckets.tasks reaches 0 for all
buckets) to maintain the last uclamp.max and prevent blocked util from
suddenly becoming visible.

However, there is an asymmetry in how the flag is set and cleared which
can lead to having the flag set whilst there are active tasks on the rq.
Specifically, the flag is cleared in the uclamp_rq_inc() path, which is
called at enqueue time, but set in uclamp_rq_dec_id() which is called
both when dequeueing a task _and_ in the update_uclamp_active() path. As
a result, when both uclamp_rq_{dec,ind}_id() are called from
update_uclamp_active(), the flag ends up being set but not cleared,
hence leaving the runqueue in a broken state.

Fix this by clearing the flag in update_uclamp_active() as well.

Fixes: e496187da710 ("sched/uclamp: Enforce last task's UCLAMP_MAX")
Reported-by: Rick Yiu <rickyiu@google.com>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 kernel/sched/core.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 314f70db3e5c..df0480ad59b0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1621,6 +1621,23 @@ static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
 		uclamp_rq_dec_id(rq, p, clamp_id);
 }
 
+static inline void uclamp_rq_reinc_id(struct rq *rq, struct task_struct *p,
+				      enum uclamp_id clamp_id)
+{
+	if (!p->uclamp[clamp_id].active)
+		return;
+
+	uclamp_rq_dec_id(rq, p, clamp_id);
+	uclamp_rq_inc_id(rq, p, clamp_id);
+
+	/*
+	 * Make sure to clear the idle flag if we've transiently reached 0
+	 * active tasks on rq.
+	 */
+	if (clamp_id == UCLAMP_MAX && (rq->uclamp_flags & UCLAMP_FLAG_IDLE))
+		rq->uclamp_flags &= ~UCLAMP_FLAG_IDLE;
+}
+
 static inline void
 uclamp_update_active(struct task_struct *p)
 {
@@ -1644,12 +1661,8 @@ uclamp_update_active(struct task_struct *p)
 	 * affecting a valid clamp bucket, the next time it's enqueued,
 	 * it will already see the updated clamp bucket value.
 	 */
-	for_each_clamp_id(clamp_id) {
-		if (p->uclamp[clamp_id].active) {
-			uclamp_rq_dec_id(rq, p, clamp_id);
-			uclamp_rq_inc_id(rq, p, clamp_id);
-		}
-	}
+	for_each_clamp_id(clamp_id)
+		uclamp_rq_reinc_id(rq, p, clamp_id);
 
 	task_rq_unlock(rq, p, &rf);
 }
-- 
2.32.0.554.ge1b32706d8-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v5 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
  2021-08-05 10:21 [PATCH v5 0/2] A couple of uclamp fixes Quentin Perret
  2021-08-05 10:21 ` [PATCH v5 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
@ 2021-08-05 10:21 ` Quentin Perret
  2021-08-06 12:57   ` [tip: sched/core] " tip-bot2 for Quentin Perret
  1 sibling, 1 reply; 5+ messages in thread
From: Quentin Perret @ 2021-08-05 10:21 UTC (permalink / raw)
  To: mingo, peterz, vincent.guittot, dietmar.eggemann, qais.yousef,
	rickyiu, wvw, patrick.bellasi
  Cc: linux-kernel, kernel-team

SCHED_FLAG_KEEP_PARAMS can be passed to sched_setattr to specify that
the call must not touch scheduling parameters (nice or priority). This
is particularly handy for uclamp when used in conjunction with
SCHED_FLAG_KEEP_POLICY as that allows to issue a syscall that only
impacts uclamp values.

However, sched_setattr always checks whether the priorities and nice
values passed in sched_attr are valid first, even if those never get
used down the line. This is useless at best since userspace can
trivially bypass this check to set the uclamp values by specifying low
priorities. However, it is cumbersome to do so as there is no single
expression of this that skips both RT and CFS checks at once. As such,
userspace needs to query the task policy first with e.g. sched_getattr
and then set sched_attr.sched_priority accordingly. This is racy and
slower than a single call.

As the priority and nice checks are useless when SCHED_FLAG_KEEP_PARAMS
is specified, simply inherit them in this case to match the policy
inheritance of SCHED_FLAG_KEEP_POLICY.

Reported-by: Wei Wang <wvw@google.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 kernel/sched/core.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index df0480ad59b0..433b4001e71e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7328,6 +7328,16 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
 	return -E2BIG;
 }
 
+static void get_params(struct task_struct *p, struct sched_attr *attr)
+{
+	if (task_has_dl_policy(p))
+		__getparam_dl(p, attr);
+	else if (task_has_rt_policy(p))
+		attr->sched_priority = p->rt_priority;
+	else
+		attr->sched_nice = task_nice(p);
+}
+
 /**
  * sys_sched_setscheduler - set/change the scheduler policy and RT priority
  * @pid: the pid in question.
@@ -7389,6 +7399,8 @@ SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
 	rcu_read_unlock();
 
 	if (likely(p)) {
+		if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
+			get_params(p, &attr);
 		retval = sched_setattr(p, &attr);
 		put_task_struct(p);
 	}
@@ -7537,12 +7549,7 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
 	kattr.sched_policy = p->policy;
 	if (p->sched_reset_on_fork)
 		kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-	if (task_has_dl_policy(p))
-		__getparam_dl(p, &kattr);
-	else if (task_has_rt_policy(p))
-		kattr.sched_priority = p->rt_priority;
-	else
-		kattr.sched_nice = task_nice(p);
+	get_params(p, &kattr);
 	kattr.sched_flags &= SCHED_FLAG_ALL;
 
 #ifdef CONFIG_UCLAMP_TASK
-- 
2.32.0.554.ge1b32706d8-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: sched/core] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
  2021-08-05 10:21 ` [PATCH v5 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS Quentin Perret
@ 2021-08-06 12:57   ` tip-bot2 for Quentin Perret
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Quentin Perret @ 2021-08-06 12:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Wei Wang, Quentin Perret, Peter Zijlstra (Intel),
	Dietmar Eggemann, Qais Yousef, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     f4dddf90d58d77b48492b775868af4041a217f4c
Gitweb:        https://git.kernel.org/tip/f4dddf90d58d77b48492b775868af4041a217f4c
Author:        Quentin Perret <qperret@google.com>
AuthorDate:    Thu, 05 Aug 2021 11:21:54 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 06 Aug 2021 14:25:25 +02:00

sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS

SCHED_FLAG_KEEP_PARAMS can be passed to sched_setattr to specify that
the call must not touch scheduling parameters (nice or priority). This
is particularly handy for uclamp when used in conjunction with
SCHED_FLAG_KEEP_POLICY as that allows to issue a syscall that only
impacts uclamp values.

However, sched_setattr always checks whether the priorities and nice
values passed in sched_attr are valid first, even if those never get
used down the line. This is useless at best since userspace can
trivially bypass this check to set the uclamp values by specifying low
priorities. However, it is cumbersome to do so as there is no single
expression of this that skips both RT and CFS checks at once. As such,
userspace needs to query the task policy first with e.g. sched_getattr
and then set sched_attr.sched_priority accordingly. This is racy and
slower than a single call.

As the priority and nice checks are useless when SCHED_FLAG_KEEP_PARAMS
is specified, simply inherit them in this case to match the policy
inheritance of SCHED_FLAG_KEEP_POLICY.

Reported-by: Wei Wang <wvw@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Link: https://lore.kernel.org/r/20210805102154.590709-3-qperret@google.com
---
 kernel/sched/core.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index df0480a..433b400 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7328,6 +7328,16 @@ err_size:
 	return -E2BIG;
 }
 
+static void get_params(struct task_struct *p, struct sched_attr *attr)
+{
+	if (task_has_dl_policy(p))
+		__getparam_dl(p, attr);
+	else if (task_has_rt_policy(p))
+		attr->sched_priority = p->rt_priority;
+	else
+		attr->sched_nice = task_nice(p);
+}
+
 /**
  * sys_sched_setscheduler - set/change the scheduler policy and RT priority
  * @pid: the pid in question.
@@ -7389,6 +7399,8 @@ SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
 	rcu_read_unlock();
 
 	if (likely(p)) {
+		if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
+			get_params(p, &attr);
 		retval = sched_setattr(p, &attr);
 		put_task_struct(p);
 	}
@@ -7537,12 +7549,7 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
 	kattr.sched_policy = p->policy;
 	if (p->sched_reset_on_fork)
 		kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-	if (task_has_dl_policy(p))
-		__getparam_dl(p, &kattr);
-	else if (task_has_rt_policy(p))
-		kattr.sched_priority = p->rt_priority;
-	else
-		kattr.sched_nice = task_nice(p);
+	get_params(p, &kattr);
 	kattr.sched_flags &= SCHED_FLAG_ALL;
 
 #ifdef CONFIG_UCLAMP_TASK

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: sched/core] sched: Fix UCLAMP_FLAG_IDLE setting
  2021-08-05 10:21 ` [PATCH v5 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
@ 2021-08-06 12:57   ` tip-bot2 for Quentin Perret
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Quentin Perret @ 2021-08-06 12:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Rick Yiu, Quentin Perret, Peter Zijlstra (Intel),
	Qais Yousef, Dietmar Eggemann, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     ca4984a7dd863f3e1c0df775ae3e744bff24c303
Gitweb:        https://git.kernel.org/tip/ca4984a7dd863f3e1c0df775ae3e744bff24c303
Author:        Quentin Perret <qperret@google.com>
AuthorDate:    Thu, 05 Aug 2021 11:21:53 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 06 Aug 2021 14:25:25 +02:00

sched: Fix UCLAMP_FLAG_IDLE setting

The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
uclamp active task (that is, when buckets.tasks reaches 0 for all
buckets) to maintain the last uclamp.max and prevent blocked util from
suddenly becoming visible.

However, there is an asymmetry in how the flag is set and cleared which
can lead to having the flag set whilst there are active tasks on the rq.
Specifically, the flag is cleared in the uclamp_rq_inc() path, which is
called at enqueue time, but set in uclamp_rq_dec_id() which is called
both when dequeueing a task _and_ in the update_uclamp_active() path. As
a result, when both uclamp_rq_{dec,ind}_id() are called from
update_uclamp_active(), the flag ends up being set but not cleared,
hence leaving the runqueue in a broken state.

Fix this by clearing the flag in update_uclamp_active() as well.

Fixes: e496187da710 ("sched/uclamp: Enforce last task's UCLAMP_MAX")
Reported-by: Rick Yiu <rickyiu@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20210805102154.590709-2-qperret@google.com
---
 kernel/sched/core.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 314f70d..df0480a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1621,6 +1621,23 @@ static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
 		uclamp_rq_dec_id(rq, p, clamp_id);
 }
 
+static inline void uclamp_rq_reinc_id(struct rq *rq, struct task_struct *p,
+				      enum uclamp_id clamp_id)
+{
+	if (!p->uclamp[clamp_id].active)
+		return;
+
+	uclamp_rq_dec_id(rq, p, clamp_id);
+	uclamp_rq_inc_id(rq, p, clamp_id);
+
+	/*
+	 * Make sure to clear the idle flag if we've transiently reached 0
+	 * active tasks on rq.
+	 */
+	if (clamp_id == UCLAMP_MAX && (rq->uclamp_flags & UCLAMP_FLAG_IDLE))
+		rq->uclamp_flags &= ~UCLAMP_FLAG_IDLE;
+}
+
 static inline void
 uclamp_update_active(struct task_struct *p)
 {
@@ -1644,12 +1661,8 @@ uclamp_update_active(struct task_struct *p)
 	 * affecting a valid clamp bucket, the next time it's enqueued,
 	 * it will already see the updated clamp bucket value.
 	 */
-	for_each_clamp_id(clamp_id) {
-		if (p->uclamp[clamp_id].active) {
-			uclamp_rq_dec_id(rq, p, clamp_id);
-			uclamp_rq_inc_id(rq, p, clamp_id);
-		}
-	}
+	for_each_clamp_id(clamp_id)
+		uclamp_rq_reinc_id(rq, p, clamp_id);
 
 	task_rq_unlock(rq, p, &rf);
 }

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-06 12:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-05 10:21 [PATCH v5 0/2] A couple of uclamp fixes Quentin Perret
2021-08-05 10:21 ` [PATCH v5 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
2021-08-06 12:57   ` [tip: sched/core] " tip-bot2 for Quentin Perret
2021-08-05 10:21 ` [PATCH v5 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS Quentin Perret
2021-08-06 12:57   ` [tip: sched/core] " tip-bot2 for Quentin Perret

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).