LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: "Michal Koutný" <mkoutny@suse.com>, "Waiman Long" <llong@redhat.com>
Cc: Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Roman Gushchin <guro@fb.com>, Phil Auld <pauld@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Frederic Weisbecker <frederic@kernel.org>,
Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: [PATCH v7 5/6] cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-v2.rst
Date: Wed, 13 Oct 2021 18:11:37 -0400 [thread overview]
Message-ID: <306d7fca-ee8a-e5dc-973e-5255d73de71f@redhat.com> (raw)
In-Reply-To: <5eacfdcc-148b-b599-3111-4f2971e7ddc0@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 643 bytes --]
On 10/13/21 5:45 PM, Waiman Long wrote:
>
>
>>
>> In conclusion, it'd be good to have validity conditions separate from
>> transition conditions (since hotplug transition can't be rejected) and
>> perhaps treat administrative changes from an ancestor equally as a
>> hotplug.
>
> I am trying to make the result of changing "cpuset.cpus" as close to
> hotplug as possible but there are cases where the "cpuset.cpus" change
> is prohibited but hotplug can still happen to remove the cpu.
>
> Hope this will help to clarify the current design.
>
BTW, the attached file is the current draft of cpuset.cpus.partition
document.
Cheers,
Longman
[-- Attachment #2: cpuset.cpus.partition.txt --]
[-- Type: text/plain, Size: 4889 bytes --]
cpuset.cpus.partition
A read-write single value file which exists on non-root
cpuset-enabled cgroups. This flag is owned by the parent cgroup
and is not delegatable.
It accepts only the following input values when written to.
======== ================================
"member" Non-root member of a partition
"root" Partition root
"isolated" Partition root without load balancing
======== ================================
When set to be a partition root, the current cgroup is the
root of a new partition or scheduling domain that comprises
itself and all its descendants except those that are separate
partition roots themselves and their descendants. The root
cgroup is always a partition root.
When set to "isolated", the CPUs in that partition root will
be in an isolated state without any load balancing from the
scheduler. Tasks in such a partition must be explicitly bound
to each individual CPU.
"cpuset.cpus" must always be set up first before enabling
partition. Unlike "member" whose "cpuset.cpus.effective" can
contain CPUs not in "cpuset.cpus", this can never happen with a
valid partition root. In other words, "cpuset.cpus.effective"
is always a subset of "cpuset.cpus" for a valid partition root.
When a parent partition root cannot exclusively grant any of
the CPUs specified in "cpuset.cpus", "cpuset.cpus.effective"
becomes empty. If there are tasks in the partition root, the
partition root becomes invalid and "cpuset.cpus.effective"
is reset to that of the nearest non-empty ancestor.
Note that a task cannot be moved to a cgroup with empty
"cpuset.cpus.effective".
There are additional constraints on where a partition root can
be enabled ("root" or "isolated"). It can only be enabled in
a cgroup if all the following conditions are met.
1) The "cpuset.cpus" is non-empty and exclusive, i.e. they are
not shared by any of its siblings.
2) The parent cgroup is a valid partition root.
3) The "cpuset.cpus" is a subset of parent's "cpuset.cpus".
4) There is no child cgroups with cpuset enabled. This avoids
cpu migrations of multiple cgroups simultaneously which can
be problematic.
On read, the "cpuset.cpus.partition" file can show the following
values.
====================== ==============================
"member" Non-root member of a partition
"root" Partition root
"isolated" Partition root without load balancing
"root invalid (<reason>)" Invalid partition root
====================== ==============================
In the case of an invalid partition root, a descriptive string on
why the partition is invalid is included within parentheses.
Once becoming a partition root, changes to "cpuset.cpus" is
generally allowed as long as the cpu list is exclusive and is
a superset of children's cpu lists.
The constraints of a valid partition root are as follows:
1) "cpuset.cpus" is non-empty and exclusive.
2) The parent cgroup is a valid partition root.
3) "cpuset.cpus.effective" is a subset of "cpuset.cpus"
4) "cpuset.cpus.effective" is non-empty when there are tasks
in the partition.
Changes to "cpuset.cpus" or cpu hotplug may cause the state
of a valid partition root to become invalid when one or more
constraints of a valid partition root are violated. Therefore,
user space agents that manage partition roots should avoid
unnecessary changes to "cpuset.cpus" and always check the state
of "cpuset.cpus.partition" after making changes to make sure
that the partitions are functioning properly as expected.
Changing a partition root to "member" is always allowed.
If there are child partition roots underneath it, however,
they will be forced to be switched back to "member" too and
lose their partitions. So care must be taken to double check
for this condition before disabling a partition root.
Setting a cgroup to a valid partition root will take the CPUs
away from the effective CPUs of the parent partition.
A valid parent partition may distribute out all its CPUs to
its child partitions as long as it is not the root cgroup as
we need some house-keeping CPUs in the root cgroup.
An invalid partition is not a real partition even though some
internal states may still be kept.
An invalid partition root can be reverted back to a real
partition root if none of the constraints of a valid partition
root are violated.
Poll and inotify events are triggered whenever the state of
"cpuset.cpus.partition" changes. That includes changes caused by
write to "cpuset.cpus.partition", cpu hotplug and other changes
that make the partition invalid. This will allow user space
agents to monitor unexpected changes to "cpuset.cpus.partition"
without the need to do continuous polling.
next prev parent reply other threads:[~2021-10-13 22:11 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-25 21:37 [PATCH v7 0/6] cgroup/cpuset: Add new cpuset partition type & empty effecitve cpus Waiman Long
2021-08-25 21:37 ` [PATCH v7 1/6] cgroup/cpuset: Properly transition to invalid partition Waiman Long
2021-08-25 21:37 ` [PATCH v7 2/6] cgroup/cpuset: Show invalid partition reason string Waiman Long
2021-08-25 21:37 ` [PATCH v7 3/6] cgroup/cpuset: Add a new isolated cpus.partition type Waiman Long
2021-08-25 21:37 ` [PATCH v7 4/6] cgroup/cpuset: Allow non-top parent partition to distribute out all CPUs Waiman Long
2021-08-25 21:37 ` [PATCH v7 5/6] cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-v2.rst Waiman Long
2021-08-26 17:35 ` Tejun Heo
2021-08-27 3:01 ` Waiman Long
2021-08-27 4:00 ` Tejun Heo
2021-08-27 21:19 ` Waiman Long
2021-08-27 21:27 ` Tejun Heo
2021-08-27 22:50 ` Waiman Long
2021-08-27 23:35 ` Tejun Heo
2021-08-28 1:14 ` Waiman Long
[not found] ` <3533e4f9-169c-d13c-9c4e-d9ec6bdc78f0@redhat.com>
2021-10-12 14:39 ` Michal Koutný
2021-10-13 21:45 ` Waiman Long
2021-10-13 22:11 ` Waiman Long [this message]
2021-08-30 17:59 ` Michal Koutný
2021-08-25 21:37 ` [PATCH v7 6/6] kselftest/cgroup: Add cpuset v2 partition root state test Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=306d7fca-ee8a-e5dc-973e-5255d73de71f@redhat.com \
--to=longman@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=frederic@kernel.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=juri.lelli@redhat.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=lizefan.x@bytedance.com \
--cc=llong@redhat.com \
--cc=mkoutny@suse.com \
--cc=mtosatti@redhat.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
--subject='Re: [PATCH v7 5/6] cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-v2.rst' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).