LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2] arm64: topology: Avoid checking numa mask for scheduler MC selection
@ 2018-06-06 16:38 Jeremy Linton
2018-06-06 17:18 ` Catalin Marinas
2018-06-07 14:25 ` Geert Uytterhoeven
0 siblings, 2 replies; 3+ messages in thread
From: Jeremy Linton @ 2018-06-06 16:38 UTC (permalink / raw)
To: Sudeep.Holla
Cc: Will.Deacon, Catalin.Marinas, Robin.Murphy, Morten.Rasmussen,
linux-arm-kernel, linux-kernel, geert, linux-acpi,
ard.biesheuvel, Jeremy Linton
The numa mask subset check can often lead to system hang or crash during
CPU hotplug and system suspend operation if NUMA is disabled. This is
mostly observed on HMP systems where the CPU compute capacities are
different and ends up in different scheduler domains. Since
cpumask_of_node is returned instead core_sibling, the scheduler is
confused with incorrect cpumasks(e.g. one CPU in two different sched
domains at the same time) on CPU hotplug.
Lets disable the NUMA siblings checks for the time being, as NUMA in
socket machines have LLC's that will assure that the scheduler topology
isn't "borken".
The NUMA check exists to assure that if a LLC within a socket crosses
NUMA nodes/chiplets the scheduler domains remain consistent. This code will
likely have to be re-enabled in the near future once the NUMA mask story
is sorted. At the moment its not necessary because the NUMA in socket
machines LLC's are contained within the NUMA domains.
Further, as a defensive mechanism during hot-plug, lets assure that the
LLC siblings are also masked.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
arch/arm64/kernel/topology.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 7415c166281f..f845a8617812 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -215,13 +215,8 @@ EXPORT_SYMBOL_GPL(cpu_topology);
const struct cpumask *cpu_coregroup_mask(int cpu)
{
- const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
+ const cpumask_t *core_mask = &cpu_topology[cpu].core_sibling;
- /* Find the smaller of NUMA, core or LLC siblings */
- if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
- /* not numa in package, lets use the package siblings */
- core_mask = &cpu_topology[cpu].core_sibling;
- }
if (cpu_topology[cpu].llc_id != -1) {
if (cpumask_subset(&cpu_topology[cpu].llc_siblings, core_mask))
core_mask = &cpu_topology[cpu].llc_siblings;
@@ -239,8 +234,10 @@ static void update_siblings_masks(unsigned int cpuid)
for_each_possible_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];
- if (cpuid_topo->llc_id == cpu_topo->llc_id)
+ if (cpuid_topo->llc_id == cpu_topo->llc_id) {
cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings);
+ cpumask_set_cpu(cpuid, &cpu_topo->llc_siblings);
+ }
if (cpuid_topo->package_id != cpu_topo->package_id)
continue;
--
2.14.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] arm64: topology: Avoid checking numa mask for scheduler MC selection
2018-06-06 16:38 [PATCH v2] arm64: topology: Avoid checking numa mask for scheduler MC selection Jeremy Linton
@ 2018-06-06 17:18 ` Catalin Marinas
2018-06-07 14:25 ` Geert Uytterhoeven
1 sibling, 0 replies; 3+ messages in thread
From: Catalin Marinas @ 2018-06-06 17:18 UTC (permalink / raw)
To: Jeremy Linton
Cc: Sudeep.Holla, ard.biesheuvel, Will.Deacon, linux-kernel,
linux-acpi, geert, Robin.Murphy, Morten.Rasmussen,
linux-arm-kernel
On Wed, Jun 06, 2018 at 11:38:46AM -0500, Jeremy Linton wrote:
> The numa mask subset check can often lead to system hang or crash during
> CPU hotplug and system suspend operation if NUMA is disabled. This is
> mostly observed on HMP systems where the CPU compute capacities are
> different and ends up in different scheduler domains. Since
> cpumask_of_node is returned instead core_sibling, the scheduler is
> confused with incorrect cpumasks(e.g. one CPU in two different sched
> domains at the same time) on CPU hotplug.
>
> Lets disable the NUMA siblings checks for the time being, as NUMA in
> socket machines have LLC's that will assure that the scheduler topology
> isn't "borken".
>
> The NUMA check exists to assure that if a LLC within a socket crosses
> NUMA nodes/chiplets the scheduler domains remain consistent. This code will
> likely have to be re-enabled in the near future once the NUMA mask story
> is sorted. At the moment its not necessary because the NUMA in socket
> machines LLC's are contained within the NUMA domains.
>
> Further, as a defensive mechanism during hot-plug, lets assure that the
> LLC siblings are also masked.
>
> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Thanks for this. I queued it for this merging window.
--
Catalin
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] arm64: topology: Avoid checking numa mask for scheduler MC selection
2018-06-06 16:38 [PATCH v2] arm64: topology: Avoid checking numa mask for scheduler MC selection Jeremy Linton
2018-06-06 17:18 ` Catalin Marinas
@ 2018-06-07 14:25 ` Geert Uytterhoeven
1 sibling, 0 replies; 3+ messages in thread
From: Geert Uytterhoeven @ 2018-06-07 14:25 UTC (permalink / raw)
To: Jeremy Linton
Cc: Sudeep Holla, Will Deacon, Catalin Marinas, Robin Murphy,
Morten.Rasmussen, Linux ARM, Linux Kernel Mailing List,
ACPI Devel Maling List, Ard Biesheuvel, Linux-Renesas
Hi Jeremy,
On Wed, Jun 6, 2018 at 6:38 PM, Jeremy Linton <jeremy.linton@arm.com> wrote:
> The numa mask subset check can often lead to system hang or crash during
> CPU hotplug and system suspend operation if NUMA is disabled. This is
Also during boot, if CONFIG_ARM_PSCI_CHECKER=y.
> mostly observed on HMP systems where the CPU compute capacities are
> different and ends up in different scheduler domains. Since
> cpumask_of_node is returned instead core_sibling, the scheduler is
> confused with incorrect cpumasks(e.g. one CPU in two different sched
> domains at the same time) on CPU hotplug.
>
> Lets disable the NUMA siblings checks for the time being, as NUMA in
> socket machines have LLC's that will assure that the scheduler topology
> isn't "borken".
>
> The NUMA check exists to assure that if a LLC within a socket crosses
> NUMA nodes/chiplets the scheduler domains remain consistent. This code will
> likely have to be re-enabled in the near future once the NUMA mask story
> is sorted. At the moment its not necessary because the NUMA in socket
> machines LLC's are contained within the NUMA domains.
>
> Further, as a defensive mechanism during hot-plug, lets assure that the
> LLC siblings are also masked.
>
> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Thanks!
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> arch/arm64/kernel/topology.c | 11 ++++-------
> 1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 7415c166281f..f845a8617812 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -215,13 +215,8 @@ EXPORT_SYMBOL_GPL(cpu_topology);
>
> const struct cpumask *cpu_coregroup_mask(int cpu)
> {
> - const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
> + const cpumask_t *core_mask = &cpu_topology[cpu].core_sibling;
>
> - /* Find the smaller of NUMA, core or LLC siblings */
> - if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
> - /* not numa in package, lets use the package siblings */
> - core_mask = &cpu_topology[cpu].core_sibling;
> - }
> if (cpu_topology[cpu].llc_id != -1) {
> if (cpumask_subset(&cpu_topology[cpu].llc_siblings, core_mask))
> core_mask = &cpu_topology[cpu].llc_siblings;
> @@ -239,8 +234,10 @@ static void update_siblings_masks(unsigned int cpuid)
> for_each_possible_cpu(cpu) {
> cpu_topo = &cpu_topology[cpu];
>
> - if (cpuid_topo->llc_id == cpu_topo->llc_id)
> + if (cpuid_topo->llc_id == cpu_topo->llc_id) {
> cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings);
> + cpumask_set_cpu(cpuid, &cpu_topo->llc_siblings);
> + }
>
> if (cpuid_topo->package_id != cpu_topo->package_id)
> continue;
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-06-07 14:25 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-06 16:38 [PATCH v2] arm64: topology: Avoid checking numa mask for scheduler MC selection Jeremy Linton
2018-06-06 17:18 ` Catalin Marinas
2018-06-07 14:25 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).