LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V 
@ 2019-05-29 21:13 Atish Patra
  2019-05-29 21:13 ` [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries Atish Patra
                   ` (7 more replies)
  0 siblings, 8 replies; 25+ messages in thread
From: Atish Patra @ 2019-05-29 21:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Albert Ou, Anup Patel, Catalin Marinas,
	David S. Miller, devicetree, Greg Kroah-Hartman, Ingo Molnar,
	Jeremy Linton, Linus Walleij, linux-riscv, Mark Rutland,
	Mauro Carvalho Chehab, Morten Rasmussen, Otto Sabart,
	Palmer Dabbelt, Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Sudeep Holla, Thomas Gleixner,
	Will Deacon, Russell King, linux-arm-kernel

The cpu-map DT entry in ARM can describe the CPU topology in much better
way compared to other existing approaches. RISC-V can easily adopt this
binding to represent its own CPU topology. Thus, both cpu-map DT
binding and topology parsing code can be moved to a common location so
that RISC-V or any other architecture can leverage that.

The relevant discussion regarding unifying cpu topology can be found in
[1].

arch_topology seems to be a perfect place to move the common code. I
have not introduced any significant functional changes in the moved code.
The only downside in this approach is that the capacity code will be
executed for RISC-V as well. But, it will exit immediately after not
able to find the appropriate DT node. If the overhead is considered too
much, we can always compile out capacity related functions under a
different config for the architectures that do not support them.

There was an opportunity to unify topology data structure for ARM32 done
by patch 3/4. But, I refrained from making any other changes as I am not
very well versed with original intention for some functions that
are present in arch_topology.c. I hope this patch series can be served
as a baseline for such changes in the future.

The patches have been tested for RISC-V and compile tested for ARM64,
ARM32 & x86.

The socket change[2] is also now part of this series.

[1] https://lkml.org/lkml/2018/11/6/19
[2] https://lkml.org/lkml/2018/11/7/918

QEMU changes for RISC-V topology are available at

https://github.com/atishp04/qemu/tree/riscv_topology_dt

HiFive Unleashed DT with topology node is available here.
https://github.com/atishp04/opensbi/tree/HiFive_unleashed_topology

It can be verified with OpenSBI with following additional compile time
option.

FW_PAYLOAD_FDT="unleashed_topology.dtb"

Changes from v5->v6
1. Added two more patches from Sudeep about maintainership of arch_topology.c
   and Kconfig update. 
2. Added Tested-by & Reviewed-by
3. Fixed a nit (reordering of variables)

Changes from v4-v5
1. Removed the arch_topology.h header inclusion from topology.c and arch_topology.c
file. Added it in linux/topology.h.
2. core_id is set to -1 upon reset. Otherwise, ARM topology store function does not
work.

Changes from v3->v4
1. Get rid of ARM32 specific information in topology structure.
2. Remove redundant functions from ARM32 and use common code instead. 

Changes from v2->v3
1. Cover letter update with experiment DT for topology changes.
2. Added the patch for [2].

Changes from v1->v2
1. ARM32 can now use the common code as well.

Atish Patra (4):
dt-binding: cpu-topology: Move cpu-map to a common binding.
cpu-topology: Move cpu topology code to common code.
arm: Use common cpu_topology structure and functions.
RISC-V: Parse cpu topology during boot.

Sudeep Holla (3):
Documentation: DT: arm: add support for sockets defining package
boundaries
base: arch_topology: update Kconfig help description
MAINTAINERS: Add an entry for generic architecture topology

.../topology.txt => cpu/cpu-topology.txt}     | 134 ++++++--
MAINTAINERS                                   |   7 +
arch/arm/include/asm/topology.h               |  20 --
arch/arm/kernel/topology.c                    |  60 +---
arch/arm64/include/asm/topology.h             |  23 --
arch/arm64/kernel/topology.c                  | 303 +-----------------
arch/riscv/Kconfig                            |   1 +
arch/riscv/kernel/smpboot.c                   |   3 +
drivers/base/Kconfig                          |   2 +-
drivers/base/arch_topology.c                  | 298 +++++++++++++++++
include/linux/arch_topology.h                 |  26 ++
include/linux/topology.h                      |   1 +
12 files changed, 452 insertions(+), 426 deletions(-)
rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (66%)

--
2.21.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
@ 2019-05-29 21:13 ` Atish Patra
  2019-05-29 23:39   ` Andrew F. Davis
  2019-05-29 21:13 ` [PATCH v6 2/7] dt-binding: cpu-topology: Move cpu-map to a common binding Atish Patra
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Atish Patra @ 2019-05-29 21:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Rob Herring, Albert Ou, Anup Patel, Atish Patra,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Morten Rasmussen,
	Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Russell King, linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

The current ARM DT topology description provides the operating system
with a topological view of the system that is based on leaf nodes
representing either cores or threads (in an SMT system) and a
hierarchical set of cluster nodes that creates a hierarchical topology
view of how those cores and threads are grouped.

However this hierarchical representation of clusters does not allow to
describe what topology level actually represents the physical package or
the socket boundary, which is a key piece of information to be used by
an operating system to optimize resource allocation and scheduling.

Lets add a new "socket" node type in the cpu-map node to describe the
same.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 .../devicetree/bindings/arm/topology.txt      | 52 ++++++++++++++-----
 1 file changed, 39 insertions(+), 13 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/arm/topology.txt
index b0d80c0fb265..3b8febb46dad 100644
--- a/Documentation/devicetree/bindings/arm/topology.txt
+++ b/Documentation/devicetree/bindings/arm/topology.txt
@@ -9,6 +9,7 @@ ARM topology binding description
 In an ARM system, the hierarchy of CPUs is defined through three entities that
 are used to describe the layout of physical CPUs in the system:
 
+- socket
 - cluster
 - core
 - thread
@@ -63,21 +64,23 @@ nodes are listed.
 
 	The cpu-map node's child nodes can be:
 
-	- one or more cluster nodes
+	- one or more cluster nodes or
+	- one or more socket nodes in a multi-socket system
 
 	Any other configuration is considered invalid.
 
-The cpu-map node can only contain three types of child nodes:
+The cpu-map node can only contain 4 types of child nodes:
 
+- socket node
 - cluster node
 - core node
 - thread node
 
 whose bindings are described in paragraph 3.
 
-The nodes describing the CPU topology (cluster/core/thread) can only
-be defined within the cpu-map node and every core/thread in the system
-must be defined within the topology.  Any other configuration is
+The nodes describing the CPU topology (socket/cluster/core/thread) can
+only be defined within the cpu-map node and every core/thread in the
+system must be defined within the topology.  Any other configuration is
 invalid and therefore must be ignored.
 
 ===========================================
@@ -85,26 +88,44 @@ invalid and therefore must be ignored.
 ===========================================
 
 cpu-map child nodes must follow a naming convention where the node name
-must be "clusterN", "coreN", "threadN" depending on the node type (ie
-cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes which
-are siblings within a single common parent node must be given a unique and
+must be "socketN", "clusterN", "coreN", "threadN" depending on the node type
+(ie socket/cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes
+which are siblings within a single common parent node must be given a unique and
 sequential N value, starting from 0).
 cpu-map child nodes which do not share a common parent node can have the same
 name (ie same number N as other cpu-map child nodes at different device tree
 levels) since name uniqueness will be guaranteed by the device tree hierarchy.
 
 ===========================================
-3 - cluster/core/thread node bindings
+3 - socket/cluster/core/thread node bindings
 ===========================================
 
-Bindings for cluster/cpu/thread nodes are defined as follows:
+Bindings for socket/cluster/cpu/thread nodes are defined as follows:
+
+- socket node
+
+	 Description: must be declared within a cpu-map node, one node
+		      per physical socket in the system. A system can
+		      contain single or multiple physical socket.
+		      The association of sockets and NUMA nodes is beyond
+		      the scope of this bindings, please refer [2] for
+		      NUMA bindings.
+
+	This node is optional for a single socket system.
+
+	The socket node name must be "socketN" as described in 2.1 above.
+	A socket node can not be a leaf node.
+
+	A socket node's child nodes must be one or more cluster nodes.
+
+	Any other configuration is considered invalid.
 
 - cluster node
 
 	 Description: must be declared within a cpu-map node, one node
 		      per cluster. A system can contain several layers of
-		      clustering and cluster nodes can be contained in parent
-		      cluster nodes.
+		      clustering within a single physical socket and cluster
+		      nodes can be contained in parent cluster nodes.
 
 	The cluster node name must be "clusterN" as described in 2.1 above.
 	A cluster node can not be a leaf node.
@@ -164,13 +185,15 @@ Bindings for cluster/cpu/thread nodes are defined as follows:
 4 - Example dts
 ===========================================
 
-Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters):
+Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters in a single
+physical socket):
 
 cpus {
 	#size-cells = <0>;
 	#address-cells = <2>;
 
 	cpu-map {
+		socket0 {
 			cluster0 {
 				cluster0 {
 					core0 {
@@ -253,6 +276,7 @@ cpus {
 				};
 			};
 		};
+	};
 
 	CPU0: cpu@0 {
 		device_type = "cpu";
@@ -473,3 +497,5 @@ cpus {
 ===============================================================================
 [1] ARM Linux kernel documentation
     Documentation/devicetree/bindings/arm/cpus.yaml
+[2] Devicetree NUMA binding description
+    Documentation/devicetree/bindings/numa.txt
-- 
2.21.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 2/7] dt-binding: cpu-topology: Move cpu-map to a common binding.
  2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
  2019-05-29 21:13 ` [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries Atish Patra
@ 2019-05-29 21:13 ` Atish Patra
  2019-05-30 20:55   ` Jeremy Linton
  2019-05-29 21:13 ` [PATCH v6 3/7] cpu-topology: Move cpu topology code to common code Atish Patra
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Atish Patra @ 2019-05-29 21:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: Atish Patra, Sudeep Holla, Rob Herring, Albert Ou, Anup Patel,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Morten Rasmussen,
	Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Russell King, linux-arm-kernel

cpu-map binding can be used to described cpu topology for both
RISC-V & ARM. It makes more sense to move the binding to document
to a common place.

The relevant discussion can be found here.
https://lkml.org/lkml/2018/11/6/19

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 .../topology.txt => cpu/cpu-topology.txt}     | 82 +++++++++++++++----
 1 file changed, 66 insertions(+), 16 deletions(-)
 rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (86%)

diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/cpu/cpu-topology.txt
similarity index 86%
rename from Documentation/devicetree/bindings/arm/topology.txt
rename to Documentation/devicetree/bindings/cpu/cpu-topology.txt
index 3b8febb46dad..069addccab14 100644
--- a/Documentation/devicetree/bindings/arm/topology.txt
+++ b/Documentation/devicetree/bindings/cpu/cpu-topology.txt
@@ -1,12 +1,12 @@
 ===========================================
-ARM topology binding description
+CPU topology binding description
 ===========================================
 
 ===========================================
 1 - Introduction
 ===========================================
 
-In an ARM system, the hierarchy of CPUs is defined through three entities that
+In a SMP system, the hierarchy of CPUs is defined through three entities that
 are used to describe the layout of physical CPUs in the system:
 
 - socket
@@ -14,9 +14,6 @@ are used to describe the layout of physical CPUs in the system:
 - core
 - thread
 
-The cpu nodes (bindings defined in [1]) represent the devices that
-correspond to physical CPUs and are to be mapped to the hierarchy levels.
-
 The bottom hierarchy level sits at core or thread level depending on whether
 symmetric multi-threading (SMT) is supported or not.
 
@@ -25,33 +22,31 @@ threads existing in the system and map to the hierarchy level "thread" above.
 In systems where SMT is not supported "cpu" nodes represent all cores present
 in the system and map to the hierarchy level "core" above.
 
-ARM topology bindings allow one to associate cpu nodes with hierarchical groups
+CPU topology bindings allow one to associate cpu nodes with hierarchical groups
 corresponding to the system hierarchy; syntactically they are defined as device
 tree nodes.
 
-The remainder of this document provides the topology bindings for ARM, based
-on the Devicetree Specification, available from:
+Currently, only ARM/RISC-V intend to use this cpu topology binding but it may be
+used for any other architecture as well.
 
-https://www.devicetree.org/specifications/
+The cpu nodes, as per bindings defined in [4], represent the devices that
+correspond to physical CPUs and are to be mapped to the hierarchy levels.
 
-If not stated otherwise, whenever a reference to a cpu node phandle is made its
-value must point to a cpu node compliant with the cpu node bindings as
-documented in [1].
 A topology description containing phandles to cpu nodes that are not compliant
-with bindings standardized in [1] is therefore considered invalid.
+with bindings standardized in [4] is therefore considered invalid.
 
 ===========================================
 2 - cpu-map node
 ===========================================
 
-The ARM CPU topology is defined within the cpu-map node, which is a direct
+The ARM/RISC-V CPU topology is defined within the cpu-map node, which is a direct
 child of the cpus node and provides a container where the actual topology
 nodes are listed.
 
 - cpu-map node
 
-	Usage: Optional - On ARM SMP systems provide CPUs topology to the OS.
-			  ARM uniprocessor systems do not require a topology
+	Usage: Optional - On SMP systems provide CPUs topology to the OS.
+			  Uniprocessor systems do not require a topology
 			  description and therefore should not define a
 			  cpu-map node.
 
@@ -494,8 +489,63 @@ cpus {
 	};
 };
 
+Example 3: HiFive Unleashed (RISC-V 64 bit, 4 core system)
+
+{
+	#address-cells = <2>;
+	#size-cells = <2>;
+	compatible = "sifive,fu540g", "sifive,fu500";
+	model = "sifive,hifive-unleashed-a00";
+
+	...
+	cpus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		cpu-map {
+			cluster0 {
+				core0 {
+					cpu = <&CPU1>;
+				};
+				core1 {
+					cpu = <&CPU2>;
+				};
+				core2 {
+					cpu0 = <&CPU2>;
+				};
+				core3 {
+					cpu0 = <&CPU3>;
+				};
+			};
+		};
+
+		CPU1: cpu@1 {
+			device_type = "cpu";
+			compatible = "sifive,rocket0", "riscv";
+			reg = <0x1>;
+		}
+
+		CPU2: cpu@2 {
+			device_type = "cpu";
+			compatible = "sifive,rocket0", "riscv";
+			reg = <0x2>;
+		}
+		CPU3: cpu@3 {
+			device_type = "cpu";
+			compatible = "sifive,rocket0", "riscv";
+			reg = <0x3>;
+		}
+		CPU4: cpu@4 {
+			device_type = "cpu";
+			compatible = "sifive,rocket0", "riscv";
+			reg = <0x4>;
+		}
+	}
+};
 ===============================================================================
 [1] ARM Linux kernel documentation
     Documentation/devicetree/bindings/arm/cpus.yaml
 [2] Devicetree NUMA binding description
     Documentation/devicetree/bindings/numa.txt
+[3] RISC-V Linux kernel documentation
+    Documentation/devicetree/bindings/riscv/cpus.txt
+[4] https://www.devicetree.org/specifications/
-- 
2.21.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 3/7] cpu-topology: Move cpu topology code to common code.
  2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
  2019-05-29 21:13 ` [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries Atish Patra
  2019-05-29 21:13 ` [PATCH v6 2/7] dt-binding: cpu-topology: Move cpu-map to a common binding Atish Patra
@ 2019-05-29 21:13 ` Atish Patra
  2019-06-06 14:26   ` Atish Patra
  2019-06-11 15:55   ` Will Deacon
  2019-05-29 21:13 ` [PATCH v6 4/7] arm: Use common cpu_topology structure and functions Atish Patra
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 25+ messages in thread
From: Atish Patra @ 2019-05-29 21:13 UTC (permalink / raw)
  To: linux-kernel, Catalin Marinas, Will Deacon
  Cc: Atish Patra, Jeffrey Hugo, Sudeep Holla, Albert Ou, Anup Patel,
	David S. Miller, devicetree, Greg Kroah-Hartman, Ingo Molnar,
	Jeremy Linton, Linus Walleij, linux-riscv, Mark Rutland,
	Mauro Carvalho Chehab, Morten Rasmussen, Otto Sabart,
	Palmer Dabbelt, Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Russell King,
	linux-arm-kernel

Both RISC-V & ARM64 are using cpu-map device tree to describe
their cpu topology. It's better to move the relevant code to
a common place instead of duplicate code.

To: Will Deacon <will.deacon@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
[Tested on QDF2400]
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
[Tested on Juno and other embedded platforms.]
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

---
Hi Will/Catalin,
Can we get ack for this patch ? We are hoping to get the entire series
merged at one go.
---
 arch/arm64/include/asm/topology.h |  23 ---
 arch/arm64/kernel/topology.c      | 303 +-----------------------------
 drivers/base/arch_topology.c      | 296 +++++++++++++++++++++++++++++
 include/linux/arch_topology.h     |  28 +++
 include/linux/topology.h          |   1 +
 5 files changed, 329 insertions(+), 322 deletions(-)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index 0524f2438649..a4d945db95a2 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -4,29 +4,6 @@
 
 #include <linux/cpumask.h>
 
-struct cpu_topology {
-	int thread_id;
-	int core_id;
-	int package_id;
-	int llc_id;
-	cpumask_t thread_sibling;
-	cpumask_t core_sibling;
-	cpumask_t llc_sibling;
-};
-
-extern struct cpu_topology cpu_topology[NR_CPUS];
-
-#define topology_physical_package_id(cpu)	(cpu_topology[cpu].package_id)
-#define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
-#define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
-#define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
-#define topology_llc_cpumask(cpu)	(&cpu_topology[cpu].llc_sibling)
-
-void init_cpu_topology(void);
-void store_cpu_topology(unsigned int cpuid);
-void remove_cpu_topology(unsigned int cpuid);
-const struct cpumask *cpu_coregroup_mask(int cpu);
-
 #ifdef CONFIG_NUMA
 
 struct pci_bus;
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 0825c4a856e3..6b95c91e7d67 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -14,250 +14,13 @@
 #include <linux/acpi.h>
 #include <linux/arch_topology.h>
 #include <linux/cacheinfo.h>
-#include <linux/cpu.h>
-#include <linux/cpumask.h>
 #include <linux/init.h>
 #include <linux/percpu.h>
-#include <linux/node.h>
-#include <linux/nodemask.h>
-#include <linux/of.h>
-#include <linux/sched.h>
-#include <linux/sched/topology.h>
-#include <linux/slab.h>
-#include <linux/smp.h>
-#include <linux/string.h>
 
 #include <asm/cpu.h>
 #include <asm/cputype.h>
 #include <asm/topology.h>
 
-static int __init get_cpu_for_node(struct device_node *node)
-{
-	struct device_node *cpu_node;
-	int cpu;
-
-	cpu_node = of_parse_phandle(node, "cpu", 0);
-	if (!cpu_node)
-		return -1;
-
-	cpu = of_cpu_node_to_id(cpu_node);
-	if (cpu >= 0)
-		topology_parse_cpu_capacity(cpu_node, cpu);
-	else
-		pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
-
-	of_node_put(cpu_node);
-	return cpu;
-}
-
-static int __init parse_core(struct device_node *core, int package_id,
-			     int core_id)
-{
-	char name[10];
-	bool leaf = true;
-	int i = 0;
-	int cpu;
-	struct device_node *t;
-
-	do {
-		snprintf(name, sizeof(name), "thread%d", i);
-		t = of_get_child_by_name(core, name);
-		if (t) {
-			leaf = false;
-			cpu = get_cpu_for_node(t);
-			if (cpu >= 0) {
-				cpu_topology[cpu].package_id = package_id;
-				cpu_topology[cpu].core_id = core_id;
-				cpu_topology[cpu].thread_id = i;
-			} else {
-				pr_err("%pOF: Can't get CPU for thread\n",
-				       t);
-				of_node_put(t);
-				return -EINVAL;
-			}
-			of_node_put(t);
-		}
-		i++;
-	} while (t);
-
-	cpu = get_cpu_for_node(core);
-	if (cpu >= 0) {
-		if (!leaf) {
-			pr_err("%pOF: Core has both threads and CPU\n",
-			       core);
-			return -EINVAL;
-		}
-
-		cpu_topology[cpu].package_id = package_id;
-		cpu_topology[cpu].core_id = core_id;
-	} else if (leaf) {
-		pr_err("%pOF: Can't get CPU for leaf core\n", core);
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int __init parse_cluster(struct device_node *cluster, int depth)
-{
-	char name[10];
-	bool leaf = true;
-	bool has_cores = false;
-	struct device_node *c;
-	static int package_id __initdata;
-	int core_id = 0;
-	int i, ret;
-
-	/*
-	 * First check for child clusters; we currently ignore any
-	 * information about the nesting of clusters and present the
-	 * scheduler with a flat list of them.
-	 */
-	i = 0;
-	do {
-		snprintf(name, sizeof(name), "cluster%d", i);
-		c = of_get_child_by_name(cluster, name);
-		if (c) {
-			leaf = false;
-			ret = parse_cluster(c, depth + 1);
-			of_node_put(c);
-			if (ret != 0)
-				return ret;
-		}
-		i++;
-	} while (c);
-
-	/* Now check for cores */
-	i = 0;
-	do {
-		snprintf(name, sizeof(name), "core%d", i);
-		c = of_get_child_by_name(cluster, name);
-		if (c) {
-			has_cores = true;
-
-			if (depth == 0) {
-				pr_err("%pOF: cpu-map children should be clusters\n",
-				       c);
-				of_node_put(c);
-				return -EINVAL;
-			}
-
-			if (leaf) {
-				ret = parse_core(c, package_id, core_id++);
-			} else {
-				pr_err("%pOF: Non-leaf cluster with core %s\n",
-				       cluster, name);
-				ret = -EINVAL;
-			}
-
-			of_node_put(c);
-			if (ret != 0)
-				return ret;
-		}
-		i++;
-	} while (c);
-
-	if (leaf && !has_cores)
-		pr_warn("%pOF: empty cluster\n", cluster);
-
-	if (leaf)
-		package_id++;
-
-	return 0;
-}
-
-static int __init parse_dt_topology(void)
-{
-	struct device_node *cn, *map;
-	int ret = 0;
-	int cpu;
-
-	cn = of_find_node_by_path("/cpus");
-	if (!cn) {
-		pr_err("No CPU information found in DT\n");
-		return 0;
-	}
-
-	/*
-	 * When topology is provided cpu-map is essentially a root
-	 * cluster with restricted subnodes.
-	 */
-	map = of_get_child_by_name(cn, "cpu-map");
-	if (!map)
-		goto out;
-
-	ret = parse_cluster(map, 0);
-	if (ret != 0)
-		goto out_map;
-
-	topology_normalize_cpu_scale();
-
-	/*
-	 * Check that all cores are in the topology; the SMP code will
-	 * only mark cores described in the DT as possible.
-	 */
-	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].package_id == -1)
-			ret = -EINVAL;
-
-out_map:
-	of_node_put(map);
-out:
-	of_node_put(cn);
-	return ret;
-}
-
-/*
- * cpu topology table
- */
-struct cpu_topology cpu_topology[NR_CPUS];
-EXPORT_SYMBOL_GPL(cpu_topology);
-
-const struct cpumask *cpu_coregroup_mask(int cpu)
-{
-	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
-
-	/* Find the smaller of NUMA, core or LLC siblings */
-	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
-		/* not numa in package, lets use the package siblings */
-		core_mask = &cpu_topology[cpu].core_sibling;
-	}
-	if (cpu_topology[cpu].llc_id != -1) {
-		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
-			core_mask = &cpu_topology[cpu].llc_sibling;
-	}
-
-	return core_mask;
-}
-
-static void update_siblings_masks(unsigned int cpuid)
-{
-	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
-	int cpu;
-
-	/* update core and thread sibling masks */
-	for_each_online_cpu(cpu) {
-		cpu_topo = &cpu_topology[cpu];
-
-		if (cpuid_topo->llc_id == cpu_topo->llc_id) {
-			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
-			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
-		}
-
-		if (cpuid_topo->package_id != cpu_topo->package_id)
-			continue;
-
-		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
-		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
-		if (cpuid_topo->core_id != cpu_topo->core_id)
-			continue;
-
-		cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
-		cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
-	}
-}
-
 void store_cpu_topology(unsigned int cpuid)
 {
 	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
@@ -296,59 +59,19 @@ void store_cpu_topology(unsigned int cpuid)
 	update_siblings_masks(cpuid);
 }
 
-static void clear_cpu_topology(int cpu)
-{
-	struct cpu_topology *cpu_topo = &cpu_topology[cpu];
-
-	cpumask_clear(&cpu_topo->llc_sibling);
-	cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
-
-	cpumask_clear(&cpu_topo->core_sibling);
-	cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
-	cpumask_clear(&cpu_topo->thread_sibling);
-	cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
-}
-
-static void __init reset_cpu_topology(void)
-{
-	unsigned int cpu;
-
-	for_each_possible_cpu(cpu) {
-		struct cpu_topology *cpu_topo = &cpu_topology[cpu];
-
-		cpu_topo->thread_id = -1;
-		cpu_topo->core_id = 0;
-		cpu_topo->package_id = -1;
-		cpu_topo->llc_id = -1;
-
-		clear_cpu_topology(cpu);
-	}
-}
-
-void remove_cpu_topology(unsigned int cpu)
-{
-	int sibling;
-
-	for_each_cpu(sibling, topology_core_cpumask(cpu))
-		cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
-	for_each_cpu(sibling, topology_sibling_cpumask(cpu))
-		cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
-	for_each_cpu(sibling, topology_llc_cpumask(cpu))
-		cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
-
-	clear_cpu_topology(cpu);
-}
-
 #ifdef CONFIG_ACPI
 /*
  * Propagate the topology information of the processor_topology_node tree to the
  * cpu_topology array.
  */
-static int __init parse_acpi_topology(void)
+int __init parse_acpi_topology(void)
 {
 	bool is_threaded;
 	int cpu, topology_id;
 
+	if (acpi_disabled)
+		return 0;
+
 	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
 
 	for_each_possible_cpu(cpu) {
@@ -384,24 +107,6 @@ static int __init parse_acpi_topology(void)
 
 	return 0;
 }
-
-#else
-static inline int __init parse_acpi_topology(void)
-{
-	return -EINVAL;
-}
 #endif
 
-void __init init_cpu_topology(void)
-{
-	reset_cpu_topology();
 
-	/*
-	 * Discard anything that was parsed if we hit an error so we
-	 * don't use partial information.
-	 */
-	if (!acpi_disabled && parse_acpi_topology())
-		reset_cpu_topology();
-	else if (of_have_populated_dt() && parse_dt_topology())
-		reset_cpu_topology();
-}
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1739d7e1952a..5781bb4c457c 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -15,6 +15,11 @@
 #include <linux/string.h>
 #include <linux/sched/topology.h>
 #include <linux/cpuset.h>
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/percpu.h>
+#include <linux/sched.h>
+#include <linux/smp.h>
 
 DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
 
@@ -244,3 +249,294 @@ static void parsing_done_workfn(struct work_struct *work)
 #else
 core_initcall(free_raw_capacity);
 #endif
+
+#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+static int __init get_cpu_for_node(struct device_node *node)
+{
+	struct device_node *cpu_node;
+	int cpu;
+
+	cpu_node = of_parse_phandle(node, "cpu", 0);
+	if (!cpu_node)
+		return -1;
+
+	cpu = of_cpu_node_to_id(cpu_node);
+	if (cpu >= 0)
+		topology_parse_cpu_capacity(cpu_node, cpu);
+	else
+		pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
+
+	of_node_put(cpu_node);
+	return cpu;
+}
+
+static int __init parse_core(struct device_node *core, int package_id,
+			     int core_id)
+{
+	char name[10];
+	bool leaf = true;
+	int i = 0;
+	int cpu;
+	struct device_node *t;
+
+	do {
+		snprintf(name, sizeof(name), "thread%d", i);
+		t = of_get_child_by_name(core, name);
+		if (t) {
+			leaf = false;
+			cpu = get_cpu_for_node(t);
+			if (cpu >= 0) {
+				cpu_topology[cpu].package_id = package_id;
+				cpu_topology[cpu].core_id = core_id;
+				cpu_topology[cpu].thread_id = i;
+			} else {
+				pr_err("%pOF: Can't get CPU for thread\n",
+				       t);
+				of_node_put(t);
+				return -EINVAL;
+			}
+			of_node_put(t);
+		}
+		i++;
+	} while (t);
+
+	cpu = get_cpu_for_node(core);
+	if (cpu >= 0) {
+		if (!leaf) {
+			pr_err("%pOF: Core has both threads and CPU\n",
+			       core);
+			return -EINVAL;
+		}
+
+		cpu_topology[cpu].package_id = package_id;
+		cpu_topology[cpu].core_id = core_id;
+	} else if (leaf) {
+		pr_err("%pOF: Can't get CPU for leaf core\n", core);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int __init parse_cluster(struct device_node *cluster, int depth)
+{
+	char name[10];
+	bool leaf = true;
+	bool has_cores = false;
+	struct device_node *c;
+	static int package_id __initdata;
+	int core_id = 0;
+	int i, ret;
+
+	/*
+	 * First check for child clusters; we currently ignore any
+	 * information about the nesting of clusters and present the
+	 * scheduler with a flat list of them.
+	 */
+	i = 0;
+	do {
+		snprintf(name, sizeof(name), "cluster%d", i);
+		c = of_get_child_by_name(cluster, name);
+		if (c) {
+			leaf = false;
+			ret = parse_cluster(c, depth + 1);
+			of_node_put(c);
+			if (ret != 0)
+				return ret;
+		}
+		i++;
+	} while (c);
+
+	/* Now check for cores */
+	i = 0;
+	do {
+		snprintf(name, sizeof(name), "core%d", i);
+		c = of_get_child_by_name(cluster, name);
+		if (c) {
+			has_cores = true;
+
+			if (depth == 0) {
+				pr_err("%pOF: cpu-map children should be clusters\n",
+				       c);
+				of_node_put(c);
+				return -EINVAL;
+			}
+
+			if (leaf) {
+				ret = parse_core(c, package_id, core_id++);
+			} else {
+				pr_err("%pOF: Non-leaf cluster with core %s\n",
+				       cluster, name);
+				ret = -EINVAL;
+			}
+
+			of_node_put(c);
+			if (ret != 0)
+				return ret;
+		}
+		i++;
+	} while (c);
+
+	if (leaf && !has_cores)
+		pr_warn("%pOF: empty cluster\n", cluster);
+
+	if (leaf)
+		package_id++;
+
+	return 0;
+}
+
+static int __init parse_dt_topology(void)
+{
+	struct device_node *cn, *map;
+	int ret = 0;
+	int cpu;
+
+	cn = of_find_node_by_path("/cpus");
+	if (!cn) {
+		pr_err("No CPU information found in DT\n");
+		return 0;
+	}
+
+	/*
+	 * When topology is provided cpu-map is essentially a root
+	 * cluster with restricted subnodes.
+	 */
+	map = of_get_child_by_name(cn, "cpu-map");
+	if (!map)
+		goto out;
+
+	ret = parse_cluster(map, 0);
+	if (ret != 0)
+		goto out_map;
+
+	topology_normalize_cpu_scale();
+
+	/*
+	 * Check that all cores are in the topology; the SMP code will
+	 * only mark cores described in the DT as possible.
+	 */
+	for_each_possible_cpu(cpu)
+		if (cpu_topology[cpu].package_id == -1)
+			ret = -EINVAL;
+
+out_map:
+	of_node_put(map);
+out:
+	of_node_put(cn);
+	return ret;
+}
+
+/*
+ * cpu topology table
+ */
+struct cpu_topology cpu_topology[NR_CPUS];
+EXPORT_SYMBOL_GPL(cpu_topology);
+
+const struct cpumask *cpu_coregroup_mask(int cpu)
+{
+	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
+
+	/* Find the smaller of NUMA, core or LLC siblings */
+	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
+		/* not numa in package, lets use the package siblings */
+		core_mask = &cpu_topology[cpu].core_sibling;
+	}
+	if (cpu_topology[cpu].llc_id != -1) {
+		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
+			core_mask = &cpu_topology[cpu].llc_sibling;
+	}
+
+	return core_mask;
+}
+
+void update_siblings_masks(unsigned int cpuid)
+{
+	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
+	int cpu;
+
+	/* update core and thread sibling masks */
+	for_each_online_cpu(cpu) {
+		cpu_topo = &cpu_topology[cpu];
+
+		if (cpuid_topo->llc_id == cpu_topo->llc_id) {
+			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
+			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
+		}
+
+		if (cpuid_topo->package_id != cpu_topo->package_id)
+			continue;
+
+		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+		if (cpuid_topo->core_id != cpu_topo->core_id)
+			continue;
+
+		cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
+		cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
+	}
+}
+
+static void clear_cpu_topology(int cpu)
+{
+	struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+
+	cpumask_clear(&cpu_topo->llc_sibling);
+	cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
+
+	cpumask_clear(&cpu_topo->core_sibling);
+	cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
+	cpumask_clear(&cpu_topo->thread_sibling);
+	cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
+}
+
+static void __init reset_cpu_topology(void)
+{
+	unsigned int cpu;
+
+	for_each_possible_cpu(cpu) {
+		struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+
+		cpu_topo->thread_id = -1;
+		cpu_topo->core_id = -1;
+		cpu_topo->package_id = -1;
+		cpu_topo->llc_id = -1;
+
+		clear_cpu_topology(cpu);
+	}
+}
+
+void remove_cpu_topology(unsigned int cpu)
+{
+	int sibling;
+
+	for_each_cpu(sibling, topology_core_cpumask(cpu))
+		cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
+	for_each_cpu(sibling, topology_sibling_cpumask(cpu))
+		cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
+	for_each_cpu(sibling, topology_llc_cpumask(cpu))
+		cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
+
+	clear_cpu_topology(cpu);
+}
+
+__weak int __init parse_acpi_topology(void)
+{
+	return 0;
+}
+
+void __init init_cpu_topology(void)
+{
+	reset_cpu_topology();
+
+	/*
+	 * Discard anything that was parsed if we hit an error so we
+	 * don't use partial information.
+	 */
+	if (parse_acpi_topology())
+		reset_cpu_topology();
+	else if (of_have_populated_dt() && parse_dt_topology())
+		reset_cpu_topology();
+}
+#endif
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index d9bdc1a7f4e7..d4e76e0a283f 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -33,4 +33,32 @@ unsigned long topology_get_freq_scale(int cpu)
 	return per_cpu(freq_scale, cpu);
 }
 
+struct cpu_topology {
+	int thread_id;
+	int core_id;
+	int package_id;
+	int llc_id;
+	cpumask_t thread_sibling;
+	cpumask_t core_sibling;
+	cpumask_t llc_sibling;
+};
+
+#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
+extern struct cpu_topology cpu_topology[NR_CPUS];
+
+#define topology_physical_package_id(cpu)	(cpu_topology[cpu].package_id)
+#define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
+#define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
+#define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
+#define topology_llc_cpumask(cpu)	(&cpu_topology[cpu].llc_sibling)
+void init_cpu_topology(void);
+void store_cpu_topology(unsigned int cpuid);
+const struct cpumask *cpu_coregroup_mask(int cpu);
+#endif
+
+#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+void update_siblings_masks(unsigned int cpu);
+#endif
+void remove_cpu_topology(unsigned int cpuid);
+
 #endif /* _LINUX_ARCH_TOPOLOGY_H_ */
diff --git a/include/linux/topology.h b/include/linux/topology.h
index cb0775e1ee4b..4b3755d65812 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -27,6 +27,7 @@
 #ifndef _LINUX_TOPOLOGY_H
 #define _LINUX_TOPOLOGY_H
 
+#include <linux/arch_topology.h>
 #include <linux/cpumask.h>
 #include <linux/bitops.h>
 #include <linux/mmzone.h>
-- 
2.21.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 4/7] arm: Use common cpu_topology structure and functions.
  2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
                   ` (2 preceding siblings ...)
  2019-05-29 21:13 ` [PATCH v6 3/7] cpu-topology: Move cpu topology code to common code Atish Patra
@ 2019-05-29 21:13 ` Atish Patra
  2019-06-06 14:25   ` Atish Patra
  2019-05-29 21:13 ` [PATCH v6 5/7] RISC-V: Parse cpu topology during boot Atish Patra
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Atish Patra @ 2019-05-29 21:13 UTC (permalink / raw)
  To: linux-kernel, Russell King
  Cc: Atish Patra, Sudeep Holla, Albert Ou, Anup Patel,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Morten Rasmussen,
	Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	linux-arm-kernel

Currently, ARM32 and ARM64 uses different data structures to represent
their cpu topologies. Since, we are moving the ARM64 topology to common
code to be used by other architectures, we can reuse that for ARM32 as
well.

Take this opprtunity to remove the redundant functions from ARM32 and
reuse the common code instead.

To: Russell King <linux@armlinux.org.uk>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com> (on TC2)
Reviewed-by : Sudeep Holla <sudeep.holla@arm.com>

---
Hi Russell,
Can we get a ACK for this patch ? We are hoping that the entire
series can be merged at one go.
---
 arch/arm/include/asm/topology.h | 20 -----------
 arch/arm/kernel/topology.c      | 60 ++++-----------------------------
 drivers/base/arch_topology.c    |  4 ++-
 include/linux/arch_topology.h   |  6 ++--
 4 files changed, 11 insertions(+), 79 deletions(-)

diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index 2a786f54d8b8..8a0fae94d45e 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -5,26 +5,6 @@
 #ifdef CONFIG_ARM_CPU_TOPOLOGY
 
 #include <linux/cpumask.h>
-
-struct cputopo_arm {
-	int thread_id;
-	int core_id;
-	int socket_id;
-	cpumask_t thread_sibling;
-	cpumask_t core_sibling;
-};
-
-extern struct cputopo_arm cpu_topology[NR_CPUS];
-
-#define topology_physical_package_id(cpu)	(cpu_topology[cpu].socket_id)
-#define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
-#define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
-#define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
-
-void init_cpu_topology(void);
-void store_cpu_topology(unsigned int cpuid);
-const struct cpumask *cpu_coregroup_mask(int cpu);
-
 #include <linux/arch_topology.h>
 
 /* Replace task scheduler's default frequency-invariant accounting */
diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 60e375ce1ab2..238f1da0219c 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -177,17 +177,6 @@ static inline void parse_dt_topology(void) {}
 static inline void update_cpu_capacity(unsigned int cpuid) {}
 #endif
 
- /*
- * cpu topology table
- */
-struct cputopo_arm cpu_topology[NR_CPUS];
-EXPORT_SYMBOL_GPL(cpu_topology);
-
-const struct cpumask *cpu_coregroup_mask(int cpu)
-{
-	return &cpu_topology[cpu].core_sibling;
-}
-
 /*
  * The current assumption is that we can power gate each core independently.
  * This will be superseded by DT binding once available.
@@ -197,32 +186,6 @@ const struct cpumask *cpu_corepower_mask(int cpu)
 	return &cpu_topology[cpu].thread_sibling;
 }
 
-static void update_siblings_masks(unsigned int cpuid)
-{
-	struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
-	int cpu;
-
-	/* update core and thread sibling masks */
-	for_each_possible_cpu(cpu) {
-		cpu_topo = &cpu_topology[cpu];
-
-		if (cpuid_topo->socket_id != cpu_topo->socket_id)
-			continue;
-
-		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
-		if (cpu != cpuid)
-			cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
-		if (cpuid_topo->core_id != cpu_topo->core_id)
-			continue;
-
-		cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
-		if (cpu != cpuid)
-			cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
-	}
-	smp_wmb();
-}
-
 /*
  * store_cpu_topology is called at boot when only one cpu is running
  * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
@@ -230,7 +193,7 @@ static void update_siblings_masks(unsigned int cpuid)
  */
 void store_cpu_topology(unsigned int cpuid)
 {
-	struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid];
+	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
 	unsigned int mpidr;
 
 	/* If the cpu topology has been already set, just return */
@@ -250,12 +213,12 @@ void store_cpu_topology(unsigned int cpuid)
 			/* core performance interdependency */
 			cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
 			cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
-			cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
+			cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
 		} else {
 			/* largely independent cores */
 			cpuid_topo->thread_id = -1;
 			cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
-			cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+			cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
 		}
 	} else {
 		/*
@@ -265,7 +228,7 @@ void store_cpu_topology(unsigned int cpuid)
 		 */
 		cpuid_topo->thread_id = -1;
 		cpuid_topo->core_id = 0;
-		cpuid_topo->socket_id = -1;
+		cpuid_topo->package_id = -1;
 	}
 
 	update_siblings_masks(cpuid);
@@ -275,7 +238,7 @@ void store_cpu_topology(unsigned int cpuid)
 	pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
 		cpuid, cpu_topology[cpuid].thread_id,
 		cpu_topology[cpuid].core_id,
-		cpu_topology[cpuid].socket_id, mpidr);
+		cpu_topology[cpuid].package_id, mpidr);
 }
 
 static inline int cpu_corepower_flags(void)
@@ -298,18 +261,7 @@ static struct sched_domain_topology_level arm_topology[] = {
  */
 void __init init_cpu_topology(void)
 {
-	unsigned int cpu;
-
-	/* init core mask and capacity */
-	for_each_possible_cpu(cpu) {
-		struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
-
-		cpu_topo->thread_id = -1;
-		cpu_topo->core_id =  -1;
-		cpu_topo->socket_id = -1;
-		cpumask_clear(&cpu_topo->core_sibling);
-		cpumask_clear(&cpu_topo->thread_sibling);
-	}
+	reset_cpu_topology();
 	smp_wmb();
 
 	parse_dt_topology();
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 5781bb4c457c..797e3cd71bea 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -426,6 +426,7 @@ static int __init parse_dt_topology(void)
 	of_node_put(cn);
 	return ret;
 }
+#endif
 
 /*
  * cpu topology table
@@ -491,7 +492,7 @@ static void clear_cpu_topology(int cpu)
 	cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
 }
 
-static void __init reset_cpu_topology(void)
+void __init reset_cpu_topology(void)
 {
 	unsigned int cpu;
 
@@ -526,6 +527,7 @@ __weak int __init parse_acpi_topology(void)
 	return 0;
 }
 
+#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
 void __init init_cpu_topology(void)
 {
 	reset_cpu_topology();
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index d4e76e0a283f..d4311127970d 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -54,11 +54,9 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
 void init_cpu_topology(void);
 void store_cpu_topology(unsigned int cpuid);
 const struct cpumask *cpu_coregroup_mask(int cpu);
-#endif
-
-#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
 void update_siblings_masks(unsigned int cpu);
-#endif
 void remove_cpu_topology(unsigned int cpuid);
+void reset_cpu_topology(void);
+#endif
 
 #endif /* _LINUX_ARCH_TOPOLOGY_H_ */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 5/7] RISC-V: Parse cpu topology during boot.
  2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
                   ` (3 preceding siblings ...)
  2019-05-29 21:13 ` [PATCH v6 4/7] arm: Use common cpu_topology structure and functions Atish Patra
@ 2019-05-29 21:13 ` Atish Patra
  2019-06-07  5:00   ` Paul Walmsley
  2019-05-29 21:13 ` [PATCH v6 6/7] base: arch_topology: update Kconfig help description Atish Patra
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Atish Patra @ 2019-05-29 21:13 UTC (permalink / raw)
  To: linux-kernel, Russell King
  Cc: Atish Patra, Sudeep Holla, Albert Ou, Anup Patel,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Morten Rasmussen,
	Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	linux-arm-kernel

Currently, there are no topology defined for RISC-V.
Parse the cpu-map node from device tree and setup the
cpu topology.

CPU topology after applying the patch.
$cat /sys/devices/system/cpu/cpu2/topology/core_siblings_list
0-3
$cat /sys/devices/system/cpu/cpu3/topology/core_siblings_list
0-3
$cat /sys/devices/system/cpu/cpu3/topology/physical_package_id
0
$cat /sys/devices/system/cpu/cpu3/topology/core_id
3

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
---
 arch/riscv/Kconfig          | 1 +
 arch/riscv/kernel/smpboot.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 0c4b12205632..2d8a16299a85 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -47,6 +47,7 @@ config RISCV
 	select PCI_MSI if PCI
 	select RISCV_TIMER
 	select GENERIC_IRQ_MULTI_HANDLER
+	select GENERIC_ARCH_TOPOLOGY if SMP
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_MMIOWB
 	select HAVE_EBPF_JIT if 64BIT
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 7a0b62252524..54f89d5b19ba 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -16,6 +16,7 @@
  * GNU General Public License for more details.
  */
 
+#include <linux/arch_topology.h>
 #include <linux/module.h>
 #include <linux/init.h>
 #include <linux/kernel.h>
@@ -43,6 +44,7 @@ static DECLARE_COMPLETION(cpu_running);
 
 void __init smp_prepare_boot_cpu(void)
 {
+	init_cpu_topology();
 }
 
 void __init smp_prepare_cpus(unsigned int max_cpus)
@@ -146,6 +148,7 @@ asmlinkage void __init smp_callin(void)
 
 	trap_init();
 	notify_cpu_starting(smp_processor_id());
+	update_siblings_masks(smp_processor_id());
 	set_cpu_online(smp_processor_id(), 1);
 	/*
 	 * Remote TLB flushes are ignored while the CPU is offline, so emit
-- 
2.21.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 6/7] base: arch_topology: update Kconfig help description
  2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
                   ` (4 preceding siblings ...)
  2019-05-29 21:13 ` [PATCH v6 5/7] RISC-V: Parse cpu topology during boot Atish Patra
@ 2019-05-29 21:13 ` Atish Patra
  2019-05-29 21:13 ` [PATCH v6 7/7] MAINTAINERS: Add an entry for generic architecture topology Atish Patra
  2019-05-30 21:12 ` [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Jeremy Linton
  7 siblings, 0 replies; 25+ messages in thread
From: Atish Patra @ 2019-05-29 21:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Greg Kroah-Hartman, Albert Ou, Anup Patel,
	Atish Patra, Catalin Marinas, David S. Miller, devicetree,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Morten Rasmussen,
	Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Russell King, linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

Commit 5d777b185f6d ("arch_topology: Make cpu_capacity sysfs node as read-only")
made cpu_capacity sysfs node read-only. Update the GENERIC_ARCH_TOPOLOGY
Kconfig help section to reflect the same.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/base/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index dc404492381d..28b92e3cc570 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -202,7 +202,7 @@ config GENERIC_ARCH_TOPOLOGY
 	help
 	  Enable support for architectures common topology code: e.g., parsing
 	  CPU capacity information from DT, usage of such information for
-	  appropriate scaling, sysfs interface for changing capacity values at
+	  appropriate scaling, sysfs interface for reading capacity values at
 	  runtime.
 
 endmenu
-- 
2.21.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v6 7/7] MAINTAINERS: Add an entry for generic architecture topology
  2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
                   ` (5 preceding siblings ...)
  2019-05-29 21:13 ` [PATCH v6 6/7] base: arch_topology: update Kconfig help description Atish Patra
@ 2019-05-29 21:13 ` Atish Patra
  2019-05-30 21:12 ` [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Jeremy Linton
  7 siblings, 0 replies; 25+ messages in thread
From: Atish Patra @ 2019-05-29 21:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sudeep Holla, Will Deacon, Greg Kroah-Hartman, Juri Lelli,
	Albert Ou, Anup Patel, Atish Patra, Catalin Marinas,
	David S. Miller, devicetree, Ingo Molnar, Jeremy Linton,
	Linus Walleij, linux-riscv, Mark Rutland, Mauro Carvalho Chehab,
	Morten Rasmussen, Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Russell King,
	linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

arm and arm64 shared lot of CPU topology related code. This was
consolidated under driver/base/arch_topology.c by Juri. Now RISC-V
is also started sharing the same code pulling more code from arm64
into arch_topology.c

Since I was involved in the review from the beginning, I would like
to assume maintenance for the same.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 MAINTAINERS | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 429c6c624861..f0b72ed51e22 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6583,6 +6583,13 @@ W:	https://linuxtv.org
 S:	Maintained
 F:	drivers/media/radio/radio-gemtek*
 
+GENERIC ARCHITECTURE TOPOLOGY
+M:	Sudeep Holla <sudeep.holla@arm.com>
+L:	linux-kernel@vger.kernel.org
+S:	Maintained
+F:	drivers/base/arch_topology.c
+F:	include/linux/arch_topology.h
+
 GENERIC GPIO I2C DRIVER
 M:	Wolfram Sang <wsa+renesas@sang-engineering.com>
 S:	Supported
-- 
2.21.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-29 21:13 ` [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries Atish Patra
@ 2019-05-29 23:39   ` Andrew F. Davis
  2019-05-30 11:51     ` Morten Rasmussen
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew F. Davis @ 2019-05-29 23:39 UTC (permalink / raw)
  To: Atish Patra, linux-kernel
  Cc: Sudeep Holla, Rob Herring, Albert Ou, Anup Patel,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Morten Rasmussen,
	Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Russell King, linux-arm-kernel

On 5/29/19 5:13 PM, Atish Patra wrote:
> From: Sudeep Holla <sudeep.holla@arm.com>
> 
> The current ARM DT topology description provides the operating system
> with a topological view of the system that is based on leaf nodes
> representing either cores or threads (in an SMT system) and a
> hierarchical set of cluster nodes that creates a hierarchical topology
> view of how those cores and threads are grouped.
> 
> However this hierarchical representation of clusters does not allow to
> describe what topology level actually represents the physical package or
> the socket boundary, which is a key piece of information to be used by
> an operating system to optimize resource allocation and scheduling.
> 

Are physical package descriptions really needed? What does "socket" 
imply that a higher layer "cluster" node grouping does not? It doesn't 
imply a different NUMA distance and the definition of "socket" is 
already not well defined, is a dual chiplet processor not just a fancy 
dual "socket" or are dual "sockets" on a server board "slotket" card, 
will we need new names for those too..

Andrew

> Lets add a new "socket" node type in the cpu-map node to describe the
> same.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> ---
>   .../devicetree/bindings/arm/topology.txt      | 52 ++++++++++++++-----
>   1 file changed, 39 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/arm/topology.txt
> index b0d80c0fb265..3b8febb46dad 100644
> --- a/Documentation/devicetree/bindings/arm/topology.txt
> +++ b/Documentation/devicetree/bindings/arm/topology.txt
> @@ -9,6 +9,7 @@ ARM topology binding description
>   In an ARM system, the hierarchy of CPUs is defined through three entities that
>   are used to describe the layout of physical CPUs in the system:
>   
> +- socket
>   - cluster
>   - core
>   - thread
> @@ -63,21 +64,23 @@ nodes are listed.
>   
>   	The cpu-map node's child nodes can be:
>   
> -	- one or more cluster nodes
> +	- one or more cluster nodes or
> +	- one or more socket nodes in a multi-socket system
>   
>   	Any other configuration is considered invalid.
>   
> -The cpu-map node can only contain three types of child nodes:
> +The cpu-map node can only contain 4 types of child nodes:
>   
> +- socket node
>   - cluster node
>   - core node
>   - thread node
>   
>   whose bindings are described in paragraph 3.
>   
> -The nodes describing the CPU topology (cluster/core/thread) can only
> -be defined within the cpu-map node and every core/thread in the system
> -must be defined within the topology.  Any other configuration is
> +The nodes describing the CPU topology (socket/cluster/core/thread) can
> +only be defined within the cpu-map node and every core/thread in the
> +system must be defined within the topology.  Any other configuration is
>   invalid and therefore must be ignored.
>   
>   ===========================================
> @@ -85,26 +88,44 @@ invalid and therefore must be ignored.
>   ===========================================
>   
>   cpu-map child nodes must follow a naming convention where the node name
> -must be "clusterN", "coreN", "threadN" depending on the node type (ie
> -cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes which
> -are siblings within a single common parent node must be given a unique and
> +must be "socketN", "clusterN", "coreN", "threadN" depending on the node type
> +(ie socket/cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes
> +which are siblings within a single common parent node must be given a unique and
>   sequential N value, starting from 0).
>   cpu-map child nodes which do not share a common parent node can have the same
>   name (ie same number N as other cpu-map child nodes at different device tree
>   levels) since name uniqueness will be guaranteed by the device tree hierarchy.
>   
>   ===========================================
> -3 - cluster/core/thread node bindings
> +3 - socket/cluster/core/thread node bindings
>   ===========================================
>   
> -Bindings for cluster/cpu/thread nodes are defined as follows:
> +Bindings for socket/cluster/cpu/thread nodes are defined as follows:
> +
> +- socket node
> +
> +	 Description: must be declared within a cpu-map node, one node
> +		      per physical socket in the system. A system can
> +		      contain single or multiple physical socket.
> +		      The association of sockets and NUMA nodes is beyond
> +		      the scope of this bindings, please refer [2] for
> +		      NUMA bindings.
> +
> +	This node is optional for a single socket system.
> +
> +	The socket node name must be "socketN" as described in 2.1 above.
> +	A socket node can not be a leaf node.
> +
> +	A socket node's child nodes must be one or more cluster nodes.
> +
> +	Any other configuration is considered invalid.
>   
>   - cluster node
>   
>   	 Description: must be declared within a cpu-map node, one node
>   		      per cluster. A system can contain several layers of
> -		      clustering and cluster nodes can be contained in parent
> -		      cluster nodes.
> +		      clustering within a single physical socket and cluster
> +		      nodes can be contained in parent cluster nodes.
>   
>   	The cluster node name must be "clusterN" as described in 2.1 above.
>   	A cluster node can not be a leaf node.
> @@ -164,13 +185,15 @@ Bindings for cluster/cpu/thread nodes are defined as follows:
>   4 - Example dts
>   ===========================================
>   
> -Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters):
> +Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters in a single
> +physical socket):
>   
>   cpus {
>   	#size-cells = <0>;
>   	#address-cells = <2>;
>   
>   	cpu-map {
> +		socket0 {
>   			cluster0 {
>   				cluster0 {
>   					core0 {
> @@ -253,6 +276,7 @@ cpus {
>   				};
>   			};
>   		};
> +	};
>   
>   	CPU0: cpu@0 {
>   		device_type = "cpu";
> @@ -473,3 +497,5 @@ cpus {
>   ===============================================================================
>   [1] ARM Linux kernel documentation
>       Documentation/devicetree/bindings/arm/cpus.yaml
> +[2] Devicetree NUMA binding description
> +    Documentation/devicetree/bindings/numa.txt
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-29 23:39   ` Andrew F. Davis
@ 2019-05-30 11:51     ` Morten Rasmussen
  2019-05-30 12:56       ` Andrew F. Davis
  2019-05-30 21:42       ` Russell King - ARM Linux admin
  0 siblings, 2 replies; 25+ messages in thread
From: Morten Rasmussen @ 2019-05-30 11:51 UTC (permalink / raw)
  To: Andrew F. Davis
  Cc: Atish Patra, linux-kernel, Sudeep Holla, Rob Herring, Albert Ou,
	Anup Patel, Catalin Marinas, David S. Miller, devicetree,
	Greg Kroah-Hartman, Ingo Molnar, Jeremy Linton, Linus Walleij,
	linux-riscv, Mark Rutland, Mauro Carvalho Chehab, Otto Sabart,
	Palmer Dabbelt, Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Russell King, linux-arm-kernel

On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> On 5/29/19 5:13 PM, Atish Patra wrote:
> >From: Sudeep Holla <sudeep.holla@arm.com>
> >
> >The current ARM DT topology description provides the operating system
> >with a topological view of the system that is based on leaf nodes
> >representing either cores or threads (in an SMT system) and a
> >hierarchical set of cluster nodes that creates a hierarchical topology
> >view of how those cores and threads are grouped.
> >
> >However this hierarchical representation of clusters does not allow to
> >describe what topology level actually represents the physical package or
> >the socket boundary, which is a key piece of information to be used by
> >an operating system to optimize resource allocation and scheduling.
> >
> 
> Are physical package descriptions really needed? What does "socket" imply
> that a higher layer "cluster" node grouping does not? It doesn't imply a
> different NUMA distance and the definition of "socket" is already not well
> defined, is a dual chiplet processor not just a fancy dual "socket" or are
> dual "sockets" on a server board "slotket" card, will we need new names for
> those too..

Socket (or package) just implies what you suggest, a grouping of CPUs
based on the physical socket (or package). Some resources might be
associated with packages and more importantly socket information is
exposed to user-space. At the moment clusters are being exposed to
user-space as sockets which is less than ideal for some topologies.

At the moment user-space is only told about hw threads, cores, and
sockets. In the very near future it is going to be told about dies too
(look for Len Brown's multi-die patch set).

I don't see how we can provide correct information to user-space based
on the current information in DT. I'm not convinced it was a good idea
to expose this information to user-space to begin with but that is
another discussion.

Morten

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-30 11:51     ` Morten Rasmussen
@ 2019-05-30 12:56       ` Andrew F. Davis
  2019-05-30 13:12         ` Morten Rasmussen
  2019-05-31  9:41         ` Sudeep Holla
  2019-05-30 21:42       ` Russell King - ARM Linux admin
  1 sibling, 2 replies; 25+ messages in thread
From: Andrew F. Davis @ 2019-05-30 12:56 UTC (permalink / raw)
  To: Morten Rasmussen
  Cc: Atish Patra, linux-kernel, Sudeep Holla, Rob Herring, Albert Ou,
	Anup Patel, Catalin Marinas, David S. Miller, devicetree,
	Greg Kroah-Hartman, Ingo Molnar, Jeremy Linton, Linus Walleij,
	linux-riscv, Mark Rutland, Mauro Carvalho Chehab, Otto Sabart,
	Palmer Dabbelt, Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Russell King, linux-arm-kernel

On 5/30/19 7:51 AM, Morten Rasmussen wrote:
> On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
>> On 5/29/19 5:13 PM, Atish Patra wrote:
>>> From: Sudeep Holla <sudeep.holla@arm.com>
>>>
>>> The current ARM DT topology description provides the operating system
>>> with a topological view of the system that is based on leaf nodes
>>> representing either cores or threads (in an SMT system) and a
>>> hierarchical set of cluster nodes that creates a hierarchical topology
>>> view of how those cores and threads are grouped.
>>>
>>> However this hierarchical representation of clusters does not allow to
>>> describe what topology level actually represents the physical package or
>>> the socket boundary, which is a key piece of information to be used by
>>> an operating system to optimize resource allocation and scheduling.
>>>
>>
>> Are physical package descriptions really needed? What does "socket" imply
>> that a higher layer "cluster" node grouping does not? It doesn't imply a
>> different NUMA distance and the definition of "socket" is already not well
>> defined, is a dual chiplet processor not just a fancy dual "socket" or are
>> dual "sockets" on a server board "slotket" card, will we need new names for
>> those too..
> 
> Socket (or package) just implies what you suggest, a grouping of CPUs
> based on the physical socket (or package). Some resources might be
> associated with packages and more importantly socket information is
> exposed to user-space. At the moment clusters are being exposed to
> user-space as sockets which is less than ideal for some topologies.
> 

I see the benefit of reporting the physical layout and packaging 
information to user-space for tracking reasons, but from software 
perspective this doesn't matter, and the resource partitioning should be 
described elsewhere (NUMA nodes being the go to example).

> At the moment user-space is only told about hw threads, cores, and
> sockets. In the very near future it is going to be told about dies too
> (look for Len Brown's multi-die patch set).
> 

Seems my hypothetical case is already in the works :(

> I don't see how we can provide correct information to user-space based
> on the current information in DT. I'm not convinced it was a good idea
> to expose this information to user-space to begin with but that is
> another discussion.
> 

Fair enough, it's a little late now to un-expose this info to userspace 
so we should at least present it correctly. My worry was this getting 
out of hand with layering, for instance what happens when we need to add 
die nodes in-between cluster and socket?

Andrew

> Morten
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-30 12:56       ` Andrew F. Davis
@ 2019-05-30 13:12         ` Morten Rasmussen
  2019-05-31  9:41         ` Sudeep Holla
  1 sibling, 0 replies; 25+ messages in thread
From: Morten Rasmussen @ 2019-05-30 13:12 UTC (permalink / raw)
  To: Andrew F. Davis
  Cc: Mark Rutland, Rafael J. Wysocki, Peter Zijlstra (Intel),
	Catalin Marinas, Linus Walleij, Palmer Dabbelt, Will Deacon,
	Atish Patra, Mauro Carvalho Chehab, linux-riscv, Ingo Molnar,
	Rob Herring, Anup Patel, Russell King, devicetree, Albert Ou,
	Rob Herring, Paul Walmsley, Thomas Gleixner, linux-arm-kernel,
	Greg Kroah-Hartman, linux-kernel, Jeremy Linton, Otto Sabart,
	Sudeep Holla, David S. Miller

On Thu, May 30, 2019 at 08:56:03AM -0400, Andrew F. Davis wrote:
> On 5/30/19 7:51 AM, Morten Rasmussen wrote:
> >On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> >>On 5/29/19 5:13 PM, Atish Patra wrote:
> >>>From: Sudeep Holla <sudeep.holla@arm.com>
> >>>
> >>>The current ARM DT topology description provides the operating system
> >>>with a topological view of the system that is based on leaf nodes
> >>>representing either cores or threads (in an SMT system) and a
> >>>hierarchical set of cluster nodes that creates a hierarchical topology
> >>>view of how those cores and threads are grouped.
> >>>
> >>>However this hierarchical representation of clusters does not allow to
> >>>describe what topology level actually represents the physical package or
> >>>the socket boundary, which is a key piece of information to be used by
> >>>an operating system to optimize resource allocation and scheduling.
> >>>
> >>
> >>Are physical package descriptions really needed? What does "socket" imply
> >>that a higher layer "cluster" node grouping does not? It doesn't imply a
> >>different NUMA distance and the definition of "socket" is already not well
> >>defined, is a dual chiplet processor not just a fancy dual "socket" or are
> >>dual "sockets" on a server board "slotket" card, will we need new names for
> >>those too..
> >
> >Socket (or package) just implies what you suggest, a grouping of CPUs
> >based on the physical socket (or package). Some resources might be
> >associated with packages and more importantly socket information is
> >exposed to user-space. At the moment clusters are being exposed to
> >user-space as sockets which is less than ideal for some topologies.
> >
> 
> I see the benefit of reporting the physical layout and packaging information
> to user-space for tracking reasons, but from software perspective this
> doesn't matter, and the resource partitioning should be described elsewhere
> (NUMA nodes being the go to example).

That would make defining a NUMA node mandatory even for non-NUMA
systems?

> >At the moment user-space is only told about hw threads, cores, and
> >sockets. In the very near future it is going to be told about dies too
> >(look for Len Brown's multi-die patch set).
> >
> 
> Seems my hypothetical case is already in the works :(

Indeed. IIUC, the reasoning behind it is related to actual multi-die
x86 packages and some rapl stuff being per-die or per-core.

> 
> >I don't see how we can provide correct information to user-space based
> >on the current information in DT. I'm not convinced it was a good idea
> >to expose this information to user-space to begin with but that is
> >another discussion.
> >
> 
> Fair enough, it's a little late now to un-expose this info to userspace so
> we should at least present it correctly. My worry was this getting out of
> hand with layering, for instance what happens when we need to add die nodes
> in-between cluster and socket?

If we want the die mask to be correct for arm/arm64/riscv we need die
information from somewhere. I'm not in favour of adding more topology
layers to the user-space visible topology description, but others might
have a valid reason and if it is exposed I would prefer if we try to
expose the right information.

Btw, for packages, we already have that information in ACPI/PPTT so it
would be nice if we could have that for DT based systems too.

Morten

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 2/7] dt-binding: cpu-topology: Move cpu-map to a common binding.
  2019-05-29 21:13 ` [PATCH v6 2/7] dt-binding: cpu-topology: Move cpu-map to a common binding Atish Patra
@ 2019-05-30 20:55   ` Jeremy Linton
  2019-06-03  8:49     ` Atish Patra
  0 siblings, 1 reply; 25+ messages in thread
From: Jeremy Linton @ 2019-05-30 20:55 UTC (permalink / raw)
  To: Atish Patra, linux-kernel
  Cc: Sudeep Holla, Rob Herring, Albert Ou, Anup Patel,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Linus Walleij, linux-riscv, Mark Rutland,
	Mauro Carvalho Chehab, Morten Rasmussen, Otto Sabart,
	Palmer Dabbelt, Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Russell King, linux-arm-kernel

Hi,

On 5/29/19 4:13 PM, Atish Patra wrote:
> cpu-map binding can be used to described cpu topology for both
> RISC-V & ARM. It makes more sense to move the binding to document
> to a common place.
> 
> The relevant discussion can be found here.
> https://lkml.org/lkml/2018/11/6/19
> 
> Signed-off-by: Atish Patra <atish.patra@wdc.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> ---
>   .../topology.txt => cpu/cpu-topology.txt}     | 82 +++++++++++++++----
>   1 file changed, 66 insertions(+), 16 deletions(-)
>   rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (86%)
> 
> diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/cpu/cpu-topology.txt
> similarity index 86%
> rename from Documentation/devicetree/bindings/arm/topology.txt
> rename to Documentation/devicetree/bindings/cpu/cpu-topology.txt
> index 3b8febb46dad..069addccab14 100644
> --- a/Documentation/devicetree/bindings/arm/topology.txt
> +++ b/Documentation/devicetree/bindings/cpu/cpu-topology.txt
> @@ -1,12 +1,12 @@
>   ===========================================
> -ARM topology binding description
> +CPU topology binding description
>   ===========================================
>   
>   ===========================================
>   1 - Introduction
>   ===========================================
>   
> -In an ARM system, the hierarchy of CPUs is defined through three entities that
> +In a SMP system, the hierarchy of CPUs is defined through three entities that
>   are used to describe the layout of physical CPUs in the system:
>   
>   - socket
> @@ -14,9 +14,6 @@ are used to describe the layout of physical CPUs in the system:
>   - core
>   - thread
>   
> -The cpu nodes (bindings defined in [1]) represent the devices that
> -correspond to physical CPUs and are to be mapped to the hierarchy levels.
> -
>   The bottom hierarchy level sits at core or thread level depending on whether
>   symmetric multi-threading (SMT) is supported or not.
>   
> @@ -25,33 +22,31 @@ threads existing in the system and map to the hierarchy level "thread" above.
>   In systems where SMT is not supported "cpu" nodes represent all cores present
>   in the system and map to the hierarchy level "core" above.
>   
> -ARM topology bindings allow one to associate cpu nodes with hierarchical groups
> +CPU topology bindings allow one to associate cpu nodes with hierarchical groups
>   corresponding to the system hierarchy; syntactically they are defined as device
>   tree nodes.
>   
> -The remainder of this document provides the topology bindings for ARM, based
> -on the Devicetree Specification, available from:
> +Currently, only ARM/RISC-V intend to use this cpu topology binding but it may be
> +used for any other architecture as well.
>   
> -https://www.devicetree.org/specifications/
> +The cpu nodes, as per bindings defined in [4], represent the devices that
> +correspond to physical CPUs and are to be mapped to the hierarchy levels.
>   
> -If not stated otherwise, whenever a reference to a cpu node phandle is made its
> -value must point to a cpu node compliant with the cpu node bindings as
> -documented in [1].
>   A topology description containing phandles to cpu nodes that are not compliant
> -with bindings standardized in [1] is therefore considered invalid.
> +with bindings standardized in [4] is therefore considered invalid.
>   
>   ===========================================
>   2 - cpu-map node
>   ===========================================
>   
> -The ARM CPU topology is defined within the cpu-map node, which is a direct
> +The ARM/RISC-V CPU topology is defined within the cpu-map node, which is a direct
>   child of the cpus node and provides a container where the actual topology
>   nodes are listed.
>   
>   - cpu-map node
>   
> -	Usage: Optional - On ARM SMP systems provide CPUs topology to the OS.
> -			  ARM uniprocessor systems do not require a topology
> +	Usage: Optional - On SMP systems provide CPUs topology to the OS.
> +			  Uniprocessor systems do not require a topology
>   			  description and therefore should not define a
>   			  cpu-map node.
>   
> @@ -494,8 +489,63 @@ cpus {
>   	};
>   };
>   
> +Example 3: HiFive Unleashed (RISC-V 64 bit, 4 core system)
> +
> +{
> +	#address-cells = <2>;
> +	#size-cells = <2>;
> +	compatible = "sifive,fu540g", "sifive,fu500";
> +	model = "sifive,hifive-unleashed-a00";
> +
> +	...
> +	cpus {
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +		cpu-map {
> +			cluster0 {
> +				core0 {
> +					cpu = <&CPU1>;
> +				};
> +				core1 {
> +					cpu = <&CPU2>;
> +				};
> +				core2 {
> +					cpu0 = <&CPU2>;
> +				};
> +				core3 {
> +					cpu0 = <&CPU3>;
> +				};
> +			};
> +		};


<nit picking>

While socket is optional, its probably a good idea to include the node 
in the example even if the result is the same. That is because at least 
on arm64 the DT clusters=sockets decision had performance implications 
for larger systems.

Assuring the socket information is correct is helpful by itself to avoid 
having to explain why a single socket machine is displaying some other 
value in lscpu.



> +
> +		CPU1: cpu@1 {
> +			device_type = "cpu";
> +			compatible = "sifive,rocket0", "riscv";
> +			reg = <0x1>;
> +		}
> +
> +		CPU2: cpu@2 {
> +			device_type = "cpu";
> +			compatible = "sifive,rocket0", "riscv";
> +			reg = <0x2>;
> +		}
> +		CPU3: cpu@3 {
> +			device_type = "cpu";
> +			compatible = "sifive,rocket0", "riscv";
> +			reg = <0x3>;
> +		}
> +		CPU4: cpu@4 {
> +			device_type = "cpu";
> +			compatible = "sifive,rocket0", "riscv";
> +			reg = <0x4>;
> +		}
> +	}
> +};
>   ===============================================================================
>   [1] ARM Linux kernel documentation
>       Documentation/devicetree/bindings/arm/cpus.yaml
>   [2] Devicetree NUMA binding description
>       Documentation/devicetree/bindings/numa.txt
> +[3] RISC-V Linux kernel documentation
> +    Documentation/devicetree/bindings/riscv/cpus.txt
> +[4] https://www.devicetree.org/specifications/
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V
  2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
                   ` (6 preceding siblings ...)
  2019-05-29 21:13 ` [PATCH v6 7/7] MAINTAINERS: Add an entry for generic architecture topology Atish Patra
@ 2019-05-30 21:12 ` Jeremy Linton
  2019-06-03  8:50   ` Atish Patra
  7 siblings, 1 reply; 25+ messages in thread
From: Jeremy Linton @ 2019-05-30 21:12 UTC (permalink / raw)
  To: Atish Patra, linux-kernel
  Cc: Albert Ou, Anup Patel, Catalin Marinas, David S. Miller,
	devicetree, Greg Kroah-Hartman, Ingo Molnar, Linus Walleij,
	linux-riscv, Mark Rutland, Mauro Carvalho Chehab,
	Morten Rasmussen, Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Sudeep Holla, Thomas Gleixner,
	Will Deacon, Russell King, linux-arm-kernel

Hi,

On 5/29/19 4:13 PM, Atish Patra wrote:
> The cpu-map DT entry in ARM can describe the CPU topology in much better
> way compared to other existing approaches. RISC-V can easily adopt this
> binding to represent its own CPU topology. Thus, both cpu-map DT
> binding and topology parsing code can be moved to a common location so
> that RISC-V or any other architecture can leverage that.
> 
> The relevant discussion regarding unifying cpu topology can be found in
> [1].
> 
> arch_topology seems to be a perfect place to move the common code. I
> have not introduced any significant functional changes in the moved code.
> The only downside in this approach is that the capacity code will be
> executed for RISC-V as well. But, it will exit immediately after not
> able to find the appropriate DT node. If the overhead is considered too
> much, we can always compile out capacity related functions under a
> different config for the architectures that do not support them.
> 
> There was an opportunity to unify topology data structure for ARM32 done
> by patch 3/4. But, I refrained from making any other changes as I am not
> very well versed with original intention for some functions that
> are present in arch_topology.c. I hope this patch series can be served
> as a baseline for such changes in the future.
> 
> The patches have been tested for RISC-V and compile tested for ARM64,
> ARM32 & x86.
>

I applied these to 5.2rc2, along with my PPTT/MT change and verified the 
system & scheduler topology/etc on DAWN and ThunderX2 using ACPI on 
arm64. They appear to be working correctly.

so for the series,
Tested-by: Jeremy Linton <jeremy.linton@arm.com>

The code itself looks fine to me as well:

Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>

Thanks!

> The socket change[2] is also now part of this series.
> 
> [1] https://lkml.org/lkml/2018/11/6/19
> [2] https://lkml.org/lkml/2018/11/7/918
> 
> QEMU changes for RISC-V topology are available at
> 
> https://github.com/atishp04/qemu/tree/riscv_topology_dt
> 
> HiFive Unleashed DT with topology node is available here.
> https://github.com/atishp04/opensbi/tree/HiFive_unleashed_topology
> 
> It can be verified with OpenSBI with following additional compile time
> option.
> 
> FW_PAYLOAD_FDT="unleashed_topology.dtb"
> 
> Changes from v5->v6
> 1. Added two more patches from Sudeep about maintainership of arch_topology.c
>     and Kconfig update.
> 2. Added Tested-by & Reviewed-by
> 3. Fixed a nit (reordering of variables)
> 
> Changes from v4-v5
> 1. Removed the arch_topology.h header inclusion from topology.c and arch_topology.c
> file. Added it in linux/topology.h.
> 2. core_id is set to -1 upon reset. Otherwise, ARM topology store function does not
> work.
> 
> Changes from v3->v4
> 1. Get rid of ARM32 specific information in topology structure.
> 2. Remove redundant functions from ARM32 and use common code instead.
> 
> Changes from v2->v3
> 1. Cover letter update with experiment DT for topology changes.
> 2. Added the patch for [2].
> 
> Changes from v1->v2
> 1. ARM32 can now use the common code as well.
> 
> Atish Patra (4):
> dt-binding: cpu-topology: Move cpu-map to a common binding.
> cpu-topology: Move cpu topology code to common code.
> arm: Use common cpu_topology structure and functions.
> RISC-V: Parse cpu topology during boot.
> 
> Sudeep Holla (3):
> Documentation: DT: arm: add support for sockets defining package
> boundaries
> base: arch_topology: update Kconfig help description
> MAINTAINERS: Add an entry for generic architecture topology
> 
> .../topology.txt => cpu/cpu-topology.txt}     | 134 ++++++--
> MAINTAINERS                                   |   7 +
> arch/arm/include/asm/topology.h               |  20 --
> arch/arm/kernel/topology.c                    |  60 +---
> arch/arm64/include/asm/topology.h             |  23 --
> arch/arm64/kernel/topology.c                  | 303 +-----------------
> arch/riscv/Kconfig                            |   1 +
> arch/riscv/kernel/smpboot.c                   |   3 +
> drivers/base/Kconfig                          |   2 +-
> drivers/base/arch_topology.c                  | 298 +++++++++++++++++
> include/linux/arch_topology.h                 |  26 ++
> include/linux/topology.h                      |   1 +
> 12 files changed, 452 insertions(+), 426 deletions(-)
> rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (66%)
> 
> --
> 2.21.0
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-30 11:51     ` Morten Rasmussen
  2019-05-30 12:56       ` Andrew F. Davis
@ 2019-05-30 21:42       ` Russell King - ARM Linux admin
  2019-05-31  9:37         ` Sudeep Holla
  1 sibling, 1 reply; 25+ messages in thread
From: Russell King - ARM Linux admin @ 2019-05-30 21:42 UTC (permalink / raw)
  To: Morten Rasmussen
  Cc: Andrew F. Davis, Atish Patra, linux-kernel, Sudeep Holla,
	Rob Herring, Albert Ou, Anup Patel, Catalin Marinas,
	David S. Miller, devicetree, Greg Kroah-Hartman, Ingo Molnar,
	Jeremy Linton, Linus Walleij, linux-riscv, Mark Rutland,
	Mauro Carvalho Chehab, Otto Sabart, Palmer Dabbelt,
	Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	linux-arm-kernel

On Thu, May 30, 2019 at 12:51:03PM +0100, Morten Rasmussen wrote:
> On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> > On 5/29/19 5:13 PM, Atish Patra wrote:
> > >From: Sudeep Holla <sudeep.holla@arm.com>
> > >
> > >The current ARM DT topology description provides the operating system
> > >with a topological view of the system that is based on leaf nodes
> > >representing either cores or threads (in an SMT system) and a
> > >hierarchical set of cluster nodes that creates a hierarchical topology
> > >view of how those cores and threads are grouped.
> > >
> > >However this hierarchical representation of clusters does not allow to
> > >describe what topology level actually represents the physical package or
> > >the socket boundary, which is a key piece of information to be used by
> > >an operating system to optimize resource allocation and scheduling.
> > >
> > 
> > Are physical package descriptions really needed? What does "socket" imply
> > that a higher layer "cluster" node grouping does not? It doesn't imply a
> > different NUMA distance and the definition of "socket" is already not well
> > defined, is a dual chiplet processor not just a fancy dual "socket" or are
> > dual "sockets" on a server board "slotket" card, will we need new names for
> > those too..
> 
> Socket (or package) just implies what you suggest, a grouping of CPUs
> based on the physical socket (or package). Some resources might be
> associated with packages and more importantly socket information is
> exposed to user-space. At the moment clusters are being exposed to
> user-space as sockets which is less than ideal for some topologies.

Please point out a 32-bit ARM system that has multiple "socket"s.

As far as I'm aware, all 32-bit systems do not have socketed CPUs
(modern ARM CPUs are part of a larger SoC), and the CPUs are always
in one package.

Even the test systems I've seen do not have socketed CPUs.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-30 21:42       ` Russell King - ARM Linux admin
@ 2019-05-31  9:37         ` Sudeep Holla
  2019-05-31  9:54           ` Morten Rasmussen
  0 siblings, 1 reply; 25+ messages in thread
From: Sudeep Holla @ 2019-05-31  9:37 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Morten Rasmussen, Andrew F. Davis, Atish Patra, linux-kernel,
	Rob Herring, Albert Ou, Anup Patel, Catalin Marinas,
	David S. Miller, devicetree, Greg Kroah-Hartman, Ingo Molnar,
	Jeremy Linton, Linus Walleij, linux-riscv, Mark Rutland,
	Mauro Carvalho Chehab, Otto Sabart, Palmer Dabbelt,
	Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Sudeep Holla, linux-arm-kernel

On Thu, May 30, 2019 at 10:42:54PM +0100, Russell King - ARM Linux admin wrote:
> On Thu, May 30, 2019 at 12:51:03PM +0100, Morten Rasmussen wrote:
> > On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> > > On 5/29/19 5:13 PM, Atish Patra wrote:
> > > >From: Sudeep Holla <sudeep.holla@arm.com>
> > > >
> > > >The current ARM DT topology description provides the operating system
> > > >with a topological view of the system that is based on leaf nodes
> > > >representing either cores or threads (in an SMT system) and a
> > > >hierarchical set of cluster nodes that creates a hierarchical topology
> > > >view of how those cores and threads are grouped.
> > > >
> > > >However this hierarchical representation of clusters does not allow to
> > > >describe what topology level actually represents the physical package or
> > > >the socket boundary, which is a key piece of information to be used by
> > > >an operating system to optimize resource allocation and scheduling.
> > > >
> > >
> > > Are physical package descriptions really needed? What does "socket" imply
> > > that a higher layer "cluster" node grouping does not? It doesn't imply a
> > > different NUMA distance and the definition of "socket" is already not well
> > > defined, is a dual chiplet processor not just a fancy dual "socket" or are
> > > dual "sockets" on a server board "slotket" card, will we need new names for
> > > those too..
> >
> > Socket (or package) just implies what you suggest, a grouping of CPUs
> > based on the physical socket (or package). Some resources might be
> > associated with packages and more importantly socket information is
> > exposed to user-space. At the moment clusters are being exposed to
> > user-space as sockets which is less than ideal for some topologies.
>
> Please point out a 32-bit ARM system that has multiple "socket"s.
>
> As far as I'm aware, all 32-bit systems do not have socketed CPUs
> (modern ARM CPUs are part of a larger SoC), and the CPUs are always
> in one package.
>
> Even the test systems I've seen do not have socketed CPUs.
>

As far as we know, there's none. So we simply have to assume all
those systems are single socket(IOW all CPUs reside inside a single
SoC package) system.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-30 12:56       ` Andrew F. Davis
  2019-05-30 13:12         ` Morten Rasmussen
@ 2019-05-31  9:41         ` Sudeep Holla
  1 sibling, 0 replies; 25+ messages in thread
From: Sudeep Holla @ 2019-05-31  9:41 UTC (permalink / raw)
  To: Andrew F. Davis
  Cc: Morten Rasmussen, Atish Patra, linux-kernel, Rob Herring,
	Albert Ou, Anup Patel, Catalin Marinas, David S. Miller,
	devicetree, Greg Kroah-Hartman, Ingo Molnar, Jeremy Linton,
	Linus Walleij, linux-riscv, Mark Rutland, Mauro Carvalho Chehab,
	Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	Russell King, Sudeep Holla, linux-arm-kernel

On Thu, May 30, 2019 at 08:56:03AM -0400, Andrew F. Davis wrote:
> On 5/30/19 7:51 AM, Morten Rasmussen wrote:
> > On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> > > On 5/29/19 5:13 PM, Atish Patra wrote:
> > > > From: Sudeep Holla <sudeep.holla@arm.com>
> > > >
> > > > The current ARM DT topology description provides the operating system
> > > > with a topological view of the system that is based on leaf nodes
> > > > representing either cores or threads (in an SMT system) and a
> > > > hierarchical set of cluster nodes that creates a hierarchical topology
> > > > view of how those cores and threads are grouped.
> > > >
> > > > However this hierarchical representation of clusters does not allow to
> > > > describe what topology level actually represents the physical package or
> > > > the socket boundary, which is a key piece of information to be used by
> > > > an operating system to optimize resource allocation and scheduling.
> > > >
> > >
> > > Are physical package descriptions really needed? What does "socket" imply
> > > that a higher layer "cluster" node grouping does not? It doesn't imply a
> > > different NUMA distance and the definition of "socket" is already not well
> > > defined, is a dual chiplet processor not just a fancy dual "socket" or are
> > > dual "sockets" on a server board "slotket" card, will we need new names for
> > > those too..
> >
> > Socket (or package) just implies what you suggest, a grouping of CPUs
> > based on the physical socket (or package). Some resources might be
> > associated with packages and more importantly socket information is
> > exposed to user-space. At the moment clusters are being exposed to
> > user-space as sockets which is less than ideal for some topologies.
> >
>
> I see the benefit of reporting the physical layout and packaging information
> to user-space for tracking reasons, but from software perspective this
> doesn't matter, and the resource partitioning should be described elsewhere
> (NUMA nodes being the go to example).
>
> > At the moment user-space is only told about hw threads, cores, and
> > sockets. In the very near future it is going to be told about dies too
> > (look for Len Brown's multi-die patch set).
> >
>
> Seems my hypothetical case is already in the works :(
>
> > I don't see how we can provide correct information to user-space based
> > on the current information in DT. I'm not convinced it was a good idea
> > to expose this information to user-space to begin with but that is
> > another discussion.
> >
>
> Fair enough, it's a little late now to un-expose this info to userspace so
> we should at least present it correctly. My worry was this getting out of
> hand with layering, for instance what happens when we need to add die nodes
> in-between cluster and socket?
>

We may have to, if there's a similar requirement on ARM64 as the one
addressed by Len Brown's multi-die patch set. But for now, no one has
asked for it.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries
  2019-05-31  9:37         ` Sudeep Holla
@ 2019-05-31  9:54           ` Morten Rasmussen
  0 siblings, 0 replies; 25+ messages in thread
From: Morten Rasmussen @ 2019-05-31  9:54 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Russell King - ARM Linux admin, Andrew F. Davis, Atish Patra,
	linux-kernel, Rob Herring, Albert Ou, Anup Patel,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Otto Sabart, Palmer Dabbelt,
	Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	linux-arm-kernel

On Fri, May 31, 2019 at 10:37:43AM +0100, Sudeep Holla wrote:
> On Thu, May 30, 2019 at 10:42:54PM +0100, Russell King - ARM Linux admin wrote:
> > On Thu, May 30, 2019 at 12:51:03PM +0100, Morten Rasmussen wrote:
> > > On Wed, May 29, 2019 at 07:39:17PM -0400, Andrew F. Davis wrote:
> > > > On 5/29/19 5:13 PM, Atish Patra wrote:
> > > > >From: Sudeep Holla <sudeep.holla@arm.com>
> > > > >
> > > > >The current ARM DT topology description provides the operating system
> > > > >with a topological view of the system that is based on leaf nodes
> > > > >representing either cores or threads (in an SMT system) and a
> > > > >hierarchical set of cluster nodes that creates a hierarchical topology
> > > > >view of how those cores and threads are grouped.
> > > > >
> > > > >However this hierarchical representation of clusters does not allow to
> > > > >describe what topology level actually represents the physical package or
> > > > >the socket boundary, which is a key piece of information to be used by
> > > > >an operating system to optimize resource allocation and scheduling.
> > > > >
> > > >
> > > > Are physical package descriptions really needed? What does "socket" imply
> > > > that a higher layer "cluster" node grouping does not? It doesn't imply a
> > > > different NUMA distance and the definition of "socket" is already not well
> > > > defined, is a dual chiplet processor not just a fancy dual "socket" or are
> > > > dual "sockets" on a server board "slotket" card, will we need new names for
> > > > those too..
> > >
> > > Socket (or package) just implies what you suggest, a grouping of CPUs
> > > based on the physical socket (or package). Some resources might be
> > > associated with packages and more importantly socket information is
> > > exposed to user-space. At the moment clusters are being exposed to
> > > user-space as sockets which is less than ideal for some topologies.
> >
> > Please point out a 32-bit ARM system that has multiple "socket"s.
> >
> > As far as I'm aware, all 32-bit systems do not have socketed CPUs
> > (modern ARM CPUs are part of a larger SoC), and the CPUs are always
> > in one package.
> >
> > Even the test systems I've seen do not have socketed CPUs.
> >
> 
> As far as we know, there's none. So we simply have to assume all
> those systems are single socket(IOW all CPUs reside inside a single
> SoC package) system.

Right, but we don't make that assumption. Clusters are reported as
sockets/packages for arm, just like they are for arm64. My comment above
applied to what can be described using DT, not what systems actually
exists. We need to be able describe packages for architecture where we
can't make assumptions.

arm example (ARM TC2):
root@morras01-tc2:~# lstopo
Machine (985MB)
  Package L#0
    Core L#0 + PU L#0 (P#0)
    Core L#1 + PU L#1 (P#1)
  Package L#1
    Core L#2 + PU L#2 (P#2)
    Core L#3 + PU L#3 (P#3)
    Core L#4 + PU L#4 (P#4)

Morten

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 2/7] dt-binding: cpu-topology: Move cpu-map to a common binding.
  2019-05-30 20:55   ` Jeremy Linton
@ 2019-06-03  8:49     ` Atish Patra
  2019-06-03  9:05       ` Sudeep Holla
  0 siblings, 1 reply; 25+ messages in thread
From: Atish Patra @ 2019-06-03  8:49 UTC (permalink / raw)
  To: Jeremy Linton, linux-kernel
  Cc: Mark Rutland, Rafael J. Wysocki, Peter Zijlstra (Intel),
	Catalin Marinas, Linus Walleij, Palmer Dabbelt, Will Deacon,
	Mauro Carvalho Chehab, linux-riscv, Morten Rasmussen,
	Rob Herring, Anup Patel, Russell King, Ingo Molnar, devicetree,
	Albert Ou, Rob Herring, Paul Walmsley, Thomas Gleixner,
	linux-arm-kernel, Greg Kroah-Hartman, Otto Sabart, Sudeep Holla,
	David S. Miller

On 5/30/19 1:55 PM, Jeremy Linton wrote:
> Hi,
> 
> On 5/29/19 4:13 PM, Atish Patra wrote:
>> cpu-map binding can be used to described cpu topology for both
>> RISC-V & ARM. It makes more sense to move the binding to document
>> to a common place.
>>
>> The relevant discussion can be found here.
>> https://lkml.org/lkml/2018/11/6/19
>>
>> Signed-off-by: Atish Patra <atish.patra@wdc.com>
>> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
>> Reviewed-by: Rob Herring <robh@kernel.org>
>> ---
>>    .../topology.txt => cpu/cpu-topology.txt}     | 82 +++++++++++++++----
>>    1 file changed, 66 insertions(+), 16 deletions(-)
>>    rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (86%)
>>
>> diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/cpu/cpu-topology.txt
>> similarity index 86%
>> rename from Documentation/devicetree/bindings/arm/topology.txt
>> rename to Documentation/devicetree/bindings/cpu/cpu-topology.txt
>> index 3b8febb46dad..069addccab14 100644
>> --- a/Documentation/devicetree/bindings/arm/topology.txt
>> +++ b/Documentation/devicetree/bindings/cpu/cpu-topology.txt
>> @@ -1,12 +1,12 @@
>>    ===========================================
>> -ARM topology binding description
>> +CPU topology binding description
>>    ===========================================
>>    
>>    ===========================================
>>    1 - Introduction
>>    ===========================================
>>    
>> -In an ARM system, the hierarchy of CPUs is defined through three entities that
>> +In a SMP system, the hierarchy of CPUs is defined through three entities that
>>    are used to describe the layout of physical CPUs in the system:
>>    
>>    - socket
>> @@ -14,9 +14,6 @@ are used to describe the layout of physical CPUs in the system:
>>    - core
>>    - thread
>>    
>> -The cpu nodes (bindings defined in [1]) represent the devices that
>> -correspond to physical CPUs and are to be mapped to the hierarchy levels.
>> -
>>    The bottom hierarchy level sits at core or thread level depending on whether
>>    symmetric multi-threading (SMT) is supported or not.
>>    
>> @@ -25,33 +22,31 @@ threads existing in the system and map to the hierarchy level "thread" above.
>>    In systems where SMT is not supported "cpu" nodes represent all cores present
>>    in the system and map to the hierarchy level "core" above.
>>    
>> -ARM topology bindings allow one to associate cpu nodes with hierarchical groups
>> +CPU topology bindings allow one to associate cpu nodes with hierarchical groups
>>    corresponding to the system hierarchy; syntactically they are defined as device
>>    tree nodes.
>>    
>> -The remainder of this document provides the topology bindings for ARM, based
>> -on the Devicetree Specification, available from:
>> +Currently, only ARM/RISC-V intend to use this cpu topology binding but it may be
>> +used for any other architecture as well.
>>    
>> -https://www.devicetree.org/specifications/
>> +The cpu nodes, as per bindings defined in [4], represent the devices that
>> +correspond to physical CPUs and are to be mapped to the hierarchy levels.
>>    
>> -If not stated otherwise, whenever a reference to a cpu node phandle is made its
>> -value must point to a cpu node compliant with the cpu node bindings as
>> -documented in [1].
>>    A topology description containing phandles to cpu nodes that are not compliant
>> -with bindings standardized in [1] is therefore considered invalid.
>> +with bindings standardized in [4] is therefore considered invalid.
>>    
>>    ===========================================
>>    2 - cpu-map node
>>    ===========================================
>>    
>> -The ARM CPU topology is defined within the cpu-map node, which is a direct
>> +The ARM/RISC-V CPU topology is defined within the cpu-map node, which is a direct
>>    child of the cpus node and provides a container where the actual topology
>>    nodes are listed.
>>    
>>    - cpu-map node
>>    
>> -	Usage: Optional - On ARM SMP systems provide CPUs topology to the OS.
>> -			  ARM uniprocessor systems do not require a topology
>> +	Usage: Optional - On SMP systems provide CPUs topology to the OS.
>> +			  Uniprocessor systems do not require a topology
>>    			  description and therefore should not define a
>>    			  cpu-map node.
>>    
>> @@ -494,8 +489,63 @@ cpus {
>>    	};
>>    };
>>    
>> +Example 3: HiFive Unleashed (RISC-V 64 bit, 4 core system)
>> +
>> +{
>> +	#address-cells = <2>;
>> +	#size-cells = <2>;
>> +	compatible = "sifive,fu540g", "sifive,fu500";
>> +	model = "sifive,hifive-unleashed-a00";
>> +
>> +	...
>> +	cpus {
>> +		#address-cells = <1>;
>> +		#size-cells = <0>;
>> +		cpu-map {
>> +			cluster0 {
>> +				core0 {
>> +					cpu = <&CPU1>;
>> +				};
>> +				core1 {
>> +					cpu = <&CPU2>;
>> +				};
>> +				core2 {
>> +					cpu0 = <&CPU2>;
>> +				};
>> +				core3 {
>> +					cpu0 = <&CPU3>;
>> +				};
>> +			};
>> +		};
> 
> 
> <nit picking>
> 
> While socket is optional, its probably a good idea to include the node
> in the example even if the result is the same. 

Sure. I will update that.

That is because at least
> on arm64 the DT clusters=sockets decision had performance implications
> for larger systems.
> 
> Assuring the socket information is correct is helpful by itself to avoid
> having to explain why a single socket machine is displaying some other
> value in lscpu.
> 
Just for my understanding, can you give a example?

> 
> 
>> +
>> +		CPU1: cpu@1 {
>> +			device_type = "cpu";
>> +			compatible = "sifive,rocket0", "riscv";
>> +			reg = <0x1>;
>> +		}
>> +
>> +		CPU2: cpu@2 {
>> +			device_type = "cpu";
>> +			compatible = "sifive,rocket0", "riscv";
>> +			reg = <0x2>;
>> +		}
>> +		CPU3: cpu@3 {
>> +			device_type = "cpu";
>> +			compatible = "sifive,rocket0", "riscv";
>> +			reg = <0x3>;
>> +		}
>> +		CPU4: cpu@4 {
>> +			device_type = "cpu";
>> +			compatible = "sifive,rocket0", "riscv";
>> +			reg = <0x4>;
>> +		}
>> +	}
>> +};
>>    ===============================================================================
>>    [1] ARM Linux kernel documentation
>>        Documentation/devicetree/bindings/arm/cpus.yaml
>>    [2] Devicetree NUMA binding description
>>        Documentation/devicetree/bindings/numa.txt
>> +[3] RISC-V Linux kernel documentation
>> +    Documentation/devicetree/bindings/riscv/cpus.txt
>> +[4] https://www.devicetree.org/specifications/
>>
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
> 


-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V
  2019-05-30 21:12 ` [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Jeremy Linton
@ 2019-06-03  8:50   ` Atish Patra
  0 siblings, 0 replies; 25+ messages in thread
From: Atish Patra @ 2019-06-03  8:50 UTC (permalink / raw)
  To: Jeremy Linton, linux-kernel
  Cc: Albert Ou, Anup Patel, Catalin Marinas, David S. Miller,
	devicetree, Greg Kroah-Hartman, Ingo Molnar, Linus Walleij,
	linux-riscv, Mark Rutland, Mauro Carvalho Chehab,
	Morten Rasmussen, Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Sudeep Holla, Thomas Gleixner,
	Will Deacon, Russell King, linux-arm-kernel

On 5/30/19 2:12 PM, Jeremy Linton wrote:
> Hi,
> 
> On 5/29/19 4:13 PM, Atish Patra wrote:
>> The cpu-map DT entry in ARM can describe the CPU topology in much better
>> way compared to other existing approaches. RISC-V can easily adopt this
>> binding to represent its own CPU topology. Thus, both cpu-map DT
>> binding and topology parsing code can be moved to a common location so
>> that RISC-V or any other architecture can leverage that.
>>
>> The relevant discussion regarding unifying cpu topology can be found in
>> [1].
>>
>> arch_topology seems to be a perfect place to move the common code. I
>> have not introduced any significant functional changes in the moved code.
>> The only downside in this approach is that the capacity code will be
>> executed for RISC-V as well. But, it will exit immediately after not
>> able to find the appropriate DT node. If the overhead is considered too
>> much, we can always compile out capacity related functions under a
>> different config for the architectures that do not support them.
>>
>> There was an opportunity to unify topology data structure for ARM32 done
>> by patch 3/4. But, I refrained from making any other changes as I am not
>> very well versed with original intention for some functions that
>> are present in arch_topology.c. I hope this patch series can be served
>> as a baseline for such changes in the future.
>>
>> The patches have been tested for RISC-V and compile tested for ARM64,
>> ARM32 & x86.
>>
> 
> I applied these to 5.2rc2, along with my PPTT/MT change and verified the
> system & scheduler topology/etc on DAWN and ThunderX2 using ACPI on
> arm64. They appear to be working correctly.
> 
> so for the series,
> Tested-by: Jeremy Linton <jeremy.linton@arm.com>
> 
> The code itself looks fine to me as well:
> 
> Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
> 
Thank you the review and testing on arm64 server.

> Thanks!
> 
>> The socket change[2] is also now part of this series.
>>
>> [1] https://lkml.org/lkml/2018/11/6/19
>> [2] https://lkml.org/lkml/2018/11/7/918
>>
>> QEMU changes for RISC-V topology are available at
>>
>> https://github.com/atishp04/qemu/tree/riscv_topology_dt
>>
>> HiFive Unleashed DT with topology node is available here.
>> https://github.com/atishp04/opensbi/tree/HiFive_unleashed_topology
>>
>> It can be verified with OpenSBI with following additional compile time
>> option.
>>
>> FW_PAYLOAD_FDT="unleashed_topology.dtb"
>>
>> Changes from v5->v6
>> 1. Added two more patches from Sudeep about maintainership of arch_topology.c
>>      and Kconfig update.
>> 2. Added Tested-by & Reviewed-by
>> 3. Fixed a nit (reordering of variables)
>>
>> Changes from v4-v5
>> 1. Removed the arch_topology.h header inclusion from topology.c and arch_topology.c
>> file. Added it in linux/topology.h.
>> 2. core_id is set to -1 upon reset. Otherwise, ARM topology store function does not
>> work.
>>
>> Changes from v3->v4
>> 1. Get rid of ARM32 specific information in topology structure.
>> 2. Remove redundant functions from ARM32 and use common code instead.
>>
>> Changes from v2->v3
>> 1. Cover letter update with experiment DT for topology changes.
>> 2. Added the patch for [2].
>>
>> Changes from v1->v2
>> 1. ARM32 can now use the common code as well.
>>
>> Atish Patra (4):
>> dt-binding: cpu-topology: Move cpu-map to a common binding.
>> cpu-topology: Move cpu topology code to common code.
>> arm: Use common cpu_topology structure and functions.
>> RISC-V: Parse cpu topology during boot.
>>
>> Sudeep Holla (3):
>> Documentation: DT: arm: add support for sockets defining package
>> boundaries
>> base: arch_topology: update Kconfig help description
>> MAINTAINERS: Add an entry for generic architecture topology
>>
>> .../topology.txt => cpu/cpu-topology.txt}     | 134 ++++++--
>> MAINTAINERS                                   |   7 +
>> arch/arm/include/asm/topology.h               |  20 --
>> arch/arm/kernel/topology.c                    |  60 +---
>> arch/arm64/include/asm/topology.h             |  23 --
>> arch/arm64/kernel/topology.c                  | 303 +-----------------
>> arch/riscv/Kconfig                            |   1 +
>> arch/riscv/kernel/smpboot.c                   |   3 +
>> drivers/base/Kconfig                          |   2 +-
>> drivers/base/arch_topology.c                  | 298 +++++++++++++++++
>> include/linux/arch_topology.h                 |  26 ++
>> include/linux/topology.h                      |   1 +
>> 12 files changed, 452 insertions(+), 426 deletions(-)
>> rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (66%)
>>
>> --
>> 2.21.0
>>
> 
> 


-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 2/7] dt-binding: cpu-topology: Move cpu-map to a common binding.
  2019-06-03  8:49     ` Atish Patra
@ 2019-06-03  9:05       ` Sudeep Holla
  0 siblings, 0 replies; 25+ messages in thread
From: Sudeep Holla @ 2019-06-03  9:05 UTC (permalink / raw)
  To: Atish Patra
  Cc: Jeremy Linton, linux-kernel, Mark Rutland, Rafael J. Wysocki,
	Peter Zijlstra (Intel),
	Catalin Marinas, Linus Walleij, Palmer Dabbelt, Will Deacon,
	Mauro Carvalho Chehab, linux-riscv, Morten Rasmussen,
	Rob Herring, Anup Patel, Russell King, Ingo Molnar, devicetree,
	Albert Ou, Rob Herring, Paul Walmsley, Thomas Gleixner,
	linux-arm-kernel, Greg Kroah-Hartman, Otto Sabart, Sudeep Holla,
	David S. Miller

On Mon, Jun 03, 2019 at 01:49:13AM -0700, Atish Patra wrote:
> On 5/30/19 1:55 PM, Jeremy Linton wrote:
> > Hi,
> >
> > On 5/29/19 4:13 PM, Atish Patra wrote:
> > > cpu-map binding can be used to described cpu topology for both
> > > RISC-V & ARM. It makes more sense to move the binding to document
> > > to a common place.
> > >
> > > The relevant discussion can be found here.
> > > https://lkml.org/lkml/2018/11/6/19
> > >
> > > Signed-off-by: Atish Patra <atish.patra@wdc.com>
> > > Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> > > Reviewed-by: Rob Herring <robh@kernel.org>
> > > ---
> > >    .../topology.txt => cpu/cpu-topology.txt}     | 82 +++++++++++++++----
> > >    1 file changed, 66 insertions(+), 16 deletions(-)
> > >    rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (86%)
> > >

[...]

> > <nit picking>
> >
> > While socket is optional, its probably a good idea to include the node
> > in the example even if the result is the same.
>
> Sure. I will update that.
>
> That is because at least
> > on arm64 the DT clusters=sockets decision had performance implications
> > for larger systems.
> >
> > Assuring the socket information is correct is helpful by itself to avoid
> > having to explain why a single socket machine is displaying some other
> > value in lscpu.
> >
> Just for my understanding, can you give a example?
>

That's simple. Today any ARM{32,64} DT based platform sets their cluster
id to physical package id, which is exposed to userspace. The userspace
can/must interpret that as multi-socket system. E.g. TC2/Juno which
2 clusters show up as 2 socket systems which is wrong and needs fixing.
We have fixed it for ARM64 ACPI based systems but for DT(mostly used in
mobile/embedded) we need to make sure we don't break anything else before
we fix it.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 4/7] arm: Use common cpu_topology structure and functions.
  2019-05-29 21:13 ` [PATCH v6 4/7] arm: Use common cpu_topology structure and functions Atish Patra
@ 2019-06-06 14:25   ` Atish Patra
  0 siblings, 0 replies; 25+ messages in thread
From: Atish Patra @ 2019-06-06 14:25 UTC (permalink / raw)
  To: Russell King
  Cc: linux-kernel, Sudeep Holla, Albert Ou, Anup Patel,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Morten Rasmussen,
	Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	linux-arm-kernel

On 5/29/19 2:15 PM, Atish Patra wrote:
> Currently, ARM32 and ARM64 uses different data structures to represent
> their cpu topologies. Since, we are moving the ARM64 topology to common
> code to be used by other architectures, we can reuse that for ARM32 as
> well.
> 
> Take this opprtunity to remove the redundant functions from ARM32 and
> reuse the common code instead.
> 
> To: Russell King <linux@armlinux.org.uk>
> Signed-off-by: Atish Patra <atish.patra@wdc.com>
> Tested-by: Sudeep Holla <sudeep.holla@arm.com> (on TC2)
> Reviewed-by : Sudeep Holla <sudeep.holla@arm.com>
> 
> ---
> Hi Russell,
> Can we get a ACK for this patch ? We are hoping that the entire
> series can be merged at one go.
> ---
>   arch/arm/include/asm/topology.h | 20 -----------
>   arch/arm/kernel/topology.c      | 60 ++++-----------------------------
>   drivers/base/arch_topology.c    |  4 ++-
>   include/linux/arch_topology.h   |  6 ++--
>   4 files changed, 11 insertions(+), 79 deletions(-)
> 
> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
> index 2a786f54d8b8..8a0fae94d45e 100644
> --- a/arch/arm/include/asm/topology.h
> +++ b/arch/arm/include/asm/topology.h
> @@ -5,26 +5,6 @@
>   #ifdef CONFIG_ARM_CPU_TOPOLOGY
>   
>   #include <linux/cpumask.h>
> -
> -struct cputopo_arm {
> -	int thread_id;
> -	int core_id;
> -	int socket_id;
> -	cpumask_t thread_sibling;
> -	cpumask_t core_sibling;
> -};
> -
> -extern struct cputopo_arm cpu_topology[NR_CPUS];
> -
> -#define topology_physical_package_id(cpu)	(cpu_topology[cpu].socket_id)
> -#define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
> -#define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
> -#define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
> -
> -void init_cpu_topology(void);
> -void store_cpu_topology(unsigned int cpuid);
> -const struct cpumask *cpu_coregroup_mask(int cpu);
> -
>   #include <linux/arch_topology.h>
>   
>   /* Replace task scheduler's default frequency-invariant accounting */
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 60e375ce1ab2..238f1da0219c 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -177,17 +177,6 @@ static inline void parse_dt_topology(void) {}
>   static inline void update_cpu_capacity(unsigned int cpuid) {}
>   #endif
>   
> - /*
> - * cpu topology table
> - */
> -struct cputopo_arm cpu_topology[NR_CPUS];
> -EXPORT_SYMBOL_GPL(cpu_topology);
> -
> -const struct cpumask *cpu_coregroup_mask(int cpu)
> -{
> -	return &cpu_topology[cpu].core_sibling;
> -}
> -
>   /*
>    * The current assumption is that we can power gate each core independently.
>    * This will be superseded by DT binding once available.
> @@ -197,32 +186,6 @@ const struct cpumask *cpu_corepower_mask(int cpu)
>   	return &cpu_topology[cpu].thread_sibling;
>   }
>   
> -static void update_siblings_masks(unsigned int cpuid)
> -{
> -	struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> -	int cpu;
> -
> -	/* update core and thread sibling masks */
> -	for_each_possible_cpu(cpu) {
> -		cpu_topo = &cpu_topology[cpu];
> -
> -		if (cpuid_topo->socket_id != cpu_topo->socket_id)
> -			continue;
> -
> -		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> -		if (cpu != cpuid)
> -			cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> -
> -		if (cpuid_topo->core_id != cpu_topo->core_id)
> -			continue;
> -
> -		cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
> -		if (cpu != cpuid)
> -			cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
> -	}
> -	smp_wmb();
> -}
> -
>   /*
>    * store_cpu_topology is called at boot when only one cpu is running
>    * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
> @@ -230,7 +193,7 @@ static void update_siblings_masks(unsigned int cpuid)
>    */
>   void store_cpu_topology(unsigned int cpuid)
>   {
> -	struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid];
> +	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
>   	unsigned int mpidr;
>   
>   	/* If the cpu topology has been already set, just return */
> @@ -250,12 +213,12 @@ void store_cpu_topology(unsigned int cpuid)
>   			/* core performance interdependency */
>   			cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>   			cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> -			cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
> +			cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
>   		} else {
>   			/* largely independent cores */
>   			cpuid_topo->thread_id = -1;
>   			cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> -			cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +			cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>   		}
>   	} else {
>   		/*
> @@ -265,7 +228,7 @@ void store_cpu_topology(unsigned int cpuid)
>   		 */
>   		cpuid_topo->thread_id = -1;
>   		cpuid_topo->core_id = 0;
> -		cpuid_topo->socket_id = -1;
> +		cpuid_topo->package_id = -1;
>   	}
>   
>   	update_siblings_masks(cpuid);
> @@ -275,7 +238,7 @@ void store_cpu_topology(unsigned int cpuid)
>   	pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
>   		cpuid, cpu_topology[cpuid].thread_id,
>   		cpu_topology[cpuid].core_id,
> -		cpu_topology[cpuid].socket_id, mpidr);
> +		cpu_topology[cpuid].package_id, mpidr);
>   }
>   
>   static inline int cpu_corepower_flags(void)
> @@ -298,18 +261,7 @@ static struct sched_domain_topology_level arm_topology[] = {
>    */
>   void __init init_cpu_topology(void)
>   {
> -	unsigned int cpu;
> -
> -	/* init core mask and capacity */
> -	for_each_possible_cpu(cpu) {
> -		struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
> -
> -		cpu_topo->thread_id = -1;
> -		cpu_topo->core_id =  -1;
> -		cpu_topo->socket_id = -1;
> -		cpumask_clear(&cpu_topo->core_sibling);
> -		cpumask_clear(&cpu_topo->thread_sibling);
> -	}
> +	reset_cpu_topology();
>   	smp_wmb();
>   
>   	parse_dt_topology();
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 5781bb4c457c..797e3cd71bea 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -426,6 +426,7 @@ static int __init parse_dt_topology(void)
>   	of_node_put(cn);
>   	return ret;
>   }
> +#endif
>   
>   /*
>    * cpu topology table
> @@ -491,7 +492,7 @@ static void clear_cpu_topology(int cpu)
>   	cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
>   }
>   
> -static void __init reset_cpu_topology(void)
> +void __init reset_cpu_topology(void)
>   {
>   	unsigned int cpu;
>   
> @@ -526,6 +527,7 @@ __weak int __init parse_acpi_topology(void)
>   	return 0;
>   }
>   
> +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>   void __init init_cpu_topology(void)
>   {
>   	reset_cpu_topology();
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index d4e76e0a283f..d4311127970d 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -54,11 +54,9 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
>   void init_cpu_topology(void);
>   void store_cpu_topology(unsigned int cpuid);
>   const struct cpumask *cpu_coregroup_mask(int cpu);
> -#endif
> -
> -#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>   void update_siblings_masks(unsigned int cpu);
> -#endif
>   void remove_cpu_topology(unsigned int cpuid);
> +void reset_cpu_topology(void);
> +#endif
>   
>   #endif /* _LINUX_ARCH_TOPOLOGY_H_ */
> 
Hi Russell,
Can we get an ACK for ARM if you don't have any objection to the series ?

-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/7] cpu-topology: Move cpu topology code to common code.
  2019-05-29 21:13 ` [PATCH v6 3/7] cpu-topology: Move cpu topology code to common code Atish Patra
@ 2019-06-06 14:26   ` Atish Patra
  2019-06-11 15:55   ` Will Deacon
  1 sibling, 0 replies; 25+ messages in thread
From: Atish Patra @ 2019-06-06 14:26 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: linux-kernel, Jeffrey Hugo, Sudeep Holla, Albert Ou, Anup Patel,
	David S. Miller, devicetree, Greg Kroah-Hartman, Ingo Molnar,
	Jeremy Linton, Linus Walleij, linux-riscv, Mark Rutland,
	Mauro Carvalho Chehab, Morten Rasmussen, Otto Sabart,
	Palmer Dabbelt, Paul Walmsley, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Russell King,
	linux-arm-kernel

On 5/29/19 2:15 PM, Atish Patra wrote:
> Both RISC-V & ARM64 are using cpu-map device tree to describe
> their cpu topology. It's better to move the relevant code to
> a common place instead of duplicate code.
> 
> To: Will Deacon <will.deacon@arm.com>
> To: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Atish Patra <atish.patra@wdc.com>
> [Tested on QDF2400]
> Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
> [Tested on Juno and other embedded platforms.]
> Tested-by: Sudeep Holla <sudeep.holla@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> 
> ---
> Hi Will/Catalin,
> Can we get ack for this patch ? We are hoping to get the entire series
> merged at one go.
> ---
>   arch/arm64/include/asm/topology.h |  23 ---
>   arch/arm64/kernel/topology.c      | 303 +-----------------------------
>   drivers/base/arch_topology.c      | 296 +++++++++++++++++++++++++++++
>   include/linux/arch_topology.h     |  28 +++
>   include/linux/topology.h          |   1 +
>   5 files changed, 329 insertions(+), 322 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index 0524f2438649..a4d945db95a2 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -4,29 +4,6 @@
>   
>   #include <linux/cpumask.h>
>   
> -struct cpu_topology {
> -	int thread_id;
> -	int core_id;
> -	int package_id;
> -	int llc_id;
> -	cpumask_t thread_sibling;
> -	cpumask_t core_sibling;
> -	cpumask_t llc_sibling;
> -};
> -
> -extern struct cpu_topology cpu_topology[NR_CPUS];
> -
> -#define topology_physical_package_id(cpu)	(cpu_topology[cpu].package_id)
> -#define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
> -#define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
> -#define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
> -#define topology_llc_cpumask(cpu)	(&cpu_topology[cpu].llc_sibling)
> -
> -void init_cpu_topology(void);
> -void store_cpu_topology(unsigned int cpuid);
> -void remove_cpu_topology(unsigned int cpuid);
> -const struct cpumask *cpu_coregroup_mask(int cpu);
> -
>   #ifdef CONFIG_NUMA
>   
>   struct pci_bus;
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 0825c4a856e3..6b95c91e7d67 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -14,250 +14,13 @@
>   #include <linux/acpi.h>
>   #include <linux/arch_topology.h>
>   #include <linux/cacheinfo.h>
> -#include <linux/cpu.h>
> -#include <linux/cpumask.h>
>   #include <linux/init.h>
>   #include <linux/percpu.h>
> -#include <linux/node.h>
> -#include <linux/nodemask.h>
> -#include <linux/of.h>
> -#include <linux/sched.h>
> -#include <linux/sched/topology.h>
> -#include <linux/slab.h>
> -#include <linux/smp.h>
> -#include <linux/string.h>
>   
>   #include <asm/cpu.h>
>   #include <asm/cputype.h>
>   #include <asm/topology.h>
>   
> -static int __init get_cpu_for_node(struct device_node *node)
> -{
> -	struct device_node *cpu_node;
> -	int cpu;
> -
> -	cpu_node = of_parse_phandle(node, "cpu", 0);
> -	if (!cpu_node)
> -		return -1;
> -
> -	cpu = of_cpu_node_to_id(cpu_node);
> -	if (cpu >= 0)
> -		topology_parse_cpu_capacity(cpu_node, cpu);
> -	else
> -		pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
> -
> -	of_node_put(cpu_node);
> -	return cpu;
> -}
> -
> -static int __init parse_core(struct device_node *core, int package_id,
> -			     int core_id)
> -{
> -	char name[10];
> -	bool leaf = true;
> -	int i = 0;
> -	int cpu;
> -	struct device_node *t;
> -
> -	do {
> -		snprintf(name, sizeof(name), "thread%d", i);
> -		t = of_get_child_by_name(core, name);
> -		if (t) {
> -			leaf = false;
> -			cpu = get_cpu_for_node(t);
> -			if (cpu >= 0) {
> -				cpu_topology[cpu].package_id = package_id;
> -				cpu_topology[cpu].core_id = core_id;
> -				cpu_topology[cpu].thread_id = i;
> -			} else {
> -				pr_err("%pOF: Can't get CPU for thread\n",
> -				       t);
> -				of_node_put(t);
> -				return -EINVAL;
> -			}
> -			of_node_put(t);
> -		}
> -		i++;
> -	} while (t);
> -
> -	cpu = get_cpu_for_node(core);
> -	if (cpu >= 0) {
> -		if (!leaf) {
> -			pr_err("%pOF: Core has both threads and CPU\n",
> -			       core);
> -			return -EINVAL;
> -		}
> -
> -		cpu_topology[cpu].package_id = package_id;
> -		cpu_topology[cpu].core_id = core_id;
> -	} else if (leaf) {
> -		pr_err("%pOF: Can't get CPU for leaf core\n", core);
> -		return -EINVAL;
> -	}
> -
> -	return 0;
> -}
> -
> -static int __init parse_cluster(struct device_node *cluster, int depth)
> -{
> -	char name[10];
> -	bool leaf = true;
> -	bool has_cores = false;
> -	struct device_node *c;
> -	static int package_id __initdata;
> -	int core_id = 0;
> -	int i, ret;
> -
> -	/*
> -	 * First check for child clusters; we currently ignore any
> -	 * information about the nesting of clusters and present the
> -	 * scheduler with a flat list of them.
> -	 */
> -	i = 0;
> -	do {
> -		snprintf(name, sizeof(name), "cluster%d", i);
> -		c = of_get_child_by_name(cluster, name);
> -		if (c) {
> -			leaf = false;
> -			ret = parse_cluster(c, depth + 1);
> -			of_node_put(c);
> -			if (ret != 0)
> -				return ret;
> -		}
> -		i++;
> -	} while (c);
> -
> -	/* Now check for cores */
> -	i = 0;
> -	do {
> -		snprintf(name, sizeof(name), "core%d", i);
> -		c = of_get_child_by_name(cluster, name);
> -		if (c) {
> -			has_cores = true;
> -
> -			if (depth == 0) {
> -				pr_err("%pOF: cpu-map children should be clusters\n",
> -				       c);
> -				of_node_put(c);
> -				return -EINVAL;
> -			}
> -
> -			if (leaf) {
> -				ret = parse_core(c, package_id, core_id++);
> -			} else {
> -				pr_err("%pOF: Non-leaf cluster with core %s\n",
> -				       cluster, name);
> -				ret = -EINVAL;
> -			}
> -
> -			of_node_put(c);
> -			if (ret != 0)
> -				return ret;
> -		}
> -		i++;
> -	} while (c);
> -
> -	if (leaf && !has_cores)
> -		pr_warn("%pOF: empty cluster\n", cluster);
> -
> -	if (leaf)
> -		package_id++;
> -
> -	return 0;
> -}
> -
> -static int __init parse_dt_topology(void)
> -{
> -	struct device_node *cn, *map;
> -	int ret = 0;
> -	int cpu;
> -
> -	cn = of_find_node_by_path("/cpus");
> -	if (!cn) {
> -		pr_err("No CPU information found in DT\n");
> -		return 0;
> -	}
> -
> -	/*
> -	 * When topology is provided cpu-map is essentially a root
> -	 * cluster with restricted subnodes.
> -	 */
> -	map = of_get_child_by_name(cn, "cpu-map");
> -	if (!map)
> -		goto out;
> -
> -	ret = parse_cluster(map, 0);
> -	if (ret != 0)
> -		goto out_map;
> -
> -	topology_normalize_cpu_scale();
> -
> -	/*
> -	 * Check that all cores are in the topology; the SMP code will
> -	 * only mark cores described in the DT as possible.
> -	 */
> -	for_each_possible_cpu(cpu)
> -		if (cpu_topology[cpu].package_id == -1)
> -			ret = -EINVAL;
> -
> -out_map:
> -	of_node_put(map);
> -out:
> -	of_node_put(cn);
> -	return ret;
> -}
> -
> -/*
> - * cpu topology table
> - */
> -struct cpu_topology cpu_topology[NR_CPUS];
> -EXPORT_SYMBOL_GPL(cpu_topology);
> -
> -const struct cpumask *cpu_coregroup_mask(int cpu)
> -{
> -	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
> -
> -	/* Find the smaller of NUMA, core or LLC siblings */
> -	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
> -		/* not numa in package, lets use the package siblings */
> -		core_mask = &cpu_topology[cpu].core_sibling;
> -	}
> -	if (cpu_topology[cpu].llc_id != -1) {
> -		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> -			core_mask = &cpu_topology[cpu].llc_sibling;
> -	}
> -
> -	return core_mask;
> -}
> -
> -static void update_siblings_masks(unsigned int cpuid)
> -{
> -	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> -	int cpu;
> -
> -	/* update core and thread sibling masks */
> -	for_each_online_cpu(cpu) {
> -		cpu_topo = &cpu_topology[cpu];
> -
> -		if (cpuid_topo->llc_id == cpu_topo->llc_id) {
> -			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
> -			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
> -		}
> -
> -		if (cpuid_topo->package_id != cpu_topo->package_id)
> -			continue;
> -
> -		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> -		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> -
> -		if (cpuid_topo->core_id != cpu_topo->core_id)
> -			continue;
> -
> -		cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
> -		cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
> -	}
> -}
> -
>   void store_cpu_topology(unsigned int cpuid)
>   {
>   	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
> @@ -296,59 +59,19 @@ void store_cpu_topology(unsigned int cpuid)
>   	update_siblings_masks(cpuid);
>   }
>   
> -static void clear_cpu_topology(int cpu)
> -{
> -	struct cpu_topology *cpu_topo = &cpu_topology[cpu];
> -
> -	cpumask_clear(&cpu_topo->llc_sibling);
> -	cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
> -
> -	cpumask_clear(&cpu_topo->core_sibling);
> -	cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
> -	cpumask_clear(&cpu_topo->thread_sibling);
> -	cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
> -}
> -
> -static void __init reset_cpu_topology(void)
> -{
> -	unsigned int cpu;
> -
> -	for_each_possible_cpu(cpu) {
> -		struct cpu_topology *cpu_topo = &cpu_topology[cpu];
> -
> -		cpu_topo->thread_id = -1;
> -		cpu_topo->core_id = 0;
> -		cpu_topo->package_id = -1;
> -		cpu_topo->llc_id = -1;
> -
> -		clear_cpu_topology(cpu);
> -	}
> -}
> -
> -void remove_cpu_topology(unsigned int cpu)
> -{
> -	int sibling;
> -
> -	for_each_cpu(sibling, topology_core_cpumask(cpu))
> -		cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
> -	for_each_cpu(sibling, topology_sibling_cpumask(cpu))
> -		cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
> -	for_each_cpu(sibling, topology_llc_cpumask(cpu))
> -		cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
> -
> -	clear_cpu_topology(cpu);
> -}
> -
>   #ifdef CONFIG_ACPI
>   /*
>    * Propagate the topology information of the processor_topology_node tree to the
>    * cpu_topology array.
>    */
> -static int __init parse_acpi_topology(void)
> +int __init parse_acpi_topology(void)
>   {
>   	bool is_threaded;
>   	int cpu, topology_id;
>   
> +	if (acpi_disabled)
> +		return 0;
> +
>   	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
>   
>   	for_each_possible_cpu(cpu) {
> @@ -384,24 +107,6 @@ static int __init parse_acpi_topology(void)
>   
>   	return 0;
>   }
> -
> -#else
> -static inline int __init parse_acpi_topology(void)
> -{
> -	return -EINVAL;
> -}
>   #endif
>   
> -void __init init_cpu_topology(void)
> -{
> -	reset_cpu_topology();
>   
> -	/*
> -	 * Discard anything that was parsed if we hit an error so we
> -	 * don't use partial information.
> -	 */
> -	if (!acpi_disabled && parse_acpi_topology())
> -		reset_cpu_topology();
> -	else if (of_have_populated_dt() && parse_dt_topology())
> -		reset_cpu_topology();
> -}
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 1739d7e1952a..5781bb4c457c 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -15,6 +15,11 @@
>   #include <linux/string.h>
>   #include <linux/sched/topology.h>
>   #include <linux/cpuset.h>
> +#include <linux/cpumask.h>
> +#include <linux/init.h>
> +#include <linux/percpu.h>
> +#include <linux/sched.h>
> +#include <linux/smp.h>
>   
>   DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
>   
> @@ -244,3 +249,294 @@ static void parsing_done_workfn(struct work_struct *work)
>   #else
>   core_initcall(free_raw_capacity);
>   #endif
> +
> +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> +static int __init get_cpu_for_node(struct device_node *node)
> +{
> +	struct device_node *cpu_node;
> +	int cpu;
> +
> +	cpu_node = of_parse_phandle(node, "cpu", 0);
> +	if (!cpu_node)
> +		return -1;
> +
> +	cpu = of_cpu_node_to_id(cpu_node);
> +	if (cpu >= 0)
> +		topology_parse_cpu_capacity(cpu_node, cpu);
> +	else
> +		pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
> +
> +	of_node_put(cpu_node);
> +	return cpu;
> +}
> +
> +static int __init parse_core(struct device_node *core, int package_id,
> +			     int core_id)
> +{
> +	char name[10];
> +	bool leaf = true;
> +	int i = 0;
> +	int cpu;
> +	struct device_node *t;
> +
> +	do {
> +		snprintf(name, sizeof(name), "thread%d", i);
> +		t = of_get_child_by_name(core, name);
> +		if (t) {
> +			leaf = false;
> +			cpu = get_cpu_for_node(t);
> +			if (cpu >= 0) {
> +				cpu_topology[cpu].package_id = package_id;
> +				cpu_topology[cpu].core_id = core_id;
> +				cpu_topology[cpu].thread_id = i;
> +			} else {
> +				pr_err("%pOF: Can't get CPU for thread\n",
> +				       t);
> +				of_node_put(t);
> +				return -EINVAL;
> +			}
> +			of_node_put(t);
> +		}
> +		i++;
> +	} while (t);
> +
> +	cpu = get_cpu_for_node(core);
> +	if (cpu >= 0) {
> +		if (!leaf) {
> +			pr_err("%pOF: Core has both threads and CPU\n",
> +			       core);
> +			return -EINVAL;
> +		}
> +
> +		cpu_topology[cpu].package_id = package_id;
> +		cpu_topology[cpu].core_id = core_id;
> +	} else if (leaf) {
> +		pr_err("%pOF: Can't get CPU for leaf core\n", core);
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int __init parse_cluster(struct device_node *cluster, int depth)
> +{
> +	char name[10];
> +	bool leaf = true;
> +	bool has_cores = false;
> +	struct device_node *c;
> +	static int package_id __initdata;
> +	int core_id = 0;
> +	int i, ret;
> +
> +	/*
> +	 * First check for child clusters; we currently ignore any
> +	 * information about the nesting of clusters and present the
> +	 * scheduler with a flat list of them.
> +	 */
> +	i = 0;
> +	do {
> +		snprintf(name, sizeof(name), "cluster%d", i);
> +		c = of_get_child_by_name(cluster, name);
> +		if (c) {
> +			leaf = false;
> +			ret = parse_cluster(c, depth + 1);
> +			of_node_put(c);
> +			if (ret != 0)
> +				return ret;
> +		}
> +		i++;
> +	} while (c);
> +
> +	/* Now check for cores */
> +	i = 0;
> +	do {
> +		snprintf(name, sizeof(name), "core%d", i);
> +		c = of_get_child_by_name(cluster, name);
> +		if (c) {
> +			has_cores = true;
> +
> +			if (depth == 0) {
> +				pr_err("%pOF: cpu-map children should be clusters\n",
> +				       c);
> +				of_node_put(c);
> +				return -EINVAL;
> +			}
> +
> +			if (leaf) {
> +				ret = parse_core(c, package_id, core_id++);
> +			} else {
> +				pr_err("%pOF: Non-leaf cluster with core %s\n",
> +				       cluster, name);
> +				ret = -EINVAL;
> +			}
> +
> +			of_node_put(c);
> +			if (ret != 0)
> +				return ret;
> +		}
> +		i++;
> +	} while (c);
> +
> +	if (leaf && !has_cores)
> +		pr_warn("%pOF: empty cluster\n", cluster);
> +
> +	if (leaf)
> +		package_id++;
> +
> +	return 0;
> +}
> +
> +static int __init parse_dt_topology(void)
> +{
> +	struct device_node *cn, *map;
> +	int ret = 0;
> +	int cpu;
> +
> +	cn = of_find_node_by_path("/cpus");
> +	if (!cn) {
> +		pr_err("No CPU information found in DT\n");
> +		return 0;
> +	}
> +
> +	/*
> +	 * When topology is provided cpu-map is essentially a root
> +	 * cluster with restricted subnodes.
> +	 */
> +	map = of_get_child_by_name(cn, "cpu-map");
> +	if (!map)
> +		goto out;
> +
> +	ret = parse_cluster(map, 0);
> +	if (ret != 0)
> +		goto out_map;
> +
> +	topology_normalize_cpu_scale();
> +
> +	/*
> +	 * Check that all cores are in the topology; the SMP code will
> +	 * only mark cores described in the DT as possible.
> +	 */
> +	for_each_possible_cpu(cpu)
> +		if (cpu_topology[cpu].package_id == -1)
> +			ret = -EINVAL;
> +
> +out_map:
> +	of_node_put(map);
> +out:
> +	of_node_put(cn);
> +	return ret;
> +}
> +
> +/*
> + * cpu topology table
> + */
> +struct cpu_topology cpu_topology[NR_CPUS];
> +EXPORT_SYMBOL_GPL(cpu_topology);
> +
> +const struct cpumask *cpu_coregroup_mask(int cpu)
> +{
> +	const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
> +
> +	/* Find the smaller of NUMA, core or LLC siblings */
> +	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
> +		/* not numa in package, lets use the package siblings */
> +		core_mask = &cpu_topology[cpu].core_sibling;
> +	}
> +	if (cpu_topology[cpu].llc_id != -1) {
> +		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> +			core_mask = &cpu_topology[cpu].llc_sibling;
> +	}
> +
> +	return core_mask;
> +}
> +
> +void update_siblings_masks(unsigned int cpuid)
> +{
> +	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> +	int cpu;
> +
> +	/* update core and thread sibling masks */
> +	for_each_online_cpu(cpu) {
> +		cpu_topo = &cpu_topology[cpu];
> +
> +		if (cpuid_topo->llc_id == cpu_topo->llc_id) {
> +			cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
> +			cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
> +		}
> +
> +		if (cpuid_topo->package_id != cpu_topo->package_id)
> +			continue;
> +
> +		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> +		cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
> +
> +		if (cpuid_topo->core_id != cpu_topo->core_id)
> +			continue;
> +
> +		cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
> +		cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
> +	}
> +}
> +
> +static void clear_cpu_topology(int cpu)
> +{
> +	struct cpu_topology *cpu_topo = &cpu_topology[cpu];
> +
> +	cpumask_clear(&cpu_topo->llc_sibling);
> +	cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
> +
> +	cpumask_clear(&cpu_topo->core_sibling);
> +	cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
> +	cpumask_clear(&cpu_topo->thread_sibling);
> +	cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
> +}
> +
> +static void __init reset_cpu_topology(void)
> +{
> +	unsigned int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		struct cpu_topology *cpu_topo = &cpu_topology[cpu];
> +
> +		cpu_topo->thread_id = -1;
> +		cpu_topo->core_id = -1;
> +		cpu_topo->package_id = -1;
> +		cpu_topo->llc_id = -1;
> +
> +		clear_cpu_topology(cpu);
> +	}
> +}
> +
> +void remove_cpu_topology(unsigned int cpu)
> +{
> +	int sibling;
> +
> +	for_each_cpu(sibling, topology_core_cpumask(cpu))
> +		cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
> +	for_each_cpu(sibling, topology_sibling_cpumask(cpu))
> +		cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
> +	for_each_cpu(sibling, topology_llc_cpumask(cpu))
> +		cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
> +
> +	clear_cpu_topology(cpu);
> +}
> +
> +__weak int __init parse_acpi_topology(void)
> +{
> +	return 0;
> +}
> +
> +void __init init_cpu_topology(void)
> +{
> +	reset_cpu_topology();
> +
> +	/*
> +	 * Discard anything that was parsed if we hit an error so we
> +	 * don't use partial information.
> +	 */
> +	if (parse_acpi_topology())
> +		reset_cpu_topology();
> +	else if (of_have_populated_dt() && parse_dt_topology())
> +		reset_cpu_topology();
> +}
> +#endif
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index d9bdc1a7f4e7..d4e76e0a283f 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -33,4 +33,32 @@ unsigned long topology_get_freq_scale(int cpu)
>   	return per_cpu(freq_scale, cpu);
>   }
>   
> +struct cpu_topology {
> +	int thread_id;
> +	int core_id;
> +	int package_id;
> +	int llc_id;
> +	cpumask_t thread_sibling;
> +	cpumask_t core_sibling;
> +	cpumask_t llc_sibling;
> +};
> +
> +#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
> +extern struct cpu_topology cpu_topology[NR_CPUS];
> +
> +#define topology_physical_package_id(cpu)	(cpu_topology[cpu].package_id)
> +#define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
> +#define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
> +#define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
> +#define topology_llc_cpumask(cpu)	(&cpu_topology[cpu].llc_sibling)
> +void init_cpu_topology(void);
> +void store_cpu_topology(unsigned int cpuid);
> +const struct cpumask *cpu_coregroup_mask(int cpu);
> +#endif
> +
> +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> +void update_siblings_masks(unsigned int cpu);
> +#endif
> +void remove_cpu_topology(unsigned int cpuid);
> +
>   #endif /* _LINUX_ARCH_TOPOLOGY_H_ */
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index cb0775e1ee4b..4b3755d65812 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -27,6 +27,7 @@
>   #ifndef _LINUX_TOPOLOGY_H
>   #define _LINUX_TOPOLOGY_H
>   
> +#include <linux/arch_topology.h>
>   #include <linux/cpumask.h>
>   #include <linux/bitops.h>
>   #include <linux/mmzone.h>
> 

Hi Will/Catalin,
Can we get an ACK for ARM64 if you are okay with this series ?

-- 
Regards,
Atish

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 5/7] RISC-V: Parse cpu topology during boot.
  2019-05-29 21:13 ` [PATCH v6 5/7] RISC-V: Parse cpu topology during boot Atish Patra
@ 2019-06-07  5:00   ` Paul Walmsley
  0 siblings, 0 replies; 25+ messages in thread
From: Paul Walmsley @ 2019-06-07  5:00 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Russell King, Sudeep Holla, Albert Ou, Anup Patel,
	Catalin Marinas, David S. Miller, devicetree, Greg Kroah-Hartman,
	Ingo Molnar, Jeremy Linton, Linus Walleij, linux-riscv,
	Mark Rutland, Mauro Carvalho Chehab, Morten Rasmussen,
	Otto Sabart, Palmer Dabbelt, Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Will Deacon,
	linux-arm-kernel

On Wed, 29 May 2019, Atish Patra wrote:

> Currently, there are no topology defined for RISC-V.
> Parse the cpu-map node from device tree and setup the
> cpu topology.
> 
> CPU topology after applying the patch.
> $cat /sys/devices/system/cpu/cpu2/topology/core_siblings_list
> 0-3
> $cat /sys/devices/system/cpu/cpu3/topology/core_siblings_list
> 0-3
> $cat /sys/devices/system/cpu/cpu3/topology/physical_package_id
> 0
> $cat /sys/devices/system/cpu/cpu3/topology/core_id
> 3
> 
> Signed-off-by: Atish Patra <atish.patra@wdc.com>
> Acked-by: Sudeep Holla <sudeep.holla@arm.com>

Looks reasonable to me.

Acked-by: Paul Walmsley <paul.walmsley@sifive.com>

We're assuming, on the RISC-V side, that these patches will go in via 
another tree.


- Paul

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v6 3/7] cpu-topology: Move cpu topology code to common code.
  2019-05-29 21:13 ` [PATCH v6 3/7] cpu-topology: Move cpu topology code to common code Atish Patra
  2019-06-06 14:26   ` Atish Patra
@ 2019-06-11 15:55   ` Will Deacon
  1 sibling, 0 replies; 25+ messages in thread
From: Will Deacon @ 2019-06-11 15:55 UTC (permalink / raw)
  To: Atish Patra
  Cc: linux-kernel, Catalin Marinas, Jeffrey Hugo, Sudeep Holla,
	Albert Ou, Anup Patel, David S. Miller, devicetree,
	Greg Kroah-Hartman, Ingo Molnar, Jeremy Linton, Linus Walleij,
	linux-riscv, Mark Rutland, Mauro Carvalho Chehab,
	Morten Rasmussen, Otto Sabart, Palmer Dabbelt, Paul Walmsley,
	Peter Zijlstra (Intel),
	Rafael J. Wysocki, Rob Herring, Thomas Gleixner, Russell King,
	linux-arm-kernel

On Wed, May 29, 2019 at 02:13:36PM -0700, Atish Patra wrote:
> Both RISC-V & ARM64 are using cpu-map device tree to describe
> their cpu topology. It's better to move the relevant code to
> a common place instead of duplicate code.
> 
> To: Will Deacon <will.deacon@arm.com>
> To: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Atish Patra <atish.patra@wdc.com>
> [Tested on QDF2400]
> Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
> [Tested on Juno and other embedded platforms.]
> Tested-by: Sudeep Holla <sudeep.holla@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> 
> ---
> Hi Will/Catalin,
> Can we get ack for this patch ? We are hoping to get the entire series
> merged at one go.

If Sudeep is happy with it then that's good enough for me.

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-06-11 15:55 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-29 21:13 [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Atish Patra
2019-05-29 21:13 ` [PATCH v6 1/7] Documentation: DT: arm: add support for sockets defining package boundaries Atish Patra
2019-05-29 23:39   ` Andrew F. Davis
2019-05-30 11:51     ` Morten Rasmussen
2019-05-30 12:56       ` Andrew F. Davis
2019-05-30 13:12         ` Morten Rasmussen
2019-05-31  9:41         ` Sudeep Holla
2019-05-30 21:42       ` Russell King - ARM Linux admin
2019-05-31  9:37         ` Sudeep Holla
2019-05-31  9:54           ` Morten Rasmussen
2019-05-29 21:13 ` [PATCH v6 2/7] dt-binding: cpu-topology: Move cpu-map to a common binding Atish Patra
2019-05-30 20:55   ` Jeremy Linton
2019-06-03  8:49     ` Atish Patra
2019-06-03  9:05       ` Sudeep Holla
2019-05-29 21:13 ` [PATCH v6 3/7] cpu-topology: Move cpu topology code to common code Atish Patra
2019-06-06 14:26   ` Atish Patra
2019-06-11 15:55   ` Will Deacon
2019-05-29 21:13 ` [PATCH v6 4/7] arm: Use common cpu_topology structure and functions Atish Patra
2019-06-06 14:25   ` Atish Patra
2019-05-29 21:13 ` [PATCH v6 5/7] RISC-V: Parse cpu topology during boot Atish Patra
2019-06-07  5:00   ` Paul Walmsley
2019-05-29 21:13 ` [PATCH v6 6/7] base: arch_topology: update Kconfig help description Atish Patra
2019-05-29 21:13 ` [PATCH v6 7/7] MAINTAINERS: Add an entry for generic architecture topology Atish Patra
2019-05-30 21:12 ` [PATCH v6 0/7] Unify CPU topology across ARM & RISC-V Jeremy Linton
2019-06-03  8:50   ` Atish Patra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).