LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v3 00/10] PECI device driver introduction
@ 2018-04-10 18:32 Jae Hyun Yoo
  2018-04-10 18:32 ` [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers Jae Hyun Yoo
                   ` (9 more replies)
  0 siblings, 10 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

Introduction of the Platform Environment Control Interface (PECI) bus
device driver. PECI is a one-wire bus interface that provides a
communication channel between an Intel processor and chipset components to
external monitoring or control devices. PECI is designed to support the
following sideband functions:

* Processor and DRAM thermal management
  - Processor fan speed control is managed by comparing Digital Thermal
    Sensor (DTS) thermal readings acquired via PECI against the
    processor-specific fan speed control reference point, or TCONTROL. Both
    TCONTROL and DTS thermal readings are accessible via the processor PECI
    client. These variables are referenced to a common temperature, the TCC
    activation point, and are both defined as negative offsets from that
    reference.
  - PECI based access to the processor package configuration space provides
    a means for Baseboard Management Controllers (BMC) or other platform
    management devices to actively manage the processor and memory power
    and thermal features.

* Platform Manageability
  - Platform manageability functions including thermal, power, and error
    monitoring. Note that platform 'power' management includes monitoring
    and control for both the processor and DRAM subsystem to assist with
    data center power limiting.
  - PECI allows read access to certain error registers in the processor MSR
    space and status monitoring registers in the PCI configuration space
    within the processor and downstream devices.
  - PECI permits writes to certain registers in the processor PCI
    configuration space.

* Processor Interface Tuning and Diagnostics
  - Processor interface tuning and diagnostics capabilities
    (Intel Interconnect BIST). The processors Intel Interconnect Built In
    Self Test (Intel IBIST) allows for infield diagnostic capabilities in
    the Intel UPI and memory controller interfaces. PECI provides a port to
    execute these diagnostics via its PCI Configuration read and write
    capabilities.

* Failure Analysis
  - Output the state of the processor after a failure for analysis via
    Crashdump.

PECI uses a single wire for self-clocking and data transfer. The bus
requires no additional control lines. The physical layer is a self-clocked
one-wire bus that begins each bit with a driven, rising edge from an idle
level near zero volts. The duration of the signal driven high depends on
whether the bit value is a logic '0' or logic '1'. PECI also includes
variable data transfer rate established with every message. In this way, it
is highly flexible even though underlying logic is simple.

The interface design was optimized for interfacing between an Intel
processor and chipset components in both single processor and multiple
processor environments. The single wire interface provides low board
routing overhead for the multiple load connections in the congested routing
area near the processor and chipset components. Bus speed, error checking,
and low protocol overhead provides adequate link bandwidth and reliability
to transfer critical device operating conditions and configuration
information.

This implementation provides the basic framework to add PECI extensions to
the Linux bus and device models. A hardware specific 'Adapter' driver can
be attached to the PECI bus to provide sideband functions described above.
It is also possible to access all devices on an adapter from userspace
through the /dev interface. A device specific 'Client' driver also can be
attached to the PECI bus so each processor client's features can be
supported by the 'Client' driver through an adapter connection in the bus.
This patch set includes Aspeed 24xx/25xx PECI driver and PECI
cputemp/dimmtemp drivers as the first implementation for both adapter and
client drivers on the PECI bus framework.

Please review.

Thanks,

-Jae

Changes from v2:
* Divided peci-hwmon driver into two drivers, peci-cputemp and
  peci-dimmtemp.
* Added generic dt binding documents for PECI bus, adapter and client.
* Removed in_atomic() call from the PECI core driver.
* Improved PECI commands masking logic.
* Added permission check logic for PECI ioctls.
* Removed unnecessary type casts.
* Fixed some invalid error return codes.
* Added the mark_updated() function to improve update interval checking
  logic.
* Fixed a bug in populated DIMM checking function.
* Fixed some typo, grammar and style issues in documents.
* Rewrote hwmon drivers to use devm_hwmon_device_register_with_info API.
* Made peci_match_id() function as a static.
* Replaced a deprecated create_singlethread_workqueue() call with an
  alloc_ordered_workqueue() call.
* Reordered local variable definitions in reversed xmas tree notation.
* Listed up client CPUs that can be supported by peci-cputemp and
  peci-dimmtemp hwmon drivers.
* Added CPU generation detection logic which checks CPUID signature through
  PECI connection.
* Improved interrupt handling logic in the Aspeed PECI adapter driver.
* Fixed SPDX license identifier style in header files.
* Changed some macros in peci.h to static inline functions.
* Dropped sleepable context checking code in peci-core.
* Adjusted rt_mutex protection scope in peci-core.
* Moved adapter->xfer() checking code into peci_register_adapter().
* Improved PECI command retry checking logic.
* Changed ioctl base from 'P' to 0xb6 to avoid confiliction and updated
  ioctl-number.txt to reflect the ioctl number of PECI subsystem.
* Added a comment to describe PECI retry action.
* Simplified return code handling of peci_ioctl_ping().
* Changed type of peci_ioctl_fn[] to static const.
* Fixed range checking code for valid PECI commands.
* Fixed the error return code on invalid PECI commands.
* Fixed incorrect definitions of PECI ioctl and its handling logic.

Changes from v1:
* Additionally implemented a core driver to support PECI linux bus driver
  model.
* Modified Aspeed PECI driver to make that to be an adapter driver in PECI
  bus.
* Modified PECI hwmon driver to make that to be a client driver in PECI
  bus.
* Simplified hwmon driver attribute labels and removed redundant strings.
* Removed core_nums from device tree setting of hwmon driver and modified
  core number detection logic to check the resolved_core register in client
  CPU's local PCI configuration area.
* Removed dimm_nums from device tree setting of hwmon driver and added
  populated DIMM detection logic to support dynamic creation.
* Removed indexing gap on core temperature and DIMM temperature attributes.
* Improved hwmon registration and dynamic attribute creation logic.
* Fixed structure definitions in PECI uapi header to make that use __u8,
  __u16 and etc.
* Modified wait_for_completion_interruptible_timeout error handling logic
  in Aspeed PECI driver to deliver errors correctly.
* Removed low-level xfer command from ioctl and kept only high-level PECI
  command suite as ioctls.
* Fixed I/O timeout logic in Aspeed PECI driver using ktime.
* Added a function into hwmon driver to simplify update delay checking.
* Added a function into hwmon driver to convert 10.6 to millidegree.
* Dropped non-standard attributes in hwmon driver.
* Fixed OF table for hwmon to make it indicate as a PECI client of Intel
  CPU target.
* Added a maintainer of PECI subsystem into MAINTAINERS document.

Fengguang Wu (1):
  drivers/peci: Add support for PECI bus driver core


Jae Hyun Yoo (10):
  Documentations: dt-bindings: Add documents of generic PECI bus,
    adapter and client drivers
  Documentations: ioctl: Add ioctl numbers for PECI subsystem
  drivers/peci: Add support for PECI bus driver core
  Documentations: dt-bindings: Add a document of PECI adapter driver for
    Aspeed AST24xx/25xx SoCs
  ARM: dts: aspeed: peci: Add PECI node
  drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
  Documentation: dt-bindings: Add documents for PECI hwmon client
    drivers
  Documentation: hwmon: Add documents for PECI hwmon client drivers
  drivers/hwmon: Add PECI hwmon client drivers
  Add a maintainer for the PECI subsystem

 .../devicetree/bindings/hwmon/peci-cputemp.txt     |   24 +
 .../devicetree/bindings/hwmon/peci-dimmtemp.txt    |   25 +
 .../devicetree/bindings/peci/peci-adapter.txt      |   23 +
 .../devicetree/bindings/peci/peci-aspeed.txt       |   60 +
 .../devicetree/bindings/peci/peci-bus.txt          |   15 +
 .../devicetree/bindings/peci/peci-client.txt       |   25 +
 Documentation/hwmon/peci-cputemp                   |   88 ++
 Documentation/hwmon/peci-dimmtemp                  |   50 +
 Documentation/ioctl/ioctl-number.txt               |    2 +
 MAINTAINERS                                        |   10 +
 arch/arm/boot/dts/aspeed-g4.dtsi                   |   25 +
 arch/arm/boot/dts/aspeed-g5.dtsi                   |   25 +
 drivers/Kconfig                                    |    2 +
 drivers/Makefile                                   |    1 +
 drivers/hwmon/Kconfig                              |   28 +
 drivers/hwmon/Makefile                             |    2 +
 drivers/hwmon/peci-cputemp.c                       |  783 ++++++++++++
 drivers/hwmon/peci-dimmtemp.c                      |  432 +++++++
 drivers/peci/Kconfig                               |   45 +
 drivers/peci/Makefile                              |    9 +
 drivers/peci/peci-aspeed.c                         |  504 ++++++++
 drivers/peci/peci-core.c                           | 1291 ++++++++++++++++++++
 include/linux/peci.h                               |  107 ++
 include/uapi/linux/peci-ioctl.h                    |  200 +++
 24 files changed, 3776 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt
 create mode 100644 Documentation/hwmon/peci-cputemp
 create mode 100644 Documentation/hwmon/peci-dimmtemp
 create mode 100644 drivers/hwmon/peci-cputemp.c
 create mode 100644 drivers/hwmon/peci-dimmtemp.c
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/peci-aspeed.c
 create mode 100644 drivers/peci/peci-core.c
 create mode 100644 include/linux/peci.h
 create mode 100644 include/uapi/linux/peci-ioctl.h

-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-11 11:52   ` Joel Stanley
  2018-04-16 17:59   ` Rob Herring
  2018-04-10 18:32 ` [PATCH v3 02/10] Documentations: ioctl: Add ioctl numbers for PECI subsystem Jae Hyun Yoo
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds documents of generic PECI bus, adapter and client drivers.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 .../devicetree/bindings/peci/peci-adapter.txt      | 23 ++++++++++++++++++++
 .../devicetree/bindings/peci/peci-bus.txt          | 15 +++++++++++++
 .../devicetree/bindings/peci/peci-client.txt       | 25 ++++++++++++++++++++++
 3 files changed, 63 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt

diff --git a/Documentation/devicetree/bindings/peci/peci-adapter.txt b/Documentation/devicetree/bindings/peci/peci-adapter.txt
new file mode 100644
index 000000000000..9221374f6b11
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-adapter.txt
@@ -0,0 +1,23 @@
+Generic device tree configuration for PECI adapters.
+
+Required properties:
+- compatible     : Should contain hardware specific definition strings that can
+		   match an adapter driver implementation.
+- reg            : Should contain PECI controller registers location and length.
+- #address-cells : Should be <1>.
+- #size-cells    : Should be <0>.
+
+Example:
+	peci: peci@10000000 {
+		compatible = "simple-bus";
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0x0 0x10000000 0x1000>;
+
+		peci0: peci-bus@0 {
+			compatible = "soc,soc-peci";
+			reg = <0x0 0x1000>;
+			#address-cells = <1>;
+			#size-cells = <0>;
+		};
+	};
diff --git a/Documentation/devicetree/bindings/peci/peci-bus.txt b/Documentation/devicetree/bindings/peci/peci-bus.txt
new file mode 100644
index 000000000000..90bcc791ccb0
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-bus.txt
@@ -0,0 +1,15 @@
+Generic device tree configuration for PECI buses.
+
+Required properties:
+- compatible     : Should be "simple-bus".
+- #address-cells : Should be <1>.
+- #size-cells    : Should be <1>.
+- ranges         : Should contain PECI controller registers ranges.
+
+Example:
+	peci: peci@10000000 {
+		compatible = "simple-bus";
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0x0 0x10000000 0x1000>;
+	};
diff --git a/Documentation/devicetree/bindings/peci/peci-client.txt b/Documentation/devicetree/bindings/peci/peci-client.txt
new file mode 100644
index 000000000000..8e2bfd8532f6
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-client.txt
@@ -0,0 +1,25 @@
+Generic device tree configuration for PECI clients.
+
+Required properties:
+- compatible : Should contain target device specific definition strings that can
+	       match a client driver implementation.
+- reg        : Should contain address of a client CPU. Address range of CPU
+	       clients is starting from 0x30 based on PECI specification.
+	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
+
+Example:
+	peci-bus@0 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		< more properties >
+
+		function@cpu0 {
+			compatible = "device,function";
+			reg = <0x30>;
+		};
+
+		function@cpu1 {
+			compatible = "device,function";
+			reg = <0x31>;
+		};
+	};
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 02/10] Documentations: ioctl: Add ioctl numbers for PECI subsystem
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
  2018-04-10 18:32 ` [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-10 18:32 ` [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core Jae Hyun Yoo
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit Updates ioctl-number.txt to reflect ioctl numbers being
used by the PECI subsystem.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Haiyue Wang <haiyue.wang@linux.intel.com>
Cc: James Feist <james.feist@linux.intel.com>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
Cc: Vernon Mauery <vernon.mauery@linux.intel.com>
---
 Documentation/ioctl/ioctl-number.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 84bb74dcae12..4bc3a65d7204 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -323,6 +323,8 @@ Code  Seq#(hex)	Include File		Comments
 0xB3	00	linux/mmc/ioctl.h
 0xB4	00-0F	linux/gpio.h		<mailto:linux-gpio@vger.kernel.org>
 0xB5	00-0F	uapi/linux/rpmsg.h	<mailto:linux-remoteproc@vger.kernel.org>
+0xB6	00-0F	uapi/linux/peci-ioctl.h	PECI subsystem
+					<mailto:jae.hyun.yoo@linux.intel.com>
 0xC0	00-0F	linux/usb/iowarrior.h
 0xCA	00-0F	uapi/misc/cxl.h
 0xCA	10-2F	uapi/misc/ocxl.h
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
  2018-04-10 18:32 ` [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers Jae Hyun Yoo
  2018-04-10 18:32 ` [PATCH v3 02/10] Documentations: ioctl: Add ioctl numbers for PECI subsystem Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-19 18:59   ` kbuild test robot
                     ` (2 more replies)
  2018-04-10 18:32 ` [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs Jae Hyun Yoo
                   ` (6 subsequent siblings)
  9 siblings, 3 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds driver implementation for PECI bus core into linux
driver framework.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 drivers/Kconfig                 |    2 +
 drivers/Makefile                |    1 +
 drivers/peci/Kconfig            |   17 +
 drivers/peci/Makefile           |    6 +
 drivers/peci/peci-core.c        | 1291 +++++++++++++++++++++++++++++++++++++++
 include/linux/peci.h            |  107 ++++
 include/uapi/linux/peci-ioctl.h |  200 ++++++
 7 files changed, 1624 insertions(+)
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/peci-core.c
 create mode 100644 include/linux/peci.h
 create mode 100644 include/uapi/linux/peci-ioctl.h

diff --git a/drivers/Kconfig b/drivers/Kconfig
index 95b9ccc08165..8c44d9738377 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -217,4 +217,6 @@ source "drivers/siox/Kconfig"
 
 source "drivers/slimbus/Kconfig"
 
+source "drivers/peci/Kconfig"
+
 endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index 24cd47014657..250fe3d0fa7e 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -185,3 +185,4 @@ obj-$(CONFIG_TEE)		+= tee/
 obj-$(CONFIG_MULTIPLEXER)	+= mux/
 obj-$(CONFIG_UNISYS_VISORBUS)	+= visorbus/
 obj-$(CONFIG_SIOX)		+= siox/
+obj-$(CONFIG_PECI)		+= peci/
diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
new file mode 100644
index 000000000000..1fbc13f9e6c2
--- /dev/null
+++ b/drivers/peci/Kconfig
@@ -0,0 +1,17 @@
+#
+# Platform Environment Control Interface (PECI) subsystem configuration
+#
+
+menu "PECI support"
+
+config PECI
+	bool "PECI support"
+	select RT_MUTEXES
+	select CRC8
+	help
+	  The Platform Environment Control Interface (PECI) is a one-wire bus
+	  interface that provides a communication channel between Intel
+	  processors and chipset components to external monitoring or control
+	  devices.
+
+endmenu
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
new file mode 100644
index 000000000000..9e8615e0d3ff
--- /dev/null
+++ b/drivers/peci/Makefile
@@ -0,0 +1,6 @@
+#
+# Makefile for the PECI core and bus drivers.
+#
+
+# Core functionality
+obj-$(CONFIG_PECI)		+= peci-core.o
diff --git a/drivers/peci/peci-core.c b/drivers/peci/peci-core.c
new file mode 100644
index 000000000000..9b45869b7c39
--- /dev/null
+++ b/drivers/peci/peci-core.c
@@ -0,0 +1,1291 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/crc8.h>
+#include <linux/delay.h>
+#include <linux/fs.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/peci.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+/* Device Specific Completion Code (CC) Definition */
+#define DEV_PECI_CC_SUCCESS          0x40
+#define DEV_PECI_CC_TIMEOUT          0x80
+#define DEV_PECI_CC_OUT_OF_RESOURCE  0x81
+#define DEV_PECI_CC_UNAVAIL_RESOURCE 0x82
+#define DEV_PECI_CC_INVALID_REQ      0x90
+
+/* Completion Code mask to check retry needs */
+#define DEV_PECI_CC_RETRY_CHECK_MASK 0xf0
+#define DEV_PECI_CC_NEED_RETRY       0x80
+
+/* Skylake EDS says to retry for 250ms */
+#define DEV_PECI_RETRY_TIME_MS     250
+#define DEV_PECI_RETRY_INTERVAL_MS 10
+#define DEV_PECI_RETRY_BIT         0x01
+
+#define GET_TEMP_WR_LEN   1
+#define GET_TEMP_RD_LEN   2
+#define GET_TEMP_PECI_CMD 0x01
+
+#define GET_DIB_WR_LEN   1
+#define GET_DIB_RD_LEN   8
+#define GET_DIB_PECI_CMD 0xf7
+
+#define RDPKGCFG_WRITE_LEN     5
+#define RDPKGCFG_READ_LEN_BASE 1
+#define RDPKGCFG_PECI_CMD      0xa1
+
+#define WRPKGCFG_WRITE_LEN_BASE 6
+#define WRPKGCFG_READ_LEN       1
+#define WRPKGCFG_PECI_CMD       0xa5
+
+#define RDIAMSR_WRITE_LEN 5
+#define RDIAMSR_READ_LEN  9
+#define RDIAMSR_PECI_CMD  0xb1
+
+#define WRIAMSR_PECI_CMD  0xb5
+
+#define RDPCICFG_WRITE_LEN 6
+#define RDPCICFG_READ_LEN  5
+#define RDPCICFG_PECI_CMD  0x61
+
+#define WRPCICFG_PECI_CMD  0x65
+
+#define RDPCICFGLOCAL_WRITE_LEN     5
+#define RDPCICFGLOCAL_READ_LEN_BASE 1
+#define RDPCICFGLOCAL_PECI_CMD      0xe1
+
+#define WRPCICFGLOCAL_WRITE_LEN_BASE 6
+#define WRPCICFGLOCAL_READ_LEN       1
+#define WRPCICFGLOCAL_PECI_CMD       0xe5
+
+/* Macro for getting minor revision number from DIB */
+#define GET_MINOR_REV_NUM(x) (((x) >> 8) & 0xF)
+
+/* CRC8 table for Assure Write Frame Check */
+#define PECI_CRC8_POLYNOMIAL 0x07
+DECLARE_CRC8_TABLE(peci_crc8_table);
+
+static struct device_type peci_adapter_type;
+static struct device_type peci_client_type;
+
+/* Max number of peci cdev */
+#define PECI_CDEV_MAX    16
+
+/* Max index of devices sharing the same client address */
+#define PECI_DEV_IDX_MAX 16
+
+static dev_t peci_devt;
+static bool is_registered;
+
+static DEFINE_MUTEX(core_lock);
+static DEFINE_IDR(peci_adapter_idr);
+
+static ssize_t name_show(struct device *dev,
+			 struct device_attribute *attr,
+			 char *buf)
+{
+	return sprintf(buf, "%s\n", dev->type == &peci_client_type ?
+		       to_peci_client(dev)->name : to_peci_adapter(dev)->name);
+}
+static DEVICE_ATTR_RO(name);
+
+static void peci_client_dev_release(struct device *dev)
+{
+	kfree(to_peci_client(dev));
+}
+
+static struct attribute *peci_device_attrs[] = {
+	&dev_attr_name.attr,
+	NULL
+};
+ATTRIBUTE_GROUPS(peci_device);
+
+static struct device_type peci_client_type = {
+	.groups		= peci_device_groups,
+	.release	= peci_client_dev_release,
+};
+
+static struct peci_client *peci_verify_client(struct device *dev)
+{
+	return (dev->type == &peci_client_type)
+			? to_peci_client(dev)
+			: NULL;
+}
+
+static void peci_adapter_dev_release(struct device *dev)
+{
+	/* do nothing */
+}
+
+static struct attribute *peci_adapter_attrs[] = {
+	&dev_attr_name.attr,
+	NULL
+};
+ATTRIBUTE_GROUPS(peci_adapter);
+
+static struct device_type peci_adapter_type = {
+	.groups		= peci_adapter_groups,
+	.release	= peci_adapter_dev_release,
+};
+
+static struct peci_adapter *peci_verify_adapter(struct device *dev)
+{
+	return (dev->type == &peci_adapter_type)
+			? to_peci_adapter(dev)
+			: NULL;
+}
+
+static struct peci_adapter *peci_get_adapter(int nr)
+{
+	struct peci_adapter *adapter;
+
+	mutex_lock(&core_lock);
+	adapter = idr_find(&peci_adapter_idr, nr);
+	if (!adapter)
+		goto out_unlock;
+
+	if (try_module_get(adapter->owner))
+		get_device(&adapter->dev);
+	else
+		adapter = NULL;
+
+out_unlock:
+	mutex_unlock(&core_lock);
+	return adapter;
+}
+
+static void peci_put_adapter(struct peci_adapter *adapter)
+{
+	if (!adapter)
+		return;
+
+	put_device(&adapter->dev);
+	module_put(adapter->owner);
+}
+
+static u8 peci_aw_fcs(u8 *data, int len)
+{
+	return crc8(peci_crc8_table, data, (size_t)len, 0);
+}
+
+static int __peci_xfer(struct peci_adapter *adapter, struct peci_xfer_msg *msg,
+		       bool do_retry, bool has_aw_fcs)
+{
+	ktime_t start, end;
+	s64 elapsed_ms;
+	int rc = 0;
+
+	/**
+	 * For some commands, the PECI originator may need to retry a command if
+	 * the processor PECI client responds with a 0x8x completion code. In
+	 * each instance, the processor PECI client may have started the
+	 * operation but not completed it yet. When the 'retry' bit is set, the
+	 * PECI client will ignore a new request if it exactly matches a
+	 * previous valid request.
+	 */
+
+	if (do_retry)
+		start = ktime_get();
+
+	do {
+		rc = adapter->xfer(adapter, msg);
+
+		if (!do_retry || rc)
+			break;
+
+		if (msg->rx_buf[0] == DEV_PECI_CC_SUCCESS)
+			break;
+
+		/* Retry is needed when completion code is 0x8x */
+		if ((msg->rx_buf[0] & DEV_PECI_CC_RETRY_CHECK_MASK) !=
+		    DEV_PECI_CC_NEED_RETRY) {
+			rc = -EIO;
+			break;
+		}
+
+		/* Set the retry bit to indicate a retry attempt */
+		msg->tx_buf[1] |= DEV_PECI_RETRY_BIT;
+
+		/* Recalculate the AW FCS if it has one */
+		if (has_aw_fcs)
+			msg->tx_buf[msg->tx_len - 1] = 0x80 ^
+						peci_aw_fcs((u8 *)msg,
+							    2 + msg->tx_len);
+
+		/**
+		 * Retry for at least 250ms before returning an error.
+		 * Retry interval guideline:
+		 *   No minimum < Retry Interval < No maximum
+		 *                (recommend 10ms)
+		 */
+		end = ktime_get();
+		elapsed_ms = ktime_to_ms(ktime_sub(end, start));
+		if (elapsed_ms >= DEV_PECI_RETRY_TIME_MS) {
+			dev_dbg(&adapter->dev, "Timeout retrying xfer!\n");
+			rc = -ETIMEDOUT;
+			break;
+		}
+
+		usleep_range(DEV_PECI_RETRY_INTERVAL_MS * 1000,
+			     (DEV_PECI_RETRY_INTERVAL_MS * 1000) + 1000);
+	} while (true);
+
+	if (rc)
+		dev_dbg(&adapter->dev, "xfer error, rc: %d\n", rc);
+
+	return rc;
+}
+
+static int peci_xfer(struct peci_adapter *adapter, struct peci_xfer_msg *msg)
+{
+	return __peci_xfer(adapter, msg, false, false);
+}
+
+static int peci_xfer_with_retries(struct peci_adapter *adapter,
+				  struct peci_xfer_msg *msg,
+				  bool has_aw_fcs)
+{
+	return __peci_xfer(adapter, msg, true, has_aw_fcs);
+}
+
+static int peci_scan_cmd_mask(struct peci_adapter *adapter)
+{
+	struct peci_xfer_msg msg;
+	int rc = 0;
+	u32 dib;
+
+	/* Update command mask just once */
+	if (adapter->cmd_mask & BIT(PECI_CMD_PING))
+		return 0;
+
+	msg.addr      = PECI_BASE_ADDR;
+	msg.tx_len    = GET_DIB_WR_LEN;
+	msg.rx_len    = GET_DIB_RD_LEN;
+	msg.tx_buf[0] = GET_DIB_PECI_CMD;
+
+	rc = peci_xfer(adapter, &msg);
+	if (rc)
+		return rc;
+
+	dib = msg.rx_buf[0] | (msg.rx_buf[1] << 8) |
+	      (msg.rx_buf[2] << 16) | (msg.rx_buf[3] << 24);
+
+	/* Check special case for Get DIB command */
+	if (dib == 0x00) {
+		dev_dbg(&adapter->dev, "DIB read as 0x00\n");
+		return -EIO;
+	}
+
+	/**
+	 * Setting up the supporting commands based on minor revision number.
+	 * See PECI Spec Table 3-1.
+	 */
+	switch (GET_MINOR_REV_NUM(dib)) {
+	case 6:
+		adapter->cmd_mask |= BIT(PECI_CMD_WR_IA_MSR);
+		/* fallthrough */
+	case 5:
+		adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG);
+		/* fallthrough */
+	case 4:
+		adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG);
+		/* fallthrough */
+	case 3:
+		adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG_LOCAL);
+		adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG_LOCAL);
+		/* fallthrough */
+	case 2:
+		adapter->cmd_mask |= BIT(PECI_CMD_RD_IA_MSR);
+		/* fallthrough */
+	case 1:
+		adapter->cmd_mask |= BIT(PECI_CMD_RD_PKG_CFG);
+		adapter->cmd_mask |= BIT(PECI_CMD_WR_PKG_CFG);
+	}
+
+	adapter->cmd_mask |= BIT(PECI_CMD_GET_TEMP);
+	adapter->cmd_mask |= BIT(PECI_CMD_GET_DIB);
+	adapter->cmd_mask |= BIT(PECI_CMD_PING);
+
+	return rc;
+}
+
+static int peci_cmd_support(struct peci_adapter *adapter, enum peci_cmd cmd)
+{
+	if (!(adapter->cmd_mask & BIT(PECI_CMD_PING)) &&
+	    peci_scan_cmd_mask(adapter) < 0) {
+		dev_dbg(&adapter->dev, "Failed to scan command mask\n");
+		return -EIO;
+	}
+
+	if (!(adapter->cmd_mask & BIT(cmd))) {
+		dev_dbg(&adapter->dev, "Command %d is not supported\n", cmd);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int peci_ioctl_ping(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_ping_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+
+	msg.addr   = umsg->addr;
+	msg.tx_len = 0;
+	msg.rx_len = 0;
+
+	return peci_xfer(adapter, &msg);
+}
+
+static int peci_ioctl_get_dib(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_get_dib_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc;
+
+	msg.addr      = umsg->addr;
+	msg.tx_len    = GET_DIB_WR_LEN;
+	msg.rx_len    = GET_DIB_RD_LEN;
+	msg.tx_buf[0] = GET_DIB_PECI_CMD;
+
+	rc = peci_xfer(adapter, &msg);
+	if (rc)
+		return rc;
+
+	umsg->dib = msg.rx_buf[0] | (msg.rx_buf[1] << 8) |
+		     (msg.rx_buf[2] << 16) | (msg.rx_buf[3] << 24);
+
+	return 0;
+}
+
+static int peci_ioctl_get_temp(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_get_temp_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc;
+
+	msg.addr      = umsg->addr;
+	msg.tx_len    = GET_TEMP_WR_LEN;
+	msg.rx_len    = GET_TEMP_RD_LEN;
+	msg.tx_buf[0] = GET_TEMP_PECI_CMD;
+
+	rc = peci_xfer(adapter, &msg);
+	if (rc)
+		return rc;
+
+	umsg->temp_raw = msg.rx_buf[0] | (msg.rx_buf[1] << 8);
+
+	return 0;
+}
+
+static int peci_ioctl_rd_pkg_cfg(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_rd_pkg_cfg_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc = 0;
+
+	/* Per the PECI spec, the read length must be a byte, word, or dword */
+	if (umsg->rx_len != 1 && umsg->rx_len != 2 && umsg->rx_len != 4) {
+		dev_dbg(&adapter->dev, "Invalid read length, rx_len: %d\n",
+			umsg->rx_len);
+		return -EINVAL;
+	}
+
+	msg.addr = umsg->addr;
+	msg.tx_len = RDPKGCFG_WRITE_LEN;
+	/* read lengths of 1 and 2 result in an error, so only use 4 for now */
+	msg.rx_len = RDPKGCFG_READ_LEN_BASE + umsg->rx_len;
+	msg.tx_buf[0] = RDPKGCFG_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = umsg->index;            /* RdPkgConfig index */
+	msg.tx_buf[3] = (u8)umsg->param;        /* LSB - Config parameter */
+	msg.tx_buf[4] = (u8)(umsg->param >> 8); /* MSB - Config parameter */
+
+	rc = peci_xfer_with_retries(adapter, &msg, false);
+	if (!rc)
+		memcpy(umsg->pkg_config, &msg.rx_buf[1], umsg->rx_len);
+
+	return rc;
+}
+
+static int peci_ioctl_wr_pkg_cfg(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_wr_pkg_cfg_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc = 0, i;
+
+	/* Per the PECI spec, the write length must be a dword */
+	if (umsg->tx_len != 4) {
+		dev_dbg(&adapter->dev, "Invalid write length, tx_len: %d\n",
+			umsg->tx_len);
+		return -EINVAL;
+	}
+
+	msg.addr = umsg->addr;
+	msg.tx_len = WRPKGCFG_WRITE_LEN_BASE + umsg->tx_len;
+	/* read lengths of 1 and 2 result in an error, so only use 4 for now */
+	msg.rx_len = WRPKGCFG_READ_LEN;
+	msg.tx_buf[0] = WRPKGCFG_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = umsg->index;            /* RdPkgConfig index */
+	msg.tx_buf[3] = (u8)umsg->param;        /* LSB - Config parameter */
+	msg.tx_buf[4] = (u8)(umsg->param >> 8); /* MSB - Config parameter */
+	for (i = 0; i < umsg->tx_len; i++)
+		msg.tx_buf[5 + i] = (u8)(umsg->value >> (i << 3));
+
+	/* Add an Assure Write Frame Check Sequence byte */
+	msg.tx_buf[5 + i] = 0x80 ^
+			    peci_aw_fcs((u8 *)&msg, 8 + umsg->tx_len);
+
+	rc = peci_xfer_with_retries(adapter, &msg, true);
+
+	return rc;
+}
+
+static int peci_ioctl_rd_ia_msr(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_rd_ia_msr_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc = 0;
+
+	msg.addr = umsg->addr;
+	msg.tx_len = RDIAMSR_WRITE_LEN;
+	msg.rx_len = RDIAMSR_READ_LEN;
+	msg.tx_buf[0] = RDIAMSR_PECI_CMD;
+	msg.tx_buf[1] = 0x00;
+	msg.tx_buf[2] = umsg->thread_id;
+	msg.tx_buf[3] = (u8)umsg->address;
+	msg.tx_buf[4] = (u8)(umsg->address >> 8);
+
+	rc = peci_xfer_with_retries(adapter, &msg, false);
+	if (!rc)
+		memcpy(&umsg->value, &msg.rx_buf[1], sizeof(uint64_t));
+
+	return rc;
+}
+
+static int peci_ioctl_rd_pci_cfg(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_rd_pci_cfg_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	u32 address;
+	int rc = 0;
+
+	address = umsg->reg;                  /* [11:0]  - Register */
+	address |= (u32)umsg->function << 12; /* [14:12] - Function */
+	address |= (u32)umsg->device << 15;   /* [19:15] - Device   */
+	address |= (u32)umsg->bus << 20;      /* [27:20] - Bus      */
+					      /* [31:28] - Reserved */
+	msg.addr = umsg->addr;
+	msg.tx_len = RDPCICFG_WRITE_LEN;
+	msg.rx_len = RDPCICFG_READ_LEN;
+	msg.tx_buf[0] = RDPCICFG_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = (u8)address;         /* LSB - PCI Config Address */
+	msg.tx_buf[3] = (u8)(address >> 8);  /* PCI Config Address */
+	msg.tx_buf[4] = (u8)(address >> 16); /* PCI Config Address */
+	msg.tx_buf[5] = (u8)(address >> 24); /* MSB - PCI Config Address */
+
+	rc = peci_xfer_with_retries(adapter, &msg, false);
+	if (!rc)
+		memcpy(umsg->pci_config, &msg.rx_buf[1], 4);
+
+	return rc;
+}
+
+static int peci_ioctl_rd_pci_cfg_local(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_rd_pci_cfg_local_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	u32 address;
+	int rc = 0;
+
+	/* Per the PECI spec, the read length must be a byte, word, or dword */
+	if (umsg->rx_len != 1 && umsg->rx_len != 2 && umsg->rx_len != 4) {
+		dev_dbg(&adapter->dev, "Invalid read length, rx_len: %d\n",
+			umsg->rx_len);
+		return -EINVAL;
+	}
+
+	address = umsg->reg;                  /* [11:0]  - Register */
+	address |= (u32)umsg->function << 12; /* [14:12] - Function */
+	address |= (u32)umsg->device << 15;   /* [19:15] - Device   */
+	address |= (u32)umsg->bus << 20;      /* [23:20] - Bus      */
+
+	msg.addr = umsg->addr;
+	msg.tx_len = RDPCICFGLOCAL_WRITE_LEN;
+	msg.rx_len = RDPCICFGLOCAL_READ_LEN_BASE + umsg->rx_len;
+	msg.tx_buf[0] = RDPCICFGLOCAL_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = (u8)address;       /* LSB - PCI Configuration Address */
+	msg.tx_buf[3] = (u8)(address >> 8);  /* PCI Configuration Address */
+	msg.tx_buf[4] = (u8)(address >> 16); /* PCI Configuration Address */
+
+	rc = peci_xfer_with_retries(adapter, &msg, false);
+	if (!rc)
+		memcpy(umsg->pci_config, &msg.rx_buf[1], umsg->rx_len);
+
+	return rc;
+}
+
+static int peci_ioctl_wr_pci_cfg_local(struct peci_adapter *adapter, void *vmsg)
+{
+	struct peci_wr_pci_cfg_local_msg *umsg = vmsg;
+	struct peci_xfer_msg msg;
+	int rc = 0, i;
+	u32 address;
+
+	/* Per the PECI spec, the write length must be a byte, word, or dword */
+	if (umsg->tx_len != 1 && umsg->tx_len != 2 && umsg->tx_len != 4) {
+		dev_dbg(&adapter->dev, "Invalid write length, tx_len: %d\n",
+			umsg->tx_len);
+		return -EINVAL;
+	}
+
+	address = umsg->reg;                  /* [11:0]  - Register */
+	address |= (u32)umsg->function << 12; /* [14:12] - Function */
+	address |= (u32)umsg->device << 15;   /* [19:15] - Device   */
+	address |= (u32)umsg->bus << 20;      /* [23:20] - Bus      */
+
+	msg.addr = umsg->addr;
+	msg.tx_len = WRPCICFGLOCAL_WRITE_LEN_BASE + umsg->tx_len;
+	msg.rx_len = WRPCICFGLOCAL_READ_LEN;
+	msg.tx_buf[0] = WRPCICFGLOCAL_PECI_CMD;
+	msg.tx_buf[1] = 0x00;         /* request byte for Host ID | Retry bit */
+				      /* Host ID is 0 for PECI 3.0 */
+	msg.tx_buf[2] = (u8)address;       /* LSB - PCI Configuration Address */
+	msg.tx_buf[3] = (u8)(address >> 8);  /* PCI Configuration Address */
+	msg.tx_buf[4] = (u8)(address >> 16); /* PCI Configuration Address */
+	for (i = 0; i < umsg->tx_len; i++)
+		msg.tx_buf[5 + i] = (u8)(umsg->value >> (i << 3));
+
+	/* Add an Assure Write Frame Check Sequence byte */
+	msg.tx_buf[5 + i] = 0x80 ^
+			    peci_aw_fcs((u8 *)&msg, 8 + umsg->tx_len);
+
+	rc = peci_xfer_with_retries(adapter, &msg, true);
+
+	return rc;
+}
+
+typedef int (*peci_ioctl_fn_type)(struct peci_adapter *, void *);
+
+static const peci_ioctl_fn_type peci_ioctl_fn[PECI_CMD_MAX] = {
+	NULL, /* Reserved */
+	peci_ioctl_ping,
+	peci_ioctl_get_dib,
+	peci_ioctl_get_temp,
+	peci_ioctl_rd_pkg_cfg,
+	peci_ioctl_wr_pkg_cfg,
+	peci_ioctl_rd_ia_msr,
+	NULL, /* Reserved */
+	peci_ioctl_rd_pci_cfg,
+	NULL, /* Reserved */
+	peci_ioctl_rd_pci_cfg_local,
+	peci_ioctl_wr_pci_cfg_local,
+};
+
+int peci_command(struct peci_adapter *adapter, enum peci_cmd cmd, void *vmsg)
+{
+	int rc = 0;
+
+	if (cmd >= PECI_CMD_MAX || cmd < PECI_CMD_XFER)
+		return -EINVAL;
+
+	dev_dbg(&adapter->dev, "%s, cmd=0x%02x\n", __func__, cmd);
+
+	if (!peci_ioctl_fn[cmd])
+		return -EINVAL;
+
+	rt_mutex_lock(&adapter->bus_lock);
+
+	rc = peci_cmd_support(adapter, cmd);
+	if (!rc)
+		rc = peci_ioctl_fn[cmd](adapter, vmsg);
+
+	rt_mutex_unlock(&adapter->bus_lock);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(peci_command);
+
+static long peci_ioctl(struct file *file, unsigned int iocmd, unsigned long arg)
+{
+	struct peci_adapter *adapter = file->private_data;
+	void __user *argp = (void __user *)arg;
+	unsigned int msg_len;
+	enum peci_cmd cmd;
+	int rc = 0;
+	u8 *msg;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	dev_dbg(&adapter->dev, "ioctl, cmd=0x%x, arg=0x%lx\n", iocmd, arg);
+
+	switch (iocmd) {
+	case PECI_IOC_PING:
+	case PECI_IOC_GET_DIB:
+	case PECI_IOC_GET_TEMP:
+	case PECI_IOC_RD_PKG_CFG:
+	case PECI_IOC_WR_PKG_CFG:
+	case PECI_IOC_RD_IA_MSR:
+	case PECI_IOC_RD_PCI_CFG:
+	case PECI_IOC_RD_PCI_CFG_LOCAL:
+	case PECI_IOC_WR_PCI_CFG_LOCAL:
+		cmd = _IOC_NR(iocmd);
+		msg_len = _IOC_SIZE(iocmd);
+		break;
+
+	default:
+		dev_dbg(&adapter->dev, "Invalid ioctl cmd : 0x%x\n", iocmd);
+		return -ENOTTY;
+	}
+
+	msg = memdup_user(argp, msg_len);
+	if (IS_ERR(msg))
+		return PTR_ERR(msg);
+
+	rc = peci_command(adapter, cmd, msg);
+
+	if (!rc && copy_to_user(argp, msg, msg_len))
+		rc = -EFAULT;
+
+	kfree(msg);
+	return (long)rc;
+}
+
+static int peci_open(struct inode *inode, struct file *file)
+{
+	unsigned int minor = iminor(inode);
+	struct peci_adapter *adapter;
+
+	adapter = peci_get_adapter(minor);
+	if (!adapter)
+		return -ENODEV;
+
+	file->private_data = adapter;
+
+	return 0;
+}
+
+static int peci_release(struct inode *inode, struct file *file)
+{
+	struct peci_adapter *adapter = file->private_data;
+
+	peci_put_adapter(adapter);
+	file->private_data = NULL;
+
+	return 0;
+}
+
+static const struct file_operations peci_fops = {
+	.owner          = THIS_MODULE,
+	.unlocked_ioctl = peci_ioctl,
+	.open           = peci_open,
+	.release        = peci_release,
+};
+
+static int peci_detect(struct peci_adapter *adapter, u8 addr)
+{
+	struct peci_ping_msg msg;
+
+	msg.addr = addr;
+
+	return peci_command(adapter, PECI_CMD_PING, &msg);
+}
+
+#if IS_ENABLED(CONFIG_OF)
+static const struct of_device_id *
+peci_of_match_device(const struct of_device_id *matches,
+		     struct peci_client *client)
+{
+	if (!(client && matches))
+		return NULL;
+
+	return of_match_device(matches, &client->dev);
+}
+#endif
+
+static const struct peci_device_id *
+peci_match_id(const struct peci_device_id *id, struct peci_client *client)
+{
+	if (!(id && client))
+		return NULL;
+
+	while (id->name[0]) {
+		if (strcmp(client->name, id->name) == 0)
+			return id;
+		id++;
+	}
+
+	return NULL;
+}
+
+static int peci_device_match(struct device *dev, struct device_driver *drv)
+{
+	struct peci_client *client = peci_verify_client(dev);
+	struct peci_driver *driver;
+
+	/* Attempt an OF style match */
+	if (peci_of_match_device(drv->of_match_table, client))
+		return 1;
+
+	driver = to_peci_driver(drv);
+
+	if (peci_match_id(driver->id_table, client))
+		return 1;
+
+	return 0;
+}
+
+static int peci_device_probe(struct device *dev)
+{
+	struct peci_client	*client = peci_verify_client(dev);
+	struct peci_driver	*driver;
+	int status = -EINVAL;
+
+	if (!client)
+		return 0;
+
+	if (!peci_of_match_device(dev->driver->of_match_table, client))
+		return -ENODEV;
+
+	dev_dbg(dev, "%s: name:%s\n", __func__, client->name);
+
+	driver = to_peci_driver(dev->driver);
+	if (driver->probe)
+		status = driver->probe(client);
+
+	return status;
+}
+
+static int peci_device_remove(struct device *dev)
+{
+	struct peci_client *client = peci_verify_client(dev);
+	struct peci_driver *driver;
+	int status = 0;
+
+	if (!client || !dev->driver)
+		return 0;
+
+	driver = to_peci_driver(dev->driver);
+	if (driver->remove) {
+		dev_dbg(dev, "%s: name:%s\n", __func__, client->name);
+		status = driver->remove(client);
+	}
+
+	return status;
+}
+
+static void peci_device_shutdown(struct device *dev)
+{
+	struct peci_client *client = peci_verify_client(dev);
+	struct peci_driver *driver;
+
+	if (!client || !dev->driver)
+		return;
+
+	dev_dbg(dev, "%s: name:%s\n", __func__, client->name);
+
+	driver = to_peci_driver(dev->driver);
+	if (driver->shutdown)
+		driver->shutdown(client);
+}
+
+static struct bus_type peci_bus_type = {
+	.name		= "peci",
+	.match		= peci_device_match,
+	.probe		= peci_device_probe,
+	.remove		= peci_device_remove,
+	.shutdown	= peci_device_shutdown,
+};
+
+static void peci_unregister_device(struct peci_client *client)
+{
+	if (client->dev.of_node)
+		of_node_clear_flag(client->dev.of_node, OF_POPULATED);
+
+	device_unregister(&client->dev);
+}
+
+static int peci_check_addr_validity(u8 addr)
+{
+	if (addr < PECI_BASE_ADDR && addr > PECI_BASE_ADDR + PECI_OFFSET_MAX)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int peci_check_client_busy(struct device *dev, void *client_new_p)
+{
+	struct peci_client *client = peci_verify_client(dev);
+	struct peci_client *client_new = client_new_p;
+
+	if (client && client->addr == client_new->addr &&
+	    client->idx == client_new->idx)
+		return -EBUSY;
+
+	return 0;
+}
+
+static struct peci_client *peci_new_device(struct peci_adapter *adapter,
+					   struct peci_board_info const *info)
+{
+	struct peci_client *client;
+	int rc;
+
+	client = kzalloc(sizeof(*client), GFP_KERNEL);
+	if (!client)
+		return NULL;
+
+	client->adapter = adapter;
+	client->addr = info->addr;
+	strlcpy(client->name, info->type, sizeof(client->name));
+
+	rc = peci_check_addr_validity(client->addr);
+	if (rc) {
+		dev_err(&adapter->dev, "Invalid PECI CPU address 0x%02hx\n",
+			client->addr);
+		goto err_free_client_silent;
+	}
+
+	/* Check client's online status */
+	rc = peci_detect(adapter, client->addr);
+	if (rc)
+		goto err_free_client;
+
+	for (client->idx = 0; client->idx < PECI_DEV_IDX_MAX; client->idx++) {
+		rc = device_for_each_child(&adapter->dev, client,
+					   peci_check_client_busy);
+		if (!rc)
+			break;
+	}
+
+	if (rc || client->idx == PECI_DEV_IDX_MAX)
+		goto err_free_client;
+
+	client->dev.parent = &client->adapter->dev;
+	client->dev.bus = &peci_bus_type;
+	client->dev.type = &peci_client_type;
+	client->dev.of_node = info->of_node;
+	dev_set_name(&client->dev, "%d-%02x:%02x",
+		     adapter->nr, client->addr, client->idx);
+
+	rc = device_register(&client->dev);
+	if (rc)
+		goto err_free_client;
+
+	dev_dbg(&adapter->dev, "client [%s] registered with bus id %s\n",
+		client->name, dev_name(&client->dev));
+
+	return client;
+
+err_free_client:
+	dev_err(&adapter->dev,
+		"Failed to register peci client %s at 0x%02x (%d)\n",
+		client->name, client->addr, rc);
+err_free_client_silent:
+	kfree(client);
+	return NULL;
+}
+
+#if IS_ENABLED(CONFIG_OF)
+static struct peci_client *peci_of_register_device(struct peci_adapter *adapter,
+						   struct device_node *node)
+{
+	struct peci_board_info info = {};
+	struct peci_client *result;
+	const __be32 *addr_be;
+	u32 addr;
+	int len;
+
+	dev_dbg(&adapter->dev, "register %s\n", node->full_name);
+
+	if (of_modalias_node(node, info.type, sizeof(info.type)) < 0) {
+		dev_err(&adapter->dev, "modalias failure on %s\n",
+			node->full_name);
+		return ERR_PTR(-EINVAL);
+	}
+
+	addr_be = of_get_property(node, "reg", &len);
+	if (!addr_be || len < sizeof(*addr_be)) {
+		dev_err(&adapter->dev, "invalid reg on %s\n",
+			node->full_name);
+		return ERR_PTR(-EINVAL);
+	}
+
+	addr = be32_to_cpup(addr_be);
+
+	if (peci_check_addr_validity(addr)) {
+		dev_err(&adapter->dev, "invalid addr=%x on %s\n",
+			addr, node->full_name);
+		return ERR_PTR(-EINVAL);
+	}
+
+	info.addr = addr;
+	info.of_node = of_node_get(node);
+
+	result = peci_new_device(adapter, &info);
+	if (!result)
+		result = ERR_PTR(-EINVAL);
+
+	of_node_put(node);
+	return result;
+}
+
+static void peci_of_register_devices(struct peci_adapter *adapter)
+{
+	struct device_node *bus, *node;
+	struct peci_client *client;
+
+	/* Only register child devices if the adapter has a node pointer set */
+	if (!adapter->dev.of_node)
+		return;
+
+	bus = of_get_child_by_name(adapter->dev.of_node, "peci-bus");
+	if (!bus)
+		bus = of_node_get(adapter->dev.of_node);
+
+	for_each_available_child_of_node(bus, node) {
+		if (of_node_test_and_set_flag(node, OF_POPULATED))
+			continue;
+
+		client = peci_of_register_device(adapter, node);
+		if (IS_ERR(client)) {
+			dev_warn(&adapter->dev,
+				 "Failed to create PECI device for %s\n",
+				 node->full_name);
+			of_node_clear_flag(node, OF_POPULATED);
+		}
+	}
+
+	of_node_put(bus);
+}
+
+static int peci_of_match_node(struct device *dev, void *data)
+{
+	return dev->of_node == data;
+}
+
+/* must call put_device() when done with returned peci_client device */
+static struct peci_client *peci_of_find_device(struct device_node *node)
+{
+	struct peci_client *client;
+	struct device *dev;
+
+	dev = bus_find_device(&peci_bus_type, NULL, node, peci_of_match_node);
+	if (!dev)
+		return NULL;
+
+	client = peci_verify_client(dev);
+	if (!client)
+		put_device(dev);
+
+	return client;
+}
+
+/* must call put_device() when done with returned peci_adapter device */
+static struct peci_adapter *peci_of_find_adapter(struct device_node *node)
+{
+	struct peci_adapter *adapter;
+	struct device *dev;
+
+	dev = bus_find_device(&peci_bus_type, NULL, node, peci_of_match_node);
+	if (!dev)
+		return NULL;
+
+	adapter = peci_verify_adapter(dev);
+	if (!adapter)
+		put_device(dev);
+
+	return adapter;
+}
+#else
+static void peci_of_register_devices(struct peci_adapter *adapter) { }
+#endif /* CONFIG_OF */
+
+#if IS_ENABLED(CONFIG_OF_DYNAMIC)
+static int peci_of_notify(struct notifier_block *nb,
+			  unsigned long action,
+			  void *arg)
+{
+	struct of_reconfig_data *rd = arg;
+	struct peci_adapter *adapter;
+	struct peci_client *client;
+
+	switch (of_reconfig_get_state_change(action, rd)) {
+	case OF_RECONFIG_CHANGE_ADD:
+		adapter = peci_of_find_adapter(rd->dn->parent);
+		if (!adapter)
+			return NOTIFY_OK;	/* not for us */
+
+		if (of_node_test_and_set_flag(rd->dn, OF_POPULATED)) {
+			put_device(&adapter->dev);
+			return NOTIFY_OK;
+		}
+
+		client = peci_of_register_device(adapter, rd->dn);
+		put_device(&adapter->dev);
+
+		if (IS_ERR(client)) {
+			dev_err(&adapter->dev,
+				"failed to create client for '%s'\n",
+				rd->dn->full_name);
+			of_node_clear_flag(rd->dn, OF_POPULATED);
+			return notifier_from_errno(PTR_ERR(client));
+		}
+		break;
+	case OF_RECONFIG_CHANGE_REMOVE:
+		/* already depopulated? */
+		if (!of_node_check_flag(rd->dn, OF_POPULATED))
+			return NOTIFY_OK;
+
+		/* find our device by node */
+		client = peci_of_find_device(rd->dn);
+		if (!client)
+			return NOTIFY_OK;	/* no? not meant for us */
+
+		/* unregister takes one ref away */
+		peci_unregister_device(client);
+
+		/* and put the reference of the find */
+		put_device(&client->dev);
+		break;
+	}
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block peci_of_notifier = {
+	.notifier_call = peci_of_notify,
+};
+#else
+extern struct notifier_block peci_of_notifier;
+#endif /* CONFIG_OF_DYNAMIC */
+
+static int peci_register_adapter(struct peci_adapter *adapter)
+{
+	int rc = -EINVAL;
+
+	/* Can't register until after driver model init */
+	if (WARN_ON(!is_registered))
+		goto err_free_idr;
+
+	if (WARN(!adapter->name[0], "peci adapter has no name"))
+		goto err_free_idr;
+
+	if (WARN(!adapter->xfer, "peci adapter has no xfer function\n"))
+		goto err_free_idr;
+
+	rt_mutex_init(&adapter->bus_lock);
+
+	dev_set_name(&adapter->dev, "peci%d", adapter->nr);
+	adapter->dev.bus = &peci_bus_type;
+	adapter->dev.type = &peci_adapter_type;
+	device_initialize(&adapter->dev);
+
+	/* cdev */
+	cdev_init(&adapter->cdev, &peci_fops);
+	adapter->cdev.owner = THIS_MODULE;
+	adapter->cdev.kobj.parent = &adapter->dev.kobj;
+	adapter->dev.devt = MKDEV(MAJOR(peci_devt), adapter->nr);
+	rc = cdev_add(&adapter->cdev, adapter->dev.devt, 1);
+	if (rc) {
+		pr_err("adapter '%s': can't add cdev (%d)\n",
+		       adapter->name, rc);
+		goto err_free_idr;
+	}
+	rc = device_add(&adapter->dev);
+	if (rc) {
+		pr_err("adapter '%s': can't add device (%d)\n",
+		       adapter->name, rc);
+		goto err_del_cdev;
+	}
+
+	dev_dbg(&adapter->dev, "adapter [%s] registered\n", adapter->name);
+
+	/* create pre-declared device nodes */
+	peci_of_register_devices(adapter);
+
+	return 0;
+
+err_del_cdev:
+	cdev_del(&adapter->cdev);
+err_free_idr:
+	mutex_lock(&core_lock);
+	idr_remove(&peci_adapter_idr, adapter->nr);
+	mutex_unlock(&core_lock);
+	return rc;
+}
+
+static int peci_add_numbered_adapter(struct peci_adapter *adapter)
+{
+	int id;
+
+	mutex_lock(&core_lock);
+	id = idr_alloc(&peci_adapter_idr, adapter,
+		       adapter->nr, adapter->nr + 1, GFP_KERNEL);
+	mutex_unlock(&core_lock);
+	if (WARN(id < 0, "couldn't get idr"))
+		return id == -ENOSPC ? -EBUSY : id;
+
+	return peci_register_adapter(adapter);
+}
+
+int peci_add_adapter(struct peci_adapter *adapter)
+{
+	struct device *dev = &adapter->dev;
+	int id;
+
+	if (dev->of_node) {
+		id = of_alias_get_id(dev->of_node, "peci");
+		if (id >= 0) {
+			adapter->nr = id;
+			return peci_add_numbered_adapter(adapter);
+		}
+	}
+
+	mutex_lock(&core_lock);
+	id = idr_alloc(&peci_adapter_idr, adapter, 0, 0, GFP_KERNEL);
+	mutex_unlock(&core_lock);
+	if (WARN(id < 0, "couldn't get idr"))
+		return id;
+
+	adapter->nr = id;
+
+	return peci_register_adapter(adapter);
+}
+EXPORT_SYMBOL_GPL(peci_add_adapter);
+
+static int peci_unregister_client(struct device *dev, void *dummy)
+{
+	struct peci_client *client = peci_verify_client(dev);
+
+	if (client)
+		peci_unregister_device(client);
+
+	return 0;
+}
+
+void peci_del_adapter(struct peci_adapter *adapter)
+{
+	struct peci_adapter *found;
+
+	/* First make sure that this adapter was ever added */
+	mutex_lock(&core_lock);
+	found = idr_find(&peci_adapter_idr, adapter->nr);
+	mutex_unlock(&core_lock);
+
+	if (found != adapter)
+		return;
+
+	/**
+	 * Detach any active clients. This can't fail, thus we do not
+	 * check the returned value.
+	 */
+	device_for_each_child(&adapter->dev, NULL, peci_unregister_client);
+
+	/* device name is gone after device_unregister */
+	dev_dbg(&adapter->dev, "adapter [%s] unregistered\n", adapter->name);
+
+	device_unregister(&adapter->dev);
+
+	/* free cdev */
+	cdev_del(&adapter->cdev);
+
+	/* free bus id */
+	mutex_lock(&core_lock);
+	idr_remove(&peci_adapter_idr, adapter->nr);
+	mutex_unlock(&core_lock);
+}
+EXPORT_SYMBOL_GPL(peci_del_adapter);
+
+/**
+ * A peci_driver is used with one or more peci_client (device) nodes to access
+ * peci clients, on a bus instance associated with some peci_adapter.
+ */
+int peci_register_driver(struct module *owner, struct peci_driver *driver)
+{
+	int rc;
+
+	/* Can't register until after driver model init */
+	if (WARN_ON(!is_registered))
+		return -EAGAIN;
+
+	/* add the driver to the list of peci drivers in the driver core */
+	driver->driver.owner = owner;
+	driver->driver.bus = &peci_bus_type;
+
+	/**
+	 * When registration returns, the driver core
+	 * will have called probe() for all matching-but-unbound devices.
+	 */
+	rc = driver_register(&driver->driver);
+	if (rc)
+		return rc;
+
+	pr_debug("driver [%s] registered\n", driver->driver.name);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(peci_register_driver);
+
+void peci_del_driver(struct peci_driver *driver)
+{
+	driver_unregister(&driver->driver);
+	pr_debug("driver [%s] unregistered\n", driver->driver.name);
+}
+EXPORT_SYMBOL_GPL(peci_del_driver);
+
+static int __init peci_init(void)
+{
+	int ret;
+
+	ret = bus_register(&peci_bus_type);
+	if (ret < 0) {
+		pr_err("peci: Failed to register PECI bus type!\n");
+		return ret;
+	}
+
+	ret = alloc_chrdev_region(&peci_devt, 0, PECI_CDEV_MAX, "peci");
+	if (ret < 0) {
+		pr_err("peci: Failed to allocate chr dev region!\n");
+		bus_unregister(&peci_bus_type);
+		return ret;
+	}
+
+	crc8_populate_msb(peci_crc8_table, PECI_CRC8_POLYNOMIAL);
+
+	if (IS_ENABLED(CONFIG_OF_DYNAMIC))
+		WARN_ON(of_reconfig_notifier_register(&peci_of_notifier));
+
+	is_registered = true;
+
+	return 0;
+}
+
+static void __exit peci_exit(void)
+{
+	if (IS_ENABLED(CONFIG_OF_DYNAMIC))
+		WARN_ON(of_reconfig_notifier_unregister(&peci_of_notifier));
+
+	unregister_chrdev_region(peci_devt, PECI_CDEV_MAX);
+	bus_unregister(&peci_bus_type);
+}
+
+postcore_initcall(peci_init);
+module_exit(peci_exit);
+
+MODULE_AUTHOR("Jason M Biils <jason.m.bills@linux.intel.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("PECI bus core module");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/peci.h b/include/linux/peci.h
new file mode 100644
index 000000000000..8730deb6673c
--- /dev/null
+++ b/include/linux/peci.h
@@ -0,0 +1,107 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018 Intel Corporation */
+
+#ifndef __LINUX_PECI_H
+#define __LINUX_PECI_H
+
+#include <linux/cdev.h>
+#include <linux/device.h>
+#include <linux/peci-ioctl.h>
+#include <linux/rtmutex.h>
+
+#define PECI_BUFFER_SIZE  32
+#define PECI_NAME_SIZE    32
+
+struct peci_xfer_msg {
+	u8	addr;
+	u8	tx_len;
+	u8	rx_len;
+	u8	tx_buf[PECI_BUFFER_SIZE];
+	u8	rx_buf[PECI_BUFFER_SIZE];
+} __attribute__((__packed__));
+
+struct peci_board_info {
+	char			type[PECI_NAME_SIZE];
+	u8			addr;	/* CPU client address */
+	struct device_node	*of_node;
+};
+
+struct peci_adapter {
+	struct module	*owner;
+	struct rt_mutex	bus_lock;
+	struct device	dev;
+	struct cdev	cdev;
+	int		nr;
+	char		name[PECI_NAME_SIZE];
+	int		(*xfer)(struct peci_adapter *adapter,
+				struct peci_xfer_msg *msg);
+	uint		cmd_mask;
+};
+
+static inline struct peci_adapter *to_peci_adapter(void *d)
+{
+	return container_of(d, struct peci_adapter, dev);
+}
+
+static inline void *peci_get_adapdata(const struct peci_adapter *adapter)
+{
+	return dev_get_drvdata(&adapter->dev);
+}
+
+static inline void peci_set_adapdata(struct peci_adapter *adapter, void *data)
+{
+	dev_set_drvdata(&adapter->dev, data);
+}
+
+struct peci_client {
+	struct device		dev;		/* the device structure */
+	struct peci_adapter	*adapter;	/* the adapter we sit on */
+	u8			addr;		/* CPU client address */
+	u8			idx;		/* device index */
+	char			name[PECI_NAME_SIZE];
+};
+
+static inline struct peci_client *to_peci_client(void *d)
+{
+	return container_of(d, struct peci_client, dev);
+}
+
+struct peci_device_id {
+	char		name[PECI_NAME_SIZE];
+	kernel_ulong_t	driver_data;	/* Data private to the driver */
+};
+
+struct peci_driver {
+	int				(*probe)(struct peci_client *client);
+	int				(*remove)(struct peci_client *client);
+	void				(*shutdown)(struct peci_client *client);
+	struct device_driver		driver;
+	const struct peci_device_id	*id_table;
+};
+
+static inline struct peci_driver *to_peci_driver(void *d)
+{
+	return container_of(d, struct peci_driver, driver);
+}
+
+/**
+ * module_peci_driver() - Helper macro for registering a modular PECI driver
+ * @__peci_driver: peci_driver struct
+ *
+ * Helper macro for PECI drivers which do not do anything special in module
+ * init/exit. This eliminates a lot of boilerplate. Each module may only
+ * use this macro once, and calling it replaces module_init() and module_exit()
+ */
+#define module_peci_driver(__peci_driver) \
+	module_driver(__peci_driver, peci_add_driver, peci_del_driver)
+
+/* use a define to avoid include chaining to get THIS_MODULE */
+#define peci_add_driver(driver) peci_register_driver(THIS_MODULE, driver)
+
+int  peci_register_driver(struct module *owner, struct peci_driver *drv);
+void peci_del_driver(struct peci_driver *driver);
+int  peci_add_adapter(struct peci_adapter *adapter);
+void peci_del_adapter(struct peci_adapter *adapter);
+int  peci_command(struct peci_adapter *adpater, enum peci_cmd cmd, void *vmsg);
+
+#endif /* __LINUX_PECI_H */
diff --git a/include/uapi/linux/peci-ioctl.h b/include/uapi/linux/peci-ioctl.h
new file mode 100644
index 000000000000..ec73847b9400
--- /dev/null
+++ b/include/uapi/linux/peci-ioctl.h
@@ -0,0 +1,200 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018 Intel Corporation */
+
+#ifndef __PECI_IOCTL_H
+#define __PECI_IOCTL_H
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/* Base Address of 48d */
+#define PECI_BASE_ADDR  0x30  /* The PECI client's default address of 0x30 */
+#define PECI_OFFSET_MAX 8     /* Max numver of CPU clients */
+
+/* PCI Access */
+#define MAX_PCI_READ_LEN 24   /* Number of bytes of the PCI Space read */
+
+#define PCI_BUS0_CPU0      0x00
+#define PCI_BUS0_CPU1      0x80
+#define PCI_CPUBUSNO_BUS   0x00
+#define PCI_CPUBUSNO_DEV   0x08
+#define PCI_CPUBUSNO_FUNC  0x02
+#define PCI_CPUBUSNO       0xcc
+#define PCI_CPUBUSNO_1     0xd0
+#define PCI_CPUBUSNO_VALID 0xd4
+
+/* Package Identifier Read Parameter Value */
+#define PKG_ID_CPU_ID               0x0000  /* CPUID Info */
+#define PKG_ID_PLATFORM_ID          0x0001  /* Platform ID */
+#define PKG_ID_UNCORE_ID            0x0002  /* Uncore Device ID */
+#define PKG_ID_MAX_THREAD_ID        0x0003  /* Max Thread ID */
+#define PKG_ID_MICROCODE_REV        0x0004  /* CPU Microcode Update Revision */
+#define PKG_ID_MACHINE_CHECK_STATUS 0x0005  /* Machine Check Status */
+
+/* RdPkgConfig Index */
+#define MBX_INDEX_CPU_ID            0   /* Package Identifier Read */
+#define MBX_INDEX_VR_DEBUG          1   /* VR Debug */
+#define MBX_INDEX_PKG_TEMP_READ     2   /* Package Temperature Read */
+#define MBX_INDEX_ENERGY_COUNTER    3   /* Energy counter */
+#define MBX_INDEX_ENERGY_STATUS     4   /* DDR Energy Status */
+#define MBX_INDEX_WAKE_MODE_BIT     5   /* "Wake on PECI" Mode bit */
+#define MBX_INDEX_EPI               6   /* Efficient Performance Indication */
+#define MBX_INDEX_PKG_RAPL_PERF     8   /* Pkg RAPL Performance Status Read */
+#define MBX_INDEX_PER_CORE_DTS_TEMP 9   /* Per Core DTS Temperature Read */
+#define MBX_INDEX_DTS_MARGIN        10  /* DTS thermal margin */
+#define MBX_INDEX_SKT_PWR_THRTL_DUR 11  /* Socket Power Throttled Duration */
+#define MBX_INDEX_CFG_TDP_CONTROL   12  /* TDP Config Control */
+#define MBX_INDEX_CFG_TDP_LEVELS    13  /* TDP Config Levels */
+#define MBX_INDEX_DDR_DIMM_TEMP     14  /* DDR DIMM Temperature */
+#define MBX_INDEX_CFG_ICCMAX        15  /* Configurable ICCMAX */
+#define MBX_INDEX_TEMP_TARGET       16  /* Temperature Target Read */
+#define MBX_INDEX_CURR_CFG_LIMIT    17  /* Current Config Limit */
+#define MBX_INDEX_DIMM_TEMP_READ    20  /* Package Thermal Status Read */
+#define MBX_INDEX_DRAM_IMC_TMP_READ 22  /* DRAM IMC Temperature Read */
+#define MBX_INDEX_DDR_CH_THERM_STAT 23  /* DDR Channel Thermal Status */
+#define MBX_INDEX_PKG_POWER_LIMIT1  26  /* Package Power Limit1 */
+#define MBX_INDEX_PKG_POWER_LIMIT2  27  /* Package Power Limit2 */
+#define MBX_INDEX_TDP               28  /* Thermal design power minimum */
+#define MBX_INDEX_TDP_HIGH          29  /* Thermal design power maximum */
+#define MBX_INDEX_TDP_UNITS         30  /* Units for power/energy registers */
+#define MBX_INDEX_RUN_TIME          31  /* Accumulated Run Time */
+#define MBX_INDEX_CONSTRAINED_TIME  32  /* Thermally Constrained Time Read */
+#define MBX_INDEX_TURBO_RATIO       33  /* Turbo Activation Ratio */
+#define MBX_INDEX_DDR_RAPL_PL1      34  /* DDR RAPL PL1 */
+#define MBX_INDEX_DDR_PWR_INFO_HIGH 35  /* DRAM Power Info Read (high) */
+#define MBX_INDEX_DDR_PWR_INFO_LOW  36  /* DRAM Power Info Read (low) */
+#define MBX_INDEX_DDR_RAPL_PL2      37  /* DDR RAPL PL2 */
+#define MBX_INDEX_DDR_RAPL_STATUS   38  /* DDR RAPL Performance Status */
+#define MBX_INDEX_DDR_HOT_ABSOLUTE  43  /* DDR Hottest Dimm Absolute Temp */
+#define MBX_INDEX_DDR_HOT_RELATIVE  44  /* DDR Hottest Dimm Relative Temp */
+#define MBX_INDEX_DDR_THROTTLE_TIME 45  /* DDR Throttle Time */
+#define MBX_INDEX_DDR_THERM_STATUS  46  /* DDR Thermal Status */
+#define MBX_INDEX_TIME_AVG_TEMP     47  /* Package time-averaged temperature */
+#define MBX_INDEX_TURBO_RATIO_LIMIT 49  /* Turbo Ratio Limit Read */
+#define MBX_INDEX_HWP_AUTO_OOB      53  /* HWP Autonomous Out-of-band */
+#define MBX_INDEX_DDR_WARM_BUDGET   55  /* DDR Warm Power Budget */
+#define MBX_INDEX_DDR_HOT_BUDGET    56  /* DDR Hot Power Budget */
+#define MBX_INDEX_PKG_PSYS_PWR_LIM3 57  /* Package/Psys Power Limit3 */
+#define MBX_INDEX_PKG_PSYS_PWR_LIM1 58  /* Package/Psys Power Limit1 */
+#define MBX_INDEX_PKG_PSYS_PWR_LIM2 59  /* Package/Psys Power Limit2 */
+#define MBX_INDEX_PKG_PSYS_PWR_LIM4 60  /* Package/Psys Power Limit4 */
+#define MBX_INDEX_PERF_LIMIT_REASON 65  /* Performance Limit Reasons */
+
+/* WrPkgConfig Index */
+#define MBX_INDEX_DIMM_AMBIENT 19
+#define MBX_INDEX_DIMM_TEMP    24
+
+enum peci_cmd {
+	PECI_CMD_XFER = 0,
+	PECI_CMD_PING,
+	PECI_CMD_GET_DIB,
+	PECI_CMD_GET_TEMP,
+	PECI_CMD_RD_PKG_CFG,
+	PECI_CMD_WR_PKG_CFG,
+	PECI_CMD_RD_IA_MSR,
+	PECI_CMD_WR_IA_MSR,
+	PECI_CMD_RD_PCI_CFG,
+	PECI_CMD_WR_PCI_CFG,
+	PECI_CMD_RD_PCI_CFG_LOCAL,
+	PECI_CMD_WR_PCI_CFG_LOCAL,
+	PECI_CMD_MAX
+};
+
+struct peci_ping_msg {
+	__u8 addr;
+} __attribute__((__packed__));
+
+struct peci_get_dib_msg {
+	__u8  addr;
+	__u32 dib;
+} __attribute__((__packed__));
+
+struct peci_get_temp_msg {
+	__u8  addr;
+	__s16 temp_raw;
+} __attribute__((__packed__));
+
+struct peci_rd_pkg_cfg_msg {
+	__u8  addr;
+	__u8  index;
+	__u16 param;
+	__u8  rx_len;
+	__u8  pkg_config[4];
+} __attribute__((__packed__));
+
+struct peci_wr_pkg_cfg_msg {
+	__u8  addr;
+	__u8  index;
+	__u16 param;
+	__u8  tx_len;
+	__u32 value;
+} __attribute__((__packed__));
+
+struct peci_rd_ia_msr_msg {
+	__u8  addr;
+	__u8  thread_id;
+	__u16 address;
+	__u64 value;
+} __attribute__((__packed__));
+
+struct peci_rd_pci_cfg_msg {
+	__u8  addr;
+	__u8  bus;
+	__u8  device;
+	__u8  function;
+	__u16 reg;
+	__u8  pci_config[4];
+} __attribute__((__packed__));
+
+struct peci_rd_pci_cfg_local_msg {
+	__u8  addr;
+	__u8  bus;
+	__u8  device;
+	__u8  function;
+	__u16 reg;
+	__u8  rx_len;
+	__u8  pci_config[4];
+} __attribute__((__packed__));
+
+struct peci_wr_pci_cfg_local_msg {
+	__u8  addr;
+	__u8  bus;
+	__u8  device;
+	__u8  function;
+	__u16 reg;
+	__u8  tx_len;
+	__u32 value;
+} __attribute__((__packed__));
+
+#define PECI_IOC_BASE  0xb6
+
+#define PECI_IOC_PING \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_PING, struct peci_ping_msg)
+
+#define PECI_IOC_GET_DIB \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_GET_DIB, struct peci_get_dib_msg)
+
+#define PECI_IOC_GET_TEMP \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_GET_TEMP, struct peci_get_temp_msg)
+
+#define PECI_IOC_RD_PKG_CFG \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_RD_PKG_CFG, struct peci_rd_pkg_cfg_msg)
+
+#define PECI_IOC_WR_PKG_CFG \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_WR_PKG_CFG, struct peci_wr_pkg_cfg_msg)
+
+#define PECI_IOC_RD_IA_MSR \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_RD_IA_MSR, struct peci_rd_ia_msr_msg)
+
+#define PECI_IOC_RD_PCI_CFG \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_RD_PCI_CFG, struct peci_rd_pci_cfg_msg)
+
+#define PECI_IOC_RD_PCI_CFG_LOCAL \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_RD_PCI_CFG_LOCAL, \
+	      struct peci_rd_pci_cfg_local_msg)
+
+#define PECI_IOC_WR_PCI_CFG_LOCAL \
+	_IOWR(PECI_IOC_BASE, PECI_CMD_WR_PCI_CFG_LOCAL, \
+	      struct peci_wr_pci_cfg_local_msg)
+
+#endif /* __PECI_IOCTL_H */
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
                   ` (2 preceding siblings ...)
  2018-04-10 18:32 ` [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-11 11:52   ` Joel Stanley
  2018-04-16 18:10   ` Rob Herring
  2018-04-10 18:32 ` [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node Jae Hyun Yoo
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds a dt-bindings document of PECI adapter driver for Aspeed
AST24xx/25xx SoCs.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 .../devicetree/bindings/peci/peci-aspeed.txt       | 60 ++++++++++++++++++++++
 1 file changed, 60 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt

diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
new file mode 100644
index 000000000000..4598bb8c20fa
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
@@ -0,0 +1,60 @@
+Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
+
+Required properties:
+- compatible        : Should be "aspeed,ast2400-peci" or "aspeed,ast2500-peci"
+		      - aspeed,ast2400-peci: Aspeed AST2400 family PECI
+					     controller
+		      - aspeed,ast2500-peci: Aspeed AST2500 family PECI
+					     controller
+- reg               : Should contain PECI controller registers location and
+		      length.
+- #address-cells    : Should be <1>.
+- #size-cells       : Should be <0>.
+- interrupts        : Should contain PECI controller interrupt.
+- clocks            : Should contain clock source for PECI controller.
+		      Should reference clkin.
+- clock_frequency   : Should contain the operation frequency of PECI controller
+		      in units of Hz.
+		      187500 ~ 24000000
+
+Optional properties:
+- msg-timing-nego   : Message timing negotiation period. This value will
+		      determine the period of message timing negotiation to be
+		      issued by PECI controller. The unit of the programmed
+		      value is four times of PECI clock period.
+		      0 ~ 255 (default: 1)
+- addr-timing-nego  : Address timing negotiation period. This value will
+		      determine the period of address timing negotiation to be
+		      issued by PECI controller. The unit of the programmed
+		      value is four times of PECI clock period.
+		      0 ~ 255 (default: 1)
+- rd-sampling-point : Read sampling point selection. The whole period of a bit
+		      time will be divided into 16 time frames. This value will
+		      determine the time frame in which the controller will
+		      sample PECI signal for data read back. Usually in the
+		      middle of a bit time is the best.
+		      0 ~ 15 (default: 8)
+- cmd_timeout_ms    : Command timeout in units of ms.
+		      1 ~ 60000 (default: 1000)
+
+Example:
+	peci: peci@1e78b000 {
+		compatible = "simple-bus";
+		#address-cells = <1>;
+		#size-cells = <1>;
+		ranges = <0x0 0x1e78b000 0x60>;
+
+		peci0: peci-bus@0 {
+			compatible = "aspeed,ast2500-peci";
+			reg = <0x0 0x60>;
+			#address-cells = <1>;
+			#size-cells = <0>;
+			interrupts = <15>;
+			clocks = <&clk_clkin>;
+			clock-frequency = <24000000>;
+			msg-timing-nego = <1>;
+			addr-timing-nego = <1>;
+			rd-sampling-point = <8>;
+			cmd-timeout-ms = <1000>;
+		};
+	};
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
                   ` (3 preceding siblings ...)
  2018-04-10 18:32 ` [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-11 11:52   ` Joel Stanley
  2018-04-10 18:32 ` [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx Jae Hyun Yoo
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds PECI bus/adapter node of AST24xx/AST25xx into
aspeed-g4 and aspeed-g5.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 arch/arm/boot/dts/aspeed-g4.dtsi | 25 +++++++++++++++++++++++++
 arch/arm/boot/dts/aspeed-g5.dtsi | 25 +++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g4.dtsi b/arch/arm/boot/dts/aspeed-g4.dtsi
index 518d2bc7c7fc..f7992eee4d1f 100644
--- a/arch/arm/boot/dts/aspeed-g4.dtsi
+++ b/arch/arm/boot/dts/aspeed-g4.dtsi
@@ -29,6 +29,7 @@
 		serial3 = &uart4;
 		serial4 = &uart5;
 		serial5 = &vuart;
+		peci0 = &peci0;
 	};
 
 	cpus {
@@ -270,6 +271,13 @@
 				};
 			};
 
+			peci: peci@1e78b000 {
+				compatible = "simple-bus";
+				#address-cells = <1>;
+				#size-cells = <1>;
+				ranges = <0x0 0x1e78b000 0x60>;
+			};
+
 			uart2: serial@1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
@@ -313,6 +321,23 @@
 	};
 };
 
+&peci {
+	peci0: peci-bus@0 {
+		compatible = "aspeed,ast2400-peci";
+		reg = <0x0 0x60>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		interrupts = <15>;
+		clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+		clock-frequency = <24000000>;
+		msg-timing-nego = <1>;
+		addr-timing-nego = <1>;
+		rd-sampling-point = <8>;
+		cmd-timeout-ms = <1000>;
+		status = "disabled";
+	};
+};
+
 &i2c {
 	i2c_ic: interrupt-controller@0 {
 		#interrupt-cells = <1>;
diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index f9917717dd08..278791dba8a0 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -29,6 +29,7 @@
 		serial3 = &uart4;
 		serial4 = &uart5;
 		serial5 = &vuart;
+		peci0 = &peci0;
 	};
 
 	cpus {
@@ -320,6 +321,13 @@
 				};
 			};
 
+			peci: peci@1e78b000 {
+				compatible = "simple-bus";
+				#address-cells = <1>;
+				#size-cells = <1>;
+				ranges = <0x0 0x1e78b000 0x60>;
+			};
+
 			uart2: serial@1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
@@ -363,6 +371,23 @@
 	};
 };
 
+&peci {
+	peci0: peci-bus@0 {
+		compatible = "aspeed,ast2500-peci";
+		reg = <0x0 0x60>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		interrupts = <15>;
+		clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+		clock-frequency = <24000000>;
+		msg-timing-nego = <1>;
+		addr-timing-nego = <1>;
+		rd-sampling-point = <8>;
+		cmd-timeout-ms = <1000>;
+		status = "disabled";
+	};
+};
+
 &i2c {
 	i2c_ic: interrupt-controller@0 {
 		#interrupt-cells = <1>;
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
                   ` (4 preceding siblings ...)
  2018-04-10 18:32 ` [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-11 11:51   ` Joel Stanley
  2018-04-17 13:37   ` Robin Murphy
  2018-04-10 18:32 ` [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers Jae Hyun Yoo
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds PECI adapter driver implementation for Aspeed
AST24xx/AST25xx.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 drivers/peci/Kconfig       |  28 +++
 drivers/peci/Makefile      |   3 +
 drivers/peci/peci-aspeed.c | 504 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 535 insertions(+)
 create mode 100644 drivers/peci/peci-aspeed.c

diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index 1fbc13f9e6c2..0e33420365de 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -14,4 +14,32 @@ config PECI
 	  processors and chipset components to external monitoring or control
 	  devices.
 
+	  If you want PECI support, you should say Y here and also to the
+	  specific driver for your bus adapter(s) below.
+
+if PECI
+
+#
+# PECI hardware bus configuration
+#
+
+menu "PECI Hardware Bus support"
+
+config PECI_ASPEED
+	tristate "Aspeed AST24xx/AST25xx PECI support"
+	select REGMAP_MMIO
+	depends on OF
+	depends on ARCH_ASPEED || COMPILE_TEST
+	help
+	  Say Y here if you want support for the Platform Environment Control
+	  Interface (PECI) bus adapter driver on the Aspeed AST24XX and AST25XX
+	  SoCs.
+
+	  This support is also available as a module.  If so, the module
+	  will be called peci-aspeed.
+
+endmenu
+
+endif # PECI
+
 endmenu
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index 9e8615e0d3ff..886285e69765 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -4,3 +4,6 @@
 
 # Core functionality
 obj-$(CONFIG_PECI)		+= peci-core.o
+
+# Hardware specific bus drivers
+obj-$(CONFIG_PECI_ASPEED)	+= peci-aspeed.o
diff --git a/drivers/peci/peci-aspeed.c b/drivers/peci/peci-aspeed.c
new file mode 100644
index 000000000000..be2a1f327eb1
--- /dev/null
+++ b/drivers/peci/peci-aspeed.c
@@ -0,0 +1,504 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2012-2017 ASPEED Technology Inc.
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/peci.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+
+#define DUMP_DEBUG 0
+
+/* Aspeed PECI Registers */
+#define AST_PECI_CTRL     0x00
+#define AST_PECI_TIMING   0x04
+#define AST_PECI_CMD      0x08
+#define AST_PECI_CMD_CTRL 0x0c
+#define AST_PECI_EXP_FCS  0x10
+#define AST_PECI_CAP_FCS  0x14
+#define AST_PECI_INT_CTRL 0x18
+#define AST_PECI_INT_STS  0x1c
+#define AST_PECI_W_DATA0  0x20
+#define AST_PECI_W_DATA1  0x24
+#define AST_PECI_W_DATA2  0x28
+#define AST_PECI_W_DATA3  0x2c
+#define AST_PECI_R_DATA0  0x30
+#define AST_PECI_R_DATA1  0x34
+#define AST_PECI_R_DATA2  0x38
+#define AST_PECI_R_DATA3  0x3c
+#define AST_PECI_W_DATA4  0x40
+#define AST_PECI_W_DATA5  0x44
+#define AST_PECI_W_DATA6  0x48
+#define AST_PECI_W_DATA7  0x4c
+#define AST_PECI_R_DATA4  0x50
+#define AST_PECI_R_DATA5  0x54
+#define AST_PECI_R_DATA6  0x58
+#define AST_PECI_R_DATA7  0x5c
+
+/* AST_PECI_CTRL - 0x00 : Control Register */
+#define PECI_CTRL_SAMPLING_MASK     GENMASK(19, 16)
+#define PECI_CTRL_SAMPLING(x)       (((x) << 16) & PECI_CTRL_SAMPLING_MASK)
+#define PECI_CTRL_SAMPLING_GET(x)   (((x) & PECI_CTRL_SAMPLING_MASK) >> 16)
+#define PECI_CTRL_READ_MODE_MASK    GENMASK(13, 12)
+#define PECI_CTRL_READ_MODE(x)      (((x) << 12) & PECI_CTRL_READ_MODE_MASK)
+#define PECI_CTRL_READ_MODE_GET(x)  (((x) & PECI_CTRL_READ_MODE_MASK) >> 12)
+#define PECI_CTRL_READ_MODE_COUNT   BIT(12)
+#define PECI_CTRL_READ_MODE_DBG     BIT(13)
+#define PECI_CTRL_CLK_SOURCE_MASK   BIT(11)
+#define PECI_CTRL_CLK_SOURCE(x)     (((x) << 11) & PECI_CTRL_CLK_SOURCE_MASK)
+#define PECI_CTRL_CLK_SOURCE_GET(x) (((x) & PECI_CTRL_CLK_SOURCE_MASK) >> 11)
+#define PECI_CTRL_CLK_DIV_MASK      GENMASK(10, 8)
+#define PECI_CTRL_CLK_DIV(x)        (((x) << 8) & PECI_CTRL_CLK_DIV_MASK)
+#define PECI_CTRL_CLK_DIV_GET(x)    (((x) & PECI_CTRL_CLK_DIV_MASK) >> 8)
+#define PECI_CTRL_INVERT_OUT        BIT(7)
+#define PECI_CTRL_INVERT_IN         BIT(6)
+#define PECI_CTRL_BUS_CONTENT_EN    BIT(5)
+#define PECI_CTRL_PECI_EN           BIT(4)
+#define PECI_CTRL_PECI_CLK_EN       BIT(0)
+
+/* AST_PECI_TIMING - 0x04 : Timing Negotiation Register */
+#define PECI_TIMING_MESSAGE_MASK   GENMASK(15, 8)
+#define PECI_TIMING_MESSAGE(x)     (((x) << 8) & PECI_TIMING_MESSAGE_MASK)
+#define PECI_TIMING_MESSAGE_GET(x) (((x) & PECI_TIMING_MESSAGE_MASK) >> 8)
+#define PECI_TIMING_ADDRESS_MASK   GENMASK(7, 0)
+#define PECI_TIMING_ADDRESS(x)     ((x) & PECI_TIMING_ADDRESS_MASK)
+#define PECI_TIMING_ADDRESS_GET(x) ((x) & PECI_TIMING_ADDRESS_MASK)
+
+/* AST_PECI_CMD - 0x08 : Command Register */
+#define PECI_CMD_PIN_MON    BIT(31)
+#define PECI_CMD_STS_MASK   GENMASK(27, 24)
+#define PECI_CMD_STS_GET(x) (((x) & PECI_CMD_STS_MASK) >> 24)
+#define PECI_CMD_FIRE       BIT(0)
+
+/* AST_PECI_LEN - 0x0C : Read/Write Length Register */
+#define PECI_AW_FCS_EN       BIT(31)
+#define PECI_READ_LEN_MASK   GENMASK(23, 16)
+#define PECI_READ_LEN(x)     (((x) << 16) & PECI_READ_LEN_MASK)
+#define PECI_WRITE_LEN_MASK  GENMASK(15, 8)
+#define PECI_WRITE_LEN(x)    (((x) << 8) & PECI_WRITE_LEN_MASK)
+#define PECI_TAGET_ADDR_MASK GENMASK(7, 0)
+#define PECI_TAGET_ADDR(x)   ((x) & PECI_TAGET_ADDR_MASK)
+
+/* AST_PECI_EXP_FCS - 0x10 : Expected FCS Data Register */
+#define PECI_EXPECT_READ_FCS_MASK      GENMASK(23, 16)
+#define PECI_EXPECT_READ_FCS_GET(x)    (((x) & PECI_EXPECT_READ_FCS_MASK) >> 16)
+#define PECI_EXPECT_AW_FCS_AUTO_MASK   GENMASK(15, 8)
+#define PECI_EXPECT_AW_FCS_AUTO_GET(x) (((x) & PECI_EXPECT_AW_FCS_AUTO_MASK) \
+					>> 8)
+#define PECI_EXPECT_WRITE_FCS_MASK     GENMASK(7, 0)
+#define PECI_EXPECT_WRITE_FCS_GET(x)   ((x) & PECI_EXPECT_WRITE_FCS_MASK)
+
+/* AST_PECI_CAP_FCS - 0x14 : Captured FCS Data Register */
+#define PECI_CAPTURE_READ_FCS_MASK    GENMASK(23, 16)
+#define PECI_CAPTURE_READ_FCS_GET(x)  (((x) & PECI_CAPTURE_READ_FCS_MASK) >> 16)
+#define PECI_CAPTURE_WRITE_FCS_MASK   GENMASK(7, 0)
+#define PECI_CAPTURE_WRITE_FCS_GET(x) ((x) & PECI_CAPTURE_WRITE_FCS_MASK)
+
+/* AST_PECI_INT_CTRL/STS - 0x18/0x1c : Interrupt Register */
+#define PECI_INT_TIMING_RESULT_MASK GENMASK(31, 30)
+#define PECI_INT_TIMEOUT            BIT(4)
+#define PECI_INT_CONNECT            BIT(3)
+#define PECI_INT_W_FCS_BAD          BIT(2)
+#define PECI_INT_W_FCS_ABORT        BIT(1)
+#define PECI_INT_CMD_DONE           BIT(0)
+
+struct aspeed_peci {
+	struct peci_adapter	adaper;
+	struct device		*dev;
+	struct regmap		*regmap;
+	int			irq;
+	struct completion	xfer_complete;
+	u32			status;
+	u32			cmd_timeout_ms;
+};
+
+#define PECI_INT_MASK  (PECI_INT_TIMEOUT | PECI_INT_CONNECT | \
+			PECI_INT_W_FCS_BAD | PECI_INT_W_FCS_ABORT | \
+			PECI_INT_CMD_DONE)
+
+#define PECI_IDLE_CHECK_TIMEOUT_MS      50
+#define PECI_IDLE_CHECK_INTERVAL_MS     10
+
+#define PECI_RD_SAMPLING_POINT_DEFAULT  8
+#define PECI_RD_SAMPLING_POINT_MAX      15
+#define PECI_CLK_DIV_DEFAULT            0
+#define PECI_CLK_DIV_MAX                7
+#define PECI_MSG_TIMING_NEGO_DEFAULT    1
+#define PECI_MSG_TIMING_NEGO_MAX        255
+#define PECI_ADDR_TIMING_NEGO_DEFAULT   1
+#define PECI_ADDR_TIMING_NEGO_MAX       255
+#define PECI_CMD_TIMEOUT_MS_DEFAULT     1000
+#define PECI_CMD_TIMEOUT_MS_MAX         60000
+
+static int aspeed_peci_xfer_native(struct aspeed_peci *priv,
+				   struct peci_xfer_msg *msg)
+{
+	long err, timeout = msecs_to_jiffies(priv->cmd_timeout_ms);
+	u32 peci_head, peci_state, rx_data, cmd_sts;
+	ktime_t start, end;
+	s64 elapsed_ms;
+	int i, rc = 0;
+	uint reg;
+
+	start = ktime_get();
+
+	/* Check command sts and bus idle state */
+	while (!regmap_read(priv->regmap, AST_PECI_CMD, &cmd_sts) &&
+	       (cmd_sts & (PECI_CMD_STS_MASK | PECI_CMD_PIN_MON))) {
+		end = ktime_get();
+		elapsed_ms = ktime_to_ms(ktime_sub(end, start));
+		if (elapsed_ms >= PECI_IDLE_CHECK_TIMEOUT_MS) {
+			dev_dbg(priv->dev, "Timeout waiting for idle state!\n");
+			return -ETIMEDOUT;
+		}
+
+		usleep_range(PECI_IDLE_CHECK_INTERVAL_MS * 1000,
+			     (PECI_IDLE_CHECK_INTERVAL_MS * 1000) + 1000);
+	};
+
+	reinit_completion(&priv->xfer_complete);
+
+	peci_head = PECI_TAGET_ADDR(msg->addr) |
+				    PECI_WRITE_LEN(msg->tx_len) |
+				    PECI_READ_LEN(msg->rx_len);
+
+	rc = regmap_write(priv->regmap, AST_PECI_CMD_CTRL, peci_head);
+	if (rc)
+		return rc;
+
+	for (i = 0; i < msg->tx_len; i += 4) {
+		reg = i < 16 ? AST_PECI_W_DATA0 + i % 16 :
+			       AST_PECI_W_DATA4 + i % 16;
+		rc = regmap_write(priv->regmap, reg,
+				  (msg->tx_buf[i + 3] << 24) |
+				  (msg->tx_buf[i + 2] << 16) |
+				  (msg->tx_buf[i + 1] << 8) |
+				  msg->tx_buf[i + 0]);
+		if (rc)
+			return rc;
+	}
+
+	dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
+#if DUMP_DEBUG
+	print_hex_dump(KERN_DEBUG, "TX : ", DUMP_PREFIX_NONE, 16, 1,
+		       msg->tx_buf, msg->tx_len, true);
+#endif
+
+	rc = regmap_write(priv->regmap, AST_PECI_CMD, PECI_CMD_FIRE);
+	if (rc)
+		return rc;
+
+	err = wait_for_completion_interruptible_timeout(&priv->xfer_complete,
+							timeout);
+
+	dev_dbg(priv->dev, "INT_STS : 0x%08x\n", priv->status);
+	if (!regmap_read(priv->regmap, AST_PECI_CMD, &peci_state))
+		dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
+			PECI_CMD_STS_GET(peci_state));
+	else
+		dev_dbg(priv->dev, "PECI_STATE : read error\n");
+
+	rc = regmap_write(priv->regmap, AST_PECI_CMD, 0);
+	if (rc)
+		return rc;
+
+	if (err <= 0 || !(priv->status & PECI_INT_CMD_DONE)) {
+		if (err < 0) { /* -ERESTARTSYS */
+			return (int)err;
+		} else if (err == 0) {
+			dev_dbg(priv->dev, "Timeout waiting for a response!\n");
+			return -ETIMEDOUT;
+		}
+
+		dev_dbg(priv->dev, "No valid response!\n");
+		return -EIO;
+	}
+
+	for (i = 0; i < msg->rx_len; i++) {
+		u8 byte_offset = i % 4;
+
+		if (byte_offset == 0) {
+			reg = i < 16 ? AST_PECI_R_DATA0 + i % 16 :
+				       AST_PECI_R_DATA4 + i % 16;
+			rc = regmap_read(priv->regmap, reg, &rx_data);
+			if (rc)
+				return rc;
+		}
+
+		msg->rx_buf[i] = (u8)(rx_data >> (byte_offset << 3));
+	}
+
+#if DUMP_DEBUG
+	print_hex_dump(KERN_DEBUG, "RX : ", DUMP_PREFIX_NONE, 16, 1,
+		       msg->rx_buf, msg->rx_len, true);
+#endif
+	if (!regmap_read(priv->regmap, AST_PECI_CMD, &peci_state))
+		dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
+			PECI_CMD_STS_GET(peci_state));
+	else
+		dev_dbg(priv->dev, "PECI_STATE : read error\n");
+	dev_dbg(priv->dev, "------------------------\n");
+
+	return rc;
+}
+
+static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
+{
+	struct aspeed_peci *priv = arg;
+	u32 status_ack = 0;
+
+	if (regmap_read(priv->regmap, AST_PECI_INT_STS, &priv->status))
+		return IRQ_NONE;
+
+	/* Be noted that multiple interrupt bits can be set at the same time */
+	if (priv->status & PECI_INT_TIMEOUT) {
+		dev_dbg(priv->dev, "PECI_INT_TIMEOUT\n");
+		status_ack |= PECI_INT_TIMEOUT;
+	}
+
+	if (priv->status & PECI_INT_CONNECT) {
+		dev_dbg(priv->dev, "PECI_INT_CONNECT\n");
+		status_ack |= PECI_INT_CONNECT;
+	}
+
+	if (priv->status & PECI_INT_W_FCS_BAD) {
+		dev_dbg(priv->dev, "PECI_INT_W_FCS_BAD\n");
+		status_ack |= PECI_INT_W_FCS_BAD;
+	}
+
+	if (priv->status & PECI_INT_W_FCS_ABORT) {
+		dev_dbg(priv->dev, "PECI_INT_W_FCS_ABORT\n");
+		status_ack |= PECI_INT_W_FCS_ABORT;
+	}
+
+	/**
+	 * All commands should be ended up with a PECI_INT_CMD_DONE bit set
+	 * even in an error case.
+	 */
+	if (priv->status & PECI_INT_CMD_DONE) {
+		dev_dbg(priv->dev, "PECI_INT_CMD_DONE\n");
+		status_ack |= PECI_INT_CMD_DONE;
+		complete(&priv->xfer_complete);
+	}
+
+	if (regmap_write(priv->regmap, AST_PECI_INT_STS, status_ack))
+		return IRQ_NONE;
+
+	return IRQ_HANDLED;
+}
+
+static int aspeed_peci_init_ctrl(struct aspeed_peci *priv)
+{
+	u32 msg_timing_nego, addr_timing_nego, rd_sampling_point;
+	u32 clk_freq, clk_divisor, clk_div_val = 0;
+	struct clk *clkin;
+	int ret;
+
+	clkin = devm_clk_get(priv->dev, NULL);
+	if (IS_ERR(clkin)) {
+		dev_err(priv->dev, "Failed to get clk source.\n");
+		return PTR_ERR(clkin);
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "clock-frequency",
+				   &clk_freq);
+	if (ret < 0) {
+		dev_err(priv->dev,
+			"Could not read clock-frequency property.\n");
+		return ret;
+	}
+
+	clk_divisor = clk_get_rate(clkin) / clk_freq;
+	devm_clk_put(priv->dev, clkin);
+
+	while ((clk_divisor >> 1) && (clk_div_val < PECI_CLK_DIV_MAX))
+		clk_div_val++;
+
+	ret = of_property_read_u32(priv->dev->of_node, "msg-timing-nego",
+				   &msg_timing_nego);
+	if (ret || msg_timing_nego > PECI_MSG_TIMING_NEGO_MAX) {
+		dev_warn(priv->dev,
+			 "Invalid msg-timing-nego : %u, Use default : %u\n",
+			 msg_timing_nego, PECI_MSG_TIMING_NEGO_DEFAULT);
+		msg_timing_nego = PECI_MSG_TIMING_NEGO_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "addr-timing-nego",
+				   &addr_timing_nego);
+	if (ret || addr_timing_nego > PECI_ADDR_TIMING_NEGO_MAX) {
+		dev_warn(priv->dev,
+			 "Invalid addr-timing-nego : %u, Use default : %u\n",
+			 addr_timing_nego, PECI_ADDR_TIMING_NEGO_DEFAULT);
+		addr_timing_nego = PECI_ADDR_TIMING_NEGO_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "rd-sampling-point",
+				   &rd_sampling_point);
+	if (ret || rd_sampling_point > PECI_RD_SAMPLING_POINT_MAX) {
+		dev_warn(priv->dev,
+			 "Invalid rd-sampling-point : %u. Use default : %u\n",
+			 rd_sampling_point,
+			 PECI_RD_SAMPLING_POINT_DEFAULT);
+		rd_sampling_point = PECI_RD_SAMPLING_POINT_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "cmd-timeout-ms",
+				   &priv->cmd_timeout_ms);
+	if (ret || priv->cmd_timeout_ms > PECI_CMD_TIMEOUT_MS_MAX ||
+	    priv->cmd_timeout_ms == 0) {
+		dev_warn(priv->dev,
+			 "Invalid cmd-timeout-ms : %u. Use default : %u\n",
+			 priv->cmd_timeout_ms,
+			 PECI_CMD_TIMEOUT_MS_DEFAULT);
+		priv->cmd_timeout_ms = PECI_CMD_TIMEOUT_MS_DEFAULT;
+	}
+
+	ret = regmap_write(priv->regmap, AST_PECI_CTRL,
+			   PECI_CTRL_CLK_DIV(PECI_CLK_DIV_DEFAULT) |
+			   PECI_CTRL_PECI_CLK_EN);
+	if (ret)
+		return ret;
+
+	usleep_range(1000, 5000);
+
+	/**
+	 * Timing negotiation period setting.
+	 * The unit of the programmed value is 4 times of PECI clock period.
+	 */
+	ret = regmap_write(priv->regmap, AST_PECI_TIMING,
+			   PECI_TIMING_MESSAGE(msg_timing_nego) |
+			   PECI_TIMING_ADDRESS(addr_timing_nego));
+	if (ret)
+		return ret;
+
+	/* Clear interrupts */
+	ret = regmap_write(priv->regmap, AST_PECI_INT_STS, PECI_INT_MASK);
+	if (ret)
+		return ret;
+
+	/* Enable interrupts */
+	ret = regmap_write(priv->regmap, AST_PECI_INT_CTRL, PECI_INT_MASK);
+	if (ret)
+		return ret;
+
+	/* Read sampling point and clock speed setting */
+	ret = regmap_write(priv->regmap, AST_PECI_CTRL,
+			   PECI_CTRL_SAMPLING(rd_sampling_point) |
+			   PECI_CTRL_CLK_DIV(clk_div_val) |
+			   PECI_CTRL_PECI_EN | PECI_CTRL_PECI_CLK_EN);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static const struct regmap_config aspeed_peci_regmap_config = {
+	.reg_bits = 32,
+	.val_bits = 32,
+	.reg_stride = 4,
+	.max_register = AST_PECI_R_DATA7,
+	.val_format_endian = REGMAP_ENDIAN_LITTLE,
+	.fast_io = true,
+};
+
+static int aspeed_peci_xfer(struct peci_adapter *adaper,
+			    struct peci_xfer_msg *msg)
+{
+	struct aspeed_peci *priv = peci_get_adapdata(adaper);
+
+	return aspeed_peci_xfer_native(priv, msg);
+}
+
+static int aspeed_peci_probe(struct platform_device *pdev)
+{
+	struct aspeed_peci *priv;
+	struct resource *res;
+	void __iomem *base;
+	int ret = 0;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(&pdev->dev, priv);
+	priv->dev = &pdev->dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(base))
+		return PTR_ERR(base);
+
+	priv->regmap = devm_regmap_init_mmio(&pdev->dev, base,
+					     &aspeed_peci_regmap_config);
+	if (IS_ERR(priv->regmap))
+		return PTR_ERR(priv->regmap);
+
+	priv->irq = platform_get_irq(pdev, 0);
+	if (!priv->irq)
+		return -ENODEV;
+
+	ret = devm_request_irq(&pdev->dev, priv->irq, aspeed_peci_irq_handler,
+			       IRQF_SHARED,
+			       "peci-aspeed-irq",
+			       priv);
+	if (ret < 0)
+		return ret;
+
+	init_completion(&priv->xfer_complete);
+
+	priv->adaper.dev.parent = priv->dev;
+	priv->adaper.dev.of_node = of_node_get(dev_of_node(priv->dev));
+	strlcpy(priv->adaper.name, pdev->name, sizeof(priv->adaper.name));
+	priv->adaper.xfer = aspeed_peci_xfer;
+	peci_set_adapdata(&priv->adaper, priv);
+
+	ret = aspeed_peci_init_ctrl(priv);
+	if (ret < 0)
+		return ret;
+
+	ret = peci_add_adapter(&priv->adaper);
+	if (ret < 0)
+		return ret;
+
+	dev_info(&pdev->dev, "peci bus %d registered, irq %d\n",
+		 priv->adaper.nr, priv->irq);
+
+	return 0;
+}
+
+static int aspeed_peci_remove(struct platform_device *pdev)
+{
+	struct aspeed_peci *priv = dev_get_drvdata(&pdev->dev);
+
+	peci_del_adapter(&priv->adaper);
+	of_node_put(priv->adaper.dev.of_node);
+
+	return 0;
+}
+
+static const struct of_device_id aspeed_peci_of_table[] = {
+	{ .compatible = "aspeed,ast2400-peci", },
+	{ .compatible = "aspeed,ast2500-peci", },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
+
+static struct platform_driver aspeed_peci_driver = {
+	.probe  = aspeed_peci_probe,
+	.remove = aspeed_peci_remove,
+	.driver = {
+		.name           = "peci-aspeed",
+		.of_match_table = of_match_ptr(aspeed_peci_of_table),
+	},
+};
+module_platform_driver(aspeed_peci_driver);
+
+MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("Aspeed PECI driver");
+MODULE_LICENSE("GPL v2");
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
                   ` (5 preceding siblings ...)
  2018-04-10 18:32 ` [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-16 18:14   ` Rob Herring
  2018-04-10 18:32 ` [PATCH v3 08/10] Documentation: hwmon: " Jae Hyun Yoo
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds dt-bindings documents for PECI cputemp and dimmtemp client
drivers.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 .../devicetree/bindings/hwmon/peci-cputemp.txt     | 24 +++++++++++++++++++++
 .../devicetree/bindings/hwmon/peci-dimmtemp.txt    | 25 ++++++++++++++++++++++
 2 files changed, 49 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt

diff --git a/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
new file mode 100644
index 000000000000..d5530ef9cfd2
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
@@ -0,0 +1,24 @@
+Bindings for Intel PECI (Platform Environment Control Interface) cputemp driver.
+
+Required properties:
+- compatible : Should be "intel,peci-cputemp".
+- reg        : Should contain address of a client CPU. Address range of CPU
+	       clients is starting from 0x30 based on PECI specification.
+	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
+
+Example:
+	peci-bus@0 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		< more properties >
+
+		peci-cputemp@cpu0 {
+			compatible = "intel,peci-cputemp";
+			reg = <0x30>;
+		};
+
+		peci-cputemp@cpu1 {
+			compatible = "intel,peci-cputemp";
+			reg = <0x31>;
+		};
+	};
diff --git a/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
new file mode 100644
index 000000000000..56e5deb61e5c
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
@@ -0,0 +1,25 @@
+Bindings for Intel PECI (Platform Environment Control Interface) dimmtemp
+driver.
+
+Required properties:
+- compatible : Should be "intel,peci-dimmtemp".
+- reg        : Should contain address of a client CPU. Address range of CPU
+	       clients is starting from 0x30 based on PECI specification.
+	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
+
+Example:
+	peci-bus@0 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		< more properties >
+
+		peci-dimmtemp@cpu0 {
+			compatible = "intel,peci-dimmtemp";
+			reg = <0x30>;
+		};
+
+		peci-dimmtemp@cpu1 {
+			compatible = "intel,peci-dimmtemp";
+			reg = <0x31>;
+		};
+	};
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 08/10] Documentation: hwmon: Add documents for PECI hwmon client drivers
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
                   ` (6 preceding siblings ...)
  2018-04-10 18:32 ` [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-10 18:32 ` [PATCH v3 09/10] drivers/hwmon: Add " Jae Hyun Yoo
  2018-04-10 18:32 ` [PATCH v3 10/10] Add a maintainer for the PECI subsystem Jae Hyun Yoo
  9 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds hwmon documents for PECI cputemp and dimmtemp drivers.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 Documentation/hwmon/peci-cputemp  | 88 +++++++++++++++++++++++++++++++++++++++
 Documentation/hwmon/peci-dimmtemp | 50 ++++++++++++++++++++++
 2 files changed, 138 insertions(+)
 create mode 100644 Documentation/hwmon/peci-cputemp
 create mode 100644 Documentation/hwmon/peci-dimmtemp

diff --git a/Documentation/hwmon/peci-cputemp b/Documentation/hwmon/peci-cputemp
new file mode 100644
index 000000000000..cdd5ea49a4a2
--- /dev/null
+++ b/Documentation/hwmon/peci-cputemp
@@ -0,0 +1,88 @@
+Kernel driver peci-cputemp
+==========================
+
+Supported chips:
+	One of Intel server CPUs listed below which is connected to a PECI bus.
+		* Intel Xeon E5/E7 v3 server processors
+			Intel Xeon E5-14xx v3 family
+			Intel Xeon E5-24xx v3 family
+			Intel Xeon E5-16xx v3 family
+			Intel Xeon E5-26xx v3 family
+			Intel Xeon E5-46xx v3 family
+			Intel Xeon E7-48xx v3 family
+			Intel Xeon E7-88xx v3 family
+		* Intel Xeon E5/E7 v4 server processors
+			Intel Xeon E5-16xx v4 family
+			Intel Xeon E5-26xx v4 family
+			Intel Xeon E5-46xx v4 family
+			Intel Xeon E7-48xx v4 family
+			Intel Xeon E7-88xx v4 family
+		* Intel Xeon Scalable server processors
+			Intel Xeon Bronze family
+			Intel Xeon Silver family
+			Intel Xeon Gold family
+			Intel Xeon Platinum family
+	Addresses scanned: PECI client address 0x30 - 0x37
+	Datasheet: Available from http://www.intel.com/design/literature.htm
+
+Author:
+	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+Description
+-----------
+
+This driver implements a generic PECI hwmon feature which provides Digital
+Thermal Sensor (DTS) thermal readings of the CPU package and CPU cores that are
+accessible using the PECI Client Command Suite via the processor PECI client.
+
+All temperature values are given in millidegree Celsius and will be measurable
+only when the target CPU is powered on.
+
+sysfs attributes
+----------------
+
+temp1_label		"Die"
+temp1_input		Provides current die temperature of the CPU package.
+temp1_max		Provides thermal control temperature of the CPU package
+			which is also known as Tcontrol.
+temp1_crit		Provides shutdown temperature of the CPU package which
+			is also known as the maximum processor junction
+			temperature, Tjmax or Tprochot.
+temp1_crit_hyst		Provides the hysteresis value from Tcontrol to Tjmax of
+			the CPU package.
+
+temp2_label		"DTS margin"
+temp2_input		Provides current DTS thermal margin to Tcontrol of the
+			CPU package. Value 0 means it reaches to Tcontrol
+			temperature. Sub-zero value means the die temperature
+			goes across Tconrtol to Tjmax.
+temp2_min		Provides the minimum DTS thermal margin to Tcontrol of
+			the CPU package.
+temp2_lcrit		Provides the value when the CPU package temperature
+			reaches to Tjmax.
+
+temp3_label		"Tcontrol"
+temp3_input		Provides current Tcontrol temperature of the CPU
+			package which is also known as Fan Temperature target.
+			Indicates the relative value from thermal monitor trip
+			temperature at which fans should be engaged.
+temp3_crit		Provides Tcontrol critical value of the CPU package
+			which is same to Tjmax.
+
+temp4_label		"Tthrottle"
+temp4_input		Provides current Tthrottle temperature of the CPU
+			package. Used for throttling temperature. If this value
+			is allowed and lower than Tjmax - the throttle will
+			occur and reported at lower than Tjmax.
+
+temp5_label		"Tjmax"
+temp5_input		Provides the maximum junction temperature, Tjmax of the
+			CPU package.
+
+temp[6-*]_label		Provides string "Core X", where X is resolved core
+			number.
+temp[6-*]_input		Provides current temperature of each core.
+temp[6-*]_max		Provides thermal control temperature of the core.
+temp[6-*]_crit		Provides shutdown temperature of the core.
+temp[6-*]_crit_hyst	Provides the hysteresis value from Tcontrol to Tjmax of
+			the core.
diff --git a/Documentation/hwmon/peci-dimmtemp b/Documentation/hwmon/peci-dimmtemp
new file mode 100644
index 000000000000..c54f2526188c
--- /dev/null
+++ b/Documentation/hwmon/peci-dimmtemp
@@ -0,0 +1,50 @@
+Kernel driver peci-dimmtemp
+===========================
+
+Supported chips:
+	One of Intel server CPUs listed below which is connected to a PECI bus.
+		* Intel Xeon E5/E7 v3 server processors
+			Intel Xeon E5-14xx v3 family
+			Intel Xeon E5-24xx v3 family
+			Intel Xeon E5-16xx v3 family
+			Intel Xeon E5-26xx v3 family
+			Intel Xeon E5-46xx v3 family
+			Intel Xeon E7-48xx v3 family
+			Intel Xeon E7-88xx v3 family
+		* Intel Xeon E5/E7 v4 server processors
+			Intel Xeon E5-16xx v4 family
+			Intel Xeon E5-26xx v4 family
+			Intel Xeon E5-46xx v4 family
+			Intel Xeon E7-48xx v4 family
+			Intel Xeon E7-88xx v4 family
+		* Intel Xeon Scalable server processors
+			Intel Xeon Bronze family
+			Intel Xeon Silver family
+			Intel Xeon Gold family
+			Intel Xeon Platinum family
+	Addresses scanned: PECI client address 0x30 - 0x37
+	Datasheet: Available from http://www.intel.com/design/literature.htm
+
+Author:
+	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+
+Description
+-----------
+
+This driver implements a generic PECI hwmon feature which provides Digital
+Thermal Sensor (DTS) thermal readings of DIMM components that are accessible
+using the PECI Client Command Suite via the processor PECI client.
+
+All temperature values are given in millidegree Celsius and will be measurable
+only when the target CPU is powered on.
+
+sysfs attributes
+----------------
+
+temp[N]_label		Provides string "DIMM CI", where C is DIMM channel and
+			I is DIMM index of the populated DIMM.
+temp[N]_input		Provides current temperature of the populated DIMM.
+
+Note:
+	DIMM temperature attributes will appear when the client CPU's BIOS
+	completes memory training and testing.
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
                   ` (7 preceding siblings ...)
  2018-04-10 18:32 ` [PATCH v3 08/10] Documentation: hwmon: " Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  2018-04-10 22:28   ` Guenter Roeck
  2018-04-24 15:56   ` Andy Shevchenko
  2018-04-10 18:32 ` [PATCH v3 10/10] Add a maintainer for the PECI subsystem Jae Hyun Yoo
  9 siblings, 2 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds PECI cputemp and dimmtemp hwmon drivers.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 drivers/hwmon/Kconfig         |  28 ++
 drivers/hwmon/Makefile        |   2 +
 drivers/hwmon/peci-cputemp.c  | 783 ++++++++++++++++++++++++++++++++++++++++++
 drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
 4 files changed, 1245 insertions(+)
 create mode 100644 drivers/hwmon/peci-cputemp.c
 create mode 100644 drivers/hwmon/peci-dimmtemp.c

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index f249a4428458..c52f610f81d0 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
 	  This driver can also be built as a module.  If so, the module
 	  will be called nct7904.
 
+config SENSORS_PECI_CPUTEMP
+	tristate "PECI CPU temperature monitoring support"
+	depends on OF
+	depends on PECI
+	help
+	  If you say yes here you get support for the generic Intel PECI
+	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
+	  readings of the CPU package and CPU cores that are accessible using
+	  the PECI Client Command Suite via the processor PECI client.
+	  Check Documentation/hwmon/peci-cputemp for details.
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called peci-cputemp.
+
+config SENSORS_PECI_DIMMTEMP
+	tristate "PECI DIMM temperature monitoring support"
+	depends on OF
+	depends on PECI
+	help
+	  If you say yes here you get support for the generic Intel PECI hwmon
+	  driver which provides Digital Thermal Sensor (DTS) thermal readings of
+	  DIMM components that are accessible using the PECI Client Command
+	  Suite via the processor PECI client.
+	  Check Documentation/hwmon/peci-dimmtemp for details.
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called peci-dimmtemp.
+
 config SENSORS_NSA320
 	tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
 	depends on GPIOLIB && OF
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index e7d52a36e6c4..48d9598fcd3a 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)	+= nct7802.o
 obj-$(CONFIG_SENSORS_NCT7904)	+= nct7904.o
 obj-$(CONFIG_SENSORS_NSA320)	+= nsa320-hwmon.o
 obj-$(CONFIG_SENSORS_NTC_THERMISTOR)	+= ntc_thermistor.o
+obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
+obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)	+= peci-dimmtemp.o
 obj-$(CONFIG_SENSORS_PC87360)	+= pc87360.o
 obj-$(CONFIG_SENSORS_PC87427)	+= pc87427.o
 obj-$(CONFIG_SENSORS_PCF8591)	+= pcf8591.o
diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
new file mode 100644
index 000000000000..f0bc92687512
--- /dev/null
+++ b/drivers/hwmon/peci-cputemp.c
@@ -0,0 +1,783 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/delay.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/peci.h>
+
+#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
+
+#define CORE_MAX_ON_HSX       18 /* Max number of cores on Haswell */
+#define CORE_MAX_ON_BDX       24 /* Max number of cores on Broadwell */
+#define CORE_MAX_ON_SKX       28 /* Max number of cores on Skylake */
+
+#define DEFAULT_CHANNEL_NUMS  5
+#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
+#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
+
+#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
+
+#define UPDATE_INTERVAL_MIN   HZ
+
+enum cpu_gens {
+	CPU_GEN_HSX, /* Haswell Xeon */
+	CPU_GEN_BRX, /* Broadwell Xeon */
+	CPU_GEN_SKX, /* Skylake Xeon */
+	CPU_GEN_MAX
+};
+
+struct cpu_gen_info {
+	u32 type;
+	u32 cpu_id;
+	u32 core_max;
+};
+
+struct temp_data {
+	bool valid;
+	s32  value;
+	unsigned long last_updated;
+};
+
+struct temp_group {
+	struct temp_data die;
+	struct temp_data dts_margin;
+	struct temp_data tcontrol;
+	struct temp_data tthrottle;
+	struct temp_data tjmax;
+	struct temp_data core[CORETEMP_CHANNEL_NUMS];
+};
+
+struct peci_cputemp {
+	struct peci_client *client;
+	struct device *dev;
+	char name[PECI_NAME_SIZE];
+	struct temp_group temp;
+	u8 addr;
+	uint cpu_no;
+	const struct cpu_gen_info *gen_info;
+	u32 core_mask;
+	u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
+	uint config_idx;
+	struct hwmon_channel_info temp_info;
+	const struct hwmon_channel_info *info[2];
+	struct hwmon_chip_info chip;
+};
+
+enum cputemp_channels {
+	channel_die,
+	channel_dts_mrgn,
+	channel_tcontrol,
+	channel_tthrottle,
+	channel_tjmax,
+	channel_core,
+};
+
+static const struct cpu_gen_info cpu_gen_info_table[] = {
+	{ .type = CPU_GEN_HSX,
+	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
+	  .core_max = CORE_MAX_ON_HSX },
+	{ .type = CPU_GEN_BRX,
+	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
+	  .core_max = CORE_MAX_ON_BDX },
+	{ .type = CPU_GEN_SKX,
+	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
+	  .core_max = CORE_MAX_ON_SKX },
+};
+
+static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
+	/* Die temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
+	HWMON_T_CRIT_HYST,
+
+	/* DTS margin temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
+
+	/* Tcontrol temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
+
+	/* Tthrottle temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT,
+
+	/* Tjmax temperature */
+	HWMON_T_LABEL | HWMON_T_INPUT,
+
+	/* Core temperature - for all core channels */
+	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
+	HWMON_T_CRIT_HYST,
+};
+
+static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
+	"Die",
+	"DTS margin",
+	"Tcontrol",
+	"Tthrottle",
+	"Tjmax",
+	"Core 0", "Core 1", "Core 2", "Core 3",
+	"Core 4", "Core 5", "Core 6", "Core 7",
+	"Core 8", "Core 9", "Core 10", "Core 11",
+	"Core 12", "Core 13", "Core 14", "Core 15",
+	"Core 16", "Core 17", "Core 18", "Core 19",
+	"Core 20", "Core 21", "Core 22", "Core 23",
+};
+
+static int send_peci_cmd(struct peci_cputemp *priv,
+			 enum peci_cmd cmd,
+			 void *msg)
+{
+	return peci_command(priv->client->adapter, cmd, msg);
+}
+
+static int need_update(struct temp_data *temp)
+{
+	if (temp->valid &&
+	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
+		return 0;
+
+	return 1;
+}
+
+static void mark_updated(struct temp_data *temp)
+{
+	temp->valid = true;
+	temp->last_updated = jiffies;
+}
+
+static s32 ten_dot_six_to_millidegree(s32 val)
+{
+	return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
+}
+
+static int get_tjmax(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	int rc;
+
+	if (!priv->temp.tjmax.valid) {
+		msg.addr = priv->addr;
+		msg.index = MBX_INDEX_TEMP_TARGET;
+		msg.param = 0;
+		msg.rx_len = 4;
+
+		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+		if (rc)
+			return rc;
+
+		priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
+		priv->temp.tjmax.valid = true;
+	}
+
+	return 0;
+}
+
+static int get_tcontrol(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	s32 tcontrol_margin;
+	s32 tthrottle_offset;
+	int rc;
+
+	if (!need_update(&priv->temp.tcontrol))
+		return 0;
+
+	rc = get_tjmax(priv);
+	if (rc)
+		return rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_TEMP_TARGET;
+	msg.param = 0;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	tcontrol_margin = msg.pkg_config[1];
+	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
+	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
+
+	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
+	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
+
+	mark_updated(&priv->temp.tcontrol);
+	mark_updated(&priv->temp.tthrottle);
+
+	return 0;
+}
+
+static int get_tthrottle(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	s32 tcontrol_margin;
+	s32 tthrottle_offset;
+	int rc;
+
+	if (!need_update(&priv->temp.tthrottle))
+		return 0;
+
+	rc = get_tjmax(priv);
+	if (rc)
+		return rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_TEMP_TARGET;
+	msg.param = 0;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
+	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
+
+	tcontrol_margin = msg.pkg_config[1];
+	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
+	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
+
+	mark_updated(&priv->temp.tthrottle);
+	mark_updated(&priv->temp.tcontrol);
+
+	return 0;
+}
+
+static int get_die_temp(struct peci_cputemp *priv)
+{
+	struct peci_get_temp_msg msg;
+	int rc;
+
+	if (!need_update(&priv->temp.die))
+		return 0;
+
+	rc = get_tjmax(priv);
+	if (rc)
+		return rc;
+
+	msg.addr = priv->addr;
+
+	rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
+	if (rc)
+		return rc;
+
+	priv->temp.die.value = priv->temp.tjmax.value +
+			       ((s32)msg.temp_raw * 1000 / 64);
+
+	mark_updated(&priv->temp.die);
+
+	return 0;
+}
+
+static int get_dts_margin(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	s32 dts_margin;
+	int rc;
+
+	if (!need_update(&priv->temp.dts_margin))
+		return 0;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_DTS_MARGIN;
+	msg.param = 0;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
+
+	/**
+	 * Processors return a value of DTS reading in 10.6 format
+	 * (10 bits signed decimal, 6 bits fractional).
+	 * Error codes:
+	 *   0x8000: General sensor error
+	 *   0x8001: Reserved
+	 *   0x8002: Underflow on reading value
+	 *   0x8003-0x81ff: Reserved
+	 */
+	if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
+		return -EIO;
+
+	dts_margin = ten_dot_six_to_millidegree(dts_margin);
+
+	priv->temp.dts_margin.value = dts_margin;
+
+	mark_updated(&priv->temp.dts_margin);
+
+	return 0;
+}
+
+static int get_core_temp(struct peci_cputemp *priv, int core_index)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	s32 core_dts_margin;
+	int rc;
+
+	if (!need_update(&priv->temp.core[core_index]))
+		return 0;
+
+	rc = get_tjmax(priv);
+	if (rc)
+		return rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
+	msg.param = core_index;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
+
+	/**
+	 * Processors return a value of the core DTS reading in 10.6 format
+	 * (10 bits signed decimal, 6 bits fractional).
+	 * Error codes:
+	 *   0x8000: General sensor error
+	 *   0x8001: Reserved
+	 *   0x8002: Underflow on reading value
+	 *   0x8003-0x81ff: Reserved
+	 */
+	if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
+		return -EIO;
+
+	core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
+
+	priv->temp.core[core_index].value = priv->temp.tjmax.value +
+					    core_dts_margin;
+
+	mark_updated(&priv->temp.core[core_index]);
+
+	return 0;
+}
+
+static int find_core_index(struct peci_cputemp *priv, int channel)
+{
+	int core_channel = channel - DEFAULT_CHANNEL_NUMS;
+	int idx, found = 0;
+
+	for (idx = 0; idx < priv->gen_info->core_max; idx++) {
+		if (priv->core_mask & BIT(idx)) {
+			if (core_channel == found)
+				break;
+
+			found++;
+		}
+	}
+
+	return idx;
+}
+
+static int cputemp_read_string(struct device *dev,
+			       enum hwmon_sensor_types type,
+			       u32 attr, int channel, const char **str)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int core_index;
+
+	switch (attr) {
+	case hwmon_temp_label:
+		if (channel < DEFAULT_CHANNEL_NUMS) {
+			*str = cputemp_label[channel];
+		} else {
+			core_index = find_core_index(priv, channel);
+			*str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
+		}
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_die(struct device *dev,
+			    enum hwmon_sensor_types type,
+			    u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_die_temp(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.die.value;
+		return 0;
+	case hwmon_temp_max:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tcontrol.value;
+		return 0;
+	case hwmon_temp_crit:
+		rc = get_tjmax(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value;
+		return 0;
+	case hwmon_temp_crit_hyst:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_dts_margin(struct device *dev,
+				   enum hwmon_sensor_types type,
+				   u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_dts_margin(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.dts_margin.value;
+		return 0;
+	case hwmon_temp_min:
+		*val = 0;
+		return 0;
+	case hwmon_temp_lcrit:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tcontrol.value - priv->temp.tjmax.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_tcontrol(struct device *dev,
+				 enum hwmon_sensor_types type,
+				 u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tcontrol.value;
+		return 0;
+	case hwmon_temp_crit:
+		rc = get_tjmax(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_tthrottle(struct device *dev,
+				  enum hwmon_sensor_types type,
+				  u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_tthrottle(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tthrottle.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_tjmax(struct device *dev,
+			      enum hwmon_sensor_types type,
+			      u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_tjmax(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read_core(struct device *dev,
+			     enum hwmon_sensor_types type,
+			     u32 attr, int channel, long *val)
+{
+	struct peci_cputemp *priv = dev_get_drvdata(dev);
+	int core_index = find_core_index(priv, channel);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_core_temp(priv, core_index);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.core[core_index].value;
+		return 0;
+	case hwmon_temp_max:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tcontrol.value;
+		return 0;
+	case hwmon_temp_crit:
+		rc = get_tjmax(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value;
+		return 0;
+	case hwmon_temp_crit_hyst:
+		rc = get_tcontrol(priv);
+		if (rc)
+			return rc;
+
+		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int cputemp_read(struct device *dev,
+			enum hwmon_sensor_types type,
+			u32 attr, int channel, long *val)
+{
+	switch (channel) {
+	case channel_die:
+		return cputemp_read_die(dev, type, attr, channel, val);
+	case channel_dts_mrgn:
+		return cputemp_read_dts_margin(dev, type, attr, channel, val);
+	case channel_tcontrol:
+		return cputemp_read_tcontrol(dev, type, attr, channel, val);
+	case channel_tthrottle:
+		return cputemp_read_tthrottle(dev, type, attr, channel, val);
+	case channel_tjmax:
+		return cputemp_read_tjmax(dev, type, attr, channel, val);
+	default:
+		if (channel < CPUTEMP_CHANNEL_NUMS)
+			return cputemp_read_core(dev, type, attr, channel, val);
+
+		return -EOPNOTSUPP;
+	}
+}
+
+static umode_t cputemp_is_visible(const void *data,
+				  enum hwmon_sensor_types type,
+				  u32 attr, int channel)
+{
+	const struct peci_cputemp *priv = data;
+
+	if (priv->temp_config[channel] & BIT(attr))
+		return 0444;
+
+	return 0;
+}
+
+static const struct hwmon_ops cputemp_ops = {
+	.is_visible = cputemp_is_visible,
+	.read_string = cputemp_read_string,
+	.read = cputemp_read,
+};
+
+static int check_resolved_cores(struct peci_cputemp *priv)
+{
+	struct peci_rd_pci_cfg_local_msg msg;
+	int rc;
+
+	if (!(priv->client->adapter->cmd_mask & BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
+		return -EINVAL;
+
+	/* Get the RESOLVED_CORES register value */
+	msg.addr = priv->addr;
+	msg.bus = 1;
+	msg.device = 30;
+	msg.function = 3;
+	msg.reg = 0xB4;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
+	if (rc)
+		return rc;
+
+	priv->core_mask = msg.pci_config[3] << 24 |
+			  msg.pci_config[2] << 16 |
+			  msg.pci_config[1] << 8 |
+			  msg.pci_config[0];
+
+	if (!priv->core_mask)
+		return -EAGAIN;
+
+	dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", priv->core_mask);
+	return 0;
+}
+
+static int create_core_temp_info(struct peci_cputemp *priv)
+{
+	int rc, i;
+
+	rc = check_resolved_cores(priv);
+	if (!rc) {
+		for (i = 0; i < priv->gen_info->core_max; i++) {
+			if (priv->core_mask & BIT(i)) {
+				priv->temp_config[priv->config_idx++] =
+						     config_table[channel_core];
+			}
+		}
+	}
+
+	return rc;
+}
+
+static int check_cpu_id(struct peci_cputemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	u32 cpu_id;
+	int i, rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_CPU_ID;
+	msg.param = PKG_ID_CPU_ID;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
+		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
+
+	for (i = 0; i < CPU_GEN_MAX; i++) {
+		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
+			priv->gen_info = &cpu_gen_info_table[i];
+			break;
+		}
+	}
+
+	if (!priv->gen_info)
+		return -ENODEV;
+
+	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
+	return 0;
+}
+
+static int peci_cputemp_probe(struct peci_client *client)
+{
+	struct device *dev = &client->dev;
+	struct peci_cputemp *priv;
+	struct device *hwmon_dev;
+	int rc;
+
+	if ((client->adapter->cmd_mask &
+	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
+	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
+		dev_err(dev, "Client doesn't support temperature monitoring\n");
+		return -EINVAL;
+	}
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(dev, priv);
+	priv->client = client;
+	priv->dev = dev;
+	priv->addr = client->addr;
+	priv->cpu_no = priv->addr - PECI_BASE_ADDR;
+
+	snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
+		 priv->cpu_no);
+
+	rc = check_cpu_id(priv);
+	if (rc) {
+		dev_err(dev, "Client CPU is not supported\n");
+		return rc;
+	}
+
+	priv->temp_config[priv->config_idx++] = config_table[channel_die];
+	priv->temp_config[priv->config_idx++] = config_table[channel_dts_mrgn];
+	priv->temp_config[priv->config_idx++] = config_table[channel_tcontrol];
+	priv->temp_config[priv->config_idx++] = config_table[channel_tthrottle];
+	priv->temp_config[priv->config_idx++] = config_table[channel_tjmax];
+
+	rc = create_core_temp_info(priv);
+	if (rc)
+		dev_dbg(dev, "Failed to create core temp info\n");
+
+	priv->chip.ops = &cputemp_ops;
+	priv->chip.info = priv->info;
+
+	priv->info[0] = &priv->temp_info;
+
+	priv->temp_info.type = hwmon_temp;
+	priv->temp_info.config = priv->temp_config;
+
+	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
+							 priv->name,
+							 priv,
+							 &priv->chip,
+							 NULL);
+
+	if (IS_ERR(hwmon_dev))
+		return PTR_ERR(hwmon_dev);
+
+	dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), priv->name);
+
+	return 0;
+}
+
+static const struct of_device_id peci_cputemp_of_table[] = {
+	{ .compatible = "intel,peci-cputemp" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
+
+static struct peci_driver peci_cputemp_driver = {
+	.probe  = peci_cputemp_probe,
+	.driver = {
+		.name           = "peci-cputemp",
+		.of_match_table = of_match_ptr(peci_cputemp_of_table),
+	},
+};
+module_peci_driver(peci_cputemp_driver);
+
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("PECI cputemp driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/hwmon/peci-dimmtemp.c b/drivers/hwmon/peci-dimmtemp.c
new file mode 100644
index 000000000000..78bf29cb2c4c
--- /dev/null
+++ b/drivers/hwmon/peci-dimmtemp.c
@@ -0,0 +1,432 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/delay.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/peci.h>
+#include <linux/workqueue.h>
+
+#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
+
+#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on Haswell */
+#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on Haswell */
+
+#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on Broadwell */
+#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on Broadwell */
+
+#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on Skylake */
+#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on Skylake */
+
+#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
+#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
+
+#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
+
+#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model info */
+
+#define UPDATE_INTERVAL_MIN  HZ
+
+#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
+#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 minutes */
+
+enum cpu_gens {
+	CPU_GEN_HSX, /* Haswell Xeon */
+	CPU_GEN_BRX, /* Broadwell Xeon */
+	CPU_GEN_SKX, /* Skylake Xeon */
+	CPU_GEN_MAX
+};
+
+struct cpu_gen_info {
+	u32 type;
+	u32 cpu_id;
+	u32 chan_rank_max;
+	u32 dimm_idx_max;
+};
+
+struct temp_data {
+	bool valid;
+	s32  value;
+	unsigned long last_updated;
+};
+
+struct peci_dimmtemp {
+	struct peci_client *client;
+	struct device *dev;
+	struct workqueue_struct *work_queue;
+	struct delayed_work work_handler;
+	char name[PECI_NAME_SIZE];
+	struct temp_data temp[DIMM_NUMS_MAX];
+	u8 addr;
+	uint cpu_no;
+	const struct cpu_gen_info *gen_info;
+	u32 dimm_mask;
+	int retry_count;
+	int channels;
+	u32 temp_config[DIMM_NUMS_MAX + 1];
+	struct hwmon_channel_info temp_info;
+	const struct hwmon_channel_info *info[2];
+	struct hwmon_chip_info chip;
+};
+
+static const struct cpu_gen_info cpu_gen_info_table[] = {
+	{ .type  = CPU_GEN_HSX,
+	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
+	  .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
+	  .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
+	{ .type  = CPU_GEN_BRX,
+	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
+	  .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
+	  .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
+	{ .type  = CPU_GEN_SKX,
+	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
+	  .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
+	  .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
+};
+
+static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
+	{ "DIMM A0", "DIMM A1", "DIMM A2" },
+	{ "DIMM B0", "DIMM B1", "DIMM B2" },
+	{ "DIMM C0", "DIMM C1", "DIMM C2" },
+	{ "DIMM D0", "DIMM D1", "DIMM D2" },
+	{ "DIMM E0", "DIMM E1", "DIMM E2" },
+	{ "DIMM F0", "DIMM F1", "DIMM F2" },
+	{ "DIMM G0", "DIMM G1", "DIMM G2" },
+	{ "DIMM H0", "DIMM H1", "DIMM H2" },
+};
+
+static int send_peci_cmd(struct peci_dimmtemp *priv, enum peci_cmd cmd,
+			 void *msg)
+{
+	return peci_command(priv->client->adapter, cmd, msg);
+}
+
+static int need_update(struct temp_data *temp)
+{
+	if (temp->valid &&
+	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
+		return 0;
+
+	return 1;
+}
+
+static void mark_updated(struct temp_data *temp)
+{
+	temp->valid = true;
+	temp->last_updated = jiffies;
+}
+
+static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
+{
+	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
+	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
+	struct peci_rd_pkg_cfg_msg msg;
+	int rc;
+
+	if (!need_update(&priv->temp[dimm_no]))
+		return 0;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_DDR_DIMM_TEMP;
+	msg.param = chan_rank;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
+
+	mark_updated(&priv->temp[dimm_no]);
+
+	return 0;
+}
+
+static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
+{
+	int dimm_nums_max = priv->gen_info->chan_rank_max *
+			    priv->gen_info->dimm_idx_max;
+	int idx, found = 0;
+
+	for (idx = 0; idx < dimm_nums_max; idx++) {
+		if (priv->dimm_mask & BIT(idx)) {
+			if (channel == found)
+				break;
+
+			found++;
+		}
+	}
+
+	return idx;
+}
+
+static int dimmtemp_read_string(struct device *dev,
+				enum hwmon_sensor_types type,
+				u32 attr, int channel, const char **str)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
+	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
+	int dimm_no, chan_rank, dimm_idx;
+
+	switch (attr) {
+	case hwmon_temp_label:
+		dimm_no = find_dimm_number(priv, channel);
+		chan_rank = dimm_no / dimm_idx_max;
+		dimm_idx = dimm_no % dimm_idx_max;
+		*str = dimmtemp_label[chan_rank][dimm_idx];
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
+			 u32 attr, int channel, long *val)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
+	int dimm_no = find_dimm_number(priv, channel);
+	int rc;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		rc = get_dimm_temp(priv, dimm_no);
+		if (rc)
+			return rc;
+
+		*val = priv->temp[dimm_no].value;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static umode_t dimmtemp_is_visible(const void *data,
+				   enum hwmon_sensor_types type,
+				   u32 attr, int channel)
+{
+	switch (attr) {
+	case hwmon_temp_label:
+	case hwmon_temp_input:
+		return 0444;
+	default:
+		return 0;
+	}
+}
+
+static const struct hwmon_ops dimmtemp_ops = {
+	.is_visible = dimmtemp_is_visible,
+	.read_string = dimmtemp_read_string,
+	.read = dimmtemp_read,
+};
+
+static int check_populated_dimms(struct peci_dimmtemp *priv)
+{
+	u32 chan_rank_max = priv->gen_info->chan_rank_max;
+	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
+	struct peci_rd_pkg_cfg_msg msg;
+	int chan_rank, dimm_idx;
+	int rc, channels = 0;
+
+	for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
+		msg.addr = priv->addr;
+		msg.index = MBX_INDEX_DDR_DIMM_TEMP;
+		msg.param = chan_rank;
+		msg.rx_len = 4;
+
+		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+		if (rc) {
+			priv->dimm_mask = 0;
+			return rc;
+		}
+
+		for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
+			if (msg.pkg_config[dimm_idx]) {
+				priv->dimm_mask |= BIT(chan_rank *
+						       chan_rank_max +
+						       dimm_idx);
+				channels++;
+			}
+		}
+	}
+
+	if (!priv->dimm_mask)
+		return -EAGAIN;
+
+	priv->channels = channels;
+
+	dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", priv->dimm_mask);
+	return 0;
+}
+
+static int create_dimm_temp_info(struct peci_dimmtemp *priv)
+{
+	struct device *hwmon_dev;
+	int rc, i;
+
+	rc = check_populated_dimms(priv);
+	if (!rc) {
+		for (i = 0; i < priv->channels; i++)
+			priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
+
+		priv->chip.ops = &dimmtemp_ops;
+		priv->chip.info = priv->info;
+
+		priv->info[0] = &priv->temp_info;
+
+		priv->temp_info.type = hwmon_temp;
+		priv->temp_info.config = priv->temp_config;
+
+		hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
+								 priv->name,
+								 priv,
+								 &priv->chip,
+								 NULL);
+		rc = PTR_ERR_OR_ZERO(hwmon_dev);
+		if (!rc)
+			dev_dbg(priv->dev, "%s: sensor '%s'\n",
+				dev_name(hwmon_dev), priv->name);
+	} else if (rc == -EAGAIN) {
+		if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
+			queue_delayed_work(priv->work_queue,
+					   &priv->work_handler,
+					   DIMM_MASK_CHECK_DELAY_JIFFIES);
+			priv->retry_count++;
+			dev_dbg(priv->dev,
+				"Deferred DIMM temp info creation\n");
+		} else {
+			rc = -ETIMEDOUT;
+			dev_err(priv->dev,
+				"Timeout retrying DIMM temp info creation\n");
+		}
+	}
+
+	return rc;
+}
+
+static void create_dimm_temp_info_delayed(struct work_struct *work)
+{
+	struct delayed_work *dwork = to_delayed_work(work);
+	struct peci_dimmtemp *priv = container_of(dwork, struct peci_dimmtemp,
+						  work_handler);
+	int rc;
+
+	rc = create_dimm_temp_info(priv);
+	if (rc && rc != -EAGAIN)
+		dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
+}
+
+static int check_cpu_id(struct peci_dimmtemp *priv)
+{
+	struct peci_rd_pkg_cfg_msg msg;
+	u32 cpu_id;
+	int i, rc;
+
+	msg.addr = priv->addr;
+	msg.index = MBX_INDEX_CPU_ID;
+	msg.param = PKG_ID_CPU_ID;
+	msg.rx_len = 4;
+
+	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
+	if (rc)
+		return rc;
+
+	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
+		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
+
+	for (i = 0; i < CPU_GEN_MAX; i++) {
+		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
+			priv->gen_info = &cpu_gen_info_table[i];
+			break;
+		}
+	}
+
+	if (!priv->gen_info)
+		return -ENODEV;
+
+	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
+	return 0;
+}
+
+static int peci_dimmtemp_probe(struct peci_client *client)
+{
+	struct device *dev = &client->dev;
+	struct peci_dimmtemp *priv;
+	int rc;
+
+	if ((client->adapter->cmd_mask &
+	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
+	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
+		dev_err(dev, "Client doesn't support temperature monitoring\n");
+		return -EINVAL;
+	}
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(dev, priv);
+	priv->client = client;
+	priv->dev = dev;
+	priv->addr = client->addr;
+	priv->cpu_no = priv->addr - PECI_BASE_ADDR;
+
+	snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
+		 priv->cpu_no);
+
+	rc = check_cpu_id(priv);
+	if (rc) {
+		dev_err(dev, "Client CPU is not supported\n");
+		return rc;
+	}
+
+	priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
+	if (!priv->work_queue)
+		return -ENOMEM;
+
+	INIT_DELAYED_WORK(&priv->work_handler, create_dimm_temp_info_delayed);
+
+	rc = create_dimm_temp_info(priv);
+	if (rc && rc != -EAGAIN) {
+		dev_err(dev, "Failed to create DIMM temp info\n");
+		goto err_free_wq;
+	}
+
+	return 0;
+
+err_free_wq:
+	destroy_workqueue(priv->work_queue);
+	return rc;
+}
+
+static int peci_dimmtemp_remove(struct peci_client *client)
+{
+	struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
+
+	cancel_delayed_work(&priv->work_handler);
+	destroy_workqueue(priv->work_queue);
+
+	return 0;
+}
+
+static const struct of_device_id peci_dimmtemp_of_table[] = {
+	{ .compatible = "intel,peci-dimmtemp" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
+
+static struct peci_driver peci_dimmtemp_driver = {
+	.probe  = peci_dimmtemp_probe,
+	.remove = peci_dimmtemp_remove,
+	.driver = {
+		.name           = "peci-dimmtemp",
+		.of_match_table = of_match_ptr(peci_dimmtemp_of_table),
+	},
+};
+module_peci_driver(peci_dimmtemp_driver);
+
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("PECI dimmtemp driver");
+MODULE_LICENSE("GPL v2");
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 10/10] Add a maintainer for the PECI subsystem
  2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
                   ` (8 preceding siblings ...)
  2018-04-10 18:32 ` [PATCH v3 09/10] drivers/hwmon: Add " Jae Hyun Yoo
@ 2018-04-10 18:32 ` Jae Hyun Yoo
  9 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-10 18:32 UTC (permalink / raw)
  To: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc, Jae Hyun Yoo

This commit adds a maintainer information for the PECI subsystem.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Julia Cartwright <juliac@eso.teric.us>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
---
 MAINTAINERS | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5cd5ff0e4428..3e6917e1ad31 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10965,6 +10965,16 @@ L:	platform-driver-x86@vger.kernel.org
 S:	Maintained
 F:	drivers/platform/x86/peaq-wmi.c
 
+PECI SUBSYSTEM
+M:	Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
+M:	Jason M Biils <jason.m.bills@linux.intel.com>
+S:	Maintained
+F:	Documentation/devicetree/bindings/peci/
+F:	drivers/peci/
+F:	drivers/hwmon/peci-*.c
+F:	include/linux/peci.h
+F:	include/uapi/linux/peci-ioctl.h
+
 PER-CPU MEMORY ALLOCATOR
 M:	Tejun Heo <tj@kernel.org>
 M:	Christoph Lameter <cl@linux.com>
-- 
2.16.2

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-10 18:32 ` [PATCH v3 09/10] drivers/hwmon: Add " Jae Hyun Yoo
@ 2018-04-10 22:28   ` Guenter Roeck
  2018-04-11 21:59     ` Jae Hyun Yoo
  2018-04-24 15:56   ` Andy Shevchenko
  1 sibling, 1 reply; 54+ messages in thread
From: Guenter Roeck @ 2018-04-10 22:28 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:
> This commit adds PECI cputemp and dimmtemp hwmon drivers.
> 
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
> Reviewed-by: James Feist <james.feist@linux.intel.com>
> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Andrew Jeffery <andrew@aj.id.au>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
> Cc: Jean Delvare <jdelvare@suse.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Julia Cartwright <juliac@eso.teric.us>
> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
> Cc: Milton Miller II <miltonm@us.ibm.com>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
> ---
>  drivers/hwmon/Kconfig         |  28 ++
>  drivers/hwmon/Makefile        |   2 +
>  drivers/hwmon/peci-cputemp.c  | 783 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
>  4 files changed, 1245 insertions(+)
>  create mode 100644 drivers/hwmon/peci-cputemp.c
>  create mode 100644 drivers/hwmon/peci-dimmtemp.c
> 
> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> index f249a4428458..c52f610f81d0 100644
> --- a/drivers/hwmon/Kconfig
> +++ b/drivers/hwmon/Kconfig
> @@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
>  	  This driver can also be built as a module.  If so, the module
>  	  will be called nct7904.
>  
> +config SENSORS_PECI_CPUTEMP
> +	tristate "PECI CPU temperature monitoring support"
> +	depends on OF
> +	depends on PECI
> +	help
> +	  If you say yes here you get support for the generic Intel PECI
> +	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
> +	  readings of the CPU package and CPU cores that are accessible using
> +	  the PECI Client Command Suite via the processor PECI client.
> +	  Check Documentation/hwmon/peci-cputemp for details.
> +
> +	  This driver can also be built as a module.  If so, the module
> +	  will be called peci-cputemp.
> +
> +config SENSORS_PECI_DIMMTEMP
> +	tristate "PECI DIMM temperature monitoring support"
> +	depends on OF
> +	depends on PECI
> +	help
> +	  If you say yes here you get support for the generic Intel PECI hwmon
> +	  driver which provides Digital Thermal Sensor (DTS) thermal readings of
> +	  DIMM components that are accessible using the PECI Client Command
> +	  Suite via the processor PECI client.
> +	  Check Documentation/hwmon/peci-dimmtemp for details.
> +
> +	  This driver can also be built as a module.  If so, the module
> +	  will be called peci-dimmtemp.
> +
>  config SENSORS_NSA320
>  	tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
>  	depends on GPIOLIB && OF
> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> index e7d52a36e6c4..48d9598fcd3a 100644
> --- a/drivers/hwmon/Makefile
> +++ b/drivers/hwmon/Makefile
> @@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)	+= nct7802.o
>  obj-$(CONFIG_SENSORS_NCT7904)	+= nct7904.o
>  obj-$(CONFIG_SENSORS_NSA320)	+= nsa320-hwmon.o
>  obj-$(CONFIG_SENSORS_NTC_THERMISTOR)	+= ntc_thermistor.o
> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)	+= peci-dimmtemp.o
>  obj-$(CONFIG_SENSORS_PC87360)	+= pc87360.o
>  obj-$(CONFIG_SENSORS_PC87427)	+= pc87427.o
>  obj-$(CONFIG_SENSORS_PCF8591)	+= pcf8591.o
> diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
> new file mode 100644
> index 000000000000..f0bc92687512
> --- /dev/null
> +++ b/drivers/hwmon/peci-cputemp.c
> @@ -0,0 +1,783 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2018 Intel Corporation
> +
> +#include <linux/delay.h>
> +#include <linux/hwmon.h>
> +#include <linux/hwmon-sysfs.h>

Is this include needed ?

> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/of_device.h>
> +#include <linux/peci.h>
> +
> +#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
> +
> +#define CORE_MAX_ON_HSX       18 /* Max number of cores on Haswell */
> +#define CORE_MAX_ON_BDX       24 /* Max number of cores on Broadwell */
> +#define CORE_MAX_ON_SKX       28 /* Max number of cores on Skylake */
> +
> +#define DEFAULT_CHANNEL_NUMS  5
> +#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
> +#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
> +
> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
> +
> +#define UPDATE_INTERVAL_MIN   HZ
> +
> +enum cpu_gens {
> +	CPU_GEN_HSX, /* Haswell Xeon */
> +	CPU_GEN_BRX, /* Broadwell Xeon */
> +	CPU_GEN_SKX, /* Skylake Xeon */
> +	CPU_GEN_MAX
> +};
> +
> +struct cpu_gen_info {
> +	u32 type;
> +	u32 cpu_id;
> +	u32 core_max;
> +};
> +
> +struct temp_data {
> +	bool valid;
> +	s32  value;
> +	unsigned long last_updated;
> +};
> +
> +struct temp_group {
> +	struct temp_data die;
> +	struct temp_data dts_margin;
> +	struct temp_data tcontrol;
> +	struct temp_data tthrottle;
> +	struct temp_data tjmax;
> +	struct temp_data core[CORETEMP_CHANNEL_NUMS];
> +};
> +
> +struct peci_cputemp {
> +	struct peci_client *client;
> +	struct device *dev;
> +	char name[PECI_NAME_SIZE];
> +	struct temp_group temp;
> +	u8 addr;
> +	uint cpu_no;
> +	const struct cpu_gen_info *gen_info;
> +	u32 core_mask;
> +	u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
> +	uint config_idx;
> +	struct hwmon_channel_info temp_info;
> +	const struct hwmon_channel_info *info[2];
> +	struct hwmon_chip_info chip;
> +};
> +
> +enum cputemp_channels {
> +	channel_die,
> +	channel_dts_mrgn,
> +	channel_tcontrol,
> +	channel_tthrottle,
> +	channel_tjmax,
> +	channel_core,
> +};
> +
> +static const struct cpu_gen_info cpu_gen_info_table[] = {
> +	{ .type = CPU_GEN_HSX,
> +	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
> +	  .core_max = CORE_MAX_ON_HSX },
> +	{ .type = CPU_GEN_BRX,
> +	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
> +	  .core_max = CORE_MAX_ON_BDX },
> +	{ .type = CPU_GEN_SKX,
> +	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
> +	  .core_max = CORE_MAX_ON_SKX },
> +};
> +
> +static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
> +	/* Die temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> +	HWMON_T_CRIT_HYST,
> +
> +	/* DTS margin temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
> +
> +	/* Tcontrol temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
> +
> +	/* Tthrottle temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT,
> +
> +	/* Tjmax temperature */
> +	HWMON_T_LABEL | HWMON_T_INPUT,
> +
> +	/* Core temperature - for all core channels */
> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
> +	HWMON_T_CRIT_HYST,
> +};
> +
> +static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
> +	"Die",
> +	"DTS margin",
> +	"Tcontrol",
> +	"Tthrottle",
> +	"Tjmax",
> +	"Core 0", "Core 1", "Core 2", "Core 3",
> +	"Core 4", "Core 5", "Core 6", "Core 7",
> +	"Core 8", "Core 9", "Core 10", "Core 11",
> +	"Core 12", "Core 13", "Core 14", "Core 15",
> +	"Core 16", "Core 17", "Core 18", "Core 19",
> +	"Core 20", "Core 21", "Core 22", "Core 23",
> +};
> +
> +static int send_peci_cmd(struct peci_cputemp *priv,
> +			 enum peci_cmd cmd,
> +			 void *msg)
> +{
> +	return peci_command(priv->client->adapter, cmd, msg);
> +}
> +
> +static int need_update(struct temp_data *temp)

Please use bool.

> +{
> +	if (temp->valid &&
> +	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +static void mark_updated(struct temp_data *temp)
> +{
> +	temp->valid = true;
> +	temp->last_updated = jiffies;
> +}
> +
> +static s32 ten_dot_six_to_millidegree(s32 val)
> +{
> +	return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
> +}
> +
> +static int get_tjmax(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	int rc;
> +
> +	if (!priv->temp.tjmax.valid) {
> +		msg.addr = priv->addr;
> +		msg.index = MBX_INDEX_TEMP_TARGET;
> +		msg.param = 0;
> +		msg.rx_len = 4;
> +
> +		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +		if (rc)
> +			return rc;
> +
> +		priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
> +		priv->temp.tjmax.valid = true;
> +	}
> +
> +	return 0;
> +}
> +
> +static int get_tcontrol(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	s32 tcontrol_margin;
> +	s32 tthrottle_offset;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.tcontrol))
> +		return 0;
> +
> +	rc = get_tjmax(priv);
> +	if (rc)
> +		return rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_TEMP_TARGET;
> +	msg.param = 0;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	tcontrol_margin = msg.pkg_config[1];
> +	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
> +	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
> +
> +	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
> +	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
> +
> +	mark_updated(&priv->temp.tcontrol);
> +	mark_updated(&priv->temp.tthrottle);
> +
> +	return 0;
> +}
> +
> +static int get_tthrottle(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	s32 tcontrol_margin;
> +	s32 tthrottle_offset;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.tthrottle))
> +		return 0;
> +
> +	rc = get_tjmax(priv);
> +	if (rc)
> +		return rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_TEMP_TARGET;
> +	msg.param = 0;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
> +	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
> +
> +	tcontrol_margin = msg.pkg_config[1];
> +	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
> +	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
> +
> +	mark_updated(&priv->temp.tthrottle);
> +	mark_updated(&priv->temp.tcontrol);
> +
> +	return 0;
> +}

I am quite completely missing how the two functions above are different.

> +
> +static int get_die_temp(struct peci_cputemp *priv)
> +{
> +	struct peci_get_temp_msg msg;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.die))
> +		return 0;
> +
> +	rc = get_tjmax(priv);
> +	if (rc)
> +		return rc;
> +
> +	msg.addr = priv->addr;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
> +	if (rc)
> +		return rc;
> +
> +	priv->temp.die.value = priv->temp.tjmax.value +
> +			       ((s32)msg.temp_raw * 1000 / 64);
> +
> +	mark_updated(&priv->temp.die);
> +
> +	return 0;
> +}
> +
> +static int get_dts_margin(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	s32 dts_margin;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.dts_margin))
> +		return 0;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_DTS_MARGIN;
> +	msg.param = 0;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
> +
> +	/**
> +	 * Processors return a value of DTS reading in 10.6 format
> +	 * (10 bits signed decimal, 6 bits fractional).
> +	 * Error codes:
> +	 *   0x8000: General sensor error
> +	 *   0x8001: Reserved
> +	 *   0x8002: Underflow on reading value
> +	 *   0x8003-0x81ff: Reserved
> +	 */
> +	if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
> +		return -EIO;
> +
> +	dts_margin = ten_dot_six_to_millidegree(dts_margin);
> +
> +	priv->temp.dts_margin.value = dts_margin;
> +
> +	mark_updated(&priv->temp.dts_margin);
> +
> +	return 0;
> +}
> +
> +static int get_core_temp(struct peci_cputemp *priv, int core_index)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	s32 core_dts_margin;
> +	int rc;
> +
> +	if (!need_update(&priv->temp.core[core_index]))
> +		return 0;
> +
> +	rc = get_tjmax(priv);
> +	if (rc)
> +		return rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
> +	msg.param = core_index;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
> +
> +	/**
> +	 * Processors return a value of the core DTS reading in 10.6 format
> +	 * (10 bits signed decimal, 6 bits fractional).
> +	 * Error codes:
> +	 *   0x8000: General sensor error
> +	 *   0x8001: Reserved
> +	 *   0x8002: Underflow on reading value
> +	 *   0x8003-0x81ff: Reserved
> +	 */
> +	if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
> +		return -EIO;
> +
> +	core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
> +
> +	priv->temp.core[core_index].value = priv->temp.tjmax.value +
> +					    core_dts_margin;
> +
> +	mark_updated(&priv->temp.core[core_index]);
> +
> +	return 0;
> +}
> +

There is a lot of duplication in those functions. Would it be possible
to find common code and use functions for it instead of duplicating
everything several times ?

> +static int find_core_index(struct peci_cputemp *priv, int channel)
> +{
> +	int core_channel = channel - DEFAULT_CHANNEL_NUMS;
> +	int idx, found = 0;
> +
> +	for (idx = 0; idx < priv->gen_info->core_max; idx++) {
> +		if (priv->core_mask & BIT(idx)) {
> +			if (core_channel == found)
> +				break;
> +
> +			found++;
> +		}
> +	}
> +
> +	return idx;

What if nothing is found ?

> +}
> +
> +static int cputemp_read_string(struct device *dev,
> +			       enum hwmon_sensor_types type,
> +			       u32 attr, int channel, const char **str)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int core_index;
> +
> +	switch (attr) {
> +	case hwmon_temp_label:
> +		if (channel < DEFAULT_CHANNEL_NUMS) {
> +			*str = cputemp_label[channel];
> +		} else {
> +			core_index = find_core_index(priv, channel);

FWIW, it might be better to pass channel - DEFAULT_CHANNEL_NUMS
as parameter.

What if find_core_index() returns priv->gen_info->core_max, ie
if it didn't find a core ?

> +			*str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
> +		}
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_die(struct device *dev,
> +			    enum hwmon_sensor_types type,
> +			    u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_die_temp(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.die.value;
> +		return 0;
> +	case hwmon_temp_max:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tcontrol.value;
> +		return 0;
> +	case hwmon_temp_crit:
> +		rc = get_tjmax(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value;
> +		return 0;
> +	case hwmon_temp_crit_hyst:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_dts_margin(struct device *dev,
> +				   enum hwmon_sensor_types type,
> +				   u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_dts_margin(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.dts_margin.value;
> +		return 0;
> +	case hwmon_temp_min:
> +		*val = 0;
> +		return 0;

This attribute should not exist.

> +	case hwmon_temp_lcrit:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tcontrol.value - priv->temp.tjmax.value;

lcrit is tcontrol - tjmax, and crit_hyst above is
tjmax - tcontrol ? How does this make sense ?

> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_tcontrol(struct device *dev,
> +				 enum hwmon_sensor_types type,
> +				 u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tcontrol.value;
> +		return 0;
> +	case hwmon_temp_crit:
> +		rc = get_tjmax(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value;
> +		return 0;

Am I missing something, or is the same temperature reported several times ?
tjmax is also reported as temp_crit cputemp_read_die(), for example.

> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_tthrottle(struct device *dev,
> +				  enum hwmon_sensor_types type,
> +				  u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_tthrottle(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tthrottle.value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_tjmax(struct device *dev,
> +			      enum hwmon_sensor_types type,
> +			      u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_tjmax(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int cputemp_read_core(struct device *dev,
> +			     enum hwmon_sensor_types type,
> +			     u32 attr, int channel, long *val)
> +{
> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
> +	int core_index = find_core_index(priv, channel);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_core_temp(priv, core_index);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.core[core_index].value;
> +		return 0;
> +	case hwmon_temp_max:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tcontrol.value;
> +		return 0;
> +	case hwmon_temp_crit:
> +		rc = get_tjmax(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value;
> +		return 0;
> +	case hwmon_temp_crit_hyst:
> +		rc = get_tcontrol(priv);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}

There is again a lot of duplication in those functions.

> +
> +static int cputemp_read(struct device *dev,
> +			enum hwmon_sensor_types type,
> +			u32 attr, int channel, long *val)
> +{
> +	switch (channel) {
> +	case channel_die:
> +		return cputemp_read_die(dev, type, attr, channel, val);
> +	case channel_dts_mrgn:
> +		return cputemp_read_dts_margin(dev, type, attr, channel, val);
> +	case channel_tcontrol:
> +		return cputemp_read_tcontrol(dev, type, attr, channel, val);
> +	case channel_tthrottle:
> +		return cputemp_read_tthrottle(dev, type, attr, channel, val);
> +	case channel_tjmax:
> +		return cputemp_read_tjmax(dev, type, attr, channel, val);
> +	default:
> +		if (channel < CPUTEMP_CHANNEL_NUMS)
> +			return cputemp_read_core(dev, type, attr, channel, val);
> +
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static umode_t cputemp_is_visible(const void *data,
> +				  enum hwmon_sensor_types type,
> +				  u32 attr, int channel)
> +{
> +	const struct peci_cputemp *priv = data;
> +
> +	if (priv->temp_config[channel] & BIT(attr))
> +		return 0444;
> +
> +	return 0;
> +}
> +
> +static const struct hwmon_ops cputemp_ops = {
> +	.is_visible = cputemp_is_visible,
> +	.read_string = cputemp_read_string,
> +	.read = cputemp_read,
> +};
> +
> +static int check_resolved_cores(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pci_cfg_local_msg msg;
> +	int rc;
> +
> +	if (!(priv->client->adapter->cmd_mask & BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
> +		return -EINVAL;
> +
> +	/* Get the RESOLVED_CORES register value */
> +	msg.addr = priv->addr;
> +	msg.bus = 1;
> +	msg.device = 30;
> +	msg.function = 3;
> +	msg.reg = 0xB4;

Can this be made less magic with some defines ?

> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
> +	if (rc)
> +		return rc;
> +
> +	priv->core_mask = msg.pci_config[3] << 24 |
> +			  msg.pci_config[2] << 16 |
> +			  msg.pci_config[1] << 8 |
> +			  msg.pci_config[0];
> +
> +	if (!priv->core_mask)
> +		return -EAGAIN;
> +
> +	dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", priv->core_mask);
> +	return 0;
> +}
> +
> +static int create_core_temp_info(struct peci_cputemp *priv)
> +{
> +	int rc, i;
> +
> +	rc = check_resolved_cores(priv);
> +	if (!rc) {
> +		for (i = 0; i < priv->gen_info->core_max; i++) {
> +			if (priv->core_mask & BIT(i)) {
> +				priv->temp_config[priv->config_idx++] =
> +						     config_table[channel_core];
> +			}
> +		}
> +	}
> +
> +	return rc;
> +}
> +
> +static int check_cpu_id(struct peci_cputemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	u32 cpu_id;
> +	int i, rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_CPU_ID;
> +	msg.param = PKG_ID_CPU_ID;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
> +		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
> +
> +	for (i = 0; i < CPU_GEN_MAX; i++) {
> +		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
> +			priv->gen_info = &cpu_gen_info_table[i];
> +			break;
> +		}
> +	}
> +
> +	if (!priv->gen_info)
> +		return -ENODEV;
> +
> +	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
> +	return 0;
> +}
> +
> +static int peci_cputemp_probe(struct peci_client *client)
> +{
> +	struct device *dev = &client->dev;
> +	struct peci_cputemp *priv;
> +	struct device *hwmon_dev;
> +	int rc;
> +
> +	if ((client->adapter->cmd_mask &
> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
> +		dev_err(dev, "Client doesn't support temperature monitoring\n");
> +		return -EINVAL;

Does this mean there will be an error message for each non-supported CPU ?
Why ?

> +	}
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	dev_set_drvdata(dev, priv);
> +	priv->client = client;
> +	priv->dev = dev;
> +	priv->addr = client->addr;
> +	priv->cpu_no = priv->addr - PECI_BASE_ADDR;
> +
> +	snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
> +		 priv->cpu_no);
> +
> +	rc = check_cpu_id(priv);
> +	if (rc) {
> +		dev_err(dev, "Client CPU is not supported\n");

-ENODEV is not an error, and should not result in an error message.
Besides, the error can also be propagated from peci core code,
and may well be something else.

> +		return rc;
> +	}
> +
> +	priv->temp_config[priv->config_idx++] = config_table[channel_die];
> +	priv->temp_config[priv->config_idx++] = config_table[channel_dts_mrgn];
> +	priv->temp_config[priv->config_idx++] = config_table[channel_tcontrol];
> +	priv->temp_config[priv->config_idx++] = config_table[channel_tthrottle];
> +	priv->temp_config[priv->config_idx++] = config_table[channel_tjmax];
> +
> +	rc = create_core_temp_info(priv);
> +	if (rc)
> +		dev_dbg(dev, "Failed to create core temp info\n");

Then what ? Shouldn't this result in probe deferral or something more useful
instead of just being ignored ?

> +
> +	priv->chip.ops = &cputemp_ops;
> +	priv->chip.info = priv->info;
> +
> +	priv->info[0] = &priv->temp_info;
> +
> +	priv->temp_info.type = hwmon_temp;
> +	priv->temp_info.config = priv->temp_config;
> +
> +	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
> +							 priv->name,
> +							 priv,
> +							 &priv->chip,
> +							 NULL);
> +
> +	if (IS_ERR(hwmon_dev))
> +		return PTR_ERR(hwmon_dev);
> +
> +	dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), priv->name);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id peci_cputemp_of_table[] = {
> +	{ .compatible = "intel,peci-cputemp" },
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
> +
> +static struct peci_driver peci_cputemp_driver = {
> +	.probe  = peci_cputemp_probe,
> +	.driver = {
> +		.name           = "peci-cputemp",
> +		.of_match_table = of_match_ptr(peci_cputemp_of_table),
> +	},
> +};
> +module_peci_driver(peci_cputemp_driver);
> +
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_DESCRIPTION("PECI cputemp driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/hwmon/peci-dimmtemp.c b/drivers/hwmon/peci-dimmtemp.c
> new file mode 100644
> index 000000000000..78bf29cb2c4c
> --- /dev/null
> +++ b/drivers/hwmon/peci-dimmtemp.c

FWIW, this should be two separate patches.

> @@ -0,0 +1,432 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2018 Intel Corporation
> +
> +#include <linux/delay.h>
> +#include <linux/hwmon.h>
> +#include <linux/hwmon-sysfs.h>

Needed ?

> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/of_device.h>
> +#include <linux/peci.h>
> +#include <linux/workqueue.h>
> +
> +#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
> +
> +#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on Haswell */
> +#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on Haswell */
> +
> +#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on Broadwell */
> +#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on Broadwell */
> +
> +#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on Skylake */
> +#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on Skylake */
> +
> +#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
> +#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
> +
> +#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
> +
> +#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model info */
> +
> +#define UPDATE_INTERVAL_MIN  HZ
> +
> +#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
> +#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 minutes */
> +
> +enum cpu_gens {
> +	CPU_GEN_HSX, /* Haswell Xeon */
> +	CPU_GEN_BRX, /* Broadwell Xeon */
> +	CPU_GEN_SKX, /* Skylake Xeon */
> +	CPU_GEN_MAX
> +};
> +
> +struct cpu_gen_info {
> +	u32 type;
> +	u32 cpu_id;
> +	u32 chan_rank_max;
> +	u32 dimm_idx_max;
> +};
> +
> +struct temp_data {
> +	bool valid;
> +	s32  value;
> +	unsigned long last_updated;
> +};
> +
> +struct peci_dimmtemp {
> +	struct peci_client *client;
> +	struct device *dev;
> +	struct workqueue_struct *work_queue;
> +	struct delayed_work work_handler;
> +	char name[PECI_NAME_SIZE];
> +	struct temp_data temp[DIMM_NUMS_MAX];
> +	u8 addr;
> +	uint cpu_no;
> +	const struct cpu_gen_info *gen_info;
> +	u32 dimm_mask;
> +	int retry_count;
> +	int channels;
> +	u32 temp_config[DIMM_NUMS_MAX + 1];
> +	struct hwmon_channel_info temp_info;
> +	const struct hwmon_channel_info *info[2];
> +	struct hwmon_chip_info chip;
> +};
> +
> +static const struct cpu_gen_info cpu_gen_info_table[] = {
> +	{ .type  = CPU_GEN_HSX,
> +	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
> +	  .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
> +	{ .type  = CPU_GEN_BRX,
> +	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
> +	  .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
> +	{ .type  = CPU_GEN_SKX,
> +	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
> +	  .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
> +};
> +
> +static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
> +	{ "DIMM A0", "DIMM A1", "DIMM A2" },
> +	{ "DIMM B0", "DIMM B1", "DIMM B2" },
> +	{ "DIMM C0", "DIMM C1", "DIMM C2" },
> +	{ "DIMM D0", "DIMM D1", "DIMM D2" },
> +	{ "DIMM E0", "DIMM E1", "DIMM E2" },
> +	{ "DIMM F0", "DIMM F1", "DIMM F2" },
> +	{ "DIMM G0", "DIMM G1", "DIMM G2" },
> +	{ "DIMM H0", "DIMM H1", "DIMM H2" },
> +};
> +
> +static int send_peci_cmd(struct peci_dimmtemp *priv, enum peci_cmd cmd,
> +			 void *msg)
> +{
> +	return peci_command(priv->client->adapter, cmd, msg);
> +}
> +
> +static int need_update(struct temp_data *temp)
> +{
> +	if (temp->valid &&
> +	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +static void mark_updated(struct temp_data *temp)
> +{
> +	temp->valid = true;
> +	temp->last_updated = jiffies;
> +}

It might make sense to provide the duplicate functions in a core file.

> +
> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
> +{
> +	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
> +	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
> +	struct peci_rd_pkg_cfg_msg msg;
> +	int rc;
> +
> +	if (!need_update(&priv->temp[dimm_no]))
> +		return 0;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_DDR_DIMM_TEMP;
> +	msg.param = chan_rank;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
> +
> +	mark_updated(&priv->temp[dimm_no]);
> +
> +	return 0;
> +}
> +
> +static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
> +{
> +	int dimm_nums_max = priv->gen_info->chan_rank_max *
> +			    priv->gen_info->dimm_idx_max;
> +	int idx, found = 0;
> +
> +	for (idx = 0; idx < dimm_nums_max; idx++) {
> +		if (priv->dimm_mask & BIT(idx)) {
> +			if (channel == found)
> +				break;
> +
> +			found++;
> +		}
> +	}
> +
> +	return idx;
> +}

This again looks like duplicate code.

> +
> +static int dimmtemp_read_string(struct device *dev,
> +				enum hwmon_sensor_types type,
> +				u32 attr, int channel, const char **str)
> +{
> +	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> +	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
> +	int dimm_no, chan_rank, dimm_idx;
> +
> +	switch (attr) {
> +	case hwmon_temp_label:
> +		dimm_no = find_dimm_number(priv, channel);
> +		chan_rank = dimm_no / dimm_idx_max;
> +		dimm_idx = dimm_no % dimm_idx_max;
> +		*str = dimmtemp_label[chan_rank][dimm_idx];
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
> +			 u32 attr, int channel, long *val)
> +{
> +	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
> +	int dimm_no = find_dimm_number(priv, channel);
> +	int rc;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		rc = get_dimm_temp(priv, dimm_no);
> +		if (rc)
> +			return rc;
> +
> +		*val = priv->temp[dimm_no].value;
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static umode_t dimmtemp_is_visible(const void *data,
> +				   enum hwmon_sensor_types type,
> +				   u32 attr, int channel)
> +{
> +	switch (attr) {
> +	case hwmon_temp_label:
> +	case hwmon_temp_input:
> +		return 0444;
> +	default:
> +		return 0;
> +	}
> +}
> +
> +static const struct hwmon_ops dimmtemp_ops = {
> +	.is_visible = dimmtemp_is_visible,
> +	.read_string = dimmtemp_read_string,
> +	.read = dimmtemp_read,
> +};
> +
> +static int check_populated_dimms(struct peci_dimmtemp *priv)
> +{
> +	u32 chan_rank_max = priv->gen_info->chan_rank_max;
> +	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
> +	struct peci_rd_pkg_cfg_msg msg;
> +	int chan_rank, dimm_idx;
> +	int rc, channels = 0;
> +
> +	for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
> +		msg.addr = priv->addr;
> +		msg.index = MBX_INDEX_DDR_DIMM_TEMP;
> +		msg.param = chan_rank;
> +		msg.rx_len = 4;
> +
> +		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +		if (rc) {
> +			priv->dimm_mask = 0;
> +			return rc;
> +		}
> +
> +		for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
> +			if (msg.pkg_config[dimm_idx]) {
> +				priv->dimm_mask |= BIT(chan_rank *
> +						       chan_rank_max +
> +						       dimm_idx);
> +				channels++;
> +			}
> +		}
> +	}
> +
> +	if (!priv->dimm_mask)
> +		return -EAGAIN;
> +
> +	priv->channels = channels;
> +
> +	dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", priv->dimm_mask);
> +	return 0;
> +}
> +
> +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
> +{
> +	struct device *hwmon_dev;
> +	int rc, i;
> +
> +	rc = check_populated_dimms(priv);
> +	if (!rc) {

Please handle error cases first.

> +		for (i = 0; i < priv->channels; i++)
> +			priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
> +
> +		priv->chip.ops = &dimmtemp_ops;
> +		priv->chip.info = priv->info;
> +
> +		priv->info[0] = &priv->temp_info;
> +
> +		priv->temp_info.type = hwmon_temp;
> +		priv->temp_info.config = priv->temp_config;
> +
> +		hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
> +								 priv->name,
> +								 priv,
> +								 &priv->chip,
> +								 NULL);
> +		rc = PTR_ERR_OR_ZERO(hwmon_dev);
> +		if (!rc)
> +			dev_dbg(priv->dev, "%s: sensor '%s'\n",
> +				dev_name(hwmon_dev), priv->name);
> +	} else if (rc == -EAGAIN) {
> +		if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
> +			queue_delayed_work(priv->work_queue,
> +					   &priv->work_handler,
> +					   DIMM_MASK_CHECK_DELAY_JIFFIES);
> +			priv->retry_count++;
> +			dev_dbg(priv->dev,
> +				"Deferred DIMM temp info creation\n");
> +		} else {
> +			rc = -ETIMEDOUT;
> +			dev_err(priv->dev,
> +				"Timeout retrying DIMM temp info creation\n");
> +		}
> +	}
> +
> +	return rc;
> +}
> +
> +static void create_dimm_temp_info_delayed(struct work_struct *work)
> +{
> +	struct delayed_work *dwork = to_delayed_work(work);
> +	struct peci_dimmtemp *priv = container_of(dwork, struct peci_dimmtemp,
> +						  work_handler);
> +	int rc;
> +
> +	rc = create_dimm_temp_info(priv);
> +	if (rc && rc != -EAGAIN)
> +		dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
> +}
> +
> +static int check_cpu_id(struct peci_dimmtemp *priv)
> +{
> +	struct peci_rd_pkg_cfg_msg msg;
> +	u32 cpu_id;
> +	int i, rc;
> +
> +	msg.addr = priv->addr;
> +	msg.index = MBX_INDEX_CPU_ID;
> +	msg.param = PKG_ID_CPU_ID;
> +	msg.rx_len = 4;
> +
> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
> +	if (rc)
> +		return rc;
> +
> +	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
> +		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
> +
> +	for (i = 0; i < CPU_GEN_MAX; i++) {
> +		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
> +			priv->gen_info = &cpu_gen_info_table[i];
> +			break;
> +		}
> +	}
> +
> +	if (!priv->gen_info)
> +		return -ENODEV;
> +
> +	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
> +	return 0;
> +}

More duplicate code.

> +
> +static int peci_dimmtemp_probe(struct peci_client *client)
> +{
> +	struct device *dev = &client->dev;
> +	struct peci_dimmtemp *priv;
> +	int rc;
> +
> +	if ((client->adapter->cmd_mask &
> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {

One set of ( ) is unnecessary on each side of the expression.

> +		dev_err(dev, "Client doesn't support temperature monitoring\n");
> +		return -EINVAL;

Why is this "invalid", and why does it warrant an error message ?

> +	}
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	dev_set_drvdata(dev, priv);
> +	priv->client = client;
> +	priv->dev = dev;
> +	priv->addr = client->addr;
> +	priv->cpu_no = priv->addr - PECI_BASE_ADDR;

Is priv->addr guaranteed to be >= PECI_BASE_ADDR ?
> +
> +	snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
> +		 priv->cpu_no);
> +
> +	rc = check_cpu_id(priv);
> +	if (rc) {
> +		dev_err(dev, "Client CPU is not supported\n");

Or the peci command failed.

> +		return rc;
> +	}
> +
> +	priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
> +	if (!priv->work_queue)
> +		return -ENOMEM;
> +
> +	INIT_DELAYED_WORK(&priv->work_handler, create_dimm_temp_info_delayed);
> +
> +	rc = create_dimm_temp_info(priv);
> +	if (rc && rc != -EAGAIN) {
> +		dev_err(dev, "Failed to create DIMM temp info\n");
> +		goto err_free_wq;
> +	}
> +
> +	return 0;
> +
> +err_free_wq:
> +	destroy_workqueue(priv->work_queue);
> +	return rc;
> +}
> +
> +static int peci_dimmtemp_remove(struct peci_client *client)
> +{
> +	struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
> +
> +	cancel_delayed_work(&priv->work_handler);

cancel_delayed_work_sync() ?

> +	destroy_workqueue(priv->work_queue);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id peci_dimmtemp_of_table[] = {
> +	{ .compatible = "intel,peci-dimmtemp" },
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
> +
> +static struct peci_driver peci_dimmtemp_driver = {
> +	.probe  = peci_dimmtemp_probe,
> +	.remove = peci_dimmtemp_remove,
> +	.driver = {
> +		.name           = "peci-dimmtemp",
> +		.of_match_table = of_match_ptr(peci_dimmtemp_of_table),
> +	},
> +};
> +module_peci_driver(peci_dimmtemp_driver);
> +
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_DESCRIPTION("PECI dimmtemp driver");
> +MODULE_LICENSE("GPL v2");
> -- 
> 2.16.2
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
  2018-04-10 18:32 ` [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx Jae Hyun Yoo
@ 2018-04-11 11:51   ` Joel Stanley
  2018-04-12  2:03     ` Jae Hyun Yoo
  2018-04-17 13:37   ` Robin Murphy
  1 sibling, 1 reply; 54+ messages in thread
From: Joel Stanley @ 2018-04-11 11:51 UTC (permalink / raw)
  To: Jae Hyun Yoo, Ryan Chen
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, Linux Kernel Mailing List, linux-doc, devicetree,
	linux-hwmon, Linux ARM, OpenBMC Maillist

Hello Jae,

On 11 April 2018 at 04:02, Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> wrote:
> This commit adds PECI adapter driver implementation for Aspeed
> AST24xx/AST25xx.

The driver is looking good!

It looks like you've done some kind of review that we weren't allowed
to see, which is a double edged sword - I might be asking about things
that you've already spoken about with someone else.

I'm only just learning about PECI, but I do have some general comments below.

> ---
>  drivers/peci/Kconfig       |  28 +++
>  drivers/peci/Makefile      |   3 +
>  drivers/peci/peci-aspeed.c | 504 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 535 insertions(+)
>  create mode 100644 drivers/peci/peci-aspeed.c
>
> diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
> index 1fbc13f9e6c2..0e33420365de 100644
> --- a/drivers/peci/Kconfig
> +++ b/drivers/peci/Kconfig
> @@ -14,4 +14,32 @@ config PECI
>           processors and chipset components to external monitoring or control
>           devices.
>
> +         If you want PECI support, you should say Y here and also to the
> +         specific driver for your bus adapter(s) below.
> +
> +if PECI
> +
> +#
> +# PECI hardware bus configuration
> +#
> +
> +menu "PECI Hardware Bus support"
> +
> +config PECI_ASPEED
> +       tristate "Aspeed AST24xx/AST25xx PECI support"

I think just saying ASPEED PECI support is enough. That way if the
next ASPEED SoC happens to have PECI we don't need to update all of
the help text :)

> +       select REGMAP_MMIO
> +       depends on OF
> +       depends on ARCH_ASPEED || COMPILE_TEST
> +       help
> +         Say Y here if you want support for the Platform Environment Control
> +         Interface (PECI) bus adapter driver on the Aspeed AST24XX and AST25XX
> +         SoCs.
> +
> +         This support is also available as a module.  If so, the module
> +         will be called peci-aspeed.
> +
> +endmenu
> +
> +endif # PECI
> +
>  endmenu
> diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
> index 9e8615e0d3ff..886285e69765 100644
> --- a/drivers/peci/Makefile
> +++ b/drivers/peci/Makefile
> @@ -4,3 +4,6 @@
>
>  # Core functionality
>  obj-$(CONFIG_PECI)             += peci-core.o
> +
> +# Hardware specific bus drivers
> +obj-$(CONFIG_PECI_ASPEED)      += peci-aspeed.o
> diff --git a/drivers/peci/peci-aspeed.c b/drivers/peci/peci-aspeed.c
> new file mode 100644
> index 000000000000..be2a1f327eb1
> --- /dev/null
> +++ b/drivers/peci/peci-aspeed.c
> @@ -0,0 +1,504 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2012-2017 ASPEED Technology Inc.
> +// Copyright (c) 2018 Intel Corporation
> +
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/interrupt.h>
> +#include <linux/jiffies.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/peci.h>
> +#include <linux/platform_device.h>
> +#include <linux/regmap.h>
> +
> +#define DUMP_DEBUG 0
> +
> +/* Aspeed PECI Registers */
> +#define AST_PECI_CTRL     0x00

Nit: we use ASPEED instead of AST in the upstream kernel to distingush
from the aspeed sdk drivers. If you feel strongly about this then I
won't insist you change.

> +#define AST_PECI_TIMING   0x04
> +#define AST_PECI_CMD      0x08
> +#define AST_PECI_CMD_CTRL 0x0c
> +#define AST_PECI_EXP_FCS  0x10
> +#define AST_PECI_CAP_FCS  0x14
> +#define AST_PECI_INT_CTRL 0x18
> +#define AST_PECI_INT_STS  0x1c
> +#define AST_PECI_W_DATA0  0x20
> +#define AST_PECI_W_DATA1  0x24
> +#define AST_PECI_W_DATA2  0x28
> +#define AST_PECI_W_DATA3  0x2c
> +#define AST_PECI_R_DATA0  0x30
> +#define AST_PECI_R_DATA1  0x34
> +#define AST_PECI_R_DATA2  0x38
> +#define AST_PECI_R_DATA3  0x3c
> +#define AST_PECI_W_DATA4  0x40
> +#define AST_PECI_W_DATA5  0x44
> +#define AST_PECI_W_DATA6  0x48
> +#define AST_PECI_W_DATA7  0x4c
> +#define AST_PECI_R_DATA4  0x50
> +#define AST_PECI_R_DATA5  0x54
> +#define AST_PECI_R_DATA6  0x58
> +#define AST_PECI_R_DATA7  0x5c
> +
> +/* AST_PECI_CTRL - 0x00 : Control Register */
> +#define PECI_CTRL_SAMPLING_MASK     GENMASK(19, 16)
> +#define PECI_CTRL_SAMPLING(x)       (((x) << 16) & PECI_CTRL_SAMPLING_MASK)
> +#define PECI_CTRL_SAMPLING_GET(x)   (((x) & PECI_CTRL_SAMPLING_MASK) >> 16)
> +#define PECI_CTRL_READ_MODE_MASK    GENMASK(13, 12)
> +#define PECI_CTRL_READ_MODE(x)      (((x) << 12) & PECI_CTRL_READ_MODE_MASK)
> +#define PECI_CTRL_READ_MODE_GET(x)  (((x) & PECI_CTRL_READ_MODE_MASK) >> 12)
> +#define PECI_CTRL_READ_MODE_COUNT   BIT(12)
> +#define PECI_CTRL_READ_MODE_DBG     BIT(13)
> +#define PECI_CTRL_CLK_SOURCE_MASK   BIT(11)
> +#define PECI_CTRL_CLK_SOURCE(x)     (((x) << 11) & PECI_CTRL_CLK_SOURCE_MASK)
> +#define PECI_CTRL_CLK_SOURCE_GET(x) (((x) & PECI_CTRL_CLK_SOURCE_MASK) >> 11)
> +#define PECI_CTRL_CLK_DIV_MASK      GENMASK(10, 8)
> +#define PECI_CTRL_CLK_DIV(x)        (((x) << 8) & PECI_CTRL_CLK_DIV_MASK)
> +#define PECI_CTRL_CLK_DIV_GET(x)    (((x) & PECI_CTRL_CLK_DIV_MASK) >> 8)
> +#define PECI_CTRL_INVERT_OUT        BIT(7)
> +#define PECI_CTRL_INVERT_IN         BIT(6)
> +#define PECI_CTRL_BUS_CONTENT_EN    BIT(5)
> +#define PECI_CTRL_PECI_EN           BIT(4)
> +#define PECI_CTRL_PECI_CLK_EN       BIT(0)

I know these come from the ASPEED sdk driver. Do we need them all?

> +
> +/* AST_PECI_TIMING - 0x04 : Timing Negotiation Register */
> +#define PECI_TIMING_MESSAGE_MASK   GENMASK(15, 8)
> +#define PECI_TIMING_MESSAGE(x)     (((x) << 8) & PECI_TIMING_MESSAGE_MASK)
> +#define PECI_TIMING_MESSAGE_GET(x) (((x) & PECI_TIMING_MESSAGE_MASK) >> 8)
> +#define PECI_TIMING_ADDRESS_MASK   GENMASK(7, 0)
> +#define PECI_TIMING_ADDRESS(x)     ((x) & PECI_TIMING_ADDRESS_MASK)
> +#define PECI_TIMING_ADDRESS_GET(x) ((x) & PECI_TIMING_ADDRESS_MASK)
> +
> +/* AST_PECI_CMD - 0x08 : Command Register */
> +#define PECI_CMD_PIN_MON    BIT(31)
> +#define PECI_CMD_STS_MASK   GENMASK(27, 24)
> +#define PECI_CMD_STS_GET(x) (((x) & PECI_CMD_STS_MASK) >> 24)
> +#define PECI_CMD_FIRE       BIT(0)
> +
> +/* AST_PECI_LEN - 0x0C : Read/Write Length Register */
> +#define PECI_AW_FCS_EN       BIT(31)
> +#define PECI_READ_LEN_MASK   GENMASK(23, 16)
> +#define PECI_READ_LEN(x)     (((x) << 16) & PECI_READ_LEN_MASK)
> +#define PECI_WRITE_LEN_MASK  GENMASK(15, 8)
> +#define PECI_WRITE_LEN(x)    (((x) << 8) & PECI_WRITE_LEN_MASK)
> +#define PECI_TAGET_ADDR_MASK GENMASK(7, 0)
> +#define PECI_TAGET_ADDR(x)   ((x) & PECI_TAGET_ADDR_MASK)
> +
> +/* AST_PECI_EXP_FCS - 0x10 : Expected FCS Data Register */
> +#define PECI_EXPECT_READ_FCS_MASK      GENMASK(23, 16)
> +#define PECI_EXPECT_READ_FCS_GET(x)    (((x) & PECI_EXPECT_READ_FCS_MASK) >> 16)
> +#define PECI_EXPECT_AW_FCS_AUTO_MASK   GENMASK(15, 8)
> +#define PECI_EXPECT_AW_FCS_AUTO_GET(x) (((x) & PECI_EXPECT_AW_FCS_AUTO_MASK) \
> +                                       >> 8)
> +#define PECI_EXPECT_WRITE_FCS_MASK     GENMASK(7, 0)
> +#define PECI_EXPECT_WRITE_FCS_GET(x)   ((x) & PECI_EXPECT_WRITE_FCS_MASK)
> +
> +/* AST_PECI_CAP_FCS - 0x14 : Captured FCS Data Register */
> +#define PECI_CAPTURE_READ_FCS_MASK    GENMASK(23, 16)
> +#define PECI_CAPTURE_READ_FCS_GET(x)  (((x) & PECI_CAPTURE_READ_FCS_MASK) >> 16)
> +#define PECI_CAPTURE_WRITE_FCS_MASK   GENMASK(7, 0)
> +#define PECI_CAPTURE_WRITE_FCS_GET(x) ((x) & PECI_CAPTURE_WRITE_FCS_MASK)
> +
> +/* AST_PECI_INT_CTRL/STS - 0x18/0x1c : Interrupt Register */
> +#define PECI_INT_TIMING_RESULT_MASK GENMASK(31, 30)
> +#define PECI_INT_TIMEOUT            BIT(4)
> +#define PECI_INT_CONNECT            BIT(3)
> +#define PECI_INT_W_FCS_BAD          BIT(2)
> +#define PECI_INT_W_FCS_ABORT        BIT(1)
> +#define PECI_INT_CMD_DONE           BIT(0)
> +
> +struct aspeed_peci {
> +       struct peci_adapter     adaper;
> +       struct device           *dev;
> +       struct regmap           *regmap;
> +       int                     irq;
> +       struct completion       xfer_complete;
> +       u32                     status;
> +       u32                     cmd_timeout_ms;
> +};
> +
> +#define PECI_INT_MASK  (PECI_INT_TIMEOUT | PECI_INT_CONNECT | \
> +                       PECI_INT_W_FCS_BAD | PECI_INT_W_FCS_ABORT | \
> +                       PECI_INT_CMD_DONE)
> +
> +#define PECI_IDLE_CHECK_TIMEOUT_MS      50
> +#define PECI_IDLE_CHECK_INTERVAL_MS     10
> +
> +#define PECI_RD_SAMPLING_POINT_DEFAULT  8
> +#define PECI_RD_SAMPLING_POINT_MAX      15
> +#define PECI_CLK_DIV_DEFAULT            0
> +#define PECI_CLK_DIV_MAX                7
> +#define PECI_MSG_TIMING_NEGO_DEFAULT    1
> +#define PECI_MSG_TIMING_NEGO_MAX        255
> +#define PECI_ADDR_TIMING_NEGO_DEFAULT   1
> +#define PECI_ADDR_TIMING_NEGO_MAX       255
> +#define PECI_CMD_TIMEOUT_MS_DEFAULT     1000
> +#define PECI_CMD_TIMEOUT_MS_MAX         60000
> +
> +static int aspeed_peci_xfer_native(struct aspeed_peci *priv,
> +                                  struct peci_xfer_msg *msg)
> +{
> +       long err, timeout = msecs_to_jiffies(priv->cmd_timeout_ms);
> +       u32 peci_head, peci_state, rx_data, cmd_sts;
> +       ktime_t start, end;
> +       s64 elapsed_ms;
> +       int i, rc = 0;
> +       uint reg;
> +
> +       start = ktime_get();
> +
> +       /* Check command sts and bus idle state */
> +       while (!regmap_read(priv->regmap, AST_PECI_CMD, &cmd_sts) &&
> +              (cmd_sts & (PECI_CMD_STS_MASK | PECI_CMD_PIN_MON))) {
> +               end = ktime_get();
> +               elapsed_ms = ktime_to_ms(ktime_sub(end, start));
> +               if (elapsed_ms >= PECI_IDLE_CHECK_TIMEOUT_MS) {
> +                       dev_dbg(priv->dev, "Timeout waiting for idle state!\n");
> +                       return -ETIMEDOUT;
> +               }
> +
> +               usleep_range(PECI_IDLE_CHECK_INTERVAL_MS * 1000,
> +                            (PECI_IDLE_CHECK_INTERVAL_MS * 1000) + 1000);
> +       };

Could the above use regmap_read_poll_timeout instead?

> +
> +       reinit_completion(&priv->xfer_complete);
> +
> +       peci_head = PECI_TAGET_ADDR(msg->addr) |
> +                                   PECI_WRITE_LEN(msg->tx_len) |
> +                                   PECI_READ_LEN(msg->rx_len);
> +
> +       rc = regmap_write(priv->regmap, AST_PECI_CMD_CTRL, peci_head);
> +       if (rc)
> +               return rc;
> +
> +       for (i = 0; i < msg->tx_len; i += 4) {
> +               reg = i < 16 ? AST_PECI_W_DATA0 + i % 16 :
> +                              AST_PECI_W_DATA4 + i % 16;
> +               rc = regmap_write(priv->regmap, reg,
> +                                 (msg->tx_buf[i + 3] << 24) |
> +                                 (msg->tx_buf[i + 2] << 16) |
> +                                 (msg->tx_buf[i + 1] << 8) |
> +                                 msg->tx_buf[i + 0]);

That looks like an endian swap. Can we do something like this?

 regmap_write(map, reg, cpu_to_be32p((void *)msg->tx_buff))

> +               if (rc)
> +                       return rc;
> +       }
> +
> +       dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
> +#if DUMP_DEBUG

Having #defines is frowned upon. I think print_hex_dump_debug will do
what you want here.

> +       print_hex_dump(KERN_DEBUG, "TX : ", DUMP_PREFIX_NONE, 16, 1,
> +                      msg->tx_buf, msg->tx_len, true);
> +#endif
> +
> +       rc = regmap_write(priv->regmap, AST_PECI_CMD, PECI_CMD_FIRE);
> +       if (rc)
> +               return rc;
> +
> +       err = wait_for_completion_interruptible_timeout(&priv->xfer_complete,
> +                                                       timeout);
> +
> +       dev_dbg(priv->dev, "INT_STS : 0x%08x\n", priv->status);
> +       if (!regmap_read(priv->regmap, AST_PECI_CMD, &peci_state))
> +               dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
> +                       PECI_CMD_STS_GET(peci_state));
> +       else
> +               dev_dbg(priv->dev, "PECI_STATE : read error\n");
> +
> +       rc = regmap_write(priv->regmap, AST_PECI_CMD, 0);
> +       if (rc)
> +               return rc;
> +
> +       if (err <= 0 || !(priv->status & PECI_INT_CMD_DONE)) {
> +               if (err < 0) { /* -ERESTARTSYS */
> +                       return (int)err;
> +               } else if (err == 0) {
> +                       dev_dbg(priv->dev, "Timeout waiting for a response!\n");
> +                       return -ETIMEDOUT;
> +               }
> +
> +               dev_dbg(priv->dev, "No valid response!\n");
> +               return -EIO;
> +       }
> +
> +       for (i = 0; i < msg->rx_len; i++) {
> +               u8 byte_offset = i % 4;
> +
> +               if (byte_offset == 0) {
> +                       reg = i < 16 ? AST_PECI_R_DATA0 + i % 16 :
> +                                      AST_PECI_R_DATA4 + i % 16;

I find this hard to read. Use a few more lines to make it clear what
your code is doing.

Actually, the entire for loop is cryptic. I understand what it's doing
now. Can you rework it to make it more readable? You follow a similar
pattern above in the write case.

> +                       rc = regmap_read(priv->regmap, reg, &rx_data);
> +                       if (rc)
> +                               return rc;
> +               }
> +
> +               msg->rx_buf[i] = (u8)(rx_data >> (byte_offset << 3))
> +       }
> +
> +#if DUMP_DEBUG
> +       print_hex_dump(KERN_DEBUG, "RX : ", DUMP_PREFIX_NONE, 16, 1,
> +                      msg->rx_buf, msg->rx_len, true);
> +#endif
> +       if (!regmap_read(priv->regmap, AST_PECI_CMD, &peci_state))
> +               dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
> +                       PECI_CMD_STS_GET(peci_state));
> +       else
> +               dev_dbg(priv->dev, "PECI_STATE : read error\n");

Given the regmap_read is always going to be a memory read on the
aspeed, I can't think of a situation where the read will fail.

On that note, is there a reason you are using regmap and not just
accessing the hardware directly? regmap imposes a number of pointer
lookups and tests each time you do a read or write.

> +       dev_dbg(priv->dev, "------------------------\n");
> +
> +       return rc;
> +}
> +
> +static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
> +{
> +       struct aspeed_peci *priv = arg;
> +       u32 status_ack = 0;
> +
> +       if (regmap_read(priv->regmap, AST_PECI_INT_STS, &priv->status))
> +               return IRQ_NONE;

Again, a memory mapped read won't fail. How about we check that the
regmap is working once in your _probe() function, and assume it will
continue working from there (or remove the regmap abstraction all
together).

> +
> +       /* Be noted that multiple interrupt bits can be set at the same time */
> +       if (priv->status & PECI_INT_TIMEOUT) {
> +               dev_dbg(priv->dev, "PECI_INT_TIMEOUT\n");
> +               status_ack |= PECI_INT_TIMEOUT;
> +       }
> +
> +       if (priv->status & PECI_INT_CONNECT) {
> +               dev_dbg(priv->dev, "PECI_INT_CONNECT\n");
> +               status_ack |= PECI_INT_CONNECT;
> +       }
> +
> +       if (priv->status & PECI_INT_W_FCS_BAD) {
> +               dev_dbg(priv->dev, "PECI_INT_W_FCS_BAD\n");
> +               status_ack |= PECI_INT_W_FCS_BAD;
> +       }
> +
> +       if (priv->status & PECI_INT_W_FCS_ABORT) {
> +               dev_dbg(priv->dev, "PECI_INT_W_FCS_ABORT\n");
> +               status_ack |= PECI_INT_W_FCS_ABORT;
> +       }

All of this code is for debugging only. Do you want to put it behind
some kind of conditional?

> +
> +       /**
> +        * All commands should be ended up with a PECI_INT_CMD_DONE bit set
> +        * even in an error case.
> +        */
> +       if (priv->status & PECI_INT_CMD_DONE) {
> +               dev_dbg(priv->dev, "PECI_INT_CMD_DONE\n");
> +               status_ack |= PECI_INT_CMD_DONE;
> +               complete(&priv->xfer_complete);
> +       }
> +
> +       if (regmap_write(priv->regmap, AST_PECI_INT_STS, status_ack))
> +               return IRQ_NONE;
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static int aspeed_peci_init_ctrl(struct aspeed_peci *priv)
> +{
> +       u32 msg_timing_nego, addr_timing_nego, rd_sampling_point;
> +       u32 clk_freq, clk_divisor, clk_div_val = 0;
> +       struct clk *clkin;
> +       int ret;
> +
> +       clkin = devm_clk_get(priv->dev, NULL);
> +       if (IS_ERR(clkin)) {
> +               dev_err(priv->dev, "Failed to get clk source.\n");
> +               return PTR_ERR(clkin);
> +       }
> +
> +       ret = of_property_read_u32(priv->dev->of_node, "clock-frequency",
> +                                  &clk_freq);
> +       if (ret < 0) {
> +               dev_err(priv->dev,
> +                       "Could not read clock-frequency property.\n");
> +               return ret;
> +       }
> +
> +       clk_divisor = clk_get_rate(clkin) / clk_freq;
> +       devm_clk_put(priv->dev, clkin);
> +
> +       while ((clk_divisor >> 1) && (clk_div_val < PECI_CLK_DIV_MAX))
> +               clk_div_val++;

We have a framework for doing clocks in the kernel. Would it make
sense to write a driver for this clock and add it to
drivers/clk/clk-aspeed.c?

> +
> +       ret = of_property_read_u32(priv->dev->of_node, "msg-timing-nego",
> +                                  &msg_timing_nego);
> +       if (ret || msg_timing_nego > PECI_MSG_TIMING_NEGO_MAX) {
> +               dev_warn(priv->dev,
> +                        "Invalid msg-timing-nego : %u, Use default : %u\n",
> +                        msg_timing_nego, PECI_MSG_TIMING_NEGO_DEFAULT);

The property is optional so I suggest we don't print a message if it's
not present. We certainly don't want to print a message saying
"invalid".

The same comment applies to the other optional properties below.

> +               msg_timing_nego = PECI_MSG_TIMING_NEGO_DEFAULT;
> +       }
> +
> +       ret = of_property_read_u32(priv->dev->of_node, "addr-timing-nego",
> +                                  &addr_timing_nego);
> +       if (ret || addr_timing_nego > PECI_ADDR_TIMING_NEGO_MAX) {
> +               dev_warn(priv->dev,
> +                        "Invalid addr-timing-nego : %u, Use default : %u\n",
> +                        addr_timing_nego, PECI_ADDR_TIMING_NEGO_DEFAULT);
> +               addr_timing_nego = PECI_ADDR_TIMING_NEGO_DEFAULT;
> +       }
> +
> +       ret = of_property_read_u32(priv->dev->of_node, "rd-sampling-point",
> +                                  &rd_sampling_point);
> +       if (ret || rd_sampling_point > PECI_RD_SAMPLING_POINT_MAX) {
> +               dev_warn(priv->dev,
> +                        "Invalid rd-sampling-point : %u. Use default : %u\n",
> +                        rd_sampling_point,
> +                        PECI_RD_SAMPLING_POINT_DEFAULT);
> +               rd_sampling_point = PECI_RD_SAMPLING_POINT_DEFAULT;
> +       }
> +
> +       ret = of_property_read_u32(priv->dev->of_node, "cmd-timeout-ms",
> +                                  &priv->cmd_timeout_ms);
> +       if (ret || priv->cmd_timeout_ms > PECI_CMD_TIMEOUT_MS_MAX ||
> +           priv->cmd_timeout_ms == 0) {
> +               dev_warn(priv->dev,
> +                        "Invalid cmd-timeout-ms : %u. Use default : %u\n",
> +                        priv->cmd_timeout_ms,
> +                        PECI_CMD_TIMEOUT_MS_DEFAULT);
> +               priv->cmd_timeout_ms = PECI_CMD_TIMEOUT_MS_DEFAULT;
> +       }
> +
> +       ret = regmap_write(priv->regmap, AST_PECI_CTRL,
> +                          PECI_CTRL_CLK_DIV(PECI_CLK_DIV_DEFAULT) |
> +                          PECI_CTRL_PECI_CLK_EN);
> +       if (ret)
> +               return ret;
> +
> +       usleep_range(1000, 5000);

Can we probe in parallel? If not, putting a sleep in the _probe will
hold up the rest of drivers from being able to do anything, and hold
up boot.

If you decide that you do need to probe here, please add a comment.
(This is the wait for the clock to be stable?)

> +
> +       /**
> +        * Timing negotiation period setting.
> +        * The unit of the programmed value is 4 times of PECI clock period.
> +        */
> +       ret = regmap_write(priv->regmap, AST_PECI_TIMING,
> +                          PECI_TIMING_MESSAGE(msg_timing_nego) |
> +                          PECI_TIMING_ADDRESS(addr_timing_nego));
> +       if (ret)
> +               return ret;
> +
> +       /* Clear interrupts */
> +       ret = regmap_write(priv->regmap, AST_PECI_INT_STS, PECI_INT_MASK);
> +       if (ret)
> +               return ret;
> +
> +       /* Enable interrupts */
> +       ret = regmap_write(priv->regmap, AST_PECI_INT_CTRL, PECI_INT_MASK);
> +       if (ret)
> +               return ret;
> +
> +       /* Read sampling point and clock speed setting */
> +       ret = regmap_write(priv->regmap, AST_PECI_CTRL,
> +                          PECI_CTRL_SAMPLING(rd_sampling_point) |
> +                          PECI_CTRL_CLK_DIV(clk_div_val) |
> +                          PECI_CTRL_PECI_EN | PECI_CTRL_PECI_CLK_EN);
> +       if (ret)
> +               return ret;
> +
> +       return 0;
> +}
> +
> +static const struct regmap_config aspeed_peci_regmap_config = {
> +       .reg_bits = 32,
> +       .val_bits = 32,
> +       .reg_stride = 4,
> +       .max_register = AST_PECI_R_DATA7,
> +       .val_format_endian = REGMAP_ENDIAN_LITTLE,
> +       .fast_io = true,
> +};
> +
> +static int aspeed_peci_xfer(struct peci_adapter *adaper,
> +                           struct peci_xfer_msg *msg)
> +{
> +       struct aspeed_peci *priv = peci_get_adapdata(adaper);
> +
> +       return aspeed_peci_xfer_native(priv, msg);
> +}
> +
> +static int aspeed_peci_probe(struct platform_device *pdev)
> +{
> +       struct aspeed_peci *priv;
> +       struct resource *res;
> +       void __iomem *base;
> +       int ret = 0;
> +
> +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> +       if (!priv)
> +               return -ENOMEM;
> +
> +       dev_set_drvdata(&pdev->dev, priv);
> +       priv->dev = &pdev->dev;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       base = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(base))
> +               return PTR_ERR(base);
> +
> +       priv->regmap = devm_regmap_init_mmio(&pdev->dev, base,
> +                                            &aspeed_peci_regmap_config);
> +       if (IS_ERR(priv->regmap))
> +               return PTR_ERR(priv->regmap);
> +
> +       priv->irq = platform_get_irq(pdev, 0);
> +       if (!priv->irq)
> +               return -ENODEV;
> +
> +       ret = devm_request_irq(&pdev->dev, priv->irq, aspeed_peci_irq_handler,
> +                              IRQF_SHARED,

This interrupt is only for the peci device. Why is it marked as shared?

> +                              "peci-aspeed-irq",
> +                              priv);
> +       if (ret < 0)
> +               return ret;
> +
> +       init_completion(&priv->xfer_complete);
> +
> +       priv->adaper.dev.parent = priv->dev;
> +       priv->adaper.dev.of_node = of_node_get(dev_of_node(priv->dev));
> +       strlcpy(priv->adaper.name, pdev->name, sizeof(priv->adaper.name));
> +       priv->adaper.xfer = aspeed_peci_xfer;
> +       peci_set_adapdata(&priv->adaper, priv);
> +
> +       ret = aspeed_peci_init_ctrl(priv);
> +       if (ret < 0)
> +               return ret;
> +
> +       ret = peci_add_adapter(&priv->adaper);
> +       if (ret < 0)
> +               return ret;
> +
> +       dev_info(&pdev->dev, "peci bus %d registered, irq %d\n",
> +                priv->adaper.nr, priv->irq);
> +
> +       return 0;
> +}
> +
> +static int aspeed_peci_remove(struct platform_device *pdev)
> +{
> +       struct aspeed_peci *priv = dev_get_drvdata(&pdev->dev);
> +
> +       peci_del_adapter(&priv->adaper);
> +       of_node_put(priv->adaper.dev.of_node);
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id aspeed_peci_of_table[] = {
> +       { .compatible = "aspeed,ast2400-peci", },
> +       { .compatible = "aspeed,ast2500-peci", },
> +       { }
> +};
> +MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
> +
> +static struct platform_driver aspeed_peci_driver = {
> +       .probe  = aspeed_peci_probe,
> +       .remove = aspeed_peci_remove,
> +       .driver = {
> +               .name           = "peci-aspeed",
> +               .of_match_table = of_match_ptr(aspeed_peci_of_table),
> +       },
> +};
> +module_platform_driver(aspeed_peci_driver);
> +
> +MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
> +MODULE_DESCRIPTION("Aspeed PECI driver");
> +MODULE_LICENSE("GPL v2");
> --
> 2.16.2
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers
  2018-04-10 18:32 ` [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers Jae Hyun Yoo
@ 2018-04-11 11:52   ` Joel Stanley
  2018-04-12  2:06     ` Jae Hyun Yoo
  2018-04-16 17:59   ` Rob Herring
  1 sibling, 1 reply; 54+ messages in thread
From: Joel Stanley @ 2018-04-11 11:52 UTC (permalink / raw)
  To: Jae Hyun Yoo, Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, Linux Kernel Mailing List, linux-doc, devicetree,
	linux-hwmon, Linux ARM, OpenBMC Maillist

Hi Jae,

On 11 April 2018 at 04:02, Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> wrote:
> This commit adds documents of generic PECI bus, adapter and client drivers.
>
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
> Reviewed-by: James Feist <james.feist@linux.intel.com>
> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Andrew Jeffery <andrew@aj.id.au>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
> Cc: Jean Delvare <jdelvare@suse.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Julia Cartwright <juliac@eso.teric.us>
> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
> Cc: Milton Miller II <miltonm@us.ibm.com>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>

That's a hefty cc list. I can't see Rob Herring though, and he's
usually the person who you need to convince to get your bindings
accepted.

I recommend using ./scripts/get_maintainers.pl to build your CC list,
and then add others you think are relevant.

I'm not sure what the guidelines are for generic bindings, so I'll
defer to Rob for this patch.

Cheers,

Joel

> ---
>  .../devicetree/bindings/peci/peci-adapter.txt      | 23 ++++++++++++++++++++
>  .../devicetree/bindings/peci/peci-bus.txt          | 15 +++++++++++++
>  .../devicetree/bindings/peci/peci-client.txt       | 25 ++++++++++++++++++++++
>  3 files changed, 63 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt
>
> diff --git a/Documentation/devicetree/bindings/peci/peci-adapter.txt b/Documentation/devicetree/bindings/peci/peci-adapter.txt
> new file mode 100644
> index 000000000000..9221374f6b11
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-adapter.txt
> @@ -0,0 +1,23 @@
> +Generic device tree configuration for PECI adapters.
> +
> +Required properties:
> +- compatible     : Should contain hardware specific definition strings that can
> +                  match an adapter driver implementation.
> +- reg            : Should contain PECI controller registers location and length.
> +- #address-cells : Should be <1>.
> +- #size-cells    : Should be <0>.
> +
> +Example:
> +       peci: peci@10000000 {
> +               compatible = "simple-bus";
> +               #address-cells = <1>;
> +               #size-cells = <1>;
> +               ranges = <0x0 0x10000000 0x1000>;
> +
> +               peci0: peci-bus@0 {
> +                       compatible = "soc,soc-peci";
> +                       reg = <0x0 0x1000>;
> +                       #address-cells = <1>;
> +                       #size-cells = <0>;
> +               };
> +       };
> diff --git a/Documentation/devicetree/bindings/peci/peci-bus.txt b/Documentation/devicetree/bindings/peci/peci-bus.txt
> new file mode 100644
> index 000000000000..90bcc791ccb0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-bus.txt
> @@ -0,0 +1,15 @@
> +Generic device tree configuration for PECI buses.
> +
> +Required properties:
> +- compatible     : Should be "simple-bus".
> +- #address-cells : Should be <1>.
> +- #size-cells    : Should be <1>.
> +- ranges         : Should contain PECI controller registers ranges.
> +
> +Example:
> +       peci: peci@10000000 {
> +               compatible = "simple-bus";
> +               #address-cells = <1>;
> +               #size-cells = <1>;
> +               ranges = <0x0 0x10000000 0x1000>;
> +       };
> diff --git a/Documentation/devicetree/bindings/peci/peci-client.txt b/Documentation/devicetree/bindings/peci/peci-client.txt
> new file mode 100644
> index 000000000000..8e2bfd8532f6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-client.txt
> @@ -0,0 +1,25 @@
> +Generic device tree configuration for PECI clients.
> +
> +Required properties:
> +- compatible : Should contain target device specific definition strings that can
> +              match a client driver implementation.
> +- reg        : Should contain address of a client CPU. Address range of CPU
> +              clients is starting from 0x30 based on PECI specification.
> +              <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
> +
> +Example:
> +       peci-bus@0 {
> +               #address-cells = <1>;
> +               #size-cells = <0>;
> +               < more properties >
> +
> +               function@cpu0 {
> +                       compatible = "device,function";
> +                       reg = <0x30>;
> +               };
> +
> +               function@cpu1 {
> +                       compatible = "device,function";
> +                       reg = <0x31>;
> +               };
> +       };
> --
> 2.16.2
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-10 18:32 ` [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs Jae Hyun Yoo
@ 2018-04-11 11:52   ` Joel Stanley
  2018-04-12  2:11     ` Jae Hyun Yoo
  2018-04-16 18:10   ` Rob Herring
  1 sibling, 1 reply; 54+ messages in thread
From: Joel Stanley @ 2018-04-11 11:52 UTC (permalink / raw)
  To: Jae Hyun Yoo, Rob Herring, linux-aspeed, Ryan Chen
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, Linux Kernel Mailing List, linux-doc, devicetree,
	linux-hwmon, Linux ARM, OpenBMC Maillist

On 11 April 2018 at 04:02, Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> wrote:
> This commit adds a dt-bindings document of PECI adapter driver for Aspeed

We try to capitalise ASPEED.

> AST24xx/25xx SoCs.
> ---
>  .../devicetree/bindings/peci/peci-aspeed.txt       | 60 ++++++++++++++++++++++
>  1 file changed, 60 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt
>
> diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
> new file mode 100644
> index 000000000000..4598bb8c20fa
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
> @@ -0,0 +1,60 @@
> +Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
> +
> +Required properties:
> +- compatible        : Should be "aspeed,ast2400-peci" or "aspeed,ast2500-peci"
> +                     - aspeed,ast2400-peci: Aspeed AST2400 family PECI
> +                                            controller
> +                     - aspeed,ast2500-peci: Aspeed AST2500 family PECI
> +                                            controller
> +- reg               : Should contain PECI controller registers location and
> +                     length.
> +- #address-cells    : Should be <1>.
> +- #size-cells       : Should be <0>.
> +- interrupts        : Should contain PECI controller interrupt.
> +- clocks            : Should contain clock source for PECI controller.
> +                     Should reference clkin.

Are you sure that this is driven by clkin? Most peripherals on the
Aspeed are attached to the apb, so should reference that clock.

> +- clock_frequency   : Should contain the operation frequency of PECI controller
> +                     in units of Hz.
> +                     187500 ~ 24000000

Can you explain why you need both the parent clock and this frequency
to be specified?

> +
> +Optional properties:
> +- msg-timing-nego   : Message timing negotiation period. This value will

Perhaps msg-timing-period? Or just msg-timing?

> +                     determine the period of message timing negotiation to be
> +                     issued by PECI controller. The unit of the programmed
> +                     value is four times of PECI clock period.
> +                     0 ~ 255 (default: 1)
> +- addr-timing-nego  : Address timing negotiation period. This value will
> +                     determine the period of address timing negotiation to be
> +                     issued by PECI controller. The unit of the programmed
> +                     value is four times of PECI clock period.
> +                     0 ~ 255 (default: 1)
> +- rd-sampling-point : Read sampling point selection. The whole period of a bit
> +                     time will be divided into 16 time frames. This value will
> +                     determine the time frame in which the controller will
> +                     sample PECI signal for data read back. Usually in the
> +                     middle of a bit time is the best.
> +                     0 ~ 15 (default: 8)
> +- cmd_timeout_ms    : Command timeout in units of ms.
> +                     1 ~ 60000 (default: 1000)
> +
> +Example:
> +       peci: peci@1e78b000 {
> +               compatible = "simple-bus";
> +               #address-cells = <1>;
> +               #size-cells = <1>;
> +               ranges = <0x0 0x1e78b000 0x60>;
> +
> +               peci0: peci-bus@0 {
> +                       compatible = "aspeed,ast2500-peci";
> +                       reg = <0x0 0x60>;
> +                       #address-cells = <1>;
> +                       #size-cells = <0>;
> +                       interrupts = <15>;
> +                       clocks = <&clk_clkin>;
> +                       clock-frequency = <24000000>;
> +                       msg-timing-nego = <1>;
> +                       addr-timing-nego = <1>;
> +                       rd-sampling-point = <8>;
> +                       cmd-timeout-ms = <1000>;
> +               };
> +       };
> --
> 2.16.2
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node
  2018-04-10 18:32 ` [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node Jae Hyun Yoo
@ 2018-04-11 11:52   ` Joel Stanley
  2018-04-12  2:20     ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Joel Stanley @ 2018-04-11 11:52 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, Linux Kernel Mailing List, linux-doc, devicetree,
	linux-hwmon, Linux ARM, OpenBMC Maillist

On 11 April 2018 at 04:02, Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> wrote:
> This commit adds PECI bus/adapter node of AST24xx/AST25xx into
> aspeed-g4 and aspeed-g5.
>

The patches to the device trees get merged by the ASPEED maintainer
(me). Once you have the bindings reviewed you can send the patches to
me and the linux-aspeed list (I've got a pending patch to maintainers
that will ensure get_maintainers.pl does the right thing as far as
email addresses go).

I'd suggest dropping it from your series and re-sending once the
bindings and driver are reviewed.

Cheers,

Joel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-10 22:28   ` Guenter Roeck
@ 2018-04-11 21:59     ` Jae Hyun Yoo
  2018-04-12  0:34       ` Guenter Roeck
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-11 21:59 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

Hi Guenter,

Thanks a lot for sharing your time. Please see my inline answers.

On 4/10/2018 3:28 PM, Guenter Roeck wrote:
> On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:
>> This commit adds PECI cputemp and dimmtemp hwmon drivers.
>>
>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>> Cc: Alan Cox <alan@linux.intel.com>
>> Cc: Andrew Jeffery <andrew@aj.id.au>
>> Cc: Andrew Lunn <andrew@lunn.ch>
>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>> Cc: Greg KH <gregkh@linuxfoundation.org>
>> Cc: Guenter Roeck <linux@roeck-us.net>
>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>> Cc: Jean Delvare <jdelvare@suse.com>
>> Cc: Joel Stanley <joel@jms.id.au>
>> Cc: Julia Cartwright <juliac@eso.teric.us>
>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>> Cc: Milton Miller II <miltonm@us.ibm.com>
>> Cc: Pavel Machek <pavel@ucw.cz>
>> Cc: Randy Dunlap <rdunlap@infradead.org>
>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>> ---
>>   drivers/hwmon/Kconfig         |  28 ++
>>   drivers/hwmon/Makefile        |   2 +
>>   drivers/hwmon/peci-cputemp.c  | 783 ++++++++++++++++++++++++++++++++++++++++++
>>   drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
>>   4 files changed, 1245 insertions(+)
>>   create mode 100644 drivers/hwmon/peci-cputemp.c
>>   create mode 100644 drivers/hwmon/peci-dimmtemp.c
>>
>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
>> index f249a4428458..c52f610f81d0 100644
>> --- a/drivers/hwmon/Kconfig
>> +++ b/drivers/hwmon/Kconfig
>> @@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
>>   	  This driver can also be built as a module.  If so, the module
>>   	  will be called nct7904.
>>   
>> +config SENSORS_PECI_CPUTEMP
>> +	tristate "PECI CPU temperature monitoring support"
>> +	depends on OF
>> +	depends on PECI
>> +	help
>> +	  If you say yes here you get support for the generic Intel PECI
>> +	  cputemp driver which provides Digital Thermal Sensor (DTS) thermal
>> +	  readings of the CPU package and CPU cores that are accessible using
>> +	  the PECI Client Command Suite via the processor PECI client.
>> +	  Check Documentation/hwmon/peci-cputemp for details.
>> +
>> +	  This driver can also be built as a module.  If so, the module
>> +	  will be called peci-cputemp.
>> +
>> +config SENSORS_PECI_DIMMTEMP
>> +	tristate "PECI DIMM temperature monitoring support"
>> +	depends on OF
>> +	depends on PECI
>> +	help
>> +	  If you say yes here you get support for the generic Intel PECI hwmon
>> +	  driver which provides Digital Thermal Sensor (DTS) thermal readings of
>> +	  DIMM components that are accessible using the PECI Client Command
>> +	  Suite via the processor PECI client.
>> +	  Check Documentation/hwmon/peci-dimmtemp for details.
>> +
>> +	  This driver can also be built as a module.  If so, the module
>> +	  will be called peci-dimmtemp.
>> +
>>   config SENSORS_NSA320
>>   	tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
>>   	depends on GPIOLIB && OF
>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
>> index e7d52a36e6c4..48d9598fcd3a 100644
>> --- a/drivers/hwmon/Makefile
>> +++ b/drivers/hwmon/Makefile
>> @@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)	+= nct7802.o
>>   obj-$(CONFIG_SENSORS_NCT7904)	+= nct7904.o
>>   obj-$(CONFIG_SENSORS_NSA320)	+= nsa320-hwmon.o
>>   obj-$(CONFIG_SENSORS_NTC_THERMISTOR)	+= ntc_thermistor.o
>> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)	+= peci-cputemp.o
>> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)	+= peci-dimmtemp.o
>>   obj-$(CONFIG_SENSORS_PC87360)	+= pc87360.o
>>   obj-$(CONFIG_SENSORS_PC87427)	+= pc87427.o
>>   obj-$(CONFIG_SENSORS_PCF8591)	+= pcf8591.o
>> diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
>> new file mode 100644
>> index 000000000000..f0bc92687512
>> --- /dev/null
>> +++ b/drivers/hwmon/peci-cputemp.c
>> @@ -0,0 +1,783 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (c) 2018 Intel Corporation
>> +
>> +#include <linux/delay.h>
>> +#include <linux/hwmon.h>
>> +#include <linux/hwmon-sysfs.h>
> 
> Is this include needed ?
> 

No it isn't. Will drop the line.

>> +#include <linux/jiffies.h>
>> +#include <linux/module.h>
>> +#include <linux/of_device.h>
>> +#include <linux/peci.h>
>> +
>> +#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
>> +
>> +#define CORE_MAX_ON_HSX       18 /* Max number of cores on Haswell */
>> +#define CORE_MAX_ON_BDX       24 /* Max number of cores on Broadwell */
>> +#define CORE_MAX_ON_SKX       28 /* Max number of cores on Skylake */
>> +
>> +#define DEFAULT_CHANNEL_NUMS  5
>> +#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
>> +#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
>> +
>> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
>> +
>> +#define UPDATE_INTERVAL_MIN   HZ
>> +
>> +enum cpu_gens {
>> +	CPU_GEN_HSX, /* Haswell Xeon */
>> +	CPU_GEN_BRX, /* Broadwell Xeon */
>> +	CPU_GEN_SKX, /* Skylake Xeon */
>> +	CPU_GEN_MAX
>> +};
>> +
>> +struct cpu_gen_info {
>> +	u32 type;
>> +	u32 cpu_id;
>> +	u32 core_max;
>> +};
>> +
>> +struct temp_data {
>> +	bool valid;
>> +	s32  value;
>> +	unsigned long last_updated;
>> +};
>> +
>> +struct temp_group {
>> +	struct temp_data die;
>> +	struct temp_data dts_margin;
>> +	struct temp_data tcontrol;
>> +	struct temp_data tthrottle;
>> +	struct temp_data tjmax;
>> +	struct temp_data core[CORETEMP_CHANNEL_NUMS];
>> +};
>> +
>> +struct peci_cputemp {
>> +	struct peci_client *client;
>> +	struct device *dev;
>> +	char name[PECI_NAME_SIZE];
>> +	struct temp_group temp;
>> +	u8 addr;
>> +	uint cpu_no;
>> +	const struct cpu_gen_info *gen_info;
>> +	u32 core_mask;
>> +	u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
>> +	uint config_idx;
>> +	struct hwmon_channel_info temp_info;
>> +	const struct hwmon_channel_info *info[2];
>> +	struct hwmon_chip_info chip;
>> +};
>> +
>> +enum cputemp_channels {
>> +	channel_die,
>> +	channel_dts_mrgn,
>> +	channel_tcontrol,
>> +	channel_tthrottle,
>> +	channel_tjmax,
>> +	channel_core,
>> +};
>> +
>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>> +	{ .type = CPU_GEN_HSX,
>> +	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
>> +	  .core_max = CORE_MAX_ON_HSX },
>> +	{ .type = CPU_GEN_BRX,
>> +	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
>> +	  .core_max = CORE_MAX_ON_BDX },
>> +	{ .type = CPU_GEN_SKX,
>> +	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
>> +	  .core_max = CORE_MAX_ON_SKX },
>> +};
>> +
>> +static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
>> +	/* Die temperature */
>> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>> +	HWMON_T_CRIT_HYST,
>> +
>> +	/* DTS margin temperature */
>> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
>> +
>> +	/* Tcontrol temperature */
>> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
>> +
>> +	/* Tthrottle temperature */
>> +	HWMON_T_LABEL | HWMON_T_INPUT,
>> +
>> +	/* Tjmax temperature */
>> +	HWMON_T_LABEL | HWMON_T_INPUT,
>> +
>> +	/* Core temperature - for all core channels */
>> +	HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>> +	HWMON_T_CRIT_HYST,
>> +};
>> +
>> +static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
>> +	"Die",
>> +	"DTS margin",
>> +	"Tcontrol",
>> +	"Tthrottle",
>> +	"Tjmax",
>> +	"Core 0", "Core 1", "Core 2", "Core 3",
>> +	"Core 4", "Core 5", "Core 6", "Core 7",
>> +	"Core 8", "Core 9", "Core 10", "Core 11",
>> +	"Core 12", "Core 13", "Core 14", "Core 15",
>> +	"Core 16", "Core 17", "Core 18", "Core 19",
>> +	"Core 20", "Core 21", "Core 22", "Core 23",
>> +};
>> +
>> +static int send_peci_cmd(struct peci_cputemp *priv,
>> +			 enum peci_cmd cmd,
>> +			 void *msg)
>> +{
>> +	return peci_command(priv->client->adapter, cmd, msg);
>> +}
>> +
>> +static int need_update(struct temp_data *temp)
> 
> Please use bool.
> 

Okay. I'll use bool instead of int.

>> +{
>> +	if (temp->valid &&
>> +	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
>> +		return 0;
>> +
>> +	return 1;
>> +}
>> +
>> +static void mark_updated(struct temp_data *temp)
>> +{
>> +	temp->valid = true;
>> +	temp->last_updated = jiffies;
>> +}
>> +
>> +static s32 ten_dot_six_to_millidegree(s32 val)
>> +{
>> +	return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
>> +}
>> +
>> +static int get_tjmax(struct peci_cputemp *priv)
>> +{
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	int rc;
>> +
>> +	if (!priv->temp.tjmax.valid) {
>> +		msg.addr = priv->addr;
>> +		msg.index = MBX_INDEX_TEMP_TARGET;
>> +		msg.param = 0;
>> +		msg.rx_len = 4;
>> +
>> +		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +		if (rc)
>> +			return rc;
>> +
>> +		priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
>> +		priv->temp.tjmax.valid = true;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int get_tcontrol(struct peci_cputemp *priv)
>> +{
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	s32 tcontrol_margin;
>> +	s32 tthrottle_offset;
>> +	int rc;
>> +
>> +	if (!need_update(&priv->temp.tcontrol))
>> +		return 0;
>> +
>> +	rc = get_tjmax(priv);
>> +	if (rc)
>> +		return rc;
>> +
>> +	msg.addr = priv->addr;
>> +	msg.index = MBX_INDEX_TEMP_TARGET;
>> +	msg.param = 0;
>> +	msg.rx_len = 4;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	tcontrol_margin = msg.pkg_config[1];
>> +	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>> +	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
>> +
>> +	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>> +	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
>> +
>> +	mark_updated(&priv->temp.tcontrol);
>> +	mark_updated(&priv->temp.tthrottle);
>> +
>> +	return 0;
>> +}
>> +
>> +static int get_tthrottle(struct peci_cputemp *priv)
>> +{
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	s32 tcontrol_margin;
>> +	s32 tthrottle_offset;
>> +	int rc;
>> +
>> +	if (!need_update(&priv->temp.tthrottle))
>> +		return 0;
>> +
>> +	rc = get_tjmax(priv);
>> +	if (rc)
>> +		return rc;
>> +
>> +	msg.addr = priv->addr;
>> +	msg.index = MBX_INDEX_TEMP_TARGET;
>> +	msg.param = 0;
>> +	msg.rx_len = 4;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>> +	priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
>> +
>> +	tcontrol_margin = msg.pkg_config[1];
>> +	tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>> +	priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
>> +
>> +	mark_updated(&priv->temp.tthrottle);
>> +	mark_updated(&priv->temp.tcontrol);
>> +
>> +	return 0;
>> +}
> 
> I am quite completely missing how the two functions above are different.
> 

The two above functions are slightly different but uses the same PECI 
command which provides both Tthrottle and Tcontrol values in pkg_config 
array so it updates the values to reduce duplicate PECI transactions. 
Probably, combining these two functions into get_ttrottle_and_tcontrol() 
would look better. I'll rewrite it.

>> +
>> +static int get_die_temp(struct peci_cputemp *priv)
>> +{
>> +	struct peci_get_temp_msg msg;
>> +	int rc;
>> +
>> +	if (!need_update(&priv->temp.die))
>> +		return 0;
>> +
>> +	rc = get_tjmax(priv);
>> +	if (rc)
>> +		return rc;
>> +
>> +	msg.addr = priv->addr;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	priv->temp.die.value = priv->temp.tjmax.value +
>> +			       ((s32)msg.temp_raw * 1000 / 64);
>> +
>> +	mark_updated(&priv->temp.die);
>> +
>> +	return 0;
>> +}
>> +
>> +static int get_dts_margin(struct peci_cputemp *priv)
>> +{
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	s32 dts_margin;
>> +	int rc;
>> +
>> +	if (!need_update(&priv->temp.dts_margin))
>> +		return 0;
>> +
>> +	msg.addr = priv->addr;
>> +	msg.index = MBX_INDEX_DTS_MARGIN;
>> +	msg.param = 0;
>> +	msg.rx_len = 4;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>> +
>> +	/**
>> +	 * Processors return a value of DTS reading in 10.6 format
>> +	 * (10 bits signed decimal, 6 bits fractional).
>> +	 * Error codes:
>> +	 *   0x8000: General sensor error
>> +	 *   0x8001: Reserved
>> +	 *   0x8002: Underflow on reading value
>> +	 *   0x8003-0x81ff: Reserved
>> +	 */
>> +	if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
>> +		return -EIO;
>> +
>> +	dts_margin = ten_dot_six_to_millidegree(dts_margin);
>> +
>> +	priv->temp.dts_margin.value = dts_margin;
>> +
>> +	mark_updated(&priv->temp.dts_margin);
>> +
>> +	return 0;
>> +}
>> +
>> +static int get_core_temp(struct peci_cputemp *priv, int core_index)
>> +{
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	s32 core_dts_margin;
>> +	int rc;
>> +
>> +	if (!need_update(&priv->temp.core[core_index]))
>> +		return 0;
>> +
>> +	rc = get_tjmax(priv);
>> +	if (rc)
>> +		return rc;
>> +
>> +	msg.addr = priv->addr;
>> +	msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
>> +	msg.param = core_index;
>> +	msg.rx_len = 4;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>> +
>> +	/**
>> +	 * Processors return a value of the core DTS reading in 10.6 format
>> +	 * (10 bits signed decimal, 6 bits fractional).
>> +	 * Error codes:
>> +	 *   0x8000: General sensor error
>> +	 *   0x8001: Reserved
>> +	 *   0x8002: Underflow on reading value
>> +	 *   0x8003-0x81ff: Reserved
>> +	 */
>> +	if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>> +		return -EIO;
>> +
>> +	core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
>> +
>> +	priv->temp.core[core_index].value = priv->temp.tjmax.value +
>> +					    core_dts_margin;
>> +
>> +	mark_updated(&priv->temp.core[core_index]);
>> +
>> +	return 0;
>> +}
>> +
> 
> There is a lot of duplication in those functions. Would it be possible
> to find common code and use functions for it instead of duplicating
> everything several times ?
> 

Are you pointing out this code?
/**
  * Processors return a value of the core DTS reading in 10.6 format
  * (10 bits signed decimal, 6 bits fractional).
  * Error codes:
  *   0x8000: General sensor error
  *   0x8001: Reserved
  *   0x8002: Underflow on reading value
  *   0x8003-0x81ff: Reserved
  */
if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
	return -EIO;

Then I'll rewrite it as a function. If not, please point out the 
duplication.

>> +static int find_core_index(struct peci_cputemp *priv, int channel)
>> +{
>> +	int core_channel = channel - DEFAULT_CHANNEL_NUMS;
>> +	int idx, found = 0;
>> +
>> +	for (idx = 0; idx < priv->gen_info->core_max; idx++) {
>> +		if (priv->core_mask & BIT(idx)) {
>> +			if (core_channel == found)
>> +				break;
>> +
>> +			found++;
>> +		}
>> +	}
>> +
>> +	return idx;
> 
> What if nothing is found ?
> 

Core temperature group will be registered only when it detects at least 
one core checked by check_resolved_cores(), so find_core_index() can be 
called only when priv->core_mask has a non-zero value. The 'nothing is 
found' case will not happen.

>> +}
>> +
>> +static int cputemp_read_string(struct device *dev,
>> +			       enum hwmon_sensor_types type,
>> +			       u32 attr, int channel, const char **str)
>> +{
>> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
>> +	int core_index;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_label:
>> +		if (channel < DEFAULT_CHANNEL_NUMS) {
>> +			*str = cputemp_label[channel];
>> +		} else {
>> +			core_index = find_core_index(priv, channel);
> 
> FWIW, it might be better to pass channel - DEFAULT_CHANNEL_NUMS
> as parameter.
> 

cputemp_read_string() is mapped to read_string member of hwmon_ops 
struct, so hwmon susbsystem passes the channel parameter based on the 
registered channel order. Should I modify hwmon subsystem code?

> What if find_core_index() returns priv->gen_info->core_max, ie
> if it didn't find a core ?
> 

As explained above, find_core index() returns a correct index always.

>> +			*str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
>> +		}
>> +		return 0;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static int cputemp_read_die(struct device *dev,
>> +			    enum hwmon_sensor_types type,
>> +			    u32 attr, int channel, long *val)
>> +{
>> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
>> +	int rc;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_input:
>> +		rc = get_die_temp(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.die.value;
>> +		return 0;
>> +	case hwmon_temp_max:
>> +		rc = get_tcontrol(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tcontrol.value;
>> +		return 0;
>> +	case hwmon_temp_crit:
>> +		rc = get_tjmax(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tjmax.value;
>> +		return 0;
>> +	case hwmon_temp_crit_hyst:
>> +		rc = get_tcontrol(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>> +		return 0;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static int cputemp_read_dts_margin(struct device *dev,
>> +				   enum hwmon_sensor_types type,
>> +				   u32 attr, int channel, long *val)
>> +{
>> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
>> +	int rc;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_input:
>> +		rc = get_dts_margin(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.dts_margin.value;
>> +		return 0;
>> +	case hwmon_temp_min:
>> +		*val = 0;
>> +		return 0;
> 
> This attribute should not exist.
> 

This is an attribute of DTS margin temperature which reflects thermal 
margin to Tcontrol of the CPU package. If it shows '0' means it reached 
to Tcontrol, the first level of thermal warning. If the CPU keeps 
getting hot then this DTS margin shows a negative value until it reaches 
to Tjmax. When the temperature reaches to Tjmax at last then it shows 
the lower critcal value which lcrit indicates as the second level of 
thermal warning.

>> +	case hwmon_temp_lcrit:
>> +		rc = get_tcontrol(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tcontrol.value - priv->temp.tjmax.value;
> 
> lcrit is tcontrol - tjmax, and crit_hyst above is
> tjmax - tcontrol ? How does this make sense ?
> 

Both Tjmax and Tcontrol have positive values and Tjmax is greater than 
Tcontrol always. As explained above, lcrit of DTS margin should show a 
negative value means the margin goes down across '0'. On the other hand, 
crit_hyst of Die temperature should show absolute hyterisis value 
between Tcontrol and Tjmax.

>> +		return 0;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static int cputemp_read_tcontrol(struct device *dev,
>> +				 enum hwmon_sensor_types type,
>> +				 u32 attr, int channel, long *val)
>> +{
>> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
>> +	int rc;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_input:
>> +		rc = get_tcontrol(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tcontrol.value;
>> +		return 0;
>> +	case hwmon_temp_crit:
>> +		rc = get_tjmax(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tjmax.value;
>> +		return 0;
> 
> Am I missing something, or is the same temperature reported several times ?
> tjmax is also reported as temp_crit cputemp_read_die(), for example.
> 

This driver provides multiple channels and each channel has its own 
supplement attributes. As you mentioned, Die temperature channel and 
Core temperature channel have their individual crit attributes and they 
reflect the same value, Tjmax. It is not reporting several times but 
reporting the same value.

>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static int cputemp_read_tthrottle(struct device *dev,
>> +				  enum hwmon_sensor_types type,
>> +				  u32 attr, int channel, long *val)
>> +{
>> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
>> +	int rc;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_input:
>> +		rc = get_tthrottle(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tthrottle.value;
>> +		return 0;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static int cputemp_read_tjmax(struct device *dev,
>> +			      enum hwmon_sensor_types type,
>> +			      u32 attr, int channel, long *val)
>> +{
>> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
>> +	int rc;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_input:
>> +		rc = get_tjmax(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tjmax.value;
>> +		return 0;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static int cputemp_read_core(struct device *dev,
>> +			     enum hwmon_sensor_types type,
>> +			     u32 attr, int channel, long *val)
>> +{
>> +	struct peci_cputemp *priv = dev_get_drvdata(dev);
>> +	int core_index = find_core_index(priv, channel);
>> +	int rc;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_input:
>> +		rc = get_core_temp(priv, core_index);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.core[core_index].value;
>> +		return 0;
>> +	case hwmon_temp_max:
>> +		rc = get_tcontrol(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tcontrol.value;
>> +		return 0;
>> +	case hwmon_temp_crit:
>> +		rc = get_tjmax(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tjmax.value;
>> +		return 0;
>> +	case hwmon_temp_crit_hyst:
>> +		rc = get_tcontrol(priv);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>> +		return 0;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
> 
> There is again a lot of duplication in those functions.
> 

Each function is called from cputemp_read() which is mapped to read 
function pointer of hwmon_ops struct. Since each channel has different 
set of attributes so the cputemp_read() calls an individual channel 
handler after checking the channel type. Of course, we can handle all 
attributes of all channels in a single function but the way also needs 
channel type checking code on each attribute.

>> +
>> +static int cputemp_read(struct device *dev,
>> +			enum hwmon_sensor_types type,
>> +			u32 attr, int channel, long *val)
>> +{
>> +	switch (channel) {
>> +	case channel_die:
>> +		return cputemp_read_die(dev, type, attr, channel, val);
>> +	case channel_dts_mrgn:
>> +		return cputemp_read_dts_margin(dev, type, attr, channel, val);
>> +	case channel_tcontrol:
>> +		return cputemp_read_tcontrol(dev, type, attr, channel, val);
>> +	case channel_tthrottle:
>> +		return cputemp_read_tthrottle(dev, type, attr, channel, val);
>> +	case channel_tjmax:
>> +		return cputemp_read_tjmax(dev, type, attr, channel, val);
>> +	default:
>> +		if (channel < CPUTEMP_CHANNEL_NUMS)
>> +			return cputemp_read_core(dev, type, attr, channel, val);
>> +
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static umode_t cputemp_is_visible(const void *data,
>> +				  enum hwmon_sensor_types type,
>> +				  u32 attr, int channel)
>> +{
>> +	const struct peci_cputemp *priv = data;
>> +
>> +	if (priv->temp_config[channel] & BIT(attr))
>> +		return 0444;
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct hwmon_ops cputemp_ops = {
>> +	.is_visible = cputemp_is_visible,
>> +	.read_string = cputemp_read_string,
>> +	.read = cputemp_read,
>> +};
>> +
>> +static int check_resolved_cores(struct peci_cputemp *priv)
>> +{
>> +	struct peci_rd_pci_cfg_local_msg msg;
>> +	int rc;
>> +
>> +	if (!(priv->client->adapter->cmd_mask & BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
>> +		return -EINVAL;
>> +
>> +	/* Get the RESOLVED_CORES register value */
>> +	msg.addr = priv->addr;
>> +	msg.bus = 1;
>> +	msg.device = 30;
>> +	msg.function = 3;
>> +	msg.reg = 0xB4;
> 
> Can this be made less magic with some defines ?
> 

Sure, will use defines instead.

>> +	msg.rx_len = 4;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	priv->core_mask = msg.pci_config[3] << 24 |
>> +			  msg.pci_config[2] << 16 |
>> +			  msg.pci_config[1] << 8 |
>> +			  msg.pci_config[0];
>> +
>> +	if (!priv->core_mask)
>> +		return -EAGAIN;
>> +
>> +	dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", priv->core_mask);
>> +	return 0;
>> +}
>> +
>> +static int create_core_temp_info(struct peci_cputemp *priv)
>> +{
>> +	int rc, i;
>> +
>> +	rc = check_resolved_cores(priv);
>> +	if (!rc) {
>> +		for (i = 0; i < priv->gen_info->core_max; i++) {
>> +			if (priv->core_mask & BIT(i)) {
>> +				priv->temp_config[priv->config_idx++] =
>> +						     config_table[channel_core];
>> +			}
>> +		}
>> +	}
>> +
>> +	return rc;
>> +}
>> +
>> +static int check_cpu_id(struct peci_cputemp *priv)
>> +{
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	u32 cpu_id;
>> +	int i, rc;
>> +
>> +	msg.addr = priv->addr;
>> +	msg.index = MBX_INDEX_CPU_ID;
>> +	msg.param = PKG_ID_CPU_ID;
>> +	msg.rx_len = 4;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>> +		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>> +
>> +	for (i = 0; i < CPU_GEN_MAX; i++) {
>> +		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>> +			priv->gen_info = &cpu_gen_info_table[i];
>> +			break;
>> +		}
>> +	}
>> +
>> +	if (!priv->gen_info)
>> +		return -ENODEV;
>> +
>> +	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>> +	return 0;
>> +}
>> +
>> +static int peci_cputemp_probe(struct peci_client *client)
>> +{
>> +	struct device *dev = &client->dev;
>> +	struct peci_cputemp *priv;
>> +	struct device *hwmon_dev;
>> +	int rc;
>> +
>> +	if ((client->adapter->cmd_mask &
>> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>> +		dev_err(dev, "Client doesn't support temperature monitoring\n");
>> +		return -EINVAL;
> 
> Does this mean there will be an error message for each non-supported CPU ?
> Why ?
> 

For proper operation of this driver, PECI_CMD_GET_TEMP and 
PECI_CMD_RD_PKG_CFG have to be supported by a client CPU. 
PECI_CMD_GET_TEMP is provided as a default command but 
PECI_CMD_RD_PKG_CFG depends on PECI minor revision of a CPU package so 
this checking is needed.

>> +	}
>> +
>> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>> +	if (!priv)
>> +		return -ENOMEM;
>> +
>> +	dev_set_drvdata(dev, priv);
>> +	priv->client = client;
>> +	priv->dev = dev;
>> +	priv->addr = client->addr;
>> +	priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>> +
>> +	snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
>> +		 priv->cpu_no);
>> +
>> +	rc = check_cpu_id(priv);
>> +	if (rc) {
>> +		dev_err(dev, "Client CPU is not supported\n");
> 
> -ENODEV is not an error, and should not result in an error message.
> Besides, the error can also be propagated from peci core code,
> and may well be something else.
> 

Got it. I'll remove the error message and will add a proper handling 
code into PECI core.

>> +		return rc;
>> +	}
>> +
>> +	priv->temp_config[priv->config_idx++] = config_table[channel_die];
>> +	priv->temp_config[priv->config_idx++] = config_table[channel_dts_mrgn];
>> +	priv->temp_config[priv->config_idx++] = config_table[channel_tcontrol];
>> +	priv->temp_config[priv->config_idx++] = config_table[channel_tthrottle];
>> +	priv->temp_config[priv->config_idx++] = config_table[channel_tjmax];
>> +
>> +	rc = create_core_temp_info(priv);
>> +	if (rc)
>> +		dev_dbg(dev, "Failed to create core temp info\n");
> 
> Then what ? Shouldn't this result in probe deferral or something more useful
> instead of just being ignored ?
> 

This driver can't support core temperature monitoring if a CPU doesn't 
support PECI_CMD_RD_PCI_CFG_LOCAL command. In that case, it skips core 
temperature group creation and supports only basic temperature 
monitoring of Die, DTS margin and etc. I'll add this description as a 
comment.

>> +
>> +	priv->chip.ops = &cputemp_ops;
>> +	priv->chip.info = priv->info;
>> +
>> +	priv->info[0] = &priv->temp_info;
>> +
>> +	priv->temp_info.type = hwmon_temp;
>> +	priv->temp_info.config = priv->temp_config;
>> +
>> +	hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>> +							 priv->name,
>> +							 priv,
>> +							 &priv->chip,
>> +							 NULL);
>> +
>> +	if (IS_ERR(hwmon_dev))
>> +		return PTR_ERR(hwmon_dev);
>> +
>> +	dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), priv->name);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct of_device_id peci_cputemp_of_table[] = {
>> +	{ .compatible = "intel,peci-cputemp" },
>> +	{ }
>> +};
>> +MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
>> +
>> +static struct peci_driver peci_cputemp_driver = {
>> +	.probe  = peci_cputemp_probe,
>> +	.driver = {
>> +		.name           = "peci-cputemp",
>> +		.of_match_table = of_match_ptr(peci_cputemp_of_table),
>> +	},
>> +};
>> +module_peci_driver(peci_cputemp_driver);
>> +
>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>> +MODULE_DESCRIPTION("PECI cputemp driver");
>> +MODULE_LICENSE("GPL v2");
>> diff --git a/drivers/hwmon/peci-dimmtemp.c b/drivers/hwmon/peci-dimmtemp.c
>> new file mode 100644
>> index 000000000000..78bf29cb2c4c
>> --- /dev/null
>> +++ b/drivers/hwmon/peci-dimmtemp.c
> 
> FWIW, this should be two separate patches.
> 

Should I split out hwmon documents and dt bindings too?

>> @@ -0,0 +1,432 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (c) 2018 Intel Corporation
>> +
>> +#include <linux/delay.h>
>> +#include <linux/hwmon.h>
>> +#include <linux/hwmon-sysfs.h>
> 
> Needed ?
> 

No. Will drop the line.

>> +#include <linux/jiffies.h>
>> +#include <linux/module.h>
>> +#include <linux/of_device.h>
>> +#include <linux/peci.h>
>> +#include <linux/workqueue.h>
>> +
>> +#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
>> +
>> +#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on Haswell */
>> +#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on Haswell */
>> +
>> +#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on Broadwell */
>> +#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on Broadwell */
>> +
>> +#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on Skylake */
>> +#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on Skylake */
>> +
>> +#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
>> +#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
>> +
>> +#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
>> +
>> +#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model info */
>> +
>> +#define UPDATE_INTERVAL_MIN  HZ
>> +
>> +#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
>> +#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 minutes */
>> +
>> +enum cpu_gens {
>> +	CPU_GEN_HSX, /* Haswell Xeon */
>> +	CPU_GEN_BRX, /* Broadwell Xeon */
>> +	CPU_GEN_SKX, /* Skylake Xeon */
>> +	CPU_GEN_MAX
>> +};
>> +
>> +struct cpu_gen_info {
>> +	u32 type;
>> +	u32 cpu_id;
>> +	u32 chan_rank_max;
>> +	u32 dimm_idx_max;
>> +};
>> +
>> +struct temp_data {
>> +	bool valid;
>> +	s32  value;
>> +	unsigned long last_updated;
>> +};
>> +
>> +struct peci_dimmtemp {
>> +	struct peci_client *client;
>> +	struct device *dev;
>> +	struct workqueue_struct *work_queue;
>> +	struct delayed_work work_handler;
>> +	char name[PECI_NAME_SIZE];
>> +	struct temp_data temp[DIMM_NUMS_MAX];
>> +	u8 addr;
>> +	uint cpu_no;
>> +	const struct cpu_gen_info *gen_info;
>> +	u32 dimm_mask;
>> +	int retry_count;
>> +	int channels;
>> +	u32 temp_config[DIMM_NUMS_MAX + 1];
>> +	struct hwmon_channel_info temp_info;
>> +	const struct hwmon_channel_info *info[2];
>> +	struct hwmon_chip_info chip;
>> +};
>> +
>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>> +	{ .type  = CPU_GEN_HSX,
>> +	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
>> +	  .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
>> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
>> +	{ .type  = CPU_GEN_BRX,
>> +	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
>> +	  .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
>> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
>> +	{ .type  = CPU_GEN_SKX,
>> +	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
>> +	  .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
>> +	  .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
>> +};
>> +
>> +static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
>> +	{ "DIMM A0", "DIMM A1", "DIMM A2" },
>> +	{ "DIMM B0", "DIMM B1", "DIMM B2" },
>> +	{ "DIMM C0", "DIMM C1", "DIMM C2" },
>> +	{ "DIMM D0", "DIMM D1", "DIMM D2" },
>> +	{ "DIMM E0", "DIMM E1", "DIMM E2" },
>> +	{ "DIMM F0", "DIMM F1", "DIMM F2" },
>> +	{ "DIMM G0", "DIMM G1", "DIMM G2" },
>> +	{ "DIMM H0", "DIMM H1", "DIMM H2" },
>> +};
>> +
>> +static int send_peci_cmd(struct peci_dimmtemp *priv, enum peci_cmd cmd,
>> +			 void *msg)
>> +{
>> +	return peci_command(priv->client->adapter, cmd, msg);
>> +}
>> +
>> +static int need_update(struct temp_data *temp)
>> +{
>> +	if (temp->valid &&
>> +	    time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
>> +		return 0;
>> +
>> +	return 1;
>> +}
>> +
>> +static void mark_updated(struct temp_data *temp)
>> +{
>> +	temp->valid = true;
>> +	temp->last_updated = jiffies;
>> +}
> 
> It might make sense to provide the duplicate functions in a core file.
> 

It is temperature monitoring specific function and it touches module 
specific variables. Do you really think that this non-generic function 
should be moved to PECI core?

>> +
>> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
>> +{
>> +	int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
>> +	int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	int rc;
>> +
>> +	if (!need_update(&priv->temp[dimm_no]))
>> +		return 0;
>> +
>> +	msg.addr = priv->addr;
>> +	msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>> +	msg.param = chan_rank;
>> +	msg.rx_len = 4;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
>> +
>> +	mark_updated(&priv->temp[dimm_no]);
>> +
>> +	return 0;
>> +}
>> +
>> +static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
>> +{
>> +	int dimm_nums_max = priv->gen_info->chan_rank_max *
>> +			    priv->gen_info->dimm_idx_max;
>> +	int idx, found = 0;
>> +
>> +	for (idx = 0; idx < dimm_nums_max; idx++) {
>> +		if (priv->dimm_mask & BIT(idx)) {
>> +			if (channel == found)
>> +				break;
>> +
>> +			found++;
>> +		}
>> +	}
>> +
>> +	return idx;
>> +}
> 
> This again looks like duplicate code.
> 

find_dimm_number()? I'm sure it isn't.

>> +
>> +static int dimmtemp_read_string(struct device *dev,
>> +				enum hwmon_sensor_types type,
>> +				u32 attr, int channel, const char **str)
>> +{
>> +	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>> +	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>> +	int dimm_no, chan_rank, dimm_idx;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_label:
>> +		dimm_no = find_dimm_number(priv, channel);
>> +		chan_rank = dimm_no / dimm_idx_max;
>> +		dimm_idx = dimm_no % dimm_idx_max;
>> +		*str = dimmtemp_label[chan_rank][dimm_idx];
>> +		return 0;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
>> +			 u32 attr, int channel, long *val)
>> +{
>> +	struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>> +	int dimm_no = find_dimm_number(priv, channel);
>> +	int rc;
>> +
>> +	switch (attr) {
>> +	case hwmon_temp_input:
>> +		rc = get_dimm_temp(priv, dimm_no);
>> +		if (rc)
>> +			return rc;
>> +
>> +		*val = priv->temp[dimm_no].value;
>> +		return 0;
>> +	default:
>> +		return -EOPNOTSUPP;
>> +	}
>> +}
>> +
>> +static umode_t dimmtemp_is_visible(const void *data,
>> +				   enum hwmon_sensor_types type,
>> +				   u32 attr, int channel)
>> +{
>> +	switch (attr) {
>> +	case hwmon_temp_label:
>> +	case hwmon_temp_input:
>> +		return 0444;
>> +	default:
>> +		return 0;
>> +	}
>> +}
>> +
>> +static const struct hwmon_ops dimmtemp_ops = {
>> +	.is_visible = dimmtemp_is_visible,
>> +	.read_string = dimmtemp_read_string,
>> +	.read = dimmtemp_read,
>> +};
>> +
>> +static int check_populated_dimms(struct peci_dimmtemp *priv)
>> +{
>> +	u32 chan_rank_max = priv->gen_info->chan_rank_max;
>> +	u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	int chan_rank, dimm_idx;
>> +	int rc, channels = 0;
>> +
>> +	for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
>> +		msg.addr = priv->addr;
>> +		msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>> +		msg.param = chan_rank;
>> +		msg.rx_len = 4;
>> +
>> +		rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +		if (rc) {
>> +			priv->dimm_mask = 0;
>> +			return rc;
>> +		}
>> +
>> +		for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
>> +			if (msg.pkg_config[dimm_idx]) {
>> +				priv->dimm_mask |= BIT(chan_rank *
>> +						       chan_rank_max +
>> +						       dimm_idx);
>> +				channels++;
>> +			}
>> +		}
>> +	}
>> +
>> +	if (!priv->dimm_mask)
>> +		return -EAGAIN;
>> +
>> +	priv->channels = channels;
>> +
>> +	dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", priv->dimm_mask);
>> +	return 0;
>> +}
>> +
>> +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
>> +{
>> +	struct device *hwmon_dev;
>> +	int rc, i;
>> +
>> +	rc = check_populated_dimms(priv);
>> +	if (!rc) {
> 
> Please handle error cases first.
> 

Sure, I'll rewrite it.

>> +		for (i = 0; i < priv->channels; i++)
>> +			priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
>> +
>> +		priv->chip.ops = &dimmtemp_ops;
>> +		priv->chip.info = priv->info;
>> +
>> +		priv->info[0] = &priv->temp_info;
>> +
>> +		priv->temp_info.type = hwmon_temp;
>> +		priv->temp_info.config = priv->temp_config;
>> +
>> +		hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>> +								 priv->name,
>> +								 priv,
>> +								 &priv->chip,
>> +								 NULL);
>> +		rc = PTR_ERR_OR_ZERO(hwmon_dev);
>> +		if (!rc)
>> +			dev_dbg(priv->dev, "%s: sensor '%s'\n",
>> +				dev_name(hwmon_dev), priv->name);
>> +	} else if (rc == -EAGAIN) {
>> +		if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
>> +			queue_delayed_work(priv->work_queue,
>> +					   &priv->work_handler,
>> +					   DIMM_MASK_CHECK_DELAY_JIFFIES);
>> +			priv->retry_count++;
>> +			dev_dbg(priv->dev,
>> +				"Deferred DIMM temp info creation\n");
>> +		} else {
>> +			rc = -ETIMEDOUT;
>> +			dev_err(priv->dev,
>> +				"Timeout retrying DIMM temp info creation\n");
>> +		}
>> +	}
>> +
>> +	return rc;
>> +}
>> +
>> +static void create_dimm_temp_info_delayed(struct work_struct *work)
>> +{
>> +	struct delayed_work *dwork = to_delayed_work(work);
>> +	struct peci_dimmtemp *priv = container_of(dwork, struct peci_dimmtemp,
>> +						  work_handler);
>> +	int rc;
>> +
>> +	rc = create_dimm_temp_info(priv);
>> +	if (rc && rc != -EAGAIN)
>> +		dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
>> +}
>> +
>> +static int check_cpu_id(struct peci_dimmtemp *priv)
>> +{
>> +	struct peci_rd_pkg_cfg_msg msg;
>> +	u32 cpu_id;
>> +	int i, rc;
>> +
>> +	msg.addr = priv->addr;
>> +	msg.index = MBX_INDEX_CPU_ID;
>> +	msg.param = PKG_ID_CPU_ID;
>> +	msg.rx_len = 4;
>> +
>> +	rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>> +	if (rc)
>> +		return rc;
>> +
>> +	cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>> +		  msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>> +
>> +	for (i = 0; i < CPU_GEN_MAX; i++) {
>> +		if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>> +			priv->gen_info = &cpu_gen_info_table[i];
>> +			break;
>> +		}
>> +	}
>> +
>> +	if (!priv->gen_info)
>> +		return -ENODEV;
>> +
>> +	dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>> +	return 0;
>> +}
> 
> More duplicate code.
> 

Okay. In case of check_cpu_id(), it could be used as a generic PECI 
function. I'll move it into PECI core.

>> +
>> +static int peci_dimmtemp_probe(struct peci_client *client)
>> +{
>> +	struct device *dev = &client->dev;
>> +	struct peci_dimmtemp *priv;
>> +	int rc;
>> +
>> +	if ((client->adapter->cmd_mask &
>> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>> +	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
> 
> One set of ( ) is unnecessary on each side of the expression.
> 

'&' has a precedence over '!=' but '|' doesn't. I'll rewrite it to:

	if (client->adapter->cmd_mask &
	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)) !=
	    (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)))

>> +		dev_err(dev, "Client doesn't support temperature monitoring\n");
>> +		return -EINVAL;
> 
> Why is this "invalid", and why does it warrant an error message ?
> 

Should I use -EPERM? Any suggestion?

>> +	}
>> +
>> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>> +	if (!priv)
>> +		return -ENOMEM;
>> +
>> +	dev_set_drvdata(dev, priv);
>> +	priv->client = client;
>> +	priv->dev = dev;
>> +	priv->addr = client->addr;
>> +	priv->cpu_no = priv->addr - PECI_BASE_ADDR;
> 
> Is priv->addr guaranteed to be >= PECI_BASE_ADDR ?

Client address range validation will be done in 
peci_check_addr_validity() in PECI core before probing a device driver.

>> +
>> +	snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
>> +		 priv->cpu_no);
>> +
>> +	rc = check_cpu_id(priv);
>> +	if (rc) {
>> +		dev_err(dev, "Client CPU is not supported\n");
> 
> Or the peci command failed.
> 

I'll remove the error message and will add a proper handling code into 
PECI core on each error type.

>> +		return rc;
>> +	}
>> +
>> +	priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
>> +	if (!priv->work_queue)
>> +		return -ENOMEM;
>> +
>> +	INIT_DELAYED_WORK(&priv->work_handler, create_dimm_temp_info_delayed);
>> +
>> +	rc = create_dimm_temp_info(priv);
>> +	if (rc && rc != -EAGAIN) {
>> +		dev_err(dev, "Failed to create DIMM temp info\n");
>> +		goto err_free_wq;
>> +	}
>> +
>> +	return 0;
>> +
>> +err_free_wq:
>> +	destroy_workqueue(priv->work_queue);
>> +	return rc;
>> +}
>> +
>> +static int peci_dimmtemp_remove(struct peci_client *client)
>> +{
>> +	struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
>> +
>> +	cancel_delayed_work(&priv->work_handler);
> 
> cancel_delayed_work_sync() ?
> 

Yes, it would be safer. Will fix it.

>> +	destroy_workqueue(priv->work_queue);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct of_device_id peci_dimmtemp_of_table[] = {
>> +	{ .compatible = "intel,peci-dimmtemp" },
>> +	{ }
>> +};
>> +MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
>> +
>> +static struct peci_driver peci_dimmtemp_driver = {
>> +	.probe  = peci_dimmtemp_probe,
>> +	.remove = peci_dimmtemp_remove,
>> +	.driver = {
>> +		.name           = "peci-dimmtemp",
>> +		.of_match_table = of_match_ptr(peci_dimmtemp_of_table),
>> +	},
>> +};
>> +module_peci_driver(peci_dimmtemp_driver);
>> +
>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>> +MODULE_DESCRIPTION("PECI dimmtemp driver");
>> +MODULE_LICENSE("GPL v2");
>> -- 
>> 2.16.2
>>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-11 21:59     ` Jae Hyun Yoo
@ 2018-04-12  0:34       ` Guenter Roeck
  2018-04-12  2:51         ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Guenter Roeck @ 2018-04-12  0:34 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On 04/11/2018 02:59 PM, Jae Hyun Yoo wrote:
> Hi Guenter,
> 
> Thanks a lot for sharing your time. Please see my inline answers.
> 
> On 4/10/2018 3:28 PM, Guenter Roeck wrote:
>> On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:
>>> This commit adds PECI cputemp and dimmtemp hwmon drivers.
>>>
>>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>>> Cc: Alan Cox <alan@linux.intel.com>
>>> Cc: Andrew Jeffery <andrew@aj.id.au>
>>> Cc: Andrew Lunn <andrew@lunn.ch>
>>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>>> Cc: Greg KH <gregkh@linuxfoundation.org>
>>> Cc: Guenter Roeck <linux@roeck-us.net>
>>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>>> Cc: Jean Delvare <jdelvare@suse.com>
>>> Cc: Joel Stanley <joel@jms.id.au>
>>> Cc: Julia Cartwright <juliac@eso.teric.us>
>>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>>> Cc: Milton Miller II <miltonm@us.ibm.com>
>>> Cc: Pavel Machek <pavel@ucw.cz>
>>> Cc: Randy Dunlap <rdunlap@infradead.org>
>>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>>> ---
>>>   drivers/hwmon/Kconfig         |  28 ++
>>>   drivers/hwmon/Makefile        |   2 +
>>>   drivers/hwmon/peci-cputemp.c  | 783 ++++++++++++++++++++++++++++++++++++++++++
>>>   drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
>>>   4 files changed, 1245 insertions(+)
>>>   create mode 100644 drivers/hwmon/peci-cputemp.c
>>>   create mode 100644 drivers/hwmon/peci-dimmtemp.c
>>>
>>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
>>> index f249a4428458..c52f610f81d0 100644
>>> --- a/drivers/hwmon/Kconfig
>>> +++ b/drivers/hwmon/Kconfig
>>> @@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
>>>         This driver can also be built as a module.  If so, the module
>>>         will be called nct7904.
>>> +config SENSORS_PECI_CPUTEMP
>>> +    tristate "PECI CPU temperature monitoring support"
>>> +    depends on OF
>>> +    depends on PECI
>>> +    help
>>> +      If you say yes here you get support for the generic Intel PECI
>>> +      cputemp driver which provides Digital Thermal Sensor (DTS) thermal
>>> +      readings of the CPU package and CPU cores that are accessible using
>>> +      the PECI Client Command Suite via the processor PECI client.
>>> +      Check Documentation/hwmon/peci-cputemp for details.
>>> +
>>> +      This driver can also be built as a module.  If so, the module
>>> +      will be called peci-cputemp.
>>> +
>>> +config SENSORS_PECI_DIMMTEMP
>>> +    tristate "PECI DIMM temperature monitoring support"
>>> +    depends on OF
>>> +    depends on PECI
>>> +    help
>>> +      If you say yes here you get support for the generic Intel PECI hwmon
>>> +      driver which provides Digital Thermal Sensor (DTS) thermal readings of
>>> +      DIMM components that are accessible using the PECI Client Command
>>> +      Suite via the processor PECI client.
>>> +      Check Documentation/hwmon/peci-dimmtemp for details.
>>> +
>>> +      This driver can also be built as a module.  If so, the module
>>> +      will be called peci-dimmtemp.
>>> +
>>>   config SENSORS_NSA320
>>>       tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
>>>       depends on GPIOLIB && OF
>>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
>>> index e7d52a36e6c4..48d9598fcd3a 100644
>>> --- a/drivers/hwmon/Makefile
>>> +++ b/drivers/hwmon/Makefile
>>> @@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)    += nct7802.o
>>>   obj-$(CONFIG_SENSORS_NCT7904)    += nct7904.o
>>>   obj-$(CONFIG_SENSORS_NSA320)    += nsa320-hwmon.o
>>>   obj-$(CONFIG_SENSORS_NTC_THERMISTOR)    += ntc_thermistor.o
>>> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)    += peci-cputemp.o
>>> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
>>>   obj-$(CONFIG_SENSORS_PC87360)    += pc87360.o
>>>   obj-$(CONFIG_SENSORS_PC87427)    += pc87427.o
>>>   obj-$(CONFIG_SENSORS_PCF8591)    += pcf8591.o
>>> diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
>>> new file mode 100644
>>> index 000000000000..f0bc92687512
>>> --- /dev/null
>>> +++ b/drivers/hwmon/peci-cputemp.c
>>> @@ -0,0 +1,783 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +// Copyright (c) 2018 Intel Corporation
>>> +
>>> +#include <linux/delay.h>
>>> +#include <linux/hwmon.h>
>>> +#include <linux/hwmon-sysfs.h>
>>
>> Is this include needed ?
>>
> 
> No it isn't. Will drop the line.
> 
>>> +#include <linux/jiffies.h>
>>> +#include <linux/module.h>
>>> +#include <linux/of_device.h>
>>> +#include <linux/peci.h>
>>> +
>>> +#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
>>> +
>>> +#define CORE_MAX_ON_HSX       18 /* Max number of cores on Haswell */
>>> +#define CORE_MAX_ON_BDX       24 /* Max number of cores on Broadwell */
>>> +#define CORE_MAX_ON_SKX       28 /* Max number of cores on Skylake */
>>> +
>>> +#define DEFAULT_CHANNEL_NUMS  5
>>> +#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
>>> +#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
>>> +
>>> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
>>> +
>>> +#define UPDATE_INTERVAL_MIN   HZ
>>> +
>>> +enum cpu_gens {
>>> +    CPU_GEN_HSX, /* Haswell Xeon */
>>> +    CPU_GEN_BRX, /* Broadwell Xeon */
>>> +    CPU_GEN_SKX, /* Skylake Xeon */
>>> +    CPU_GEN_MAX
>>> +};
>>> +
>>> +struct cpu_gen_info {
>>> +    u32 type;
>>> +    u32 cpu_id;
>>> +    u32 core_max;
>>> +};
>>> +
>>> +struct temp_data {
>>> +    bool valid;
>>> +    s32  value;
>>> +    unsigned long last_updated;
>>> +};
>>> +
>>> +struct temp_group {
>>> +    struct temp_data die;
>>> +    struct temp_data dts_margin;
>>> +    struct temp_data tcontrol;
>>> +    struct temp_data tthrottle;
>>> +    struct temp_data tjmax;
>>> +    struct temp_data core[CORETEMP_CHANNEL_NUMS];
>>> +};
>>> +
>>> +struct peci_cputemp {
>>> +    struct peci_client *client;
>>> +    struct device *dev;
>>> +    char name[PECI_NAME_SIZE];
>>> +    struct temp_group temp;
>>> +    u8 addr;
>>> +    uint cpu_no;
>>> +    const struct cpu_gen_info *gen_info;
>>> +    u32 core_mask;
>>> +    u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
>>> +    uint config_idx;
>>> +    struct hwmon_channel_info temp_info;
>>> +    const struct hwmon_channel_info *info[2];
>>> +    struct hwmon_chip_info chip;
>>> +};
>>> +
>>> +enum cputemp_channels {
>>> +    channel_die,
>>> +    channel_dts_mrgn,
>>> +    channel_tcontrol,
>>> +    channel_tthrottle,
>>> +    channel_tjmax,
>>> +    channel_core,
>>> +};
>>> +
>>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>>> +    { .type = CPU_GEN_HSX,
>>> +      .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
>>> +      .core_max = CORE_MAX_ON_HSX },
>>> +    { .type = CPU_GEN_BRX,
>>> +      .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
>>> +      .core_max = CORE_MAX_ON_BDX },
>>> +    { .type = CPU_GEN_SKX,
>>> +      .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
>>> +      .core_max = CORE_MAX_ON_SKX },
>>> +};
>>> +
>>> +static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
>>> +    /* Die temperature */
>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>>> +    HWMON_T_CRIT_HYST,
>>> +
>>> +    /* DTS margin temperature */
>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
>>> +
>>> +    /* Tcontrol temperature */
>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
>>> +
>>> +    /* Tthrottle temperature */
>>> +    HWMON_T_LABEL | HWMON_T_INPUT,
>>> +
>>> +    /* Tjmax temperature */
>>> +    HWMON_T_LABEL | HWMON_T_INPUT,
>>> +
>>> +    /* Core temperature - for all core channels */
>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>>> +    HWMON_T_CRIT_HYST,
>>> +};
>>> +
>>> +static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
>>> +    "Die",
>>> +    "DTS margin",
>>> +    "Tcontrol",
>>> +    "Tthrottle",
>>> +    "Tjmax",
>>> +    "Core 0", "Core 1", "Core 2", "Core 3",
>>> +    "Core 4", "Core 5", "Core 6", "Core 7",
>>> +    "Core 8", "Core 9", "Core 10", "Core 11",
>>> +    "Core 12", "Core 13", "Core 14", "Core 15",
>>> +    "Core 16", "Core 17", "Core 18", "Core 19",
>>> +    "Core 20", "Core 21", "Core 22", "Core 23",
>>> +};
>>> +
>>> +static int send_peci_cmd(struct peci_cputemp *priv,
>>> +             enum peci_cmd cmd,
>>> +             void *msg)
>>> +{
>>> +    return peci_command(priv->client->adapter, cmd, msg);
>>> +}
>>> +
>>> +static int need_update(struct temp_data *temp)
>>
>> Please use bool.
>>
> 
> Okay. I'll use bool instead of int.
> 
>>> +{
>>> +    if (temp->valid &&
>>> +        time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
>>> +        return 0;
>>> +
>>> +    return 1;
>>> +}
>>> +
>>> +static void mark_updated(struct temp_data *temp)
>>> +{
>>> +    temp->valid = true;
>>> +    temp->last_updated = jiffies;
>>> +}
>>> +
>>> +static s32 ten_dot_six_to_millidegree(s32 val)
>>> +{
>>> +    return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
>>> +}
>>> +
>>> +static int get_tjmax(struct peci_cputemp *priv)
>>> +{
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    int rc;
>>> +
>>> +    if (!priv->temp.tjmax.valid) {
>>> +        msg.addr = priv->addr;
>>> +        msg.index = MBX_INDEX_TEMP_TARGET;
>>> +        msg.param = 0;
>>> +        msg.rx_len = 4;
>>> +
>>> +        rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
>>> +        priv->temp.tjmax.valid = true;
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int get_tcontrol(struct peci_cputemp *priv)
>>> +{
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    s32 tcontrol_margin;
>>> +    s32 tthrottle_offset;
>>> +    int rc;
>>> +
>>> +    if (!need_update(&priv->temp.tcontrol))
>>> +        return 0;
>>> +
>>> +    rc = get_tjmax(priv);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    msg.addr = priv->addr;
>>> +    msg.index = MBX_INDEX_TEMP_TARGET;
>>> +    msg.param = 0;
>>> +    msg.rx_len = 4;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    tcontrol_margin = msg.pkg_config[1];
>>> +    tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>>> +    priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
>>> +
>>> +    tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>>> +    priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
>>> +
>>> +    mark_updated(&priv->temp.tcontrol);
>>> +    mark_updated(&priv->temp.tthrottle);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int get_tthrottle(struct peci_cputemp *priv)
>>> +{
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    s32 tcontrol_margin;
>>> +    s32 tthrottle_offset;
>>> +    int rc;
>>> +
>>> +    if (!need_update(&priv->temp.tthrottle))
>>> +        return 0;
>>> +
>>> +    rc = get_tjmax(priv);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    msg.addr = priv->addr;
>>> +    msg.index = MBX_INDEX_TEMP_TARGET;
>>> +    msg.param = 0;
>>> +    msg.rx_len = 4;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>>> +    priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
>>> +
>>> +    tcontrol_margin = msg.pkg_config[1];
>>> +    tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>>> +    priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
>>> +
>>> +    mark_updated(&priv->temp.tthrottle);
>>> +    mark_updated(&priv->temp.tcontrol);
>>> +
>>> +    return 0;
>>> +}
>>
>> I am quite completely missing how the two functions above are different.
>>
> 
> The two above functions are slightly different but uses the same PECI command which provides both Tthrottle and Tcontrol values in pkg_config array so it updates the values to reduce duplicate PECI transactions. Probably, combining these two functions into get_ttrottle_and_tcontrol() would look better. I'll rewrite it.
> 
>>> +
>>> +static int get_die_temp(struct peci_cputemp *priv)
>>> +{
>>> +    struct peci_get_temp_msg msg;
>>> +    int rc;
>>> +
>>> +    if (!need_update(&priv->temp.die))
>>> +        return 0;
>>> +
>>> +    rc = get_tjmax(priv);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    msg.addr = priv->addr;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    priv->temp.die.value = priv->temp.tjmax.value +
>>> +                   ((s32)msg.temp_raw * 1000 / 64);
>>> +
>>> +    mark_updated(&priv->temp.die);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int get_dts_margin(struct peci_cputemp *priv)
>>> +{
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    s32 dts_margin;
>>> +    int rc;
>>> +
>>> +    if (!need_update(&priv->temp.dts_margin))
>>> +        return 0;
>>> +
>>> +    msg.addr = priv->addr;
>>> +    msg.index = MBX_INDEX_DTS_MARGIN;
>>> +    msg.param = 0;
>>> +    msg.rx_len = 4;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>>> +
>>> +    /**
>>> +     * Processors return a value of DTS reading in 10.6 format
>>> +     * (10 bits signed decimal, 6 bits fractional).
>>> +     * Error codes:
>>> +     *   0x8000: General sensor error
>>> +     *   0x8001: Reserved
>>> +     *   0x8002: Underflow on reading value
>>> +     *   0x8003-0x81ff: Reserved
>>> +     */
>>> +    if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
>>> +        return -EIO;
>>> +
>>> +    dts_margin = ten_dot_six_to_millidegree(dts_margin);
>>> +
>>> +    priv->temp.dts_margin.value = dts_margin;
>>> +
>>> +    mark_updated(&priv->temp.dts_margin);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int get_core_temp(struct peci_cputemp *priv, int core_index)
>>> +{
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    s32 core_dts_margin;
>>> +    int rc;
>>> +
>>> +    if (!need_update(&priv->temp.core[core_index]))
>>> +        return 0;
>>> +
>>> +    rc = get_tjmax(priv);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    msg.addr = priv->addr;
>>> +    msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
>>> +    msg.param = core_index;
>>> +    msg.rx_len = 4;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>>> +
>>> +    /**
>>> +     * Processors return a value of the core DTS reading in 10.6 format
>>> +     * (10 bits signed decimal, 6 bits fractional).
>>> +     * Error codes:
>>> +     *   0x8000: General sensor error
>>> +     *   0x8001: Reserved
>>> +     *   0x8002: Underflow on reading value
>>> +     *   0x8003-0x81ff: Reserved
>>> +     */
>>> +    if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>>> +        return -EIO;
>>> +
>>> +    core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
>>> +
>>> +    priv->temp.core[core_index].value = priv->temp.tjmax.value +
>>> +                        core_dts_margin;
>>> +
>>> +    mark_updated(&priv->temp.core[core_index]);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>
>> There is a lot of duplication in those functions. Would it be possible
>> to find common code and use functions for it instead of duplicating
>> everything several times ?
>>
> 
> Are you pointing out this code?
> /**
>   * Processors return a value of the core DTS reading in 10.6 format
>   * (10 bits signed decimal, 6 bits fractional).
>   * Error codes:
>   *   0x8000: General sensor error
>   *   0x8001: Reserved
>   *   0x8002: Underflow on reading value
>   *   0x8003-0x81ff: Reserved
>   */
> if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>      return -EIO;
> 
> Then I'll rewrite it as a function. If not, please point out the duplication.
> 

There is lots of other duplication.

>>> +static int find_core_index(struct peci_cputemp *priv, int channel)
>>> +{
>>> +    int core_channel = channel - DEFAULT_CHANNEL_NUMS;
>>> +    int idx, found = 0;
>>> +
>>> +    for (idx = 0; idx < priv->gen_info->core_max; idx++) {
>>> +        if (priv->core_mask & BIT(idx)) {
>>> +            if (core_channel == found)
>>> +                break;
>>> +
>>> +            found++;
>>> +        }
>>> +    }
>>> +
>>> +    return idx;
>>
>> What if nothing is found ?
>>
> 
> Core temperature group will be registered only when it detects at least one core checked by check_resolved_cores(), so find_core_index() can be called only when priv->core_mask has a non-zero value. The 'nothing is found' case will not happen.
> 
That doesn't guarantee a match. If what you are saying is correct there should always be
a well defined match of channel -> idx, and the search should be unnecessary.

>>> +}
>>> +
>>> +static int cputemp_read_string(struct device *dev,
>>> +                   enum hwmon_sensor_types type,
>>> +                   u32 attr, int channel, const char **str)
>>> +{
>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>> +    int core_index;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_label:
>>> +        if (channel < DEFAULT_CHANNEL_NUMS) {
>>> +            *str = cputemp_label[channel];
>>> +        } else {
>>> +            core_index = find_core_index(priv, channel);
>>
>> FWIW, it might be better to pass channel - DEFAULT_CHANNEL_NUMS
>> as parameter.
>>
> 
> cputemp_read_string() is mapped to read_string member of hwmon_ops struct, so hwmon susbsystem passes the channel parameter based on the registered channel order. Should I modify hwmon subsystem code?
> 

Huh ? Changing
	f(x) { y = x - const; }
...
	f(x);

to
	f(y) { }
...
	f(x - const);

requires a hwmon core change ? Really ?

>> What if find_core_index() returns priv->gen_info->core_max, ie
>> if it didn't find a core ?
>>
> 
> As explained above, find_core index() returns a correct index always.
> 
>>> +            *str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
>>> +        }
>>> +        return 0;
>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static int cputemp_read_die(struct device *dev,
>>> +                enum hwmon_sensor_types type,
>>> +                u32 attr, int channel, long *val)
>>> +{
>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>> +    int rc;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_input:
>>> +        rc = get_die_temp(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.die.value;
>>> +        return 0;
>>> +    case hwmon_temp_max:
>>> +        rc = get_tcontrol(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tcontrol.value;
>>> +        return 0;
>>> +    case hwmon_temp_crit:
>>> +        rc = get_tjmax(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tjmax.value;
>>> +        return 0;
>>> +    case hwmon_temp_crit_hyst:
>>> +        rc = get_tcontrol(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>>> +        return 0;
>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static int cputemp_read_dts_margin(struct device *dev,
>>> +                   enum hwmon_sensor_types type,
>>> +                   u32 attr, int channel, long *val)
>>> +{
>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>> +    int rc;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_input:
>>> +        rc = get_dts_margin(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.dts_margin.value;
>>> +        return 0;
>>> +    case hwmon_temp_min:
>>> +        *val = 0;
>>> +        return 0;
>>
>> This attribute should not exist.
>>
> 
> This is an attribute of DTS margin temperature which reflects thermal margin to Tcontrol of the CPU package. If it shows '0' means it reached to Tcontrol, the first level of thermal warning. If the CPU keeps getting hot then this DTS margin shows a negative value until it reaches to Tjmax. When the temperature reaches to Tjmax at last then it shows the lower critcal value which lcrit indicates as the second level of thermal warning.
> 

The hwmon ABI reports chip values, not constants. Even though some drivers do
it, reporting a constant is always wrong.

>>> +    case hwmon_temp_lcrit:
>>> +        rc = get_tcontrol(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tcontrol.value - priv->temp.tjmax.value;
>>
>> lcrit is tcontrol - tjmax, and crit_hyst above is
>> tjmax - tcontrol ? How does this make sense ?
>>
> 
> Both Tjmax and Tcontrol have positive values and Tjmax is greater than Tcontrol always. As explained above, lcrit of DTS margin should show a negative value means the margin goes down across '0'. On the other hand, crit_hyst of Die temperature should show absolute hyterisis value between Tcontrol and Tjmax.
> 
The hwmon ABI requires reporting of absolute temperatures in milli-degrees C.
Your statements make it very clear that this driver does not report
absolute temperatures. This is not acceptable.

>>> +        return 0;
>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static int cputemp_read_tcontrol(struct device *dev,
>>> +                 enum hwmon_sensor_types type,
>>> +                 u32 attr, int channel, long *val)
>>> +{
>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>> +    int rc;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_input:
>>> +        rc = get_tcontrol(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tcontrol.value;
>>> +        return 0;
>>> +    case hwmon_temp_crit:
>>> +        rc = get_tjmax(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tjmax.value;
>>> +        return 0;
>>
>> Am I missing something, or is the same temperature reported several times ?
>> tjmax is also reported as temp_crit cputemp_read_die(), for example.
>>
> 
> This driver provides multiple channels and each channel has its own supplement attributes. As you mentioned, Die temperature channel and Core temperature channel have their individual crit attributes and they reflect the same value, Tjmax. It is not reporting several times but reporting the same value.
> 
Then maybe fold the functions accordingly ?

>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static int cputemp_read_tthrottle(struct device *dev,
>>> +                  enum hwmon_sensor_types type,
>>> +                  u32 attr, int channel, long *val)
>>> +{
>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>> +    int rc;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_input:
>>> +        rc = get_tthrottle(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tthrottle.value;
>>> +        return 0;
>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static int cputemp_read_tjmax(struct device *dev,
>>> +                  enum hwmon_sensor_types type,
>>> +                  u32 attr, int channel, long *val)
>>> +{
>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>> +    int rc;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_input:
>>> +        rc = get_tjmax(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tjmax.value;
>>> +        return 0;
>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static int cputemp_read_core(struct device *dev,
>>> +                 enum hwmon_sensor_types type,
>>> +                 u32 attr, int channel, long *val)
>>> +{
>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>> +    int core_index = find_core_index(priv, channel);
>>> +    int rc;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_input:
>>> +        rc = get_core_temp(priv, core_index);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.core[core_index].value;
>>> +        return 0;
>>> +    case hwmon_temp_max:
>>> +        rc = get_tcontrol(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tcontrol.value;
>>> +        return 0;
>>> +    case hwmon_temp_crit:
>>> +        rc = get_tjmax(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tjmax.value;
>>> +        return 0;
>>> +    case hwmon_temp_crit_hyst:
>>> +        rc = get_tcontrol(priv);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>>> +        return 0;
>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>
>> There is again a lot of duplication in those functions.
>>
> 
> Each function is called from cputemp_read() which is mapped to read function pointer of hwmon_ops struct. Since each channel has different set of attributes so the cputemp_read() calls an individual channel handler after checking the channel type. Of course, we can handle all attributes of all channels in a single function but the way also needs channel type checking code on each attribute.
> 
>>> +
>>> +static int cputemp_read(struct device *dev,
>>> +            enum hwmon_sensor_types type,
>>> +            u32 attr, int channel, long *val)
>>> +{
>>> +    switch (channel) {
>>> +    case channel_die:
>>> +        return cputemp_read_die(dev, type, attr, channel, val);
>>> +    case channel_dts_mrgn:
>>> +        return cputemp_read_dts_margin(dev, type, attr, channel, val);
>>> +    case channel_tcontrol:
>>> +        return cputemp_read_tcontrol(dev, type, attr, channel, val);
>>> +    case channel_tthrottle:
>>> +        return cputemp_read_tthrottle(dev, type, attr, channel, val);
>>> +    case channel_tjmax:
>>> +        return cputemp_read_tjmax(dev, type, attr, channel, val);
>>> +    default:
>>> +        if (channel < CPUTEMP_CHANNEL_NUMS)
>>> +            return cputemp_read_core(dev, type, attr, channel, val);
>>> +
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static umode_t cputemp_is_visible(const void *data,
>>> +                  enum hwmon_sensor_types type,
>>> +                  u32 attr, int channel)
>>> +{
>>> +    const struct peci_cputemp *priv = data;
>>> +
>>> +    if (priv->temp_config[channel] & BIT(attr))
>>> +        return 0444;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static const struct hwmon_ops cputemp_ops = {
>>> +    .is_visible = cputemp_is_visible,
>>> +    .read_string = cputemp_read_string,
>>> +    .read = cputemp_read,
>>> +};
>>> +
>>> +static int check_resolved_cores(struct peci_cputemp *priv)
>>> +{
>>> +    struct peci_rd_pci_cfg_local_msg msg;
>>> +    int rc;
>>> +
>>> +    if (!(priv->client->adapter->cmd_mask & BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
>>> +        return -EINVAL;
>>> +
>>> +    /* Get the RESOLVED_CORES register value */
>>> +    msg.addr = priv->addr;
>>> +    msg.bus = 1;
>>> +    msg.device = 30;
>>> +    msg.function = 3;
>>> +    msg.reg = 0xB4;
>>
>> Can this be made less magic with some defines ?
>>
> 
> Sure, will use defines instead.
> 
>>> +    msg.rx_len = 4;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    priv->core_mask = msg.pci_config[3] << 24 |
>>> +              msg.pci_config[2] << 16 |
>>> +              msg.pci_config[1] << 8 |
>>> +              msg.pci_config[0];
>>> +
>>> +    if (!priv->core_mask)
>>> +        return -EAGAIN;
>>> +
>>> +    dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", priv->core_mask);
>>> +    return 0;
>>> +}
>>> +
>>> +static int create_core_temp_info(struct peci_cputemp *priv)
>>> +{
>>> +    int rc, i;
>>> +
>>> +    rc = check_resolved_cores(priv);
>>> +    if (!rc) {
>>> +        for (i = 0; i < priv->gen_info->core_max; i++) {
>>> +            if (priv->core_mask & BIT(i)) {
>>> +                priv->temp_config[priv->config_idx++] =
>>> +                             config_table[channel_core];
>>> +            }
>>> +        }
>>> +    }
>>> +
>>> +    return rc;
>>> +}
>>> +
>>> +static int check_cpu_id(struct peci_cputemp *priv)
>>> +{
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    u32 cpu_id;
>>> +    int i, rc;
>>> +
>>> +    msg.addr = priv->addr;
>>> +    msg.index = MBX_INDEX_CPU_ID;
>>> +    msg.param = PKG_ID_CPU_ID;
>>> +    msg.rx_len = 4;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>>> +          msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>>> +
>>> +    for (i = 0; i < CPU_GEN_MAX; i++) {
>>> +        if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>>> +            priv->gen_info = &cpu_gen_info_table[i];
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    if (!priv->gen_info)
>>> +        return -ENODEV;
>>> +
>>> +    dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>>> +    return 0;
>>> +}
>>> +
>>> +static int peci_cputemp_probe(struct peci_client *client)
>>> +{
>>> +    struct device *dev = &client->dev;
>>> +    struct peci_cputemp *priv;
>>> +    struct device *hwmon_dev;
>>> +    int rc;
>>> +
>>> +    if ((client->adapter->cmd_mask &
>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>>> +        dev_err(dev, "Client doesn't support temperature monitoring\n");
>>> +        return -EINVAL;
>>
>> Does this mean there will be an error message for each non-supported CPU ?
>> Why ?
>>
> 
> For proper operation of this driver, PECI_CMD_GET_TEMP and PECI_CMD_RD_PKG_CFG have to be supported by a client CPU. PECI_CMD_GET_TEMP is provided as a default command but PECI_CMD_RD_PKG_CFG depends on PECI minor revision of a CPU package so this checking is needed.
> 

I do not question the check. I question the error message and error return value.
Why is it an _error_ if the CPU does not support the functionality, and why does
it have to be reported in the kernel log ?

>>> +    }
>>> +
>>> +    priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>>> +    if (!priv)
>>> +        return -ENOMEM;
>>> +
>>> +    dev_set_drvdata(dev, priv);
>>> +    priv->client = client;
>>> +    priv->dev = dev;
>>> +    priv->addr = client->addr;
>>> +    priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>>> +
>>> +    snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
>>> +         priv->cpu_no);
>>> +
>>> +    rc = check_cpu_id(priv);
>>> +    if (rc) {
>>> +        dev_err(dev, "Client CPU is not supported\n");
>>
>> -ENODEV is not an error, and should not result in an error message.
>> Besides, the error can also be propagated from peci core code,
>> and may well be something else.
>>
> 
> Got it. I'll remove the error message and will add a proper handling code into PECI core.
> 
>>> +        return rc;
>>> +    }
>>> +
>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_die];
>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_dts_mrgn];
>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_tcontrol];
>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_tthrottle];
>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_tjmax];
>>> +
>>> +    rc = create_core_temp_info(priv);
>>> +    if (rc)
>>> +        dev_dbg(dev, "Failed to create core temp info\n");
>>
>> Then what ? Shouldn't this result in probe deferral or something more useful
>> instead of just being ignored ?
>>
> 
> This driver can't support core temperature monitoring if a CPU doesn't support PECI_CMD_RD_PCI_CFG_LOCAL command. In that case, it skips core temperature group creation and supports only basic temperature monitoring of Die, DTS margin and etc. I'll add this description as a comment.
> 

The message says "Failed to ...". It does not say "This CPU does not support ...".

>>> +
>>> +    priv->chip.ops = &cputemp_ops;
>>> +    priv->chip.info = priv->info;
>>> +
>>> +    priv->info[0] = &priv->temp_info;
>>> +
>>> +    priv->temp_info.type = hwmon_temp;
>>> +    priv->temp_info.config = priv->temp_config;
>>> +
>>> +    hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>>> +                             priv->name,
>>> +                             priv,
>>> +                             &priv->chip,
>>> +                             NULL);
>>> +
>>> +    if (IS_ERR(hwmon_dev))
>>> +        return PTR_ERR(hwmon_dev);
>>> +
>>> +    dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), priv->name);
>>> +

Why does this message display the device name twice ?

>>> +    return 0;
>>> +}
>>> +
>>> +static const struct of_device_id peci_cputemp_of_table[] = {
>>> +    { .compatible = "intel,peci-cputemp" },
>>> +    { }
>>> +};
>>> +MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
>>> +
>>> +static struct peci_driver peci_cputemp_driver = {
>>> +    .probe  = peci_cputemp_probe,
>>> +    .driver = {
>>> +        .name           = "peci-cputemp",
>>> +        .of_match_table = of_match_ptr(peci_cputemp_of_table),
>>> +    },
>>> +};
>>> +module_peci_driver(peci_cputemp_driver);
>>> +
>>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>>> +MODULE_DESCRIPTION("PECI cputemp driver");
>>> +MODULE_LICENSE("GPL v2");
>>> diff --git a/drivers/hwmon/peci-dimmtemp.c b/drivers/hwmon/peci-dimmtemp.c
>>> new file mode 100644
>>> index 000000000000..78bf29cb2c4c
>>> --- /dev/null
>>> +++ b/drivers/hwmon/peci-dimmtemp.c
>>
>> FWIW, this should be two separate patches.
>>
> 
> Should I split out hwmon documents and dt bindings too?
> 
>>> @@ -0,0 +1,432 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +// Copyright (c) 2018 Intel Corporation
>>> +
>>> +#include <linux/delay.h>
>>> +#include <linux/hwmon.h>
>>> +#include <linux/hwmon-sysfs.h>
>>
>> Needed ?
>>
> 
> No. Will drop the line.
> 
>>> +#include <linux/jiffies.h>
>>> +#include <linux/module.h>
>>> +#include <linux/of_device.h>
>>> +#include <linux/peci.h>
>>> +#include <linux/workqueue.h>
>>> +
>>> +#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
>>> +
>>> +#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on Haswell */
>>> +#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on Haswell */
>>> +
>>> +#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on Broadwell */
>>> +#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on Broadwell */
>>> +
>>> +#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on Skylake */
>>> +#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on Skylake */
>>> +
>>> +#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
>>> +#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
>>> +
>>> +#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
>>> +
>>> +#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model info */
>>> +
>>> +#define UPDATE_INTERVAL_MIN  HZ
>>> +
>>> +#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
>>> +#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 minutes */
>>> +
>>> +enum cpu_gens {
>>> +    CPU_GEN_HSX, /* Haswell Xeon */
>>> +    CPU_GEN_BRX, /* Broadwell Xeon */
>>> +    CPU_GEN_SKX, /* Skylake Xeon */
>>> +    CPU_GEN_MAX
>>> +};
>>> +
>>> +struct cpu_gen_info {
>>> +    u32 type;
>>> +    u32 cpu_id;
>>> +    u32 chan_rank_max;
>>> +    u32 dimm_idx_max;
>>> +};
>>> +
>>> +struct temp_data {
>>> +    bool valid;
>>> +    s32  value;
>>> +    unsigned long last_updated;
>>> +};
>>> +
>>> +struct peci_dimmtemp {
>>> +    struct peci_client *client;
>>> +    struct device *dev;
>>> +    struct workqueue_struct *work_queue;
>>> +    struct delayed_work work_handler;
>>> +    char name[PECI_NAME_SIZE];
>>> +    struct temp_data temp[DIMM_NUMS_MAX];
>>> +    u8 addr;
>>> +    uint cpu_no;
>>> +    const struct cpu_gen_info *gen_info;
>>> +    u32 dimm_mask;
>>> +    int retry_count;
>>> +    int channels;
>>> +    u32 temp_config[DIMM_NUMS_MAX + 1];
>>> +    struct hwmon_channel_info temp_info;
>>> +    const struct hwmon_channel_info *info[2];
>>> +    struct hwmon_chip_info chip;
>>> +};
>>> +
>>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>>> +    { .type  = CPU_GEN_HSX,
>>> +      .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
>>> +    { .type  = CPU_GEN_BRX,
>>> +      .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
>>> +    { .type  = CPU_GEN_SKX,
>>> +      .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
>>> +};
>>> +
>>> +static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
>>> +    { "DIMM A0", "DIMM A1", "DIMM A2" },
>>> +    { "DIMM B0", "DIMM B1", "DIMM B2" },
>>> +    { "DIMM C0", "DIMM C1", "DIMM C2" },
>>> +    { "DIMM D0", "DIMM D1", "DIMM D2" },
>>> +    { "DIMM E0", "DIMM E1", "DIMM E2" },
>>> +    { "DIMM F0", "DIMM F1", "DIMM F2" },
>>> +    { "DIMM G0", "DIMM G1", "DIMM G2" },
>>> +    { "DIMM H0", "DIMM H1", "DIMM H2" },
>>> +};
>>> +
>>> +static int send_peci_cmd(struct peci_dimmtemp *priv, enum peci_cmd cmd,
>>> +             void *msg)
>>> +{
>>> +    return peci_command(priv->client->adapter, cmd, msg);
>>> +}
>>> +
>>> +static int need_update(struct temp_data *temp)
>>> +{
>>> +    if (temp->valid &&
>>> +        time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
>>> +        return 0;
>>> +
>>> +    return 1;
>>> +}
>>> +
>>> +static void mark_updated(struct temp_data *temp)
>>> +{
>>> +    temp->valid = true;
>>> +    temp->last_updated = jiffies;
>>> +}
>>
>> It might make sense to provide the duplicate functions in a core file.
>>
> 
> It is temperature monitoring specific function and it touches module specific variables. Do you really think that this non-generic function should be moved to PECI core?
> 
>>> +
>>> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
>>> +{
>>> +    int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
>>> +    int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    int rc;
>>> +
>>> +    if (!need_update(&priv->temp[dimm_no]))
>>> +        return 0;
>>> +
>>> +    msg.addr = priv->addr;
>>> +    msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>>> +    msg.param = chan_rank;
>>> +    msg.rx_len = 4;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
>>> +
>>> +    mark_updated(&priv->temp[dimm_no]);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
>>> +{
>>> +    int dimm_nums_max = priv->gen_info->chan_rank_max *
>>> +                priv->gen_info->dimm_idx_max;
>>> +    int idx, found = 0;
>>> +
>>> +    for (idx = 0; idx < dimm_nums_max; idx++) {
>>> +        if (priv->dimm_mask & BIT(idx)) {
>>> +            if (channel == found)
>>> +                break;
>>> +
>>> +            found++;
>>> +        }
>>> +    }
>>> +
>>> +    return idx;
>>> +}
>>
>> This again looks like duplicate code.
>>
> 
> find_dimm_number()? I'm sure it isn't.
> 
>>> +
>>> +static int dimmtemp_read_string(struct device *dev,
>>> +                enum hwmon_sensor_types type,
>>> +                u32 attr, int channel, const char **str)
>>> +{
>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>> +    u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>>> +    int dimm_no, chan_rank, dimm_idx;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_label:
>>> +        dimm_no = find_dimm_number(priv, channel);
>>> +        chan_rank = dimm_no / dimm_idx_max;
>>> +        dimm_idx = dimm_no % dimm_idx_max;
>>> +        *str = dimmtemp_label[chan_rank][dimm_idx];
>>> +        return 0;
>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
>>> +             u32 attr, int channel, long *val)
>>> +{
>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>> +    int dimm_no = find_dimm_number(priv, channel);
>>> +    int rc;
>>> +
>>> +    switch (attr) {
>>> +    case hwmon_temp_input:
>>> +        rc = get_dimm_temp(priv, dimm_no);
>>> +        if (rc)
>>> +            return rc;
>>> +
>>> +        *val = priv->temp[dimm_no].value;
>>> +        return 0;
>>> +    default:
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +}
>>> +
>>> +static umode_t dimmtemp_is_visible(const void *data,
>>> +                   enum hwmon_sensor_types type,
>>> +                   u32 attr, int channel)
>>> +{
>>> +    switch (attr) {
>>> +    case hwmon_temp_label:
>>> +    case hwmon_temp_input:
>>> +        return 0444;
>>> +    default:
>>> +        return 0;
>>> +    }
>>> +}
>>> +
>>> +static const struct hwmon_ops dimmtemp_ops = {
>>> +    .is_visible = dimmtemp_is_visible,
>>> +    .read_string = dimmtemp_read_string,
>>> +    .read = dimmtemp_read,
>>> +};
>>> +
>>> +static int check_populated_dimms(struct peci_dimmtemp *priv)
>>> +{
>>> +    u32 chan_rank_max = priv->gen_info->chan_rank_max;
>>> +    u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    int chan_rank, dimm_idx;
>>> +    int rc, channels = 0;
>>> +
>>> +    for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
>>> +        msg.addr = priv->addr;
>>> +        msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>>> +        msg.param = chan_rank;
>>> +        msg.rx_len = 4;
>>> +
>>> +        rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +        if (rc) {
>>> +            priv->dimm_mask = 0;
>>> +            return rc;
>>> +        }
>>> +
>>> +        for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
>>> +            if (msg.pkg_config[dimm_idx]) {
>>> +                priv->dimm_mask |= BIT(chan_rank *
>>> +                               chan_rank_max +
>>> +                               dimm_idx);
>>> +                channels++;
>>> +            }
>>> +        }
>>> +    }
>>> +
>>> +    if (!priv->dimm_mask)
>>> +        return -EAGAIN;
>>> +
>>> +    priv->channels = channels;
>>> +
>>> +    dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", priv->dimm_mask);
>>> +    return 0;
>>> +}
>>> +
>>> +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
>>> +{
>>> +    struct device *hwmon_dev;
>>> +    int rc, i;
>>> +
>>> +    rc = check_populated_dimms(priv);
>>> +    if (!rc) {
>>
>> Please handle error cases first.
>>
> 
> Sure, I'll rewrite it.
> 
>>> +        for (i = 0; i < priv->channels; i++)
>>> +            priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
>>> +
>>> +        priv->chip.ops = &dimmtemp_ops;
>>> +        priv->chip.info = priv->info;
>>> +
>>> +        priv->info[0] = &priv->temp_info;
>>> +
>>> +        priv->temp_info.type = hwmon_temp;
>>> +        priv->temp_info.config = priv->temp_config;
>>> +
>>> +        hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>>> +                                 priv->name,
>>> +                                 priv,
>>> +                                 &priv->chip,
>>> +                                 NULL);
>>> +        rc = PTR_ERR_OR_ZERO(hwmon_dev);
>>> +        if (!rc)
>>> +            dev_dbg(priv->dev, "%s: sensor '%s'\n",
>>> +                dev_name(hwmon_dev), priv->name);
>>> +    } else if (rc == -EAGAIN) {
>>> +        if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
>>> +            queue_delayed_work(priv->work_queue,
>>> +                       &priv->work_handler,
>>> +                       DIMM_MASK_CHECK_DELAY_JIFFIES);
>>> +            priv->retry_count++;
>>> +            dev_dbg(priv->dev,
>>> +                "Deferred DIMM temp info creation\n");
>>> +        } else {
>>> +            rc = -ETIMEDOUT;
>>> +            dev_err(priv->dev,
>>> +                "Timeout retrying DIMM temp info creation\n");
>>> +        }
>>> +    }
>>> +
>>> +    return rc;
>>> +}
>>> +
>>> +static void create_dimm_temp_info_delayed(struct work_struct *work)
>>> +{
>>> +    struct delayed_work *dwork = to_delayed_work(work);
>>> +    struct peci_dimmtemp *priv = container_of(dwork, struct peci_dimmtemp,
>>> +                          work_handler);
>>> +    int rc;
>>> +
>>> +    rc = create_dimm_temp_info(priv);
>>> +    if (rc && rc != -EAGAIN)
>>> +        dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
>>> +}
>>> +
>>> +static int check_cpu_id(struct peci_dimmtemp *priv)
>>> +{
>>> +    struct peci_rd_pkg_cfg_msg msg;
>>> +    u32 cpu_id;
>>> +    int i, rc;
>>> +
>>> +    msg.addr = priv->addr;
>>> +    msg.index = MBX_INDEX_CPU_ID;
>>> +    msg.param = PKG_ID_CPU_ID;
>>> +    msg.rx_len = 4;
>>> +
>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>>> +          msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>>> +
>>> +    for (i = 0; i < CPU_GEN_MAX; i++) {
>>> +        if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>>> +            priv->gen_info = &cpu_gen_info_table[i];
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    if (!priv->gen_info)
>>> +        return -ENODEV;
>>> +
>>> +    dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>>> +    return 0;
>>> +}
>>
>> More duplicate code.
>>
> 
> Okay. In case of check_cpu_id(), it could be used as a generic PECI function. I'll move it into PECI core.
> 
>>> +
>>> +static int peci_dimmtemp_probe(struct peci_client *client)
>>> +{
>>> +    struct device *dev = &client->dev;
>>> +    struct peci_dimmtemp *priv;
>>> +    int rc;
>>> +
>>> +    if ((client->adapter->cmd_mask &
>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>>
>> One set of ( ) is unnecessary on each side of the expression.
>>
> 
> '&' has a precedence over '!=' but '|' doesn't. I'll rewrite it to:
> 

Actually, that is wrong. You refer to address-of. Bit operations do have lower
precedence that comparisons. I stand corrected.

>      if (client->adapter->cmd_mask &
>          (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)) !=
>          (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)))
> 
>>> +        dev_err(dev, "Client doesn't support temperature monitoring\n");
>>> +        return -EINVAL;
>>
>> Why is this "invalid", and why does it warrant an error message ?
>>
> 
> Should I use -EPERM? Any suggestion?
> 

Is it an _error_ if the CPU does not support this functionality ?

>>> +    }
>>> +
>>> +    priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>>> +    if (!priv)
>>> +        return -ENOMEM;
>>> +
>>> +    dev_set_drvdata(dev, priv);
>>> +    priv->client = client;
>>> +    priv->dev = dev;
>>> +    priv->addr = client->addr;
>>> +    priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>>
>> Is priv->addr guaranteed to be >= PECI_BASE_ADDR ?
> 
> Client address range validation will be done in peci_check_addr_validity() in PECI core before probing a device driver.
> 
>>> +
>>> +    snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
>>> +         priv->cpu_no);
>>> +
>>> +    rc = check_cpu_id(priv);
>>> +    if (rc) {
>>> +        dev_err(dev, "Client CPU is not supported\n");
>>
>> Or the peci command failed.
>>
> 
> I'll remove the error message and will add a proper handling code into PECI core on each error type.
> 
>>> +        return rc;
>>> +    }
>>> +
>>> +    priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
>>> +    if (!priv->work_queue)
>>> +        return -ENOMEM;
>>> +
>>> +    INIT_DELAYED_WORK(&priv->work_handler, create_dimm_temp_info_delayed);
>>> +
>>> +    rc = create_dimm_temp_info(priv);
>>> +    if (rc && rc != -EAGAIN) {
>>> +        dev_err(dev, "Failed to create DIMM temp info\n");
>>> +        goto err_free_wq;
>>> +    }
>>> +
>>> +    return 0;
>>> +
>>> +err_free_wq:
>>> +    destroy_workqueue(priv->work_queue);
>>> +    return rc;
>>> +}
>>> +
>>> +static int peci_dimmtemp_remove(struct peci_client *client)
>>> +{
>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
>>> +
>>> +    cancel_delayed_work(&priv->work_handler);
>>
>> cancel_delayed_work_sync() ?
>>
> 
> Yes, it would be safer. Will fix it.
> 
>>> +    destroy_workqueue(priv->work_queue);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static const struct of_device_id peci_dimmtemp_of_table[] = {
>>> +    { .compatible = "intel,peci-dimmtemp" },
>>> +    { }
>>> +};
>>> +MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
>>> +
>>> +static struct peci_driver peci_dimmtemp_driver = {
>>> +    .probe  = peci_dimmtemp_probe,
>>> +    .remove = peci_dimmtemp_remove,
>>> +    .driver = {
>>> +        .name           = "peci-dimmtemp",
>>> +        .of_match_table = of_match_ptr(peci_dimmtemp_of_table),
>>> +    },
>>> +};
>>> +module_peci_driver(peci_dimmtemp_driver);
>>> +
>>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>>> +MODULE_DESCRIPTION("PECI dimmtemp driver");
>>> +MODULE_LICENSE("GPL v2");
>>> -- 
>>> 2.16.2
>>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
  2018-04-11 11:51   ` Joel Stanley
@ 2018-04-12  2:03     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-12  2:03 UTC (permalink / raw)
  To: Joel Stanley, Ryan Chen
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, Linux Kernel Mailing List, linux-doc, devicetree,
	linux-hwmon, Linux ARM, OpenBMC Maillist

Hello Joel,

Thanks for sharing your time. Please see my answers inline.

On 4/11/2018 4:51 AM, Joel Stanley wrote:
> Hello Jae,
> 
> On 11 April 2018 at 04:02, Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> wrote:
>> This commit adds PECI adapter driver implementation for Aspeed
>> AST24xx/AST25xx.
> 
> The driver is looking good!
> 
> It looks like you've done some kind of review that we weren't allowed
> to see, which is a double edged sword - I might be asking about things
> that you've already spoken about with someone else.
> 
> I'm only just learning about PECI, but I do have some general comments below.
> 

Yes, it took a hidden review process between v2 and v3. I know it's an 
unusual process but it was requested. Hopefully, change logs in cover 
letter could roughly provide the details. Thanks for your comments.

>> ---
>>   drivers/peci/Kconfig       |  28 +++
>>   drivers/peci/Makefile      |   3 +
>>   drivers/peci/peci-aspeed.c | 504 +++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 535 insertions(+)
>>   create mode 100644 drivers/peci/peci-aspeed.c
>>
>> diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
>> index 1fbc13f9e6c2..0e33420365de 100644
>> --- a/drivers/peci/Kconfig
>> +++ b/drivers/peci/Kconfig
>> @@ -14,4 +14,32 @@ config PECI
>>            processors and chipset components to external monitoring or control
>>            devices.
>>
>> +         If you want PECI support, you should say Y here and also to the
>> +         specific driver for your bus adapter(s) below.
>> +
>> +if PECI
>> +
>> +#
>> +# PECI hardware bus configuration
>> +#
>> +
>> +menu "PECI Hardware Bus support"
>> +
>> +config PECI_ASPEED
>> +       tristate "Aspeed AST24xx/AST25xx PECI support"
> 
> I think just saying ASPEED PECI support is enough. That way if the
> next ASPEED SoC happens to have PECI we don't need to update all of
> the help text :)
> 

Agreed. I'll change the description.

>> +       select REGMAP_MMIO
>> +       depends on OF
>> +       depends on ARCH_ASPEED || COMPILE_TEST
>> +       help
>> +         Say Y here if you want support for the Platform Environment Control
>> +         Interface (PECI) bus adapter driver on the Aspeed AST24XX and AST25XX
>> +         SoCs.
>> +
>> +         This support is also available as a module.  If so, the module
>> +         will be called peci-aspeed.
>> +
>> +endmenu
>> +
>> +endif # PECI
>> +
>>   endmenu
>> diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
>> index 9e8615e0d3ff..886285e69765 100644
>> --- a/drivers/peci/Makefile
>> +++ b/drivers/peci/Makefile
>> @@ -4,3 +4,6 @@
>>
>>   # Core functionality
>>   obj-$(CONFIG_PECI)             += peci-core.o
>> +
>> +# Hardware specific bus drivers
>> +obj-$(CONFIG_PECI_ASPEED)      += peci-aspeed.o
>> diff --git a/drivers/peci/peci-aspeed.c b/drivers/peci/peci-aspeed.c
>> new file mode 100644
>> index 000000000000..be2a1f327eb1
>> --- /dev/null
>> +++ b/drivers/peci/peci-aspeed.c
>> @@ -0,0 +1,504 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (C) 2012-2017 ASPEED Technology Inc.
>> +// Copyright (c) 2018 Intel Corporation
>> +
>> +#include <linux/clk.h>
>> +#include <linux/delay.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/jiffies.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/peci.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/regmap.h>
>> +
>> +#define DUMP_DEBUG 0
>> +
>> +/* Aspeed PECI Registers */
>> +#define AST_PECI_CTRL     0x00
> 
> Nit: we use ASPEED instead of AST in the upstream kernel to distingush
> from the aspeed sdk drivers. If you feel strongly about this then I
> won't insist you change.
> 

Okay then, better change it now than later. Will change all defines.

>> +#define AST_PECI_TIMING   0x04
>> +#define AST_PECI_CMD      0x08
>> +#define AST_PECI_CMD_CTRL 0x0c
>> +#define AST_PECI_EXP_FCS  0x10
>> +#define AST_PECI_CAP_FCS  0x14
>> +#define AST_PECI_INT_CTRL 0x18
>> +#define AST_PECI_INT_STS  0x1c
>> +#define AST_PECI_W_DATA0  0x20
>> +#define AST_PECI_W_DATA1  0x24
>> +#define AST_PECI_W_DATA2  0x28
>> +#define AST_PECI_W_DATA3  0x2c
>> +#define AST_PECI_R_DATA0  0x30
>> +#define AST_PECI_R_DATA1  0x34
>> +#define AST_PECI_R_DATA2  0x38
>> +#define AST_PECI_R_DATA3  0x3c
>> +#define AST_PECI_W_DATA4  0x40
>> +#define AST_PECI_W_DATA5  0x44
>> +#define AST_PECI_W_DATA6  0x48
>> +#define AST_PECI_W_DATA7  0x4c
>> +#define AST_PECI_R_DATA4  0x50
>> +#define AST_PECI_R_DATA5  0x54
>> +#define AST_PECI_R_DATA6  0x58
>> +#define AST_PECI_R_DATA7  0x5c
>> +
>> +/* AST_PECI_CTRL - 0x00 : Control Register */
>> +#define PECI_CTRL_SAMPLING_MASK     GENMASK(19, 16)
>> +#define PECI_CTRL_SAMPLING(x)       (((x) << 16) & PECI_CTRL_SAMPLING_MASK)
>> +#define PECI_CTRL_SAMPLING_GET(x)   (((x) & PECI_CTRL_SAMPLING_MASK) >> 16)
>> +#define PECI_CTRL_READ_MODE_MASK    GENMASK(13, 12)
>> +#define PECI_CTRL_READ_MODE(x)      (((x) << 12) & PECI_CTRL_READ_MODE_MASK)
>> +#define PECI_CTRL_READ_MODE_GET(x)  (((x) & PECI_CTRL_READ_MODE_MASK) >> 12)
>> +#define PECI_CTRL_READ_MODE_COUNT   BIT(12)
>> +#define PECI_CTRL_READ_MODE_DBG     BIT(13)
>> +#define PECI_CTRL_CLK_SOURCE_MASK   BIT(11)
>> +#define PECI_CTRL_CLK_SOURCE(x)     (((x) << 11) & PECI_CTRL_CLK_SOURCE_MASK)
>> +#define PECI_CTRL_CLK_SOURCE_GET(x) (((x) & PECI_CTRL_CLK_SOURCE_MASK) >> 11)
>> +#define PECI_CTRL_CLK_DIV_MASK      GENMASK(10, 8)
>> +#define PECI_CTRL_CLK_DIV(x)        (((x) << 8) & PECI_CTRL_CLK_DIV_MASK)
>> +#define PECI_CTRL_CLK_DIV_GET(x)    (((x) & PECI_CTRL_CLK_DIV_MASK) >> 8)
>> +#define PECI_CTRL_INVERT_OUT        BIT(7)
>> +#define PECI_CTRL_INVERT_IN         BIT(6)
>> +#define PECI_CTRL_BUS_CONTENT_EN    BIT(5)
>> +#define PECI_CTRL_PECI_EN           BIT(4)
>> +#define PECI_CTRL_PECI_CLK_EN       BIT(0)
> 
> I know these come from the ASPEED sdk driver. Do we need them all?
> 

It doesn't use all but better keep for bug fix or improvement use, I think.

>> +
>> +/* AST_PECI_TIMING - 0x04 : Timing Negotiation Register */
>> +#define PECI_TIMING_MESSAGE_MASK   GENMASK(15, 8)
>> +#define PECI_TIMING_MESSAGE(x)     (((x) << 8) & PECI_TIMING_MESSAGE_MASK)
>> +#define PECI_TIMING_MESSAGE_GET(x) (((x) & PECI_TIMING_MESSAGE_MASK) >> 8)
>> +#define PECI_TIMING_ADDRESS_MASK   GENMASK(7, 0)
>> +#define PECI_TIMING_ADDRESS(x)     ((x) & PECI_TIMING_ADDRESS_MASK)
>> +#define PECI_TIMING_ADDRESS_GET(x) ((x) & PECI_TIMING_ADDRESS_MASK)
>> +
>> +/* AST_PECI_CMD - 0x08 : Command Register */
>> +#define PECI_CMD_PIN_MON    BIT(31)
>> +#define PECI_CMD_STS_MASK   GENMASK(27, 24)
>> +#define PECI_CMD_STS_GET(x) (((x) & PECI_CMD_STS_MASK) >> 24)
>> +#define PECI_CMD_FIRE       BIT(0)
>> +
>> +/* AST_PECI_LEN - 0x0C : Read/Write Length Register */
>> +#define PECI_AW_FCS_EN       BIT(31)
>> +#define PECI_READ_LEN_MASK   GENMASK(23, 16)
>> +#define PECI_READ_LEN(x)     (((x) << 16) & PECI_READ_LEN_MASK)
>> +#define PECI_WRITE_LEN_MASK  GENMASK(15, 8)
>> +#define PECI_WRITE_LEN(x)    (((x) << 8) & PECI_WRITE_LEN_MASK)
>> +#define PECI_TAGET_ADDR_MASK GENMASK(7, 0)
>> +#define PECI_TAGET_ADDR(x)   ((x) & PECI_TAGET_ADDR_MASK)
>> +
>> +/* AST_PECI_EXP_FCS - 0x10 : Expected FCS Data Register */
>> +#define PECI_EXPECT_READ_FCS_MASK      GENMASK(23, 16)
>> +#define PECI_EXPECT_READ_FCS_GET(x)    (((x) & PECI_EXPECT_READ_FCS_MASK) >> 16)
>> +#define PECI_EXPECT_AW_FCS_AUTO_MASK   GENMASK(15, 8)
>> +#define PECI_EXPECT_AW_FCS_AUTO_GET(x) (((x) & PECI_EXPECT_AW_FCS_AUTO_MASK) \
>> +                                       >> 8)
>> +#define PECI_EXPECT_WRITE_FCS_MASK     GENMASK(7, 0)
>> +#define PECI_EXPECT_WRITE_FCS_GET(x)   ((x) & PECI_EXPECT_WRITE_FCS_MASK)
>> +
>> +/* AST_PECI_CAP_FCS - 0x14 : Captured FCS Data Register */
>> +#define PECI_CAPTURE_READ_FCS_MASK    GENMASK(23, 16)
>> +#define PECI_CAPTURE_READ_FCS_GET(x)  (((x) & PECI_CAPTURE_READ_FCS_MASK) >> 16)
>> +#define PECI_CAPTURE_WRITE_FCS_MASK   GENMASK(7, 0)
>> +#define PECI_CAPTURE_WRITE_FCS_GET(x) ((x) & PECI_CAPTURE_WRITE_FCS_MASK)
>> +
>> +/* AST_PECI_INT_CTRL/STS - 0x18/0x1c : Interrupt Register */
>> +#define PECI_INT_TIMING_RESULT_MASK GENMASK(31, 30)
>> +#define PECI_INT_TIMEOUT            BIT(4)
>> +#define PECI_INT_CONNECT            BIT(3)
>> +#define PECI_INT_W_FCS_BAD          BIT(2)
>> +#define PECI_INT_W_FCS_ABORT        BIT(1)
>> +#define PECI_INT_CMD_DONE           BIT(0)
>> +
>> +struct aspeed_peci {
>> +       struct peci_adapter     adaper;
>> +       struct device           *dev;
>> +       struct regmap           *regmap;
>> +       int                     irq;
>> +       struct completion       xfer_complete;
>> +       u32                     status;
>> +       u32                     cmd_timeout_ms;
>> +};
>> +
>> +#define PECI_INT_MASK  (PECI_INT_TIMEOUT | PECI_INT_CONNECT | \
>> +                       PECI_INT_W_FCS_BAD | PECI_INT_W_FCS_ABORT | \
>> +                       PECI_INT_CMD_DONE)
>> +
>> +#define PECI_IDLE_CHECK_TIMEOUT_MS      50
>> +#define PECI_IDLE_CHECK_INTERVAL_MS     10
>> +
>> +#define PECI_RD_SAMPLING_POINT_DEFAULT  8
>> +#define PECI_RD_SAMPLING_POINT_MAX      15
>> +#define PECI_CLK_DIV_DEFAULT            0
>> +#define PECI_CLK_DIV_MAX                7
>> +#define PECI_MSG_TIMING_NEGO_DEFAULT    1
>> +#define PECI_MSG_TIMING_NEGO_MAX        255
>> +#define PECI_ADDR_TIMING_NEGO_DEFAULT   1
>> +#define PECI_ADDR_TIMING_NEGO_MAX       255
>> +#define PECI_CMD_TIMEOUT_MS_DEFAULT     1000
>> +#define PECI_CMD_TIMEOUT_MS_MAX         60000
>> +
>> +static int aspeed_peci_xfer_native(struct aspeed_peci *priv,
>> +                                  struct peci_xfer_msg *msg)
>> +{
>> +       long err, timeout = msecs_to_jiffies(priv->cmd_timeout_ms);
>> +       u32 peci_head, peci_state, rx_data, cmd_sts;
>> +       ktime_t start, end;
>> +       s64 elapsed_ms;
>> +       int i, rc = 0;
>> +       uint reg;
>> +
>> +       start = ktime_get();
>> +
>> +       /* Check command sts and bus idle state */
>> +       while (!regmap_read(priv->regmap, AST_PECI_CMD, &cmd_sts) &&
>> +              (cmd_sts & (PECI_CMD_STS_MASK | PECI_CMD_PIN_MON))) {
>> +               end = ktime_get();
>> +               elapsed_ms = ktime_to_ms(ktime_sub(end, start));
>> +               if (elapsed_ms >= PECI_IDLE_CHECK_TIMEOUT_MS) {
>> +                       dev_dbg(priv->dev, "Timeout waiting for idle state!\n");
>> +                       return -ETIMEDOUT;
>> +               }
>> +
>> +               usleep_range(PECI_IDLE_CHECK_INTERVAL_MS * 1000,
>> +                            (PECI_IDLE_CHECK_INTERVAL_MS * 1000) + 1000);
>> +       };
> 
> Could the above use regmap_read_poll_timeout instead?
> 

Yes, that would be better. I'll rewrite it.

>> +
>> +       reinit_completion(&priv->xfer_complete);
>> +
>> +       peci_head = PECI_TAGET_ADDR(msg->addr) |
>> +                                   PECI_WRITE_LEN(msg->tx_len) |
>> +                                   PECI_READ_LEN(msg->rx_len);
>> +
>> +       rc = regmap_write(priv->regmap, AST_PECI_CMD_CTRL, peci_head);
>> +       if (rc)
>> +               return rc;
>> +
>> +       for (i = 0; i < msg->tx_len; i += 4) {
>> +               reg = i < 16 ? AST_PECI_W_DATA0 + i % 16 :
>> +                              AST_PECI_W_DATA4 + i % 16;
>> +               rc = regmap_write(priv->regmap, reg,
>> +                                 (msg->tx_buf[i + 3] << 24) |
>> +                                 (msg->tx_buf[i + 2] << 16) |
>> +                                 (msg->tx_buf[i + 1] << 8) |
>> +                                 msg->tx_buf[i + 0]);
> 
> That looks like an endian swap. Can we do something like this?
> 
>   regmap_write(map, reg, cpu_to_be32p((void *)msg->tx_buff))
> 

Yes, it could be simplified like you pointed out. Will change it.

>> +               if (rc)
>> +                       return rc;
>> +       }
>> +
>> +       dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
>> +#if DUMP_DEBUG
> 
> Having #defines is frowned upon. I think print_hex_dump_debug will do
> what you want here.
> 

Got it. I'll replace it with print_hex_dump_debug() after removing the 
define.

>> +       print_hex_dump(KERN_DEBUG, "TX : ", DUMP_PREFIX_NONE, 16, 1,
>> +                      msg->tx_buf, msg->tx_len, true);
>> +#endif
>> +
>> +       rc = regmap_write(priv->regmap, AST_PECI_CMD, PECI_CMD_FIRE);
>> +       if (rc)
>> +               return rc;
>> +
>> +       err = wait_for_completion_interruptible_timeout(&priv->xfer_complete,
>> +                                                       timeout);
>> +
>> +       dev_dbg(priv->dev, "INT_STS : 0x%08x\n", priv->status);
>> +       if (!regmap_read(priv->regmap, AST_PECI_CMD, &peci_state))
>> +               dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
>> +                       PECI_CMD_STS_GET(peci_state));
>> +       else
>> +               dev_dbg(priv->dev, "PECI_STATE : read error\n");
>> +
>> +       rc = regmap_write(priv->regmap, AST_PECI_CMD, 0);
>> +       if (rc)
>> +               return rc;
>> +
>> +       if (err <= 0 || !(priv->status & PECI_INT_CMD_DONE)) {
>> +               if (err < 0) { /* -ERESTARTSYS */
>> +                       return (int)err;
>> +               } else if (err == 0) {
>> +                       dev_dbg(priv->dev, "Timeout waiting for a response!\n");
>> +                       return -ETIMEDOUT;
>> +               }
>> +
>> +               dev_dbg(priv->dev, "No valid response!\n");
>> +               return -EIO;
>> +       }
>> +
>> +       for (i = 0; i < msg->rx_len; i++) {
>> +               u8 byte_offset = i % 4;
>> +
>> +               if (byte_offset == 0) {
>> +                       reg = i < 16 ? AST_PECI_R_DATA0 + i % 16 :
>> +                                      AST_PECI_R_DATA4 + i % 16;
> 
> I find this hard to read. Use a few more lines to make it clear what
> your code is doing.
> 
> Actually, the entire for loop is cryptic. I understand what it's doing
> now. Can you rework it to make it more readable? You follow a similar
> pattern above in the write case.
> 

Intention was that make it run just amount up to the rx_len but it's not 
efficient. I'll rewrite it like you suggested.

>> +                       rc = regmap_read(priv->regmap, reg, &rx_data);
>> +                       if (rc)
>> +                               return rc;
>> +               }
>> +
>> +               msg->rx_buf[i] = (u8)(rx_data >> (byte_offset << 3))
>> +       }
>> +
>> +#if DUMP_DEBUG
>> +       print_hex_dump(KERN_DEBUG, "RX : ", DUMP_PREFIX_NONE, 16, 1,
>> +                      msg->rx_buf, msg->rx_len, true);
>> +#endif
>> +       if (!regmap_read(priv->regmap, AST_PECI_CMD, &peci_state))
>> +               dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
>> +                       PECI_CMD_STS_GET(peci_state));
>> +       else
>> +               dev_dbg(priv->dev, "PECI_STATE : read error\n");
> 
> Given the regmap_read is always going to be a memory read on the
> aspeed, I can't think of a situation where the read will fail.
> 
> On that note, is there a reason you are using regmap and not just
> accessing the hardware directly? regmap imposes a number of pointer
> lookups and tests each time you do a read or write.
> 

No specific reason. regmap makes some overhead as you mentioned but it 
also provides some advantages on access simplification, endianness 
handling and register dump at run time. I'd not insist using of regmap 
if you prefer using of raw readl and writel. Do you?

>> +       dev_dbg(priv->dev, "------------------------\n");
>> +
>> +       return rc;
>> +}
>> +
>> +static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
>> +{
>> +       struct aspeed_peci *priv = arg;
>> +       u32 status_ack = 0;
>> +
>> +       if (regmap_read(priv->regmap, AST_PECI_INT_STS, &priv->status))
>> +               return IRQ_NONE;
> 
> Again, a memory mapped read won't fail. How about we check that the
> regmap is working once in your _probe() function, and assume it will
> continue working from there (or remove the regmap abstraction all
> together).
> 

You are right. I'll keep this checking only in _probe() function and 
remove all redundant error checking codes on memory mapped IO.

>> +
>> +       /* Be noted that multiple interrupt bits can be set at the same time */
>> +       if (priv->status & PECI_INT_TIMEOUT) {
>> +               dev_dbg(priv->dev, "PECI_INT_TIMEOUT\n");
>> +               status_ack |= PECI_INT_TIMEOUT;
>> +       }
>> +
>> +       if (priv->status & PECI_INT_CONNECT) {
>> +               dev_dbg(priv->dev, "PECI_INT_CONNECT\n");
>> +               status_ack |= PECI_INT_CONNECT;
>> +       }
>> +
>> +       if (priv->status & PECI_INT_W_FCS_BAD) {
>> +               dev_dbg(priv->dev, "PECI_INT_W_FCS_BAD\n");
>> +               status_ack |= PECI_INT_W_FCS_BAD;
>> +       }
>> +
>> +       if (priv->status & PECI_INT_W_FCS_ABORT) {
>> +               dev_dbg(priv->dev, "PECI_INT_W_FCS_ABORT\n");
>> +               status_ack |= PECI_INT_W_FCS_ABORT;
>> +       }
> 
> All of this code is for debugging only. Do you want to put it behind
> some kind of conditional?
> 

This code makes changes on the status_ack variable to write back ack bit 
on each interrupt.

>> +
>> +       /**
>> +        * All commands should be ended up with a PECI_INT_CMD_DONE bit set
>> +        * even in an error case.
>> +        */
>> +       if (priv->status & PECI_INT_CMD_DONE) {
>> +               dev_dbg(priv->dev, "PECI_INT_CMD_DONE\n");
>> +               status_ack |= PECI_INT_CMD_DONE;
>> +               complete(&priv->xfer_complete);
>> +       }
>> +
>> +       if (regmap_write(priv->regmap, AST_PECI_INT_STS, status_ack))
>> +               return IRQ_NONE;
>> +
>> +       return IRQ_HANDLED;
>> +}
>> +
>> +static int aspeed_peci_init_ctrl(struct aspeed_peci *priv)
>> +{
>> +       u32 msg_timing_nego, addr_timing_nego, rd_sampling_point;
>> +       u32 clk_freq, clk_divisor, clk_div_val = 0;
>> +       struct clk *clkin;
>> +       int ret;
>> +
>> +       clkin = devm_clk_get(priv->dev, NULL);
>> +       if (IS_ERR(clkin)) {
>> +               dev_err(priv->dev, "Failed to get clk source.\n");
>> +               return PTR_ERR(clkin);
>> +       }
>> +
>> +       ret = of_property_read_u32(priv->dev->of_node, "clock-frequency",
>> +                                  &clk_freq);
>> +       if (ret < 0) {
>> +               dev_err(priv->dev,
>> +                       "Could not read clock-frequency property.\n");
>> +               return ret;
>> +       }
>> +
>> +       clk_divisor = clk_get_rate(clkin) / clk_freq;
>> +       devm_clk_put(priv->dev, clkin);
>> +
>> +       while ((clk_divisor >> 1) && (clk_div_val < PECI_CLK_DIV_MAX))
>> +               clk_div_val++;
> 
> We have a framework for doing clocks in the kernel. Would it make
> sense to write a driver for this clock and add it to
> drivers/clk/clk-aspeed.c?
> 

Unlike other HW module, PECI uses the 24MHz external clock as its clock 
source. Should it use clk-aspeed.c in this case?

>> +
>> +       ret = of_property_read_u32(priv->dev->of_node, "msg-timing-nego",
>> +                                  &msg_timing_nego);
>> +       if (ret || msg_timing_nego > PECI_MSG_TIMING_NEGO_MAX) {
>> +               dev_warn(priv->dev,
>> +                        "Invalid msg-timing-nego : %u, Use default : %u\n",
>> +                        msg_timing_nego, PECI_MSG_TIMING_NEGO_DEFAULT);
> 
> The property is optional so I suggest we don't print a message if it's
> not present. We certainly don't want to print a message saying
> "invalid".
> 
> The same comment applies to the other optional properties below.
> 

Agreed. I'll make it print out the message only when ret == 0 and 
msg_timing_nego > PECI_MSG_TIMING_NEGO_MAX.

>> +               msg_timing_nego = PECI_MSG_TIMING_NEGO_DEFAULT;
>> +       }
>> +
>> +       ret = of_property_read_u32(priv->dev->of_node, "addr-timing-nego",
>> +                                  &addr_timing_nego);
>> +       if (ret || addr_timing_nego > PECI_ADDR_TIMING_NEGO_MAX) {
>> +               dev_warn(priv->dev,
>> +                        "Invalid addr-timing-nego : %u, Use default : %u\n",
>> +                        addr_timing_nego, PECI_ADDR_TIMING_NEGO_DEFAULT);
>> +               addr_timing_nego = PECI_ADDR_TIMING_NEGO_DEFAULT;
>> +       }
>> +
>> +       ret = of_property_read_u32(priv->dev->of_node, "rd-sampling-point",
>> +                                  &rd_sampling_point);
>> +       if (ret || rd_sampling_point > PECI_RD_SAMPLING_POINT_MAX) {
>> +               dev_warn(priv->dev,
>> +                        "Invalid rd-sampling-point : %u. Use default : %u\n",
>> +                        rd_sampling_point,
>> +                        PECI_RD_SAMPLING_POINT_DEFAULT);
>> +               rd_sampling_point = PECI_RD_SAMPLING_POINT_DEFAULT;
>> +       }
>> +
>> +       ret = of_property_read_u32(priv->dev->of_node, "cmd-timeout-ms",
>> +                                  &priv->cmd_timeout_ms);
>> +       if (ret || priv->cmd_timeout_ms > PECI_CMD_TIMEOUT_MS_MAX ||
>> +           priv->cmd_timeout_ms == 0) {
>> +               dev_warn(priv->dev,
>> +                        "Invalid cmd-timeout-ms : %u. Use default : %u\n",
>> +                        priv->cmd_timeout_ms,
>> +                        PECI_CMD_TIMEOUT_MS_DEFAULT);
>> +               priv->cmd_timeout_ms = PECI_CMD_TIMEOUT_MS_DEFAULT;
>> +       }
>> +
>> +       ret = regmap_write(priv->regmap, AST_PECI_CTRL,
>> +                          PECI_CTRL_CLK_DIV(PECI_CLK_DIV_DEFAULT) |
>> +                          PECI_CTRL_PECI_CLK_EN);
>> +       if (ret)
>> +               return ret;
>> +
>> +       usleep_range(1000, 5000);
> 
> Can we probe in parallel? If not, putting a sleep in the _probe will
> hold up the rest of drivers from being able to do anything, and hold
> up boot.
> 
> If you decide that you do need to probe here, please add a comment.
> (This is the wait for the clock to be stable?)
> 

I'll test it again and will remove it if it is not necessary.

>> +
>> +       /**
>> +        * Timing negotiation period setting.
>> +        * The unit of the programmed value is 4 times of PECI clock period.
>> +        */
>> +       ret = regmap_write(priv->regmap, AST_PECI_TIMING,
>> +                          PECI_TIMING_MESSAGE(msg_timing_nego) |
>> +                          PECI_TIMING_ADDRESS(addr_timing_nego));
>> +       if (ret)
>> +               return ret;
>> +
>> +       /* Clear interrupts */
>> +       ret = regmap_write(priv->regmap, AST_PECI_INT_STS, PECI_INT_MASK);
>> +       if (ret)
>> +               return ret;
>> +
>> +       /* Enable interrupts */
>> +       ret = regmap_write(priv->regmap, AST_PECI_INT_CTRL, PECI_INT_MASK);
>> +       if (ret)
>> +               return ret;
>> +
>> +       /* Read sampling point and clock speed setting */
>> +       ret = regmap_write(priv->regmap, AST_PECI_CTRL,
>> +                          PECI_CTRL_SAMPLING(rd_sampling_point) |
>> +                          PECI_CTRL_CLK_DIV(clk_div_val) |
>> +                          PECI_CTRL_PECI_EN | PECI_CTRL_PECI_CLK_EN);
>> +       if (ret)
>> +               return ret;
>> +
>> +       return 0;
>> +}
>> +
>> +static const struct regmap_config aspeed_peci_regmap_config = {
>> +       .reg_bits = 32,
>> +       .val_bits = 32,
>> +       .reg_stride = 4,
>> +       .max_register = AST_PECI_R_DATA7,
>> +       .val_format_endian = REGMAP_ENDIAN_LITTLE,
>> +       .fast_io = true,
>> +};
>> +
>> +static int aspeed_peci_xfer(struct peci_adapter *adaper,
>> +                           struct peci_xfer_msg *msg)
>> +{
>> +       struct aspeed_peci *priv = peci_get_adapdata(adaper);
>> +
>> +       return aspeed_peci_xfer_native(priv, msg);
>> +}
>> +
>> +static int aspeed_peci_probe(struct platform_device *pdev)
>> +{
>> +       struct aspeed_peci *priv;
>> +       struct resource *res;
>> +       void __iomem *base;
>> +       int ret = 0;
>> +
>> +       priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
>> +       if (!priv)
>> +               return -ENOMEM;
>> +
>> +       dev_set_drvdata(&pdev->dev, priv);
>> +       priv->dev = &pdev->dev;
>> +
>> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +       base = devm_ioremap_resource(&pdev->dev, res);
>> +       if (IS_ERR(base))
>> +               return PTR_ERR(base);
>> +
>> +       priv->regmap = devm_regmap_init_mmio(&pdev->dev, base,
>> +                                            &aspeed_peci_regmap_config);
>> +       if (IS_ERR(priv->regmap))
>> +               return PTR_ERR(priv->regmap);
>> +
>> +       priv->irq = platform_get_irq(pdev, 0);
>> +       if (!priv->irq)
>> +               return -ENODEV;
>> +
>> +       ret = devm_request_irq(&pdev->dev, priv->irq, aspeed_peci_irq_handler,
>> +                              IRQF_SHARED,
> 
> This interrupt is only for the peci device. Why is it marked as shared?
> 

You are right. I'll remove the flag.

>> +                              "peci-aspeed-irq",
>> +                              priv);
>> +       if (ret < 0)
>> +               return ret;
>> +
>> +       init_completion(&priv->xfer_complete);
>> +
>> +       priv->adaper.dev.parent = priv->dev;
>> +       priv->adaper.dev.of_node = of_node_get(dev_of_node(priv->dev));
>> +       strlcpy(priv->adaper.name, pdev->name, sizeof(priv->adaper.name));
>> +       priv->adaper.xfer = aspeed_peci_xfer;
>> +       peci_set_adapdata(&priv->adaper, priv);
>> +
>> +       ret = aspeed_peci_init_ctrl(priv);
>> +       if (ret < 0)
>> +               return ret;
>> +
>> +       ret = peci_add_adapter(&priv->adaper);
>> +       if (ret < 0)
>> +               return ret;
>> +
>> +       dev_info(&pdev->dev, "peci bus %d registered, irq %d\n",
>> +                priv->adaper.nr, priv->irq);
>> +
>> +       return 0;
>> +}
>> +
>> +static int aspeed_peci_remove(struct platform_device *pdev)
>> +{
>> +       struct aspeed_peci *priv = dev_get_drvdata(&pdev->dev);
>> +
>> +       peci_del_adapter(&priv->adaper);
>> +       of_node_put(priv->adaper.dev.of_node);
>> +
>> +       return 0;
>> +}
>> +
>> +static const struct of_device_id aspeed_peci_of_table[] = {
>> +       { .compatible = "aspeed,ast2400-peci", },
>> +       { .compatible = "aspeed,ast2500-peci", },
>> +       { }
>> +};
>> +MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
>> +
>> +static struct platform_driver aspeed_peci_driver = {
>> +       .probe  = aspeed_peci_probe,
>> +       .remove = aspeed_peci_remove,
>> +       .driver = {
>> +               .name           = "peci-aspeed",
>> +               .of_match_table = of_match_ptr(aspeed_peci_of_table),
>> +       },
>> +};
>> +module_platform_driver(aspeed_peci_driver);
>> +
>> +MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>> +MODULE_DESCRIPTION("Aspeed PECI driver");
>> +MODULE_LICENSE("GPL v2");
>> --
>> 2.16.2
>>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers
  2018-04-11 11:52   ` Joel Stanley
@ 2018-04-12  2:06     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-12  2:06 UTC (permalink / raw)
  To: Joel Stanley, Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, Linux Kernel Mailing List, linux-doc, devicetree,
	linux-hwmon, Linux ARM, OpenBMC Maillist

Hi Joel,

On 4/11/2018 4:52 AM, Joel Stanley wrote:
> Hi Jae,
> 
> On 11 April 2018 at 04:02, Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> wrote:
>> This commit adds documents of generic PECI bus, adapter and client drivers.
>>
>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>> Cc: Alan Cox <alan@linux.intel.com>
>> Cc: Andrew Jeffery <andrew@aj.id.au>
>> Cc: Andrew Lunn <andrew@lunn.ch>
>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>> Cc: Greg KH <gregkh@linuxfoundation.org>
>> Cc: Guenter Roeck <linux@roeck-us.net>
>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>> Cc: Jean Delvare <jdelvare@suse.com>
>> Cc: Joel Stanley <joel@jms.id.au>
>> Cc: Julia Cartwright <juliac@eso.teric.us>
>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>> Cc: Milton Miller II <miltonm@us.ibm.com>
>> Cc: Pavel Machek <pavel@ucw.cz>
>> Cc: Randy Dunlap <rdunlap@infradead.org>
>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
> 
> That's a hefty cc list. I can't see Rob Herring though, and he's
> usually the person who you need to convince to get your bindings
> accepted.
> 
> I recommend using ./scripts/get_maintainers.pl to build your CC list,
> and then add others you think are relevant.
> 
> I'm not sure what the guidelines are for generic bindings, so I'll
> defer to Rob for this patch.
> 
> Cheers,
> 
> Joel
> 

Thanks a lot for letting me know that. I'll do as you suggested.

-Jae

>> ---
>>   .../devicetree/bindings/peci/peci-adapter.txt      | 23 ++++++++++++++++++++
>>   .../devicetree/bindings/peci/peci-bus.txt          | 15 +++++++++++++
>>   .../devicetree/bindings/peci/peci-client.txt       | 25 ++++++++++++++++++++++
>>   3 files changed, 63 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
>>   create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
>>   create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt
>>
>> diff --git a/Documentation/devicetree/bindings/peci/peci-adapter.txt b/Documentation/devicetree/bindings/peci/peci-adapter.txt
>> new file mode 100644
>> index 000000000000..9221374f6b11
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/peci/peci-adapter.txt
>> @@ -0,0 +1,23 @@
>> +Generic device tree configuration for PECI adapters.
>> +
>> +Required properties:
>> +- compatible     : Should contain hardware specific definition strings that can
>> +                  match an adapter driver implementation.
>> +- reg            : Should contain PECI controller registers location and length.
>> +- #address-cells : Should be <1>.
>> +- #size-cells    : Should be <0>.
>> +
>> +Example:
>> +       peci: peci@10000000 {
>> +               compatible = "simple-bus";
>> +               #address-cells = <1>;
>> +               #size-cells = <1>;
>> +               ranges = <0x0 0x10000000 0x1000>;
>> +
>> +               peci0: peci-bus@0 {
>> +                       compatible = "soc,soc-peci";
>> +                       reg = <0x0 0x1000>;
>> +                       #address-cells = <1>;
>> +                       #size-cells = <0>;
>> +               };
>> +       };
>> diff --git a/Documentation/devicetree/bindings/peci/peci-bus.txt b/Documentation/devicetree/bindings/peci/peci-bus.txt
>> new file mode 100644
>> index 000000000000..90bcc791ccb0
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/peci/peci-bus.txt
>> @@ -0,0 +1,15 @@
>> +Generic device tree configuration for PECI buses.
>> +
>> +Required properties:
>> +- compatible     : Should be "simple-bus".
>> +- #address-cells : Should be <1>.
>> +- #size-cells    : Should be <1>.
>> +- ranges         : Should contain PECI controller registers ranges.
>> +
>> +Example:
>> +       peci: peci@10000000 {
>> +               compatible = "simple-bus";
>> +               #address-cells = <1>;
>> +               #size-cells = <1>;
>> +               ranges = <0x0 0x10000000 0x1000>;
>> +       };
>> diff --git a/Documentation/devicetree/bindings/peci/peci-client.txt b/Documentation/devicetree/bindings/peci/peci-client.txt
>> new file mode 100644
>> index 000000000000..8e2bfd8532f6
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/peci/peci-client.txt
>> @@ -0,0 +1,25 @@
>> +Generic device tree configuration for PECI clients.
>> +
>> +Required properties:
>> +- compatible : Should contain target device specific definition strings that can
>> +              match a client driver implementation.
>> +- reg        : Should contain address of a client CPU. Address range of CPU
>> +              clients is starting from 0x30 based on PECI specification.
>> +              <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
>> +
>> +Example:
>> +       peci-bus@0 {
>> +               #address-cells = <1>;
>> +               #size-cells = <0>;
>> +               < more properties >
>> +
>> +               function@cpu0 {
>> +                       compatible = "device,function";
>> +                       reg = <0x30>;
>> +               };
>> +
>> +               function@cpu1 {
>> +                       compatible = "device,function";
>> +                       reg = <0x31>;
>> +               };
>> +       };
>> --
>> 2.16.2
>>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-11 11:52   ` Joel Stanley
@ 2018-04-12  2:11     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-12  2:11 UTC (permalink / raw)
  To: Joel Stanley, Rob Herring, linux-aspeed, Ryan Chen
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, Linux Kernel Mailing List, linux-doc, devicetree,
	linux-hwmon, Linux ARM, OpenBMC Maillist

Hi Joel,

On 4/11/2018 4:52 AM, Joel Stanley wrote:
> On 11 April 2018 at 04:02, Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> wrote:
>> This commit adds a dt-bindings document of PECI adapter driver for Aspeed
> 
> We try to capitalise ASPEED.
> 

Got it. Will capitalize all Aspeed words.

>> AST24xx/25xx SoCs.
>> ---
>>   .../devicetree/bindings/peci/peci-aspeed.txt       | 60 ++++++++++++++++++++++
>>   1 file changed, 60 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt
>>
>> diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
>> new file mode 100644
>> index 000000000000..4598bb8c20fa
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
>> @@ -0,0 +1,60 @@
>> +Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
>> +
>> +Required properties:
>> +- compatible        : Should be "aspeed,ast2400-peci" or "aspeed,ast2500-peci"
>> +                     - aspeed,ast2400-peci: Aspeed AST2400 family PECI
>> +                                            controller
>> +                     - aspeed,ast2500-peci: Aspeed AST2500 family PECI
>> +                                            controller
>> +- reg               : Should contain PECI controller registers location and
>> +                     length.
>> +- #address-cells    : Should be <1>.
>> +- #size-cells       : Should be <0>.
>> +- interrupts        : Should contain PECI controller interrupt.
>> +- clocks            : Should contain clock source for PECI controller.
>> +                     Should reference clkin.
> 
> Are you sure that this is driven by clkin? Most peripherals on the
> Aspeed are attached to the apb, so should reference that clock.
> 

According to the datasheet, PECI controller module is attached to apb 
but its clock source is the 24MHz external clock.

>> +- clock_frequency   : Should contain the operation frequency of PECI controller
>> +                     in units of Hz.
>> +                     187500 ~ 24000000
> 
> Can you explain why you need both the parent clock and this frequency
> to be specified?
> 

Based on this setting, driver code makes clock divisor value to set 
operation clock of PECI controller which is adjustable.

>> +
>> +Optional properties:
>> +- msg-timing-nego   : Message timing negotiation period. This value will
> 
> Perhaps msg-timing-period? Or just msg-timing?
> 

Will use msg-timing instead.

>> +                     determine the period of message timing negotiation to be
>> +                     issued by PECI controller. The unit of the programmed
>> +                     value is four times of PECI clock period.
>> +                     0 ~ 255 (default: 1)
>> +- addr-timing-nego  : Address timing negotiation period. This value will
>> +                     determine the period of address timing negotiation to be
>> +                     issued by PECI controller. The unit of the programmed
>> +                     value is four times of PECI clock period.
>> +                     0 ~ 255 (default: 1)
>> +- rd-sampling-point : Read sampling point selection. The whole period of a bit
>> +                     time will be divided into 16 time frames. This value will
>> +                     determine the time frame in which the controller will
>> +                     sample PECI signal for data read back. Usually in the
>> +                     middle of a bit time is the best.
>> +                     0 ~ 15 (default: 8)
>> +- cmd_timeout_ms    : Command timeout in units of ms.
>> +                     1 ~ 60000 (default: 1000)
>> +
>> +Example:
>> +       peci: peci@1e78b000 {
>> +               compatible = "simple-bus";
>> +               #address-cells = <1>;
>> +               #size-cells = <1>;
>> +               ranges = <0x0 0x1e78b000 0x60>;
>> +
>> +               peci0: peci-bus@0 {
>> +                       compatible = "aspeed,ast2500-peci";
>> +                       reg = <0x0 0x60>;
>> +                       #address-cells = <1>;
>> +                       #size-cells = <0>;
>> +                       interrupts = <15>;
>> +                       clocks = <&clk_clkin>;
>> +                       clock-frequency = <24000000>;
>> +                       msg-timing-nego = <1>;
>> +                       addr-timing-nego = <1>;
>> +                       rd-sampling-point = <8>;
>> +                       cmd-timeout-ms = <1000>;
>> +               };
>> +       };
>> --
>> 2.16.2
>>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node
  2018-04-11 11:52   ` Joel Stanley
@ 2018-04-12  2:20     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-12  2:20 UTC (permalink / raw)
  To: Joel Stanley
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, Linux Kernel Mailing List, linux-doc, devicetree,
	linux-hwmon, Linux ARM, OpenBMC Maillist

On 4/11/2018 4:52 AM, Joel Stanley wrote:
> On 11 April 2018 at 04:02, Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> wrote:
>> This commit adds PECI bus/adapter node of AST24xx/AST25xx into
>> aspeed-g4 and aspeed-g5.
>>
> 
> The patches to the device trees get merged by the ASPEED maintainer
> (me). Once you have the bindings reviewed you can send the patches to
> me and the linux-aspeed list (I've got a pending patch to maintainers
> that will ensure get_maintainers.pl does the right thing as far as
> email addresses go).
> 
> I'd suggest dropping it from your series and re-sending once the
> bindings and driver are reviewed.
> 
> Cheers,
> 
> Joel
> 

Do you mean that bindings and driver of ASPEED peci adapter driver 
including documents?

Thanks,
-Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-12  0:34       ` Guenter Roeck
@ 2018-04-12  2:51         ` Jae Hyun Yoo
  2018-04-12  3:40           ` Guenter Roeck
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-12  2:51 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On 4/11/2018 5:34 PM, Guenter Roeck wrote:
> On 04/11/2018 02:59 PM, Jae Hyun Yoo wrote:
>> Hi Guenter,
>>
>> Thanks a lot for sharing your time. Please see my inline answers.
>>
>> On 4/10/2018 3:28 PM, Guenter Roeck wrote:
>>> On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:
>>>> This commit adds PECI cputemp and dimmtemp hwmon drivers.
>>>>
>>>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>>>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>>>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>>>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>>>> Cc: Alan Cox <alan@linux.intel.com>
>>>> Cc: Andrew Jeffery <andrew@aj.id.au>
>>>> Cc: Andrew Lunn <andrew@lunn.ch>
>>>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>>>> Cc: Greg KH <gregkh@linuxfoundation.org>
>>>> Cc: Guenter Roeck <linux@roeck-us.net>
>>>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>>>> Cc: Jean Delvare <jdelvare@suse.com>
>>>> Cc: Joel Stanley <joel@jms.id.au>
>>>> Cc: Julia Cartwright <juliac@eso.teric.us>
>>>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>>>> Cc: Milton Miller II <miltonm@us.ibm.com>
>>>> Cc: Pavel Machek <pavel@ucw.cz>
>>>> Cc: Randy Dunlap <rdunlap@infradead.org>
>>>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>>>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>>>> ---
>>>>   drivers/hwmon/Kconfig         |  28 ++
>>>>   drivers/hwmon/Makefile        |   2 +
>>>>   drivers/hwmon/peci-cputemp.c  | 783 
>>>> ++++++++++++++++++++++++++++++++++++++++++
>>>>   drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
>>>>   4 files changed, 1245 insertions(+)
>>>>   create mode 100644 drivers/hwmon/peci-cputemp.c
>>>>   create mode 100644 drivers/hwmon/peci-dimmtemp.c
>>>>
>>>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
>>>> index f249a4428458..c52f610f81d0 100644
>>>> --- a/drivers/hwmon/Kconfig
>>>> +++ b/drivers/hwmon/Kconfig
>>>> @@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
>>>>         This driver can also be built as a module.  If so, the module
>>>>         will be called nct7904.
>>>> +config SENSORS_PECI_CPUTEMP
>>>> +    tristate "PECI CPU temperature monitoring support"
>>>> +    depends on OF
>>>> +    depends on PECI
>>>> +    help
>>>> +      If you say yes here you get support for the generic Intel PECI
>>>> +      cputemp driver which provides Digital Thermal Sensor (DTS) 
>>>> thermal
>>>> +      readings of the CPU package and CPU cores that are accessible 
>>>> using
>>>> +      the PECI Client Command Suite via the processor PECI client.
>>>> +      Check Documentation/hwmon/peci-cputemp for details.
>>>> +
>>>> +      This driver can also be built as a module.  If so, the module
>>>> +      will be called peci-cputemp.
>>>> +
>>>> +config SENSORS_PECI_DIMMTEMP
>>>> +    tristate "PECI DIMM temperature monitoring support"
>>>> +    depends on OF
>>>> +    depends on PECI
>>>> +    help
>>>> +      If you say yes here you get support for the generic Intel 
>>>> PECI hwmon
>>>> +      driver which provides Digital Thermal Sensor (DTS) thermal 
>>>> readings of
>>>> +      DIMM components that are accessible using the PECI Client 
>>>> Command
>>>> +      Suite via the processor PECI client.
>>>> +      Check Documentation/hwmon/peci-dimmtemp for details.
>>>> +
>>>> +      This driver can also be built as a module.  If so, the module
>>>> +      will be called peci-dimmtemp.
>>>> +
>>>>   config SENSORS_NSA320
>>>>       tristate "ZyXEL NSA320 and compatible fan speed and 
>>>> temperature sensors"
>>>>       depends on GPIOLIB && OF
>>>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
>>>> index e7d52a36e6c4..48d9598fcd3a 100644
>>>> --- a/drivers/hwmon/Makefile
>>>> +++ b/drivers/hwmon/Makefile
>>>> @@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)    += nct7802.o
>>>>   obj-$(CONFIG_SENSORS_NCT7904)    += nct7904.o
>>>>   obj-$(CONFIG_SENSORS_NSA320)    += nsa320-hwmon.o
>>>>   obj-$(CONFIG_SENSORS_NTC_THERMISTOR)    += ntc_thermistor.o
>>>> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)    += peci-cputemp.o
>>>> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
>>>>   obj-$(CONFIG_SENSORS_PC87360)    += pc87360.o
>>>>   obj-$(CONFIG_SENSORS_PC87427)    += pc87427.o
>>>>   obj-$(CONFIG_SENSORS_PCF8591)    += pcf8591.o
>>>> diff --git a/drivers/hwmon/peci-cputemp.c 
>>>> b/drivers/hwmon/peci-cputemp.c
>>>> new file mode 100644
>>>> index 000000000000..f0bc92687512
>>>> --- /dev/null
>>>> +++ b/drivers/hwmon/peci-cputemp.c
>>>> @@ -0,0 +1,783 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +// Copyright (c) 2018 Intel Corporation
>>>> +
>>>> +#include <linux/delay.h>
>>>> +#include <linux/hwmon.h>
>>>> +#include <linux/hwmon-sysfs.h>
>>>
>>> Is this include needed ?
>>>
>>
>> No it isn't. Will drop the line.
>>
>>>> +#include <linux/jiffies.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/of_device.h>
>>>> +#include <linux/peci.h>
>>>> +
>>>> +#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
>>>> +
>>>> +#define CORE_MAX_ON_HSX       18 /* Max number of cores on Haswell */
>>>> +#define CORE_MAX_ON_BDX       24 /* Max number of cores on 
>>>> Broadwell */
>>>> +#define CORE_MAX_ON_SKX       28 /* Max number of cores on Skylake */
>>>> +
>>>> +#define DEFAULT_CHANNEL_NUMS  5
>>>> +#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
>>>> +#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + 
>>>> CORETEMP_CHANNEL_NUMS)
>>>> +
>>>> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model 
>>>> info */
>>>> +
>>>> +#define UPDATE_INTERVAL_MIN   HZ
>>>> +
>>>> +enum cpu_gens {
>>>> +    CPU_GEN_HSX, /* Haswell Xeon */
>>>> +    CPU_GEN_BRX, /* Broadwell Xeon */
>>>> +    CPU_GEN_SKX, /* Skylake Xeon */
>>>> +    CPU_GEN_MAX
>>>> +};
>>>> +
>>>> +struct cpu_gen_info {
>>>> +    u32 type;
>>>> +    u32 cpu_id;
>>>> +    u32 core_max;
>>>> +};
>>>> +
>>>> +struct temp_data {
>>>> +    bool valid;
>>>> +    s32  value;
>>>> +    unsigned long last_updated;
>>>> +};
>>>> +
>>>> +struct temp_group {
>>>> +    struct temp_data die;
>>>> +    struct temp_data dts_margin;
>>>> +    struct temp_data tcontrol;
>>>> +    struct temp_data tthrottle;
>>>> +    struct temp_data tjmax;
>>>> +    struct temp_data core[CORETEMP_CHANNEL_NUMS];
>>>> +};
>>>> +
>>>> +struct peci_cputemp {
>>>> +    struct peci_client *client;
>>>> +    struct device *dev;
>>>> +    char name[PECI_NAME_SIZE];
>>>> +    struct temp_group temp;
>>>> +    u8 addr;
>>>> +    uint cpu_no;
>>>> +    const struct cpu_gen_info *gen_info;
>>>> +    u32 core_mask;
>>>> +    u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
>>>> +    uint config_idx;
>>>> +    struct hwmon_channel_info temp_info;
>>>> +    const struct hwmon_channel_info *info[2];
>>>> +    struct hwmon_chip_info chip;
>>>> +};
>>>> +
>>>> +enum cputemp_channels {
>>>> +    channel_die,
>>>> +    channel_dts_mrgn,
>>>> +    channel_tcontrol,
>>>> +    channel_tthrottle,
>>>> +    channel_tjmax,
>>>> +    channel_core,
>>>> +};
>>>> +
>>>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>>>> +    { .type = CPU_GEN_HSX,
>>>> +      .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
>>>> +      .core_max = CORE_MAX_ON_HSX },
>>>> +    { .type = CPU_GEN_BRX,
>>>> +      .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
>>>> +      .core_max = CORE_MAX_ON_BDX },
>>>> +    { .type = CPU_GEN_SKX,
>>>> +      .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
>>>> +      .core_max = CORE_MAX_ON_SKX },
>>>> +};
>>>> +
>>>> +static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
>>>> +    /* Die temperature */
>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>>>> +    HWMON_T_CRIT_HYST,
>>>> +
>>>> +    /* DTS margin temperature */
>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
>>>> +
>>>> +    /* Tcontrol temperature */
>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
>>>> +
>>>> +    /* Tthrottle temperature */
>>>> +    HWMON_T_LABEL | HWMON_T_INPUT,
>>>> +
>>>> +    /* Tjmax temperature */
>>>> +    HWMON_T_LABEL | HWMON_T_INPUT,
>>>> +
>>>> +    /* Core temperature - for all core channels */
>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>>>> +    HWMON_T_CRIT_HYST,
>>>> +};
>>>> +
>>>> +static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
>>>> +    "Die",
>>>> +    "DTS margin",
>>>> +    "Tcontrol",
>>>> +    "Tthrottle",
>>>> +    "Tjmax",
>>>> +    "Core 0", "Core 1", "Core 2", "Core 3",
>>>> +    "Core 4", "Core 5", "Core 6", "Core 7",
>>>> +    "Core 8", "Core 9", "Core 10", "Core 11",
>>>> +    "Core 12", "Core 13", "Core 14", "Core 15",
>>>> +    "Core 16", "Core 17", "Core 18", "Core 19",
>>>> +    "Core 20", "Core 21", "Core 22", "Core 23",
>>>> +};
>>>> +
>>>> +static int send_peci_cmd(struct peci_cputemp *priv,
>>>> +             enum peci_cmd cmd,
>>>> +             void *msg)
>>>> +{
>>>> +    return peci_command(priv->client->adapter, cmd, msg);
>>>> +}
>>>> +
>>>> +static int need_update(struct temp_data *temp)
>>>
>>> Please use bool.
>>>
>>
>> Okay. I'll use bool instead of int.
>>
>>>> +{
>>>> +    if (temp->valid &&
>>>> +        time_before(jiffies, temp->last_updated + 
>>>> UPDATE_INTERVAL_MIN))
>>>> +        return 0;
>>>> +
>>>> +    return 1;
>>>> +}
>>>> +
>>>> +static void mark_updated(struct temp_data *temp)
>>>> +{
>>>> +    temp->valid = true;
>>>> +    temp->last_updated = jiffies;
>>>> +}
>>>> +
>>>> +static s32 ten_dot_six_to_millidegree(s32 val)
>>>> +{
>>>> +    return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
>>>> +}
>>>> +
>>>> +static int get_tjmax(struct peci_cputemp *priv)
>>>> +{
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    int rc;
>>>> +
>>>> +    if (!priv->temp.tjmax.valid) {
>>>> +        msg.addr = priv->addr;
>>>> +        msg.index = MBX_INDEX_TEMP_TARGET;
>>>> +        msg.param = 0;
>>>> +        msg.rx_len = 4;
>>>> +
>>>> +        rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
>>>> +        priv->temp.tjmax.valid = true;
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int get_tcontrol(struct peci_cputemp *priv)
>>>> +{
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    s32 tcontrol_margin;
>>>> +    s32 tthrottle_offset;
>>>> +    int rc;
>>>> +
>>>> +    if (!need_update(&priv->temp.tcontrol))
>>>> +        return 0;
>>>> +
>>>> +    rc = get_tjmax(priv);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    msg.addr = priv->addr;
>>>> +    msg.index = MBX_INDEX_TEMP_TARGET;
>>>> +    msg.param = 0;
>>>> +    msg.rx_len = 4;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    tcontrol_margin = msg.pkg_config[1];
>>>> +    tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>>>> +    priv->temp.tcontrol.value = priv->temp.tjmax.value - 
>>>> tcontrol_margin;
>>>> +
>>>> +    tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>>>> +    priv->temp.tthrottle.value = priv->temp.tjmax.value - 
>>>> tthrottle_offset;
>>>> +
>>>> +    mark_updated(&priv->temp.tcontrol);
>>>> +    mark_updated(&priv->temp.tthrottle);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int get_tthrottle(struct peci_cputemp *priv)
>>>> +{
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    s32 tcontrol_margin;
>>>> +    s32 tthrottle_offset;
>>>> +    int rc;
>>>> +
>>>> +    if (!need_update(&priv->temp.tthrottle))
>>>> +        return 0;
>>>> +
>>>> +    rc = get_tjmax(priv);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    msg.addr = priv->addr;
>>>> +    msg.index = MBX_INDEX_TEMP_TARGET;
>>>> +    msg.param = 0;
>>>> +    msg.rx_len = 4;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>>>> +    priv->temp.tthrottle.value = priv->temp.tjmax.value - 
>>>> tthrottle_offset;
>>>> +
>>>> +    tcontrol_margin = msg.pkg_config[1];
>>>> +    tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>>>> +    priv->temp.tcontrol.value = priv->temp.tjmax.value - 
>>>> tcontrol_margin;
>>>> +
>>>> +    mark_updated(&priv->temp.tthrottle);
>>>> +    mark_updated(&priv->temp.tcontrol);
>>>> +
>>>> +    return 0;
>>>> +}
>>>
>>> I am quite completely missing how the two functions above are different.
>>>
>>
>> The two above functions are slightly different but uses the same PECI 
>> command which provides both Tthrottle and Tcontrol values in 
>> pkg_config array so it updates the values to reduce duplicate PECI 
>> transactions. Probably, combining these two functions into 
>> get_ttrottle_and_tcontrol() would look better. I'll rewrite it.
>>
>>>> +
>>>> +static int get_die_temp(struct peci_cputemp *priv)
>>>> +{
>>>> +    struct peci_get_temp_msg msg;
>>>> +    int rc;
>>>> +
>>>> +    if (!need_update(&priv->temp.die))
>>>> +        return 0;
>>>> +
>>>> +    rc = get_tjmax(priv);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    msg.addr = priv->addr;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    priv->temp.die.value = priv->temp.tjmax.value +
>>>> +                   ((s32)msg.temp_raw * 1000 / 64);
>>>> +
>>>> +    mark_updated(&priv->temp.die);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int get_dts_margin(struct peci_cputemp *priv)
>>>> +{
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    s32 dts_margin;
>>>> +    int rc;
>>>> +
>>>> +    if (!need_update(&priv->temp.dts_margin))
>>>> +        return 0;
>>>> +
>>>> +    msg.addr = priv->addr;
>>>> +    msg.index = MBX_INDEX_DTS_MARGIN;
>>>> +    msg.param = 0;
>>>> +    msg.rx_len = 4;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>>>> +
>>>> +    /**
>>>> +     * Processors return a value of DTS reading in 10.6 format
>>>> +     * (10 bits signed decimal, 6 bits fractional).
>>>> +     * Error codes:
>>>> +     *   0x8000: General sensor error
>>>> +     *   0x8001: Reserved
>>>> +     *   0x8002: Underflow on reading value
>>>> +     *   0x8003-0x81ff: Reserved
>>>> +     */
>>>> +    if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
>>>> +        return -EIO;
>>>> +
>>>> +    dts_margin = ten_dot_six_to_millidegree(dts_margin);
>>>> +
>>>> +    priv->temp.dts_margin.value = dts_margin;
>>>> +
>>>> +    mark_updated(&priv->temp.dts_margin);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int get_core_temp(struct peci_cputemp *priv, int core_index)
>>>> +{
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    s32 core_dts_margin;
>>>> +    int rc;
>>>> +
>>>> +    if (!need_update(&priv->temp.core[core_index]))
>>>> +        return 0;
>>>> +
>>>> +    rc = get_tjmax(priv);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    msg.addr = priv->addr;
>>>> +    msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
>>>> +    msg.param = core_index;
>>>> +    msg.rx_len = 4;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>>>> +
>>>> +    /**
>>>> +     * Processors return a value of the core DTS reading in 10.6 
>>>> format
>>>> +     * (10 bits signed decimal, 6 bits fractional).
>>>> +     * Error codes:
>>>> +     *   0x8000: General sensor error
>>>> +     *   0x8001: Reserved
>>>> +     *   0x8002: Underflow on reading value
>>>> +     *   0x8003-0x81ff: Reserved
>>>> +     */
>>>> +    if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>>>> +        return -EIO;
>>>> +
>>>> +    core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
>>>> +
>>>> +    priv->temp.core[core_index].value = priv->temp.tjmax.value +
>>>> +                        core_dts_margin;
>>>> +
>>>> +    mark_updated(&priv->temp.core[core_index]);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>
>>> There is a lot of duplication in those functions. Would it be possible
>>> to find common code and use functions for it instead of duplicating
>>> everything several times ?
>>>
>>
>> Are you pointing out this code?
>> /**
>>   * Processors return a value of the core DTS reading in 10.6 format
>>   * (10 bits signed decimal, 6 bits fractional).
>>   * Error codes:
>>   *   0x8000: General sensor error
>>   *   0x8001: Reserved
>>   *   0x8002: Underflow on reading value
>>   *   0x8003-0x81ff: Reserved
>>   */
>> if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>>      return -EIO;
>>
>> Then I'll rewrite it as a function. If not, please point out the 
>> duplication.
>>
> 
> There is lots of other duplication.
> 

Sorry but can you point out the duplication?

>>>> +static int find_core_index(struct peci_cputemp *priv, int channel)
>>>> +{
>>>> +    int core_channel = channel - DEFAULT_CHANNEL_NUMS;
>>>> +    int idx, found = 0;
>>>> +
>>>> +    for (idx = 0; idx < priv->gen_info->core_max; idx++) {
>>>> +        if (priv->core_mask & BIT(idx)) {
>>>> +            if (core_channel == found)
>>>> +                break;
>>>> +
>>>> +            found++;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return idx;
>>>
>>> What if nothing is found ?
>>>
>>
>> Core temperature group will be registered only when it detects at 
>> least one core checked by check_resolved_cores(), so find_core_index() 
>> can be called only when priv->core_mask has a non-zero value. The 
>> 'nothing is found' case will not happen.
>>
> That doesn't guarantee a match. If what you are saying is correct there 
> should always be
> a well defined match of channel -> idx, and the search should be 
> unnecessary.
> 

There could be some disabled cores in the resolved core mask bit 
sequence also it should remove indexing gap in channel numbering so it 
is the reason why this search function is needed. Well defined match of 
channel -> idx would not be always satisfied.

>>>> +}
>>>> +
>>>> +static int cputemp_read_string(struct device *dev,
>>>> +                   enum hwmon_sensor_types type,
>>>> +                   u32 attr, int channel, const char **str)
>>>> +{
>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>> +    int core_index;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_label:
>>>> +        if (channel < DEFAULT_CHANNEL_NUMS) {
>>>> +            *str = cputemp_label[channel];
>>>> +        } else {
>>>> +            core_index = find_core_index(priv, channel);
>>>
>>> FWIW, it might be better to pass channel - DEFAULT_CHANNEL_NUMS
>>> as parameter.
>>>
>>
>> cputemp_read_string() is mapped to read_string member of hwmon_ops 
>> struct, so hwmon susbsystem passes the channel parameter based on the 
>> registered channel order. Should I modify hwmon subsystem code?
>>
> 
> Huh ? Changing
>      f(x) { y = x - const; }
> ...
>      f(x);
> 
> to
>      f(y) { }
> ...
>      f(x - const);
> 
> requires a hwmon core change ? Really ?
> 

Sorry for my misunderstanding. You are right. I'll change the parameter 
passing of find_core_index() from 'channel' to 'channel - 
DEFAULT_CHANNEL_NUMS'.

>>> What if find_core_index() returns priv->gen_info->core_max, ie
>>> if it didn't find a core ?
>>>
>>
>> As explained above, find_core index() returns a correct index always.
>>
>>>> +            *str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
>>>> +        }
>>>> +        return 0;
>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static int cputemp_read_die(struct device *dev,
>>>> +                enum hwmon_sensor_types type,
>>>> +                u32 attr, int channel, long *val)
>>>> +{
>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>> +    int rc;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_input:
>>>> +        rc = get_die_temp(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.die.value;
>>>> +        return 0;
>>>> +    case hwmon_temp_max:
>>>> +        rc = get_tcontrol(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tcontrol.value;
>>>> +        return 0;
>>>> +    case hwmon_temp_crit:
>>>> +        rc = get_tjmax(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tjmax.value;
>>>> +        return 0;
>>>> +    case hwmon_temp_crit_hyst:
>>>> +        rc = get_tcontrol(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>>>> +        return 0;
>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static int cputemp_read_dts_margin(struct device *dev,
>>>> +                   enum hwmon_sensor_types type,
>>>> +                   u32 attr, int channel, long *val)
>>>> +{
>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>> +    int rc;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_input:
>>>> +        rc = get_dts_margin(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.dts_margin.value;
>>>> +        return 0;
>>>> +    case hwmon_temp_min:
>>>> +        *val = 0;
>>>> +        return 0;
>>>
>>> This attribute should not exist.
>>>
>>
>> This is an attribute of DTS margin temperature which reflects thermal 
>> margin to Tcontrol of the CPU package. If it shows '0' means it 
>> reached to Tcontrol, the first level of thermal warning. If the CPU 
>> keeps getting hot then this DTS margin shows a negative value until it 
>> reaches to Tjmax. When the temperature reaches to Tjmax at last then 
>> it shows the lower critcal value which lcrit indicates as the second 
>> level of thermal warning.
>>
> 
> The hwmon ABI reports chip values, not constants. Even though some 
> drivers do
> it, reporting a constant is always wrong.
> 
>>>> +    case hwmon_temp_lcrit:
>>>> +        rc = get_tcontrol(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tcontrol.value - priv->temp.tjmax.value;
>>>
>>> lcrit is tcontrol - tjmax, and crit_hyst above is
>>> tjmax - tcontrol ? How does this make sense ?
>>>
>>
>> Both Tjmax and Tcontrol have positive values and Tjmax is greater than 
>> Tcontrol always. As explained above, lcrit of DTS margin should show a 
>> negative value means the margin goes down across '0'. On the other 
>> hand, crit_hyst of Die temperature should show absolute hyterisis 
>> value between Tcontrol and Tjmax.
>>
> The hwmon ABI requires reporting of absolute temperatures in 
> milli-degrees C.
> Your statements make it very clear that this driver does not report
> absolute temperatures. This is not acceptable.
> 

Okay. I'll remove the 'DTS margin' temperature. All others are reporting 
absolute temperatures.

>>>> +        return 0;
>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static int cputemp_read_tcontrol(struct device *dev,
>>>> +                 enum hwmon_sensor_types type,
>>>> +                 u32 attr, int channel, long *val)
>>>> +{
>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>> +    int rc;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_input:
>>>> +        rc = get_tcontrol(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tcontrol.value;
>>>> +        return 0;
>>>> +    case hwmon_temp_crit:
>>>> +        rc = get_tjmax(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tjmax.value;
>>>> +        return 0;
>>>
>>> Am I missing something, or is the same temperature reported several 
>>> times ?
>>> tjmax is also reported as temp_crit cputemp_read_die(), for example.
>>>
>>
>> This driver provides multiple channels and each channel has its own 
>> supplement attributes. As you mentioned, Die temperature channel and 
>> Core temperature channel have their individual crit attributes and 
>> they reflect the same value, Tjmax. It is not reporting several times 
>> but reporting the same value.
>>
> Then maybe fold the functions accordingly ?
> 

I'll use a single function for 'Die temperature' and 'Core temperature' 
that have the same attributes set. It would simplify this code a bit.

>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static int cputemp_read_tthrottle(struct device *dev,
>>>> +                  enum hwmon_sensor_types type,
>>>> +                  u32 attr, int channel, long *val)
>>>> +{
>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>> +    int rc;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_input:
>>>> +        rc = get_tthrottle(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tthrottle.value;
>>>> +        return 0;
>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static int cputemp_read_tjmax(struct device *dev,
>>>> +                  enum hwmon_sensor_types type,
>>>> +                  u32 attr, int channel, long *val)
>>>> +{
>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>> +    int rc;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_input:
>>>> +        rc = get_tjmax(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tjmax.value;
>>>> +        return 0;
>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static int cputemp_read_core(struct device *dev,
>>>> +                 enum hwmon_sensor_types type,
>>>> +                 u32 attr, int channel, long *val)
>>>> +{
>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>> +    int core_index = find_core_index(priv, channel);
>>>> +    int rc;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_input:
>>>> +        rc = get_core_temp(priv, core_index);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.core[core_index].value;
>>>> +        return 0;
>>>> +    case hwmon_temp_max:
>>>> +        rc = get_tcontrol(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tcontrol.value;
>>>> +        return 0;
>>>> +    case hwmon_temp_crit:
>>>> +        rc = get_tjmax(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tjmax.value;
>>>> +        return 0;
>>>> +    case hwmon_temp_crit_hyst:
>>>> +        rc = get_tcontrol(priv);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>>>> +        return 0;
>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>
>>> There is again a lot of duplication in those functions.
>>>
>>
>> Each function is called from cputemp_read() which is mapped to read 
>> function pointer of hwmon_ops struct. Since each channel has different 
>> set of attributes so the cputemp_read() calls an individual channel 
>> handler after checking the channel type. Of course, we can handle all 
>> attributes of all channels in a single function but the way also needs 
>> channel type checking code on each attribute.
>>
>>>> +
>>>> +static int cputemp_read(struct device *dev,
>>>> +            enum hwmon_sensor_types type,
>>>> +            u32 attr, int channel, long *val)
>>>> +{
>>>> +    switch (channel) {
>>>> +    case channel_die:
>>>> +        return cputemp_read_die(dev, type, attr, channel, val);
>>>> +    case channel_dts_mrgn:
>>>> +        return cputemp_read_dts_margin(dev, type, attr, channel, val);
>>>> +    case channel_tcontrol:
>>>> +        return cputemp_read_tcontrol(dev, type, attr, channel, val);
>>>> +    case channel_tthrottle:
>>>> +        return cputemp_read_tthrottle(dev, type, attr, channel, val);
>>>> +    case channel_tjmax:
>>>> +        return cputemp_read_tjmax(dev, type, attr, channel, val);
>>>> +    default:
>>>> +        if (channel < CPUTEMP_CHANNEL_NUMS)
>>>> +            return cputemp_read_core(dev, type, attr, channel, val);
>>>> +
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static umode_t cputemp_is_visible(const void *data,
>>>> +                  enum hwmon_sensor_types type,
>>>> +                  u32 attr, int channel)
>>>> +{
>>>> +    const struct peci_cputemp *priv = data;
>>>> +
>>>> +    if (priv->temp_config[channel] & BIT(attr))
>>>> +        return 0444;
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static const struct hwmon_ops cputemp_ops = {
>>>> +    .is_visible = cputemp_is_visible,
>>>> +    .read_string = cputemp_read_string,
>>>> +    .read = cputemp_read,
>>>> +};
>>>> +
>>>> +static int check_resolved_cores(struct peci_cputemp *priv)
>>>> +{
>>>> +    struct peci_rd_pci_cfg_local_msg msg;
>>>> +    int rc;
>>>> +
>>>> +    if (!(priv->client->adapter->cmd_mask & 
>>>> BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
>>>> +        return -EINVAL;
>>>> +
>>>> +    /* Get the RESOLVED_CORES register value */
>>>> +    msg.addr = priv->addr;
>>>> +    msg.bus = 1;
>>>> +    msg.device = 30;
>>>> +    msg.function = 3;
>>>> +    msg.reg = 0xB4;
>>>
>>> Can this be made less magic with some defines ?
>>>
>>
>> Sure, will use defines instead.
>>
>>>> +    msg.rx_len = 4;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    priv->core_mask = msg.pci_config[3] << 24 |
>>>> +              msg.pci_config[2] << 16 |
>>>> +              msg.pci_config[1] << 8 |
>>>> +              msg.pci_config[0];
>>>> +
>>>> +    if (!priv->core_mask)
>>>> +        return -EAGAIN;
>>>> +
>>>> +    dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", 
>>>> priv->core_mask);
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int create_core_temp_info(struct peci_cputemp *priv)
>>>> +{
>>>> +    int rc, i;
>>>> +
>>>> +    rc = check_resolved_cores(priv);
>>>> +    if (!rc) {
>>>> +        for (i = 0; i < priv->gen_info->core_max; i++) {
>>>> +            if (priv->core_mask & BIT(i)) {
>>>> +                priv->temp_config[priv->config_idx++] =
>>>> +                             config_table[channel_core];
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static int check_cpu_id(struct peci_cputemp *priv)
>>>> +{
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    u32 cpu_id;
>>>> +    int i, rc;
>>>> +
>>>> +    msg.addr = priv->addr;
>>>> +    msg.index = MBX_INDEX_CPU_ID;
>>>> +    msg.param = PKG_ID_CPU_ID;
>>>> +    msg.rx_len = 4;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>>>> +          msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>>>> +
>>>> +    for (i = 0; i < CPU_GEN_MAX; i++) {
>>>> +        if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>>>> +            priv->gen_info = &cpu_gen_info_table[i];
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    if (!priv->gen_info)
>>>> +        return -ENODEV;
>>>> +
>>>> +    dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int peci_cputemp_probe(struct peci_client *client)
>>>> +{
>>>> +    struct device *dev = &client->dev;
>>>> +    struct peci_cputemp *priv;
>>>> +    struct device *hwmon_dev;
>>>> +    int rc;
>>>> +
>>>> +    if ((client->adapter->cmd_mask &
>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>>>> +        dev_err(dev, "Client doesn't support temperature 
>>>> monitoring\n");
>>>> +        return -EINVAL;
>>>
>>> Does this mean there will be an error message for each non-supported 
>>> CPU ?
>>> Why ?
>>>
>>
>> For proper operation of this driver, PECI_CMD_GET_TEMP and 
>> PECI_CMD_RD_PKG_CFG have to be supported by a client CPU. 
>> PECI_CMD_GET_TEMP is provided as a default command but 
>> PECI_CMD_RD_PKG_CFG depends on PECI minor revision of a CPU package so 
>> this checking is needed.
>>
> 
> I do not question the check. I question the error message and error 
> return value.
> Why is it an _error_ if the CPU does not support the functionality, and 
> why does
> it have to be reported in the kernel log ?
> 

Got it. I'll change that to dev_dbg.

>>>> +    }
>>>> +
>>>> +    priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>>>> +    if (!priv)
>>>> +        return -ENOMEM;
>>>> +
>>>> +    dev_set_drvdata(dev, priv);
>>>> +    priv->client = client;
>>>> +    priv->dev = dev;
>>>> +    priv->addr = client->addr;
>>>> +    priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>>>> +
>>>> +    snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
>>>> +         priv->cpu_no);
>>>> +
>>>> +    rc = check_cpu_id(priv);
>>>> +    if (rc) {
>>>> +        dev_err(dev, "Client CPU is not supported\n");
>>>
>>> -ENODEV is not an error, and should not result in an error message.
>>> Besides, the error can also be propagated from peci core code,
>>> and may well be something else.
>>>
>>
>> Got it. I'll remove the error message and will add a proper handling 
>> code into PECI core.
>>
>>>> +        return rc;
>>>> +    }
>>>> +
>>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_die];
>>>> +    priv->temp_config[priv->config_idx++] = 
>>>> config_table[channel_dts_mrgn];
>>>> +    priv->temp_config[priv->config_idx++] = 
>>>> config_table[channel_tcontrol];
>>>> +    priv->temp_config[priv->config_idx++] = 
>>>> config_table[channel_tthrottle];
>>>> +    priv->temp_config[priv->config_idx++] = 
>>>> config_table[channel_tjmax];
>>>> +
>>>> +    rc = create_core_temp_info(priv);
>>>> +    if (rc)
>>>> +        dev_dbg(dev, "Failed to create core temp info\n");
>>>
>>> Then what ? Shouldn't this result in probe deferral or something more 
>>> useful
>>> instead of just being ignored ?
>>>
>>
>> This driver can't support core temperature monitoring if a CPU doesn't 
>> support PECI_CMD_RD_PCI_CFG_LOCAL command. In that case, it skips core 
>> temperature group creation and supports only basic temperature 
>> monitoring of Die, DTS margin and etc. I'll add this description as a 
>> comment.
>>
> 
> The message says "Failed to ...". It does not say "This CPU does not 
> support ...".
> 

Got it. Will correct the message.

>>>> +
>>>> +    priv->chip.ops = &cputemp_ops;
>>>> +    priv->chip.info = priv->info;
>>>> +
>>>> +    priv->info[0] = &priv->temp_info;
>>>> +
>>>> +    priv->temp_info.type = hwmon_temp;
>>>> +    priv->temp_info.config = priv->temp_config;
>>>> +
>>>> +    hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>>>> +                             priv->name,
>>>> +                             priv,
>>>> +                             &priv->chip,
>>>> +                             NULL);
>>>> +
>>>> +    if (IS_ERR(hwmon_dev))
>>>> +        return PTR_ERR(hwmon_dev);
>>>> +
>>>> +    dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), 
>>>> priv->name);
>>>> +
> 
> Why does this message display the device name twice ?
> 

For an example, dev_name(hwmon_dev) shows 'hwmon5' and priv->name shows 
'peci-cputemp0'.

>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static const struct of_device_id peci_cputemp_of_table[] = {
>>>> +    { .compatible = "intel,peci-cputemp" },
>>>> +    { }
>>>> +};
>>>> +MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
>>>> +
>>>> +static struct peci_driver peci_cputemp_driver = {
>>>> +    .probe  = peci_cputemp_probe,
>>>> +    .driver = {
>>>> +        .name           = "peci-cputemp",
>>>> +        .of_match_table = of_match_ptr(peci_cputemp_of_table),
>>>> +    },
>>>> +};
>>>> +module_peci_driver(peci_cputemp_driver);
>>>> +
>>>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>>>> +MODULE_DESCRIPTION("PECI cputemp driver");
>>>> +MODULE_LICENSE("GPL v2");
>>>> diff --git a/drivers/hwmon/peci-dimmtemp.c 
>>>> b/drivers/hwmon/peci-dimmtemp.c
>>>> new file mode 100644
>>>> index 000000000000..78bf29cb2c4c
>>>> --- /dev/null
>>>> +++ b/drivers/hwmon/peci-dimmtemp.c
>>>
>>> FWIW, this should be two separate patches.
>>>
>>
>> Should I split out hwmon documents and dt bindings too?
>>
>>>> @@ -0,0 +1,432 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +// Copyright (c) 2018 Intel Corporation
>>>> +
>>>> +#include <linux/delay.h>
>>>> +#include <linux/hwmon.h>
>>>> +#include <linux/hwmon-sysfs.h>
>>>
>>> Needed ?
>>>
>>
>> No. Will drop the line.
>>
>>>> +#include <linux/jiffies.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/of_device.h>
>>>> +#include <linux/peci.h>
>>>> +#include <linux/workqueue.h>
>>>> +
>>>> +#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
>>>> +
>>>> +#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on 
>>>> Haswell */
>>>> +#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on 
>>>> Haswell */
>>>> +
>>>> +#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on 
>>>> Broadwell */
>>>> +#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on 
>>>> Broadwell */
>>>> +
>>>> +#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on 
>>>> Skylake */
>>>> +#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on 
>>>> Skylake */
>>>> +
>>>> +#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
>>>> +#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
>>>> +
>>>> +#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
>>>> +
>>>> +#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model 
>>>> info */
>>>> +
>>>> +#define UPDATE_INTERVAL_MIN  HZ
>>>> +
>>>> +#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
>>>> +#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 minutes */
>>>> +
>>>> +enum cpu_gens {
>>>> +    CPU_GEN_HSX, /* Haswell Xeon */
>>>> +    CPU_GEN_BRX, /* Broadwell Xeon */
>>>> +    CPU_GEN_SKX, /* Skylake Xeon */
>>>> +    CPU_GEN_MAX
>>>> +};
>>>> +
>>>> +struct cpu_gen_info {
>>>> +    u32 type;
>>>> +    u32 cpu_id;
>>>> +    u32 chan_rank_max;
>>>> +    u32 dimm_idx_max;
>>>> +};
>>>> +
>>>> +struct temp_data {
>>>> +    bool valid;
>>>> +    s32  value;
>>>> +    unsigned long last_updated;
>>>> +};
>>>> +
>>>> +struct peci_dimmtemp {
>>>> +    struct peci_client *client;
>>>> +    struct device *dev;
>>>> +    struct workqueue_struct *work_queue;
>>>> +    struct delayed_work work_handler;
>>>> +    char name[PECI_NAME_SIZE];
>>>> +    struct temp_data temp[DIMM_NUMS_MAX];
>>>> +    u8 addr;
>>>> +    uint cpu_no;
>>>> +    const struct cpu_gen_info *gen_info;
>>>> +    u32 dimm_mask;
>>>> +    int retry_count;
>>>> +    int channels;
>>>> +    u32 temp_config[DIMM_NUMS_MAX + 1];
>>>> +    struct hwmon_channel_info temp_info;
>>>> +    const struct hwmon_channel_info *info[2];
>>>> +    struct hwmon_chip_info chip;
>>>> +};
>>>> +
>>>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>>>> +    { .type  = CPU_GEN_HSX,
>>>> +      .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
>>>> +    { .type  = CPU_GEN_BRX,
>>>> +      .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
>>>> +    { .type  = CPU_GEN_SKX,
>>>> +      .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
>>>> +};
>>>> +
>>>> +static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
>>>> +    { "DIMM A0", "DIMM A1", "DIMM A2" },
>>>> +    { "DIMM B0", "DIMM B1", "DIMM B2" },
>>>> +    { "DIMM C0", "DIMM C1", "DIMM C2" },
>>>> +    { "DIMM D0", "DIMM D1", "DIMM D2" },
>>>> +    { "DIMM E0", "DIMM E1", "DIMM E2" },
>>>> +    { "DIMM F0", "DIMM F1", "DIMM F2" },
>>>> +    { "DIMM G0", "DIMM G1", "DIMM G2" },
>>>> +    { "DIMM H0", "DIMM H1", "DIMM H2" },
>>>> +};
>>>> +
>>>> +static int send_peci_cmd(struct peci_dimmtemp *priv, enum peci_cmd 
>>>> cmd,
>>>> +             void *msg)
>>>> +{
>>>> +    return peci_command(priv->client->adapter, cmd, msg);
>>>> +}
>>>> +
>>>> +static int need_update(struct temp_data *temp)
>>>> +{
>>>> +    if (temp->valid &&
>>>> +        time_before(jiffies, temp->last_updated + 
>>>> UPDATE_INTERVAL_MIN))
>>>> +        return 0;
>>>> +
>>>> +    return 1;
>>>> +}
>>>> +
>>>> +static void mark_updated(struct temp_data *temp)
>>>> +{
>>>> +    temp->valid = true;
>>>> +    temp->last_updated = jiffies;
>>>> +}
>>>
>>> It might make sense to provide the duplicate functions in a core file.
>>>
>>
>> It is temperature monitoring specific function and it touches module 
>> specific variables. Do you really think that this non-generic function 
>> should be moved to PECI core?
>>
>>>> +
>>>> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
>>>> +{
>>>> +    int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
>>>> +    int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    int rc;
>>>> +
>>>> +    if (!need_update(&priv->temp[dimm_no]))
>>>> +        return 0;
>>>> +
>>>> +    msg.addr = priv->addr;
>>>> +    msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>>>> +    msg.param = chan_rank;
>>>> +    msg.rx_len = 4;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
>>>> +
>>>> +    mark_updated(&priv->temp[dimm_no]);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
>>>> +{
>>>> +    int dimm_nums_max = priv->gen_info->chan_rank_max *
>>>> +                priv->gen_info->dimm_idx_max;
>>>> +    int idx, found = 0;
>>>> +
>>>> +    for (idx = 0; idx < dimm_nums_max; idx++) {
>>>> +        if (priv->dimm_mask & BIT(idx)) {
>>>> +            if (channel == found)
>>>> +                break;
>>>> +
>>>> +            found++;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return idx;
>>>> +}
>>>
>>> This again looks like duplicate code.
>>>
>>
>> find_dimm_number()? I'm sure it isn't.
>>
>>>> +
>>>> +static int dimmtemp_read_string(struct device *dev,
>>>> +                enum hwmon_sensor_types type,
>>>> +                u32 attr, int channel, const char **str)
>>>> +{
>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>>> +    u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>>>> +    int dimm_no, chan_rank, dimm_idx;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_label:
>>>> +        dimm_no = find_dimm_number(priv, channel);
>>>> +        chan_rank = dimm_no / dimm_idx_max;
>>>> +        dimm_idx = dimm_no % dimm_idx_max;
>>>> +        *str = dimmtemp_label[chan_rank][dimm_idx];
>>>> +        return 0;
>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static int dimmtemp_read(struct device *dev, enum 
>>>> hwmon_sensor_types type,
>>>> +             u32 attr, int channel, long *val)
>>>> +{
>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>>> +    int dimm_no = find_dimm_number(priv, channel);
>>>> +    int rc;
>>>> +
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_input:
>>>> +        rc = get_dimm_temp(priv, dimm_no);
>>>> +        if (rc)
>>>> +            return rc;
>>>> +
>>>> +        *val = priv->temp[dimm_no].value;
>>>> +        return 0;
>>>> +    default:
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +}
>>>> +
>>>> +static umode_t dimmtemp_is_visible(const void *data,
>>>> +                   enum hwmon_sensor_types type,
>>>> +                   u32 attr, int channel)
>>>> +{
>>>> +    switch (attr) {
>>>> +    case hwmon_temp_label:
>>>> +    case hwmon_temp_input:
>>>> +        return 0444;
>>>> +    default:
>>>> +        return 0;
>>>> +    }
>>>> +}
>>>> +
>>>> +static const struct hwmon_ops dimmtemp_ops = {
>>>> +    .is_visible = dimmtemp_is_visible,
>>>> +    .read_string = dimmtemp_read_string,
>>>> +    .read = dimmtemp_read,
>>>> +};
>>>> +
>>>> +static int check_populated_dimms(struct peci_dimmtemp *priv)
>>>> +{
>>>> +    u32 chan_rank_max = priv->gen_info->chan_rank_max;
>>>> +    u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    int chan_rank, dimm_idx;
>>>> +    int rc, channels = 0;
>>>> +
>>>> +    for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
>>>> +        msg.addr = priv->addr;
>>>> +        msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>>>> +        msg.param = chan_rank;
>>>> +        msg.rx_len = 4;
>>>> +
>>>> +        rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +        if (rc) {
>>>> +            priv->dimm_mask = 0;
>>>> +            return rc;
>>>> +        }
>>>> +
>>>> +        for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
>>>> +            if (msg.pkg_config[dimm_idx]) {
>>>> +                priv->dimm_mask |= BIT(chan_rank *
>>>> +                               chan_rank_max +
>>>> +                               dimm_idx);
>>>> +                channels++;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    if (!priv->dimm_mask)
>>>> +        return -EAGAIN;
>>>> +
>>>> +    priv->channels = channels;
>>>> +
>>>> +    dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", 
>>>> priv->dimm_mask);
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
>>>> +{
>>>> +    struct device *hwmon_dev;
>>>> +    int rc, i;
>>>> +
>>>> +    rc = check_populated_dimms(priv);
>>>> +    if (!rc) {
>>>
>>> Please handle error cases first.
>>>
>>
>> Sure, I'll rewrite it.
>>
>>>> +        for (i = 0; i < priv->channels; i++)
>>>> +            priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
>>>> +
>>>> +        priv->chip.ops = &dimmtemp_ops;
>>>> +        priv->chip.info = priv->info;
>>>> +
>>>> +        priv->info[0] = &priv->temp_info;
>>>> +
>>>> +        priv->temp_info.type = hwmon_temp;
>>>> +        priv->temp_info.config = priv->temp_config;
>>>> +
>>>> +        hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>>>> +                                 priv->name,
>>>> +                                 priv,
>>>> +                                 &priv->chip,
>>>> +                                 NULL);
>>>> +        rc = PTR_ERR_OR_ZERO(hwmon_dev);
>>>> +        if (!rc)
>>>> +            dev_dbg(priv->dev, "%s: sensor '%s'\n",
>>>> +                dev_name(hwmon_dev), priv->name);
>>>> +    } else if (rc == -EAGAIN) {
>>>> +        if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
>>>> +            queue_delayed_work(priv->work_queue,
>>>> +                       &priv->work_handler,
>>>> +                       DIMM_MASK_CHECK_DELAY_JIFFIES);
>>>> +            priv->retry_count++;
>>>> +            dev_dbg(priv->dev,
>>>> +                "Deferred DIMM temp info creation\n");
>>>> +        } else {
>>>> +            rc = -ETIMEDOUT;
>>>> +            dev_err(priv->dev,
>>>> +                "Timeout retrying DIMM temp info creation\n");
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static void create_dimm_temp_info_delayed(struct work_struct *work)
>>>> +{
>>>> +    struct delayed_work *dwork = to_delayed_work(work);
>>>> +    struct peci_dimmtemp *priv = container_of(dwork, struct 
>>>> peci_dimmtemp,
>>>> +                          work_handler);
>>>> +    int rc;
>>>> +
>>>> +    rc = create_dimm_temp_info(priv);
>>>> +    if (rc && rc != -EAGAIN)
>>>> +        dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
>>>> +}
>>>> +
>>>> +static int check_cpu_id(struct peci_dimmtemp *priv)
>>>> +{
>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>> +    u32 cpu_id;
>>>> +    int i, rc;
>>>> +
>>>> +    msg.addr = priv->addr;
>>>> +    msg.index = MBX_INDEX_CPU_ID;
>>>> +    msg.param = PKG_ID_CPU_ID;
>>>> +    msg.rx_len = 4;
>>>> +
>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>> +    if (rc)
>>>> +        return rc;
>>>> +
>>>> +    cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>>>> +          msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>>>> +
>>>> +    for (i = 0; i < CPU_GEN_MAX; i++) {
>>>> +        if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>>>> +            priv->gen_info = &cpu_gen_info_table[i];
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    if (!priv->gen_info)
>>>> +        return -ENODEV;
>>>> +
>>>> +    dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>>>> +    return 0;
>>>> +}
>>>
>>> More duplicate code.
>>>
>>
>> Okay. In case of check_cpu_id(), it could be used as a generic PECI 
>> function. I'll move it into PECI core.
>>
>>>> +
>>>> +static int peci_dimmtemp_probe(struct peci_client *client)
>>>> +{
>>>> +    struct device *dev = &client->dev;
>>>> +    struct peci_dimmtemp *priv;
>>>> +    int rc;
>>>> +
>>>> +    if ((client->adapter->cmd_mask &
>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>>>
>>> One set of ( ) is unnecessary on each side of the expression.
>>>
>>
>> '&' has a precedence over '!=' but '|' doesn't. I'll rewrite it to:
>>
> 
> Actually, that is wrong. You refer to address-of. Bit operations do have 
> lower
> precedence that comparisons. I stand corrected.
> 
>>      if (client->adapter->cmd_mask &
>>          (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)) !=
>>          (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)))
>>
>>>> +        dev_err(dev, "Client doesn't support temperature 
>>>> monitoring\n");
>>>> +        return -EINVAL;
>>>
>>> Why is this "invalid", and why does it warrant an error message ?
>>>
>>
>> Should I use -EPERM? Any suggestion?
>>
> 
> Is it an _error_ if the CPU does not support this functionality ?
> 

Actually, it returns from this probe() function without making any hwmon 
info creation so I intended to handle this case as an error. Am I wrong?

>>>> +    }
>>>> +
>>>> +    priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>>>> +    if (!priv)
>>>> +        return -ENOMEM;
>>>> +
>>>> +    dev_set_drvdata(dev, priv);
>>>> +    priv->client = client;
>>>> +    priv->dev = dev;
>>>> +    priv->addr = client->addr;
>>>> +    priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>>>
>>> Is priv->addr guaranteed to be >= PECI_BASE_ADDR ?
>>
>> Client address range validation will be done in 
>> peci_check_addr_validity() in PECI core before probing a device driver.
>>
>>>> +
>>>> +    snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
>>>> +         priv->cpu_no);
>>>> +
>>>> +    rc = check_cpu_id(priv);
>>>> +    if (rc) {
>>>> +        dev_err(dev, "Client CPU is not supported\n");
>>>
>>> Or the peci command failed.
>>>
>>
>> I'll remove the error message and will add a proper handling code into 
>> PECI core on each error type.
>>
>>>> +        return rc;
>>>> +    }
>>>> +
>>>> +    priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
>>>> +    if (!priv->work_queue)
>>>> +        return -ENOMEM;
>>>> +
>>>> +    INIT_DELAYED_WORK(&priv->work_handler, 
>>>> create_dimm_temp_info_delayed);
>>>> +
>>>> +    rc = create_dimm_temp_info(priv);
>>>> +    if (rc && rc != -EAGAIN) {
>>>> +        dev_err(dev, "Failed to create DIMM temp info\n");
>>>> +        goto err_free_wq;
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +
>>>> +err_free_wq:
>>>> +    destroy_workqueue(priv->work_queue);
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static int peci_dimmtemp_remove(struct peci_client *client)
>>>> +{
>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
>>>> +
>>>> +    cancel_delayed_work(&priv->work_handler);
>>>
>>> cancel_delayed_work_sync() ?
>>>
>>
>> Yes, it would be safer. Will fix it.
>>
>>>> +    destroy_workqueue(priv->work_queue);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static const struct of_device_id peci_dimmtemp_of_table[] = {
>>>> +    { .compatible = "intel,peci-dimmtemp" },
>>>> +    { }
>>>> +};
>>>> +MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
>>>> +
>>>> +static struct peci_driver peci_dimmtemp_driver = {
>>>> +    .probe  = peci_dimmtemp_probe,
>>>> +    .remove = peci_dimmtemp_remove,
>>>> +    .driver = {
>>>> +        .name           = "peci-dimmtemp",
>>>> +        .of_match_table = of_match_ptr(peci_dimmtemp_of_table),
>>>> +    },
>>>> +};
>>>> +module_peci_driver(peci_dimmtemp_driver);
>>>> +
>>>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>>>> +MODULE_DESCRIPTION("PECI dimmtemp driver");
>>>> +MODULE_LICENSE("GPL v2");
>>>> -- 
>>>> 2.16.2
>>>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-12  2:51         ` Jae Hyun Yoo
@ 2018-04-12  3:40           ` Guenter Roeck
  2018-04-12 17:09             ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Guenter Roeck @ 2018-04-12  3:40 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On 04/11/2018 07:51 PM, Jae Hyun Yoo wrote:
> On 4/11/2018 5:34 PM, Guenter Roeck wrote:
>> On 04/11/2018 02:59 PM, Jae Hyun Yoo wrote:
>>> Hi Guenter,
>>>
>>> Thanks a lot for sharing your time. Please see my inline answers.
>>>
>>> On 4/10/2018 3:28 PM, Guenter Roeck wrote:
>>>> On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:
>>>>> This commit adds PECI cputemp and dimmtemp hwmon drivers.
>>>>>
>>>>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>>>>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>>>>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>>>>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>>>>> Cc: Alan Cox <alan@linux.intel.com>
>>>>> Cc: Andrew Jeffery <andrew@aj.id.au>
>>>>> Cc: Andrew Lunn <andrew@lunn.ch>
>>>>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>>>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>>>>> Cc: Greg KH <gregkh@linuxfoundation.org>
>>>>> Cc: Guenter Roeck <linux@roeck-us.net>
>>>>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>>>>> Cc: Jean Delvare <jdelvare@suse.com>
>>>>> Cc: Joel Stanley <joel@jms.id.au>
>>>>> Cc: Julia Cartwright <juliac@eso.teric.us>
>>>>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>>>>> Cc: Milton Miller II <miltonm@us.ibm.com>
>>>>> Cc: Pavel Machek <pavel@ucw.cz>
>>>>> Cc: Randy Dunlap <rdunlap@infradead.org>
>>>>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>>>>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>>>>> ---
>>>>>   drivers/hwmon/Kconfig         |  28 ++
>>>>>   drivers/hwmon/Makefile        |   2 +
>>>>>   drivers/hwmon/peci-cputemp.c  | 783 ++++++++++++++++++++++++++++++++++++++++++
>>>>>   drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
>>>>>   4 files changed, 1245 insertions(+)
>>>>>   create mode 100644 drivers/hwmon/peci-cputemp.c
>>>>>   create mode 100644 drivers/hwmon/peci-dimmtemp.c
>>>>>
>>>>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
>>>>> index f249a4428458..c52f610f81d0 100644
>>>>> --- a/drivers/hwmon/Kconfig
>>>>> +++ b/drivers/hwmon/Kconfig
>>>>> @@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
>>>>>         This driver can also be built as a module.  If so, the module
>>>>>         will be called nct7904.
>>>>> +config SENSORS_PECI_CPUTEMP
>>>>> +    tristate "PECI CPU temperature monitoring support"
>>>>> +    depends on OF
>>>>> +    depends on PECI
>>>>> +    help
>>>>> +      If you say yes here you get support for the generic Intel PECI
>>>>> +      cputemp driver which provides Digital Thermal Sensor (DTS) thermal
>>>>> +      readings of the CPU package and CPU cores that are accessible using
>>>>> +      the PECI Client Command Suite via the processor PECI client.
>>>>> +      Check Documentation/hwmon/peci-cputemp for details.
>>>>> +
>>>>> +      This driver can also be built as a module.  If so, the module
>>>>> +      will be called peci-cputemp.
>>>>> +
>>>>> +config SENSORS_PECI_DIMMTEMP
>>>>> +    tristate "PECI DIMM temperature monitoring support"
>>>>> +    depends on OF
>>>>> +    depends on PECI
>>>>> +    help
>>>>> +      If you say yes here you get support for the generic Intel PECI hwmon
>>>>> +      driver which provides Digital Thermal Sensor (DTS) thermal readings of
>>>>> +      DIMM components that are accessible using the PECI Client Command
>>>>> +      Suite via the processor PECI client.
>>>>> +      Check Documentation/hwmon/peci-dimmtemp for details.
>>>>> +
>>>>> +      This driver can also be built as a module.  If so, the module
>>>>> +      will be called peci-dimmtemp.
>>>>> +
>>>>>   config SENSORS_NSA320
>>>>>       tristate "ZyXEL NSA320 and compatible fan speed and temperature sensors"
>>>>>       depends on GPIOLIB && OF
>>>>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
>>>>> index e7d52a36e6c4..48d9598fcd3a 100644
>>>>> --- a/drivers/hwmon/Makefile
>>>>> +++ b/drivers/hwmon/Makefile
>>>>> @@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)    += nct7802.o
>>>>>   obj-$(CONFIG_SENSORS_NCT7904)    += nct7904.o
>>>>>   obj-$(CONFIG_SENSORS_NSA320)    += nsa320-hwmon.o
>>>>>   obj-$(CONFIG_SENSORS_NTC_THERMISTOR)    += ntc_thermistor.o
>>>>> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)    += peci-cputemp.o
>>>>> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
>>>>>   obj-$(CONFIG_SENSORS_PC87360)    += pc87360.o
>>>>>   obj-$(CONFIG_SENSORS_PC87427)    += pc87427.o
>>>>>   obj-$(CONFIG_SENSORS_PCF8591)    += pcf8591.o
>>>>> diff --git a/drivers/hwmon/peci-cputemp.c b/drivers/hwmon/peci-cputemp.c
>>>>> new file mode 100644
>>>>> index 000000000000..f0bc92687512
>>>>> --- /dev/null
>>>>> +++ b/drivers/hwmon/peci-cputemp.c
>>>>> @@ -0,0 +1,783 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>> +// Copyright (c) 2018 Intel Corporation
>>>>> +
>>>>> +#include <linux/delay.h>
>>>>> +#include <linux/hwmon.h>
>>>>> +#include <linux/hwmon-sysfs.h>
>>>>
>>>> Is this include needed ?
>>>>
>>>
>>> No it isn't. Will drop the line.
>>>
>>>>> +#include <linux/jiffies.h>
>>>>> +#include <linux/module.h>
>>>>> +#include <linux/of_device.h>
>>>>> +#include <linux/peci.h>
>>>>> +
>>>>> +#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
>>>>> +
>>>>> +#define CORE_MAX_ON_HSX       18 /* Max number of cores on Haswell */
>>>>> +#define CORE_MAX_ON_BDX       24 /* Max number of cores on Broadwell */
>>>>> +#define CORE_MAX_ON_SKX       28 /* Max number of cores on Skylake */
>>>>> +
>>>>> +#define DEFAULT_CHANNEL_NUMS  5
>>>>> +#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
>>>>> +#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + CORETEMP_CHANNEL_NUMS)
>>>>> +
>>>>> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model info */
>>>>> +
>>>>> +#define UPDATE_INTERVAL_MIN   HZ
>>>>> +
>>>>> +enum cpu_gens {
>>>>> +    CPU_GEN_HSX, /* Haswell Xeon */
>>>>> +    CPU_GEN_BRX, /* Broadwell Xeon */
>>>>> +    CPU_GEN_SKX, /* Skylake Xeon */
>>>>> +    CPU_GEN_MAX
>>>>> +};
>>>>> +
>>>>> +struct cpu_gen_info {
>>>>> +    u32 type;
>>>>> +    u32 cpu_id;
>>>>> +    u32 core_max;
>>>>> +};
>>>>> +
>>>>> +struct temp_data {
>>>>> +    bool valid;
>>>>> +    s32  value;
>>>>> +    unsigned long last_updated;
>>>>> +};
>>>>> +
>>>>> +struct temp_group {
>>>>> +    struct temp_data die;
>>>>> +    struct temp_data dts_margin;
>>>>> +    struct temp_data tcontrol;
>>>>> +    struct temp_data tthrottle;
>>>>> +    struct temp_data tjmax;
>>>>> +    struct temp_data core[CORETEMP_CHANNEL_NUMS];
>>>>> +};
>>>>> +
>>>>> +struct peci_cputemp {
>>>>> +    struct peci_client *client;
>>>>> +    struct device *dev;
>>>>> +    char name[PECI_NAME_SIZE];
>>>>> +    struct temp_group temp;
>>>>> +    u8 addr;
>>>>> +    uint cpu_no;
>>>>> +    const struct cpu_gen_info *gen_info;
>>>>> +    u32 core_mask;
>>>>> +    u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
>>>>> +    uint config_idx;
>>>>> +    struct hwmon_channel_info temp_info;
>>>>> +    const struct hwmon_channel_info *info[2];
>>>>> +    struct hwmon_chip_info chip;
>>>>> +};
>>>>> +
>>>>> +enum cputemp_channels {
>>>>> +    channel_die,
>>>>> +    channel_dts_mrgn,
>>>>> +    channel_tcontrol,
>>>>> +    channel_tthrottle,
>>>>> +    channel_tjmax,
>>>>> +    channel_core,
>>>>> +};
>>>>> +
>>>>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>>>>> +    { .type = CPU_GEN_HSX,
>>>>> +      .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
>>>>> +      .core_max = CORE_MAX_ON_HSX },
>>>>> +    { .type = CPU_GEN_BRX,
>>>>> +      .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
>>>>> +      .core_max = CORE_MAX_ON_BDX },
>>>>> +    { .type = CPU_GEN_SKX,
>>>>> +      .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
>>>>> +      .core_max = CORE_MAX_ON_SKX },
>>>>> +};
>>>>> +
>>>>> +static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
>>>>> +    /* Die temperature */
>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>>>>> +    HWMON_T_CRIT_HYST,
>>>>> +
>>>>> +    /* DTS margin temperature */
>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
>>>>> +
>>>>> +    /* Tcontrol temperature */
>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
>>>>> +
>>>>> +    /* Tthrottle temperature */
>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT,
>>>>> +
>>>>> +    /* Tjmax temperature */
>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT,
>>>>> +
>>>>> +    /* Core temperature - for all core channels */
>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>>>>> +    HWMON_T_CRIT_HYST,
>>>>> +};
>>>>> +
>>>>> +static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
>>>>> +    "Die",
>>>>> +    "DTS margin",
>>>>> +    "Tcontrol",
>>>>> +    "Tthrottle",
>>>>> +    "Tjmax",
>>>>> +    "Core 0", "Core 1", "Core 2", "Core 3",
>>>>> +    "Core 4", "Core 5", "Core 6", "Core 7",
>>>>> +    "Core 8", "Core 9", "Core 10", "Core 11",
>>>>> +    "Core 12", "Core 13", "Core 14", "Core 15",
>>>>> +    "Core 16", "Core 17", "Core 18", "Core 19",
>>>>> +    "Core 20", "Core 21", "Core 22", "Core 23",
>>>>> +};
>>>>> +
>>>>> +static int send_peci_cmd(struct peci_cputemp *priv,
>>>>> +             enum peci_cmd cmd,
>>>>> +             void *msg)
>>>>> +{
>>>>> +    return peci_command(priv->client->adapter, cmd, msg);
>>>>> +}
>>>>> +
>>>>> +static int need_update(struct temp_data *temp)
>>>>
>>>> Please use bool.
>>>>
>>>
>>> Okay. I'll use bool instead of int.
>>>
>>>>> +{
>>>>> +    if (temp->valid &&
>>>>> +        time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
>>>>> +        return 0;
>>>>> +
>>>>> +    return 1;
>>>>> +}
>>>>> +
>>>>> +static void mark_updated(struct temp_data *temp)
>>>>> +{
>>>>> +    temp->valid = true;
>>>>> +    temp->last_updated = jiffies;
>>>>> +}
>>>>> +
>>>>> +static s32 ten_dot_six_to_millidegree(s32 val)
>>>>> +{
>>>>> +    return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
>>>>> +}
>>>>> +
>>>>> +static int get_tjmax(struct peci_cputemp *priv)
>>>>> +{
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    int rc;
>>>>> +
>>>>> +    if (!priv->temp.tjmax.valid) {
>>>>> +        msg.addr = priv->addr;
>>>>> +        msg.index = MBX_INDEX_TEMP_TARGET;
>>>>> +        msg.param = 0;
>>>>> +        msg.rx_len = 4;
>>>>> +
>>>>> +        rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
>>>>> +        priv->temp.tjmax.valid = true;
>>>>> +    }
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int get_tcontrol(struct peci_cputemp *priv)
>>>>> +{
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    s32 tcontrol_margin;
>>>>> +    s32 tthrottle_offset;
>>>>> +    int rc;
>>>>> +
>>>>> +    if (!need_update(&priv->temp.tcontrol))
>>>>> +        return 0;
>>>>> +
>>>>> +    rc = get_tjmax(priv);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    msg.addr = priv->addr;
>>>>> +    msg.index = MBX_INDEX_TEMP_TARGET;
>>>>> +    msg.param = 0;
>>>>> +    msg.rx_len = 4;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    tcontrol_margin = msg.pkg_config[1];
>>>>> +    tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>>>>> +    priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
>>>>> +
>>>>> +    tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>>>>> +    priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
>>>>> +
>>>>> +    mark_updated(&priv->temp.tcontrol);
>>>>> +    mark_updated(&priv->temp.tthrottle);
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int get_tthrottle(struct peci_cputemp *priv)
>>>>> +{
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    s32 tcontrol_margin;
>>>>> +    s32 tthrottle_offset;
>>>>> +    int rc;
>>>>> +
>>>>> +    if (!need_update(&priv->temp.tthrottle))
>>>>> +        return 0;
>>>>> +
>>>>> +    rc = get_tjmax(priv);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    msg.addr = priv->addr;
>>>>> +    msg.index = MBX_INDEX_TEMP_TARGET;
>>>>> +    msg.param = 0;
>>>>> +    msg.rx_len = 4;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>>>>> +    priv->temp.tthrottle.value = priv->temp.tjmax.value - tthrottle_offset;
>>>>> +
>>>>> +    tcontrol_margin = msg.pkg_config[1];
>>>>> +    tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>>>>> +    priv->temp.tcontrol.value = priv->temp.tjmax.value - tcontrol_margin;
>>>>> +
>>>>> +    mark_updated(&priv->temp.tthrottle);
>>>>> +    mark_updated(&priv->temp.tcontrol);
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>
>>>> I am quite completely missing how the two functions above are different.
>>>>
>>>
>>> The two above functions are slightly different but uses the same PECI command which provides both Tthrottle and Tcontrol values in pkg_config array so it updates the values to reduce duplicate PECI transactions. Probably, combining these two functions into get_ttrottle_and_tcontrol() would look better. I'll rewrite it.
>>>
>>>>> +
>>>>> +static int get_die_temp(struct peci_cputemp *priv)
>>>>> +{
>>>>> +    struct peci_get_temp_msg msg;
>>>>> +    int rc;
>>>>> +
>>>>> +    if (!need_update(&priv->temp.die))
>>>>> +        return 0;
>>>>> +
>>>>> +    rc = get_tjmax(priv);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    msg.addr = priv->addr;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    priv->temp.die.value = priv->temp.tjmax.value +
>>>>> +                   ((s32)msg.temp_raw * 1000 / 64);
>>>>> +
>>>>> +    mark_updated(&priv->temp.die);
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int get_dts_margin(struct peci_cputemp *priv)
>>>>> +{
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    s32 dts_margin;
>>>>> +    int rc;
>>>>> +
>>>>> +    if (!need_update(&priv->temp.dts_margin))
>>>>> +        return 0;
>>>>> +
>>>>> +    msg.addr = priv->addr;
>>>>> +    msg.index = MBX_INDEX_DTS_MARGIN;
>>>>> +    msg.param = 0;
>>>>> +    msg.rx_len = 4;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>>>>> +
>>>>> +    /**
>>>>> +     * Processors return a value of DTS reading in 10.6 format
>>>>> +     * (10 bits signed decimal, 6 bits fractional).
>>>>> +     * Error codes:
>>>>> +     *   0x8000: General sensor error
>>>>> +     *   0x8001: Reserved
>>>>> +     *   0x8002: Underflow on reading value
>>>>> +     *   0x8003-0x81ff: Reserved
>>>>> +     */
>>>>> +    if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
>>>>> +        return -EIO;
>>>>> +
>>>>> +    dts_margin = ten_dot_six_to_millidegree(dts_margin);
>>>>> +
>>>>> +    priv->temp.dts_margin.value = dts_margin;
>>>>> +
>>>>> +    mark_updated(&priv->temp.dts_margin);
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int get_core_temp(struct peci_cputemp *priv, int core_index)
>>>>> +{
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    s32 core_dts_margin;
>>>>> +    int rc;
>>>>> +
>>>>> +    if (!need_update(&priv->temp.core[core_index]))
>>>>> +        return 0;
>>>>> +
>>>>> +    rc = get_tjmax(priv);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    msg.addr = priv->addr;
>>>>> +    msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
>>>>> +    msg.param = core_index;
>>>>> +    msg.rx_len = 4;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>>>>> +
>>>>> +    /**
>>>>> +     * Processors return a value of the core DTS reading in 10.6 format
>>>>> +     * (10 bits signed decimal, 6 bits fractional).
>>>>> +     * Error codes:
>>>>> +     *   0x8000: General sensor error
>>>>> +     *   0x8001: Reserved
>>>>> +     *   0x8002: Underflow on reading value
>>>>> +     *   0x8003-0x81ff: Reserved
>>>>> +     */
>>>>> +    if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>>>>> +        return -EIO;
>>>>> +
>>>>> +    core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
>>>>> +
>>>>> +    priv->temp.core[core_index].value = priv->temp.tjmax.value +
>>>>> +                        core_dts_margin;
>>>>> +
>>>>> +    mark_updated(&priv->temp.core[core_index]);
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>
>>>> There is a lot of duplication in those functions. Would it be possible
>>>> to find common code and use functions for it instead of duplicating
>>>> everything several times ?
>>>>
>>>
>>> Are you pointing out this code?
>>> /**
>>>   * Processors return a value of the core DTS reading in 10.6 format
>>>   * (10 bits signed decimal, 6 bits fractional).
>>>   * Error codes:
>>>   *   0x8000: General sensor error
>>>   *   0x8001: Reserved
>>>   *   0x8002: Underflow on reading value
>>>   *   0x8003-0x81ff: Reserved
>>>   */
>>> if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>>>      return -EIO;
>>>
>>> Then I'll rewrite it as a function. If not, please point out the duplication.
>>>
>>
>> There is lots of other duplication.
>>
> 
> Sorry but can you point out the duplication?
> 
write a python script to do a semantic comparison.

>>>>> +static int find_core_index(struct peci_cputemp *priv, int channel)
>>>>> +{
>>>>> +    int core_channel = channel - DEFAULT_CHANNEL_NUMS;
>>>>> +    int idx, found = 0;
>>>>> +
>>>>> +    for (idx = 0; idx < priv->gen_info->core_max; idx++) {
>>>>> +        if (priv->core_mask & BIT(idx)) {
>>>>> +            if (core_channel == found)
>>>>> +                break;
>>>>> +
>>>>> +            found++;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return idx;
>>>>
>>>> What if nothing is found ?
>>>>
>>>
>>> Core temperature group will be registered only when it detects at least one core checked by check_resolved_cores(), so find_core_index() can be called only when priv->core_mask has a non-zero value. The 'nothing is found' case will not happen.
>>>
>> That doesn't guarantee a match. If what you are saying is correct there should always be
>> a well defined match of channel -> idx, and the search should be unnecessary.
>>
> 
> There could be some disabled cores in the resolved core mask bit sequence also it should remove indexing gap in channel numbering so it is the reason why this search function is needed. Well defined match of channel -> idx would not be always satisfied.
> 
Are you saying that each call to the function, with the same parameters,
can return a different result ?

>>>>> +}
>>>>> +
>>>>> +static int cputemp_read_string(struct device *dev,
>>>>> +                   enum hwmon_sensor_types type,
>>>>> +                   u32 attr, int channel, const char **str)
>>>>> +{
>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>> +    int core_index;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_label:
>>>>> +        if (channel < DEFAULT_CHANNEL_NUMS) {
>>>>> +            *str = cputemp_label[channel];
>>>>> +        } else {
>>>>> +            core_index = find_core_index(priv, channel);
>>>>
>>>> FWIW, it might be better to pass channel - DEFAULT_CHANNEL_NUMS
>>>> as parameter.
>>>>
>>>
>>> cputemp_read_string() is mapped to read_string member of hwmon_ops struct, so hwmon susbsystem passes the channel parameter based on the registered channel order. Should I modify hwmon subsystem code?
>>>
>>
>> Huh ? Changing
>>      f(x) { y = x - const; }
>> ...
>>      f(x);
>>
>> to
>>      f(y) { }
>> ...
>>      f(x - const);
>>
>> requires a hwmon core change ? Really ?
>>
> 
> Sorry for my misunderstanding. You are right. I'll change the parameter passing of find_core_index() from 'channel' to 'channel - DEFAULT_CHANNEL_NUMS'.
> 
>>>> What if find_core_index() returns priv->gen_info->core_max, ie
>>>> if it didn't find a core ?
>>>>
>>>
>>> As explained above, find_core index() returns a correct index always.
>>>
>>>>> +            *str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
>>>>> +        }
>>>>> +        return 0;
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static int cputemp_read_die(struct device *dev,
>>>>> +                enum hwmon_sensor_types type,
>>>>> +                u32 attr, int channel, long *val)
>>>>> +{
>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>> +    int rc;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_input:
>>>>> +        rc = get_die_temp(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.die.value;
>>>>> +        return 0;
>>>>> +    case hwmon_temp_max:
>>>>> +        rc = get_tcontrol(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tcontrol.value;
>>>>> +        return 0;
>>>>> +    case hwmon_temp_crit:
>>>>> +        rc = get_tjmax(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tjmax.value;
>>>>> +        return 0;
>>>>> +    case hwmon_temp_crit_hyst:
>>>>> +        rc = get_tcontrol(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>>>>> +        return 0;
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static int cputemp_read_dts_margin(struct device *dev,
>>>>> +                   enum hwmon_sensor_types type,
>>>>> +                   u32 attr, int channel, long *val)
>>>>> +{
>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>> +    int rc;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_input:
>>>>> +        rc = get_dts_margin(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.dts_margin.value;
>>>>> +        return 0;
>>>>> +    case hwmon_temp_min:
>>>>> +        *val = 0;
>>>>> +        return 0;
>>>>
>>>> This attribute should not exist.
>>>>
>>>
>>> This is an attribute of DTS margin temperature which reflects thermal margin to Tcontrol of the CPU package. If it shows '0' means it reached to Tcontrol, the first level of thermal warning. If the CPU keeps getting hot then this DTS margin shows a negative value until it reaches to Tjmax. When the temperature reaches to Tjmax at last then it shows the lower critcal value which lcrit indicates as the second level of thermal warning.
>>>
>>
>> The hwmon ABI reports chip values, not constants. Even though some drivers do
>> it, reporting a constant is always wrong.
>>
>>>>> +    case hwmon_temp_lcrit:
>>>>> +        rc = get_tcontrol(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tcontrol.value - priv->temp.tjmax.value;
>>>>
>>>> lcrit is tcontrol - tjmax, and crit_hyst above is
>>>> tjmax - tcontrol ? How does this make sense ?
>>>>
>>>
>>> Both Tjmax and Tcontrol have positive values and Tjmax is greater than Tcontrol always. As explained above, lcrit of DTS margin should show a negative value means the margin goes down across '0'. On the other hand, crit_hyst of Die temperature should show absolute hyterisis value between Tcontrol and Tjmax.
>>>
>> The hwmon ABI requires reporting of absolute temperatures in milli-degrees C.
>> Your statements make it very clear that this driver does not report
>> absolute temperatures. This is not acceptable.
>>
> 
> Okay. I'll remove the 'DTS margin' temperature. All others are reporting absolute temperatures.
> 
>>>>> +        return 0;
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static int cputemp_read_tcontrol(struct device *dev,
>>>>> +                 enum hwmon_sensor_types type,
>>>>> +                 u32 attr, int channel, long *val)
>>>>> +{
>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>> +    int rc;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_input:
>>>>> +        rc = get_tcontrol(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tcontrol.value;
>>>>> +        return 0;
>>>>> +    case hwmon_temp_crit:
>>>>> +        rc = get_tjmax(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tjmax.value;
>>>>> +        return 0;
>>>>
>>>> Am I missing something, or is the same temperature reported several times ?
>>>> tjmax is also reported as temp_crit cputemp_read_die(), for example.
>>>>
>>>
>>> This driver provides multiple channels and each channel has its own supplement attributes. As you mentioned, Die temperature channel and Core temperature channel have their individual crit attributes and they reflect the same value, Tjmax. It is not reporting several times but reporting the same value.
>>>
>> Then maybe fold the functions accordingly ?
>>
> 
> I'll use a single function for 'Die temperature' and 'Core temperature' that have the same attributes set. It would simplify this code a bit.
> 
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static int cputemp_read_tthrottle(struct device *dev,
>>>>> +                  enum hwmon_sensor_types type,
>>>>> +                  u32 attr, int channel, long *val)
>>>>> +{
>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>> +    int rc;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_input:
>>>>> +        rc = get_tthrottle(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tthrottle.value;
>>>>> +        return 0;
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static int cputemp_read_tjmax(struct device *dev,
>>>>> +                  enum hwmon_sensor_types type,
>>>>> +                  u32 attr, int channel, long *val)
>>>>> +{
>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>> +    int rc;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_input:
>>>>> +        rc = get_tjmax(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tjmax.value;
>>>>> +        return 0;
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static int cputemp_read_core(struct device *dev,
>>>>> +                 enum hwmon_sensor_types type,
>>>>> +                 u32 attr, int channel, long *val)
>>>>> +{
>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>> +    int core_index = find_core_index(priv, channel);
>>>>> +    int rc;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_input:
>>>>> +        rc = get_core_temp(priv, core_index);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.core[core_index].value;
>>>>> +        return 0;
>>>>> +    case hwmon_temp_max:
>>>>> +        rc = get_tcontrol(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tcontrol.value;
>>>>> +        return 0;
>>>>> +    case hwmon_temp_crit:
>>>>> +        rc = get_tjmax(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tjmax.value;
>>>>> +        return 0;
>>>>> +    case hwmon_temp_crit_hyst:
>>>>> +        rc = get_tcontrol(priv);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>>>>> +        return 0;
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>
>>>> There is again a lot of duplication in those functions.
>>>>
>>>
>>> Each function is called from cputemp_read() which is mapped to read function pointer of hwmon_ops struct. Since each channel has different set of attributes so the cputemp_read() calls an individual channel handler after checking the channel type. Of course, we can handle all attributes of all channels in a single function but the way also needs channel type checking code on each attribute.
>>>
>>>>> +
>>>>> +static int cputemp_read(struct device *dev,
>>>>> +            enum hwmon_sensor_types type,
>>>>> +            u32 attr, int channel, long *val)
>>>>> +{
>>>>> +    switch (channel) {
>>>>> +    case channel_die:
>>>>> +        return cputemp_read_die(dev, type, attr, channel, val);
>>>>> +    case channel_dts_mrgn:
>>>>> +        return cputemp_read_dts_margin(dev, type, attr, channel, val);
>>>>> +    case channel_tcontrol:
>>>>> +        return cputemp_read_tcontrol(dev, type, attr, channel, val);
>>>>> +    case channel_tthrottle:
>>>>> +        return cputemp_read_tthrottle(dev, type, attr, channel, val);
>>>>> +    case channel_tjmax:
>>>>> +        return cputemp_read_tjmax(dev, type, attr, channel, val);
>>>>> +    default:
>>>>> +        if (channel < CPUTEMP_CHANNEL_NUMS)
>>>>> +            return cputemp_read_core(dev, type, attr, channel, val);
>>>>> +
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static umode_t cputemp_is_visible(const void *data,
>>>>> +                  enum hwmon_sensor_types type,
>>>>> +                  u32 attr, int channel)
>>>>> +{
>>>>> +    const struct peci_cputemp *priv = data;
>>>>> +
>>>>> +    if (priv->temp_config[channel] & BIT(attr))
>>>>> +        return 0444;
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static const struct hwmon_ops cputemp_ops = {
>>>>> +    .is_visible = cputemp_is_visible,
>>>>> +    .read_string = cputemp_read_string,
>>>>> +    .read = cputemp_read,
>>>>> +};
>>>>> +
>>>>> +static int check_resolved_cores(struct peci_cputemp *priv)
>>>>> +{
>>>>> +    struct peci_rd_pci_cfg_local_msg msg;
>>>>> +    int rc;
>>>>> +
>>>>> +    if (!(priv->client->adapter->cmd_mask & BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
>>>>> +        return -EINVAL;
>>>>> +
>>>>> +    /* Get the RESOLVED_CORES register value */
>>>>> +    msg.addr = priv->addr;
>>>>> +    msg.bus = 1;
>>>>> +    msg.device = 30;
>>>>> +    msg.function = 3;
>>>>> +    msg.reg = 0xB4;
>>>>
>>>> Can this be made less magic with some defines ?
>>>>
>>>
>>> Sure, will use defines instead.
>>>
>>>>> +    msg.rx_len = 4;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    priv->core_mask = msg.pci_config[3] << 24 |
>>>>> +              msg.pci_config[2] << 16 |
>>>>> +              msg.pci_config[1] << 8 |
>>>>> +              msg.pci_config[0];
>>>>> +
>>>>> +    if (!priv->core_mask)
>>>>> +        return -EAGAIN;
>>>>> +
>>>>> +    dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", priv->core_mask);
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int create_core_temp_info(struct peci_cputemp *priv)
>>>>> +{
>>>>> +    int rc, i;
>>>>> +
>>>>> +    rc = check_resolved_cores(priv);
>>>>> +    if (!rc) {
>>>>> +        for (i = 0; i < priv->gen_info->core_max; i++) {
>>>>> +            if (priv->core_mask & BIT(i)) {
>>>>> +                priv->temp_config[priv->config_idx++] =
>>>>> +                             config_table[channel_core];
>>>>> +            }
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return rc;
>>>>> +}
>>>>> +
>>>>> +static int check_cpu_id(struct peci_cputemp *priv)
>>>>> +{
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    u32 cpu_id;
>>>>> +    int i, rc;
>>>>> +
>>>>> +    msg.addr = priv->addr;
>>>>> +    msg.index = MBX_INDEX_CPU_ID;
>>>>> +    msg.param = PKG_ID_CPU_ID;
>>>>> +    msg.rx_len = 4;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>>>>> +          msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>>>>> +
>>>>> +    for (i = 0; i < CPU_GEN_MAX; i++) {
>>>>> +        if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>>>>> +            priv->gen_info = &cpu_gen_info_table[i];
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    if (!priv->gen_info)
>>>>> +        return -ENODEV;
>>>>> +
>>>>> +    dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int peci_cputemp_probe(struct peci_client *client)
>>>>> +{
>>>>> +    struct device *dev = &client->dev;
>>>>> +    struct peci_cputemp *priv;
>>>>> +    struct device *hwmon_dev;
>>>>> +    int rc;
>>>>> +
>>>>> +    if ((client->adapter->cmd_mask &
>>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>>>>> +        dev_err(dev, "Client doesn't support temperature monitoring\n");
>>>>> +        return -EINVAL;
>>>>
>>>> Does this mean there will be an error message for each non-supported CPU ?
>>>> Why ?
>>>>
>>>
>>> For proper operation of this driver, PECI_CMD_GET_TEMP and PECI_CMD_RD_PKG_CFG have to be supported by a client CPU. PECI_CMD_GET_TEMP is provided as a default command but PECI_CMD_RD_PKG_CFG depends on PECI minor revision of a CPU package so this checking is needed.
>>>
>>
>> I do not question the check. I question the error message and error return value.
>> Why is it an _error_ if the CPU does not support the functionality, and why does
>> it have to be reported in the kernel log ?
>>
> 
> Got it. I'll change that to dev_dbg.
> 
>>>>> +    }
>>>>> +
>>>>> +    priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>>>>> +    if (!priv)
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    dev_set_drvdata(dev, priv);
>>>>> +    priv->client = client;
>>>>> +    priv->dev = dev;
>>>>> +    priv->addr = client->addr;
>>>>> +    priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>>>>> +
>>>>> +    snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
>>>>> +         priv->cpu_no);
>>>>> +
>>>>> +    rc = check_cpu_id(priv);
>>>>> +    if (rc) {
>>>>> +        dev_err(dev, "Client CPU is not supported\n");
>>>>
>>>> -ENODEV is not an error, and should not result in an error message.
>>>> Besides, the error can also be propagated from peci core code,
>>>> and may well be something else.
>>>>
>>>
>>> Got it. I'll remove the error message and will add a proper handling code into PECI core.
>>>
>>>>> +        return rc;
>>>>> +    }
>>>>> +
>>>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_die];
>>>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_dts_mrgn];
>>>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_tcontrol];
>>>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_tthrottle];
>>>>> +    priv->temp_config[priv->config_idx++] = config_table[channel_tjmax];
>>>>> +
>>>>> +    rc = create_core_temp_info(priv);
>>>>> +    if (rc)
>>>>> +        dev_dbg(dev, "Failed to create core temp info\n");
>>>>
>>>> Then what ? Shouldn't this result in probe deferral or something more useful
>>>> instead of just being ignored ?
>>>>
>>>
>>> This driver can't support core temperature monitoring if a CPU doesn't support PECI_CMD_RD_PCI_CFG_LOCAL command. In that case, it skips core temperature group creation and supports only basic temperature monitoring of Die, DTS margin and etc. I'll add this description as a comment.
>>>
>>
>> The message says "Failed to ...". It does not say "This CPU does not support ...".
>>
> 
> Got it. Will correct the message.
> 
>>>>> +
>>>>> +    priv->chip.ops = &cputemp_ops;
>>>>> +    priv->chip.info = priv->info;
>>>>> +
>>>>> +    priv->info[0] = &priv->temp_info;
>>>>> +
>>>>> +    priv->temp_info.type = hwmon_temp;
>>>>> +    priv->temp_info.config = priv->temp_config;
>>>>> +
>>>>> +    hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>>>>> +                             priv->name,
>>>>> +                             priv,
>>>>> +                             &priv->chip,
>>>>> +                             NULL);
>>>>> +
>>>>> +    if (IS_ERR(hwmon_dev))
>>>>> +        return PTR_ERR(hwmon_dev);
>>>>> +
>>>>> +    dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), priv->name);
>>>>> +
>>
>> Why does this message display the device name twice ?
>>
> 
> For an example, dev_name(hwmon_dev) shows 'hwmon5' and priv->name shows 'peci-cputemp0'.
> 
And dev_dbg() shows another device name. So you'll have something like

peci-cputemp0: hwmon5: sensor 'peci-cputemp0'

>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static const struct of_device_id peci_cputemp_of_table[] = {
>>>>> +    { .compatible = "intel,peci-cputemp" },
>>>>> +    { }
>>>>> +};
>>>>> +MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
>>>>> +
>>>>> +static struct peci_driver peci_cputemp_driver = {
>>>>> +    .probe  = peci_cputemp_probe,
>>>>> +    .driver = {
>>>>> +        .name           = "peci-cputemp",
>>>>> +        .of_match_table = of_match_ptr(peci_cputemp_of_table),
>>>>> +    },
>>>>> +};
>>>>> +module_peci_driver(peci_cputemp_driver);
>>>>> +
>>>>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>>>>> +MODULE_DESCRIPTION("PECI cputemp driver");
>>>>> +MODULE_LICENSE("GPL v2");
>>>>> diff --git a/drivers/hwmon/peci-dimmtemp.c b/drivers/hwmon/peci-dimmtemp.c
>>>>> new file mode 100644
>>>>> index 000000000000..78bf29cb2c4c
>>>>> --- /dev/null
>>>>> +++ b/drivers/hwmon/peci-dimmtemp.c
>>>>
>>>> FWIW, this should be two separate patches.
>>>>
>>>
>>> Should I split out hwmon documents and dt bindings too?
>>>
>>>>> @@ -0,0 +1,432 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>> +// Copyright (c) 2018 Intel Corporation
>>>>> +
>>>>> +#include <linux/delay.h>
>>>>> +#include <linux/hwmon.h>
>>>>> +#include <linux/hwmon-sysfs.h>
>>>>
>>>> Needed ?
>>>>
>>>
>>> No. Will drop the line.
>>>
>>>>> +#include <linux/jiffies.h>
>>>>> +#include <linux/module.h>
>>>>> +#include <linux/of_device.h>
>>>>> +#include <linux/peci.h>
>>>>> +#include <linux/workqueue.h>
>>>>> +
>>>>> +#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
>>>>> +
>>>>> +#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on Haswell */
>>>>> +#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on Haswell */
>>>>> +
>>>>> +#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on Broadwell */
>>>>> +#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on Broadwell */
>>>>> +
>>>>> +#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on Skylake */
>>>>> +#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on Skylake */
>>>>> +
>>>>> +#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
>>>>> +#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
>>>>> +
>>>>> +#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
>>>>> +
>>>>> +#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model info */
>>>>> +
>>>>> +#define UPDATE_INTERVAL_MIN  HZ
>>>>> +
>>>>> +#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
>>>>> +#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 minutes */
>>>>> +
>>>>> +enum cpu_gens {
>>>>> +    CPU_GEN_HSX, /* Haswell Xeon */
>>>>> +    CPU_GEN_BRX, /* Broadwell Xeon */
>>>>> +    CPU_GEN_SKX, /* Skylake Xeon */
>>>>> +    CPU_GEN_MAX
>>>>> +};
>>>>> +
>>>>> +struct cpu_gen_info {
>>>>> +    u32 type;
>>>>> +    u32 cpu_id;
>>>>> +    u32 chan_rank_max;
>>>>> +    u32 dimm_idx_max;
>>>>> +};
>>>>> +
>>>>> +struct temp_data {
>>>>> +    bool valid;
>>>>> +    s32  value;
>>>>> +    unsigned long last_updated;
>>>>> +};
>>>>> +
>>>>> +struct peci_dimmtemp {
>>>>> +    struct peci_client *client;
>>>>> +    struct device *dev;
>>>>> +    struct workqueue_struct *work_queue;
>>>>> +    struct delayed_work work_handler;
>>>>> +    char name[PECI_NAME_SIZE];
>>>>> +    struct temp_data temp[DIMM_NUMS_MAX];
>>>>> +    u8 addr;
>>>>> +    uint cpu_no;
>>>>> +    const struct cpu_gen_info *gen_info;
>>>>> +    u32 dimm_mask;
>>>>> +    int retry_count;
>>>>> +    int channels;
>>>>> +    u32 temp_config[DIMM_NUMS_MAX + 1];
>>>>> +    struct hwmon_channel_info temp_info;
>>>>> +    const struct hwmon_channel_info *info[2];
>>>>> +    struct hwmon_chip_info chip;
>>>>> +};
>>>>> +
>>>>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>>>>> +    { .type  = CPU_GEN_HSX,
>>>>> +      .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 (0x3f) */
>>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
>>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
>>>>> +    { .type  = CPU_GEN_BRX,
>>>>> +      .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 (0x4f) */
>>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
>>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
>>>>> +    { .type  = CPU_GEN_SKX,
>>>>> +      .cpu_id = 0x50650, /* Family code: 6, Model number: 85 (0x55) */
>>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
>>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
>>>>> +};
>>>>> +
>>>>> +static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
>>>>> +    { "DIMM A0", "DIMM A1", "DIMM A2" },
>>>>> +    { "DIMM B0", "DIMM B1", "DIMM B2" },
>>>>> +    { "DIMM C0", "DIMM C1", "DIMM C2" },
>>>>> +    { "DIMM D0", "DIMM D1", "DIMM D2" },
>>>>> +    { "DIMM E0", "DIMM E1", "DIMM E2" },
>>>>> +    { "DIMM F0", "DIMM F1", "DIMM F2" },
>>>>> +    { "DIMM G0", "DIMM G1", "DIMM G2" },
>>>>> +    { "DIMM H0", "DIMM H1", "DIMM H2" },
>>>>> +};
>>>>> +
>>>>> +static int send_peci_cmd(struct peci_dimmtemp *priv, enum peci_cmd cmd,
>>>>> +             void *msg)
>>>>> +{
>>>>> +    return peci_command(priv->client->adapter, cmd, msg);
>>>>> +}
>>>>> +
>>>>> +static int need_update(struct temp_data *temp)
>>>>> +{
>>>>> +    if (temp->valid &&
>>>>> +        time_before(jiffies, temp->last_updated + UPDATE_INTERVAL_MIN))
>>>>> +        return 0;
>>>>> +
>>>>> +    return 1;
>>>>> +}
>>>>> +
>>>>> +static void mark_updated(struct temp_data *temp)
>>>>> +{
>>>>> +    temp->valid = true;
>>>>> +    temp->last_updated = jiffies;
>>>>> +}
>>>>
>>>> It might make sense to provide the duplicate functions in a core file.
>>>>
>>>
>>> It is temperature monitoring specific function and it touches module specific variables. Do you really think that this non-generic function should be moved to PECI core?
>>>
>>>>> +
>>>>> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
>>>>> +{
>>>>> +    int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
>>>>> +    int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    int rc;
>>>>> +
>>>>> +    if (!need_update(&priv->temp[dimm_no]))
>>>>> +        return 0;
>>>>> +
>>>>> +    msg.addr = priv->addr;
>>>>> +    msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>>>>> +    msg.param = chan_rank;
>>>>> +    msg.rx_len = 4;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
>>>>> +
>>>>> +    mark_updated(&priv->temp[dimm_no]);
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
>>>>> +{
>>>>> +    int dimm_nums_max = priv->gen_info->chan_rank_max *
>>>>> +                priv->gen_info->dimm_idx_max;
>>>>> +    int idx, found = 0;
>>>>> +
>>>>> +    for (idx = 0; idx < dimm_nums_max; idx++) {
>>>>> +        if (priv->dimm_mask & BIT(idx)) {
>>>>> +            if (channel == found)
>>>>> +                break;
>>>>> +
>>>>> +            found++;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return idx;
>>>>> +}
>>>>
>>>> This again looks like duplicate code.
>>>>
>>>
>>> find_dimm_number()? I'm sure it isn't.
>>>
>>>>> +
>>>>> +static int dimmtemp_read_string(struct device *dev,
>>>>> +                enum hwmon_sensor_types type,
>>>>> +                u32 attr, int channel, const char **str)
>>>>> +{
>>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>>>> +    u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>>>>> +    int dimm_no, chan_rank, dimm_idx;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_label:
>>>>> +        dimm_no = find_dimm_number(priv, channel);
>>>>> +        chan_rank = dimm_no / dimm_idx_max;
>>>>> +        dimm_idx = dimm_no % dimm_idx_max;
>>>>> +        *str = dimmtemp_label[chan_rank][dimm_idx];
>>>>> +        return 0;
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static int dimmtemp_read(struct device *dev, enum hwmon_sensor_types type,
>>>>> +             u32 attr, int channel, long *val)
>>>>> +{
>>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>>>> +    int dimm_no = find_dimm_number(priv, channel);
>>>>> +    int rc;
>>>>> +
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_input:
>>>>> +        rc = get_dimm_temp(priv, dimm_no);
>>>>> +        if (rc)
>>>>> +            return rc;
>>>>> +
>>>>> +        *val = priv->temp[dimm_no].value;
>>>>> +        return 0;
>>>>> +    default:
>>>>> +        return -EOPNOTSUPP;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static umode_t dimmtemp_is_visible(const void *data,
>>>>> +                   enum hwmon_sensor_types type,
>>>>> +                   u32 attr, int channel)
>>>>> +{
>>>>> +    switch (attr) {
>>>>> +    case hwmon_temp_label:
>>>>> +    case hwmon_temp_input:
>>>>> +        return 0444;
>>>>> +    default:
>>>>> +        return 0;
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static const struct hwmon_ops dimmtemp_ops = {
>>>>> +    .is_visible = dimmtemp_is_visible,
>>>>> +    .read_string = dimmtemp_read_string,
>>>>> +    .read = dimmtemp_read,
>>>>> +};
>>>>> +
>>>>> +static int check_populated_dimms(struct peci_dimmtemp *priv)
>>>>> +{
>>>>> +    u32 chan_rank_max = priv->gen_info->chan_rank_max;
>>>>> +    u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    int chan_rank, dimm_idx;
>>>>> +    int rc, channels = 0;
>>>>> +
>>>>> +    for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
>>>>> +        msg.addr = priv->addr;
>>>>> +        msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>>>>> +        msg.param = chan_rank;
>>>>> +        msg.rx_len = 4;
>>>>> +
>>>>> +        rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +        if (rc) {
>>>>> +            priv->dimm_mask = 0;
>>>>> +            return rc;
>>>>> +        }
>>>>> +
>>>>> +        for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
>>>>> +            if (msg.pkg_config[dimm_idx]) {
>>>>> +                priv->dimm_mask |= BIT(chan_rank *
>>>>> +                               chan_rank_max +
>>>>> +                               dimm_idx);
>>>>> +                channels++;
>>>>> +            }
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    if (!priv->dimm_mask)
>>>>> +        return -EAGAIN;
>>>>> +
>>>>> +    priv->channels = channels;
>>>>> +
>>>>> +    dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", priv->dimm_mask);
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
>>>>> +{
>>>>> +    struct device *hwmon_dev;
>>>>> +    int rc, i;
>>>>> +
>>>>> +    rc = check_populated_dimms(priv);
>>>>> +    if (!rc) {
>>>>
>>>> Please handle error cases first.
>>>>
>>>
>>> Sure, I'll rewrite it.
>>>
>>>>> +        for (i = 0; i < priv->channels; i++)
>>>>> +            priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
>>>>> +
>>>>> +        priv->chip.ops = &dimmtemp_ops;
>>>>> +        priv->chip.info = priv->info;
>>>>> +
>>>>> +        priv->info[0] = &priv->temp_info;
>>>>> +
>>>>> +        priv->temp_info.type = hwmon_temp;
>>>>> +        priv->temp_info.config = priv->temp_config;
>>>>> +
>>>>> +        hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>>>>> +                                 priv->name,
>>>>> +                                 priv,
>>>>> +                                 &priv->chip,
>>>>> +                                 NULL);
>>>>> +        rc = PTR_ERR_OR_ZERO(hwmon_dev);
>>>>> +        if (!rc)
>>>>> +            dev_dbg(priv->dev, "%s: sensor '%s'\n",
>>>>> +                dev_name(hwmon_dev), priv->name);
>>>>> +    } else if (rc == -EAGAIN) {
>>>>> +        if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
>>>>> +            queue_delayed_work(priv->work_queue,
>>>>> +                       &priv->work_handler,
>>>>> +                       DIMM_MASK_CHECK_DELAY_JIFFIES);
>>>>> +            priv->retry_count++;
>>>>> +            dev_dbg(priv->dev,
>>>>> +                "Deferred DIMM temp info creation\n");
>>>>> +        } else {
>>>>> +            rc = -ETIMEDOUT;
>>>>> +            dev_err(priv->dev,
>>>>> +                "Timeout retrying DIMM temp info creation\n");
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return rc;
>>>>> +}
>>>>> +
>>>>> +static void create_dimm_temp_info_delayed(struct work_struct *work)
>>>>> +{
>>>>> +    struct delayed_work *dwork = to_delayed_work(work);
>>>>> +    struct peci_dimmtemp *priv = container_of(dwork, struct peci_dimmtemp,
>>>>> +                          work_handler);
>>>>> +    int rc;
>>>>> +
>>>>> +    rc = create_dimm_temp_info(priv);
>>>>> +    if (rc && rc != -EAGAIN)
>>>>> +        dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
>>>>> +}
>>>>> +
>>>>> +static int check_cpu_id(struct peci_dimmtemp *priv)
>>>>> +{
>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>> +    u32 cpu_id;
>>>>> +    int i, rc;
>>>>> +
>>>>> +    msg.addr = priv->addr;
>>>>> +    msg.index = MBX_INDEX_CPU_ID;
>>>>> +    msg.param = PKG_ID_CPU_ID;
>>>>> +    msg.rx_len = 4;
>>>>> +
>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>> +    if (rc)
>>>>> +        return rc;
>>>>> +
>>>>> +    cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>>>>> +          msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>>>>> +
>>>>> +    for (i = 0; i < CPU_GEN_MAX; i++) {
>>>>> +        if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>>>>> +            priv->gen_info = &cpu_gen_info_table[i];
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    if (!priv->gen_info)
>>>>> +        return -ENODEV;
>>>>> +
>>>>> +    dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>>>>> +    return 0;
>>>>> +}
>>>>
>>>> More duplicate code.
>>>>
>>>
>>> Okay. In case of check_cpu_id(), it could be used as a generic PECI function. I'll move it into PECI core.
>>>
>>>>> +
>>>>> +static int peci_dimmtemp_probe(struct peci_client *client)
>>>>> +{
>>>>> +    struct device *dev = &client->dev;
>>>>> +    struct peci_dimmtemp *priv;
>>>>> +    int rc;
>>>>> +
>>>>> +    if ((client->adapter->cmd_mask &
>>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>>>>
>>>> One set of ( ) is unnecessary on each side of the expression.
>>>>
>>>
>>> '&' has a precedence over '!=' but '|' doesn't. I'll rewrite it to:
>>>
>>
>> Actually, that is wrong. You refer to address-of. Bit operations do have lower
>> precedence that comparisons. I stand corrected.
>>
>>>      if (client->adapter->cmd_mask &
>>>          (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)) !=
>>>          (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)))
>>>
>>>>> +        dev_err(dev, "Client doesn't support temperature monitoring\n");
>>>>> +        return -EINVAL;
>>>>
>>>> Why is this "invalid", and why does it warrant an error message ?
>>>>
>>>
>>> Should I use -EPERM? Any suggestion?
>>>
>>
>> Is it an _error_ if the CPU does not support this functionality ?
>>
> 
> Actually, it returns from this probe() function without making any hwmon info creation so I intended to handle this case as an error. Am I wrong?
> 

If the functionality or HW supported by the driver isn't available, it is customary
to return -ENODEV and no error message. Otherwise the kernel log would drown in
"not supported" error messages. I don't see where it would add any value to handle
this driver differently.

EINVAL	Invalid argument
EPERM	Operation not permitted

You'll have to work hard to convince me that any of those makes sense, and that

ENODEV	No such device

doesn't. More specifically, if EINVAL makes sense, the caller did something wrong,
meaning there is a problem in the infrastructure which should get fixed.
The same is true for EPERM.

>>>>> +    }
>>>>> +
>>>>> +    priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>>>>> +    if (!priv)
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    dev_set_drvdata(dev, priv);
>>>>> +    priv->client = client;
>>>>> +    priv->dev = dev;
>>>>> +    priv->addr = client->addr;
>>>>> +    priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>>>>
>>>> Is priv->addr guaranteed to be >= PECI_BASE_ADDR ?
>>>
>>> Client address range validation will be done in peci_check_addr_validity() in PECI core before probing a device driver.
>>>
>>>>> +
>>>>> +    snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
>>>>> +         priv->cpu_no);
>>>>> +
>>>>> +    rc = check_cpu_id(priv);
>>>>> +    if (rc) {
>>>>> +        dev_err(dev, "Client CPU is not supported\n");
>>>>
>>>> Or the peci command failed.
>>>>
>>>
>>> I'll remove the error message and will add a proper handling code into PECI core on each error type.
>>>
>>>>> +        return rc;
>>>>> +    }
>>>>> +
>>>>> +    priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
>>>>> +    if (!priv->work_queue)
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    INIT_DELAYED_WORK(&priv->work_handler, create_dimm_temp_info_delayed);
>>>>> +
>>>>> +    rc = create_dimm_temp_info(priv);
>>>>> +    if (rc && rc != -EAGAIN) {
>>>>> +        dev_err(dev, "Failed to create DIMM temp info\n");
>>>>> +        goto err_free_wq;
>>>>> +    }
>>>>> +
>>>>> +    return 0;
>>>>> +
>>>>> +err_free_wq:
>>>>> +    destroy_workqueue(priv->work_queue);
>>>>> +    return rc;
>>>>> +}
>>>>> +
>>>>> +static int peci_dimmtemp_remove(struct peci_client *client)
>>>>> +{
>>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
>>>>> +
>>>>> +    cancel_delayed_work(&priv->work_handler);
>>>>
>>>> cancel_delayed_work_sync() ?
>>>>
>>>
>>> Yes, it would be safer. Will fix it.
>>>
>>>>> +    destroy_workqueue(priv->work_queue);
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static const struct of_device_id peci_dimmtemp_of_table[] = {
>>>>> +    { .compatible = "intel,peci-dimmtemp" },
>>>>> +    { }
>>>>> +};
>>>>> +MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
>>>>> +
>>>>> +static struct peci_driver peci_dimmtemp_driver = {
>>>>> +    .probe  = peci_dimmtemp_probe,
>>>>> +    .remove = peci_dimmtemp_remove,
>>>>> +    .driver = {
>>>>> +        .name           = "peci-dimmtemp",
>>>>> +        .of_match_table = of_match_ptr(peci_dimmtemp_of_table),
>>>>> +    },
>>>>> +};
>>>>> +module_peci_driver(peci_dimmtemp_driver);
>>>>> +
>>>>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>>>>> +MODULE_DESCRIPTION("PECI dimmtemp driver");
>>>>> +MODULE_LICENSE("GPL v2");
>>>>> -- 
>>>>> 2.16.2
>>>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-12  3:40           ` Guenter Roeck
@ 2018-04-12 17:09             ` Jae Hyun Yoo
  2018-04-12 17:37               ` Guenter Roeck
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-12 17:09 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On 4/11/2018 8:40 PM, Guenter Roeck wrote:
> On 04/11/2018 07:51 PM, Jae Hyun Yoo wrote:
>> On 4/11/2018 5:34 PM, Guenter Roeck wrote:
>>> On 04/11/2018 02:59 PM, Jae Hyun Yoo wrote:
>>>> Hi Guenter,
>>>>
>>>> Thanks a lot for sharing your time. Please see my inline answers.
>>>>
>>>> On 4/10/2018 3:28 PM, Guenter Roeck wrote:
>>>>> On Tue, Apr 10, 2018 at 11:32:11AM -0700, Jae Hyun Yoo wrote:
>>>>>> This commit adds PECI cputemp and dimmtemp hwmon drivers.
>>>>>>
>>>>>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>>>>>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>>>>>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>>>>>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>>>>>> Cc: Alan Cox <alan@linux.intel.com>
>>>>>> Cc: Andrew Jeffery <andrew@aj.id.au>
>>>>>> Cc: Andrew Lunn <andrew@lunn.ch>
>>>>>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>>>>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>>>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>>>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>>>>>> Cc: Greg KH <gregkh@linuxfoundation.org>
>>>>>> Cc: Guenter Roeck <linux@roeck-us.net>
>>>>>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>>>>>> Cc: Jean Delvare <jdelvare@suse.com>
>>>>>> Cc: Joel Stanley <joel@jms.id.au>
>>>>>> Cc: Julia Cartwright <juliac@eso.teric.us>
>>>>>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>>>>>> Cc: Milton Miller II <miltonm@us.ibm.com>
>>>>>> Cc: Pavel Machek <pavel@ucw.cz>
>>>>>> Cc: Randy Dunlap <rdunlap@infradead.org>
>>>>>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>>>>>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>>>>>> ---
>>>>>>   drivers/hwmon/Kconfig         |  28 ++
>>>>>>   drivers/hwmon/Makefile        |   2 +
>>>>>>   drivers/hwmon/peci-cputemp.c  | 783 
>>>>>> ++++++++++++++++++++++++++++++++++++++++++
>>>>>>   drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
>>>>>>   4 files changed, 1245 insertions(+)
>>>>>>   create mode 100644 drivers/hwmon/peci-cputemp.c
>>>>>>   create mode 100644 drivers/hwmon/peci-dimmtemp.c
>>>>>>
>>>>>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
>>>>>> index f249a4428458..c52f610f81d0 100644
>>>>>> --- a/drivers/hwmon/Kconfig
>>>>>> +++ b/drivers/hwmon/Kconfig
>>>>>> @@ -1259,6 +1259,34 @@ config SENSORS_NCT7904
>>>>>>         This driver can also be built as a module.  If so, the module
>>>>>>         will be called nct7904.
>>>>>> +config SENSORS_PECI_CPUTEMP
>>>>>> +    tristate "PECI CPU temperature monitoring support"
>>>>>> +    depends on OF
>>>>>> +    depends on PECI
>>>>>> +    help
>>>>>> +      If you say yes here you get support for the generic Intel PECI
>>>>>> +      cputemp driver which provides Digital Thermal Sensor (DTS) 
>>>>>> thermal
>>>>>> +      readings of the CPU package and CPU cores that are 
>>>>>> accessible using
>>>>>> +      the PECI Client Command Suite via the processor PECI client.
>>>>>> +      Check Documentation/hwmon/peci-cputemp for details.
>>>>>> +
>>>>>> +      This driver can also be built as a module.  If so, the module
>>>>>> +      will be called peci-cputemp.
>>>>>> +
>>>>>> +config SENSORS_PECI_DIMMTEMP
>>>>>> +    tristate "PECI DIMM temperature monitoring support"
>>>>>> +    depends on OF
>>>>>> +    depends on PECI
>>>>>> +    help
>>>>>> +      If you say yes here you get support for the generic Intel 
>>>>>> PECI hwmon
>>>>>> +      driver which provides Digital Thermal Sensor (DTS) thermal 
>>>>>> readings of
>>>>>> +      DIMM components that are accessible using the PECI Client 
>>>>>> Command
>>>>>> +      Suite via the processor PECI client.
>>>>>> +      Check Documentation/hwmon/peci-dimmtemp for details.
>>>>>> +
>>>>>> +      This driver can also be built as a module.  If so, the module
>>>>>> +      will be called peci-dimmtemp.
>>>>>> +
>>>>>>   config SENSORS_NSA320
>>>>>>       tristate "ZyXEL NSA320 and compatible fan speed and 
>>>>>> temperature sensors"
>>>>>>       depends on GPIOLIB && OF
>>>>>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
>>>>>> index e7d52a36e6c4..48d9598fcd3a 100644
>>>>>> --- a/drivers/hwmon/Makefile
>>>>>> +++ b/drivers/hwmon/Makefile
>>>>>> @@ -136,6 +136,8 @@ obj-$(CONFIG_SENSORS_NCT7802)    += nct7802.o
>>>>>>   obj-$(CONFIG_SENSORS_NCT7904)    += nct7904.o
>>>>>>   obj-$(CONFIG_SENSORS_NSA320)    += nsa320-hwmon.o
>>>>>>   obj-$(CONFIG_SENSORS_NTC_THERMISTOR)    += ntc_thermistor.o
>>>>>> +obj-$(CONFIG_SENSORS_PECI_CPUTEMP)    += peci-cputemp.o
>>>>>> +obj-$(CONFIG_SENSORS_PECI_DIMMTEMP)    += peci-dimmtemp.o
>>>>>>   obj-$(CONFIG_SENSORS_PC87360)    += pc87360.o
>>>>>>   obj-$(CONFIG_SENSORS_PC87427)    += pc87427.o
>>>>>>   obj-$(CONFIG_SENSORS_PCF8591)    += pcf8591.o
>>>>>> diff --git a/drivers/hwmon/peci-cputemp.c 
>>>>>> b/drivers/hwmon/peci-cputemp.c
>>>>>> new file mode 100644
>>>>>> index 000000000000..f0bc92687512
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/hwmon/peci-cputemp.c
>>>>>> @@ -0,0 +1,783 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>> +// Copyright (c) 2018 Intel Corporation
>>>>>> +
>>>>>> +#include <linux/delay.h>
>>>>>> +#include <linux/hwmon.h>
>>>>>> +#include <linux/hwmon-sysfs.h>
>>>>>
>>>>> Is this include needed ?
>>>>>
>>>>
>>>> No it isn't. Will drop the line.
>>>>
>>>>>> +#include <linux/jiffies.h>
>>>>>> +#include <linux/module.h>
>>>>>> +#include <linux/of_device.h>
>>>>>> +#include <linux/peci.h>
>>>>>> +
>>>>>> +#define TEMP_TYPE_PECI        6  /* Sensor type 6: Intel PECI */
>>>>>> +
>>>>>> +#define CORE_MAX_ON_HSX       18 /* Max number of cores on 
>>>>>> Haswell */
>>>>>> +#define CORE_MAX_ON_BDX       24 /* Max number of cores on 
>>>>>> Broadwell */
>>>>>> +#define CORE_MAX_ON_SKX       28 /* Max number of cores on 
>>>>>> Skylake */
>>>>>> +
>>>>>> +#define DEFAULT_CHANNEL_NUMS  5
>>>>>> +#define CORETEMP_CHANNEL_NUMS CORE_MAX_ON_SKX
>>>>>> +#define CPUTEMP_CHANNEL_NUMS  (DEFAULT_CHANNEL_NUMS + 
>>>>>> CORETEMP_CHANNEL_NUMS)
>>>>>> +
>>>>>> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model 
>>>>>> info */
>>>>>> +
>>>>>> +#define UPDATE_INTERVAL_MIN   HZ
>>>>>> +
>>>>>> +enum cpu_gens {
>>>>>> +    CPU_GEN_HSX, /* Haswell Xeon */
>>>>>> +    CPU_GEN_BRX, /* Broadwell Xeon */
>>>>>> +    CPU_GEN_SKX, /* Skylake Xeon */
>>>>>> +    CPU_GEN_MAX
>>>>>> +};
>>>>>> +
>>>>>> +struct cpu_gen_info {
>>>>>> +    u32 type;
>>>>>> +    u32 cpu_id;
>>>>>> +    u32 core_max;
>>>>>> +};
>>>>>> +
>>>>>> +struct temp_data {
>>>>>> +    bool valid;
>>>>>> +    s32  value;
>>>>>> +    unsigned long last_updated;
>>>>>> +};
>>>>>> +
>>>>>> +struct temp_group {
>>>>>> +    struct temp_data die;
>>>>>> +    struct temp_data dts_margin;
>>>>>> +    struct temp_data tcontrol;
>>>>>> +    struct temp_data tthrottle;
>>>>>> +    struct temp_data tjmax;
>>>>>> +    struct temp_data core[CORETEMP_CHANNEL_NUMS];
>>>>>> +};
>>>>>> +
>>>>>> +struct peci_cputemp {
>>>>>> +    struct peci_client *client;
>>>>>> +    struct device *dev;
>>>>>> +    char name[PECI_NAME_SIZE];
>>>>>> +    struct temp_group temp;
>>>>>> +    u8 addr;
>>>>>> +    uint cpu_no;
>>>>>> +    const struct cpu_gen_info *gen_info;
>>>>>> +    u32 core_mask;
>>>>>> +    u32 temp_config[CPUTEMP_CHANNEL_NUMS + 1];
>>>>>> +    uint config_idx;
>>>>>> +    struct hwmon_channel_info temp_info;
>>>>>> +    const struct hwmon_channel_info *info[2];
>>>>>> +    struct hwmon_chip_info chip;
>>>>>> +};
>>>>>> +
>>>>>> +enum cputemp_channels {
>>>>>> +    channel_die,
>>>>>> +    channel_dts_mrgn,
>>>>>> +    channel_tcontrol,
>>>>>> +    channel_tthrottle,
>>>>>> +    channel_tjmax,
>>>>>> +    channel_core,
>>>>>> +};
>>>>>> +
>>>>>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>>>>>> +    { .type = CPU_GEN_HSX,
>>>>>> +      .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 
>>>>>> (0x3f) */
>>>>>> +      .core_max = CORE_MAX_ON_HSX },
>>>>>> +    { .type = CPU_GEN_BRX,
>>>>>> +      .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 
>>>>>> (0x4f) */
>>>>>> +      .core_max = CORE_MAX_ON_BDX },
>>>>>> +    { .type = CPU_GEN_SKX,
>>>>>> +      .cpu_id = 0x50650, /* Family code: 6, Model number: 85 
>>>>>> (0x55) */
>>>>>> +      .core_max = CORE_MAX_ON_SKX },
>>>>>> +};
>>>>>> +
>>>>>> +static const u32 config_table[DEFAULT_CHANNEL_NUMS + 1] = {
>>>>>> +    /* Die temperature */
>>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>>>>>> +    HWMON_T_CRIT_HYST,
>>>>>> +
>>>>>> +    /* DTS margin temperature */
>>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_LCRIT,
>>>>>> +
>>>>>> +    /* Tcontrol temperature */
>>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_CRIT,
>>>>>> +
>>>>>> +    /* Tthrottle temperature */
>>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT,
>>>>>> +
>>>>>> +    /* Tjmax temperature */
>>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT,
>>>>>> +
>>>>>> +    /* Core temperature - for all core channels */
>>>>>> +    HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
>>>>>> +    HWMON_T_CRIT_HYST,
>>>>>> +};
>>>>>> +
>>>>>> +static const char *cputemp_label[CPUTEMP_CHANNEL_NUMS] = {
>>>>>> +    "Die",
>>>>>> +    "DTS margin",
>>>>>> +    "Tcontrol",
>>>>>> +    "Tthrottle",
>>>>>> +    "Tjmax",
>>>>>> +    "Core 0", "Core 1", "Core 2", "Core 3",
>>>>>> +    "Core 4", "Core 5", "Core 6", "Core 7",
>>>>>> +    "Core 8", "Core 9", "Core 10", "Core 11",
>>>>>> +    "Core 12", "Core 13", "Core 14", "Core 15",
>>>>>> +    "Core 16", "Core 17", "Core 18", "Core 19",
>>>>>> +    "Core 20", "Core 21", "Core 22", "Core 23",
>>>>>> +};
>>>>>> +
>>>>>> +static int send_peci_cmd(struct peci_cputemp *priv,
>>>>>> +             enum peci_cmd cmd,
>>>>>> +             void *msg)
>>>>>> +{
>>>>>> +    return peci_command(priv->client->adapter, cmd, msg);
>>>>>> +}
>>>>>> +
>>>>>> +static int need_update(struct temp_data *temp)
>>>>>
>>>>> Please use bool.
>>>>>
>>>>
>>>> Okay. I'll use bool instead of int.
>>>>
>>>>>> +{
>>>>>> +    if (temp->valid &&
>>>>>> +        time_before(jiffies, temp->last_updated + 
>>>>>> UPDATE_INTERVAL_MIN))
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    return 1;
>>>>>> +}
>>>>>> +
>>>>>> +static void mark_updated(struct temp_data *temp)
>>>>>> +{
>>>>>> +    temp->valid = true;
>>>>>> +    temp->last_updated = jiffies;
>>>>>> +}
>>>>>> +
>>>>>> +static s32 ten_dot_six_to_millidegree(s32 val)
>>>>>> +{
>>>>>> +    return ((val ^ 0x8000) - 0x8000) * 1000 / 64;
>>>>>> +}
>>>>>> +
>>>>>> +static int get_tjmax(struct peci_cputemp *priv)
>>>>>> +{
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if (!priv->temp.tjmax.valid) {
>>>>>> +        msg.addr = priv->addr;
>>>>>> +        msg.index = MBX_INDEX_TEMP_TARGET;
>>>>>> +        msg.param = 0;
>>>>>> +        msg.rx_len = 4;
>>>>>> +
>>>>>> +        rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        priv->temp.tjmax.value = (s32)msg.pkg_config[2] * 1000;
>>>>>> +        priv->temp.tjmax.valid = true;
>>>>>> +    }
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int get_tcontrol(struct peci_cputemp *priv)
>>>>>> +{
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    s32 tcontrol_margin;
>>>>>> +    s32 tthrottle_offset;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if (!need_update(&priv->temp.tcontrol))
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    rc = get_tjmax(priv);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    msg.addr = priv->addr;
>>>>>> +    msg.index = MBX_INDEX_TEMP_TARGET;
>>>>>> +    msg.param = 0;
>>>>>> +    msg.rx_len = 4;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    tcontrol_margin = msg.pkg_config[1];
>>>>>> +    tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>>>>>> +    priv->temp.tcontrol.value = priv->temp.tjmax.value - 
>>>>>> tcontrol_margin;
>>>>>> +
>>>>>> +    tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>>>>>> +    priv->temp.tthrottle.value = priv->temp.tjmax.value - 
>>>>>> tthrottle_offset;
>>>>>> +
>>>>>> +    mark_updated(&priv->temp.tcontrol);
>>>>>> +    mark_updated(&priv->temp.tthrottle);
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int get_tthrottle(struct peci_cputemp *priv)
>>>>>> +{
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    s32 tcontrol_margin;
>>>>>> +    s32 tthrottle_offset;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if (!need_update(&priv->temp.tthrottle))
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    rc = get_tjmax(priv);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    msg.addr = priv->addr;
>>>>>> +    msg.index = MBX_INDEX_TEMP_TARGET;
>>>>>> +    msg.param = 0;
>>>>>> +    msg.rx_len = 4;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    tthrottle_offset = (msg.pkg_config[3] & 0x2f) * 1000;
>>>>>> +    priv->temp.tthrottle.value = priv->temp.tjmax.value - 
>>>>>> tthrottle_offset;
>>>>>> +
>>>>>> +    tcontrol_margin = msg.pkg_config[1];
>>>>>> +    tcontrol_margin = ((tcontrol_margin ^ 0x80) - 0x80) * 1000;
>>>>>> +    priv->temp.tcontrol.value = priv->temp.tjmax.value - 
>>>>>> tcontrol_margin;
>>>>>> +
>>>>>> +    mark_updated(&priv->temp.tthrottle);
>>>>>> +    mark_updated(&priv->temp.tcontrol);
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>
>>>>> I am quite completely missing how the two functions above are 
>>>>> different.
>>>>>
>>>>
>>>> The two above functions are slightly different but uses the same 
>>>> PECI command which provides both Tthrottle and Tcontrol values in 
>>>> pkg_config array so it updates the values to reduce duplicate PECI 
>>>> transactions. Probably, combining these two functions into 
>>>> get_ttrottle_and_tcontrol() would look better. I'll rewrite it.
>>>>
>>>>>> +
>>>>>> +static int get_die_temp(struct peci_cputemp *priv)
>>>>>> +{
>>>>>> +    struct peci_get_temp_msg msg;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if (!need_update(&priv->temp.die))
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    rc = get_tjmax(priv);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    msg.addr = priv->addr;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_GET_TEMP, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    priv->temp.die.value = priv->temp.tjmax.value +
>>>>>> +                   ((s32)msg.temp_raw * 1000 / 64);
>>>>>> +
>>>>>> +    mark_updated(&priv->temp.die);
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int get_dts_margin(struct peci_cputemp *priv)
>>>>>> +{
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    s32 dts_margin;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if (!need_update(&priv->temp.dts_margin))
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    msg.addr = priv->addr;
>>>>>> +    msg.index = MBX_INDEX_DTS_MARGIN;
>>>>>> +    msg.param = 0;
>>>>>> +    msg.rx_len = 4;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>>>>>> +
>>>>>> +    /**
>>>>>> +     * Processors return a value of DTS reading in 10.6 format
>>>>>> +     * (10 bits signed decimal, 6 bits fractional).
>>>>>> +     * Error codes:
>>>>>> +     *   0x8000: General sensor error
>>>>>> +     *   0x8001: Reserved
>>>>>> +     *   0x8002: Underflow on reading value
>>>>>> +     *   0x8003-0x81ff: Reserved
>>>>>> +     */
>>>>>> +    if (dts_margin >= 0x8000 && dts_margin <= 0x81ff)
>>>>>> +        return -EIO;
>>>>>> +
>>>>>> +    dts_margin = ten_dot_six_to_millidegree(dts_margin);
>>>>>> +
>>>>>> +    priv->temp.dts_margin.value = dts_margin;
>>>>>> +
>>>>>> +    mark_updated(&priv->temp.dts_margin);
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int get_core_temp(struct peci_cputemp *priv, int core_index)
>>>>>> +{
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    s32 core_dts_margin;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if (!need_update(&priv->temp.core[core_index]))
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    rc = get_tjmax(priv);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    msg.addr = priv->addr;
>>>>>> +    msg.index = MBX_INDEX_PER_CORE_DTS_TEMP;
>>>>>> +    msg.param = core_index;
>>>>>> +    msg.rx_len = 4;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    core_dts_margin = (msg.pkg_config[1] << 8) | msg.pkg_config[0];
>>>>>> +
>>>>>> +    /**
>>>>>> +     * Processors return a value of the core DTS reading in 10.6 
>>>>>> format
>>>>>> +     * (10 bits signed decimal, 6 bits fractional).
>>>>>> +     * Error codes:
>>>>>> +     *   0x8000: General sensor error
>>>>>> +     *   0x8001: Reserved
>>>>>> +     *   0x8002: Underflow on reading value
>>>>>> +     *   0x8003-0x81ff: Reserved
>>>>>> +     */
>>>>>> +    if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>>>>>> +        return -EIO;
>>>>>> +
>>>>>> +    core_dts_margin = ten_dot_six_to_millidegree(core_dts_margin);
>>>>>> +
>>>>>> +    priv->temp.core[core_index].value = priv->temp.tjmax.value +
>>>>>> +                        core_dts_margin;
>>>>>> +
>>>>>> +    mark_updated(&priv->temp.core[core_index]);
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>
>>>>> There is a lot of duplication in those functions. Would it be possible
>>>>> to find common code and use functions for it instead of duplicating
>>>>> everything several times ?
>>>>>
>>>>
>>>> Are you pointing out this code?
>>>> /**
>>>>   * Processors return a value of the core DTS reading in 10.6 format
>>>>   * (10 bits signed decimal, 6 bits fractional).
>>>>   * Error codes:
>>>>   *   0x8000: General sensor error
>>>>   *   0x8001: Reserved
>>>>   *   0x8002: Underflow on reading value
>>>>   *   0x8003-0x81ff: Reserved
>>>>   */
>>>> if (core_dts_margin >= 0x8000 && core_dts_margin <= 0x81ff)
>>>>      return -EIO;
>>>>
>>>> Then I'll rewrite it as a function. If not, please point out the 
>>>> duplication.
>>>>
>>>
>>> There is lots of other duplication.
>>>
>>
>> Sorry but can you point out the duplication?
>>
> write a python script to do a semantic comparison.
> 

Okay. I'll try to simplify this code again.

>>>>>> +static int find_core_index(struct peci_cputemp *priv, int channel)
>>>>>> +{
>>>>>> +    int core_channel = channel - DEFAULT_CHANNEL_NUMS;
>>>>>> +    int idx, found = 0;
>>>>>> +
>>>>>> +    for (idx = 0; idx < priv->gen_info->core_max; idx++) {
>>>>>> +        if (priv->core_mask & BIT(idx)) {
>>>>>> +            if (core_channel == found)
>>>>>> +                break;
>>>>>> +
>>>>>> +            found++;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    return idx;
>>>>>
>>>>> What if nothing is found ?
>>>>>
>>>>
>>>> Core temperature group will be registered only when it detects at 
>>>> least one core checked by check_resolved_cores(), so 
>>>> find_core_index() can be called only when priv->core_mask has a 
>>>> non-zero value. The 'nothing is found' case will not happen.
>>>>
>>> That doesn't guarantee a match. If what you are saying is correct 
>>> there should always be
>>> a well defined match of channel -> idx, and the search should be 
>>> unnecessary.
>>>
>>
>> There could be some disabled cores in the resolved core mask bit 
>> sequence also it should remove indexing gap in channel numbering so it 
>> is the reason why this search function is needed. Well defined match 
>> of channel -> idx would not be always satisfied.
>>
> Are you saying that each call to the function, with the same parameters,
> can return a different result ?
> 

No, the result will be consistent. After reading the priv->core_mask 
once in check_resolved_cores(), the value will not be changed. I'm 
saying about this case, for example if core number 2 is unresolved in 
total 4 cores, then the idx order will be '0, 1, 3' but channel order 
will be '5, 6, 7' without making any indexing gap.

>>>>>> +}
>>>>>> +
>>>>>> +static int cputemp_read_string(struct device *dev,
>>>>>> +                   enum hwmon_sensor_types type,
>>>>>> +                   u32 attr, int channel, const char **str)
>>>>>> +{
>>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>>> +    int core_index;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_label:
>>>>>> +        if (channel < DEFAULT_CHANNEL_NUMS) {
>>>>>> +            *str = cputemp_label[channel];
>>>>>> +        } else {
>>>>>> +            core_index = find_core_index(priv, channel);
>>>>>
>>>>> FWIW, it might be better to pass channel - DEFAULT_CHANNEL_NUMS
>>>>> as parameter.
>>>>>
>>>>
>>>> cputemp_read_string() is mapped to read_string member of hwmon_ops 
>>>> struct, so hwmon susbsystem passes the channel parameter based on 
>>>> the registered channel order. Should I modify hwmon subsystem code?
>>>>
>>>
>>> Huh ? Changing
>>>      f(x) { y = x - const; }
>>> ...
>>>      f(x);
>>>
>>> to
>>>      f(y) { }
>>> ...
>>>      f(x - const);
>>>
>>> requires a hwmon core change ? Really ?
>>>
>>
>> Sorry for my misunderstanding. You are right. I'll change the 
>> parameter passing of find_core_index() from 'channel' to 'channel - 
>> DEFAULT_CHANNEL_NUMS'.
>>
>>>>> What if find_core_index() returns priv->gen_info->core_max, ie
>>>>> if it didn't find a core ?
>>>>>
>>>>
>>>> As explained above, find_core index() returns a correct index always.
>>>>
>>>>>> +            *str = cputemp_label[DEFAULT_CHANNEL_NUMS + core_index];
>>>>>> +        }
>>>>>> +        return 0;
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static int cputemp_read_die(struct device *dev,
>>>>>> +                enum hwmon_sensor_types type,
>>>>>> +                u32 attr, int channel, long *val)
>>>>>> +{
>>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_input:
>>>>>> +        rc = get_die_temp(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.die.value;
>>>>>> +        return 0;
>>>>>> +    case hwmon_temp_max:
>>>>>> +        rc = get_tcontrol(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tcontrol.value;
>>>>>> +        return 0;
>>>>>> +    case hwmon_temp_crit:
>>>>>> +        rc = get_tjmax(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tjmax.value;
>>>>>> +        return 0;
>>>>>> +    case hwmon_temp_crit_hyst:
>>>>>> +        rc = get_tcontrol(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>>>>>> +        return 0;
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static int cputemp_read_dts_margin(struct device *dev,
>>>>>> +                   enum hwmon_sensor_types type,
>>>>>> +                   u32 attr, int channel, long *val)
>>>>>> +{
>>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_input:
>>>>>> +        rc = get_dts_margin(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.dts_margin.value;
>>>>>> +        return 0;
>>>>>> +    case hwmon_temp_min:
>>>>>> +        *val = 0;
>>>>>> +        return 0;
>>>>>
>>>>> This attribute should not exist.
>>>>>
>>>>
>>>> This is an attribute of DTS margin temperature which reflects 
>>>> thermal margin to Tcontrol of the CPU package. If it shows '0' means 
>>>> it reached to Tcontrol, the first level of thermal warning. If the 
>>>> CPU keeps getting hot then this DTS margin shows a negative value 
>>>> until it reaches to Tjmax. When the temperature reaches to Tjmax at 
>>>> last then it shows the lower critcal value which lcrit indicates as 
>>>> the second level of thermal warning.
>>>>
>>>
>>> The hwmon ABI reports chip values, not constants. Even though some 
>>> drivers do
>>> it, reporting a constant is always wrong.
>>>
>>>>>> +    case hwmon_temp_lcrit:
>>>>>> +        rc = get_tcontrol(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tcontrol.value - priv->temp.tjmax.value;
>>>>>
>>>>> lcrit is tcontrol - tjmax, and crit_hyst above is
>>>>> tjmax - tcontrol ? How does this make sense ?
>>>>>
>>>>
>>>> Both Tjmax and Tcontrol have positive values and Tjmax is greater 
>>>> than Tcontrol always. As explained above, lcrit of DTS margin should 
>>>> show a negative value means the margin goes down across '0'. On the 
>>>> other hand, crit_hyst of Die temperature should show absolute 
>>>> hyterisis value between Tcontrol and Tjmax.
>>>>
>>> The hwmon ABI requires reporting of absolute temperatures in 
>>> milli-degrees C.
>>> Your statements make it very clear that this driver does not report
>>> absolute temperatures. This is not acceptable.
>>>
>>
>> Okay. I'll remove the 'DTS margin' temperature. All others are 
>> reporting absolute temperatures.
>>
>>>>>> +        return 0;
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static int cputemp_read_tcontrol(struct device *dev,
>>>>>> +                 enum hwmon_sensor_types type,
>>>>>> +                 u32 attr, int channel, long *val)
>>>>>> +{
>>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_input:
>>>>>> +        rc = get_tcontrol(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tcontrol.value;
>>>>>> +        return 0;
>>>>>> +    case hwmon_temp_crit:
>>>>>> +        rc = get_tjmax(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tjmax.value;
>>>>>> +        return 0;
>>>>>
>>>>> Am I missing something, or is the same temperature reported several 
>>>>> times ?
>>>>> tjmax is also reported as temp_crit cputemp_read_die(), for example.
>>>>>
>>>>
>>>> This driver provides multiple channels and each channel has its own 
>>>> supplement attributes. As you mentioned, Die temperature channel and 
>>>> Core temperature channel have their individual crit attributes and 
>>>> they reflect the same value, Tjmax. It is not reporting several 
>>>> times but reporting the same value.
>>>>
>>> Then maybe fold the functions accordingly ?
>>>
>>
>> I'll use a single function for 'Die temperature' and 'Core 
>> temperature' that have the same attributes set. It would simplify this 
>> code a bit.
>>
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static int cputemp_read_tthrottle(struct device *dev,
>>>>>> +                  enum hwmon_sensor_types type,
>>>>>> +                  u32 attr, int channel, long *val)
>>>>>> +{
>>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_input:
>>>>>> +        rc = get_tthrottle(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tthrottle.value;
>>>>>> +        return 0;
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static int cputemp_read_tjmax(struct device *dev,
>>>>>> +                  enum hwmon_sensor_types type,
>>>>>> +                  u32 attr, int channel, long *val)
>>>>>> +{
>>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_input:
>>>>>> +        rc = get_tjmax(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tjmax.value;
>>>>>> +        return 0;
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static int cputemp_read_core(struct device *dev,
>>>>>> +                 enum hwmon_sensor_types type,
>>>>>> +                 u32 attr, int channel, long *val)
>>>>>> +{
>>>>>> +    struct peci_cputemp *priv = dev_get_drvdata(dev);
>>>>>> +    int core_index = find_core_index(priv, channel);
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_input:
>>>>>> +        rc = get_core_temp(priv, core_index);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.core[core_index].value;
>>>>>> +        return 0;
>>>>>> +    case hwmon_temp_max:
>>>>>> +        rc = get_tcontrol(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tcontrol.value;
>>>>>> +        return 0;
>>>>>> +    case hwmon_temp_crit:
>>>>>> +        rc = get_tjmax(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tjmax.value;
>>>>>> +        return 0;
>>>>>> +    case hwmon_temp_crit_hyst:
>>>>>> +        rc = get_tcontrol(priv);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp.tjmax.value - priv->temp.tcontrol.value;
>>>>>> +        return 0;
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>
>>>>> There is again a lot of duplication in those functions.
>>>>>
>>>>
>>>> Each function is called from cputemp_read() which is mapped to read 
>>>> function pointer of hwmon_ops struct. Since each channel has 
>>>> different set of attributes so the cputemp_read() calls an 
>>>> individual channel handler after checking the channel type. Of 
>>>> course, we can handle all attributes of all channels in a single 
>>>> function but the way also needs channel type checking code on each 
>>>> attribute.
>>>>
>>>>>> +
>>>>>> +static int cputemp_read(struct device *dev,
>>>>>> +            enum hwmon_sensor_types type,
>>>>>> +            u32 attr, int channel, long *val)
>>>>>> +{
>>>>>> +    switch (channel) {
>>>>>> +    case channel_die:
>>>>>> +        return cputemp_read_die(dev, type, attr, channel, val);
>>>>>> +    case channel_dts_mrgn:
>>>>>> +        return cputemp_read_dts_margin(dev, type, attr, channel, 
>>>>>> val);
>>>>>> +    case channel_tcontrol:
>>>>>> +        return cputemp_read_tcontrol(dev, type, attr, channel, val);
>>>>>> +    case channel_tthrottle:
>>>>>> +        return cputemp_read_tthrottle(dev, type, attr, channel, 
>>>>>> val);
>>>>>> +    case channel_tjmax:
>>>>>> +        return cputemp_read_tjmax(dev, type, attr, channel, val);
>>>>>> +    default:
>>>>>> +        if (channel < CPUTEMP_CHANNEL_NUMS)
>>>>>> +            return cputemp_read_core(dev, type, attr, channel, val);
>>>>>> +
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static umode_t cputemp_is_visible(const void *data,
>>>>>> +                  enum hwmon_sensor_types type,
>>>>>> +                  u32 attr, int channel)
>>>>>> +{
>>>>>> +    const struct peci_cputemp *priv = data;
>>>>>> +
>>>>>> +    if (priv->temp_config[channel] & BIT(attr))
>>>>>> +        return 0444;
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static const struct hwmon_ops cputemp_ops = {
>>>>>> +    .is_visible = cputemp_is_visible,
>>>>>> +    .read_string = cputemp_read_string,
>>>>>> +    .read = cputemp_read,
>>>>>> +};
>>>>>> +
>>>>>> +static int check_resolved_cores(struct peci_cputemp *priv)
>>>>>> +{
>>>>>> +    struct peci_rd_pci_cfg_local_msg msg;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if (!(priv->client->adapter->cmd_mask & 
>>>>>> BIT(PECI_CMD_RD_PCI_CFG_LOCAL)))
>>>>>> +        return -EINVAL;
>>>>>> +
>>>>>> +    /* Get the RESOLVED_CORES register value */
>>>>>> +    msg.addr = priv->addr;
>>>>>> +    msg.bus = 1;
>>>>>> +    msg.device = 30;
>>>>>> +    msg.function = 3;
>>>>>> +    msg.reg = 0xB4;
>>>>>
>>>>> Can this be made less magic with some defines ?
>>>>>
>>>>
>>>> Sure, will use defines instead.
>>>>
>>>>>> +    msg.rx_len = 4;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PCI_CFG_LOCAL, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    priv->core_mask = msg.pci_config[3] << 24 |
>>>>>> +              msg.pci_config[2] << 16 |
>>>>>> +              msg.pci_config[1] << 8 |
>>>>>> +              msg.pci_config[0];
>>>>>> +
>>>>>> +    if (!priv->core_mask)
>>>>>> +        return -EAGAIN;
>>>>>> +
>>>>>> +    dev_dbg(priv->dev, "Scanned resolved cores: 0x%x\n", 
>>>>>> priv->core_mask);
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int create_core_temp_info(struct peci_cputemp *priv)
>>>>>> +{
>>>>>> +    int rc, i;
>>>>>> +
>>>>>> +    rc = check_resolved_cores(priv);
>>>>>> +    if (!rc) {
>>>>>> +        for (i = 0; i < priv->gen_info->core_max; i++) {
>>>>>> +            if (priv->core_mask & BIT(i)) {
>>>>>> +                priv->temp_config[priv->config_idx++] =
>>>>>> +                             config_table[channel_core];
>>>>>> +            }
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    return rc;
>>>>>> +}
>>>>>> +
>>>>>> +static int check_cpu_id(struct peci_cputemp *priv)
>>>>>> +{
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    u32 cpu_id;
>>>>>> +    int i, rc;
>>>>>> +
>>>>>> +    msg.addr = priv->addr;
>>>>>> +    msg.index = MBX_INDEX_CPU_ID;
>>>>>> +    msg.param = PKG_ID_CPU_ID;
>>>>>> +    msg.rx_len = 4;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>>>>>> +          msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>>>>>> +
>>>>>> +    for (i = 0; i < CPU_GEN_MAX; i++) {
>>>>>> +        if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>>>>>> +            priv->gen_info = &cpu_gen_info_table[i];
>>>>>> +            break;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    if (!priv->gen_info)
>>>>>> +        return -ENODEV;
>>>>>> +
>>>>>> +    dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int peci_cputemp_probe(struct peci_client *client)
>>>>>> +{
>>>>>> +    struct device *dev = &client->dev;
>>>>>> +    struct peci_cputemp *priv;
>>>>>> +    struct device *hwmon_dev;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if ((client->adapter->cmd_mask &
>>>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>>>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>>>>>> +        dev_err(dev, "Client doesn't support temperature 
>>>>>> monitoring\n");
>>>>>> +        return -EINVAL;
>>>>>
>>>>> Does this mean there will be an error message for each 
>>>>> non-supported CPU ?
>>>>> Why ?
>>>>>
>>>>
>>>> For proper operation of this driver, PECI_CMD_GET_TEMP and 
>>>> PECI_CMD_RD_PKG_CFG have to be supported by a client CPU. 
>>>> PECI_CMD_GET_TEMP is provided as a default command but 
>>>> PECI_CMD_RD_PKG_CFG depends on PECI minor revision of a CPU package 
>>>> so this checking is needed.
>>>>
>>>
>>> I do not question the check. I question the error message and error 
>>> return value.
>>> Why is it an _error_ if the CPU does not support the functionality, 
>>> and why does
>>> it have to be reported in the kernel log ?
>>>
>>
>> Got it. I'll change that to dev_dbg.
>>
>>>>>> +    }
>>>>>> +
>>>>>> +    priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>>>>>> +    if (!priv)
>>>>>> +        return -ENOMEM;
>>>>>> +
>>>>>> +    dev_set_drvdata(dev, priv);
>>>>>> +    priv->client = client;
>>>>>> +    priv->dev = dev;
>>>>>> +    priv->addr = client->addr;
>>>>>> +    priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>>>>>> +
>>>>>> +    snprintf(priv->name, PECI_NAME_SIZE, "peci_cputemp.cpu%d",
>>>>>> +         priv->cpu_no);
>>>>>> +
>>>>>> +    rc = check_cpu_id(priv);
>>>>>> +    if (rc) {
>>>>>> +        dev_err(dev, "Client CPU is not supported\n");
>>>>>
>>>>> -ENODEV is not an error, and should not result in an error message.
>>>>> Besides, the error can also be propagated from peci core code,
>>>>> and may well be something else.
>>>>>
>>>>
>>>> Got it. I'll remove the error message and will add a proper handling 
>>>> code into PECI core.
>>>>
>>>>>> +        return rc;
>>>>>> +    }
>>>>>> +
>>>>>> +    priv->temp_config[priv->config_idx++] = 
>>>>>> config_table[channel_die];
>>>>>> +    priv->temp_config[priv->config_idx++] = 
>>>>>> config_table[channel_dts_mrgn];
>>>>>> +    priv->temp_config[priv->config_idx++] = 
>>>>>> config_table[channel_tcontrol];
>>>>>> +    priv->temp_config[priv->config_idx++] = 
>>>>>> config_table[channel_tthrottle];
>>>>>> +    priv->temp_config[priv->config_idx++] = 
>>>>>> config_table[channel_tjmax];
>>>>>> +
>>>>>> +    rc = create_core_temp_info(priv);
>>>>>> +    if (rc)
>>>>>> +        dev_dbg(dev, "Failed to create core temp info\n");
>>>>>
>>>>> Then what ? Shouldn't this result in probe deferral or something 
>>>>> more useful
>>>>> instead of just being ignored ?
>>>>>
>>>>
>>>> This driver can't support core temperature monitoring if a CPU 
>>>> doesn't support PECI_CMD_RD_PCI_CFG_LOCAL command. In that case, it 
>>>> skips core temperature group creation and supports only basic 
>>>> temperature monitoring of Die, DTS margin and etc. I'll add this 
>>>> description as a comment.
>>>>
>>>
>>> The message says "Failed to ...". It does not say "This CPU does not 
>>> support ...".
>>>
>>
>> Got it. Will correct the message.
>>
>>>>>> +
>>>>>> +    priv->chip.ops = &cputemp_ops;
>>>>>> +    priv->chip.info = priv->info;
>>>>>> +
>>>>>> +    priv->info[0] = &priv->temp_info;
>>>>>> +
>>>>>> +    priv->temp_info.type = hwmon_temp;
>>>>>> +    priv->temp_info.config = priv->temp_config;
>>>>>> +
>>>>>> +    hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>>>>>> +                             priv->name,
>>>>>> +                             priv,
>>>>>> +                             &priv->chip,
>>>>>> +                             NULL);
>>>>>> +
>>>>>> +    if (IS_ERR(hwmon_dev))
>>>>>> +        return PTR_ERR(hwmon_dev);
>>>>>> +
>>>>>> +    dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), 
>>>>>> priv->name);
>>>>>> +
>>>
>>> Why does this message display the device name twice ?
>>>
>>
>> For an example, dev_name(hwmon_dev) shows 'hwmon5' and priv->name 
>> shows 'peci-cputemp0'.
>>
> And dev_dbg() shows another device name. So you'll have something like
> 
> peci-cputemp0: hwmon5: sensor 'peci-cputemp0'
> 

Practically it shows like

peci-cputemp 0-30:00: hwmon10: sensor 'peci_cputemp.cpu0'

where 0-30:00 is assigned by peci core.

>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static const struct of_device_id peci_cputemp_of_table[] = {
>>>>>> +    { .compatible = "intel,peci-cputemp" },
>>>>>> +    { }
>>>>>> +};
>>>>>> +MODULE_DEVICE_TABLE(of, peci_cputemp_of_table);
>>>>>> +
>>>>>> +static struct peci_driver peci_cputemp_driver = {
>>>>>> +    .probe  = peci_cputemp_probe,
>>>>>> +    .driver = {
>>>>>> +        .name           = "peci-cputemp",
>>>>>> +        .of_match_table = of_match_ptr(peci_cputemp_of_table),
>>>>>> +    },
>>>>>> +};
>>>>>> +module_peci_driver(peci_cputemp_driver);
>>>>>> +
>>>>>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>>>>>> +MODULE_DESCRIPTION("PECI cputemp driver");
>>>>>> +MODULE_LICENSE("GPL v2");
>>>>>> diff --git a/drivers/hwmon/peci-dimmtemp.c 
>>>>>> b/drivers/hwmon/peci-dimmtemp.c
>>>>>> new file mode 100644
>>>>>> index 000000000000..78bf29cb2c4c
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/hwmon/peci-dimmtemp.c
>>>>>
>>>>> FWIW, this should be two separate patches.
>>>>>
>>>>
>>>> Should I split out hwmon documents and dt bindings too?
>>>>
>>>>>> @@ -0,0 +1,432 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>> +// Copyright (c) 2018 Intel Corporation
>>>>>> +
>>>>>> +#include <linux/delay.h>
>>>>>> +#include <linux/hwmon.h>
>>>>>> +#include <linux/hwmon-sysfs.h>
>>>>>
>>>>> Needed ?
>>>>>
>>>>
>>>> No. Will drop the line.
>>>>
>>>>>> +#include <linux/jiffies.h>
>>>>>> +#include <linux/module.h>
>>>>>> +#include <linux/of_device.h>
>>>>>> +#include <linux/peci.h>
>>>>>> +#include <linux/workqueue.h>
>>>>>> +
>>>>>> +#define TEMP_TYPE_PECI       6  /* Sensor type 6: Intel PECI */
>>>>>> +
>>>>>> +#define CHAN_RANK_MAX_ON_HSX 8  /* Max number of channel ranks on 
>>>>>> Haswell */
>>>>>> +#define DIMM_IDX_MAX_ON_HSX  3  /* Max DIMM index per channel on 
>>>>>> Haswell */
>>>>>> +
>>>>>> +#define CHAN_RANK_MAX_ON_BDX 4  /* Max number of channel ranks on 
>>>>>> Broadwell */
>>>>>> +#define DIMM_IDX_MAX_ON_BDX  3  /* Max DIMM index per channel on 
>>>>>> Broadwell */
>>>>>> +
>>>>>> +#define CHAN_RANK_MAX_ON_SKX 6  /* Max number of channel ranks on 
>>>>>> Skylake */
>>>>>> +#define DIMM_IDX_MAX_ON_SKX  2  /* Max DIMM index per channel on 
>>>>>> Skylake */
>>>>>> +
>>>>>> +#define CHAN_RANK_MAX        CHAN_RANK_MAX_ON_HSX
>>>>>> +#define DIMM_IDX_MAX         DIMM_IDX_MAX_ON_HSX
>>>>>> +
>>>>>> +#define DIMM_NUMS_MAX        (CHAN_RANK_MAX * DIMM_IDX_MAX)
>>>>>> +
>>>>>> +#define CLIENT_CPU_ID_MASK   0xf0ff0  /* Mask for Family / Model 
>>>>>> info */
>>>>>> +
>>>>>> +#define UPDATE_INTERVAL_MIN  HZ
>>>>>> +
>>>>>> +#define DIMM_MASK_CHECK_DELAY_JIFFIES msecs_to_jiffies(5000)
>>>>>> +#define DIMM_MASK_CHECK_RETRY_MAX     60 /* 60 x 5 secs = 5 
>>>>>> minutes */
>>>>>> +
>>>>>> +enum cpu_gens {
>>>>>> +    CPU_GEN_HSX, /* Haswell Xeon */
>>>>>> +    CPU_GEN_BRX, /* Broadwell Xeon */
>>>>>> +    CPU_GEN_SKX, /* Skylake Xeon */
>>>>>> +    CPU_GEN_MAX
>>>>>> +};
>>>>>> +
>>>>>> +struct cpu_gen_info {
>>>>>> +    u32 type;
>>>>>> +    u32 cpu_id;
>>>>>> +    u32 chan_rank_max;
>>>>>> +    u32 dimm_idx_max;
>>>>>> +};
>>>>>> +
>>>>>> +struct temp_data {
>>>>>> +    bool valid;
>>>>>> +    s32  value;
>>>>>> +    unsigned long last_updated;
>>>>>> +};
>>>>>> +
>>>>>> +struct peci_dimmtemp {
>>>>>> +    struct peci_client *client;
>>>>>> +    struct device *dev;
>>>>>> +    struct workqueue_struct *work_queue;
>>>>>> +    struct delayed_work work_handler;
>>>>>> +    char name[PECI_NAME_SIZE];
>>>>>> +    struct temp_data temp[DIMM_NUMS_MAX];
>>>>>> +    u8 addr;
>>>>>> +    uint cpu_no;
>>>>>> +    const struct cpu_gen_info *gen_info;
>>>>>> +    u32 dimm_mask;
>>>>>> +    int retry_count;
>>>>>> +    int channels;
>>>>>> +    u32 temp_config[DIMM_NUMS_MAX + 1];
>>>>>> +    struct hwmon_channel_info temp_info;
>>>>>> +    const struct hwmon_channel_info *info[2];
>>>>>> +    struct hwmon_chip_info chip;
>>>>>> +};
>>>>>> +
>>>>>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>>>>>> +    { .type  = CPU_GEN_HSX,
>>>>>> +      .cpu_id = 0x306f0, /* Family code: 6, Model number: 63 
>>>>>> (0x3f) */
>>>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
>>>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_HSX },
>>>>>> +    { .type  = CPU_GEN_BRX,
>>>>>> +      .cpu_id = 0x406f0, /* Family code: 6, Model number: 79 
>>>>>> (0x4f) */
>>>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_BDX,
>>>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_BDX },
>>>>>> +    { .type  = CPU_GEN_SKX,
>>>>>> +      .cpu_id = 0x50650, /* Family code: 6, Model number: 85 
>>>>>> (0x55) */
>>>>>> +      .chan_rank_max = CHAN_RANK_MAX_ON_SKX,
>>>>>> +      .dimm_idx_max  = DIMM_IDX_MAX_ON_SKX },
>>>>>> +};
>>>>>> +
>>>>>> +static const char *dimmtemp_label[CHAN_RANK_MAX][DIMM_IDX_MAX] = {
>>>>>> +    { "DIMM A0", "DIMM A1", "DIMM A2" },
>>>>>> +    { "DIMM B0", "DIMM B1", "DIMM B2" },
>>>>>> +    { "DIMM C0", "DIMM C1", "DIMM C2" },
>>>>>> +    { "DIMM D0", "DIMM D1", "DIMM D2" },
>>>>>> +    { "DIMM E0", "DIMM E1", "DIMM E2" },
>>>>>> +    { "DIMM F0", "DIMM F1", "DIMM F2" },
>>>>>> +    { "DIMM G0", "DIMM G1", "DIMM G2" },
>>>>>> +    { "DIMM H0", "DIMM H1", "DIMM H2" },
>>>>>> +};
>>>>>> +
>>>>>> +static int send_peci_cmd(struct peci_dimmtemp *priv, enum 
>>>>>> peci_cmd cmd,
>>>>>> +             void *msg)
>>>>>> +{
>>>>>> +    return peci_command(priv->client->adapter, cmd, msg);
>>>>>> +}
>>>>>> +
>>>>>> +static int need_update(struct temp_data *temp)
>>>>>> +{
>>>>>> +    if (temp->valid &&
>>>>>> +        time_before(jiffies, temp->last_updated + 
>>>>>> UPDATE_INTERVAL_MIN))
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    return 1;
>>>>>> +}
>>>>>> +
>>>>>> +static void mark_updated(struct temp_data *temp)
>>>>>> +{
>>>>>> +    temp->valid = true;
>>>>>> +    temp->last_updated = jiffies;
>>>>>> +}
>>>>>
>>>>> It might make sense to provide the duplicate functions in a core file.
>>>>>
>>>>
>>>> It is temperature monitoring specific function and it touches module 
>>>> specific variables. Do you really think that this non-generic 
>>>> function should be moved to PECI core?
>>>>
>>>>>> +
>>>>>> +static int get_dimm_temp(struct peci_dimmtemp *priv, int dimm_no)
>>>>>> +{
>>>>>> +    int dimm_order = dimm_no % priv->gen_info->dimm_idx_max;
>>>>>> +    int chan_rank = dimm_no / priv->gen_info->dimm_idx_max;
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if (!need_update(&priv->temp[dimm_no]))
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    msg.addr = priv->addr;
>>>>>> +    msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>>>>>> +    msg.param = chan_rank;
>>>>>> +    msg.rx_len = 4;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    priv->temp[dimm_no].value = msg.pkg_config[dimm_order] * 1000;
>>>>>> +
>>>>>> +    mark_updated(&priv->temp[dimm_no]);
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int find_dimm_number(struct peci_dimmtemp *priv, int channel)
>>>>>> +{
>>>>>> +    int dimm_nums_max = priv->gen_info->chan_rank_max *
>>>>>> +                priv->gen_info->dimm_idx_max;
>>>>>> +    int idx, found = 0;
>>>>>> +
>>>>>> +    for (idx = 0; idx < dimm_nums_max; idx++) {
>>>>>> +        if (priv->dimm_mask & BIT(idx)) {
>>>>>> +            if (channel == found)
>>>>>> +                break;
>>>>>> +
>>>>>> +            found++;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    return idx;
>>>>>> +}
>>>>>
>>>>> This again looks like duplicate code.
>>>>>
>>>>
>>>> find_dimm_number()? I'm sure it isn't.
>>>>
>>>>>> +
>>>>>> +static int dimmtemp_read_string(struct device *dev,
>>>>>> +                enum hwmon_sensor_types type,
>>>>>> +                u32 attr, int channel, const char **str)
>>>>>> +{
>>>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>>>>> +    u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>>>>>> +    int dimm_no, chan_rank, dimm_idx;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_label:
>>>>>> +        dimm_no = find_dimm_number(priv, channel);
>>>>>> +        chan_rank = dimm_no / dimm_idx_max;
>>>>>> +        dimm_idx = dimm_no % dimm_idx_max;
>>>>>> +        *str = dimmtemp_label[chan_rank][dimm_idx];
>>>>>> +        return 0;
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static int dimmtemp_read(struct device *dev, enum 
>>>>>> hwmon_sensor_types type,
>>>>>> +             u32 attr, int channel, long *val)
>>>>>> +{
>>>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(dev);
>>>>>> +    int dimm_no = find_dimm_number(priv, channel);
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_input:
>>>>>> +        rc = get_dimm_temp(priv, dimm_no);
>>>>>> +        if (rc)
>>>>>> +            return rc;
>>>>>> +
>>>>>> +        *val = priv->temp[dimm_no].value;
>>>>>> +        return 0;
>>>>>> +    default:
>>>>>> +        return -EOPNOTSUPP;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static umode_t dimmtemp_is_visible(const void *data,
>>>>>> +                   enum hwmon_sensor_types type,
>>>>>> +                   u32 attr, int channel)
>>>>>> +{
>>>>>> +    switch (attr) {
>>>>>> +    case hwmon_temp_label:
>>>>>> +    case hwmon_temp_input:
>>>>>> +        return 0444;
>>>>>> +    default:
>>>>>> +        return 0;
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static const struct hwmon_ops dimmtemp_ops = {
>>>>>> +    .is_visible = dimmtemp_is_visible,
>>>>>> +    .read_string = dimmtemp_read_string,
>>>>>> +    .read = dimmtemp_read,
>>>>>> +};
>>>>>> +
>>>>>> +static int check_populated_dimms(struct peci_dimmtemp *priv)
>>>>>> +{
>>>>>> +    u32 chan_rank_max = priv->gen_info->chan_rank_max;
>>>>>> +    u32 dimm_idx_max = priv->gen_info->dimm_idx_max;
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    int chan_rank, dimm_idx;
>>>>>> +    int rc, channels = 0;
>>>>>> +
>>>>>> +    for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
>>>>>> +        msg.addr = priv->addr;
>>>>>> +        msg.index = MBX_INDEX_DDR_DIMM_TEMP;
>>>>>> +        msg.param = chan_rank;
>>>>>> +        msg.rx_len = 4;
>>>>>> +
>>>>>> +        rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +        if (rc) {
>>>>>> +            priv->dimm_mask = 0;
>>>>>> +            return rc;
>>>>>> +        }
>>>>>> +
>>>>>> +        for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++) {
>>>>>> +            if (msg.pkg_config[dimm_idx]) {
>>>>>> +                priv->dimm_mask |= BIT(chan_rank *
>>>>>> +                               chan_rank_max +
>>>>>> +                               dimm_idx);
>>>>>> +                channels++;
>>>>>> +            }
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    if (!priv->dimm_mask)
>>>>>> +        return -EAGAIN;
>>>>>> +
>>>>>> +    priv->channels = channels;
>>>>>> +
>>>>>> +    dev_dbg(priv->dev, "Scanned populated DIMMs: 0x%x\n", 
>>>>>> priv->dimm_mask);
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int create_dimm_temp_info(struct peci_dimmtemp *priv)
>>>>>> +{
>>>>>> +    struct device *hwmon_dev;
>>>>>> +    int rc, i;
>>>>>> +
>>>>>> +    rc = check_populated_dimms(priv);
>>>>>> +    if (!rc) {
>>>>>
>>>>> Please handle error cases first.
>>>>>
>>>>
>>>> Sure, I'll rewrite it.
>>>>
>>>>>> +        for (i = 0; i < priv->channels; i++)
>>>>>> +            priv->temp_config[i] = HWMON_T_LABEL | HWMON_T_INPUT;
>>>>>> +
>>>>>> +        priv->chip.ops = &dimmtemp_ops;
>>>>>> +        priv->chip.info = priv->info;
>>>>>> +
>>>>>> +        priv->info[0] = &priv->temp_info;
>>>>>> +
>>>>>> +        priv->temp_info.type = hwmon_temp;
>>>>>> +        priv->temp_info.config = priv->temp_config;
>>>>>> +
>>>>>> +        hwmon_dev = devm_hwmon_device_register_with_info(priv->dev,
>>>>>> +                                 priv->name,
>>>>>> +                                 priv,
>>>>>> +                                 &priv->chip,
>>>>>> +                                 NULL);
>>>>>> +        rc = PTR_ERR_OR_ZERO(hwmon_dev);
>>>>>> +        if (!rc)
>>>>>> +            dev_dbg(priv->dev, "%s: sensor '%s'\n",
>>>>>> +                dev_name(hwmon_dev), priv->name);
>>>>>> +    } else if (rc == -EAGAIN) {
>>>>>> +        if (priv->retry_count < DIMM_MASK_CHECK_RETRY_MAX) {
>>>>>> +            queue_delayed_work(priv->work_queue,
>>>>>> +                       &priv->work_handler,
>>>>>> +                       DIMM_MASK_CHECK_DELAY_JIFFIES);
>>>>>> +            priv->retry_count++;
>>>>>> +            dev_dbg(priv->dev,
>>>>>> +                "Deferred DIMM temp info creation\n");
>>>>>> +        } else {
>>>>>> +            rc = -ETIMEDOUT;
>>>>>> +            dev_err(priv->dev,
>>>>>> +                "Timeout retrying DIMM temp info creation\n");
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    return rc;
>>>>>> +}
>>>>>> +
>>>>>> +static void create_dimm_temp_info_delayed(struct work_struct *work)
>>>>>> +{
>>>>>> +    struct delayed_work *dwork = to_delayed_work(work);
>>>>>> +    struct peci_dimmtemp *priv = container_of(dwork, struct 
>>>>>> peci_dimmtemp,
>>>>>> +                          work_handler);
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    rc = create_dimm_temp_info(priv);
>>>>>> +    if (rc && rc != -EAGAIN)
>>>>>> +        dev_dbg(priv->dev, "Failed to create DIMM temp info\n");
>>>>>> +}
>>>>>> +
>>>>>> +static int check_cpu_id(struct peci_dimmtemp *priv)
>>>>>> +{
>>>>>> +    struct peci_rd_pkg_cfg_msg msg;
>>>>>> +    u32 cpu_id;
>>>>>> +    int i, rc;
>>>>>> +
>>>>>> +    msg.addr = priv->addr;
>>>>>> +    msg.index = MBX_INDEX_CPU_ID;
>>>>>> +    msg.param = PKG_ID_CPU_ID;
>>>>>> +    msg.rx_len = 4;
>>>>>> +
>>>>>> +    rc = send_peci_cmd(priv, PECI_CMD_RD_PKG_CFG, &msg);
>>>>>> +    if (rc)
>>>>>> +        return rc;
>>>>>> +
>>>>>> +    cpu_id = ((msg.pkg_config[2] << 16) | (msg.pkg_config[1] << 8) |
>>>>>> +          msg.pkg_config[0]) & CLIENT_CPU_ID_MASK;
>>>>>> +
>>>>>> +    for (i = 0; i < CPU_GEN_MAX; i++) {
>>>>>> +        if (cpu_id == cpu_gen_info_table[i].cpu_id) {
>>>>>> +            priv->gen_info = &cpu_gen_info_table[i];
>>>>>> +            break;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    if (!priv->gen_info)
>>>>>> +        return -ENODEV;
>>>>>> +
>>>>>> +    dev_dbg(priv->dev, "CPU_ID: 0x%x\n", cpu_id);
>>>>>> +    return 0;
>>>>>> +}
>>>>>
>>>>> More duplicate code.
>>>>>
>>>>
>>>> Okay. In case of check_cpu_id(), it could be used as a generic PECI 
>>>> function. I'll move it into PECI core.
>>>>
>>>>>> +
>>>>>> +static int peci_dimmtemp_probe(struct peci_client *client)
>>>>>> +{
>>>>>> +    struct device *dev = &client->dev;
>>>>>> +    struct peci_dimmtemp *priv;
>>>>>> +    int rc;
>>>>>> +
>>>>>> +    if ((client->adapter->cmd_mask &
>>>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) !=
>>>>>> +        (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG))) {
>>>>>
>>>>> One set of ( ) is unnecessary on each side of the expression.
>>>>>
>>>>
>>>> '&' has a precedence over '!=' but '|' doesn't. I'll rewrite it to:
>>>>
>>>
>>> Actually, that is wrong. You refer to address-of. Bit operations do 
>>> have lower
>>> precedence that comparisons. I stand corrected.
>>>
>>>>      if (client->adapter->cmd_mask &
>>>>          (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)) !=
>>>>          (BIT(PECI_CMD_GET_TEMP) | BIT(PECI_CMD_RD_PKG_CFG)))
>>>>
>>>>>> +        dev_err(dev, "Client doesn't support temperature 
>>>>>> monitoring\n");
>>>>>> +        return -EINVAL;
>>>>>
>>>>> Why is this "invalid", and why does it warrant an error message ?
>>>>>
>>>>
>>>> Should I use -EPERM? Any suggestion?
>>>>
>>>
>>> Is it an _error_ if the CPU does not support this functionality ?
>>>
>>
>> Actually, it returns from this probe() function without making any 
>> hwmon info creation so I intended to handle this case as an error. Am 
>> I wrong?
>>
> 
> If the functionality or HW supported by the driver isn't available, it 
> is customary
> to return -ENODEV and no error message. Otherwise the kernel log would 
> drown in
> "not supported" error messages. I don't see where it would add any value 
> to handle
> this driver differently.
> 
> EINVAL    Invalid argument
> EPERM    Operation not permitted
> 
> You'll have to work hard to convince me that any of those makes sense, 
> and that
> 
> ENODEV    No such device
> 
> doesn't. More specifically, if EINVAL makes sense, the caller did 
> something wrong,
> meaning there is a problem in the infrastructure which should get fixed.
> The same is true for EPERM.
> 

Now I fully understood what you pointed out. Thanks for the detailed 
explanation. I'll change the error return value to -ENODEV and will use 
dev_dbg for the message printing. Thanks!

>>>>>> +    }
>>>>>> +
>>>>>> +    priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
>>>>>> +    if (!priv)
>>>>>> +        return -ENOMEM;
>>>>>> +
>>>>>> +    dev_set_drvdata(dev, priv);
>>>>>> +    priv->client = client;
>>>>>> +    priv->dev = dev;
>>>>>> +    priv->addr = client->addr;
>>>>>> +    priv->cpu_no = priv->addr - PECI_BASE_ADDR;
>>>>>
>>>>> Is priv->addr guaranteed to be >= PECI_BASE_ADDR ?
>>>>
>>>> Client address range validation will be done in 
>>>> peci_check_addr_validity() in PECI core before probing a device driver.
>>>>
>>>>>> +
>>>>>> +    snprintf(priv->name, PECI_NAME_SIZE, "peci_dimmtemp.cpu%d",
>>>>>> +         priv->cpu_no);
>>>>>> +
>>>>>> +    rc = check_cpu_id(priv);
>>>>>> +    if (rc) {
>>>>>> +        dev_err(dev, "Client CPU is not supported\n");
>>>>>
>>>>> Or the peci command failed.
>>>>>
>>>>
>>>> I'll remove the error message and will add a proper handling code 
>>>> into PECI core on each error type.
>>>>
>>>>>> +        return rc;
>>>>>> +    }
>>>>>> +
>>>>>> +    priv->work_queue = alloc_ordered_workqueue(priv->name, 0);
>>>>>> +    if (!priv->work_queue)
>>>>>> +        return -ENOMEM;
>>>>>> +
>>>>>> +    INIT_DELAYED_WORK(&priv->work_handler, 
>>>>>> create_dimm_temp_info_delayed);
>>>>>> +
>>>>>> +    rc = create_dimm_temp_info(priv);
>>>>>> +    if (rc && rc != -EAGAIN) {
>>>>>> +        dev_err(dev, "Failed to create DIMM temp info\n");
>>>>>> +        goto err_free_wq;
>>>>>> +    }
>>>>>> +
>>>>>> +    return 0;
>>>>>> +
>>>>>> +err_free_wq:
>>>>>> +    destroy_workqueue(priv->work_queue);
>>>>>> +    return rc;
>>>>>> +}
>>>>>> +
>>>>>> +static int peci_dimmtemp_remove(struct peci_client *client)
>>>>>> +{
>>>>>> +    struct peci_dimmtemp *priv = dev_get_drvdata(&client->dev);
>>>>>> +
>>>>>> +    cancel_delayed_work(&priv->work_handler);
>>>>>
>>>>> cancel_delayed_work_sync() ?
>>>>>
>>>>
>>>> Yes, it would be safer. Will fix it.
>>>>
>>>>>> +    destroy_workqueue(priv->work_queue);
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static const struct of_device_id peci_dimmtemp_of_table[] = {
>>>>>> +    { .compatible = "intel,peci-dimmtemp" },
>>>>>> +    { }
>>>>>> +};
>>>>>> +MODULE_DEVICE_TABLE(of, peci_dimmtemp_of_table);
>>>>>> +
>>>>>> +static struct peci_driver peci_dimmtemp_driver = {
>>>>>> +    .probe  = peci_dimmtemp_probe,
>>>>>> +    .remove = peci_dimmtemp_remove,
>>>>>> +    .driver = {
>>>>>> +        .name           = "peci-dimmtemp",
>>>>>> +        .of_match_table = of_match_ptr(peci_dimmtemp_of_table),
>>>>>> +    },
>>>>>> +};
>>>>>> +module_peci_driver(peci_dimmtemp_driver);
>>>>>> +
>>>>>> +MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
>>>>>> +MODULE_DESCRIPTION("PECI dimmtemp driver");
>>>>>> +MODULE_LICENSE("GPL v2");
>>>>>> -- 
>>>>>> 2.16.2
>>>>>>
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>> linux-hwmon" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-12 17:09             ` Jae Hyun Yoo
@ 2018-04-12 17:37               ` Guenter Roeck
  2018-04-12 19:51                 ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Guenter Roeck @ 2018-04-12 17:37 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On Thu, Apr 12, 2018 at 10:09:51AM -0700, Jae Hyun Yoo wrote:
[ ... ]
> >>>>>>+static int find_core_index(struct peci_cputemp *priv, int channel)
> >>>>>>+{
> >>>>>>+    int core_channel = channel - DEFAULT_CHANNEL_NUMS;
> >>>>>>+    int idx, found = 0;
> >>>>>>+
> >>>>>>+    for (idx = 0; idx < priv->gen_info->core_max; idx++) {
> >>>>>>+        if (priv->core_mask & BIT(idx)) {
> >>>>>>+            if (core_channel == found)
> >>>>>>+                break;
> >>>>>>+
> >>>>>>+            found++;
> >>>>>>+        }
> >>>>>>+    }
> >>>>>>+
> >>>>>>+    return idx;
> >>>>>
> >>>>>What if nothing is found ?
> >>>>>
> >>>>
> >>>>Core temperature group will be registered only when it detects at
> >>>>least one core checked by check_resolved_cores(), so
> >>>>find_core_index() can be called only when priv->core_mask has a
> >>>>non-zero value. The 'nothing is found' case will not happen.
> >>>>
> >>>That doesn't guarantee a match. If what you are saying is correct
> >>>there should always be
> >>>a well defined match of channel -> idx, and the search should be
> >>>unnecessary.
> >>>
> >>
> >>There could be some disabled cores in the resolved core mask bit
> >>sequence also it should remove indexing gap in channel numbering so it
> >>is the reason why this search function is needed. Well defined match of
> >>channel -> idx would not be always satisfied.
> >>
> >Are you saying that each call to the function, with the same parameters,
> >can return a different result ?
> >
> 
> No, the result will be consistent. After reading the priv->core_mask once in
> check_resolved_cores(), the value will not be changed. I'm saying about this
> case, for example if core number 2 is unresolved in total 4 cores, then the
> idx order will be '0, 1, 3' but channel order will be '5, 6, 7' without
> making any indexing gap.
> 

And you yet you claim that this is not well defined ? Or are you concerned
about the amount of memory consumed by providing an array for the mapping ?

Note that an indexing gap is acceptable and, in many cases, preferred.

[ ... ]

> >>>>>>+
> >>>>>>+    dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev),
> >>>>>>priv->name);
> >>>>>>+
> >>>
> >>>Why does this message display the device name twice ?
> >>>
> >>
> >>For an example, dev_name(hwmon_dev) shows 'hwmon5' and priv->name shows
> >>'peci-cputemp0'.
> >>
> >And dev_dbg() shows another device name. So you'll have something like
> >
> >peci-cputemp0: hwmon5: sensor 'peci-cputemp0'
> >
> 
> Practically it shows like
> 
> peci-cputemp 0-30:00: hwmon10: sensor 'peci_cputemp.cpu0'
> 
> where 0-30:00 is assigned by peci core.
> 

And what message would you see for cpu1 ?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-12 17:37               ` Guenter Roeck
@ 2018-04-12 19:51                 ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-12 19:51 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Haiyue Wang, James Feist, Jason M Biils, Jean Delvare,
	Joel Stanley, Julia Cartwright, Miguel Ojeda, Milton Miller II,
	Pavel Machek, Randy Dunlap, Stef van Os, Sumeet R Pawnikar,
	Vernon Mauery, linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On 4/12/2018 10:37 AM, Guenter Roeck wrote:
> On Thu, Apr 12, 2018 at 10:09:51AM -0700, Jae Hyun Yoo wrote:
> [ ... ]
>>>>>>>> +static int find_core_index(struct peci_cputemp *priv, int channel)
>>>>>>>> +{
>>>>>>>> +    int core_channel = channel - DEFAULT_CHANNEL_NUMS;
>>>>>>>> +    int idx, found = 0;
>>>>>>>> +
>>>>>>>> +    for (idx = 0; idx < priv->gen_info->core_max; idx++) {
>>>>>>>> +        if (priv->core_mask & BIT(idx)) {
>>>>>>>> +            if (core_channel == found)
>>>>>>>> +                break;
>>>>>>>> +
>>>>>>>> +            found++;
>>>>>>>> +        }
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    return idx;
>>>>>>>
>>>>>>> What if nothing is found ?
>>>>>>>
>>>>>>
>>>>>> Core temperature group will be registered only when it detects at
>>>>>> least one core checked by check_resolved_cores(), so
>>>>>> find_core_index() can be called only when priv->core_mask has a
>>>>>> non-zero value. The 'nothing is found' case will not happen.
>>>>>>
>>>>> That doesn't guarantee a match. If what you are saying is correct
>>>>> there should always be
>>>>> a well defined match of channel -> idx, and the search should be
>>>>> unnecessary.
>>>>>
>>>>
>>>> There could be some disabled cores in the resolved core mask bit
>>>> sequence also it should remove indexing gap in channel numbering so it
>>>> is the reason why this search function is needed. Well defined match of
>>>> channel -> idx would not be always satisfied.
>>>>
>>> Are you saying that each call to the function, with the same parameters,
>>> can return a different result ?
>>>
>>
>> No, the result will be consistent. After reading the priv->core_mask once in
>> check_resolved_cores(), the value will not be changed. I'm saying about this
>> case, for example if core number 2 is unresolved in total 4 cores, then the
>> idx order will be '0, 1, 3' but channel order will be '5, 6, 7' without
>> making any indexing gap.
>>
> 
> And you yet you claim that this is not well defined ? Or are you concerned
> about the amount of memory consumed by providing an array for the mapping ?
> 
> Note that an indexing gap is acceptable and, in many cases, preferred.
> 

If the indexing gap is acceptable, the index search function isn't 
needed anymore. I'll fix all relating code to make that use direct 
mapping of channel -> idx then. Thanks!

> [ ... ]
> 
>>>>>>>> +
>>>>>>>> +    dev_dbg(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev),
>>>>>>>> priv->name);
>>>>>>>> +
>>>>>
>>>>> Why does this message display the device name twice ?
>>>>>
>>>>
>>>> For an example, dev_name(hwmon_dev) shows 'hwmon5' and priv->name shows
>>>> 'peci-cputemp0'.
>>>>
>>> And dev_dbg() shows another device name. So you'll have something like
>>>
>>> peci-cputemp0: hwmon5: sensor 'peci-cputemp0'
>>>
>>
>> Practically it shows like
>>
>> peci-cputemp 0-30:00: hwmon10: sensor 'peci_cputemp.cpu0'
>>
>> where 0-30:00 is assigned by peci core.
>>
> 
> And what message would you see for cpu1 ?
> 

It shows like

peci-cputemp 0-31:00: hwmon10: sensor 'peci_cputemp.cpu1'

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers
  2018-04-10 18:32 ` [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers Jae Hyun Yoo
  2018-04-11 11:52   ` Joel Stanley
@ 2018-04-16 17:59   ` Rob Herring
  2018-04-16 23:06     ` Jae Hyun Yoo
  1 sibling, 1 reply; 54+ messages in thread
From: Rob Herring @ 2018-04-16 17:59 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-hwmon, devicetree,
	linux-doc, openbmc, linux-kernel, linux-arm-kernel

On Tue, Apr 10, 2018 at 11:32:03AM -0700, Jae Hyun Yoo wrote:
> This commit adds documents of generic PECI bus, adapter and client drivers.

"dt-bindings: ..." for the subject prefix please.

> 
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
> Reviewed-by: James Feist <james.feist@linux.intel.com>
> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Andrew Jeffery <andrew@aj.id.au>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
> Cc: Jean Delvare <jdelvare@suse.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Julia Cartwright <juliac@eso.teric.us>
> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
> Cc: Milton Miller II <miltonm@us.ibm.com>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
> ---
>  .../devicetree/bindings/peci/peci-adapter.txt      | 23 ++++++++++++++++++++
>  .../devicetree/bindings/peci/peci-bus.txt          | 15 +++++++++++++
>  .../devicetree/bindings/peci/peci-client.txt       | 25 ++++++++++++++++++++++

This should be all one document.

>  3 files changed, 63 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt
> 
> diff --git a/Documentation/devicetree/bindings/peci/peci-adapter.txt b/Documentation/devicetree/bindings/peci/peci-adapter.txt
> new file mode 100644
> index 000000000000..9221374f6b11
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-adapter.txt
> @@ -0,0 +1,23 @@
> +Generic device tree configuration for PECI adapters.
> +
> +Required properties:
> +- compatible     : Should contain hardware specific definition strings that can
> +		   match an adapter driver implementation.
> +- reg            : Should contain PECI controller registers location and length.

No need for these 2 here.

> +- #address-cells : Should be <1>.
> +- #size-cells    : Should be <0>.

Some details on the addressing for PECI would be good.

> +
> +Example:
> +	peci: peci@10000000 {
> +		compatible = "simple-bus";
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0x0 0x10000000 0x1000>;
> +

This part of the example is not relevant. Just start with the adapter 
node.

> +		peci0: peci-bus@0 {
> +			compatible = "soc,soc-peci";
> +			reg = <0x0 0x1000>;
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +		};
> +	};
> diff --git a/Documentation/devicetree/bindings/peci/peci-bus.txt b/Documentation/devicetree/bindings/peci/peci-bus.txt
> new file mode 100644
> index 000000000000..90bcc791ccb0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-bus.txt
> @@ -0,0 +1,15 @@
> +Generic device tree configuration for PECI buses.
> +
> +Required properties:
> +- compatible     : Should be "simple-bus".

I don't understand what this has to do with PECI? "simple-bus" already 
has a defined meaning.

> +- #address-cells : Should be <1>.
> +- #size-cells    : Should be <1>.
> +- ranges         : Should contain PECI controller registers ranges.
> +
> +Example:
> +	peci: peci@10000000 {
> +		compatible = "simple-bus";
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0x0 0x10000000 0x1000>;
> +	};
> diff --git a/Documentation/devicetree/bindings/peci/peci-client.txt b/Documentation/devicetree/bindings/peci/peci-client.txt
> new file mode 100644
> index 000000000000..8e2bfd8532f6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-client.txt
> @@ -0,0 +1,25 @@
> +Generic device tree configuration for PECI clients.
> +
> +Required properties:
> +- compatible : Should contain target device specific definition strings that can
> +	       match a client driver implementation.

Bindings are for h/w, not client drivers.

How are PECI devices defined?

> +- reg        : Should contain address of a client CPU. Address range of CPU
> +	       clients is starting from 0x30 based on PECI specification.
> +	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)

8 devices should be enough for anyone...

Where is PECI_OFFSET_MAX defined?

> +
> +Example:
> +	peci-bus@0 {
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +		< more properties >
> +
> +		function@cpu0 {

Not a valid node name. "function@30" is what it probably should be. For 
a new bus you can define unit-address format you like, but it must be 
based on the contents of reg. However, it doesn't look like you should 
create anything special here.

> +			compatible = "device,function";
> +			reg = <0x30>;
> +		};
> +
> +		function@cpu1 {
> +			compatible = "device,function";
> +			reg = <0x31>;
> +		};
> +	};
> -- 
> 2.16.2
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-10 18:32 ` [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs Jae Hyun Yoo
  2018-04-11 11:52   ` Joel Stanley
@ 2018-04-16 18:10   ` Rob Herring
  2018-04-16 23:12     ` Jae Hyun Yoo
  1 sibling, 1 reply; 54+ messages in thread
From: Rob Herring @ 2018-04-16 18:10 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, linux-hwmon, linux-arm-kernel, openbmc

On Tue, Apr 10, 2018 at 11:32:06AM -0700, Jae Hyun Yoo wrote:
> This commit adds a dt-bindings document of PECI adapter driver for Aspeed
> AST24xx/25xx SoCs.
> 
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
> Reviewed-by: James Feist <james.feist@linux.intel.com>
> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Andrew Jeffery <andrew@aj.id.au>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
> Cc: Jean Delvare <jdelvare@suse.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Julia Cartwright <juliac@eso.teric.us>
> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
> Cc: Milton Miller II <miltonm@us.ibm.com>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
> ---
>  .../devicetree/bindings/peci/peci-aspeed.txt       | 60 ++++++++++++++++++++++
>  1 file changed, 60 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt
> 
> diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
> new file mode 100644
> index 000000000000..4598bb8c20fa
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
> @@ -0,0 +1,60 @@
> +Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
> +
> +Required properties:
> +- compatible        : Should be "aspeed,ast2400-peci" or "aspeed,ast2500-peci"
> +		      - aspeed,ast2400-peci: Aspeed AST2400 family PECI
> +					     controller
> +		      - aspeed,ast2500-peci: Aspeed AST2500 family PECI
> +					     controller
> +- reg               : Should contain PECI controller registers location and
> +		      length.
> +- #address-cells    : Should be <1>.
> +- #size-cells       : Should be <0>.
> +- interrupts        : Should contain PECI controller interrupt.
> +- clocks            : Should contain clock source for PECI controller.
> +		      Should reference clkin.
> +- clock_frequency   : Should contain the operation frequency of PECI controller
> +		      in units of Hz.
> +		      187500 ~ 24000000

This is the frequency of the bus or used to derive it? It would be 
better to specify the bus frequency instead and have the driver 
calculate its internal freq. And then use "bus-frequency" instead.

> +
> +Optional properties:
> +- msg-timing-nego   : Message timing negotiation period. This value will
> +		      determine the period of message timing negotiation to be
> +		      issued by PECI controller. The unit of the programmed
> +		      value is four times of PECI clock period.
> +		      0 ~ 255 (default: 1)
> +- addr-timing-nego  : Address timing negotiation period. This value will
> +		      determine the period of address timing negotiation to be
> +		      issued by PECI controller. The unit of the programmed
> +		      value is four times of PECI clock period.
> +		      0 ~ 255 (default: 1)
> +- rd-sampling-point : Read sampling point selection. The whole period of a bit
> +		      time will be divided into 16 time frames. This value will
> +		      determine the time frame in which the controller will
> +		      sample PECI signal for data read back. Usually in the
> +		      middle of a bit time is the best.
> +		      0 ~ 15 (default: 8)
> +- cmd_timeout_ms    : Command timeout in units of ms.
> +		      1 ~ 60000 (default: 1000)

s/_/-/


All these either need vendor prefixes or should be standard properties 
for PECI adapters. I think probably the latter case. If so, the first 
2 should probably be in units of clocks (not 4 clocks). And they should 
then be documented in the common PECI binding doc.

> +
> +Example:
> +	peci: peci@1e78b000 {
> +		compatible = "simple-bus";
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0x0 0x1e78b000 0x60>;

No need to show this part in examples.

> +
> +		peci0: peci-bus@0 {
> +			compatible = "aspeed,ast2500-peci";
> +			reg = <0x0 0x60>;
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +			interrupts = <15>;
> +			clocks = <&clk_clkin>;
> +			clock-frequency = <24000000>;
> +			msg-timing-nego = <1>;
> +			addr-timing-nego = <1>;
> +			rd-sampling-point = <8>;
> +			cmd-timeout-ms = <1000>;
> +		};
> +	};
> -- 
> 2.16.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-10 18:32 ` [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers Jae Hyun Yoo
@ 2018-04-16 18:14   ` Rob Herring
  2018-04-16 23:22     ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Rob Herring @ 2018-04-16 18:14 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, linux-hwmon, linux-arm-kernel, openbmc

On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
> This commit adds dt-bindings documents for PECI cputemp and dimmtemp client
> drivers.

"dt-bindings: hwmon: ..." for the subject.

> 
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
> Reviewed-by: James Feist <james.feist@linux.intel.com>
> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Andrew Jeffery <andrew@aj.id.au>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
> Cc: Jean Delvare <jdelvare@suse.com>
> Cc: Joel Stanley <joel@jms.id.au>
> Cc: Julia Cartwright <juliac@eso.teric.us>
> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
> Cc: Milton Miller II <miltonm@us.ibm.com>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
> ---
>  .../devicetree/bindings/hwmon/peci-cputemp.txt     | 24 +++++++++++++++++++++
>  .../devicetree/bindings/hwmon/peci-dimmtemp.txt    | 25 ++++++++++++++++++++++
>  2 files changed, 49 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
>  create mode 100644 Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
> 
> diff --git a/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
> new file mode 100644
> index 000000000000..d5530ef9cfd2
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
> @@ -0,0 +1,24 @@
> +Bindings for Intel PECI (Platform Environment Control Interface) cputemp driver.
> +
> +Required properties:
> +- compatible : Should be "intel,peci-cputemp".
> +- reg        : Should contain address of a client CPU. Address range of CPU
> +	       clients is starting from 0x30 based on PECI specification.
> +	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)

Again, where is PECI_OFFSET_MAX defined? It can't depend on something in 
the kernel.

> +
> +Example:
> +	peci-bus@0 {
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +		< more properties >
> +
> +		peci-cputemp@cpu0 {
> +			compatible = "intel,peci-cputemp";
> +			reg = <0x30>;
> +		};
> +
> +		peci-cputemp@cpu1 {
> +			compatible = "intel,peci-cputemp";
> +			reg = <0x31>;
> +		};
> +	};
> diff --git a/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
> new file mode 100644
> index 000000000000..56e5deb61e5c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
> @@ -0,0 +1,25 @@
> +Bindings for Intel PECI (Platform Environment Control Interface) dimmtemp
> +driver.
> +
> +Required properties:
> +- compatible : Should be "intel,peci-dimmtemp".
> +- reg        : Should contain address of a client CPU. Address range of CPU
> +	       clients is starting from 0x30 based on PECI specification.
> +	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
> +
> +Example:
> +	peci-bus@0 {
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +		< more properties >
> +
> +		peci-dimmtemp@cpu0 {

unit-address is wrong.

It is a different bus from cputemp? Otherwise, you have conflicting 
addresses. If that's the case, probably should make it clear by showing 
different host adapters for each example.

> +			compatible = "intel,peci-dimmtemp";
> +			reg = <0x30>;
> +		};
> +
> +		peci-dimmtemp@cpu1 {
> +			compatible = "intel,peci-dimmtemp";
> +			reg = <0x31>;
> +		};
> +	};
> -- 
> 2.16.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers
  2018-04-16 17:59   ` Rob Herring
@ 2018-04-16 23:06     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-16 23:06 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-hwmon, devicetree,
	linux-doc, openbmc, linux-kernel, linux-arm-kernel

Hi Rob,

Thanks for sharing your time. Please see my answers inline.

On 4/16/2018 10:59 AM, Rob Herring wrote:
> On Tue, Apr 10, 2018 at 11:32:03AM -0700, Jae Hyun Yoo wrote:
>> This commit adds documents of generic PECI bus, adapter and client drivers.
> 
> "dt-bindings: ..." for the subject prefix please.
>

Sure, I'll change the subject.

>>
>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>> Cc: Alan Cox <alan@linux.intel.com>
>> Cc: Andrew Jeffery <andrew@aj.id.au>
>> Cc: Andrew Lunn <andrew@lunn.ch>
>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>> Cc: Greg KH <gregkh@linuxfoundation.org>
>> Cc: Guenter Roeck <linux@roeck-us.net>
>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>> Cc: Jean Delvare <jdelvare@suse.com>
>> Cc: Joel Stanley <joel@jms.id.au>
>> Cc: Julia Cartwright <juliac@eso.teric.us>
>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>> Cc: Milton Miller II <miltonm@us.ibm.com>
>> Cc: Pavel Machek <pavel@ucw.cz>
>> Cc: Randy Dunlap <rdunlap@infradead.org>
>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>> ---
>>   .../devicetree/bindings/peci/peci-adapter.txt      | 23 ++++++++++++++++++++
>>   .../devicetree/bindings/peci/peci-bus.txt          | 15 +++++++++++++
>>   .../devicetree/bindings/peci/peci-client.txt       | 25 ++++++++++++++++++++++
> 
> This should be all one document.
> 

Okay. I'll combine them into one document.

>>   3 files changed, 63 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/peci/peci-adapter.txt
>>   create mode 100644 Documentation/devicetree/bindings/peci/peci-bus.txt
>>   create mode 100644 Documentation/devicetree/bindings/peci/peci-client.txt
>>
>> diff --git a/Documentation/devicetree/bindings/peci/peci-adapter.txt b/Documentation/devicetree/bindings/peci/peci-adapter.txt
>> new file mode 100644
>> index 000000000000..9221374f6b11
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/peci/peci-adapter.txt
>> @@ -0,0 +1,23 @@
>> +Generic device tree configuration for PECI adapters.
>> +
>> +Required properties:
>> +- compatible     : Should contain hardware specific definition strings that can
>> +		   match an adapter driver implementation.
>> +- reg            : Should contain PECI controller registers location and length.
> 
> No need for these 2 here.
> 

Will drop these 2.

>> +- #address-cells : Should be <1>.
>> +- #size-cells    : Should be <0>.
> 
> Some details on the addressing for PECI would be good.
> 

It is for the PECI client address. Will add details.

>> +
>> +Example:
>> +	peci: peci@10000000 {
>> +		compatible = "simple-bus";
>> +		#address-cells = <1>;
>> +		#size-cells = <1>;
>> +		ranges = <0x0 0x10000000 0x1000>;
>> +
> 
> This part of the example is not relevant. Just start with the adapter
> node.
> 

Will remove that part. Thanks!

>> +		peci0: peci-bus@0 {
>> +			compatible = "soc,soc-peci";
>> +			reg = <0x0 0x1000>;
>> +			#address-cells = <1>;
>> +			#size-cells = <0>;
>> +		};
>> +	};
>> diff --git a/Documentation/devicetree/bindings/peci/peci-bus.txt b/Documentation/devicetree/bindings/peci/peci-bus.txt
>> new file mode 100644
>> index 000000000000..90bcc791ccb0
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/peci/peci-bus.txt
>> @@ -0,0 +1,15 @@
>> +Generic device tree configuration for PECI buses.
>> +
>> +Required properties:
>> +- compatible     : Should be "simple-bus".
> 
> I don't understand what this has to do with PECI? "simple-bus" already
> has a defined meaning.
> 

Maybe I'm wrong but I intended to show this node is an umbrella node of 
a PECI bus subsystem. What should I use then?

>> +- #address-cells : Should be <1>.
>> +- #size-cells    : Should be <1>.
>> +- ranges         : Should contain PECI controller registers ranges.
>> +
>> +Example:
>> +	peci: peci@10000000 {
>> +		compatible = "simple-bus";
>> +		#address-cells = <1>;
>> +		#size-cells = <1>;
>> +		ranges = <0x0 0x10000000 0x1000>;
>> +	};
>> diff --git a/Documentation/devicetree/bindings/peci/peci-client.txt b/Documentation/devicetree/bindings/peci/peci-client.txt
>> new file mode 100644
>> index 000000000000..8e2bfd8532f6
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/peci/peci-client.txt
>> @@ -0,0 +1,25 @@
>> +Generic device tree configuration for PECI clients.
>> +
>> +Required properties:
>> +- compatible : Should contain target device specific definition strings that can
>> +	       match a client driver implementation.
> 
> Bindings are for h/w, not client drivers.
> 
> How are PECI devices defined?
> 

Got it. I'll correct the description. PECI client device is Intel CPU 
which is connected through a PECI bus.

>> +- reg        : Should contain address of a client CPU. Address range of CPU
>> +	       clients is starting from 0x30 based on PECI specification.
>> +	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
> 
> 8 devices should be enough for anyone...
> 
> Where is PECI_OFFSET_MAX defined?
> 

PECI_OFFSET_MAX is defined in include/linux/peci.h based on the maximum 
CPU numbers of the current IA generation. I'll remove the unnecessary 
details. A setting out of range would be handled accordingly in kernel.

>> +
>> +Example:
>> +	peci-bus@0 {
>> +		#address-cells = <1>;
>> +		#size-cells = <0>;
>> +		< more properties >
>> +
>> +		function@cpu0 {
> 
> Not a valid node name. "function@30" is what it probably should be. For
> a new bus you can define unit-address format you like, but it must be
> based on the contents of reg. However, it doesn't look like you should
> create anything special here.
> 

Got it. I'll fix these node name like function@30 and function@31.

Thanks a lot for your comments!

-Jae

>> +			compatible = "device,function";
>> +			reg = <0x30>;
>> +		};
>> +
>> +		function@cpu1 {
>> +			compatible = "device,function";
>> +			reg = <0x31>;
>> +		};
>> +	};
>> -- 
>> 2.16.2
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-16 18:10   ` Rob Herring
@ 2018-04-16 23:12     ` Jae Hyun Yoo
  2018-04-17 13:16       ` Rob Herring
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-16 23:12 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, linux-hwmon, linux-arm-kernel, openbmc

On 4/16/2018 11:10 AM, Rob Herring wrote:
> On Tue, Apr 10, 2018 at 11:32:06AM -0700, Jae Hyun Yoo wrote:
>> This commit adds a dt-bindings document of PECI adapter driver for Aspeed
>> AST24xx/25xx SoCs.
>>
>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>> Cc: Alan Cox <alan@linux.intel.com>
>> Cc: Andrew Jeffery <andrew@aj.id.au>
>> Cc: Andrew Lunn <andrew@lunn.ch>
>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>> Cc: Greg KH <gregkh@linuxfoundation.org>
>> Cc: Guenter Roeck <linux@roeck-us.net>
>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>> Cc: Jean Delvare <jdelvare@suse.com>
>> Cc: Joel Stanley <joel@jms.id.au>
>> Cc: Julia Cartwright <juliac@eso.teric.us>
>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>> Cc: Milton Miller II <miltonm@us.ibm.com>
>> Cc: Pavel Machek <pavel@ucw.cz>
>> Cc: Randy Dunlap <rdunlap@infradead.org>
>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>> ---
>>   .../devicetree/bindings/peci/peci-aspeed.txt       | 60 ++++++++++++++++++++++
>>   1 file changed, 60 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt
>>
>> diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
>> new file mode 100644
>> index 000000000000..4598bb8c20fa
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
>> @@ -0,0 +1,60 @@
>> +Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
>> +
>> +Required properties:
>> +- compatible        : Should be "aspeed,ast2400-peci" or "aspeed,ast2500-peci"
>> +		      - aspeed,ast2400-peci: Aspeed AST2400 family PECI
>> +					     controller
>> +		      - aspeed,ast2500-peci: Aspeed AST2500 family PECI
>> +					     controller
>> +- reg               : Should contain PECI controller registers location and
>> +		      length.
>> +- #address-cells    : Should be <1>.
>> +- #size-cells       : Should be <0>.
>> +- interrupts        : Should contain PECI controller interrupt.
>> +- clocks            : Should contain clock source for PECI controller.
>> +		      Should reference clkin.
>> +- clock_frequency   : Should contain the operation frequency of PECI controller
>> +		      in units of Hz.
>> +		      187500 ~ 24000000
> 
> This is the frequency of the bus or used to derive it? It would be
> better to specify the bus frequency instead and have the driver
> calculate its internal freq. And then use "bus-frequency" instead.
> 

I agree with you. Actually, it is being used for operation frequency 
setting of PECI controller module in SoC so it's different from the 
meaning of "bus-frequency". I'll change it to "operation-frequency".

>> +
>> +Optional properties:
>> +- msg-timing-nego   : Message timing negotiation period. This value will
>> +		      determine the period of message timing negotiation to be
>> +		      issued by PECI controller. The unit of the programmed
>> +		      value is four times of PECI clock period.
>> +		      0 ~ 255 (default: 1)
>> +- addr-timing-nego  : Address timing negotiation period. This value will
>> +		      determine the period of address timing negotiation to be
>> +		      issued by PECI controller. The unit of the programmed
>> +		      value is four times of PECI clock period.
>> +		      0 ~ 255 (default: 1)
>> +- rd-sampling-point : Read sampling point selection. The whole period of a bit
>> +		      time will be divided into 16 time frames. This value will
>> +		      determine the time frame in which the controller will
>> +		      sample PECI signal for data read back. Usually in the
>> +		      middle of a bit time is the best.
>> +		      0 ~ 15 (default: 8)
>> +- cmd_timeout_ms    : Command timeout in units of ms.
>> +		      1 ~ 60000 (default: 1000)
> 
> s/_/-/
> 

Will fix it.

> 
> All these either need vendor prefixes or should be standard properties
> for PECI adapters. I think probably the latter case. If so, the first
> 2 should probably be in units of clocks (not 4 clocks). And they should
> then be documented in the common PECI binding doc.
> 

So far I've checked that these are ASPEED PECI controller specific 
properties so it should be listed in here.

>> +
>> +Example:
>> +	peci: peci@1e78b000 {
>> +		compatible = "simple-bus";
>> +		#address-cells = <1>;
>> +		#size-cells = <1>;
>> +		ranges = <0x0 0x1e78b000 0x60>;
> 
> No need to show this part in examples.
> 

Got it. Will drop the part.

>> +
>> +		peci0: peci-bus@0 {
>> +			compatible = "aspeed,ast2500-peci";
>> +			reg = <0x0 0x60>;
>> +			#address-cells = <1>;
>> +			#size-cells = <0>;
>> +			interrupts = <15>;
>> +			clocks = <&clk_clkin>;
>> +			clock-frequency = <24000000>;
>> +			msg-timing-nego = <1>;
>> +			addr-timing-nego = <1>;
>> +			rd-sampling-point = <8>;
>> +			cmd-timeout-ms = <1000>;
>> +		};
>> +	};
>> -- 
>> 2.16.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe devicetree" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-16 18:14   ` Rob Herring
@ 2018-04-16 23:22     ` Jae Hyun Yoo
  2018-04-16 23:51       ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-16 23:22 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, linux-hwmon, linux-arm-kernel, openbmc

On 4/16/2018 11:14 AM, Rob Herring wrote:
> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>> This commit adds dt-bindings documents for PECI cputemp and dimmtemp client
>> drivers.
> 
> "dt-bindings: hwmon: ..." for the subject.
> 

I'll change the subject.

>>
>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>> Cc: Alan Cox <alan@linux.intel.com>
>> Cc: Andrew Jeffery <andrew@aj.id.au>
>> Cc: Andrew Lunn <andrew@lunn.ch>
>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>> Cc: Greg KH <gregkh@linuxfoundation.org>
>> Cc: Guenter Roeck <linux@roeck-us.net>
>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>> Cc: Jean Delvare <jdelvare@suse.com>
>> Cc: Joel Stanley <joel@jms.id.au>
>> Cc: Julia Cartwright <juliac@eso.teric.us>
>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>> Cc: Milton Miller II <miltonm@us.ibm.com>
>> Cc: Pavel Machek <pavel@ucw.cz>
>> Cc: Randy Dunlap <rdunlap@infradead.org>
>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>> ---
>>   .../devicetree/bindings/hwmon/peci-cputemp.txt     | 24 +++++++++++++++++++++
>>   .../devicetree/bindings/hwmon/peci-dimmtemp.txt    | 25 ++++++++++++++++++++++
>>   2 files changed, 49 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
>>   create mode 100644 Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
>>
>> diff --git a/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
>> new file mode 100644
>> index 000000000000..d5530ef9cfd2
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
>> @@ -0,0 +1,24 @@
>> +Bindings for Intel PECI (Platform Environment Control Interface) cputemp driver.
>> +
>> +Required properties:
>> +- compatible : Should be "intel,peci-cputemp".
>> +- reg        : Should contain address of a client CPU. Address range of CPU
>> +	       clients is starting from 0x30 based on PECI specification.
>> +	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
> 
> Again, where is PECI_OFFSET_MAX defined? It can't depend on something in
> the kernel.
> 

I'll remove the unnecessary description.

>> +
>> +Example:
>> +	peci-bus@0 {
>> +		#address-cells = <1>;
>> +		#size-cells = <0>;
>> +		< more properties >
>> +
>> +		peci-cputemp@cpu0 {
>> +			compatible = "intel,peci-cputemp";
>> +			reg = <0x30>;
>> +		};
>> +
>> +		peci-cputemp@cpu1 {
>> +			compatible = "intel,peci-cputemp";
>> +			reg = <0x31>;
>> +		};
>> +	};
>> diff --git a/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
>> new file mode 100644
>> index 000000000000..56e5deb61e5c
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
>> @@ -0,0 +1,25 @@
>> +Bindings for Intel PECI (Platform Environment Control Interface) dimmtemp
>> +driver.
>> +
>> +Required properties:
>> +- compatible : Should be "intel,peci-dimmtemp".
>> +- reg        : Should contain address of a client CPU. Address range of CPU
>> +	       clients is starting from 0x30 based on PECI specification.
>> +	       <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
>> +
>> +Example:
>> +	peci-bus@0 {
>> +		#address-cells = <1>;
>> +		#size-cells = <0>;
>> +		< more properties >
>> +
>> +		peci-dimmtemp@cpu0 {
> 
> unit-address is wrong.
> 

Will fix it using the reg value.

> It is a different bus from cputemp? Otherwise, you have conflicting
> addresses. If that's the case, probably should make it clear by showing
> different host adapters for each example.
> 

It could be the same bus with cputemp. Also, client address sharing is 
possible by PECI core if the functionality is different. I mean, cputemp 
and dimmtemp targeting the same client is possible case like this.
peci-cputemp@30
peci-dimmtemp@30

>> +			compatible = "intel,peci-dimmtemp";
>> +			reg = <0x30>;
>> +		};
>> +
>> +		peci-dimmtemp@cpu1 {
>> +			compatible = "intel,peci-dimmtemp";
>> +			reg = <0x31>;
>> +		};
>> +	};
>> -- 
>> 2.16.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe devicetree" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-16 23:22     ` Jae Hyun Yoo
@ 2018-04-16 23:51       ` Jae Hyun Yoo
  2018-04-17 20:40         ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-16 23:51 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, linux-hwmon, linux-arm-kernel, openbmc

On 4/16/2018 4:22 PM, Jae Hyun Yoo wrote:
> On 4/16/2018 11:14 AM, Rob Herring wrote:
>> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>>> This commit adds dt-bindings documents for PECI cputemp and dimmtemp 
>>> client
>>> drivers.
>>
>> "dt-bindings: hwmon: ..." for the subject.
>>
> 
> I'll change the subject.
> 
>>>
>>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>>> Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
>>> Reviewed-by: James Feist <james.feist@linux.intel.com>
>>> Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
>>> Cc: Alan Cox <alan@linux.intel.com>
>>> Cc: Andrew Jeffery <andrew@aj.id.au>
>>> Cc: Andrew Lunn <andrew@lunn.ch>
>>> Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>> Cc: Fengguang Wu <fengguang.wu@intel.com>
>>> Cc: Greg KH <gregkh@linuxfoundation.org>
>>> Cc: Guenter Roeck <linux@roeck-us.net>
>>> Cc: Jason M Biils <jason.m.bills@linux.intel.com>
>>> Cc: Jean Delvare <jdelvare@suse.com>
>>> Cc: Joel Stanley <joel@jms.id.au>
>>> Cc: Julia Cartwright <juliac@eso.teric.us>
>>> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
>>> Cc: Milton Miller II <miltonm@us.ibm.com>
>>> Cc: Pavel Machek <pavel@ucw.cz>
>>> Cc: Randy Dunlap <rdunlap@infradead.org>
>>> Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
>>> Cc: Sumeet R Pawnikar <sumeet.r.pawnikar@intel.com>
>>> ---
>>>   .../devicetree/bindings/hwmon/peci-cputemp.txt     | 24 
>>> +++++++++++++++++++++
>>>   .../devicetree/bindings/hwmon/peci-dimmtemp.txt    | 25 
>>> ++++++++++++++++++++++
>>>   2 files changed, 49 insertions(+)
>>>   create mode 100644 
>>> Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
>>>   create mode 100644 
>>> Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt 
>>> b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
>>> new file mode 100644
>>> index 000000000000..d5530ef9cfd2
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
>>> @@ -0,0 +1,24 @@
>>> +Bindings for Intel PECI (Platform Environment Control Interface) 
>>> cputemp driver.
>>> +
>>> +Required properties:
>>> +- compatible : Should be "intel,peci-cputemp".
>>> +- reg        : Should contain address of a client CPU. Address range 
>>> of CPU
>>> +           clients is starting from 0x30 based on PECI specification.
>>> +           <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
>>
>> Again, where is PECI_OFFSET_MAX defined? It can't depend on something in
>> the kernel.
>>
> 
> I'll remove the unnecessary description.
> 
>>> +
>>> +Example:
>>> +    peci-bus@0 {
>>> +        #address-cells = <1>;
>>> +        #size-cells = <0>;
>>> +        < more properties >
>>> +
>>> +        peci-cputemp@cpu0 {
>>> +            compatible = "intel,peci-cputemp";
>>> +            reg = <0x30>;
>>> +        };
>>> +
>>> +        peci-cputemp@cpu1 {
>>> +            compatible = "intel,peci-cputemp";
>>> +            reg = <0x31>;
>>> +        };
>>> +    };
>>> diff --git 
>>> a/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt 
>>> b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
>>> new file mode 100644
>>> index 000000000000..56e5deb61e5c
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
>>> @@ -0,0 +1,25 @@
>>> +Bindings for Intel PECI (Platform Environment Control Interface) 
>>> dimmtemp
>>> +driver.
>>> +
>>> +Required properties:
>>> +- compatible : Should be "intel,peci-dimmtemp".
>>> +- reg        : Should contain address of a client CPU. Address range 
>>> of CPU
>>> +           clients is starting from 0x30 based on PECI specification.
>>> +           <0x30> .. <0x37> (depends on the PECI_OFFSET_MAX definition)
>>> +
>>> +Example:
>>> +    peci-bus@0 {
>>> +        #address-cells = <1>;
>>> +        #size-cells = <0>;
>>> +        < more properties >
>>> +
>>> +        peci-dimmtemp@cpu0 {
>>
>> unit-address is wrong.
>>
> 
> Will fix it using the reg value.
> 
>> It is a different bus from cputemp? Otherwise, you have conflicting
>> addresses. If that's the case, probably should make it clear by showing
>> different host adapters for each example.
>>
> 
> It could be the same bus with cputemp. Also, client address sharing is 
> possible by PECI core if the functionality is different. I mean, cputemp 
> and dimmtemp targeting the same client is possible case like this.
> peci-cputemp@30
> peci-dimmtemp@30
> 

Oh, I got your point. Probably, I should change these separate settings 
into one like

peci-client@30 {
     compatible = "intel,peci-client";
     reg = <0x30>;
};

Then cputemp and dimmtemp drivers could refer the same compatible 
string. Will rewrite it.

>>> +            compatible = "intel,peci-dimmtemp";
>>> +            reg = <0x30>;
>>> +        };
>>> +
>>> +        peci-dimmtemp@cpu1 {
>>> +            compatible = "intel,peci-dimmtemp";
>>> +            reg = <0x31>;
>>> +        };
>>> +    };
>>> -- 
>>> 2.16.2
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe devicetree" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-16 23:12     ` Jae Hyun Yoo
@ 2018-04-17 13:16       ` Rob Herring
  2018-04-17 18:16         ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Rob Herring @ 2018-04-17 13:16 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On Mon, Apr 16, 2018 at 6:12 PM, Jae Hyun Yoo
<jae.hyun.yoo@linux.intel.com> wrote:
> On 4/16/2018 11:10 AM, Rob Herring wrote:
>>
>> On Tue, Apr 10, 2018 at 11:32:06AM -0700, Jae Hyun Yoo wrote:
>>>
>>> This commit adds a dt-bindings document of PECI adapter driver for Aspeed
>>> AST24xx/25xx SoCs.

[...]

>>> +- clocks            : Should contain clock source for PECI controller.
>>> +                     Should reference clkin.
>>> +- clock_frequency   : Should contain the operation frequency of PECI
>>> controller
>>> +                     in units of Hz.
>>> +                     187500 ~ 24000000
>>
>>
>> This is the frequency of the bus or used to derive it? It would be
>> better to specify the bus frequency instead and have the driver
>> calculate its internal freq. And then use "bus-frequency" instead.
>>
>
> I agree with you. Actually, it is being used for operation frequency setting
> of PECI controller module in SoC so it's different from the meaning of
> "bus-frequency". I'll change it to "operation-frequency".

No, now you've gone from a standard property name to something custom.
Why do you need to set the frequency in DT if it is not related to the
interface frequency?

Rob

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
  2018-04-10 18:32 ` [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx Jae Hyun Yoo
  2018-04-11 11:51   ` Joel Stanley
@ 2018-04-17 13:37   ` Robin Murphy
  2018-04-17 18:21     ` Jae Hyun Yoo
  1 sibling, 1 reply; 54+ messages in thread
From: Robin Murphy @ 2018-04-17 13:37 UTC (permalink / raw)
  To: Jae Hyun Yoo, Alan Cox, Andrew Jeffery, Andrew Lunn,
	Andy Shevchenko, Arnd Bergmann, Benjamin Herrenschmidt,
	Fengguang Wu, Greg KH, Guenter Roeck, Haiyue Wang, James Feist,
	Jason M Biils, Jean Delvare, Joel Stanley, Julia Cartwright,
	Miguel Ojeda, Milton Miller II, Pavel Machek, Randy Dunlap,
	Stef van Os, Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-hwmon, devicetree, linux-doc, openbmc, linux-kernel,
	linux-arm-kernel

Just a drive-by nit:

On 10/04/18 19:32, Jae Hyun Yoo wrote:
[...]
> +#define PECI_CTRL_SAMPLING_MASK     GENMASK(19, 16)
> +#define PECI_CTRL_SAMPLING(x)       (((x) << 16) & PECI_CTRL_SAMPLING_MASK)
> +#define PECI_CTRL_SAMPLING_GET(x)   (((x) & PECI_CTRL_SAMPLING_MASK) >> 16)

FWIW, <linux/bitfield.h> already provides functionality like this, so it 
might be worth taking a look at FIELD_{GET,PREP}() to save all these 
local definitions.

Robin.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-17 13:16       ` Rob Herring
@ 2018-04-17 18:16         ` Jae Hyun Yoo
  2018-04-17 22:06           ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-17 18:16 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On 4/17/2018 6:16 AM, Rob Herring wrote:
> On Mon, Apr 16, 2018 at 6:12 PM, Jae Hyun Yoo
> <jae.hyun.yoo@linux.intel.com> wrote:
>> On 4/16/2018 11:10 AM, Rob Herring wrote:
>>>
>>> On Tue, Apr 10, 2018 at 11:32:06AM -0700, Jae Hyun Yoo wrote:
>>>>
>>>> This commit adds a dt-bindings document of PECI adapter driver for Aspeed
>>>> AST24xx/25xx SoCs.
> 
> [...]
> 
>>>> +- clocks            : Should contain clock source for PECI controller.
>>>> +                     Should reference clkin.
>>>> +- clock_frequency   : Should contain the operation frequency of PECI
>>>> controller
>>>> +                     in units of Hz.
>>>> +                     187500 ~ 24000000
>>>
>>>
>>> This is the frequency of the bus or used to derive it? It would be
>>> better to specify the bus frequency instead and have the driver
>>> calculate its internal freq. And then use "bus-frequency" instead.
>>>
>>
>> I agree with you. Actually, it is being used for operation frequency setting
>> of PECI controller module in SoC so it's different from the meaning of
>> "bus-frequency". I'll change it to "operation-frequency".
> 
> No, now you've gone from a standard property name to something custom.
> Why do you need to set the frequency in DT if it is not related to the
> interface frequency?
> 
> Rob
> 

Actually, the interface frequency is affected by the operation frequency
but there is no description of its relationship in datasheet. I'll check
again about the detail to ASPEED chip vendor and will use
'bus-frequency' if available.

Thanks,

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
  2018-04-17 13:37   ` Robin Murphy
@ 2018-04-17 18:21     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-17 18:21 UTC (permalink / raw)
  To: Robin Murphy, Alan Cox, Andrew Jeffery, Andrew Lunn,
	Andy Shevchenko, Arnd Bergmann, Benjamin Herrenschmidt,
	Fengguang Wu, Greg KH, Guenter Roeck, Haiyue Wang, James Feist,
	Jason M Biils, Jean Delvare, Joel Stanley, Julia Cartwright,
	Miguel Ojeda, Milton Miller II, Pavel Machek, Randy Dunlap,
	Stef van Os, Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-hwmon, devicetree, linux-doc, openbmc, linux-kernel,
	linux-arm-kernel

Hi Robin,

On 4/17/2018 6:37 AM, Robin Murphy wrote:
> Just a drive-by nit:
> 
> On 10/04/18 19:32, Jae Hyun Yoo wrote:
> [...]
>> +#define PECI_CTRL_SAMPLING_MASK     GENMASK(19, 16)
>> +#define PECI_CTRL_SAMPLING(x)       (((x) << 16) & 
>> PECI_CTRL_SAMPLING_MASK)
>> +#define PECI_CTRL_SAMPLING_GET(x)   (((x) & PECI_CTRL_SAMPLING_MASK) 
>> >> 16)
> 
> FWIW, <linux/bitfield.h> already provides functionality like this, so it 
> might be worth taking a look at FIELD_{GET,PREP}() to save all these 
> local definitions.
> 
> Robin.

Yes, that looks better. Thanks a lot for your pointing it out.

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-16 23:51       ` Jae Hyun Yoo
@ 2018-04-17 20:40         ` Jae Hyun Yoo
  2018-04-18 14:32           ` Rob Herring
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-17 20:40 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, linux-hwmon, linux-arm-kernel, openbmc

On 4/16/2018 4:51 PM, Jae Hyun Yoo wrote:
> On 4/16/2018 4:22 PM, Jae Hyun Yoo wrote:
>> On 4/16/2018 11:14 AM, Rob Herring wrote:
>>> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>>>> This commit adds dt-bindings documents for PECI cputemp and dimmtemp 
>>>> client
>>>> drivers.
>>>

[...]

>>>> +Example:
>>>> +    peci-bus@0 {
>>>> +        #address-cells = <1>;
>>>> +        #size-cells = <0>;
>>>> +        < more properties >
>>>> +
>>>> +        peci-dimmtemp@cpu0 {
>>>
>>> unit-address is wrong.
>>>
>>
>> Will fix it using the reg value.
>>
>>> It is a different bus from cputemp? Otherwise, you have conflicting
>>> addresses. If that's the case, probably should make it clear by showing
>>> different host adapters for each example.
>>>
>>
>> It could be the same bus with cputemp. Also, client address sharing is 
>> possible by PECI core if the functionality is different. I mean, 
>> cputemp and dimmtemp targeting the same client is possible case like 
>> this.
>> peci-cputemp@30
>> peci-dimmtemp@30
>>
> 
> Oh, I got your point. Probably, I should change these separate settings 
> into one like
> 
> peci-client@30 {
>      compatible = "intel,peci-client";
>      reg = <0x30>;
> };
> 
> Then cputemp and dimmtemp drivers could refer the same compatible 
> string. Will rewrite it.
> 

I've checked it again and realized that it should use function based 
node name like:

peci-cputemp@30
peci-dimmtemp@30

If it use the same string like 'peci-client@30', the drivers cannot be 
selectively enabled. The client address sharing way is well handled in 
PECI core and this way would be better for the future implementations of 
other PECI functional drivers such as crash dump driver and so on. So 
I'm going change the unit-address only.

Thanks,

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-17 18:16         ` Jae Hyun Yoo
@ 2018-04-17 22:06           ` Jae Hyun Yoo
  2018-04-18 13:59             ` Rob Herring
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-17 22:06 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On 4/17/2018 11:16 AM, Jae Hyun Yoo wrote:
> On 4/17/2018 6:16 AM, Rob Herring wrote:
>> On Mon, Apr 16, 2018 at 6:12 PM, Jae Hyun Yoo
>> <jae.hyun.yoo@linux.intel.com> wrote:
>>> On 4/16/2018 11:10 AM, Rob Herring wrote:
>>>>
>>>> On Tue, Apr 10, 2018 at 11:32:06AM -0700, Jae Hyun Yoo wrote:
>>>>>
>>>>> This commit adds a dt-bindings document of PECI adapter driver for 
>>>>> Aspeed
>>>>> AST24xx/25xx SoCs.
>>
>> [...]
>>
>>>>> +- clocks            : Should contain clock source for PECI 
>>>>> controller.
>>>>> +                     Should reference clkin.
>>>>> +- clock_frequency   : Should contain the operation frequency of PECI
>>>>> controller
>>>>> +                     in units of Hz.
>>>>> +                     187500 ~ 24000000
>>>>
>>>>
>>>> This is the frequency of the bus or used to derive it? It would be
>>>> better to specify the bus frequency instead and have the driver
>>>> calculate its internal freq. And then use "bus-frequency" instead.
>>>>
>>>
>>> I agree with you. Actually, it is being used for operation frequency 
>>> setting
>>> of PECI controller module in SoC so it's different from the meaning of
>>> "bus-frequency". I'll change it to "operation-frequency".
>>
>> No, now you've gone from a standard property name to something custom.
>> Why do you need to set the frequency in DT if it is not related to the
>> interface frequency?
>>
>> Rob
>>
> 
> Actually, the interface frequency is affected by the operation frequency
> but there is no description of its relationship in datasheet. I'll check
> again about the detail to ASPEED chip vendor and will use
> 'bus-frequency' if available.
> 

I investigated it more deeply. Basically, by the spec, PECI bus speed
cannot be set as a fixed speed. A PECI bus can have a wide speed range
from 2Kbps to 2Mbps which is dynamically set by a handshaking sequence
between an originator and clients called 'timing negotiation' in spec.
This timing negotiation behavior happens on every single transaction so 
the bus speed also can vary on every transactions. So I'm thinking a 
custom property name for it, 'peci-clk-frequency' if it is acceptable.

Thanks,

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-17 22:06           ` Jae Hyun Yoo
@ 2018-04-18 13:59             ` Rob Herring
  2018-04-18 16:45               ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Rob Herring @ 2018-04-18 13:59 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On Tue, Apr 17, 2018 at 5:06 PM, Jae Hyun Yoo
<jae.hyun.yoo@linux.intel.com> wrote:
> On 4/17/2018 11:16 AM, Jae Hyun Yoo wrote:
>>
>> On 4/17/2018 6:16 AM, Rob Herring wrote:
>>>
>>> On Mon, Apr 16, 2018 at 6:12 PM, Jae Hyun Yoo
>>> <jae.hyun.yoo@linux.intel.com> wrote:
>>>>
>>>> On 4/16/2018 11:10 AM, Rob Herring wrote:
>>>>>
>>>>>
>>>>> On Tue, Apr 10, 2018 at 11:32:06AM -0700, Jae Hyun Yoo wrote:
>>>>>>
>>>>>>
>>>>>> This commit adds a dt-bindings document of PECI adapter driver for
>>>>>> Aspeed
>>>>>> AST24xx/25xx SoCs.
>>>
>>>
>>> [...]
>>>
>>>>>> +- clocks            : Should contain clock source for PECI
>>>>>> controller.
>>>>>> +                     Should reference clkin.
>>>>>> +- clock_frequency   : Should contain the operation frequency of PECI
>>>>>> controller
>>>>>> +                     in units of Hz.
>>>>>> +                     187500 ~ 24000000
>>>>>
>>>>>
>>>>>
>>>>> This is the frequency of the bus or used to derive it? It would be
>>>>> better to specify the bus frequency instead and have the driver
>>>>> calculate its internal freq. And then use "bus-frequency" instead.
>>>>>
>>>>
>>>> I agree with you. Actually, it is being used for operation frequency
>>>> setting
>>>> of PECI controller module in SoC so it's different from the meaning of
>>>> "bus-frequency". I'll change it to "operation-frequency".
>>>
>>>
>>> No, now you've gone from a standard property name to something custom.
>>> Why do you need to set the frequency in DT if it is not related to the
>>> interface frequency?
>>>
>>> Rob
>>>
>>
>> Actually, the interface frequency is affected by the operation frequency
>> but there is no description of its relationship in datasheet. I'll check
>> again about the detail to ASPEED chip vendor and will use
>> 'bus-frequency' if available.
>>
>
> I investigated it more deeply. Basically, by the spec, PECI bus speed
> cannot be set as a fixed speed. A PECI bus can have a wide speed range
> from 2Kbps to 2Mbps which is dynamically set by a handshaking sequence
> between an originator and clients called 'timing negotiation' in spec.
> This timing negotiation behavior happens on every single transaction so the
> bus speed also can vary on every transactions. So I'm thinking a custom
> property name for it, 'peci-clk-frequency' if it is acceptable.

Okay, seems bus-frequency is not appropriate here. So use
'clock-frequency' (note the '-' not '_' as that is the standard
property).

Rob

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-17 20:40         ` Jae Hyun Yoo
@ 2018-04-18 14:32           ` Rob Herring
  2018-04-18 20:28             ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Rob Herring @ 2018-04-18 14:32 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On Tue, Apr 17, 2018 at 3:40 PM, Jae Hyun Yoo
<jae.hyun.yoo@linux.intel.com> wrote:
> On 4/16/2018 4:51 PM, Jae Hyun Yoo wrote:
>>
>> On 4/16/2018 4:22 PM, Jae Hyun Yoo wrote:
>>>
>>> On 4/16/2018 11:14 AM, Rob Herring wrote:
>>>>
>>>> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>>>>>
>>>>> This commit adds dt-bindings documents for PECI cputemp and dimmtemp
>>>>> client
>>>>> drivers.
>>>>
>>>>
>
> [...]
>
>>>>> +Example:
>>>>> +    peci-bus@0 {
>>>>> +        #address-cells = <1>;
>>>>> +        #size-cells = <0>;
>>>>> +        < more properties >
>>>>> +
>>>>> +        peci-dimmtemp@cpu0 {
>>>>
>>>>
>>>> unit-address is wrong.
>>>>
>>>
>>> Will fix it using the reg value.
>>>
>>>> It is a different bus from cputemp? Otherwise, you have conflicting
>>>> addresses. If that's the case, probably should make it clear by showing
>>>> different host adapters for each example.
>>>>
>>>
>>> It could be the same bus with cputemp. Also, client address sharing is
>>> possible by PECI core if the functionality is different. I mean, cputemp and
>>> dimmtemp targeting the same client is possible case like this.
>>> peci-cputemp@30
>>> peci-dimmtemp@30
>>>
>>
>> Oh, I got your point. Probably, I should change these separate settings
>> into one like
>>
>> peci-client@30 {
>>      compatible = "intel,peci-client";
>>      reg = <0x30>;
>> };
>>
>> Then cputemp and dimmtemp drivers could refer the same compatible string.
>> Will rewrite it.
>>
>
> I've checked it again and realized that it should use function based node
> name like:
>
> peci-cputemp@30
> peci-dimmtemp@30
>
> If it use the same string like 'peci-client@30', the drivers cannot be
> selectively enabled. The client address sharing way is well handled in PECI
> core and this way would be better for the future implementations of other
> PECI functional drivers such as crash dump driver and so on. So I'm going
> change the unit-address only.

2 nodes at the same address is wrong (and soon dtc will warn you on
this). You have 2 potential options. The first is you need additional
address information in the DT if these are in fact 2 independent
devices. This could be something like a function number to use
something from PCI addressing. From what I found on PECI, it doesn't
seem to have anything like that. The 2nd option is you have a single
DT node which registers multiple hwmon devices. DT nodes and drivers
don't have to be 1-1. Don't design your DT nodes from how you want to
partition drivers in some OS.

Rob

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
  2018-04-18 13:59             ` Rob Herring
@ 2018-04-18 16:45               ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-18 16:45 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On 4/18/2018 6:59 AM, Rob Herring wrote:
> On Tue, Apr 17, 2018 at 5:06 PM, Jae Hyun Yoo
> <jae.hyun.yoo@linux.intel.com> wrote:
>> On 4/17/2018 11:16 AM, Jae Hyun Yoo wrote:
>>>
>>> On 4/17/2018 6:16 AM, Rob Herring wrote:
>>>>
>>>> On Mon, Apr 16, 2018 at 6:12 PM, Jae Hyun Yoo
>>>> <jae.hyun.yoo@linux.intel.com> wrote:
>>>>>
>>>>> On 4/16/2018 11:10 AM, Rob Herring wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 10, 2018 at 11:32:06AM -0700, Jae Hyun Yoo wrote:
>>>>>>>
>>>>>>>
>>>>>>> This commit adds a dt-bindings document of PECI adapter driver for
>>>>>>> Aspeed
>>>>>>> AST24xx/25xx SoCs.
>>>>
>>>>
>>>> [...]
>>>>
>>>>>>> +- clocks            : Should contain clock source for PECI
>>>>>>> controller.
>>>>>>> +                     Should reference clkin.
>>>>>>> +- clock_frequency   : Should contain the operation frequency of PECI
>>>>>>> controller
>>>>>>> +                     in units of Hz.
>>>>>>> +                     187500 ~ 24000000
>>>>>>
>>>>>>
>>>>>>
>>>>>> This is the frequency of the bus or used to derive it? It would be
>>>>>> better to specify the bus frequency instead and have the driver
>>>>>> calculate its internal freq. And then use "bus-frequency" instead.
>>>>>>
>>>>>
>>>>> I agree with you. Actually, it is being used for operation frequency
>>>>> setting
>>>>> of PECI controller module in SoC so it's different from the meaning of
>>>>> "bus-frequency". I'll change it to "operation-frequency".
>>>>
>>>>
>>>> No, now you've gone from a standard property name to something custom.
>>>> Why do you need to set the frequency in DT if it is not related to the
>>>> interface frequency?
>>>>
>>>> Rob
>>>>
>>>
>>> Actually, the interface frequency is affected by the operation frequency
>>> but there is no description of its relationship in datasheet. I'll check
>>> again about the detail to ASPEED chip vendor and will use
>>> 'bus-frequency' if available.
>>>
>>
>> I investigated it more deeply. Basically, by the spec, PECI bus speed
>> cannot be set as a fixed speed. A PECI bus can have a wide speed range
>> from 2Kbps to 2Mbps which is dynamically set by a handshaking sequence
>> between an originator and clients called 'timing negotiation' in spec.
>> This timing negotiation behavior happens on every single transaction so the
>> bus speed also can vary on every transactions. So I'm thinking a custom
>> property name for it, 'peci-clk-frequency' if it is acceptable.
> 
> Okay, seems bus-frequency is not appropriate here. So use
> 'clock-frequency' (note the '-' not '_' as that is the standard
> property).
> 
> Rob
> 

Thanks! I'll use 'clock-frequency' for it.

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-18 14:32           ` Rob Herring
@ 2018-04-18 20:28             ` Jae Hyun Yoo
  2018-04-18 21:28               ` Rob Herring
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-18 20:28 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On 4/18/2018 7:32 AM, Rob Herring wrote:
> On Tue, Apr 17, 2018 at 3:40 PM, Jae Hyun Yoo
> <jae.hyun.yoo@linux.intel.com> wrote:
>> On 4/16/2018 4:51 PM, Jae Hyun Yoo wrote:
>>>
>>> On 4/16/2018 4:22 PM, Jae Hyun Yoo wrote:
>>>>
>>>> On 4/16/2018 11:14 AM, Rob Herring wrote:
>>>>>
>>>>> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>>>>>>
>>>>>> This commit adds dt-bindings documents for PECI cputemp and dimmtemp
>>>>>> client
>>>>>> drivers.
>>>>>
>>>>>
>>
>> [...]
>>
>>>>>> +Example:
>>>>>> +    peci-bus@0 {
>>>>>> +        #address-cells = <1>;
>>>>>> +        #size-cells = <0>;
>>>>>> +        < more properties >
>>>>>> +
>>>>>> +        peci-dimmtemp@cpu0 {
>>>>>
>>>>>
>>>>> unit-address is wrong.
>>>>>
>>>>
>>>> Will fix it using the reg value.
>>>>
>>>>> It is a different bus from cputemp? Otherwise, you have conflicting
>>>>> addresses. If that's the case, probably should make it clear by showing
>>>>> different host adapters for each example.
>>>>>
>>>>
>>>> It could be the same bus with cputemp. Also, client address sharing is
>>>> possible by PECI core if the functionality is different. I mean, cputemp and
>>>> dimmtemp targeting the same client is possible case like this.
>>>> peci-cputemp@30
>>>> peci-dimmtemp@30
>>>>
>>>
>>> Oh, I got your point. Probably, I should change these separate settings
>>> into one like
>>>
>>> peci-client@30 {
>>>       compatible = "intel,peci-client";
>>>       reg = <0x30>;
>>> };
>>>
>>> Then cputemp and dimmtemp drivers could refer the same compatible string.
>>> Will rewrite it.
>>>
>>
>> I've checked it again and realized that it should use function based node
>> name like:
>>
>> peci-cputemp@30
>> peci-dimmtemp@30
>>
>> If it use the same string like 'peci-client@30', the drivers cannot be
>> selectively enabled. The client address sharing way is well handled in PECI
>> core and this way would be better for the future implementations of other
>> PECI functional drivers such as crash dump driver and so on. So I'm going
>> change the unit-address only.
> 
> 2 nodes at the same address is wrong (and soon dtc will warn you on
> this). You have 2 potential options. The first is you need additional
> address information in the DT if these are in fact 2 independent
> devices. This could be something like a function number to use
> something from PCI addressing. From what I found on PECI, it doesn't
> seem to have anything like that. The 2nd option is you have a single
> DT node which registers multiple hwmon devices. DT nodes and drivers
> don't have to be 1-1. Don't design your DT nodes from how you want to
> partition drivers in some OS.
> 
> Rob
> 

Please correct me if I'm wrong but I'm still thinking that it is
possible. Also, I did compile it but dtc doesn't make a warning. Let me
show an another use case which is similar to this case:

In arch/arm/boot/dts/aspeed-g5.dtsi
[...]
lpc_host: lpc-host@80 {
         compatible = "aspeed,ast2500-lpc-host", "simple-mfd", "syscon";
         reg = <0x80 0x1e0>;
         reg-io-width = <4>;

         #address-cells = <1>;
         #size-cells = <1>;
         ranges = <0x0 0x80 0x1e0>;

         lpc_ctrl: lpc-ctrl@0 {
                 compatible = "aspeed,ast2500-lpc-ctrl";
                 reg = <0x0 0x80>;
                 clocks = <&syscon ASPEED_CLK_GATE_LCLK>;
                 status = "disabled";
         };

         lpc_snoop: lpc-snoop@0 {
                 compatible = "aspeed,ast2500-lpc-snoop";
                 reg = <0x0 0x80>;
                 interrupts = <8>;
                 status = "disabled";
         };
}
[...]

This is device tree setting for LPC interface and its child nodes.
LPC interface can be used as a multi-functional interface such as
snoop 80, KCS, SIO and so on. In this use case, lpc-ctrl@0 and
lpc-snoop@0 are sharing their address range from their individual
driver modules and they can be registered quite well through both
static dt or dynamic dtoverlay. PECI is also a multi-functional
interface which is similar to the above case, I think.

Thanks,

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-18 20:28             ` Jae Hyun Yoo
@ 2018-04-18 21:28               ` Rob Herring
  2018-04-18 21:57                 ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Rob Herring @ 2018-04-18 21:28 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On Wed, Apr 18, 2018 at 3:28 PM, Jae Hyun Yoo
<jae.hyun.yoo@linux.intel.com> wrote:
> On 4/18/2018 7:32 AM, Rob Herring wrote:
>>
>> On Tue, Apr 17, 2018 at 3:40 PM, Jae Hyun Yoo
>> <jae.hyun.yoo@linux.intel.com> wrote:
>>>
>>> On 4/16/2018 4:51 PM, Jae Hyun Yoo wrote:
>>>>
>>>>
>>>> On 4/16/2018 4:22 PM, Jae Hyun Yoo wrote:
>>>>>
>>>>>
>>>>> On 4/16/2018 11:14 AM, Rob Herring wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>>>>>>>
>>>>>>>
>>>>>>> This commit adds dt-bindings documents for PECI cputemp and dimmtemp
>>>>>>> client
>>>>>>> drivers.
>>>>>>
>>>>>>
>>>>>>
>>>
>>> [...]
>>>
>>>>>>> +Example:
>>>>>>> +    peci-bus@0 {
>>>>>>> +        #address-cells = <1>;
>>>>>>> +        #size-cells = <0>;
>>>>>>> +        < more properties >
>>>>>>> +
>>>>>>> +        peci-dimmtemp@cpu0 {
>>>>>>
>>>>>>
>>>>>>
>>>>>> unit-address is wrong.
>>>>>>
>>>>>
>>>>> Will fix it using the reg value.
>>>>>
>>>>>> It is a different bus from cputemp? Otherwise, you have conflicting
>>>>>> addresses. If that's the case, probably should make it clear by
>>>>>> showing
>>>>>> different host adapters for each example.
>>>>>>
>>>>>
>>>>> It could be the same bus with cputemp. Also, client address sharing is
>>>>> possible by PECI core if the functionality is different. I mean,
>>>>> cputemp and
>>>>> dimmtemp targeting the same client is possible case like this.
>>>>> peci-cputemp@30
>>>>> peci-dimmtemp@30
>>>>>
>>>>
>>>> Oh, I got your point. Probably, I should change these separate settings
>>>> into one like
>>>>
>>>> peci-client@30 {
>>>>       compatible = "intel,peci-client";
>>>>       reg = <0x30>;
>>>> };
>>>>
>>>> Then cputemp and dimmtemp drivers could refer the same compatible
>>>> string.
>>>> Will rewrite it.
>>>>
>>>
>>> I've checked it again and realized that it should use function based node
>>> name like:
>>>
>>> peci-cputemp@30
>>> peci-dimmtemp@30
>>>
>>> If it use the same string like 'peci-client@30', the drivers cannot be
>>> selectively enabled. The client address sharing way is well handled in
>>> PECI
>>> core and this way would be better for the future implementations of other
>>> PECI functional drivers such as crash dump driver and so on. So I'm going
>>> change the unit-address only.
>>
>>
>> 2 nodes at the same address is wrong (and soon dtc will warn you on
>> this). You have 2 potential options. The first is you need additional
>> address information in the DT if these are in fact 2 independent
>> devices. This could be something like a function number to use
>> something from PCI addressing. From what I found on PECI, it doesn't
>> seem to have anything like that. The 2nd option is you have a single
>> DT node which registers multiple hwmon devices. DT nodes and drivers
>> don't have to be 1-1. Don't design your DT nodes from how you want to
>> partition drivers in some OS.
>>
>> Rob
>>
>
> Please correct me if I'm wrong but I'm still thinking that it is
> possible. Also, I did compile it but dtc doesn't make a warning. Let me
> show an another use case which is similar to this case:

I did say *soon*. It's in dtc repo, but not the kernel copy yet.

> In arch/arm/boot/dts/aspeed-g5.dtsi
> [...]
> lpc_host: lpc-host@80 {
>         compatible = "aspeed,ast2500-lpc-host", "simple-mfd", "syscon";
>         reg = <0x80 0x1e0>;
>         reg-io-width = <4>;
>
>         #address-cells = <1>;
>         #size-cells = <1>;
>         ranges = <0x0 0x80 0x1e0>;
>
>         lpc_ctrl: lpc-ctrl@0 {
>                 compatible = "aspeed,ast2500-lpc-ctrl";
>                 reg = <0x0 0x80>;
>                 clocks = <&syscon ASPEED_CLK_GATE_LCLK>;
>                 status = "disabled";
>         };
>
>         lpc_snoop: lpc-snoop@0 {
>                 compatible = "aspeed,ast2500-lpc-snoop";
>                 reg = <0x0 0x80>;
>                 interrupts = <8>;
>                 status = "disabled";
>         };
> }
> [...]
>
> This is device tree setting for LPC interface and its child nodes.
> LPC interface can be used as a multi-functional interface such as
> snoop 80, KCS, SIO and so on. In this use case, lpc-ctrl@0 and
> lpc-snoop@0 are sharing their address range from their individual
> driver modules and they can be registered quite well through both
> static dt or dynamic dtoverlay. PECI is also a multi-functional
> interface which is similar to the above case, I think.

This case too is poor design and should be fixed as well. Simply put,
you can have 2 devices on a bus at the same address without some sort
of mux or arbitration device in the middle. If you have a device/block
with multiple functions provided to the OS, then it is the OS's
problem to arbitrate access. It is not a DT problem because OS's can
vary in how they handle that both from OS to OS and over time.

Rob

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-18 21:28               ` Rob Herring
@ 2018-04-18 21:57                 ` Jae Hyun Yoo
  2018-04-19 19:48                   ` Jae Hyun Yoo
  0 siblings, 1 reply; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-18 21:57 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On 4/18/2018 2:28 PM, Rob Herring wrote:
> On Wed, Apr 18, 2018 at 3:28 PM, Jae Hyun Yoo
> <jae.hyun.yoo@linux.intel.com> wrote:
>> On 4/18/2018 7:32 AM, Rob Herring wrote:
>>>
>>> On Tue, Apr 17, 2018 at 3:40 PM, Jae Hyun Yoo
>>> <jae.hyun.yoo@linux.intel.com> wrote:
>>>>
>>>> On 4/16/2018 4:51 PM, Jae Hyun Yoo wrote:
>>>>>
>>>>>
>>>>> On 4/16/2018 4:22 PM, Jae Hyun Yoo wrote:
>>>>>>
>>>>>>
>>>>>> On 4/16/2018 11:14 AM, Rob Herring wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> This commit adds dt-bindings documents for PECI cputemp and dimmtemp
>>>>>>>> client
>>>>>>>> drivers.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>>>> [...]
>>>>
>>>>>>>> +Example:
>>>>>>>> +    peci-bus@0 {
>>>>>>>> +        #address-cells = <1>;
>>>>>>>> +        #size-cells = <0>;
>>>>>>>> +        < more properties >
>>>>>>>> +
>>>>>>>> +        peci-dimmtemp@cpu0 {
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> unit-address is wrong.
>>>>>>>
>>>>>>
>>>>>> Will fix it using the reg value.
>>>>>>
>>>>>>> It is a different bus from cputemp? Otherwise, you have conflicting
>>>>>>> addresses. If that's the case, probably should make it clear by
>>>>>>> showing
>>>>>>> different host adapters for each example.
>>>>>>>
>>>>>>
>>>>>> It could be the same bus with cputemp. Also, client address sharing is
>>>>>> possible by PECI core if the functionality is different. I mean,
>>>>>> cputemp and
>>>>>> dimmtemp targeting the same client is possible case like this.
>>>>>> peci-cputemp@30
>>>>>> peci-dimmtemp@30
>>>>>>
>>>>>
>>>>> Oh, I got your point. Probably, I should change these separate settings
>>>>> into one like
>>>>>
>>>>> peci-client@30 {
>>>>>        compatible = "intel,peci-client";
>>>>>        reg = <0x30>;
>>>>> };
>>>>>
>>>>> Then cputemp and dimmtemp drivers could refer the same compatible
>>>>> string.
>>>>> Will rewrite it.
>>>>>
>>>>
>>>> I've checked it again and realized that it should use function based node
>>>> name like:
>>>>
>>>> peci-cputemp@30
>>>> peci-dimmtemp@30
>>>>
>>>> If it use the same string like 'peci-client@30', the drivers cannot be
>>>> selectively enabled. The client address sharing way is well handled in
>>>> PECI
>>>> core and this way would be better for the future implementations of other
>>>> PECI functional drivers such as crash dump driver and so on. So I'm going
>>>> change the unit-address only.
>>>
>>>
>>> 2 nodes at the same address is wrong (and soon dtc will warn you on
>>> this). You have 2 potential options. The first is you need additional
>>> address information in the DT if these are in fact 2 independent
>>> devices. This could be something like a function number to use
>>> something from PCI addressing. From what I found on PECI, it doesn't
>>> seem to have anything like that. The 2nd option is you have a single
>>> DT node which registers multiple hwmon devices. DT nodes and drivers
>>> don't have to be 1-1. Don't design your DT nodes from how you want to
>>> partition drivers in some OS.
>>>
>>> Rob
>>>
>>
>> Please correct me if I'm wrong but I'm still thinking that it is
>> possible. Also, I did compile it but dtc doesn't make a warning. Let me
>> show an another use case which is similar to this case:
> 
> I did say *soon*. It's in dtc repo, but not the kernel copy yet.
> 
>> In arch/arm/boot/dts/aspeed-g5.dtsi
>> [...]
>> lpc_host: lpc-host@80 {
>>          compatible = "aspeed,ast2500-lpc-host", "simple-mfd", "syscon";
>>          reg = <0x80 0x1e0>;
>>          reg-io-width = <4>;
>>
>>          #address-cells = <1>;
>>          #size-cells = <1>;
>>          ranges = <0x0 0x80 0x1e0>;
>>
>>          lpc_ctrl: lpc-ctrl@0 {
>>                  compatible = "aspeed,ast2500-lpc-ctrl";
>>                  reg = <0x0 0x80>;
>>                  clocks = <&syscon ASPEED_CLK_GATE_LCLK>;
>>                  status = "disabled";
>>          };
>>
>>          lpc_snoop: lpc-snoop@0 {
>>                  compatible = "aspeed,ast2500-lpc-snoop";
>>                  reg = <0x0 0x80>;
>>                  interrupts = <8>;
>>                  status = "disabled";
>>          };
>> }
>> [...]
>>
>> This is device tree setting for LPC interface and its child nodes.
>> LPC interface can be used as a multi-functional interface such as
>> snoop 80, KCS, SIO and so on. In this use case, lpc-ctrl@0 and
>> lpc-snoop@0 are sharing their address range from their individual
>> driver modules and they can be registered quite well through both
>> static dt or dynamic dtoverlay. PECI is also a multi-functional
>> interface which is similar to the above case, I think.
> 
> This case too is poor design and should be fixed as well. Simply put,
> you can have 2 devices on a bus at the same address without some sort
> of mux or arbitration device in the middle. If you have a device/block
> with multiple functions provided to the OS, then it is the OS's
> problem to arbitrate access. It is not a DT problem because OS's can
> vary in how they handle that both from OS to OS and over time.
> 
> Rob
> 

If I change it to a single DT node which registers 2 hwmon devices using
the 2nd option above, then I still have 2 devices on a bus at the same
address. Does it also make a problem to the OS then?

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core
  2018-04-10 18:32 ` [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core Jae Hyun Yoo
@ 2018-04-19 18:59   ` kbuild test robot
  2018-04-23 10:52   ` Greg KH
  2018-04-24 16:01   ` Andy Shevchenko
  2 siblings, 0 replies; 54+ messages in thread
From: kbuild test robot @ 2018-04-19 18:59 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: kbuild-all, Alan Cox, Andrew Jeffery, Andrew Lunn,
	Andy Shevchenko, Arnd Bergmann, Benjamin Herrenschmidt,
	Fengguang Wu, Greg KH, Guenter Roeck, Haiyue Wang, James Feist,
	Jason M Biils, Jean Delvare, Joel Stanley, Julia Cartwright,
	Miguel Ojeda, Milton Miller II, Pavel Machek, Randy Dunlap,
	Stef van Os, Sumeet R Pawnikar, Vernon Mauery, linux-kernel,
	linux-doc, devicetree, linux-hwmon, linux-arm-kernel, openbmc,
	Jae Hyun Yoo

[-- Attachment #1: Type: text/plain, Size: 2138 bytes --]

Hi Jae,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.17-rc1 next-20180419]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jae-Hyun-Yoo/PECI-device-driver-introduction/20180411-180018
config: x86_64-randconfig-s0-04192349 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers//peci/peci-core.c: In function 'peci_device_match':
>> drivers//peci/peci-core.c:739:6: error: implicit declaration of function 'peci_of_match_device' [-Werror=implicit-function-declaration]
     if (peci_of_match_device(drv->of_match_table, client))
         ^~~~~~~~~~~~~~~~~~~~
   At top level:
   drivers//peci/peci-core.c:840:28: warning: 'peci_new_device' defined but not used [-Wunused-function]
    static struct peci_client *peci_new_device(struct peci_adapter *adapter,
                               ^~~~~~~~~~~~~~~
   drivers//peci/peci-core.c:135:29: warning: 'peci_verify_adapter' defined but not used [-Wunused-function]
    static struct peci_adapter *peci_verify_adapter(struct device *dev)
                                ^~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/peci_of_match_device +739 drivers//peci/peci-core.c

   732	
   733	static int peci_device_match(struct device *dev, struct device_driver *drv)
   734	{
   735		struct peci_client *client = peci_verify_client(dev);
   736		struct peci_driver *driver;
   737	
   738		/* Attempt an OF style match */
 > 739		if (peci_of_match_device(drv->of_match_table, client))
   740			return 1;
   741	
   742		driver = to_peci_driver(drv);
   743	
   744		if (peci_match_id(driver->id_table, client))
   745			return 1;
   746	
   747		return 0;
   748	}
   749	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26598 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
  2018-04-18 21:57                 ` Jae Hyun Yoo
@ 2018-04-19 19:48                   ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-19 19:48 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist

On 4/18/2018 2:57 PM, Jae Hyun Yoo wrote:
> On 4/18/2018 2:28 PM, Rob Herring wrote:
>> On Wed, Apr 18, 2018 at 3:28 PM, Jae Hyun Yoo
>> <jae.hyun.yoo@linux.intel.com> wrote:
>>> On 4/18/2018 7:32 AM, Rob Herring wrote:
>>>>
>>>> On Tue, Apr 17, 2018 at 3:40 PM, Jae Hyun Yoo
>>>> <jae.hyun.yoo@linux.intel.com> wrote:
>>>>>
>>>>> On 4/16/2018 4:51 PM, Jae Hyun Yoo wrote:
>>>>>>
>>>>>>
>>>>>> On 4/16/2018 4:22 PM, Jae Hyun Yoo wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 4/16/2018 11:14 AM, Rob Herring wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This commit adds dt-bindings documents for PECI cputemp and 
>>>>>>>>> dimmtemp
>>>>>>>>> client
>>>>>>>>> drivers.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>
>>>>> [...]
>>>>>
>>>>>>>>> +Example:
>>>>>>>>> +    peci-bus@0 {
>>>>>>>>> +        #address-cells = <1>;
>>>>>>>>> +        #size-cells = <0>;
>>>>>>>>> +        < more properties >
>>>>>>>>> +
>>>>>>>>> +        peci-dimmtemp@cpu0 {
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> unit-address is wrong.
>>>>>>>>
>>>>>>>
>>>>>>> Will fix it using the reg value.
>>>>>>>
>>>>>>>> It is a different bus from cputemp? Otherwise, you have conflicting
>>>>>>>> addresses. If that's the case, probably should make it clear by
>>>>>>>> showing
>>>>>>>> different host adapters for each example.
>>>>>>>>
>>>>>>>
>>>>>>> It could be the same bus with cputemp. Also, client address 
>>>>>>> sharing is
>>>>>>> possible by PECI core if the functionality is different. I mean,
>>>>>>> cputemp and
>>>>>>> dimmtemp targeting the same client is possible case like this.
>>>>>>> peci-cputemp@30
>>>>>>> peci-dimmtemp@30
>>>>>>>
>>>>>>
>>>>>> Oh, I got your point. Probably, I should change these separate 
>>>>>> settings
>>>>>> into one like
>>>>>>
>>>>>> peci-client@30 {
>>>>>>        compatible = "intel,peci-client";
>>>>>>        reg = <0x30>;
>>>>>> };
>>>>>>
>>>>>> Then cputemp and dimmtemp drivers could refer the same compatible
>>>>>> string.
>>>>>> Will rewrite it.
>>>>>>
>>>>>
>>>>> I've checked it again and realized that it should use function 
>>>>> based node
>>>>> name like:
>>>>>
>>>>> peci-cputemp@30
>>>>> peci-dimmtemp@30
>>>>>
>>>>> If it use the same string like 'peci-client@30', the drivers cannot be
>>>>> selectively enabled. The client address sharing way is well handled in
>>>>> PECI
>>>>> core and this way would be better for the future implementations of 
>>>>> other
>>>>> PECI functional drivers such as crash dump driver and so on. So I'm 
>>>>> going
>>>>> change the unit-address only.
>>>>
>>>>
>>>> 2 nodes at the same address is wrong (and soon dtc will warn you on
>>>> this). You have 2 potential options. The first is you need additional
>>>> address information in the DT if these are in fact 2 independent
>>>> devices. This could be something like a function number to use
>>>> something from PCI addressing. From what I found on PECI, it doesn't
>>>> seem to have anything like that. The 2nd option is you have a single
>>>> DT node which registers multiple hwmon devices. DT nodes and drivers
>>>> don't have to be 1-1. Don't design your DT nodes from how you want to
>>>> partition drivers in some OS.
>>>>
>>>> Rob
>>>>
>>>
>>> Please correct me if I'm wrong but I'm still thinking that it is
>>> possible. Also, I did compile it but dtc doesn't make a warning. Let me
>>> show an another use case which is similar to this case:
>>
>> I did say *soon*. It's in dtc repo, but not the kernel copy yet.
>>
>>> In arch/arm/boot/dts/aspeed-g5.dtsi
>>> [...]
>>> lpc_host: lpc-host@80 {
>>>          compatible = "aspeed,ast2500-lpc-host", "simple-mfd", "syscon";
>>>          reg = <0x80 0x1e0>;
>>>          reg-io-width = <4>;
>>>
>>>          #address-cells = <1>;
>>>          #size-cells = <1>;
>>>          ranges = <0x0 0x80 0x1e0>;
>>>
>>>          lpc_ctrl: lpc-ctrl@0 {
>>>                  compatible = "aspeed,ast2500-lpc-ctrl";
>>>                  reg = <0x0 0x80>;
>>>                  clocks = <&syscon ASPEED_CLK_GATE_LCLK>;
>>>                  status = "disabled";
>>>          };
>>>
>>>          lpc_snoop: lpc-snoop@0 {
>>>                  compatible = "aspeed,ast2500-lpc-snoop";
>>>                  reg = <0x0 0x80>;
>>>                  interrupts = <8>;
>>>                  status = "disabled";
>>>          };
>>> }
>>> [...]
>>>
>>> This is device tree setting for LPC interface and its child nodes.
>>> LPC interface can be used as a multi-functional interface such as
>>> snoop 80, KCS, SIO and so on. In this use case, lpc-ctrl@0 and
>>> lpc-snoop@0 are sharing their address range from their individual
>>> driver modules and they can be registered quite well through both
>>> static dt or dynamic dtoverlay. PECI is also a multi-functional
>>> interface which is similar to the above case, I think.
>>
>> This case too is poor design and should be fixed as well. Simply put,
>> you can have 2 devices on a bus at the same address without some sort
>> of mux or arbitration device in the middle. If you have a device/block
>> with multiple functions provided to the OS, then it is the OS's
>> problem to arbitrate access. It is not a DT problem because OS's can
>> vary in how they handle that both from OS to OS and over time.
>>
>> Rob
>>
> 
> If I change it to a single DT node which registers 2 hwmon devices using
> the 2nd option above, then I still have 2 devices on a bus at the same
> address. Does it also make a problem to the OS then?
> 
> Jae

Additionally, I need to explain that there is one and only bus host
(adapter) and multiple clients on a PECI bus, and PECI spec doesn't
allow multiple originators so only the host device can originate
message. In this implementation, all message transactions on a bus from
client driver modules and user space will be serialized well in the PECI
core bus driver so bus occupation and traffic arbitration will be
managed well in the PECI core bus driver even in case of a bus has 2 
client drivers at the same address. I'm sure that this implementation 
doesn't make that kind of problem to OS.

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core
  2018-04-10 18:32 ` [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core Jae Hyun Yoo
  2018-04-19 18:59   ` kbuild test robot
@ 2018-04-23 10:52   ` Greg KH
  2018-04-23 17:40     ` Jae Hyun Yoo
  2018-04-24 16:01   ` Andy Shevchenko
  2 siblings, 1 reply; 54+ messages in thread
From: Greg KH @ 2018-04-23 10:52 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, linux-hwmon, linux-arm-kernel, openbmc

On Tue, Apr 10, 2018 at 11:32:05AM -0700, Jae Hyun Yoo wrote:
> +static void peci_adapter_dev_release(struct device *dev)
> +{
> +	/* do nothing */
> +}

As per the in-kernel documentation, I am now allowed to make fun of you.

You are trying to "out smart" the kernel by getting rid of a warning
message that was explicitly put there for you to do something.  To think
that by just providing an "empty" function you are somehow fulfilling
the API requirement is quite bold, don't you think?

This has to be fixed.  I didn't put that warning in there for no good
reason.  Please go read the documentation again...

greg k-h

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core
  2018-04-23 10:52   ` Greg KH
@ 2018-04-23 17:40     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-23 17:40 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel, linux-doc,
	devicetree, linux-hwmon, linux-arm-kernel, openbmc

On 4/23/2018 3:52 AM, Greg KH wrote:
> On Tue, Apr 10, 2018 at 11:32:05AM -0700, Jae Hyun Yoo wrote:
>> +static void peci_adapter_dev_release(struct device *dev)
>> +{
>> +	/* do nothing */
>> +}
> 
> As per the in-kernel documentation, I am now allowed to make fun of you.
> 
> You are trying to "out smart" the kernel by getting rid of a warning
> message that was explicitly put there for you to do something.  To think
> that by just providing an "empty" function you are somehow fulfilling
> the API requirement is quite bold, don't you think?
> 
> This has to be fixed.  I didn't put that warning in there for no good
> reason.  Please go read the documentation again...
> 
> greg k-h
> 

Hi Greg,

Thanks a lot for your review.

I think, it should contain actual device resource release code which is
being done by peci_del_adapter(), or a coupling logic should be added
between peci_adapter_dev_release() and peci_del_adapter().

As you suggested, I'll check it again after reading documentation and
understanding core.c code more deeply.

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-10 18:32 ` [PATCH v3 09/10] drivers/hwmon: Add " Jae Hyun Yoo
  2018-04-10 22:28   ` Guenter Roeck
@ 2018-04-24 15:56   ` Andy Shevchenko
  2018-04-24 16:26     ` Jae Hyun Yoo
  1 sibling, 1 reply; 54+ messages in thread
From: Andy Shevchenko @ 2018-04-24 15:56 UTC (permalink / raw)
  To: Jae Hyun Yoo, Alan Cox, Andrew Jeffery, Andrew Lunn,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On Tue, 2018-04-10 at 11:32 -0700, Jae Hyun Yoo wrote:

>  drivers/hwmon/peci-cputemp.c  | 783
> ++++++++++++++++++++++++++++++++++++++++++
>  drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++

Does it make sense one driver per patch?

> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model
> info */

> +struct cpu_gen_info {
> +	u32 type;
> +	u32 cpu_id;
> +	u32 core_max;
> +};
> 

> +static const struct cpu_gen_info cpu_gen_info_table[] = {
> +	{ .type = CPU_GEN_HSX,
> +	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63
> (0x3f) */
> +	  .core_max = CORE_MAX_ON_HSX },
> +	{ .type = CPU_GEN_BRX,
> +	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79
> (0x4f) */
> +	  .core_max = CORE_MAX_ON_BDX },
> +	{ .type = CPU_GEN_SKX,
> +	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85
> (0x55) */
> +	  .core_max = CORE_MAX_ON_SKX },
> +};

Are we talking about x86 CPU IDs here?
If so, why x86 corresponding headers, including intel-family.h are not
used?

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core
  2018-04-10 18:32 ` [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core Jae Hyun Yoo
  2018-04-19 18:59   ` kbuild test robot
  2018-04-23 10:52   ` Greg KH
@ 2018-04-24 16:01   ` Andy Shevchenko
  2018-04-24 16:29     ` Jae Hyun Yoo
  2 siblings, 1 reply; 54+ messages in thread
From: Andy Shevchenko @ 2018-04-24 16:01 UTC (permalink / raw)
  To: Jae Hyun Yoo, Alan Cox, Andrew Jeffery, Andrew Lunn,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On Tue, 2018-04-10 at 11:32 -0700, Jae Hyun Yoo wrote:
> This commit adds driver implementation for PECI bus core into linux
> driver framework.
> 

All comments you got for patch 6 are applicable here.

And perhaps in the rest of the series.

The rule of thumb: when you get even single comment in a certain place,
re-check _entire_ series for the same / similar patterns!

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/10] drivers/hwmon: Add PECI hwmon client drivers
  2018-04-24 15:56   ` Andy Shevchenko
@ 2018-04-24 16:26     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-24 16:26 UTC (permalink / raw)
  To: Andy Shevchenko, Alan Cox, Andrew Jeffery, Andrew Lunn,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

Hi Andy,

Thanks a lot for your review. Please check my inline answers.

On 4/24/2018 8:56 AM, Andy Shevchenko wrote:
> On Tue, 2018-04-10 at 11:32 -0700, Jae Hyun Yoo wrote:
> 
>>   drivers/hwmon/peci-cputemp.c  | 783
>> ++++++++++++++++++++++++++++++++++++++++++
>>   drivers/hwmon/peci-dimmtemp.c | 432 +++++++++++++++++++++++
> 
> Does it make sense one driver per patch?
> 

Yes, I'll separate it into two patches.

>> +#define CLIENT_CPU_ID_MASK    0xf0ff0  /* Mask for Family / Model
>> info */
> 
>> +struct cpu_gen_info {
>> +	u32 type;
>> +	u32 cpu_id;
>> +	u32 core_max;
>> +};
>>
> 
>> +static const struct cpu_gen_info cpu_gen_info_table[] = {
>> +	{ .type = CPU_GEN_HSX,
>> +	  .cpu_id = 0x306f0, /* Family code: 6, Model number: 63
>> (0x3f) */
>> +	  .core_max = CORE_MAX_ON_HSX },
>> +	{ .type = CPU_GEN_BRX,
>> +	  .cpu_id = 0x406f0, /* Family code: 6, Model number: 79
>> (0x4f) */
>> +	  .core_max = CORE_MAX_ON_BDX },
>> +	{ .type = CPU_GEN_SKX,
>> +	  .cpu_id = 0x50650, /* Family code: 6, Model number: 85
>> (0x55) */
>> +	  .core_max = CORE_MAX_ON_SKX },
>> +};
> 
> Are we talking about x86 CPU IDs here?
> If so, why x86 corresponding headers, including intel-family.h are not
> used?
> 

Yes, that would make more sense. I'll include the intel-family.h and 
will use these defines instead:
INTEL_FAM6_HASWELL_X
INTEL_FAM6_BROADWELL_X
INTEL_FAM6_SKYLAKE_X

Thanks,

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core
  2018-04-24 16:01   ` Andy Shevchenko
@ 2018-04-24 16:29     ` Jae Hyun Yoo
  0 siblings, 0 replies; 54+ messages in thread
From: Jae Hyun Yoo @ 2018-04-24 16:29 UTC (permalink / raw)
  To: Andy Shevchenko, Alan Cox, Andrew Jeffery, Andrew Lunn,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery
  Cc: linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc

On 4/24/2018 9:01 AM, Andy Shevchenko wrote:
> On Tue, 2018-04-10 at 11:32 -0700, Jae Hyun Yoo wrote:
>> This commit adds driver implementation for PECI bus core into linux
>> driver framework.
>>
> 
> All comments you got for patch 6 are applicable here.
> 
> And perhaps in the rest of the series.
> 
> The rule of thumb: when you get even single comment in a certain place,
> re-check _entire_ series for the same / similar patterns!
> 

Thanks for your advice. I'll keep that in mind.

Jae

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2018-04-24 16:29 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-10 18:32 [PATCH v3 00/10] PECI device driver introduction Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 01/10] Documentations: dt-bindings: Add documents of generic PECI bus, adapter and client drivers Jae Hyun Yoo
2018-04-11 11:52   ` Joel Stanley
2018-04-12  2:06     ` Jae Hyun Yoo
2018-04-16 17:59   ` Rob Herring
2018-04-16 23:06     ` Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 02/10] Documentations: ioctl: Add ioctl numbers for PECI subsystem Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 03/10] drivers/peci: Add support for PECI bus driver core Jae Hyun Yoo
2018-04-19 18:59   ` kbuild test robot
2018-04-23 10:52   ` Greg KH
2018-04-23 17:40     ` Jae Hyun Yoo
2018-04-24 16:01   ` Andy Shevchenko
2018-04-24 16:29     ` Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs Jae Hyun Yoo
2018-04-11 11:52   ` Joel Stanley
2018-04-12  2:11     ` Jae Hyun Yoo
2018-04-16 18:10   ` Rob Herring
2018-04-16 23:12     ` Jae Hyun Yoo
2018-04-17 13:16       ` Rob Herring
2018-04-17 18:16         ` Jae Hyun Yoo
2018-04-17 22:06           ` Jae Hyun Yoo
2018-04-18 13:59             ` Rob Herring
2018-04-18 16:45               ` Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 05/10] ARM: dts: aspeed: peci: Add PECI node Jae Hyun Yoo
2018-04-11 11:52   ` Joel Stanley
2018-04-12  2:20     ` Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 06/10] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx Jae Hyun Yoo
2018-04-11 11:51   ` Joel Stanley
2018-04-12  2:03     ` Jae Hyun Yoo
2018-04-17 13:37   ` Robin Murphy
2018-04-17 18:21     ` Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers Jae Hyun Yoo
2018-04-16 18:14   ` Rob Herring
2018-04-16 23:22     ` Jae Hyun Yoo
2018-04-16 23:51       ` Jae Hyun Yoo
2018-04-17 20:40         ` Jae Hyun Yoo
2018-04-18 14:32           ` Rob Herring
2018-04-18 20:28             ` Jae Hyun Yoo
2018-04-18 21:28               ` Rob Herring
2018-04-18 21:57                 ` Jae Hyun Yoo
2018-04-19 19:48                   ` Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 08/10] Documentation: hwmon: " Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 09/10] drivers/hwmon: Add " Jae Hyun Yoo
2018-04-10 22:28   ` Guenter Roeck
2018-04-11 21:59     ` Jae Hyun Yoo
2018-04-12  0:34       ` Guenter Roeck
2018-04-12  2:51         ` Jae Hyun Yoo
2018-04-12  3:40           ` Guenter Roeck
2018-04-12 17:09             ` Jae Hyun Yoo
2018-04-12 17:37               ` Guenter Roeck
2018-04-12 19:51                 ` Jae Hyun Yoo
2018-04-24 15:56   ` Andy Shevchenko
2018-04-24 16:26     ` Jae Hyun Yoo
2018-04-10 18:32 ` [PATCH v3 10/10] Add a maintainer for the PECI subsystem Jae Hyun Yoo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).