LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v3 0/2] Add temperature notification support
@ 2021-09-12 13:19 Avri Altman
  2021-09-12 13:19 ` [PATCH v3 1/2] scsi: ufs: Probe for " Avri Altman
  2021-09-12 13:19 ` [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling Avri Altman
  0 siblings, 2 replies; 14+ messages in thread
From: Avri Altman @ 2021-09-12 13:19 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter,
	Bean Huo, Guenter Roeck, Avri Altman

v2 -> v3:
 - Attend Bart's & Guenter's comments

v1 -> v2:
 - Add a hw monitor device if both the platform & the device support it
 - Remove the sysfs patch: no need to duplicate /sys/class/hwmon

UFS3.0 allows using the ufs device as a temperature sensor. The purpose
of this optional feature is to provide notification to the host of the
UFS device case temperature. It allows reading of a rough estimate
(+-10 degrees centigrade) of the current case temperature, and setting a
lower and upper temperature bounds, in which the device will trigger an
applicable exception event.

A previous attempt [1] tried a comprehensive approach.  Still, it was
unsuccessful. Here is a more modest approach that introduces just the
bare minimum to support temperature notification.

Thanks,
Avri

[1] https://lore.kernel.org/lkml/1582450522-13256-1-git-send-email-avi.shchislowski@wdc.com/

Avri Altman (2):
  scsi: ufs: Probe for temperature notification support
  scsi: ufs: Add temperature notification exception handling

 drivers/scsi/ufs/Kconfig     |  10 ++
 drivers/scsi/ufs/Makefile    |   1 +
 drivers/scsi/ufs/ufs-hwmon.c | 193 +++++++++++++++++++++++++++++++++++
 drivers/scsi/ufs/ufs.h       |   8 ++
 drivers/scsi/ufs/ufshcd.c    |  49 +++++++++
 drivers/scsi/ufs/ufshcd.h    |  20 ++++
 6 files changed, 281 insertions(+)
 create mode 100644 drivers/scsi/ufs/ufs-hwmon.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/2] scsi: ufs: Probe for temperature notification support
  2021-09-12 13:19 [PATCH v3 0/2] Add temperature notification support Avri Altman
@ 2021-09-12 13:19 ` Avri Altman
  2021-09-12 15:24   ` Guenter Roeck
  2021-09-12 13:19 ` [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling Avri Altman
  1 sibling, 1 reply; 14+ messages in thread
From: Avri Altman @ 2021-09-12 13:19 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter,
	Bean Huo, Guenter Roeck, Avri Altman

Probe the dExtendedUFSFeaturesSupport register for the device's
temperature notification support, and if supported, add a hw monitor
device.

Signed-off-by: Avri Altman <avri.altman@wdc.com>
---
 drivers/scsi/ufs/Kconfig     |  10 ++
 drivers/scsi/ufs/Makefile    |   1 +
 drivers/scsi/ufs/ufs-hwmon.c | 179 +++++++++++++++++++++++++++++++++++
 drivers/scsi/ufs/ufs.h       |   6 ++
 drivers/scsi/ufs/ufshcd.c    |  28 ++++++
 drivers/scsi/ufs/ufshcd.h    |  18 ++++
 6 files changed, 242 insertions(+)
 create mode 100644 drivers/scsi/ufs/ufs-hwmon.c

diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig
index 432df76e6318..b930f29fc375 100644
--- a/drivers/scsi/ufs/Kconfig
+++ b/drivers/scsi/ufs/Kconfig
@@ -199,3 +199,13 @@ config SCSI_UFS_FAULT_INJECTION
 	help
 	  Enable fault injection support in the UFS driver. This makes it easier
 	  to test the UFS error handler and abort handler.
+
+config SCSI_UFS_HWMON
+	bool "UFS  Temperature Notification"
+	depends on SCSI_UFSHCD && HWMON
+	help
+	  This provides support for UFS hardware monitoring. If enabled,
+	  a hardware monitoring device will be created for the UFS device.
+
+	  If unsure, say N.
+
diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
index c407da9b5171..966048875b50 100644
--- a/drivers/scsi/ufs/Makefile
+++ b/drivers/scsi/ufs/Makefile
@@ -10,6 +10,7 @@ ufshcd-core-$(CONFIG_SCSI_UFS_BSG)	+= ufs_bsg.o
 ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO)	+= ufshcd-crypto.o
 ufshcd-core-$(CONFIG_SCSI_UFS_HPB)	+= ufshpb.o
 ufshcd-core-$(CONFIG_SCSI_UFS_FAULT_INJECTION) += ufs-fault-injection.o
+ufshcd-core-$(CONFIG_SCSI_UFS_HWMON) += ufs-hwmon.o
 
 obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-dwc.o tc-dwc-g210.o
 obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-pltfrm.o ufshcd-dwc.o tc-dwc-g210.o
diff --git a/drivers/scsi/ufs/ufs-hwmon.c b/drivers/scsi/ufs/ufs-hwmon.c
new file mode 100644
index 000000000000..a50e83f645f4
--- /dev/null
+++ b/drivers/scsi/ufs/ufs-hwmon.c
@@ -0,0 +1,179 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * UFS hardware monitoring support
+ * Copyright (c) 2021, Western Digital Corporation
+ */
+
+#include <linux/hwmon.h>
+
+#include "ufshcd.h"
+
+struct ufs_hwmon_data {
+	struct ufs_hba *hba;
+	u8 mask;
+};
+
+static bool ufs_temp_enabled(struct ufs_hba *hba, u8 mask)
+{
+	u32 ee_mask;
+
+	if (ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR,
+			      QUERY_ATTR_IDN_EE_CONTROL, 0, 0, &ee_mask))
+		return false;
+
+	return (mask & ee_mask & MASK_EE_TOO_HIGH_TEMP) ||
+		(mask & ee_mask & MASK_EE_TOO_LOW_TEMP);
+}
+
+static bool ufs_temp_valid(struct ufs_hba *hba, u8 mask,
+			   enum attr_idn idn, u32 value)
+{
+	return (idn == QUERY_ATTR_IDN_CASE_ROUGH_TEMP && value >= 1 &&
+		value <= 250 && ufs_temp_enabled(hba, mask)) ||
+	      (idn == QUERY_ATTR_IDN_HIGH_TEMP_BOUND && value >= 100 &&
+	       value <= 250) ||
+	      (idn == QUERY_ATTR_IDN_LOW_TEMP_BOUND && value >= 1 &&
+	       value <= 80);
+}
+
+static int ufs_get_temp(struct ufs_hba *hba, u8 mask, enum attr_idn idn)
+{
+	u32 value;
+
+	if (ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR, idn, 0, 0,
+	    &value))
+		return 0;
+
+	if (ufs_temp_valid(hba, mask, idn, value))
+		return value - 80;
+
+	return 0;
+}
+
+static int ufs_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
+			   u32 attr, int channel, long *val)
+{
+	struct ufs_hwmon_data *data = dev_get_drvdata(dev);
+	struct ufs_hba *hba = data->hba;
+	u8 mask = data->mask;
+	int err = 0;
+	bool get_temp = true;
+
+	if (type != hwmon_temp)
+		return 0;
+
+	down(&hba->host_sem);
+
+	if (!ufshcd_is_user_access_allowed(hba)) {
+		up(&hba->host_sem);
+		return -EBUSY;
+	}
+
+	ufshcd_rpm_get_sync(hba);
+
+	switch (attr) {
+	case hwmon_temp_enable:
+		*val = ufs_temp_enabled(hba, mask);
+		get_temp = false;
+
+		break;
+	case hwmon_temp_max_alarm:
+		*val = ufs_get_temp(hba, mask, QUERY_ATTR_IDN_HIGH_TEMP_BOUND);
+
+		break;
+	case hwmon_temp_min_alarm:
+		*val = ufs_get_temp(hba, mask, QUERY_ATTR_IDN_LOW_TEMP_BOUND);
+
+		break;
+	case hwmon_temp_input:
+		*val = ufs_get_temp(hba, mask, QUERY_ATTR_IDN_CASE_ROUGH_TEMP);
+
+		break;
+	default:
+		err = -EOPNOTSUPP;
+
+		break;
+	}
+
+	ufshcd_rpm_put_sync(hba);
+
+	up(&hba->host_sem);
+
+	if (get_temp && !err && *val == 0)
+		err = -EINVAL;
+
+	return err;
+}
+
+static umode_t ufs_hwmon_is_visible(const void *_data,
+				     enum hwmon_sensor_types type,
+				     u32 attr, int channel)
+{
+	if (type != hwmon_temp)
+		return 0;
+
+	switch (attr) {
+	case hwmon_temp_enable:
+	case hwmon_temp_max_alarm:
+	case hwmon_temp_min_alarm:
+	case hwmon_temp_input:
+		return 0444;
+	default:
+		break;
+	}
+	return 0;
+}
+
+static const struct hwmon_channel_info *ufs_hwmon_info[] = {
+	HWMON_CHANNEL_INFO(temp, HWMON_T_ENABLE | HWMON_T_INPUT |
+			    HWMON_T_MIN_ALARM | HWMON_T_MAX_ALARM),
+	NULL
+};
+
+static const struct hwmon_ops ufs_hwmon_ops = {
+	.is_visible	= ufs_hwmon_is_visible,
+	.read		= ufs_hwmon_read,
+};
+
+static const struct hwmon_chip_info ufs_hwmon_hba_info = {
+	.ops	= &ufs_hwmon_ops,
+	.info	= ufs_hwmon_info,
+};
+
+void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask)
+{
+	struct device *dev = hba->dev;
+	struct ufs_hwmon_data *data;
+	struct device *hwmon;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->hba = hba;
+	data->mask = mask;
+
+	hwmon = hwmon_device_register_with_info(dev, "ufs",
+						data, &ufs_hwmon_hba_info,
+						NULL);
+	if (IS_ERR(hwmon)) {
+		dev_warn(dev, "Failed to instantiate hwmon device\n");
+		kfree(data);
+		return;
+	}
+
+	hba->hwmon_device = hwmon;
+}
+
+void ufs_hwmon_remove(struct ufs_hba *hba)
+{
+	struct ufs_hwmon_data *data;
+
+	if (!hba->hwmon_device)
+		return;
+
+	data = dev_get_drvdata(hba->hwmon_device);
+	hwmon_device_unregister(hba->hwmon_device);
+	hba->hwmon_device = NULL;
+	kfree(data);
+}
diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
index 8c6b38b1b142..171b27be7b1d 100644
--- a/drivers/scsi/ufs/ufs.h
+++ b/drivers/scsi/ufs/ufs.h
@@ -152,6 +152,9 @@ enum attr_idn {
 	QUERY_ATTR_IDN_PSA_STATE		= 0x15,
 	QUERY_ATTR_IDN_PSA_DATA_SIZE		= 0x16,
 	QUERY_ATTR_IDN_REF_CLK_GATING_WAIT_TIME	= 0x17,
+	QUERY_ATTR_IDN_CASE_ROUGH_TEMP          = 0x18,
+	QUERY_ATTR_IDN_HIGH_TEMP_BOUND          = 0x19,
+	QUERY_ATTR_IDN_LOW_TEMP_BOUND           = 0x1A,
 	QUERY_ATTR_IDN_WB_FLUSH_STATUS	        = 0x1C,
 	QUERY_ATTR_IDN_AVAIL_WB_BUFF_SIZE       = 0x1D,
 	QUERY_ATTR_IDN_WB_BUFF_LIFE_TIME_EST    = 0x1E,
@@ -338,6 +341,9 @@ enum {
 
 /* Possible values for dExtendedUFSFeaturesSupport */
 enum {
+	UFS_DEV_LOW_TEMP_NOTIF		= BIT(4),
+	UFS_DEV_HIGH_TEMP_NOTIF		= BIT(5),
+	UFS_DEV_EXT_TEMP_NOTIF		= BIT(6),
 	UFS_DEV_HPB_SUPPORT		= BIT(7),
 	UFS_DEV_WRITE_BOOSTER_SUP	= BIT(8),
 };
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 67889d74761c..90c2e9677435 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -7469,6 +7469,31 @@ static void ufshcd_wb_probe(struct ufs_hba *hba, u8 *desc_buf)
 	hba->caps &= ~UFSHCD_CAP_WB_EN;
 }
 
+static void ufshcd_temp_notif_probe(struct ufs_hba *hba, u8 *desc_buf)
+{
+	struct ufs_dev_info *dev_info = &hba->dev_info;
+	u32 ext_ufs_feature;
+	u8 mask = 0;
+
+	if (!(hba->caps & UFSHCD_CAP_TEMP_NOTIF) ||
+	    dev_info->wspecversion < 0x300)
+		return;
+
+	ext_ufs_feature = get_unaligned_be32(desc_buf +
+					DEVICE_DESC_PARAM_EXT_UFS_FEATURE_SUP);
+
+	if (ext_ufs_feature & UFS_DEV_LOW_TEMP_NOTIF)
+		mask |= MASK_EE_TOO_LOW_TEMP;
+
+	if (ext_ufs_feature & UFS_DEV_HIGH_TEMP_NOTIF)
+		mask |= MASK_EE_TOO_HIGH_TEMP;
+
+	if (mask) {
+		ufshcd_enable_ee(hba, mask);
+		ufs_hwmon_probe(hba, mask);
+	}
+}
+
 void ufshcd_fixup_dev_quirks(struct ufs_hba *hba, struct ufs_dev_fix *fixups)
 {
 	struct ufs_dev_fix *f;
@@ -7564,6 +7589,8 @@ static int ufs_get_device_desc(struct ufs_hba *hba)
 
 	ufshcd_wb_probe(hba, desc_buf);
 
+	ufshcd_temp_notif_probe(hba, desc_buf);
+
 	/*
 	 * ufshcd_read_string_desc returns size of the string
 	 * reset the error value
@@ -9408,6 +9435,7 @@ void ufshcd_remove(struct ufs_hba *hba)
 {
 	if (hba->sdev_ufs_device)
 		ufshcd_rpm_get_sync(hba);
+	ufs_hwmon_remove(hba);
 	ufs_bsg_remove(hba);
 	ufshpb_remove(hba);
 	ufs_sysfs_remove_nodes(hba->dev);
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 4723f27a55d1..798a408d71e5 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -653,6 +653,12 @@ enum ufshcd_caps {
 	 * in order to exit DeepSleep state.
 	 */
 	UFSHCD_CAP_DEEPSLEEP				= 1 << 10,
+
+	/*
+	 * This capability allows the host controller driver to use temperature
+	 * notification if it is supported by the UFS device.
+	 */
+	UFSHCD_CAP_TEMP_NOTIF				= 1 << 11,
 };
 
 struct ufs_hba_variant_params {
@@ -789,6 +795,10 @@ struct ufs_hba {
 	struct scsi_device *sdev_ufs_device;
 	struct scsi_device *sdev_rpmb;
 
+#ifdef CONFIG_SCSI_UFS_HWMON
+	struct device *hwmon_device;
+#endif
+
 	enum ufs_dev_pwr_mode curr_dev_pwr_mode;
 	enum uic_link_state uic_link_state;
 	/* Desired UFS power management level during runtime PM */
@@ -1050,6 +1060,14 @@ static inline u8 ufshcd_wb_get_query_index(struct ufs_hba *hba)
 	return 0;
 }
 
+#ifdef CONFIG_SCSI_UFS_HWMON
+void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask);
+void ufs_hwmon_remove(struct ufs_hba *hba);
+#else
+static inline void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask) {}
+static inline void ufs_hwmon_remove(struct ufs_hba *hba) {}
+#endif
+
 #ifdef CONFIG_PM
 extern int ufshcd_runtime_suspend(struct device *dev);
 extern int ufshcd_runtime_resume(struct device *dev);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling
  2021-09-12 13:19 [PATCH v3 0/2] Add temperature notification support Avri Altman
  2021-09-12 13:19 ` [PATCH v3 1/2] scsi: ufs: Probe for " Avri Altman
@ 2021-09-12 13:19 ` Avri Altman
  2021-09-12 15:43   ` kernel test robot
                     ` (3 more replies)
  1 sibling, 4 replies; 14+ messages in thread
From: Avri Altman @ 2021-09-12 13:19 UTC (permalink / raw)
  To: James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter,
	Bean Huo, Guenter Roeck, Avri Altman

The device may notify the host of an extreme temperature by using the
exception event mechanism. The exception can be raised when the device’s
Tcase temperature is either too high or too low.

It is essentially up to the platform to decide what further actions need
to be taken. leave a placeholder for a designated vop for that.

Signed-off-by: Avri Altman <avri.altman@wdc.com>
---
 drivers/scsi/ufs/ufs-hwmon.c | 14 ++++++++++++++
 drivers/scsi/ufs/ufs.h       |  2 ++
 drivers/scsi/ufs/ufshcd.c    | 21 +++++++++++++++++++++
 drivers/scsi/ufs/ufshcd.h    |  2 ++
 4 files changed, 39 insertions(+)

diff --git a/drivers/scsi/ufs/ufs-hwmon.c b/drivers/scsi/ufs/ufs-hwmon.c
index a50e83f645f4..b466b4649c21 100644
--- a/drivers/scsi/ufs/ufs-hwmon.c
+++ b/drivers/scsi/ufs/ufs-hwmon.c
@@ -177,3 +177,17 @@ void ufs_hwmon_remove(struct ufs_hba *hba)
 	hba->hwmon_device = NULL;
 	kfree(data);
 }
+
+void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask)
+{
+	if (!hba->hwmon_device)
+		return;
+
+	if (ee_mask & MASK_EE_TOO_HIGH_TEMP)
+		hwmon_notify_event(hba->hwmon_device, hwmon_temp,
+				   hwmon_temp_max_alarm, 0);
+
+	if (ee_mask & MASK_EE_TOO_LOW_TEMP)
+		hwmon_notify_event(hba->hwmon_device, hwmon_temp,
+				   hwmon_temp_min_alarm, 0);
+}
diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
index 171b27be7b1d..d9bc048c2a71 100644
--- a/drivers/scsi/ufs/ufs.h
+++ b/drivers/scsi/ufs/ufs.h
@@ -377,6 +377,8 @@ enum {
 	MASK_EE_PERFORMANCE_THROTTLING	= BIT(6),
 };
 
+#define MASK_EE_URGENT_TEMP (MASK_EE_TOO_HIGH_TEMP | MASK_EE_TOO_LOW_TEMP)
+
 /* Background operation status */
 enum bkops_status {
 	BKOPS_STATUS_NO_OP               = 0x0,
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 90c2e9677435..3f4c7124b74b 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5642,6 +5642,24 @@ static void ufshcd_bkops_exception_event_handler(struct ufs_hba *hba)
 				__func__, err);
 }
 
+static void ufshcd_temp_exception_event_handler(struct ufs_hba *hba, u16 status)
+{
+	u32 value;
+
+	if (ufshcd_query_attr_retry(hba, UPIU_QUERY_OPCODE_READ_ATTR,
+				QUERY_ATTR_IDN_CASE_ROUGH_TEMP, 0, 0, &value))
+		return;
+
+	dev_info(hba->dev, "exception Tcase %d\n", value - 80);
+
+	ufs_hwmon_notify_event(hba, status & MASK_EE_URGENT_TEMP);
+
+	/*
+	 * A placeholder for the platform vendors to add whatever additional
+	 * steps required
+	 */
+}
+
 static int __ufshcd_wb_toggle(struct ufs_hba *hba, bool set, enum flag_idn idn)
 {
 	u8 index;
@@ -5821,6 +5839,9 @@ static void ufshcd_exception_event_handler(struct work_struct *work)
 	if (status & hba->ee_drv_mask & MASK_EE_URGENT_BKOPS)
 		ufshcd_bkops_exception_event_handler(hba);
 
+	if (status & hba->ee_drv_mask & MASK_EE_URGENT_TEMP)
+		ufshcd_temp_exception_event_handler(hba, status);
+
 	ufs_debugfs_exception_event(hba, status);
 out:
 	ufshcd_scsi_unblock_requests(hba);
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 798a408d71e5..e6abce9a8b00 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -1063,9 +1063,11 @@ static inline u8 ufshcd_wb_get_query_index(struct ufs_hba *hba)
 #ifdef CONFIG_SCSI_UFS_HWMON
 void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask);
 void ufs_hwmon_remove(struct ufs_hba *hba);
+void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask);
 #else
 static inline void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask) {}
 static inline void ufs_hwmon_remove(struct ufs_hba *hba) {}
+void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask) {}
 #endif
 
 #ifdef CONFIG_PM
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/2] scsi: ufs: Probe for temperature notification support
  2021-09-12 13:19 ` [PATCH v3 1/2] scsi: ufs: Probe for " Avri Altman
@ 2021-09-12 15:24   ` Guenter Roeck
  2021-09-13  7:06     ` Avri Altman
  0 siblings, 1 reply; 14+ messages in thread
From: Guenter Roeck @ 2021-09-12 15:24 UTC (permalink / raw)
  To: Avri Altman, James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter, Bean Huo

On 9/12/21 6:19 AM, Avri Altman wrote:
> Probe the dExtendedUFSFeaturesSupport register for the device's
> temperature notification support, and if supported, add a hw monitor
> device.
> 
> Signed-off-by: Avri Altman <avri.altman@wdc.com>
> ---
>   drivers/scsi/ufs/Kconfig     |  10 ++
>   drivers/scsi/ufs/Makefile    |   1 +
>   drivers/scsi/ufs/ufs-hwmon.c | 179 +++++++++++++++++++++++++++++++++++
>   drivers/scsi/ufs/ufs.h       |   6 ++
>   drivers/scsi/ufs/ufshcd.c    |  28 ++++++
>   drivers/scsi/ufs/ufshcd.h    |  18 ++++
>   6 files changed, 242 insertions(+)
>   create mode 100644 drivers/scsi/ufs/ufs-hwmon.c
> 
> diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig
> index 432df76e6318..b930f29fc375 100644
> --- a/drivers/scsi/ufs/Kconfig
> +++ b/drivers/scsi/ufs/Kconfig
> @@ -199,3 +199,13 @@ config SCSI_UFS_FAULT_INJECTION
>   	help
>   	  Enable fault injection support in the UFS driver. This makes it easier
>   	  to test the UFS error handler and abort handler.
> +
> +config SCSI_UFS_HWMON
> +	bool "UFS  Temperature Notification"
> +	depends on SCSI_UFSHCD && HWMON
> +	help
> +	  This provides support for UFS hardware monitoring. If enabled,
> +	  a hardware monitoring device will be created for the UFS device.
> +
> +	  If unsure, say N.
> +

git complains about blank line at EOF.

> diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
> index c407da9b5171..966048875b50 100644
> --- a/drivers/scsi/ufs/Makefile
> +++ b/drivers/scsi/ufs/Makefile
> @@ -10,6 +10,7 @@ ufshcd-core-$(CONFIG_SCSI_UFS_BSG)	+= ufs_bsg.o
>   ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO)	+= ufshcd-crypto.o
>   ufshcd-core-$(CONFIG_SCSI_UFS_HPB)	+= ufshpb.o
>   ufshcd-core-$(CONFIG_SCSI_UFS_FAULT_INJECTION) += ufs-fault-injection.o
> +ufshcd-core-$(CONFIG_SCSI_UFS_HWMON) += ufs-hwmon.o
>   
>   obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-dwc.o tc-dwc-g210.o
>   obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-pltfrm.o ufshcd-dwc.o tc-dwc-g210.o
> diff --git a/drivers/scsi/ufs/ufs-hwmon.c b/drivers/scsi/ufs/ufs-hwmon.c
> new file mode 100644
> index 000000000000..a50e83f645f4
> --- /dev/null
> +++ b/drivers/scsi/ufs/ufs-hwmon.c
> @@ -0,0 +1,179 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * UFS hardware monitoring support
> + * Copyright (c) 2021, Western Digital Corporation
> + */
> +
> +#include <linux/hwmon.h>
> +
> +#include "ufshcd.h"
> +
> +struct ufs_hwmon_data {
> +	struct ufs_hba *hba;
> +	u8 mask;
> +};
> +
> +static bool ufs_temp_enabled(struct ufs_hba *hba, u8 mask)
> +{
> +	u32 ee_mask;
> +
> +	if (ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR,
> +			      QUERY_ATTR_IDN_EE_CONTROL, 0, 0, &ee_mask))
> +		return false;
> +
> +	return (mask & ee_mask & MASK_EE_TOO_HIGH_TEMP) ||
> +		(mask & ee_mask & MASK_EE_TOO_LOW_TEMP);
> +}
> +
> +static bool ufs_temp_valid(struct ufs_hba *hba, u8 mask,
> +			   enum attr_idn idn, u32 value)
> +{
> +	return (idn == QUERY_ATTR_IDN_CASE_ROUGH_TEMP && value >= 1 &&
> +		value <= 250 && ufs_temp_enabled(hba, mask)) ||
> +	      (idn == QUERY_ATTR_IDN_HIGH_TEMP_BOUND && value >= 100 &&
> +	       value <= 250) ||
> +	      (idn == QUERY_ATTR_IDN_LOW_TEMP_BOUND && value >= 1 &&
> +	       value <= 80);
> +}
> +
The value ranges checed above suggest that the temperature is reported
in degrees C (or maybe degrees C with an offset). The hwmon API expects
temperatures to be reported in milli-degrees C, and I don't see a
conversion in the actual read functions. What does the "sensors" command report ?

> +static int ufs_get_temp(struct ufs_hba *hba, u8 mask, enum attr_idn idn)
> +{
> +	u32 value;
> +
> +	if (ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR, idn, 0, 0,
> +	    &value))

checkpatch states that alignment is off, and I am quite sure this fits into one
line anyway (with the 100-column limit). There are more instances with bad alignment
according to checkpatch.

Also, ufshcd_query_attr() returns a valid Linux error code. That should be
returned to the caller and not be replaced. More on that below.

> +		return 0;
> +
> +	if (ufs_temp_valid(hba, mask, idn, value))
> +		return value - 80;
> +

This again suggests that the temperature is not milli-degrees C.

Is there reason to believe that this validation is necessary ?
Note that this reports an "error" if the returned temperature value
happens to have a value of 80. Again, more on that below.

> +	return 0;
> +}
> +
> +static int ufs_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> +			   u32 attr, int channel, long *val)
> +{
> +	struct ufs_hwmon_data *data = dev_get_drvdata(dev);
> +	struct ufs_hba *hba = data->hba;
> +	u8 mask = data->mask;
> +	int err = 0;
> +	bool get_temp = true;
> +
> +	if (type != hwmon_temp)
> +		return 0;
> +
> +	down(&hba->host_sem);
> +
> +	if (!ufshcd_is_user_access_allowed(hba)) {
> +		up(&hba->host_sem);
> +		return -EBUSY;
> +	}
> +
> +	ufshcd_rpm_get_sync(hba);
> +
> +	switch (attr) {
> +	case hwmon_temp_enable:
> +		*val = ufs_temp_enabled(hba, mask);
> +		get_temp = false;
> +

This seems to be read-only, and the mask only affects the limit registers as
far as I con see. If so, this is wrong: The mask should be used to enable
or hide the limit attributes as needed. If the mask is 0, and if this means
that the current temperature is not reported either, the driver should not
instantiate at all.

The "enable" attribute only makes sense if it can be used to actually enable
or disable a specific sensor, and is not tied to limit attributes but to the
actual sensor values.

> +		break;
> +	case hwmon_temp_max_alarm:
> +		*val = ufs_get_temp(hba, mask, QUERY_ATTR_IDN_HIGH_TEMP_BOUND);
> +
> +		break;
> +	case hwmon_temp_min_alarm:
> +		*val = ufs_get_temp(hba, mask, QUERY_ATTR_IDN_LOW_TEMP_BOUND);
> +
> +		break;
> +	case hwmon_temp_input:
> +		*val = ufs_get_temp(hba, mask, QUERY_ATTR_IDN_CASE_ROUGH_TEMP);
> +
If an enable attribute exists and is 0 (disabled), this should return -ENODATA.
In this case, that would imply that the driver should not be instantiated
in the first place since it has nothing to report.

> +		break;
> +	default:
> +		err = -EOPNOTSUPP;
> +
> +		break;
> +	}
> +
> +	ufshcd_rpm_put_sync(hba);
> +
> +	up(&hba->host_sem);
> +
> +	if (get_temp && !err && *val == 0)
> +		err = -EINVAL;
> +
That is an odd way of detection errors. If it was in the hwmon subsystem,
I'd ask for the error handling to be moved into the case statements. On
top of that, interpreting a return value of "0" as error seems wrong.
ufs_get_temp() returns 0 if the measured temperature or the reported limit
happens to have a value of 80, and that is perfectly valid. If ufs_get_temp()
reports an error, it should report that as error.

Also, EINVAL is "invalid argument", which I am quite sure does not apply here.

> +	return err;
> +}
> +
> +static umode_t ufs_hwmon_is_visible(const void *_data,
> +				     enum hwmon_sensor_types type,
> +				     u32 attr, int channel)
> +{
> +	if (type != hwmon_temp)
> +		return 0;
> +
> +	switch (attr) {
> +	case hwmon_temp_enable:
> +	case hwmon_temp_max_alarm:
> +	case hwmon_temp_min_alarm:
> +	case hwmon_temp_input:
> +		return 0444;
> +	default:
> +		break;
> +	}
> +	return 0;
> +}
> +
> +static const struct hwmon_channel_info *ufs_hwmon_info[] = {
> +	HWMON_CHANNEL_INFO(temp, HWMON_T_ENABLE | HWMON_T_INPUT |
> +			    HWMON_T_MIN_ALARM | HWMON_T_MAX_ALARM),
> +	NULL
> +};
> +
> +static const struct hwmon_ops ufs_hwmon_ops = {
> +	.is_visible	= ufs_hwmon_is_visible,
> +	.read		= ufs_hwmon_read,
> +};
> +
> +static const struct hwmon_chip_info ufs_hwmon_hba_info = {
> +	.ops	= &ufs_hwmon_ops,
> +	.info	= ufs_hwmon_info,
> +};
> +
> +void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask)
> +{
> +	struct device *dev = hba->dev;
> +	struct ufs_hwmon_data *data;
> +	struct device *hwmon;
> +
> +	data = kzalloc(sizeof(*data), GFP_KERNEL);
> +	if (!data)
> +		return;
> +
> +	data->hba = hba;
> +	data->mask = mask;
> +
> +	hwmon = hwmon_device_register_with_info(dev, "ufs",
> +						data, &ufs_hwmon_hba_info,
> +						NULL);
> +	if (IS_ERR(hwmon)) {
> +		dev_warn(dev, "Failed to instantiate hwmon device\n");
> +		kfree(data);
> +		return;
> +	}
> +
> +	hba->hwmon_device = hwmon;
> +}
> +
> +void ufs_hwmon_remove(struct ufs_hba *hba)
> +{
> +	struct ufs_hwmon_data *data;
> +
> +	if (!hba->hwmon_device)
> +		return;
> +
> +	data = dev_get_drvdata(hba->hwmon_device);
> +	hwmon_device_unregister(hba->hwmon_device);
> +	hba->hwmon_device = NULL;
> +	kfree(data);
> +}
> diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
> index 8c6b38b1b142..171b27be7b1d 100644
> --- a/drivers/scsi/ufs/ufs.h
> +++ b/drivers/scsi/ufs/ufs.h
> @@ -152,6 +152,9 @@ enum attr_idn {
>   	QUERY_ATTR_IDN_PSA_STATE		= 0x15,
>   	QUERY_ATTR_IDN_PSA_DATA_SIZE		= 0x16,
>   	QUERY_ATTR_IDN_REF_CLK_GATING_WAIT_TIME	= 0x17,
> +	QUERY_ATTR_IDN_CASE_ROUGH_TEMP          = 0x18,
> +	QUERY_ATTR_IDN_HIGH_TEMP_BOUND          = 0x19,
> +	QUERY_ATTR_IDN_LOW_TEMP_BOUND           = 0x1A,
>   	QUERY_ATTR_IDN_WB_FLUSH_STATUS	        = 0x1C,
>   	QUERY_ATTR_IDN_AVAIL_WB_BUFF_SIZE       = 0x1D,
>   	QUERY_ATTR_IDN_WB_BUFF_LIFE_TIME_EST    = 0x1E,
> @@ -338,6 +341,9 @@ enum {
>   
>   /* Possible values for dExtendedUFSFeaturesSupport */
>   enum {
> +	UFS_DEV_LOW_TEMP_NOTIF		= BIT(4),
> +	UFS_DEV_HIGH_TEMP_NOTIF		= BIT(5),
> +	UFS_DEV_EXT_TEMP_NOTIF		= BIT(6),
>   	UFS_DEV_HPB_SUPPORT		= BIT(7),
>   	UFS_DEV_WRITE_BOOSTER_SUP	= BIT(8),
>   };
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 67889d74761c..90c2e9677435 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -7469,6 +7469,31 @@ static void ufshcd_wb_probe(struct ufs_hba *hba, u8 *desc_buf)
>   	hba->caps &= ~UFSHCD_CAP_WB_EN;
>   }
>   
> +static void ufshcd_temp_notif_probe(struct ufs_hba *hba, u8 *desc_buf)
> +{
> +	struct ufs_dev_info *dev_info = &hba->dev_info;
> +	u32 ext_ufs_feature;
> +	u8 mask = 0;
> +
> +	if (!(hba->caps & UFSHCD_CAP_TEMP_NOTIF) ||
> +	    dev_info->wspecversion < 0x300)

I am quite sure this fits a single line.

> +		return;
> +
> +	ext_ufs_feature = get_unaligned_be32(desc_buf +
> +					DEVICE_DESC_PARAM_EXT_UFS_FEATURE_SUP);
> +
> +	if (ext_ufs_feature & UFS_DEV_LOW_TEMP_NOTIF)
> +		mask |= MASK_EE_TOO_LOW_TEMP;
> +
> +	if (ext_ufs_feature & UFS_DEV_HIGH_TEMP_NOTIF)
> +		mask |= MASK_EE_TOO_HIGH_TEMP;
> +
> +	if (mask) {
> +		ufshcd_enable_ee(hba, mask);
> +		ufs_hwmon_probe(hba, mask);
> +	}
> +}
> +
>   void ufshcd_fixup_dev_quirks(struct ufs_hba *hba, struct ufs_dev_fix *fixups)
>   {
>   	struct ufs_dev_fix *f;
> @@ -7564,6 +7589,8 @@ static int ufs_get_device_desc(struct ufs_hba *hba)
>   
>   	ufshcd_wb_probe(hba, desc_buf);
>   
> +	ufshcd_temp_notif_probe(hba, desc_buf);
> +
>   	/*
>   	 * ufshcd_read_string_desc returns size of the string
>   	 * reset the error value
> @@ -9408,6 +9435,7 @@ void ufshcd_remove(struct ufs_hba *hba)
>   {
>   	if (hba->sdev_ufs_device)
>   		ufshcd_rpm_get_sync(hba);
> +	ufs_hwmon_remove(hba);
>   	ufs_bsg_remove(hba);
>   	ufshpb_remove(hba);
>   	ufs_sysfs_remove_nodes(hba->dev);
> diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
> index 4723f27a55d1..798a408d71e5 100644
> --- a/drivers/scsi/ufs/ufshcd.h
> +++ b/drivers/scsi/ufs/ufshcd.h
> @@ -653,6 +653,12 @@ enum ufshcd_caps {
>   	 * in order to exit DeepSleep state.
>   	 */
>   	UFSHCD_CAP_DEEPSLEEP				= 1 << 10,
> +
> +	/*
> +	 * This capability allows the host controller driver to use temperature
> +	 * notification if it is supported by the UFS device.
> +	 */
> +	UFSHCD_CAP_TEMP_NOTIF				= 1 << 11,
>   };
>   
>   struct ufs_hba_variant_params {
> @@ -789,6 +795,10 @@ struct ufs_hba {
>   	struct scsi_device *sdev_ufs_device;
>   	struct scsi_device *sdev_rpmb;
>   
> +#ifdef CONFIG_SCSI_UFS_HWMON
> +	struct device *hwmon_device;
> +#endif
> +
>   	enum ufs_dev_pwr_mode curr_dev_pwr_mode;
>   	enum uic_link_state uic_link_state;
>   	/* Desired UFS power management level during runtime PM */
> @@ -1050,6 +1060,14 @@ static inline u8 ufshcd_wb_get_query_index(struct ufs_hba *hba)
>   	return 0;
>   }
>   
> +#ifdef CONFIG_SCSI_UFS_HWMON
> +void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask);
> +void ufs_hwmon_remove(struct ufs_hba *hba);
> +#else
> +static inline void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask) {}
> +static inline void ufs_hwmon_remove(struct ufs_hba *hba) {}
> +#endif
> +
>   #ifdef CONFIG_PM
>   extern int ufshcd_runtime_suspend(struct device *dev);
>   extern int ufshcd_runtime_resume(struct device *dev);
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling
  2021-09-12 13:19 ` [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling Avri Altman
@ 2021-09-12 15:43   ` kernel test robot
  2021-09-12 15:55   ` Guenter Roeck
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2021-09-12 15:43 UTC (permalink / raw)
  To: Avri Altman, James E . J . Bottomley, Martin K . Petersen
  Cc: kbuild-all, linux-scsi, linux-kernel, Bart Van Assche,
	Adrian Hunter, Bean Huo, Guenter Roeck, Avri Altman

[-- Attachment #1: Type: text/plain, Size: 2897 bytes --]

Hi Avri,

I love your patch! Yet something to improve:

[auto build test ERROR on mkp-scsi/for-next]
[also build test ERROR on scsi/for-next next-20210910]
[cannot apply to target/for-next v5.14]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Avri-Altman/Add-temperature-notification-support/20210912-212110
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: riscv-randconfig-r042-20210912 (attached as .config)
compiler: riscv64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/2aa54f753f44e119f4629ae015fedb3e12cebac6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Avri-Altman/Add-temperature-notification-support/20210912-212110
        git checkout 2aa54f753f44e119f4629ae015fedb3e12cebac6
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=riscv SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from drivers/scsi/ufs/ufs-debugfs.c:7:
>> drivers/scsi/ufs/ufshcd.h:1070:6: warning: no previous prototype for 'ufs_hwmon_notify_event' [-Wmissing-prototypes]
    1070 | void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask) {}
         |      ^~~~~~~~~~~~~~~~~~~~~~
--
   riscv64-linux-ld: drivers/scsi/ufs/ufs-sysfs.o: in function `ufs_hwmon_notify_event':
>> ufs-sysfs.c:(.text+0xca00): multiple definition of `ufs_hwmon_notify_event'; drivers/scsi/ufs/ufshcd.o:ufshcd.c:(.text+0xb8c0): first defined here
   riscv64-linux-ld: drivers/scsi/ufs/ufs-debugfs.o: in function `ufs_hwmon_notify_event':
   ufs-debugfs.c:(.text+0x440): multiple definition of `ufs_hwmon_notify_event'; drivers/scsi/ufs/ufshcd.o:ufshcd.c:(.text+0xb8c0): first defined here
   riscv64-linux-ld: drivers/scsi/ufs/ufshcd-crypto.o: in function `ufs_hwmon_notify_event':
   ufshcd-crypto.c:(.text+0x580): multiple definition of `ufs_hwmon_notify_event'; drivers/scsi/ufs/ufshcd.o:ufshcd.c:(.text+0xb8c0): first defined here
   riscv64-linux-ld: drivers/scsi/ufs/ufshpb.o: in function `ufs_hwmon_notify_event':
   ufshpb.c:(.text+0x5240): multiple definition of `ufs_hwmon_notify_event'; drivers/scsi/ufs/ufshcd.o:ufshcd.c:(.text+0xb8c0): first defined here

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34160 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling
  2021-09-12 13:19 ` [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling Avri Altman
  2021-09-12 15:43   ` kernel test robot
@ 2021-09-12 15:55   ` Guenter Roeck
  2021-09-13  6:09     ` Avri Altman
  2021-09-12 16:33   ` kernel test robot
  2021-09-12 16:50   ` kernel test robot
  3 siblings, 1 reply; 14+ messages in thread
From: Guenter Roeck @ 2021-09-12 15:55 UTC (permalink / raw)
  To: Avri Altman, James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter, Bean Huo

On 9/12/21 6:19 AM, Avri Altman wrote:
> The device may notify the host of an extreme temperature by using the
> exception event mechanism. The exception can be raised when the device’s
> Tcase temperature is either too high or too low.
> 
> It is essentially up to the platform to decide what further actions need
> to be taken. leave a placeholder for a designated vop for that.
> 
> Signed-off-by: Avri Altman <avri.altman@wdc.com>
> ---
>   drivers/scsi/ufs/ufs-hwmon.c | 14 ++++++++++++++
>   drivers/scsi/ufs/ufs.h       |  2 ++
>   drivers/scsi/ufs/ufshcd.c    | 21 +++++++++++++++++++++
>   drivers/scsi/ufs/ufshcd.h    |  2 ++
>   4 files changed, 39 insertions(+)
> 
> diff --git a/drivers/scsi/ufs/ufs-hwmon.c b/drivers/scsi/ufs/ufs-hwmon.c
> index a50e83f645f4..b466b4649c21 100644
> --- a/drivers/scsi/ufs/ufs-hwmon.c
> +++ b/drivers/scsi/ufs/ufs-hwmon.c
> @@ -177,3 +177,17 @@ void ufs_hwmon_remove(struct ufs_hba *hba)
>   	hba->hwmon_device = NULL;
>   	kfree(data);
>   }
> +
> +void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask)
> +{
> +	if (!hba->hwmon_device)
> +		return;
> +
> +	if (ee_mask & MASK_EE_TOO_HIGH_TEMP)
> +		hwmon_notify_event(hba->hwmon_device, hwmon_temp,
> +				   hwmon_temp_max_alarm, 0);
> +
> +	if (ee_mask & MASK_EE_TOO_LOW_TEMP)
> +		hwmon_notify_event(hba->hwmon_device, hwmon_temp,
> +				   hwmon_temp_min_alarm, 0);
> +}
> diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
> index 171b27be7b1d..d9bc048c2a71 100644
> --- a/drivers/scsi/ufs/ufs.h
> +++ b/drivers/scsi/ufs/ufs.h
> @@ -377,6 +377,8 @@ enum {
>   	MASK_EE_PERFORMANCE_THROTTLING	= BIT(6),
>   };
>   
> +#define MASK_EE_URGENT_TEMP (MASK_EE_TOO_HIGH_TEMP | MASK_EE_TOO_LOW_TEMP)
> +
>   /* Background operation status */
>   enum bkops_status {
>   	BKOPS_STATUS_NO_OP               = 0x0,
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 90c2e9677435..3f4c7124b74b 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -5642,6 +5642,24 @@ static void ufshcd_bkops_exception_event_handler(struct ufs_hba *hba)
>   				__func__, err);
>   }
>   
> +static void ufshcd_temp_exception_event_handler(struct ufs_hba *hba, u16 status)
> +{
> +	u32 value;
> +
> +	if (ufshcd_query_attr_retry(hba, UPIU_QUERY_OPCODE_READ_ATTR,
> +				QUERY_ATTR_IDN_CASE_ROUGH_TEMP, 0, 0, &value))
> +		return;
> +
> +	dev_info(hba->dev, "exception Tcase %d\n", value - 80);
> +
> +	ufs_hwmon_notify_event(hba, status & MASK_EE_URGENT_TEMP);
> +
> +	/*
> +	 * A placeholder for the platform vendors to add whatever additional
> +	 * steps required
> +	 */
> +}
> +
>   static int __ufshcd_wb_toggle(struct ufs_hba *hba, bool set, enum flag_idn idn)
>   {
>   	u8 index;
> @@ -5821,6 +5839,9 @@ static void ufshcd_exception_event_handler(struct work_struct *work)
>   	if (status & hba->ee_drv_mask & MASK_EE_URGENT_BKOPS)
>   		ufshcd_bkops_exception_event_handler(hba);
>   
> +	if (status & hba->ee_drv_mask & MASK_EE_URGENT_TEMP)
> +		ufshcd_temp_exception_event_handler(hba, status);
> +
>   	ufs_debugfs_exception_event(hba, status);
>   out:
>   	ufshcd_scsi_unblock_requests(hba);
> diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
> index 798a408d71e5..e6abce9a8b00 100644
> --- a/drivers/scsi/ufs/ufshcd.h
> +++ b/drivers/scsi/ufs/ufshcd.h
> @@ -1063,9 +1063,11 @@ static inline u8 ufshcd_wb_get_query_index(struct ufs_hba *hba)
>   #ifdef CONFIG_SCSI_UFS_HWMON
>   void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask);
>   void ufs_hwmon_remove(struct ufs_hba *hba);
> +void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask);
>   #else
>   static inline void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask) {}
>   static inline void ufs_hwmon_remove(struct ufs_hba *hba) {}
> +void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask) {}

static

>   #endif
>   
>   #ifdef CONFIG_PM
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling
  2021-09-12 13:19 ` [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling Avri Altman
  2021-09-12 15:43   ` kernel test robot
  2021-09-12 15:55   ` Guenter Roeck
@ 2021-09-12 16:33   ` kernel test robot
  2021-09-12 16:50   ` kernel test robot
  3 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2021-09-12 16:33 UTC (permalink / raw)
  To: Avri Altman, James E . J . Bottomley, Martin K . Petersen
  Cc: llvm, kbuild-all, linux-scsi, linux-kernel, Bart Van Assche,
	Adrian Hunter, Bean Huo, Guenter Roeck, Avri Altman

[-- Attachment #1: Type: text/plain, Size: 2808 bytes --]

Hi Avri,

I love your patch! Yet something to improve:

[auto build test ERROR on mkp-scsi/for-next]
[also build test ERROR on scsi/for-next next-20210910]
[cannot apply to v5.14]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Avri-Altman/Add-temperature-notification-support/20210912-212110
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: x86_64-randconfig-a001-20210912 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 261cbe98c38f8c1ee1a482fe76511110e790f58a)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/2aa54f753f44e119f4629ae015fedb3e12cebac6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Avri-Altman/Add-temperature-notification-support/20210912-212110
        git checkout 2aa54f753f44e119f4629ae015fedb3e12cebac6
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/scsi/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from drivers/scsi/ufs/ufshcd.c:22:
>> drivers/scsi/ufs/ufshcd.h:1070:6: warning: no previous prototype for function 'ufs_hwmon_notify_event' [-Wmissing-prototypes]
   void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask) {}
        ^
   drivers/scsi/ufs/ufshcd.h:1070:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask) {}
   ^
   static 
   1 warning generated.
--
>> ld.lld: error: duplicate symbol: ufs_hwmon_notify_event
   >>> defined at ufshcd.h:1070 (drivers/scsi/ufs/ufshcd.h:1070)
   >>> drivers/scsi/ufs/ufshcd.o:(ufs_hwmon_notify_event)
   >>> defined at ufshcd.h:1070 (drivers/scsi/ufs/ufshcd.h:1070)
   >>> drivers/scsi/ufs/ufs-sysfs.o:(.text+0x0)
--
>> ld.lld: error: duplicate symbol: ufs_hwmon_notify_event
   >>> defined at ufshcd.h:1070 (drivers/scsi/ufs/ufshcd.h:1070)
   >>> drivers/scsi/ufs/ufshcd.o:(ufs_hwmon_notify_event)
   >>> defined at ufshcd.h:1070 (drivers/scsi/ufs/ufshcd.h:1070)
   >>> drivers/scsi/ufs/ufs-debugfs.o:(.text+0x0)

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34068 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling
  2021-09-12 13:19 ` [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling Avri Altman
                     ` (2 preceding siblings ...)
  2021-09-12 16:33   ` kernel test robot
@ 2021-09-12 16:50   ` kernel test robot
  3 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2021-09-12 16:50 UTC (permalink / raw)
  To: Avri Altman, James E . J . Bottomley, Martin K . Petersen
  Cc: kbuild-all, linux-scsi, linux-kernel, Bart Van Assche,
	Adrian Hunter, Bean Huo, Guenter Roeck, Avri Altman

[-- Attachment #1: Type: text/plain, Size: 2337 bytes --]

Hi Avri,

I love your patch! Yet something to improve:

[auto build test ERROR on mkp-scsi/for-next]
[also build test ERROR on scsi/for-next next-20210910]
[cannot apply to v5.14]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Avri-Altman/Add-temperature-notification-support/20210912-212110
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: x86_64-kexec (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/2aa54f753f44e119f4629ae015fedb3e12cebac6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Avri-Altman/Add-temperature-notification-support/20210912-212110
        git checkout 2aa54f753f44e119f4629ae015fedb3e12cebac6
        # save the attached .config to linux build tree
        make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   ld: drivers/scsi/ufs/ufs-sysfs.o: in function `ufs_hwmon_notify_event':
>> drivers/scsi/ufs/ufshcd.h:1070: multiple definition of `ufs_hwmon_notify_event'; drivers/scsi/ufs/ufshcd.o:drivers/scsi/ufs/ufshcd.h:1070: first defined here
   ld: drivers/scsi/ufs/ufs-debugfs.o: in function `ufs_hwmon_notify_event':
>> drivers/scsi/ufs/ufshcd.h:1070: multiple definition of `ufs_hwmon_notify_event'; drivers/scsi/ufs/ufshcd.o:drivers/scsi/ufs/ufshcd.h:1070: first defined here


vim +1070 drivers/scsi/ufs/ufshcd.h

  1062	
  1063	#ifdef CONFIG_SCSI_UFS_HWMON
  1064	void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask);
  1065	void ufs_hwmon_remove(struct ufs_hba *hba);
  1066	void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask);
  1067	#else
  1068	static inline void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask) {}
  1069	static inline void ufs_hwmon_remove(struct ufs_hba *hba) {}
> 1070	void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask) {}
  1071	#endif
  1072	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28981 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling
  2021-09-12 15:55   ` Guenter Roeck
@ 2021-09-13  6:09     ` Avri Altman
  0 siblings, 0 replies; 14+ messages in thread
From: Avri Altman @ 2021-09-13  6:09 UTC (permalink / raw)
  To: Guenter Roeck, James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter, Bean Huo

> > diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
> > index 798a408d71e5..e6abce9a8b00 100644
> > --- a/drivers/scsi/ufs/ufshcd.h
> > +++ b/drivers/scsi/ufs/ufshcd.h
> > @@ -1063,9 +1063,11 @@ static inline u8
> ufshcd_wb_get_query_index(struct ufs_hba *hba)
> >   #ifdef CONFIG_SCSI_UFS_HWMON
> >   void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask);
> >   void ufs_hwmon_remove(struct ufs_hba *hba);
> > +void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask);
> >   #else
> >   static inline void ufs_hwmon_probe(struct ufs_hba *hba, u8 mask) {}
> >   static inline void ufs_hwmon_remove(struct ufs_hba *hba) {}
> > +void ufs_hwmon_notify_event(struct ufs_hba *hba, u8 ee_mask) {}
> 
> static
Done.

> 
> >   #endif
> >
> >   #ifdef CONFIG_PM
> >


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH v3 1/2] scsi: ufs: Probe for temperature notification support
  2021-09-12 15:24   ` Guenter Roeck
@ 2021-09-13  7:06     ` Avri Altman
  2021-09-13  7:41       ` Guenter Roeck
  0 siblings, 1 reply; 14+ messages in thread
From: Avri Altman @ 2021-09-13  7:06 UTC (permalink / raw)
  To: Guenter Roeck, James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter, Bean Huo

> > +config SCSI_UFS_HWMON
> > +     bool "UFS  Temperature Notification"
> > +     depends on SCSI_UFSHCD && HWMON
> > +     help
> > +       This provides support for UFS hardware monitoring. If enabled,
> > +       a hardware monitoring device will be created for the UFS device.
> > +
> > +       If unsure, say N.
> > +
> 
> git complains about blank line at EOF.
Done.

> 
> > diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
> > index c407da9b5171..966048875b50 100644
> > --- a/drivers/scsi/ufs/Makefile
> > +++ b/drivers/scsi/ufs/Makefile
> > @@ -10,6 +10,7 @@ ufshcd-core-$(CONFIG_SCSI_UFS_BSG)  += ufs_bsg.o
> >   ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO)       += ufshcd-crypto.o
> >   ufshcd-core-$(CONFIG_SCSI_UFS_HPB)  += ufshpb.o
> >   ufshcd-core-$(CONFIG_SCSI_UFS_FAULT_INJECTION) +=
> > ufs-fault-injection.o
> > +ufshcd-core-$(CONFIG_SCSI_UFS_HWMON) += ufs-hwmon.o
> >
> >   obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-dwc.o
> tc-dwc-g210.o
> >   obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-pltfrm.o
> > ufshcd-dwc.o tc-dwc-g210.o diff --git a/drivers/scsi/ufs/ufs-hwmon.c
> > b/drivers/scsi/ufs/ufs-hwmon.c new file mode 100644 index
> > 000000000000..a50e83f645f4
> > --- /dev/null
> > +++ b/drivers/scsi/ufs/ufs-hwmon.c
> > @@ -0,0 +1,179 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * UFS hardware monitoring support
> > + * Copyright (c) 2021, Western Digital Corporation  */
> > +
> > +#include <linux/hwmon.h>
> > +
> > +#include "ufshcd.h"
> > +
> > +struct ufs_hwmon_data {
> > +     struct ufs_hba *hba;
> > +     u8 mask;
> > +};
> > +
> > +static bool ufs_temp_enabled(struct ufs_hba *hba, u8 mask) {
> > +     u32 ee_mask;
> > +
> > +     if (ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR,
> > +                           QUERY_ATTR_IDN_EE_CONTROL, 0, 0, &ee_mask))
> > +             return false;
> > +
> > +     return (mask & ee_mask & MASK_EE_TOO_HIGH_TEMP) ||
> > +             (mask & ee_mask & MASK_EE_TOO_LOW_TEMP); }
> > +
> > +static bool ufs_temp_valid(struct ufs_hba *hba, u8 mask,
> > +                        enum attr_idn idn, u32 value) {
> > +     return (idn == QUERY_ATTR_IDN_CASE_ROUGH_TEMP && value >= 1
> &&
> > +             value <= 250 && ufs_temp_enabled(hba, mask)) ||
> > +           (idn == QUERY_ATTR_IDN_HIGH_TEMP_BOUND && value >= 100 &&
> > +            value <= 250) ||
> > +           (idn == QUERY_ATTR_IDN_LOW_TEMP_BOUND && value >= 1 &&
> > +            value <= 80);
> > +}
> > +
> The value ranges checed above suggest that the temperature is reported in
> degrees C (or maybe degrees C with an offset). 
Yes.  No offset.

>The hwmon API expects
> temperatures to be reported in milli-degrees C, and I don't see a conversion in
> the actual read functions. What does the "sensors" command report ?
I missed that (Although it is well documented) - sorry about that.
I wasn't aware of the sensors command.  I don't have it in my arm64 android platform image (galaxy s21).
Will try to get it.
I was reading the temperature using hwmon sysfs entries, which indicate the correct temperature.
e.g
t2s:/ # ls -la /sys/class/hwmon/
total 0
drwxr-xr-x   2 root root 0 2020-12-20 10:16 .
drwxr-xr-x 104 root root 0 2020-12-19 19:05 ..
lrwxrwxrwx   1 root root 0 2020-12-20 10:16 hwmon0 -> ../../devices/platform/13100000.ufs/hwmon/hwmon0
.....

t2s:/ # cat /sys/class/hwmon/hwmon0/temp1_input
25

Will fix it.  Thanks.

> 
> > +static int ufs_get_temp(struct ufs_hba *hba, u8 mask, enum attr_idn
> > +idn) {
> > +     u32 value;
> > +
> > +     if (ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR, idn, 0, 0,
> > +         &value))
> 
> checkpatch states that alignment is off, and I am quite sure this fits into one
> line anyway (with the 100-column limit). There are more instances with bad
> alignment according to checkpatch.
I wasn't aware that the Linux Kernel deprecates the 80 Character Line Coding Style.
Will try to make it full 100-characters lines.
I didn't get any alignment complaints from checkpatch.

> 
> Also, ufshcd_query_attr() returns a valid Linux error code. That should be
> returned to the caller and not be replaced. More on that below.
> 
> > +             return 0;
> > +
> > +     if (ufs_temp_valid(hba, mask, idn, value))
> > +             return value - 80;
> > +
> 
> This again suggests that the temperature is not milli-degrees C.
> 
> Is there reason to believe that this validation is necessary ?
> Note that this reports an "error" if the returned temperature value happens to
> have a value of 80. Again, more on that below.
Data->mask holds the temperature related bits in the ufs features register: TOO_LOW_TEMPERATURE and TOO_HIGH_TEMPERATURE.
This is set for the device by the flash vendor and can't be changed by the OEMs.
If the device doesn't support any of that, then hwmon_probe is not even called, see ufshcd_temp_notif_probe.
So data->mask is not 0, and never changes.

When the device returns a 0 temperature value, it means that it is not valid.
The spec say about the Device’s rough package case surface temperature:
"
This value shall be valid when (TOO_HIGH_TEMPERATURE is supported and TOO_HIGH_TEMP_EN is enabled) or 
( TOO_LOW_TEMPERATURE is supported and TOO_LOW_TEMP_EN is enabled ).
0 : Unknown Temperature , 1~250 : ( this value – 80 ) degrees in Celsius. ( i.e., -79 ºC ~ 170 ºC )
Others: Reserved
"
data->mask is not 0, but the temperature exception bits: TOO_HIGH_TEMP_EN and TOO_LOW_TEMP_EN are of type read/volatile,
Meaning it can be written many times, e.g. by debugfs or ufs-utils.

To sum up:
 - yes, checking the temperature against the spec boundaries is useless.
   The device will return 0 if it is not valid.
   ufs_temp_valid() can be removed, and just need to check that the temperature is not 0.

 - The return value of querry_attr is of less interest.
    if it failed or temp == 0, then the temperature is invalid and the proper return value should be -EINVAL.

> 
> > +     return 0;
> > +}
> > +
> > +static int ufs_hwmon_read(struct device *dev, enum hwmon_sensor_types
> type,
> > +                        u32 attr, int channel, long *val) {
> > +     struct ufs_hwmon_data *data = dev_get_drvdata(dev);
> > +     struct ufs_hba *hba = data->hba;
> > +     u8 mask = data->mask;
> > +     int err = 0;
> > +     bool get_temp = true;
> > +
> > +     if (type != hwmon_temp)
> > +             return 0;
> > +
> > +     down(&hba->host_sem);
> > +
> > +     if (!ufshcd_is_user_access_allowed(hba)) {
> > +             up(&hba->host_sem);
> > +             return -EBUSY;
> > +     }
> > +
> > +     ufshcd_rpm_get_sync(hba);
> > +
> > +     switch (attr) {
> > +     case hwmon_temp_enable:
> > +             *val = ufs_temp_enabled(hba, mask);
> > +             get_temp = false;
> > +
> 
> This seems to be read-only, and the mask only affects the limit registers as far
> as I con see. If so, this is wrong: The mask should be used to enable or hide the
> limit attributes as needed. If the mask is 0, and if this means that the current
> temperature is not reported either, the driver should not instantiate at all.
> 
> The "enable" attribute only makes sense if it can be used to actually enable or
> disable a specific sensor, and is not tied to limit attributes but to the actual
> sensor values.
See explanation above.
 Will make it writable as well.

> 
> > +             break;
> > +     case hwmon_temp_max_alarm:
> > +             *val = ufs_get_temp(hba, mask,
> > + QUERY_ATTR_IDN_HIGH_TEMP_BOUND);
> > +
> > +             break;
> > +     case hwmon_temp_min_alarm:
> > +             *val = ufs_get_temp(hba, mask,
> > + QUERY_ATTR_IDN_LOW_TEMP_BOUND);
> > +
> > +             break;
> > +     case hwmon_temp_input:
> > +             *val = ufs_get_temp(hba, mask,
> > + QUERY_ATTR_IDN_CASE_ROUGH_TEMP);
> > +
> If an enable attribute exists and is 0 (disabled), this should return -ENODATA.
> In this case, that would imply that the driver should not be instantiated in the
> first place since it has nothing to report.
See explanation above.
Will fix it so the error value will make more sense.

> 
> > +             break;
> > +     default:
> > +             err = -EOPNOTSUPP;
> > +
> > +             break;
> > +     }
> > +
> > +     ufshcd_rpm_put_sync(hba);
> > +
> > +     up(&hba->host_sem);
> > +
> > +     if (get_temp && !err && *val == 0)
> > +             err = -EINVAL;
> > +
> That is an odd way of detection errors. If it was in the hwmon subsystem, I'd ask
> for the error handling to be moved into the case statements. On top of that,
> interpreting a return value of "0" as error seems wrong.
> ufs_get_temp() returns 0 if the measured temperature or the reported limit
> happens to have a value of 80, and that is perfectly valid. If ufs_get_temp()
> reports an error, it should report that as error.
> 
> Also, EINVAL is "invalid argument", which I am quite sure does not apply here.
Ditto.
EINVAL implies that the temperature is invalid.

> >
> > +static void ufshcd_temp_notif_probe(struct ufs_hba *hba, u8
> > +*desc_buf) {
> > +     struct ufs_dev_info *dev_info = &hba->dev_info;
> > +     u32 ext_ufs_feature;
> > +     u8 mask = 0;
> > +
> > +     if (!(hba->caps & UFSHCD_CAP_TEMP_NOTIF) ||
> > +         dev_info->wspecversion < 0x300)
> 
> I am quite sure this fits a single line.
Done.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/2] scsi: ufs: Probe for temperature notification support
  2021-09-13  7:06     ` Avri Altman
@ 2021-09-13  7:41       ` Guenter Roeck
  2021-09-13  7:49         ` Avri Altman
  0 siblings, 1 reply; 14+ messages in thread
From: Guenter Roeck @ 2021-09-13  7:41 UTC (permalink / raw)
  To: Avri Altman, James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter, Bean Huo

On 9/13/21 12:06 AM, Avri Altman wrote:
>>> +config SCSI_UFS_HWMON
>>> +     bool "UFS  Temperature Notification"
>>> +     depends on SCSI_UFSHCD && HWMON
>>> +     help
>>> +       This provides support for UFS hardware monitoring. If enabled,
>>> +       a hardware monitoring device will be created for the UFS device.
>>> +
>>> +       If unsure, say N.
>>> +
>>
>> git complains about blank line at EOF.
> Done.
> 
>>
>>> diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
>>> index c407da9b5171..966048875b50 100644
>>> --- a/drivers/scsi/ufs/Makefile
>>> +++ b/drivers/scsi/ufs/Makefile
>>> @@ -10,6 +10,7 @@ ufshcd-core-$(CONFIG_SCSI_UFS_BSG)  += ufs_bsg.o
>>>    ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO)       += ufshcd-crypto.o
>>>    ufshcd-core-$(CONFIG_SCSI_UFS_HPB)  += ufshpb.o
>>>    ufshcd-core-$(CONFIG_SCSI_UFS_FAULT_INJECTION) +=
>>> ufs-fault-injection.o
>>> +ufshcd-core-$(CONFIG_SCSI_UFS_HWMON) += ufs-hwmon.o
>>>
>>>    obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-dwc.o
>> tc-dwc-g210.o
>>>    obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-pltfrm.o
>>> ufshcd-dwc.o tc-dwc-g210.o diff --git a/drivers/scsi/ufs/ufs-hwmon.c
>>> b/drivers/scsi/ufs/ufs-hwmon.c new file mode 100644 index
>>> 000000000000..a50e83f645f4
>>> --- /dev/null
>>> +++ b/drivers/scsi/ufs/ufs-hwmon.c
>>> @@ -0,0 +1,179 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * UFS hardware monitoring support
>>> + * Copyright (c) 2021, Western Digital Corporation  */
>>> +
>>> +#include <linux/hwmon.h>
>>> +
>>> +#include "ufshcd.h"
>>> +
>>> +struct ufs_hwmon_data {
>>> +     struct ufs_hba *hba;
>>> +     u8 mask;
>>> +};
>>> +
>>> +static bool ufs_temp_enabled(struct ufs_hba *hba, u8 mask) {
>>> +     u32 ee_mask;
>>> +
>>> +     if (ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR,
>>> +                           QUERY_ATTR_IDN_EE_CONTROL, 0, 0, &ee_mask))
>>> +             return false;
>>> +
>>> +     return (mask & ee_mask & MASK_EE_TOO_HIGH_TEMP) ||
>>> +             (mask & ee_mask & MASK_EE_TOO_LOW_TEMP); }
>>> +
>>> +static bool ufs_temp_valid(struct ufs_hba *hba, u8 mask,
>>> +                        enum attr_idn idn, u32 value) {
>>> +     return (idn == QUERY_ATTR_IDN_CASE_ROUGH_TEMP && value >= 1
>> &&
>>> +             value <= 250 && ufs_temp_enabled(hba, mask)) ||
>>> +           (idn == QUERY_ATTR_IDN_HIGH_TEMP_BOUND && value >= 100 &&
>>> +            value <= 250) ||
>>> +           (idn == QUERY_ATTR_IDN_LOW_TEMP_BOUND && value >= 1 &&
>>> +            value <= 80);
>>> +}
>>> +
>> The value ranges checed above suggest that the temperature is reported in
>> degrees C (or maybe degrees C with an offset).
> Yes.  No offset.
> 
>> The hwmon API expects
>> temperatures to be reported in milli-degrees C, and I don't see a conversion in
>> the actual read functions. What does the "sensors" command report ?
> I missed that (Although it is well documented) - sorry about that.
> I wasn't aware of the sensors command.  I don't have it in my arm64 android platform image (galaxy s21).
> Will try to get it.
> I was reading the temperature using hwmon sysfs entries, which indicate the correct temperature.
> e.g
> t2s:/ # ls -la /sys/class/hwmon/
> total 0
> drwxr-xr-x   2 root root 0 2020-12-20 10:16 .
> drwxr-xr-x 104 root root 0 2020-12-19 19:05 ..
> lrwxrwxrwx   1 root root 0 2020-12-20 10:16 hwmon0 -> ../../devices/platform/13100000.ufs/hwmon/hwmon0
> .....
> 
> t2s:/ # cat /sys/class/hwmon/hwmon0/temp1_input
> 25
> 
> Will fix it.  Thanks.
> 
>>
>>> +static int ufs_get_temp(struct ufs_hba *hba, u8 mask, enum attr_idn
>>> +idn) {
>>> +     u32 value;
>>> +
>>> +     if (ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR, idn, 0, 0,
>>> +         &value))
>>
>> checkpatch states that alignment is off, and I am quite sure this fits into one
>> line anyway (with the 100-column limit). There are more instances with bad
>> alignment according to checkpatch.
> I wasn't aware that the Linux Kernel deprecates the 80 Character Line Coding Style.
> Will try to make it full 100-characters lines.
> I didn't get any alignment complaints from checkpatch.
> 
>>
>> Also, ufshcd_query_attr() returns a valid Linux error code. That should be
>> returned to the caller and not be replaced. More on that below.
>>
>>> +             return 0;
>>> +
>>> +     if (ufs_temp_valid(hba, mask, idn, value))
>>> +             return value - 80;
>>> +
>>
>> This again suggests that the temperature is not milli-degrees C.
>>
>> Is there reason to believe that this validation is necessary ?
>> Note that this reports an "error" if the returned temperature value happens to
>> have a value of 80. Again, more on that below.
> Data->mask holds the temperature related bits in the ufs features register: TOO_LOW_TEMPERATURE and TOO_HIGH_TEMPERATURE.
> This is set for the device by the flash vendor and can't be changed by the OEMs.
> If the device doesn't support any of that, then hwmon_probe is not even called, see ufshcd_temp_notif_probe.
> So data->mask is not 0, and never changes.
> 
> When the device returns a 0 temperature value, it means that it is not valid.
> The spec say about the Device’s rough package case surface temperature:
> "
> This value shall be valid when (TOO_HIGH_TEMPERATURE is supported and TOO_HIGH_TEMP_EN is enabled) or
> ( TOO_LOW_TEMPERATURE is supported and TOO_LOW_TEMP_EN is enabled ).
> 0 : Unknown Temperature , 1~250 : ( this value – 80 ) degrees in Celsius. ( i.e., -79 ºC ~ 170 ºC )
> Others: Reserved
> "
> data->mask is not 0, but the temperature exception bits: TOO_HIGH_TEMP_EN and TOO_LOW_TEMP_EN are of type read/volatile,
> Meaning it can be written many times, e.g. by debugfs or ufs-utils.
> 
> To sum up:
>   - yes, checking the temperature against the spec boundaries is useless.
>     The device will return 0 if it is not valid.
>     ufs_temp_valid() can be removed, and just need to check that the temperature is not 0.
> 
>   - The return value of querry_attr is of less interest.
>      if it failed or temp == 0, then the temperature is invalid and the proper return value should be -EINVAL.
> 
>>
>>> +     return 0;
>>> +}
>>> +
>>> +static int ufs_hwmon_read(struct device *dev, enum hwmon_sensor_types
>> type,
>>> +                        u32 attr, int channel, long *val) {
>>> +     struct ufs_hwmon_data *data = dev_get_drvdata(dev);
>>> +     struct ufs_hba *hba = data->hba;
>>> +     u8 mask = data->mask;
>>> +     int err = 0;
>>> +     bool get_temp = true;
>>> +
>>> +     if (type != hwmon_temp)
>>> +             return 0;
>>> +
>>> +     down(&hba->host_sem);
>>> +
>>> +     if (!ufshcd_is_user_access_allowed(hba)) {
>>> +             up(&hba->host_sem);
>>> +             return -EBUSY;
>>> +     }
>>> +
>>> +     ufshcd_rpm_get_sync(hba);
>>> +
>>> +     switch (attr) {
>>> +     case hwmon_temp_enable:
>>> +             *val = ufs_temp_enabled(hba, mask);
>>> +             get_temp = false;
>>> +
>>
>> This seems to be read-only, and the mask only affects the limit registers as far
>> as I con see. If so, this is wrong: The mask should be used to enable or hide the
>> limit attributes as needed. If the mask is 0, and if this means that the current
>> temperature is not reported either, the driver should not instantiate at all.
>>
>> The "enable" attribute only makes sense if it can be used to actually enable or
>> disable a specific sensor, and is not tied to limit attributes but to the actual
>> sensor values.
> See explanation above.
>   Will make it writable as well.
> 

That only makes sense if the information is passed to the chip. What is going
to happen if the user writes 0 into the attribute ?

Guenter

>>
>>> +             break;
>>> +     case hwmon_temp_max_alarm:
>>> +             *val = ufs_get_temp(hba, mask,
>>> + QUERY_ATTR_IDN_HIGH_TEMP_BOUND);
>>> +
>>> +             break;
>>> +     case hwmon_temp_min_alarm:
>>> +             *val = ufs_get_temp(hba, mask,
>>> + QUERY_ATTR_IDN_LOW_TEMP_BOUND);
>>> +
>>> +             break;
>>> +     case hwmon_temp_input:
>>> +             *val = ufs_get_temp(hba, mask,
>>> + QUERY_ATTR_IDN_CASE_ROUGH_TEMP);
>>> +
>> If an enable attribute exists and is 0 (disabled), this should return -ENODATA.
>> In this case, that would imply that the driver should not be instantiated in the
>> first place since it has nothing to report.
> See explanation above.
> Will fix it so the error value will make more sense.
> 
>>
>>> +             break;
>>> +     default:
>>> +             err = -EOPNOTSUPP;
>>> +
>>> +             break;
>>> +     }
>>> +
>>> +     ufshcd_rpm_put_sync(hba);
>>> +
>>> +     up(&hba->host_sem);
>>> +
>>> +     if (get_temp && !err && *val == 0)
>>> +             err = -EINVAL;
>>> +
>> That is an odd way of detection errors. If it was in the hwmon subsystem, I'd ask
>> for the error handling to be moved into the case statements. On top of that,
>> interpreting a return value of "0" as error seems wrong.
>> ufs_get_temp() returns 0 if the measured temperature or the reported limit
>> happens to have a value of 80, and that is perfectly valid. If ufs_get_temp()
>> reports an error, it should report that as error.
>>
>> Also, EINVAL is "invalid argument", which I am quite sure does not apply here.
> Ditto.
> EINVAL implies that the temperature is invalid.
> 
>>>
>>> +static void ufshcd_temp_notif_probe(struct ufs_hba *hba, u8
>>> +*desc_buf) {
>>> +     struct ufs_dev_info *dev_info = &hba->dev_info;
>>> +     u32 ext_ufs_feature;
>>> +     u8 mask = 0;
>>> +
>>> +     if (!(hba->caps & UFSHCD_CAP_TEMP_NOTIF) ||
>>> +         dev_info->wspecversion < 0x300)
>>
>> I am quite sure this fits a single line.
> Done.
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH v3 1/2] scsi: ufs: Probe for temperature notification support
  2021-09-13  7:41       ` Guenter Roeck
@ 2021-09-13  7:49         ` Avri Altman
  2021-09-13 14:24           ` Guenter Roeck
  0 siblings, 1 reply; 14+ messages in thread
From: Avri Altman @ 2021-09-13  7:49 UTC (permalink / raw)
  To: Guenter Roeck, James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter, Bean Huo

> >> The "enable" attribute only makes sense if it can be used to actually
> >> enable or disable a specific sensor, and is not tied to limit
> >> attributes but to the actual sensor values.
> > See explanation above.
> >   Will make it writable as well.
> >
> 
> That only makes sense if the information is passed to the chip. What is going to
> happen if the user writes 0 into the attribute ?
Will turn off the temperature exception bits, so that Tcase is no longer valid,
and the device will always return Tcase = 0.

> Guenter

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/2] scsi: ufs: Probe for temperature notification support
  2021-09-13  7:49         ` Avri Altman
@ 2021-09-13 14:24           ` Guenter Roeck
  2021-09-13 15:26             ` Avri Altman
  0 siblings, 1 reply; 14+ messages in thread
From: Guenter Roeck @ 2021-09-13 14:24 UTC (permalink / raw)
  To: Avri Altman, James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter, Bean Huo

On 9/13/21 12:49 AM, Avri Altman wrote:
>>>> The "enable" attribute only makes sense if it can be used to actually
>>>> enable or disable a specific sensor, and is not tied to limit
>>>> attributes but to the actual sensor values.
>>> See explanation above.
>>>    Will make it writable as well.
>>>
>>
>> That only makes sense if the information is passed to the chip. What is going to
>> happen if the user writes 0 into the attribute ?
> Will turn off the temperature exception bits, so that Tcase is no longer valid,
> and the device will always return Tcase = 0.
> 

Ok. Then attempts to read the temperature should return -ENODATA, not -EINVAL,
if Tcase == 0.

Guenter

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH v3 1/2] scsi: ufs: Probe for temperature notification support
  2021-09-13 14:24           ` Guenter Roeck
@ 2021-09-13 15:26             ` Avri Altman
  0 siblings, 0 replies; 14+ messages in thread
From: Avri Altman @ 2021-09-13 15:26 UTC (permalink / raw)
  To: Guenter Roeck, James E . J . Bottomley, Martin K . Petersen
  Cc: linux-scsi, linux-kernel, Bart Van Assche, Adrian Hunter, Bean Huo

 
> On 9/13/21 12:49 AM, Avri Altman wrote:
> >>>> The "enable" attribute only makes sense if it can be used to
> >>>> actually enable or disable a specific sensor, and is not tied to
> >>>> limit attributes but to the actual sensor values.
> >>> See explanation above.
> >>>    Will make it writable as well.
> >>>
> >>
> >> That only makes sense if the information is passed to the chip. What
> >> is going to happen if the user writes 0 into the attribute ?
> > Will turn off the temperature exception bits, so that Tcase is no
> > longer valid, and the device will always return Tcase = 0.
> >
> 
> Ok. Then attempts to read the temperature should return -ENODATA, not -
> EINVAL, if Tcase == 0.
OK. Thanks,

Avri

> 
> Guenter

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-09-13 15:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-12 13:19 [PATCH v3 0/2] Add temperature notification support Avri Altman
2021-09-12 13:19 ` [PATCH v3 1/2] scsi: ufs: Probe for " Avri Altman
2021-09-12 15:24   ` Guenter Roeck
2021-09-13  7:06     ` Avri Altman
2021-09-13  7:41       ` Guenter Roeck
2021-09-13  7:49         ` Avri Altman
2021-09-13 14:24           ` Guenter Roeck
2021-09-13 15:26             ` Avri Altman
2021-09-12 13:19 ` [PATCH v3 2/2] scsi: ufs: Add temperature notification exception handling Avri Altman
2021-09-12 15:43   ` kernel test robot
2021-09-12 15:55   ` Guenter Roeck
2021-09-13  6:09     ` Avri Altman
2021-09-12 16:33   ` kernel test robot
2021-09-12 16:50   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).