LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 0/3] spi: A better solution for cros_ec_spi reliability
@ 2019-05-13 20:18 Douglas Anderson
  2019-05-13 20:18 ` [PATCH v2 1/3] spi: Allow SPI devices to force transfers on a realtime thread Douglas Anderson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Douglas Anderson @ 2019-05-13 20:18 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: linux-rockchip, drinkcat, Guenter Roeck, briannorris, mka,
	Douglas Anderson, linux-kernel, linux-spi

This series is a much better solution for getting the Chrome OS EC to
talk reliably and replaces commit 37a186225a0c ("platform/chrome:
cros_ec_spi: Transfer messages at high priority").

Specifically note that even though the above commit made things
better, we still saw some failures.

The majority of these failures were because we were competing for time
with dm-crypt which also scheduled work on HIGHPRI workqueues.  While
we can consider reverting the change that made dm-crypt run its work
at HIGHPRI, the argument in commit a1b89132dc4f ("dm crypt: use
WQ_HIGHPRI for the IO and crypt workqueues") is somewhat compelling.
It does make sense for IO to be scheduled at a priority that's higher
than the default user priority.

It turns out that we could also see problems because loop devices also
run at high priority.  See the set_user_nice() in
loop_prepare_queue().

Looking in more detail, it can be seen that the high priority
workqueue isn't actually that high of a priority.  It runs at MIN_NICE
which is _fairly_ high priority but still below all real time
priority.  We can do better by using realtime priority.  That makes
sense because cros_ec_spi actually needs to run quickly for
correctness.  As I understand this is exactly what real time priority
is for.  Note that there is a discussion going on about the dm-crypt
priority [1].

We also had other problems with the previous patch because sometimes
we'd end up on the SPI pumping thread and had our priority downgraded.

Both the competition with other high priority things and the priority
downgrading are fixed by this new series.

After this series I can run the following test on Chrome OS (which
mounts /var as stateful encrypted) with no errors:
  dd if=/dev/zero of=/var/log/foo.txt bs=4M count=512&
  while true; do
    ectool version > /dev/null;
  done

Special thanks to Guenter Roeck for pointing out the "realtime"
feature of the SPI framework so I didn't re-invent the wheel.  I have
no idea how I missed it.  :-/

Also note: if you want some history on investigation done here, feel
free to peruse the Chrome OS bug [2].

[1] https://lkml.kernel.org/r/CAD=FV=VOAjgdrvkK8YKPP-8zqwPpo39rA43JH2BCeYLB0UkgAQ@mail.gmail.com
[2] https://crbug.com/948742

Changes in v2:
- Now only force transfers to the thread for devices that want it.
- Squashed patch #1 and #2 together.
- Renamed variable to "force_rt_transfers".

Douglas Anderson (3):
  spi: Allow SPI devices to force transfers on a realtime thread
  platform/chrome: cros_ec_spi: Force transfers to realtime priority
  Revert "platform/chrome: cros_ec_spi: Transfer messages at high
    priority"

 drivers/platform/chrome/cros_ec_spi.c | 81 +++------------------------
 drivers/spi/spi.c                     | 49 +++++++++++++---
 include/linux/spi/spi.h               |  5 ++
 3 files changed, 52 insertions(+), 83 deletions(-)

-- 
2.21.0.1020.gf2820cf01a-goog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/3] spi: Allow SPI devices to force transfers on a realtime thread
  2019-05-13 20:18 [PATCH v2 0/3] spi: A better solution for cros_ec_spi reliability Douglas Anderson
@ 2019-05-13 20:18 ` Douglas Anderson
  2019-05-13 20:18 ` [PATCH v2 2/3] platform/chrome: cros_ec_spi: Force transfers to realtime priority Douglas Anderson
  2019-05-13 20:18 ` [PATCH v2 3/3] Revert "platform/chrome: cros_ec_spi: Transfer messages at high priority" Douglas Anderson
  2 siblings, 0 replies; 4+ messages in thread
From: Douglas Anderson @ 2019-05-13 20:18 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: linux-rockchip, drinkcat, Guenter Roeck, briannorris, mka,
	Douglas Anderson, linux-kernel, linux-spi

For communication with some SPI devices it may be necessary (for
correctness) not to get interrupted during a transfer.  One example is
the EC (Embedded Controller) on Chromebooks.  The Chrome OS EC will
drop a transfer if more than ~8ms passes between the chip select being
asserted and the transfer finishing.

The best way to make transfers to devices like this succeed is to run
the transfers at realtime priority.  ...but, since there is no easy
way in the kernel to just temporarily bump the priority of the current
thread the current best way to do this is to schedule work on another
thread and give that thread a boosted priority.

The SPI framework already has another thread and, in fact, that other
thread is already made realtime priority in some cases.  Let's add an
ability for a SPI device driver to request that this thread always be
used and that it's realtime priority.

NOTE: Forcing all transfers to the message pumping thread will add
extra overhead to each transfer.  On some systems this may lower your
overall bus throughput and may increase CPU utilization.  You only
want to use this feature when it's important that a transfer not get
interrupt once started.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
Another option is to change this patch to allow a SPI device driver to
request that the pumping thread be realtime but not to _force_ all
messages onto that thread.  In the case of cros_ec this means a bunch
of code duplication (we need to create a thread to do basically the
same thing as the SPI core) but that would make sense if the SPI
framework doesn't expect other use cases like this and doesn't want to
carry the baggage in the SPI core.

Changes in v2:
- Now only force transfers to the thread for devices that want it.
- Squashed patch #1 and #2 together.
- Renamed variable to "force_rt_transfers".

 drivers/spi/spi.c       | 49 +++++++++++++++++++++++++++++++++--------
 include/linux/spi/spi.h |  5 +++++
 2 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 8eb7460dd744..911961bb230d 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1217,6 +1217,7 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer);
  */
 static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 {
+	struct spi_message *msg;
 	unsigned long flags;
 	bool was_busy = false;
 	int ret;
@@ -1276,10 +1277,16 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		return;
 	}
 
-	/* Extract head of queue */
-	ctlr->cur_msg =
-		list_first_entry(&ctlr->queue, struct spi_message, queue);
+	/* If device for this message needs a realtime thread; queue it */
+	msg = list_first_entry(&ctlr->queue, struct spi_message, queue);
+	if (msg->spi->force_rt_transfers & !in_kthread) {
+		kthread_queue_work(&ctlr->kworker, &ctlr->pump_messages);
+		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
+		return;
+	}
 
+	/* Extract head of queue */
+	ctlr->cur_msg = msg;
 	list_del_init(&ctlr->cur_msg->queue);
 	if (ctlr->busy)
 		was_busy = true;
@@ -1364,10 +1371,32 @@ static void spi_pump_messages(struct kthread_work *work)
 	__spi_pump_messages(ctlr, true);
 }
 
-static int spi_init_queue(struct spi_controller *ctlr)
+/**
+ * spi_set_thread_rt - set the controller to pump at realtime priority
+ * @ctlr: controller to boost priority of
+ *
+ * This can be called because the controller requested realtime priority
+ * (by setting the ->rt value before calling spi_register_controller()) or
+ * because a device on the bus said that its transfers needed realtime
+ * priority.
+ *
+ * NOTE: at the moment if any device on a bus says it needs realtime then
+ * the thread will be at realtime priority for all transfers on that
+ * controller.  If this eventually becomes a problem we may see if we can
+ * find a way to boost the priority only temporarily during relevant
+ * transfers.
+ */
+static void spi_set_thread_rt(struct spi_controller *ctlr)
 {
 	struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
 
+	dev_info(&ctlr->dev,
+		"will run message pump with realtime priority\n");
+	sched_setscheduler(ctlr->kworker_task, SCHED_FIFO, &param);
+}
+
+static int spi_init_queue(struct spi_controller *ctlr)
+{
 	ctlr->running = false;
 	ctlr->busy = false;
 
@@ -1387,11 +1416,8 @@ static int spi_init_queue(struct spi_controller *ctlr)
 	 * request and the scheduling of the message pump thread. Without this
 	 * setting the message pump thread will remain at default priority.
 	 */
-	if (ctlr->rt) {
-		dev_info(&ctlr->dev,
-			"will run message pump with realtime priority\n");
-		sched_setscheduler(ctlr->kworker_task, SCHED_FIFO, &param);
-	}
+	if (ctlr->rt)
+		spi_set_thread_rt(ctlr);
 
 	return 0;
 }
@@ -2982,6 +3008,11 @@ int spi_setup(struct spi_device *spi)
 
 	spi_set_cs(spi, false);
 
+	if (spi->force_rt_transfers && !spi->controller->rt) {
+		spi->controller->rt = true;
+		spi_set_thread_rt(spi->controller);
+	}
+
 	dev_dbg(&spi->dev, "setup mode %d, %s%s%s%s%u bits/w, %u Hz max --> %d\n",
 			(int) (spi->mode & (SPI_CPOL | SPI_CPHA)),
 			(spi->mode & SPI_CS_HIGH) ? "cs_high, " : "",
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 053abd22ad31..4d6ea3366480 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -109,6 +109,10 @@ void spi_statistics_add_transfer_stats(struct spi_statistics *stats,
  *	This may be changed by the device's driver, or left at the
  *	default (0) indicating protocol words are eight bit bytes.
  *	The spi_transfer.bits_per_word can override this for each transfer.
+ * @force_rt_transfers: Transfers for this device should always be done
+ *	at realtime priority. NOTE that this might add some latency in
+ *	starting the transfer (since we may need to context switch) but
+ *	once the transfer starts it will be done at high priority.
  * @irq: Negative, or the number passed to request_irq() to receive
  *	interrupts from this device.
  * @controller_state: Controller's runtime state
@@ -143,6 +147,7 @@ struct spi_device {
 	u32			max_speed_hz;
 	u8			chip_select;
 	u8			bits_per_word;
+	bool			force_rt_transfers;
 	u32			mode;
 #define	SPI_CPHA	0x01			/* clock phase */
 #define	SPI_CPOL	0x02			/* clock polarity */
-- 
2.21.0.1020.gf2820cf01a-goog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 2/3] platform/chrome: cros_ec_spi: Force transfers to realtime priority
  2019-05-13 20:18 [PATCH v2 0/3] spi: A better solution for cros_ec_spi reliability Douglas Anderson
  2019-05-13 20:18 ` [PATCH v2 1/3] spi: Allow SPI devices to force transfers on a realtime thread Douglas Anderson
@ 2019-05-13 20:18 ` Douglas Anderson
  2019-05-13 20:18 ` [PATCH v2 3/3] Revert "platform/chrome: cros_ec_spi: Transfer messages at high priority" Douglas Anderson
  2 siblings, 0 replies; 4+ messages in thread
From: Douglas Anderson @ 2019-05-13 20:18 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: linux-rockchip, drinkcat, Guenter Roeck, briannorris, mka,
	Douglas Anderson, linux-kernel

All currently known ECs in the wild are very sensitive to timing.
Specifically the ECs are known to drop a transfer if more than 8 ms
passes from the assertion of the chip select until the transfer
finishes.

Let's use the new feature introduced in the patch ("spi: Allow SPI
devices to force transfers on a realtime thread") to specify this and
increase the success rate of our transfers.

NOTE: if future Chrome OS ECs ever fix themselves to be less sensitive
then we could consider adding a property (or compatible string) to not
set this property.  For now we need it across the board.

With this change we can revert the commit 37a186225a0c
("platform/chrome: cros_ec_spi: Transfer messages at high priority").
...and, in fact, transfers are _even more_ reliable than they were
with that commit since the SPI framework will use a higher priority
(realtime) and we no longer lose our priority when we get shunted over
to the message pumping thread (because we now always get shunted and
the thread is high priority).

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
---

Changes in v2:
- Renamed variable to "force_rt_transfers".

 drivers/platform/chrome/cros_ec_spi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c
index 8e9451720e73..a2959365a870 100644
--- a/drivers/platform/chrome/cros_ec_spi.c
+++ b/drivers/platform/chrome/cros_ec_spi.c
@@ -703,6 +703,7 @@ static int cros_ec_spi_probe(struct spi_device *spi)
 
 	spi->bits_per_word = 8;
 	spi->mode = SPI_MODE_0;
+	spi->force_rt_transfers = true;
 	err = spi_setup(spi);
 	if (err < 0)
 		return err;
-- 
2.21.0.1020.gf2820cf01a-goog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 3/3] Revert "platform/chrome: cros_ec_spi: Transfer messages at high priority"
  2019-05-13 20:18 [PATCH v2 0/3] spi: A better solution for cros_ec_spi reliability Douglas Anderson
  2019-05-13 20:18 ` [PATCH v2 1/3] spi: Allow SPI devices to force transfers on a realtime thread Douglas Anderson
  2019-05-13 20:18 ` [PATCH v2 2/3] platform/chrome: cros_ec_spi: Force transfers to realtime priority Douglas Anderson
@ 2019-05-13 20:18 ` Douglas Anderson
  2 siblings, 0 replies; 4+ messages in thread
From: Douglas Anderson @ 2019-05-13 20:18 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: linux-rockchip, drinkcat, Guenter Roeck, briannorris, mka,
	Douglas Anderson, linux-kernel

This reverts commit 37a186225a0c020516bafad2727fdcdfc039a1e4.

We have a better solution in the patch ("platform/chrome: cros_ec_spi:
Force transfers to realtime priority").  Let's revert the uglier and
less reliable solution.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
---

Changes in v2: None

 drivers/platform/chrome/cros_ec_spi.c | 80 ++-------------------------
 1 file changed, 6 insertions(+), 74 deletions(-)

diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c
index a2959365a870..7adaf534eb8b 100644
--- a/drivers/platform/chrome/cros_ec_spi.c
+++ b/drivers/platform/chrome/cros_ec_spi.c
@@ -75,27 +75,6 @@ struct cros_ec_spi {
 	unsigned int end_of_msg_delay;
 };
 
-typedef int (*cros_ec_xfer_fn_t) (struct cros_ec_device *ec_dev,
-				  struct cros_ec_command *ec_msg);
-
-/**
- * struct cros_ec_xfer_work_params - params for our high priority workers
- *
- * @work: The work_struct needed to queue work
- * @fn: The function to use to transfer
- * @ec_dev: ChromeOS EC device
- * @ec_msg: Message to transfer
- * @ret: The return value of the function
- */
-
-struct cros_ec_xfer_work_params {
-	struct work_struct work;
-	cros_ec_xfer_fn_t fn;
-	struct cros_ec_device *ec_dev;
-	struct cros_ec_command *ec_msg;
-	int ret;
-};
-
 static void debug_packet(struct device *dev, const char *name, u8 *ptr,
 			 int len)
 {
@@ -371,13 +350,13 @@ static int cros_ec_spi_receive_response(struct cros_ec_device *ec_dev,
 }
 
 /**
- * do_cros_ec_pkt_xfer_spi - Transfer a packet over SPI and receive the reply
+ * cros_ec_pkt_xfer_spi - Transfer a packet over SPI and receive the reply
  *
  * @ec_dev: ChromeOS EC device
  * @ec_msg: Message to transfer
  */
-static int do_cros_ec_pkt_xfer_spi(struct cros_ec_device *ec_dev,
-				   struct cros_ec_command *ec_msg)
+static int cros_ec_pkt_xfer_spi(struct cros_ec_device *ec_dev,
+				struct cros_ec_command *ec_msg)
 {
 	struct ec_host_response *response;
 	struct cros_ec_spi *ec_spi = ec_dev->priv;
@@ -514,13 +493,13 @@ static int do_cros_ec_pkt_xfer_spi(struct cros_ec_device *ec_dev,
 }
 
 /**
- * do_cros_ec_cmd_xfer_spi - Transfer a message over SPI and receive the reply
+ * cros_ec_cmd_xfer_spi - Transfer a message over SPI and receive the reply
  *
  * @ec_dev: ChromeOS EC device
  * @ec_msg: Message to transfer
  */
-static int do_cros_ec_cmd_xfer_spi(struct cros_ec_device *ec_dev,
-				   struct cros_ec_command *ec_msg)
+static int cros_ec_cmd_xfer_spi(struct cros_ec_device *ec_dev,
+				struct cros_ec_command *ec_msg)
 {
 	struct cros_ec_spi *ec_spi = ec_dev->priv;
 	struct spi_transfer trans;
@@ -632,53 +611,6 @@ static int do_cros_ec_cmd_xfer_spi(struct cros_ec_device *ec_dev,
 	return ret;
 }
 
-static void cros_ec_xfer_high_pri_work(struct work_struct *work)
-{
-	struct cros_ec_xfer_work_params *params;
-
-	params = container_of(work, struct cros_ec_xfer_work_params, work);
-	params->ret = params->fn(params->ec_dev, params->ec_msg);
-}
-
-static int cros_ec_xfer_high_pri(struct cros_ec_device *ec_dev,
-				 struct cros_ec_command *ec_msg,
-				 cros_ec_xfer_fn_t fn)
-{
-	struct cros_ec_xfer_work_params params;
-
-	INIT_WORK_ONSTACK(&params.work, cros_ec_xfer_high_pri_work);
-	params.ec_dev = ec_dev;
-	params.ec_msg = ec_msg;
-	params.fn = fn;
-
-	/*
-	 * This looks a bit ridiculous.  Why do the work on a
-	 * different thread if we're just going to block waiting for
-	 * the thread to finish?  The key here is that the thread is
-	 * running at high priority but the calling context might not
-	 * be.  We need to be at high priority to avoid getting
-	 * context switched out for too long and the EC giving up on
-	 * the transfer.
-	 */
-	queue_work(system_highpri_wq, &params.work);
-	flush_work(&params.work);
-	destroy_work_on_stack(&params.work);
-
-	return params.ret;
-}
-
-static int cros_ec_pkt_xfer_spi(struct cros_ec_device *ec_dev,
-				struct cros_ec_command *ec_msg)
-{
-	return cros_ec_xfer_high_pri(ec_dev, ec_msg, do_cros_ec_pkt_xfer_spi);
-}
-
-static int cros_ec_cmd_xfer_spi(struct cros_ec_device *ec_dev,
-				struct cros_ec_command *ec_msg)
-{
-	return cros_ec_xfer_high_pri(ec_dev, ec_msg, do_cros_ec_cmd_xfer_spi);
-}
-
 static void cros_ec_spi_dt_probe(struct cros_ec_spi *ec_spi, struct device *dev)
 {
 	struct device_node *np = dev->of_node;
-- 
2.21.0.1020.gf2820cf01a-goog


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-13 20:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-13 20:18 [PATCH v2 0/3] spi: A better solution for cros_ec_spi reliability Douglas Anderson
2019-05-13 20:18 ` [PATCH v2 1/3] spi: Allow SPI devices to force transfers on a realtime thread Douglas Anderson
2019-05-13 20:18 ` [PATCH v2 2/3] platform/chrome: cros_ec_spi: Force transfers to realtime priority Douglas Anderson
2019-05-13 20:18 ` [PATCH v2 3/3] Revert "platform/chrome: cros_ec_spi: Transfer messages at high priority" Douglas Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).