LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Sudeep Holla <sudeep.holla@arm.com> To: Cristian Marussi <cristian.marussi@arm.com> Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, virtualization@lists.linux-foundation.org, virtio-dev@lists.oasis-open.org, james.quinlan@broadcom.com, Jonathan.Cameron@Huawei.com, f.fainelli@gmail.com, etienne.carriere@linaro.org, vincent.guittot@linaro.org, souvik.chakravarty@arm.com, igor.skalkin@opensynergy.com, peter.hilber@opensynergy.com, alex.bennee@linaro.org, jean-philippe@linaro.org, mikhail.golubev@opensynergy.com, anton.yakovlev@opensynergy.com, Vasyl.Vavrychuk@opensynergy.com, Sudeep Holla <sudeep.holla@arm.com>, Andriy.Tryshnivskyy@opensynergy.com Subject: Re: [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages Date: Mon, 2 Aug 2021 11:10:32 +0100 [thread overview] Message-ID: <20210802101032.ozlidylogmdt2zqu@bogus> (raw) In-Reply-To: <20210712141833.6628-8-cristian.marussi@arm.com> On Mon, Jul 12, 2021 at 03:18:23PM +0100, Cristian Marussi wrote: > Even though in case of asynchronous commands an SCMI platform server is Drop the term "server" > constrained to emit the delayed response message only after the related > message response has been sent, the configured underlying transport could > still deliver such messages together or in inverted order, causing races > due to the concurrent or out-of-order access to the underlying xfer. > > Introduce a mechanism to grant exclusive access to an xfer in order to > properly serialize concurrent accesses to the same xfer originating from > multiple correlated messages. > > Add additional state information to xfer descriptors so as to be able to > identify out-of-order message deliveries and act accordingly: > > - when a delayed response is expected but delivered before the related > response, the synchronous response is considered as successfully > received and the delayed response processing is carried on as usual. > > - when/if the missing synchronous response is subsequently received, it > is discarded as not congruent with the current state of the xfer, or > simply, because the xfer has been already released and so, now, the > monotonically increasing sequence number carried by the late response > is stale. > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> > --- > v5 --> v6 > - added spinlock comment > --- > drivers/firmware/arm_scmi/common.h | 18 ++- > drivers/firmware/arm_scmi/driver.c | 229 ++++++++++++++++++++++++----- > 2 files changed, 212 insertions(+), 35 deletions(-) > > diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h > index 2233d0a188fc..9efebe1406d2 100644 > --- a/drivers/firmware/arm_scmi/common.h > +++ b/drivers/firmware/arm_scmi/common.h > @@ -19,6 +19,7 @@ > #include <linux/module.h> > #include <linux/refcount.h> > #include <linux/scmi_protocol.h> > +#include <linux/spinlock.h> > #include <linux/types.h> > > #include <asm/unaligned.h> > @@ -145,6 +146,13 @@ struct scmi_msg { > * @pending: True for xfers added to @pending_xfers hashtable > * @node: An hlist_node reference used to store this xfer, alternatively, on > * the free list @free_xfers or in the @pending_xfers hashtable > + * @busy: An atomic flag to ensure exclusive write access to this xfer > + * @state: The current state of this transfer, with states transitions deemed > + * valid being: > + * - SCMI_XFER_SENT_OK -> SCMI_XFER_RESP_OK [ -> SCMI_XFER_DRESP_OK ] > + * - SCMI_XFER_SENT_OK -> SCMI_XFER_DRESP_OK > + * (Missing synchronous response is assumed OK and ignored) > + * @lock: A spinlock to protect state and busy fields. > */ > struct scmi_xfer { > int transfer_id; > @@ -156,6 +164,15 @@ struct scmi_xfer { > refcount_t users; > bool pending; > struct hlist_node node; > +#define SCMI_XFER_FREE 0 > +#define SCMI_XFER_BUSY 1 > + atomic_t busy; > +#define SCMI_XFER_SENT_OK 0 > +#define SCMI_XFER_RESP_OK 1 > +#define SCMI_XFER_DRESP_OK 2 > + int state; > + /* A lock to protect state and busy fields */ > + spinlock_t lock; > }; > > /* > @@ -392,5 +409,4 @@ bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem, > void scmi_notification_instance_data_set(const struct scmi_handle *handle, > void *priv); > void *scmi_notification_instance_data_get(const struct scmi_handle *handle); > - > #endif /* _SCMI_COMMON_H */ > diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c > index 245ede223302..5ef33d692670 100644 > --- a/drivers/firmware/arm_scmi/driver.c > +++ b/drivers/firmware/arm_scmi/driver.c > @@ -369,6 +369,7 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle, > > if (!IS_ERR(xfer)) { > refcount_set(&xfer->users, 1); > + atomic_set(&xfer->busy, SCMI_XFER_FREE); > xfer->transfer_id = atomic_inc_return(&transfer_last_id); > } > spin_unlock_irqrestore(&minfo->xfer_lock, flags); > @@ -430,6 +431,168 @@ scmi_xfer_lookup_unlocked(struct scmi_xfers_info *minfo, u16 xfer_id) > return xfer ?: ERR_PTR(-EINVAL); > } > > +/** > + * scmi_msg_response_validate - Validate message type against state of related > + * xfer > + * > + * @cinfo: A reference to the channel descriptor. > + * @msg_type: Message type to check > + * @xfer: A reference to the xfer to validate against @msg_type > + * > + * This function checks if @msg_type is congruent with the current state of > + * a pending @xfer; if an asynchronous delayed response is received before the > + * related synchronous response (Out-of-Order Delayed Response) the missing > + * synchronous response is assumed to be OK and completed, carrying on with the > + * Delayed Response: this is done to address the case in which the underlying > + * SCMI transport can deliver such out-of-order responses. > + * > + * Context: Assumes to be called with xfer->lock already acquired. > + * > + * Return: 0 on Success, error otherwise > + */ > +static inline int scmi_msg_response_validate(struct scmi_chan_info *cinfo, > + u8 msg_type, > + struct scmi_xfer *xfer) > +{ > + /* > + * Even if a response was indeed expected on this slot at this point, > + * a buggy platform could wrongly reply feeding us an unexpected > + * delayed response we're not prepared to handle: bail-out safely > + * blaming firmware. > + */ > + if (msg_type == MSG_TYPE_DELAYED_RESP && !xfer->async_done) { > + dev_err(cinfo->dev, > + "Delayed Response for %d not expected! Buggy F/W ?\n", > + xfer->hdr.seq); > + return -EINVAL; > + } > + > + switch (xfer->state) { > + case SCMI_XFER_SENT_OK: > + if (msg_type == MSG_TYPE_DELAYED_RESP) { > + /* > + * Delayed Response expected but delivered earlier. > + * Assume message RESPONSE was OK and skip state. > + */ > + xfer->hdr.status = SCMI_SUCCESS; > + xfer->state = SCMI_XFER_RESP_OK; > + complete(&xfer->done); > + dev_warn(cinfo->dev, > + "Received valid OoO Delayed Response for %d\n", > + xfer->hdr.seq); > + } > + break; > + case SCMI_XFER_RESP_OK: > + if (msg_type != MSG_TYPE_DELAYED_RESP) > + return -EINVAL; > + break; > + case SCMI_XFER_DRESP_OK: > + /* No further message expected once in SCMI_XFER_DRESP_OK */ Do we really need this case ? If so, how can this happen. > + return -EINVAL; > + } > + > + return 0; > +} > + > +static bool scmi_xfer_is_free(struct scmi_xfer *xfer) > +{ > + int ret; > + > + ret = atomic_cmpxchg(&xfer->busy, SCMI_XFER_FREE, SCMI_XFER_BUSY); > + > + return ret == SCMI_XFER_FREE; > +} > + > +/** > + * scmi_xfer_command_acquire - Helper to lookup and acquire a command xfer > + * > + * @cinfo: A reference to the channel descriptor. > + * @msg_hdr: A message header to use as lookup key > + * > + * When a valid xfer is found for the sequence number embedded in the provided > + * msg_hdr, reference counting is properly updated and exclusive access to this > + * xfer is granted till released with @scmi_xfer_command_release. > + * > + * Return: A valid @xfer on Success or error otherwise. > + */ > +static inline struct scmi_xfer * > +scmi_xfer_command_acquire(struct scmi_chan_info *cinfo, u32 msg_hdr) > +{ > + int ret; > + unsigned long flags; > + struct scmi_xfer *xfer; > + struct scmi_info *info = handle_to_scmi_info(cinfo->handle); > + struct scmi_xfers_info *minfo = &info->tx_minfo; > + u8 msg_type = MSG_XTRACT_TYPE(msg_hdr); > + u16 xfer_id = MSG_XTRACT_TOKEN(msg_hdr); > + > + /* Are we even expecting this? */ > + spin_lock_irqsave(&minfo->xfer_lock, flags); > + xfer = scmi_xfer_lookup_unlocked(minfo, xfer_id); > + if (IS_ERR(xfer)) { > + dev_err(cinfo->dev, > + "Message for %d type %d is not expected!\n", > + xfer_id, msg_type); > + spin_unlock_irqrestore(&minfo->xfer_lock, flags); > + return xfer; > + } > + refcount_inc(&xfer->users); > + spin_unlock_irqrestore(&minfo->xfer_lock, flags); > + > + spin_lock_irqsave(&xfer->lock, flags); > + ret = scmi_msg_response_validate(cinfo, msg_type, xfer); > + /* > + * If a pending xfer was found which was also in a congruent state with > + * the received message, acquire exclusive access to it setting the busy > + * flag. > + * Spins only on the rare limit condition of concurrent reception of > + * RESP and DRESP for the same xfer. > + */ > + if (!ret) { > + spin_until_cond(scmi_xfer_is_free(xfer)); I agree with the discussion between you and Peter around this, so I assume it will be renamed or handled accordingly. > + xfer->hdr.type = msg_type; > + } > + spin_unlock_irqrestore(&xfer->lock, flags); > + > + if (ret) { > + dev_err(cinfo->dev, > + "Invalid message type:%d for %d - HDR:0x%X state:%d\n", > + msg_type, xfer_id, msg_hdr, xfer->state); > + /* On error the refcount incremented above has to be dropped */ > + __scmi_xfer_put(minfo, xfer); > + xfer = ERR_PTR(-EINVAL); > + } > + > + return xfer; > +} > + > +static inline void scmi_xfer_command_release(struct scmi_info *info, > + struct scmi_xfer *xfer) > +{ > + atomic_set(&xfer->busy, SCMI_XFER_FREE); > + __scmi_xfer_put(&info->tx_minfo, xfer); > +} > + > +/** > + * scmi_xfer_state_update - Update xfer state > + * > + * @xfer: A reference to the xfer to update > + * > + * Context: Assumes to be called on an xfer exclusively acquired using the > + * busy flag. > + */ > +static inline void scmi_xfer_state_update(struct scmi_xfer *xfer) > +{ > + switch (xfer->hdr.type) { > + case MSG_TYPE_COMMAND: > + xfer->state = SCMI_XFER_RESP_OK; > + break; > + case MSG_TYPE_DELAYED_RESP: > + xfer->state = SCMI_XFER_DRESP_OK; > + break; > + } > +} Can't this be if () .. else if(), switch sounds unnecessary for 2 conditions. Other than the things already discussed with you and Peter, don't have much to add ATM. I may look at this with fresh eyes once again in the next version. -- Regards, Sudeep
next prev parent reply other threads:[~2021-08-02 10:11 UTC|newest] Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-07-12 14:18 [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 01/17] firmware: arm_scmi: Avoid padding in sensor message structure Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 02/17] firmware: arm_scmi: Fix max pending messages boundary check Cristian Marussi 2021-07-14 16:46 ` Sudeep Holla 2021-07-12 14:18 ` [PATCH v6 03/17] firmware: arm_scmi: Add support for type handling in common functions Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 04/17] firmware: arm_scmi: Remove scmi_dump_header_dbg() helper Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 05/17] firmware: arm_scmi: Add transport optional init/exit support Cristian Marussi 2021-07-28 11:40 ` Sudeep Holla 2021-07-28 12:28 ` Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 06/17] firmware: arm_scmi: Introduce monotonically increasing tokens Cristian Marussi 2021-07-28 14:17 ` Sudeep Holla 2021-07-28 16:54 ` Cristian Marussi 2021-08-02 10:24 ` Sudeep Holla 2021-08-03 12:52 ` Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 07/17] firmware: arm_scmi: Handle concurrent and out-of-order messages Cristian Marussi 2021-07-15 16:36 ` Peter Hilber 2021-07-19 9:14 ` Cristian Marussi 2021-07-22 8:32 ` Peter Hilber 2021-07-28 8:31 ` Cristian Marussi 2021-08-02 10:10 ` Sudeep Holla [this message] 2021-08-02 10:27 ` Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 08/17] firmware: arm_scmi: Add priv parameter to scmi_rx_callback Cristian Marussi 2021-07-28 14:26 ` Sudeep Holla 2021-07-28 17:25 ` Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 09/17] firmware: arm_scmi: Make .clear_channel optional Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 10/17] firmware: arm_scmi: Make polling mode optional Cristian Marussi 2021-07-15 16:36 ` Peter Hilber 2021-07-19 9:15 ` Cristian Marussi 2021-07-28 14:34 ` Sudeep Holla 2021-07-28 17:41 ` Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 11/17] firmware: arm_scmi: Make SCMI transports configurable Cristian Marussi 2021-07-28 14:50 ` Sudeep Holla 2021-07-29 16:18 ` Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 12/17] firmware: arm_scmi: Make shmem support optional for transports Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 13/17] firmware: arm_scmi: Add method to override max message number Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 14/17] firmware: arm_scmi: Add message passing abstractions for transports Cristian Marussi 2021-07-15 16:36 ` Peter Hilber 2021-07-19 9:16 ` Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 15/17] firmware: arm_scmi: Add optional link_supplier() transport op Cristian Marussi 2021-07-28 15:36 ` Sudeep Holla 2021-07-29 16:19 ` Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 16/17] dt-bindings: arm: Add virtio transport for SCMI Cristian Marussi 2021-07-12 14:18 ` [PATCH v6 17/17] firmware: arm_scmi: Add virtio transport Cristian Marussi 2021-07-15 16:35 ` [PATCH v6 00/17] Introduce SCMI transport based on VirtIO Peter Hilber 2021-07-19 11:36 ` Cristian Marussi 2021-07-22 8:30 ` Peter Hilber 2021-08-11 9:31 ` Floris Westermann 2021-08-11 15:26 ` Cristian Marussi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210802101032.ozlidylogmdt2zqu@bogus \ --to=sudeep.holla@arm.com \ --cc=Andriy.Tryshnivskyy@opensynergy.com \ --cc=Jonathan.Cameron@Huawei.com \ --cc=Vasyl.Vavrychuk@opensynergy.com \ --cc=alex.bennee@linaro.org \ --cc=anton.yakovlev@opensynergy.com \ --cc=cristian.marussi@arm.com \ --cc=etienne.carriere@linaro.org \ --cc=f.fainelli@gmail.com \ --cc=igor.skalkin@opensynergy.com \ --cc=james.quinlan@broadcom.com \ --cc=jean-philippe@linaro.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mikhail.golubev@opensynergy.com \ --cc=peter.hilber@opensynergy.com \ --cc=souvik.chakravarty@arm.com \ --cc=vincent.guittot@linaro.org \ --cc=virtio-dev@lists.oasis-open.org \ --cc=virtualization@lists.linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).