LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 00/18] net: introduce Qualcomm IPA driver
@ 2019-05-12  1:24 Alex Elder
  2019-05-12  1:24 ` [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max() Alex Elder
                   ` (18 more replies)
  0 siblings, 19 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel

This series presents the driver for the Qualcomm IP Accelerator (IPA).
The IPA is a component present in some Qualcomm SoCs that allows
network functions such as aggregation, filtering, routing, and NAT
to be performed without active involvement of the main application
processor (AP).

Initially, these advanced features are disabled; the IPA driver
simply provides a network interface that makes the modem's LTE
network available to the AP.  In addition, only support for the
IPA found in the Qualcomm SDM845 SoC is provided.

This code is derived from a driver developed internally by Qualcomm.
A version of the original source can be seen here:
  https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree
in the "drivers/platform/msm/ipa" directory.  Many were involved in
developing this, but the following individuals deserve explicit
acknowledgement for their substantial contributions:

    Abhishek Choubey
    Ady Abraham
    Chaitanya Pratapa
    David Arinzon
    Ghanim Fodi
    Gidon Studinski
    Ravi Gummadidala
    Shihuan Liu
    Skylar Chang

A version of this code was posted in November 2018 as an RFC.
  https://lore.kernel.org/lkml/20181107003250.5832-1-elder@linaro.org/
Fixes addressing all feedback received have been implemented.  It
has undergone considerable further rework since that time, and
most of the "future work" described then has now been completed.

This code (including its dependencies) is available in buildable
form here, based on kernel v5.1:
  remote: ssh://git@git.linaro.org/people/alex.elder/linux.git
  branch: ipa-v1_kernel-v5.1
    f5d4a676a981 arm64: defconfig: enable build of IPA code

					-Alex

Alex Elder (18):
  bitfield.h: add FIELD_MAX() and field_max()
  soc: qcom: create "include/soc/qcom/rmnet.h"
  dt-bindings: soc: qcom: add IPA bindings
  soc: qcom: ipa: main code
  soc: qcom: ipa: configuration data
  soc: qcom: ipa: clocking, interrupts, and memory
  soc: qcom: ipa: GSI headers
  soc: qcom: ipa: the generic software interface
  soc: qcom: ipa: GSI transactions
  soc: qcom: ipa: IPA interface to GSI
  soc: qcom: ipa: IPA endpoints
  soc: qcom: ipa: immediate commands
  soc: qcom: ipa: IPA network device and microcontroller
  soc: qcom: ipa: AP/modem communications
  soc: qcom: ipa: support build of IPA code
  MAINTAINERS: add entry for the Qualcomm IPA driver
  arm64: dts: sdm845: add IPA information
  arm64: defconfig: enable build of IPA code

 .../devicetree/bindings/net/qcom,ipa.txt      |  164 ++
 MAINTAINERS                                   |    6 +
 arch/arm64/boot/dts/qcom/sdm845.dtsi          |   51 +
 arch/arm64/configs/defconfig                  |    1 +
 drivers/net/Kconfig                           |    2 +
 drivers/net/Makefile                          |    1 +
 .../ethernet/qualcomm/rmnet/rmnet_handlers.c  |    1 +
 .../net/ethernet/qualcomm/rmnet/rmnet_map.h   |   24 -
 .../qualcomm/rmnet/rmnet_map_command.c        |    1 +
 .../ethernet/qualcomm/rmnet/rmnet_map_data.c  |    1 +
 .../net/ethernet/qualcomm/rmnet/rmnet_vnd.c   |    1 +
 drivers/net/ipa/Kconfig                       |   16 +
 drivers/net/ipa/Makefile                      |    7 +
 drivers/net/ipa/gsi.c                         | 1741 +++++++++++++++++
 drivers/net/ipa/gsi.h                         |  241 +++
 drivers/net/ipa/gsi_private.h                 |  147 ++
 drivers/net/ipa/gsi_reg.h                     |  376 ++++
 drivers/net/ipa/gsi_trans.c                   |  604 ++++++
 drivers/net/ipa/gsi_trans.h                   |  106 +
 drivers/net/ipa/ipa.h                         |  131 ++
 drivers/net/ipa/ipa_clock.c                   |  291 +++
 drivers/net/ipa/ipa_clock.h                   |   52 +
 drivers/net/ipa/ipa_cmd.c                     |  372 ++++
 drivers/net/ipa/ipa_cmd.h                     |  116 ++
 drivers/net/ipa/ipa_data-sdm845.c             |  245 +++
 drivers/net/ipa/ipa_data.h                    |  267 +++
 drivers/net/ipa/ipa_endpoint.c                | 1253 ++++++++++++
 drivers/net/ipa/ipa_endpoint.h                |   96 +
 drivers/net/ipa/ipa_gsi.c                     |   48 +
 drivers/net/ipa/ipa_gsi.h                     |   49 +
 drivers/net/ipa/ipa_interrupt.c               |  279 +++
 drivers/net/ipa/ipa_interrupt.h               |   53 +
 drivers/net/ipa/ipa_main.c                    |  920 +++++++++
 drivers/net/ipa/ipa_mem.c                     |  237 +++
 drivers/net/ipa/ipa_mem.h                     |   82 +
 drivers/net/ipa/ipa_netdev.c                  |  250 +++
 drivers/net/ipa/ipa_netdev.h                  |   24 +
 drivers/net/ipa/ipa_qmi.c                     |  399 ++++
 drivers/net/ipa/ipa_qmi.h                     |   35 +
 drivers/net/ipa/ipa_qmi_msg.c                 |  583 ++++++
 drivers/net/ipa/ipa_qmi_msg.h                 |  238 +++
 drivers/net/ipa/ipa_reg.h                     |  279 +++
 drivers/net/ipa/ipa_smp2p.c                   |  304 +++
 drivers/net/ipa/ipa_smp2p.h                   |   47 +
 drivers/net/ipa/ipa_uc.c                      |  208 ++
 drivers/net/ipa/ipa_uc.h                      |   32 +
 include/linux/bitfield.h                      |   14 +
 include/soc/qcom/rmnet.h                      |   38 +
 48 files changed, 10409 insertions(+), 24 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/qcom,ipa.txt
 create mode 100644 drivers/net/ipa/Kconfig
 create mode 100644 drivers/net/ipa/Makefile
 create mode 100644 drivers/net/ipa/gsi.c
 create mode 100644 drivers/net/ipa/gsi.h
 create mode 100644 drivers/net/ipa/gsi_private.h
 create mode 100644 drivers/net/ipa/gsi_reg.h
 create mode 100644 drivers/net/ipa/gsi_trans.c
 create mode 100644 drivers/net/ipa/gsi_trans.h
 create mode 100644 drivers/net/ipa/ipa.h
 create mode 100644 drivers/net/ipa/ipa_clock.c
 create mode 100644 drivers/net/ipa/ipa_clock.h
 create mode 100644 drivers/net/ipa/ipa_cmd.c
 create mode 100644 drivers/net/ipa/ipa_cmd.h
 create mode 100644 drivers/net/ipa/ipa_data-sdm845.c
 create mode 100644 drivers/net/ipa/ipa_data.h
 create mode 100644 drivers/net/ipa/ipa_endpoint.c
 create mode 100644 drivers/net/ipa/ipa_endpoint.h
 create mode 100644 drivers/net/ipa/ipa_gsi.c
 create mode 100644 drivers/net/ipa/ipa_gsi.h
 create mode 100644 drivers/net/ipa/ipa_interrupt.c
 create mode 100644 drivers/net/ipa/ipa_interrupt.h
 create mode 100644 drivers/net/ipa/ipa_main.c
 create mode 100644 drivers/net/ipa/ipa_mem.c
 create mode 100644 drivers/net/ipa/ipa_mem.h
 create mode 100644 drivers/net/ipa/ipa_netdev.c
 create mode 100644 drivers/net/ipa/ipa_netdev.h
 create mode 100644 drivers/net/ipa/ipa_qmi.c
 create mode 100644 drivers/net/ipa/ipa_qmi.h
 create mode 100644 drivers/net/ipa/ipa_qmi_msg.c
 create mode 100644 drivers/net/ipa/ipa_qmi_msg.h
 create mode 100644 drivers/net/ipa/ipa_reg.h
 create mode 100644 drivers/net/ipa/ipa_smp2p.c
 create mode 100644 drivers/net/ipa/ipa_smp2p.h
 create mode 100644 drivers/net/ipa/ipa_uc.c
 create mode 100644 drivers/net/ipa/ipa_uc.h
 create mode 100644 include/soc/qcom/rmnet.h

-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max()
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-12  6:33   ` Kalle Valo
  2019-05-12  1:24 ` [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h" Alex Elder
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas, kvalo, johannes,
	andy.shevchenko
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

Define FIELD_MAX(), which supplies the maximum value that can be
represented by a field value.  Define field_max() as well, to go
along with the lower-case forms of the field mask functions.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 include/linux/bitfield.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/bitfield.h b/include/linux/bitfield.h
index 3f1ef4450a7c..cf4f06774520 100644
--- a/include/linux/bitfield.h
+++ b/include/linux/bitfield.h
@@ -63,6 +63,19 @@
 					      (1ULL << __bf_shf(_mask))); \
 	})
 
+/**
+ * FIELD_MAX() - produce the maximum value representable by a field
+ * @_mask: shifted mask defining the field's length and position
+ *
+ * FIELD_MAX() returns the maximum value that can be held in the field
+ * specified by @_mask.
+ */
+#define FIELD_MAX(_mask)						\
+	({								\
+		__BF_FIELD_CHECK(_mask, 0ULL, 0ULL, "FIELD_MAX: ");	\
+		(typeof(_mask))((_mask) >> __bf_shf(_mask));		\
+	})
+
 /**
  * FIELD_FIT() - check if value fits in the field
  * @_mask: shifted mask defining the field's length and position
@@ -118,6 +131,7 @@ static __always_inline u64 field_mask(u64 field)
 {
 	return field / field_multiplier(field);
 }
+#define field_max(field)	((typeof(field))field_mask(field))
 #define ____MAKE_OP(type,base,to,from)					\
 static __always_inline __##type type##_encode_bits(base v, base field)	\
 {									\
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
  2019-05-12  1:24 ` [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max() Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-12  2:34   ` Joe Perches
  2019-05-15  6:59   ` Arnd Bergmann
  2019-05-12  1:24 ` [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings Alex Elder
                   ` (16 subsequent siblings)
  18 siblings, 2 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas, subashab,
	stranche, yuehaibing, joe
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

The IPA driver requires some (but not all) symbols defined in
"drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h".  Create a new
public header file "include/soc/qcom/rmnet.h" and move the needed
definitions there.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 .../ethernet/qualcomm/rmnet/rmnet_handlers.c  |  1 +
 .../net/ethernet/qualcomm/rmnet/rmnet_map.h   | 24 ------------
 .../qualcomm/rmnet/rmnet_map_command.c        |  1 +
 .../ethernet/qualcomm/rmnet/rmnet_map_data.c  |  1 +
 .../net/ethernet/qualcomm/rmnet/rmnet_vnd.c   |  1 +
 include/soc/qcom/rmnet.h                      | 38 +++++++++++++++++++
 6 files changed, 42 insertions(+), 24 deletions(-)
 create mode 100644 include/soc/qcom/rmnet.h

diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
index 11167abe5934..3aa79b7ed539 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
@@ -17,6 +17,7 @@
 #include <linux/netdev_features.h>
 #include <linux/if_arp.h>
 #include <net/sock.h>
+#include <soc/qcom/rmnet.h>
 #include "rmnet_private.h"
 #include "rmnet_config.h"
 #include "rmnet_vnd.h"
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
index 884f1f52dcc2..39d0be99a771 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map.h
@@ -39,30 +39,6 @@ enum rmnet_map_commands {
 	RMNET_MAP_COMMAND_ENUM_LENGTH
 };
 
-struct rmnet_map_header {
-	u8  pad_len:6;
-	u8  reserved_bit:1;
-	u8  cd_bit:1;
-	u8  mux_id;
-	__be16 pkt_len;
-}  __aligned(1);
-
-struct rmnet_map_dl_csum_trailer {
-	u8  reserved1;
-	u8  valid:1;
-	u8  reserved2:7;
-	u16 csum_start_offset;
-	u16 csum_length;
-	__be16 csum_value;
-} __aligned(1);
-
-struct rmnet_map_ul_csum_header {
-	__be16 csum_start_offset;
-	u16 csum_insert_offset:14;
-	u16 udp_ip4_ind:1;
-	u16 csum_enabled:1;
-} __aligned(1);
-
 #define RMNET_MAP_GET_MUX_ID(Y) (((struct rmnet_map_header *) \
 				 (Y)->data)->mux_id)
 #define RMNET_MAP_GET_CD_BIT(Y) (((struct rmnet_map_header *) \
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
index f6cf59aee212..54b86a8be570 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c
@@ -11,6 +11,7 @@
  */
 
 #include <linux/netdevice.h>
+#include <soc/qcom/rmnet.h>
 #include "rmnet_config.h"
 #include "rmnet_map.h"
 #include "rmnet_private.h"
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
index 57a9c314a665..e3fb4035820c 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
@@ -17,6 +17,7 @@
 #include <linux/ip.h>
 #include <linux/ipv6.h>
 #include <net/ip6_checksum.h>
+#include <soc/qcom/rmnet.h>
 #include "rmnet_config.h"
 #include "rmnet_map.h"
 #include "rmnet_private.h"
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
index d11c16aeb19a..b8df36e827d4 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
@@ -17,6 +17,7 @@
 #include <linux/etherdevice.h>
 #include <linux/if_arp.h>
 #include <net/pkt_sched.h>
+#include <soc/qcom/rmnet.h>
 #include "rmnet_config.h"
 #include "rmnet_handlers.h"
 #include "rmnet_private.h"
diff --git a/include/soc/qcom/rmnet.h b/include/soc/qcom/rmnet.h
new file mode 100644
index 000000000000..80dcd6e68c3d
--- /dev/null
+++ b/include/soc/qcom/rmnet.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2013-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _SOC_QCOM_RMNET_H_
+#define _SOC_QCOM_RMNET_H_
+
+#include <linux/types.h>
+
+/* Header structure that precedes packets in ETH_P_MAP protocol */
+struct rmnet_map_header {
+	u8  pad_len		: 6;
+	u8  reserved_bit	: 1;
+	u8  cd_bit		: 1;
+	u8  mux_id;
+	__be16 pkt_len;
+}  __aligned(1);
+
+/* Checksum offload metadata header for outbound packets*/
+struct rmnet_map_ul_csum_header {
+	__be16 csum_start_offset;
+	u16 csum_insert_offset	: 14;
+	u16 udp_ip4_ind		: 1;
+	u16 csum_enabled	: 1;
+} __aligned(1);
+
+/* Checksum offload metadata trailer for inbound packets */
+struct rmnet_map_dl_csum_trailer {
+	u8  reserved1;
+	u8  valid		: 1;
+	u8  reserved2		: 7;
+	u16 csum_start_offset;
+	u16 csum_length;
+	__be16 csum_value;
+} __aligned(1);
+
+#endif /* _SOC_QCOM_RMNET_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
  2019-05-12  1:24 ` [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max() Alex Elder
  2019-05-12  1:24 ` [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h" Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-15  7:03   ` Arnd Bergmann
  2019-05-12  1:24 ` [PATCH 04/18] soc: qcom: ipa: main code Alex Elder
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas, robh+dt,
	mark.rutland, andy.gross, david.brown
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

Add the binding definitions for the "qcom,ipa" device tree node.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 .../devicetree/bindings/net/qcom,ipa.txt      | 164 ++++++++++++++++++
 1 file changed, 164 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/qcom,ipa.txt

diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.txt b/Documentation/devicetree/bindings/net/qcom,ipa.txt
new file mode 100644
index 000000000000..2705e198f12e
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/qcom,ipa.txt
@@ -0,0 +1,164 @@
+Qualcomm IP Accelerator (IPA)
+
+This binding describes the Qualcomm IPA.  The IPA is capable of offloading
+certain network processing tasks (e.g. filtering, routing, and NAT) from
+the main processor.
+
+The IPA sits between multiple independent "execution environments,"
+including the Application Processor (AP) and the modem.  The IPA presents
+a Generic Software Interface (GSI) to each execution environment.
+The GSI is an integral part of the IPA, but it is logically isolated
+and has a distinct interrupt and a separately-defined address space.
+
+	--------	     ---------
+	|      |	     |	     |
+	|  AP  +<---.	.----+ Modem |
+	|      +--. |	| .->+	     |
+	|      |  | |	| |  |	     |
+	--------  | |	| |  ---------
+		  v |	v |
+		--+-+---+-+--
+		|    GSI    |
+		|-----------|
+		|	    |
+		|    IPA    |
+		|	    |
+		-------------
+
+See also:
+  bindings/interrupt-controller/interrupts.txt
+  bindings/interconnect/interconnect.txt
+  bindings/soc/qcom/qcom,smp2p.txt
+  bindings/reserved-memory/reserved-memory.txt
+  bindings/clock/clock-bindings.txt
+
+All properties except "modem-init" defined below are required.
+
+- compatible:
+	Must be "qcom,sdm845-ipa".
+
+- modem-init:
+	This Boolean property is optional.  If present, it indicates that
+	the modem is responsible for performing early IPA initialization,
+	including loading and validating firwmare used by the GSI.  This
+	early initialization is performed by Trust Zone otherwise.
+
+- reg:
+	Resources specifying the physical address spaces of the IPA and GSI.
+
+- reg-names:
+	The names of the two address space ranges defined by the "reg"
+	property.  Must be:
+		"ipa-reg"
+		"ipa-shared"
+		"gsi"
+
+- interrupts:
+	Specifies the IRQs used by the IPA.  Four interrupts are required,
+	specifying: the IPA IRQ; the GSI IRQ; the clock query interrupt
+	from the modem; and the "ready for setup" interrupt from the modem.
+	The first two are hardware IRQs; the third and fourth are SMP2P
+	input interrupts.
+
+- interrupt-names:
+	The names of the interrupts defined by the "interrupts-extended"
+	property.  Must be:
+		"ipa"
+		"gsi"
+		"ipa-clock-query"
+		"ipa-setup-ready"
+
+- clocks:
+	Resource that defines the IPA core clock.
+
+- clock-names:
+	The name used for the IPA core clock.  Must be "core".
+
+- interconnects:
+	Specifies the interconnects used by the IPA.  Three interconnects
+	are required, specifying:  the path from the IPA to memory; from
+	IPA to internal (SoC resident) memory; and between the AP subsystem
+	and IPA for register access.
+
+- interconnect-names:
+	The names of the interconnects defined by the "interconnects"
+	property.  Must be:
+		"memory"
+		"imem"
+		"config"
+
+- qcom,smem-states
+	The state bits used for SMP2P output.  Two states must be specified.
+	The first indicates whether the value in the second bit is valid
+	(1 means valid).  The second, if valid, defines whether the IPA
+	clock is enabled (1 means enabled).
+
+- qcom,smem-state-names
+	The names of the state bits used for SMP2P output.  Must be:
+		"ipa-clock-enabled-valid"
+		"ipa-clock-enabled"
+
+- memory-region
+	A phandle for a reserved memory area that holds the firmware passed
+	to Trust Zone for authentication.  (Note, this is required
+	only when Trust Zone performs early initialization; that is,
+	it is required if "modem-init" is not defined.)
+
+= EXAMPLE
+
+The following example represents the IPA present in the SDM845 SoC.  It
+shows portions of the "modem-smp2p" node to indicate its relationship
+with the interrupts and SMEM states used by the IPA.
+
+	smp2p-mpss {
+		compatible = "qcom,smp2p";
+		. . .
+		ipa_smp2p_out: ipa-ap-to-modem {
+			qcom,entry-name = "ipa";
+			#qcom,smem-state-cells = <1>;
+		};
+
+		ipa_smp2p_in: ipa-modem-to-ap {
+			qcom,entry-name = "ipa";
+			interrupt-controller;
+			#interrupt-cells = <2>;
+		};
+	};
+
+	ipa@1e40000 {
+		compatible = "qcom,sdm845-ipa";
+
+		modem-init;
+
+		reg = <0 0x1e40000 0 0x7000>,
+		      <0 0x1e47000 0 0x2000>,
+		      <0 0x1e04000 0 0x2c000>;
+		reg-names = "ipa-reg",
+			    "ipa-shared";
+			    "gsi";
+
+		interrupts-extended = <&intc 0 311 IRQ_TYPE_EDGE_RISING>,
+				      <&intc 0 432 IRQ_TYPE_LEVEL_HIGH>,
+				      <&ipa_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+				      <&ipa_smp2p_in 1 IRQ_TYPE_EDGE_RISING>;
+		interrupt-names = "ipa",
+				   "gsi",
+				   "ipa-clock-query",
+				   "ipa-setup-ready";
+
+		clocks = <&rpmhcc RPMH_IPA_CLK>;
+		clock-names = "core";
+
+		interconnects =
+			<&rsc_hlos MASTER_IPA &rsc_hlos SLAVE_EBI1>,
+			<&rsc_hlos MASTER_IPA &rsc_hlos SLAVE_IMEM>,
+			<&rsc_hlos MASTER_APPSS_PROC &rsc_hlos SLAVE_IPA_CFG>;
+		interconnect-names = "memory",
+				     "imem",
+				     "config";
+
+		qcom,smem-states = <&ipa_smp2p_out 0>,
+				   <&ipa_smp2p_out 1>;
+		qcom,smem-state-names = "ipa-clock-enabled-valid",
+					"ipa-clock-enabled";
+	};
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 04/18] soc: qcom: ipa: main code
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (2 preceding siblings ...)
  2019-05-12  1:24 ` [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-12  1:24 ` [PATCH 05/18] soc: qcom: ipa: configuration data Alex Elder
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch includes three source files that represent some basic "main
program" code for the IPA driver.  They are:
  - "ipa.h" defines the top-level IPA structure which represents an IPA
     device throughout the code.
  - "ipa_main.c" contains the platform driver probe function, along with
    some general code used during initialization.
  - "ipa_reg.h" defines the offsets of the 32-bit registers used for the
    IPA device, along with masks that define the position and width of
    fields less than 32 bits located within these registers.

Each file includes some documentation that provides a little more
overview of how the code is organized and used.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa.h      | 131 ++++++
 drivers/net/ipa/ipa_main.c | 920 +++++++++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_reg.h  | 279 +++++++++++
 3 files changed, 1330 insertions(+)
 create mode 100644 drivers/net/ipa/ipa.h
 create mode 100644 drivers/net/ipa/ipa_main.c
 create mode 100644 drivers/net/ipa/ipa_reg.h

diff --git a/drivers/net/ipa/ipa.h b/drivers/net/ipa/ipa.h
new file mode 100644
index 000000000000..c580254d1e0e
--- /dev/null
+++ b/drivers/net/ipa/ipa.h
@@ -0,0 +1,131 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _IPA_H_
+#define _IPA_H_
+
+#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/notifier.h>
+#include <linux/pm_wakeup.h>
+
+#include "gsi.h"
+#include "ipa_qmi.h"
+#include "ipa_endpoint.h"
+#include "ipa_interrupt.h"
+
+struct clk;
+struct icc_path;
+struct net_device;
+struct platform_device;
+
+struct ipa_clock;
+struct ipa_smp2p;
+struct ipa_interrupt;
+
+/**
+ * struct ipa - IPA information
+ * @gsi:		Embedded GSI structure
+ * @pdev:		Platform device
+ * @smp2p:		SMP2P information
+ * @clock:		IPA clocking information
+ * @suspend_ref:	Whether clock reference preventing suspend taken
+ * @route_virt:		Virtual address of routing table
+ * @route_addr:		DMA address for routing table
+ * @filter_virt:	Virtual address of filter table
+ * @filter_addr:	DMA address for filter table
+ * @interrupt:		IPA Interrupt information
+ * @uc_loaded:		Non-zero when microcontroller has reported it's ready
+ * @ipa_phys:		Physical address of IPA memory space
+ * @ipa_virt:		Virtual address for IPA memory space
+ * @reg_virt:		Virtual address used for IPA register access
+ * @shared_phys:	Physical address of memory space shared with modem
+ * @shared_virt:	Virtual address of memory space shared with modem
+ * @shared_offset:	Additional offset used for shared memory
+ * @wakeup:		Wakeup source information
+ * @filter_support:	Bit mask indicating endpoints that support filtering
+ * @initialized:	Bit mask indicating endpoints initialized
+ * @set_up:		Bit mask indicating endpoints set up
+ * @enabled:		Bit mask indicating endpoints enabled
+ * @suspended:		Bit mask indicating endpoints suspended
+ * @endpoint:		Array of endpoint information
+ * @endpoint_map:	Mapping of GSI channel to IPA endpoint information
+ * @command_endpoint:	Endpoint used for command TX
+ * @default_endpoint:	Endpoint used for default route RX
+ * @modem_netdev:	Network device structure used for modem
+ * @setup_complete:	Flag indicating whether setup stage has completed
+ * @qmi:		QMI information
+ */
+struct ipa {
+	struct gsi gsi;
+	struct platform_device *pdev;
+	struct ipa_smp2p *smp2p;
+	struct ipa_clock *clock;
+	atomic_t suspend_ref;
+
+	void *route_virt;
+	dma_addr_t route_addr;
+	void *filter_virt;
+	dma_addr_t filter_addr;
+
+	struct ipa_interrupt *interrupt;
+	u32 uc_loaded;
+
+	phys_addr_t reg_phys;
+	void __iomem *reg_virt;
+	phys_addr_t shared_phys;
+	void *shared_virt;
+	u32 shared_offset;
+
+	struct wakeup_source wakeup;
+
+	/* Bit masks indicating endpoint state */
+	u32 filter_support;
+	u32 initialized;
+	u32 set_up;
+	u32 enabled;
+	u32 suspended;
+
+	struct ipa_endpoint endpoint[IPA_ENDPOINT_MAX];
+	struct ipa_endpoint *endpoint_map[GSI_CHANNEL_MAX];
+	struct ipa_endpoint *command_endpoint;	/* TX */
+	struct ipa_endpoint *default_endpoint;	/* Default route RX */
+
+	struct net_device *modem_netdev;
+	u32 setup_complete;
+
+	struct ipa_qmi qmi;
+};
+
+/**
+ * ipa_setup() - Perform IPA setup
+ * @ipa:		IPA pointer
+ *
+ * IPA initialization is broken into stages:  init; config; setup; and
+ * sometimes enable.  (These have inverses exit, deconfig, teardown, and
+ * disable.)  Activities performed at the init stage can be done without
+ * requiring any access to hardware.  For IPA, activities performed at the
+ * config stage require the IPA clock to be running, because they involve
+ * access to IPA registers.  The setup stage is performed only after the
+ * GSI hardware is ready (more on this below).  And finally IPA endpoints
+ * can be enabled once they're successfully set up.
+ *
+ * This function, @ipa_setup(), starts the setup stage.
+ *
+ * In order for the GSI hardware to be functional it needs firmware to be
+ * loaded (in addition to some other low-level initialization).  This early
+ * GSI initialization can be done either by Trust Zone or by the modem.  If
+ * it's done by Trust Zone, the AP loads the GSI firmware and supplies it to
+ * Trust Zone to verify and install.  The AP knows when this completes, and
+ * whether it was successful.  In this case the AP proceeds to setup once it
+ * knows GSI is ready.
+ *
+ * If the modem performs early GSI initialization, the AP needs to know when
+ * this has occurred.  An SMP2P interrupt is used for this purpose, and
+ * receipt of that interrupt triggers the call to ipa_setup().
+ */
+int ipa_setup(struct ipa *ipa);
+
+#endif /* _IPA_H_ */
diff --git a/drivers/net/ipa/ipa_main.c b/drivers/net/ipa/ipa_main.c
new file mode 100644
index 000000000000..4dc5ead7bab0
--- /dev/null
+++ b/drivers/net/ipa/ipa_main.c
@@ -0,0 +1,920 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/atomic.h>
+#include <linux/bitfield.h>
+#include <linux/device.h>
+#include <linux/bug.h>
+#include <linux/io.h>
+#include <linux/firmware.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/of_address.h>
+#include <linux/remoteproc.h>
+#include <linux/qcom_scm.h>
+#include <linux/soc/qcom/mdt_loader.h>
+
+#include "ipa.h"
+#include "ipa_clock.h"
+#include "ipa_data.h"
+#include "ipa_endpoint.h"
+#include "ipa_cmd.h"
+#include "ipa_mem.h"
+#include "ipa_netdev.h"
+#include "ipa_smp2p.h"
+#include "ipa_uc.h"
+#include "ipa_interrupt.h"
+
+/**
+ * DOC: The IP Accelerator
+ *
+ * This driver supports the Qualcomm IP Accelerator (IPA), which is a
+ * networking component found in many Qualcomm SoCs.  The IPA is connected
+ * to the application processor (AP), but is also connected (and partially
+ * controlled by) other "execution environments" (EEs), such as a modem.
+ *
+ * The IPA is the conduit between the AP and the modem that carries network
+ * traffic.  This driver presents a network interface representing the
+ * connection of the modem to external (e.g. LTE) networks.  The IPA can
+ * provide protocol checksum calculation, offloading this work from the AP.
+ * The IPA is able to provide additional functionality, including routing,
+ * filtering, and NAT support, but that more advanced functionality is not
+ * currently supported.
+ *
+ * Certain resources--including routing tables and filter tables--are still
+ * defined in this driver, because they must be initialized even when the
+ * advanced hardware features are not used.
+ *
+ * There are two distinct layers that implement the IPA hardware, and this
+ * is reflected in the organization of the driver.  The generic software
+ * interface (GSI) is an integral component of the IPA, providing a
+ * well-defined communication layer between the AP subsystem and the IPA
+ * core.  The GSI implements a set of "channels" used for communication
+ * between the AP and the IPA.
+ *
+ * The IPA layer uses GSI channels to implement its "endpoints".  And while
+ * a GSI channel carries data between the AP and the IPA, a pair of IPA
+ * endpoints is used to carry traffic between two EEs.  Specifically, the main
+ * modem network interface is implemented by two pairs of endpoints:  a TX
+ * endpoint on the AP coupled with an RX endpoint on the modem; and another
+ * RX endpoint on the AP receiving data from a TX endpoint on the modem.
+ */
+
+#define IPA_TABLE_ALIGN		128		/* Minimum table alignment */
+#define IPA_TABLE_ENTRY_SIZE	sizeof(u64)	/* Holds a physical address */
+#define IPA_FILTER_SIZE		8		/* Filter descriptor size */
+#define IPA_ROUTE_SIZE		8		/* Route descriptor size */
+
+/* Backward compatibility register value to use for SDM845 */
+#define IPA_BCR_REG_VAL		0x0000003b
+
+/* The name of the main firmware file relative to /lib/firmware */
+#define IPA_FWS_PATH		"ipa_fws.mdt"
+#define IPA_PAS_ID		15
+/**
+ * ipa_filter_tuple_zero() - Zero an endpoints filter tuple
+ * @endpoint_id:	Endpoint whose filter tuple should be zeroed
+ *
+ * Endpoint must be for AP (not modem) and support filtering. Updates the
+ * filter masks values without changing routing ones.
+ */
+static void ipa_filter_tuple_zero(struct ipa_endpoint *endpoint)
+{
+	enum ipa_endpoint_id endpoint_id = endpoint->endpoint_id;
+	u32 offset;
+	u32 val;
+
+	offset = IPA_REG_ENDP_FILTER_ROUTER_HSH_CFG_N_OFFSET(endpoint_id);
+
+	val = ioread32(endpoint->ipa->reg_virt + offset);
+
+	/* Zero all filter-related fields, preserving the rest */
+	u32_replace_bits(val, 0, IPA_REG_ENDP_FILTER_HASH_MSK_ALL);
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+static void ipa_filter_hash_tuple_config(struct ipa *ipa)
+{
+	u32 ep_mask = ipa->filter_support;
+
+	while (ep_mask) {
+		enum ipa_endpoint_id endpoint_id = __ffs(ep_mask);
+		struct ipa_endpoint *endpoint;
+
+		ep_mask ^= BIT(endpoint_id);
+
+		endpoint = &ipa->endpoint[endpoint_id];
+		if (endpoint->ee_id != GSI_EE_MODEM)
+			ipa_filter_tuple_zero(endpoint);
+	}
+}
+
+/**
+ * ipa_route_tuple_zero() - Zero a routing table entry tuple
+ * @route_id:	Identifier for routing table entry to be zeroed
+ *
+ * Updates the routing table values without changing filtering ones.
+ */
+static void ipa_route_tuple_zero(struct ipa *ipa, u32 route_id)
+{
+	u32 offset = IPA_REG_ENDP_FILTER_ROUTER_HSH_CFG_N_OFFSET(route_id);
+	u32 val;
+
+	val = ioread32(ipa->reg_virt + offset);
+
+	/* Zero all route-related fields, preserving the rest */
+	u32_replace_bits(val, 0, IPA_REG_ENDP_ROUTER_HASH_MSK_ALL);
+
+	iowrite32(val, ipa->reg_virt + offset);
+}
+
+static void ipa_route_hash_tuple_config(struct ipa *ipa)
+{
+	u32 route_mask;
+	u32 modem_mask;
+
+	BUILD_BUG_ON(!IPA_SMEM_MODEM_RT_COUNT);
+	BUILD_BUG_ON(IPA_SMEM_RT_COUNT < IPA_SMEM_MODEM_RT_COUNT);
+	BUILD_BUG_ON(IPA_SMEM_RT_COUNT >= BITS_PER_LONG);
+
+	/* Compute a mask representing non-modem routing table entries */
+	route_mask = GENMASK(IPA_SMEM_RT_COUNT - 1, 0);
+	modem_mask = GENMASK(IPA_SMEM_MODEM_RT_INDEX_MAX,
+			     IPA_SMEM_MODEM_RT_INDEX_MIN);
+	route_mask &= ~modem_mask;
+
+	while (route_mask) {
+		u32 route_id = __ffs(route_mask);
+
+		route_mask ^= BIT(route_id);
+
+		ipa_route_tuple_zero(ipa, route_id);
+	}
+}
+
+/**
+ * ipa_route_setup() - Initialize an empty routing table
+ * @ipa:	IPA pointer
+ *
+ * Each entry in the routing table contains the DMA address of a route
+ * descriptor.  A special zero descriptor is allocated that represents "no
+ * route" and this function initializes all its entries to point at that
+ * zero route.  The zero route is allocated with the table, immediately past
+ * its end.
+ *
+ * Return:	0 if successful or -ENOMEM
+ */
+static int ipa_route_setup(struct ipa *ipa)
+{
+	struct device *dev = &ipa->pdev->dev;
+	u64 zero_route_addr;
+	dma_addr_t addr;
+	u32 route_id;
+	size_t size;
+	u64 *virt;
+
+	BUILD_BUG_ON(!IPA_ROUTE_SIZE);
+	BUILD_BUG_ON(sizeof(*virt) != IPA_TABLE_ENTRY_SIZE);
+
+	/* Allocate the routing table, with enough space at the end of the
+	 * table to hold the zero route descriptor.  Initialize all filter
+	 * table entries to point to the zero route.
+	 */
+	size = IPA_SMEM_RT_COUNT * IPA_TABLE_ENTRY_SIZE;
+	virt = dma_alloc_coherent(dev, size + IPA_ROUTE_SIZE, &addr,
+				   GFP_KERNEL);
+	if (!virt)
+		return -ENOMEM;
+	ipa->route_virt = virt;
+	ipa->route_addr = addr;
+
+	/* Zero route is immediately after the route table */
+	zero_route_addr = addr + size;
+
+	for (route_id = 0; route_id < IPA_SMEM_RT_COUNT; route_id++)
+		*virt++ = zero_route_addr;
+
+	ipa_cmd_route_config_ipv4(ipa, size);
+	ipa_cmd_route_config_ipv6(ipa, size);
+
+	ipa_route_hash_tuple_config(ipa);
+
+	/* Configure default route for exception packets */
+	ipa_endpoint_default_route_setup(ipa->default_endpoint);
+
+	return 0;
+}
+
+/**
+ * ipa_route_teardown() - Inverse of ipa_route_setup().
+ * @ipa:	IPA pointer
+ */
+static void ipa_route_teardown(struct ipa *ipa)
+{
+	struct device *dev = &ipa->pdev->dev;
+	size_t size;
+
+	ipa_endpoint_default_route_teardown(ipa->default_endpoint);
+
+	size = IPA_SMEM_RT_COUNT * IPA_TABLE_ENTRY_SIZE;
+	size += IPA_ROUTE_SIZE;
+
+	dma_free_coherent(dev, size, ipa->route_virt, ipa->route_addr);
+	ipa->route_virt = NULL;
+	ipa->route_addr = 0;
+}
+
+/**
+ * ipa_filter_setup() - Initialize an empty filter table
+ * @ipa:	IPA pointer
+ *
+ * The filter table consists of a bitmask representing which endpoints support
+ * filtering, followed by one table entry for each set bit in the mask.  Each
+ * entry in the filter table contains the DMA address of a filter descriptor.
+ * A special zero descriptor is allocated that represents "no filter" and this
+ * function initializes all its entries to point at that zero filter.  The
+ * zero filter is allocated with the table, immediately past its end.
+ *
+ * Return:	0 if successful or a negative error code
+ */
+static int ipa_filter_setup(struct ipa *ipa)
+{
+	struct device *dev = &ipa->pdev->dev;
+	u64 zero_filter_addr;
+	u32 filter_count;
+	dma_addr_t addr;
+	size_t size;
+	u64 *virt;
+	u32 i;
+
+	BUILD_BUG_ON(!IPA_FILTER_SIZE);
+
+	/* Allocate the filter table, with an extra slot for the bitmap.  Also
+	 * allocate enough space at the end of the table to hold the zero
+	 * filter descriptor.  Initialize all filter table entries point to
+	 * that.
+	 */
+	filter_count = hweight32(ipa->filter_support);
+	size = (filter_count + 1) * IPA_TABLE_ENTRY_SIZE;
+	virt = dma_alloc_coherent(dev, size + IPA_FILTER_SIZE, &addr,
+				   GFP_KERNEL);
+	if (!virt)
+		goto err_clear_filter_support;
+	ipa->filter_virt = virt;
+	ipa->filter_addr = addr;
+
+	/* Zero filter is immediately after the filter table */
+	zero_filter_addr = addr + size;
+
+	/* Save the filter table bitmap.  The "soft" bitmap value must be
+	 * converted to the hardware representation by shifting it left one
+	 * position.  (Bit 0 represents global filtering, which is possible
+	 * but not used.)
+	 */
+	*virt++ = ipa->filter_support << 1;
+
+	/* Now point every entry in the table at the empty filter */
+	for (i = 0; i < filter_count; i++)
+		*virt++ = zero_filter_addr;
+
+	ipa_cmd_filter_config_ipv4(ipa, size);
+	ipa_cmd_filter_config_ipv6(ipa, size);
+
+	ipa_filter_hash_tuple_config(ipa);
+
+	return 0;
+
+err_clear_filter_support:
+	ipa->filter_support = 0;
+
+	return -ENOMEM;
+}
+
+/**
+ * ipa_filter_teardown() - Inverse of ipa_filter_setup().
+ * @ipa:	IPA pointer
+ */
+static void ipa_filter_teardown(struct ipa *ipa)
+{
+	u32 filter_count = hweight32(ipa->filter_support);
+	struct device *dev = &ipa->pdev->dev;
+	size_t size;
+
+	size = (filter_count + 1) * IPA_TABLE_ENTRY_SIZE;
+	size += IPA_FILTER_SIZE;
+
+	dma_free_coherent(dev, size, ipa->filter_virt, ipa->filter_addr);
+	ipa->filter_virt = NULL;
+	ipa->filter_addr = 0;
+	ipa->filter_support = 0;
+}
+
+/**
+ * ipa_suspend_handler() - Handle the suspend interrupt
+ * @ipa:	IPA pointer
+ * @interrupt:	Interrupt type.
+ *
+ * When in suspended state, the IPA can trigger a resume by sending a SUSPEND
+ * IPA interrupt.
+ */
+static void ipa_suspend_handler(struct ipa *ipa,
+				enum ipa_interrupt_id interrupt_id)
+{
+	/* Take a a single clock reference to prevent suspend.  All
+	 * endpoints will be resumed as a result.  This reference will
+	 * be dropped when we get a power management suspend request.
+	 */
+	if (!atomic_xchg(&ipa->suspend_ref, 1))
+		ipa_clock_get(ipa->clock);
+
+	/* Acknowledge/clear the suspend interrupt on all endpoints */
+	ipa_interrupt_suspend_clear_all(ipa->interrupt);
+}
+
+/* Remoteproc callbacks for SSR events: prepare, start, stop, unprepare */
+int ipa_ssr_prepare(struct rproc_subdev *subdev)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(ipa_ssr_prepare);
+
+int ipa_ssr_start(struct rproc_subdev *subdev)
+{
+	return 0;
+}
+EXPORT_SYMBOL_GPL(ipa_ssr_start);
+
+void ipa_ssr_stop(struct rproc_subdev *subdev, bool crashed)
+{
+}
+EXPORT_SYMBOL_GPL(ipa_ssr_stop);
+
+void ipa_ssr_unprepare(struct rproc_subdev *subdev)
+{
+}
+EXPORT_SYMBOL_GPL(ipa_ssr_unprepare);
+
+/**
+ * ipa_setup() - Set up IPA hardware
+ * @ipa:	IPA pointer
+ *
+ * Perform initialization that requires issuing immediate commands using the
+ * command TX endpoint.  This cannot be run until early initialization
+ * (including loading GSI firmware) is complete.
+ */
+int ipa_setup(struct ipa *ipa)
+{
+	struct ipa_endpoint *rx_endpoint;
+	struct ipa_endpoint *tx_endpoint;
+	int ret;
+
+	dev_dbg(&ipa->pdev->dev, "%s() started\n", __func__);
+
+	ret = gsi_setup(&ipa->gsi);
+	if (ret)
+		return ret;
+
+	ipa->interrupt = ipa_interrupt_setup(ipa);
+	if (IS_ERR(ipa->interrupt)) {
+		ret = PTR_ERR(ipa->interrupt);
+		goto err_gsi_teardown;
+	}
+	ipa_interrupt_add(ipa->interrupt, IPA_INTERRUPT_TX_SUSPEND,
+			  ipa_suspend_handler);
+
+	ipa_uc_setup(ipa);
+
+	ipa_endpoint_setup(ipa);
+
+	/* We need to use the AP command out endpoint to perform other
+	 * initialization, so we set that up first.
+	 */
+	ret = ipa_endpoint_enable_one(ipa->command_endpoint);
+	if (ret)
+		goto err_endpoint_teardown;
+
+	ret = ipa_smem_setup(ipa);
+	if (ret)
+		goto err_command_disable;
+
+	ret = ipa_route_setup(ipa);
+	if (ret)
+		goto err_smem_teardown;
+
+	ret = ipa_filter_setup(ipa);
+	if (ret)
+		goto err_route_teardown;
+
+	ret = ipa_endpoint_enable_one(ipa->default_endpoint);
+	if (ret)
+		goto err_filter_teardown;
+
+	rx_endpoint = &ipa->endpoint[IPA_ENDPOINT_AP_MODEM_RX];
+	tx_endpoint = &ipa->endpoint[IPA_ENDPOINT_AP_MODEM_TX];
+	ipa->modem_netdev = ipa_netdev_setup(ipa, rx_endpoint, tx_endpoint);
+	if (IS_ERR(ipa->modem_netdev)) {
+		ret = PTR_ERR(ipa->modem_netdev);
+		goto err_default_disable;
+	}
+
+	ipa->setup_complete = 1;
+
+	dev_info(&ipa->pdev->dev, "IPA driver setup completed successfully\n");
+
+	return 0;
+
+err_default_disable:
+	ipa_endpoint_disable_one(ipa->default_endpoint);
+err_filter_teardown:
+	ipa_filter_teardown(ipa);
+err_route_teardown:
+	ipa_route_teardown(ipa);
+err_smem_teardown:
+	ipa_smem_teardown(ipa);
+err_command_disable:
+	ipa_endpoint_disable_one(ipa->command_endpoint);
+err_endpoint_teardown:
+	ipa_endpoint_teardown(ipa);
+	ipa_uc_teardown(ipa);
+	ipa_interrupt_remove(ipa->interrupt, IPA_INTERRUPT_TX_SUSPEND);
+	ipa_interrupt_teardown(ipa->interrupt);
+err_gsi_teardown:
+	gsi_teardown(&ipa->gsi);
+
+	return ret;
+}
+
+/**
+ * ipa_teardown() - Inverse of ipa_setup()
+ * @ipa:	IPA pointer
+ */
+static void ipa_teardown(struct ipa *ipa)
+{
+	ipa_netdev_teardown(ipa->modem_netdev);
+	ipa_endpoint_disable_one(ipa->default_endpoint);
+	ipa_filter_teardown(ipa);
+	ipa_route_teardown(ipa);
+	ipa_smem_teardown(ipa);
+	ipa_endpoint_disable_one(ipa->command_endpoint);
+	ipa_endpoint_teardown(ipa);
+	ipa_uc_teardown(ipa);
+	ipa_interrupt_remove(ipa->interrupt, IPA_INTERRUPT_TX_SUSPEND);
+	ipa_interrupt_teardown(ipa->interrupt);
+	gsi_teardown(&ipa->gsi);
+}
+
+/**
+ * ipa_hardware_config() - Primitive hardware initialization
+ * @ipa:	IPA pointer
+ */
+static void ipa_hardware_config(struct ipa *ipa)
+{
+	u32 val;
+
+	/* SDM845 has IPA version 3.5.1 */
+	val = IPA_BCR_REG_VAL;
+	iowrite32(val, ipa->reg_virt + IPA_REG_BCR_OFFSET);
+
+	val = u32_encode_bits(8, GEN_QMB_0_MAX_WRITES_FMASK);
+	val |= u32_encode_bits(4, GEN_QMB_1_MAX_WRITES_FMASK);
+	iowrite32(val, ipa->reg_virt + IPA_REG_QSB_MAX_WRITES_OFFSET);
+
+	val = u32_encode_bits(8, GEN_QMB_0_MAX_READS_FMASK);
+	val |= u32_encode_bits(12, GEN_QMB_1_MAX_READS_FMASK);
+	iowrite32(val, ipa->reg_virt + IPA_REG_QSB_MAX_READS_OFFSET);
+}
+
+/**
+ * ipa_hardware_deconfig() - Inverse of ipa_hardware_config()
+ * @ipa:	IPA pointer
+ *
+ * This restores the power-on reset values (even if they aren't different)
+ */
+static void ipa_hardware_deconfig(struct ipa *ipa)
+{
+	/* Values we program above are the same as the power-on reset values */
+}
+
+static void ipa_resource_config_src_one(struct ipa *ipa,
+					const struct ipa_resource_src *resource)
+{
+	u32 offset = IPA_REG_SRC_RSRC_GRP_01_RSRC_TYPE_N_OFFSET;
+	u32 stride = IPA_REG_SRC_RSRC_GRP_01_RSRC_TYPE_N_STRIDE;
+	enum ipa_resource_type_src n = resource->type;
+	const struct ipa_resource_limits *xlimits;
+	const struct ipa_resource_limits *ylimits;
+	u32 val;
+
+	xlimits = &resource->limits[IPA_RESOURCE_GROUP_LWA_DL];
+	ylimits = &resource->limits[IPA_RESOURCE_GROUP_UL_DL];
+
+	val = u32_encode_bits(xlimits->min, X_MIN_LIM_FMASK);
+	val |= u32_encode_bits(xlimits->max, X_MAX_LIM_FMASK);
+	val |= u32_encode_bits(ylimits->min, Y_MIN_LIM_FMASK);
+	val |= u32_encode_bits(ylimits->max, Y_MAX_LIM_FMASK);
+
+	iowrite32(val, ipa->reg_virt + offset + n * stride);
+}
+
+static void ipa_resource_config_dst_one(struct ipa *ipa,
+					const struct ipa_resource_dst *resource)
+{
+	u32 offset = IPA_REG_DST_RSRC_GRP_01_RSRC_TYPE_N_OFFSET;
+	u32 stride = IPA_REG_DST_RSRC_GRP_01_RSRC_TYPE_N_STRIDE;
+	enum ipa_resource_type_dst n = resource->type;
+	const struct ipa_resource_limits *xlimits;
+	const struct ipa_resource_limits *ylimits;
+	u32 val;
+
+	xlimits = &resource->limits[IPA_RESOURCE_GROUP_LWA_DL];
+	ylimits = &resource->limits[IPA_RESOURCE_GROUP_UL_DL];
+
+	val = u32_encode_bits(xlimits->min, X_MIN_LIM_FMASK);
+	val |= u32_encode_bits(xlimits->max, X_MAX_LIM_FMASK);
+	val |= u32_encode_bits(ylimits->min, Y_MIN_LIM_FMASK);
+	val |= u32_encode_bits(ylimits->max, Y_MAX_LIM_FMASK);
+
+	iowrite32(val, ipa->reg_virt + offset + n * stride);
+}
+
+static void
+ipa_resource_config(struct ipa *ipa, const struct ipa_resource_data *data)
+{
+	const struct ipa_resource_src *resource_src;
+	const struct ipa_resource_dst *resource_dst;
+	u32 i;
+
+	resource_src = data->resource_src;
+	resource_dst = data->resource_dst;
+
+	for (i = 0; i < data->resource_src_count; i++)
+		ipa_resource_config_src_one(ipa, &resource_src[i]);
+
+	for (i = 0; i < data->resource_dst_count; i++)
+		ipa_resource_config_dst_one(ipa, &resource_dst[i]);
+}
+
+static void ipa_resource_deconfig(struct ipa *ipa)
+{
+	/* Nothing to do */
+}
+
+static void ipa_idle_indication_cfg(struct ipa *ipa,
+				    u32 enter_idle_debounce_thresh,
+				    bool const_non_idle_enable)
+{
+	u32 val;
+
+	val = u32_encode_bits(enter_idle_debounce_thresh,
+			      ENTER_IDLE_DEBOUNCE_THRESH_FMASK);
+	if (const_non_idle_enable)
+		val |= CONST_NON_IDLE_ENABLE_FMASK;
+
+	iowrite32(val, ipa->reg_virt + IPA_REG_IDLE_INDICATION_CFG_OFFSET);
+}
+
+/**
+ * ipa_dcd_config() - Enable dynamic clock division on IPA
+ *
+ * Configures when the IPA signals it is idle to the global clock
+ * controller, which can respond by scalling down the clock to
+ * save power.
+ */
+static void ipa_dcd_config(struct ipa *ipa)
+{
+	/* Recommended values for IPA 3.5 according to IPA HPG */
+	ipa_idle_indication_cfg(ipa, 256, false);
+}
+
+static void ipa_dcd_deconfig(struct ipa *ipa)
+{
+	/* Power-on reset values */
+	ipa_idle_indication_cfg(ipa, 0, true);
+}
+
+/**
+ * ipa_config() - Configure IPA hardware
+ * @ipa:	IPA pointer
+ *
+ * Perform initialization requiring IPA clock to be enabled.
+ */
+static int ipa_config(struct ipa *ipa, const struct ipa_data *data)
+{
+	u32 val;
+	int ret;
+
+	/* Get a clock reference to allow initialization.  This reference
+	 * is held after initialization completes, and won't get dropped
+	 * unless/until a system suspend request arrives.
+	 */
+	atomic_set(&ipa->suspend_ref, 1);
+	ipa_clock_get(ipa->clock);
+
+	ipa_hardware_config(ipa);
+
+	/* Ensure we support the number of endpoints supplied by hardware */
+	val = ioread32(ipa->reg_virt + IPA_REG_ENABLED_PIPES_OFFSET);
+	if (val > IPA_ENDPOINT_MAX) {
+		ret = -EINVAL;
+		goto err_hardware_deconfig;
+	}
+
+	ret = ipa_smem_config(ipa);
+	if (ret)
+		goto err_hardware_deconfig;
+
+	/* Assign resource limitation to each group */
+	ipa_resource_config(ipa, data->resource_data);
+
+	/* Note enabling dynamic clock division must not be
+	 * attempted for IPA hardware versions prior to 3.5.
+	 */
+	ipa_dcd_config(ipa);
+
+	return 0;
+
+err_hardware_deconfig:
+	ipa_hardware_deconfig(ipa);
+	ipa_clock_put(ipa->clock);
+
+	return ret;
+}
+
+/**
+ * ipa_deconfig() - Inverse of ipa_config()
+ * @ipa:	IPA pointer
+ */
+static void ipa_deconfig(struct ipa *ipa)
+{
+	ipa_dcd_deconfig(ipa);
+	ipa_resource_deconfig(ipa);
+	ipa_smem_deconfig(ipa);
+	ipa_hardware_deconfig(ipa);
+
+	ipa_clock_put(ipa->clock);
+}
+
+static int ipa_firmware_load(struct device *dev)
+{
+	const struct firmware *fw;
+	struct device_node *node;
+	struct resource res;
+	phys_addr_t phys;
+	ssize_t size;
+	void *virt;
+	int ret;
+
+	node = of_parse_phandle(dev->of_node, "memory-region", 0);
+	if (!node) {
+		dev_err(dev, "memory-region not specified\n");
+		return -EINVAL;
+	}
+
+	ret = of_address_to_resource(node, 0, &res);
+	if (ret)
+		return ret;
+
+	ret = request_firmware(&fw, IPA_FWS_PATH, dev);
+	if (ret)
+		return ret;
+
+	phys = res.start;
+	size = (size_t)resource_size(&res);
+	virt = memremap(phys, size, MEMREMAP_WC);
+	if (!virt) {
+		ret = -ENOMEM;
+		goto out_release_firmware;
+	}
+
+	ret = qcom_mdt_load(dev, fw, IPA_FWS_PATH, IPA_PAS_ID,
+			    virt, phys, size, NULL);
+	if (!ret)
+		ret = qcom_scm_pas_auth_and_reset(IPA_PAS_ID);
+
+	memunmap(virt);
+out_release_firmware:
+	release_firmware(fw);
+
+	return ret;
+}
+
+static const struct of_device_id ipa_match[] = {
+	{
+		.compatible	= "qcom,sdm845-ipa",
+		.data		= &ipa_data_sdm845,
+	},
+	{ },
+};
+
+/**
+ * ipa_probe() - IPA platform driver probe function
+ * @pdev:	Platform device pointer
+ *
+ * @Return:	0 if successful, or a negative error code (possibly
+ *		EPROBE_DEFER)
+ *
+ * This is the main entry point for the IPA driver.  When successful, it
+ * initializes the IPA hardware for use.
+ *
+ * Initialization proceeds in several stages.  The "init" stage involves
+ * activities that can be initialized without access to the IPA hardware.
+ * The "setup" stage requires the IPA clock to be active so IPA registers
+ * can beaccessed, but does not require access to the GSI layer.  The
+ * "setup" stage requires access to GSI, and includes initialization that's
+ * performed by issuing IPA immediate commands.
+ */
+static int ipa_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	const struct ipa_data *data;
+	struct ipa *ipa;
+	bool modem_init;
+	int ret;
+
+	/* We assume we're working on 64-bit hardware */
+	BUILD_BUG_ON(!IS_ENABLED(CONFIG_64BIT));
+	BUILD_BUG_ON(ARCH_DMA_MINALIGN % IPA_TABLE_ALIGN);
+
+	data = of_device_get_match_data(dev);
+
+	modem_init = of_property_read_bool(dev->of_node, "modem-init");
+
+	/* If we need Trust Zone, make sure it's ready */
+	if (!modem_init)
+		if (!qcom_scm_is_available())
+			return -EPROBE_DEFER;
+
+	ipa = kzalloc(sizeof(*ipa), GFP_KERNEL);
+	if (!ipa)
+		return -ENOMEM;
+	ipa->pdev = pdev;
+	dev_set_drvdata(dev, ipa);
+
+	/* Initialize the clock and interconnects early.  They might
+	 * not be ready when we're probed, so might return -EPROBE_DEFER.
+	 */
+	atomic_set(&ipa->suspend_ref, 0);
+
+	ipa->clock = ipa_clock_init(ipa);
+	if (IS_ERR(ipa->clock)) {
+		ret = PTR_ERR(ipa->clock);
+		goto err_free_ipa;
+	}
+
+	ret = ipa_mem_init(ipa);
+	if (ret)
+		goto err_clock_exit;
+
+	ret = gsi_init(&ipa->gsi, pdev, data->endpoint_data_count,
+		       data->endpoint_data);
+	if (ret)
+		goto err_mem_exit;
+
+	ipa->smp2p = ipa_smp2p_init(ipa, modem_init);
+	if (IS_ERR(ipa->smp2p)) {
+		ret = PTR_ERR(ipa->smp2p);
+		goto err_gsi_exit;
+	}
+
+	ret = ipa_endpoint_init(ipa, data->endpoint_data_count,
+				data->endpoint_data);
+	if (ret)
+		goto err_smp2p_exit;
+	ipa->command_endpoint = &ipa->endpoint[IPA_ENDPOINT_AP_COMMAND_TX];
+	ipa->default_endpoint = &ipa->endpoint[IPA_ENDPOINT_AP_LAN_RX];
+
+	/* Create a wakeup source. */
+	wakeup_source_init(&ipa->wakeup, "ipa");
+
+	/* Proceed to real initialization */
+	ret = ipa_config(ipa, data);
+	if (ret)
+		goto err_endpoint_exit;
+
+	dev_info(dev, "IPA driver initialized");
+
+	/* If the modem is verifying and loading firmware, we're
+	 * done.  We will receive an SMP2P interrupt when it is OK
+	 * to proceed with the setup phase (involving issuing
+	 * immediate commands after GSI is initialized).
+	 */
+	if (modem_init)
+		return 0;
+
+	/* Otherwise we need to load the firmware and have Trust
+	 * Zone validate and install it.  If that succeeds we can
+	 * proceed with setup.
+	 */
+	ret = ipa_firmware_load(dev);
+	if (ret)
+		goto err_deconfig;
+
+	ret = ipa_setup(ipa);
+	if (ret)
+		goto err_deconfig;
+
+	return 0;
+
+err_deconfig:
+	ipa_deconfig(ipa);
+err_endpoint_exit:
+	wakeup_source_remove(&ipa->wakeup);
+	ipa_endpoint_exit(ipa);
+err_smp2p_exit:
+	ipa_smp2p_exit(ipa->smp2p);
+err_gsi_exit:
+	gsi_exit(&ipa->gsi);
+err_mem_exit:
+	ipa_mem_exit(ipa);
+err_clock_exit:
+	ipa_clock_exit(ipa->clock);
+err_free_ipa:
+	kfree(ipa);
+
+	return ret;
+}
+
+static int ipa_remove(struct platform_device *pdev)
+{
+	struct ipa *ipa = dev_get_drvdata(&pdev->dev);
+
+	ipa_smp2p_disable(ipa->smp2p);
+	if (ipa->setup_complete)
+		ipa_teardown(ipa);
+
+	ipa_deconfig(ipa);
+	wakeup_source_remove(&ipa->wakeup);
+	ipa_endpoint_exit(ipa);
+	ipa_smp2p_exit(ipa->smp2p);
+	ipa_mem_exit(ipa);
+	ipa_clock_exit(ipa->clock);
+	kfree(ipa);
+
+	return 0;
+}
+
+/**
+ * ipa_suspend() - Power management system suspend callback
+ * @dev:	IPA device structure
+ *
+ * Return:	Zero
+ *
+ * Called by the PM framework when a system suspend operation is invoked.
+ */
+int ipa_suspend(struct device *dev)
+{
+	struct ipa *ipa = dev_get_drvdata(dev);
+
+	ipa_clock_put(ipa->clock);
+	atomic_set(&ipa->suspend_ref, 0);
+
+	return 0;
+}
+
+/**
+ * ipa_resume() - Power management system resume callback
+ * @dev:	IPA device structure
+ *
+ * Return:	Always returns 0
+ *
+ * Called by the PM framework when a system resume operation is invoked.
+ */
+int ipa_resume(struct device *dev)
+{
+	struct ipa *ipa = dev_get_drvdata(dev);
+
+	/* This clock reference will keep the IPA out of suspend
+	 * until we get a power management suspend request.
+	 */
+	atomic_set(&ipa->suspend_ref, 1);
+	ipa_clock_get(ipa->clock);
+
+	return 0;
+}
+
+static const struct dev_pm_ops ipa_pm_ops = {
+	.suspend_noirq	= ipa_suspend,
+	.resume_noirq	= ipa_resume,
+};
+
+static struct platform_driver ipa_driver = {
+	.probe	= ipa_probe,
+	.remove	= ipa_remove,
+	.driver	= {
+		.name		= "ipa",
+		.owner		= THIS_MODULE,
+		.pm		= &ipa_pm_ops,
+		.of_match_table	= ipa_match,
+	},
+};
+
+module_platform_driver(ipa_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Qualcomm IP Accelerator device driver");
diff --git a/drivers/net/ipa/ipa_reg.h b/drivers/net/ipa/ipa_reg.h
new file mode 100644
index 000000000000..8d04db6f7b00
--- /dev/null
+++ b/drivers/net/ipa/ipa_reg.h
@@ -0,0 +1,279 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _IPA_REG_H_
+#define _IPA_REG_H_
+
+#include <linux/bits.h>
+
+/**
+ * DOC: IPA Registers
+ *
+ * IPA registers are located within the "ipa" address space defined by
+ * Device Tree.  The offset of each register within that space is specified
+ * by symbols defined below.  The address space is mapped to virtual memory
+ * space in ipa_mem_init().  All IPA registers are 32 bits wide.
+ *
+ * Certain register types are duplicated for a number of instances of
+ * something.  For example, each IPA endpoint has an set of registers
+ * defining its configuration.  The offset to an endpoint's set of registers
+ * is computed based on an "base" offset plus an additional "stride" offset
+ * that's dependent on the endpoint's ID.  For such registers, the offset
+ * is computed by a function-like macro that takes a parameter used in
+ * the computation.
+ *
+ * The offset of a register dependent on execution environment is computed
+ * by a macro that is supplied a parameter "ee".  The "ee" value is a member
+ * of the gsi_ee enumerated type.
+ *
+ * The offset of a register dependent on endpoint id is computed by a macro
+ * that is supplied a parameter "ep".  The "ep" value must be less than
+ * IPA_ENDPOINT_MAX.
+ *
+ * The offset of registers related to hashed filter and router tables is
+ * computed by a macro that is supplied a parameter "er".  The "er" represents
+ * an endpoint ID for filters, or a route ID for routes.  For filters, the
+ * endpoint ID must be less than IPA_ENDPOINT_MAX, but is further restricted
+ * because not all endpoints support filtering.  For routes, the route ID
+ * must be less than IPA_SMEM_RT_COUNT.
+ *
+ * Some registers encode multiple fields within them.  For these, each field
+ * has a symbol below definining a mask that defines both the position and
+ * width of the field within its register.
+ */
+
+#define IPA_REG_ENABLED_PIPES_OFFSET			0x00000038
+
+#define IPA_REG_ROUTE_OFFSET				0x00000048
+#define ROUTE_DIS_FMASK				GENMASK(0, 0)
+#define ROUTE_DEF_PIPE_FMASK			GENMASK(5, 1)
+#define ROUTE_DEF_HDR_TABLE_FMASK		GENMASK(6, 6)
+#define ROUTE_DEF_HDR_OFST_FMASK		GENMASK(16, 7)
+#define ROUTE_FRAG_DEF_PIPE_FMASK		GENMASK(21, 17)
+#define ROUTE_DEF_RETAIN_HDR_FMASK		GENMASK(24, 24)
+
+#define IPA_REG_SHARED_MEM_SIZE_OFFSET			0x00000054
+#define SHARED_MEM_SIZE_FMASK			GENMASK(15, 0)
+#define SHARED_MEM_BADDR_FMASK			GENMASK(31, 16)
+
+#define IPA_REG_QSB_MAX_WRITES_OFFSET			0x00000074
+#define GEN_QMB_0_MAX_WRITES_FMASK		GENMASK(3, 0)
+#define GEN_QMB_1_MAX_WRITES_FMASK		GENMASK(7, 4)
+
+#define IPA_REG_QSB_MAX_READS_OFFSET			0x00000078
+#define GEN_QMB_0_MAX_READS_FMASK		GENMASK(3, 0)
+#define GEN_QMB_1_MAX_READS_FMASK		GENMASK(7, 4)
+
+#define IPA_REG_STATE_AGGR_ACTIVE_OFFSET		0x0000010c
+
+#define IPA_REG_BCR_OFFSET				0x000001d0
+
+#define IPA_REG_LOCAL_PKT_PROC_CNTXT_BASE_OFFSET	0x000001e8
+
+#define IPA_REG_AGGR_FORCE_CLOSE_OFFSET			0x000001ec
+#define PIPE_BITMAP_FMASK			GENMASK(19, 0)
+
+#define IPA_REG_IDLE_INDICATION_CFG_OFFSET		0x00000220
+#define ENTER_IDLE_DEBOUNCE_THRESH_FMASK	GENMASK(15, 0)
+#define CONST_NON_IDLE_ENABLE_FMASK		GENMASK(16, 16)
+
+#define IPA_REG_SRC_RSRC_GRP_01_RSRC_TYPE_N_OFFSET	0x00000400
+#define IPA_REG_SRC_RSRC_GRP_01_RSRC_TYPE_N_STRIDE	0x0020
+#define IPA_REG_DST_RSRC_GRP_01_RSRC_TYPE_N_OFFSET	0x00000500
+#define IPA_REG_DST_RSRC_GRP_01_RSRC_TYPE_N_STRIDE	0x0020
+#define X_MIN_LIM_FMASK				GENMASK(5, 0)
+#define X_MAX_LIM_FMASK				GENMASK(13, 8)
+#define Y_MIN_LIM_FMASK				GENMASK(21, 16)
+#define Y_MAX_LIM_FMASK				GENMASK(29, 24)
+
+#define IPA_REG_ENDP_INIT_CTRL_N_OFFSET(ep) \
+					(0x00000800 + 0x0070 * (ep))
+#define ENDP_SUSPEND_FMASK			GENMASK(0, 0)
+#define ENDP_DELAY_FMASK			GENMASK(1, 1)
+
+#define IPA_REG_ENDP_INIT_CFG_N_OFFSET(ep) \
+					(0x00000808 + 0x0070 * (ep))
+#define FRAG_OFFLOAD_EN_FMASK			GENMASK(0, 0)
+#define CS_OFFLOAD_EN_FMASK			GENMASK(2, 1)
+#define CS_METADATA_HDR_OFFSET_FMASK		GENMASK(6, 3)
+#define CS_GEN_QMB_MASTER_SEL_FMASK		GENMASK(8, 8)
+
+#define IPA_REG_ENDP_INIT_HDR_N_OFFSET(ep) \
+					(0x00000810 + 0x0070 * (ep))
+#define HDR_LEN_FMASK				GENMASK(5, 0)
+#define HDR_OFST_METADATA_VALID_FMASK		GENMASK(6, 6)
+#define HDR_OFST_METADATA_FMASK			GENMASK(12, 7)
+#define HDR_ADDITIONAL_CONST_LEN_FMASK		GENMASK(18, 13)
+#define HDR_OFST_PKT_SIZE_VALID_FMASK		GENMASK(19, 19)
+#define HDR_OFST_PKT_SIZE_FMASK			GENMASK(25, 20)
+#define HDR_A5_MUX_FMASK			GENMASK(26, 26)
+#define HDR_LEN_INC_DEAGG_HDR_FMASK		GENMASK(27, 27)
+#define HDR_METADATA_REG_VALID_FMASK		GENMASK(28, 28)
+
+#define IPA_REG_ENDP_INIT_HDR_EXT_N_OFFSET(ep) \
+					(0x00000814 + 0x0070 * (ep))
+#define HDR_ENDIANNESS_FMASK			GENMASK(0, 0)
+#define HDR_TOTAL_LEN_OR_PAD_VALID_FMASK	GENMASK(1, 1)
+#define HDR_TOTAL_LEN_OR_PAD_FMASK		GENMASK(2, 2)
+#define HDR_PAYLOAD_LEN_INC_PADDING_FMASK	GENMASK(3, 3)
+#define HDR_TOTAL_LEN_OR_PAD_OFFSET_FMASK	GENMASK(9, 4)
+#define HDR_PAD_TO_ALIGNMENT_FMASK		GENMASK(13, 10)
+
+#define IPA_REG_ENDP_INIT_HDR_METADATA_MASK_N_OFFSET(ep) \
+					(0x00000818 + 0x0070 * (ep))
+
+#define IPA_REG_ENDP_INIT_AGGR_N_OFFSET(ep) \
+					(0x00000824 +  0x0070 * (ep))
+#define AGGR_EN_FMASK				GENMASK(1, 0)
+#define AGGR_TYPE_FMASK				GENMASK(4, 2)
+#define AGGR_BYTE_LIMIT_FMASK			GENMASK(9, 5)
+#define AGGR_TIME_LIMIT_FMASK			GENMASK(14, 10)
+#define AGGR_PKT_LIMIT_FMASK			GENMASK(20, 15)
+#define AGGR_SW_EOF_ACTIVE_FMASK		GENMASK(21, 21)
+#define AGGR_FORCE_CLOSE_FMASK			GENMASK(22, 22)
+#define AGGR_HARD_BYTE_LIMIT_ENABLE_FMASK	GENMASK(24, 24)
+
+#define IPA_REG_ENDP_INIT_MODE_N_OFFSET(ep) \
+					(0x00000820 + 0x0070 * (ep))
+#define MODE_FMASK				GENMASK(2, 0)
+#define DEST_PIPE_INDEX_FMASK			GENMASK(8, 4)
+#define BYTE_THRESHOLD_FMASK			GENMASK(27, 12)
+#define PIPE_REPLICATION_EN_FMASK		GENMASK(28, 28)
+#define PAD_EN_FMASK				GENMASK(29, 29)
+#define HDR_FTCH_DISABLE_FMASK			GENMASK(30, 30)
+
+#define IPA_REG_ENDP_INIT_DEAGGR_N_OFFSET(ep) \
+					(0x00000834 + 0x0070 * (ep))
+#define DEAGGR_HDR_LEN_FMASK			GENMASK(5, 0)
+#define PACKET_OFFSET_VALID_FMASK		GENMASK(7, 7)
+#define PACKET_OFFSET_LOCATION_FMASK		GENMASK(13, 8)
+#define MAX_PACKET_LEN_FMASK			GENMASK(31, 16)
+
+#define IPA_REG_ENDP_INIT_SEQ_N_OFFSET(ep) \
+					(0x0000083c + 0x0070 * (ep))
+#define HPS_SEQ_TYPE_FMASK			GENMASK(3, 0)
+#define DPS_SEQ_TYPE_FMASK			GENMASK(7, 4)
+#define HPS_REP_SEQ_TYPE_FMASK			GENMASK(11, 8)
+#define DPS_REP_SEQ_TYPE_FMASK			GENMASK(15, 12)
+
+#define IPA_REG_ENDP_STATUS_N_OFFSET(ep) \
+					(0x00000840 + 0x0070 * (ep))
+#define STATUS_EN_FMASK				GENMASK(0, 0)
+#define STATUS_ENDP_FMASK			GENMASK(5, 1)
+#define STATUS_LOCATION_FMASK			GENMASK(8, 8)
+#define STATUS_PKT_SUPPRESS_FMASK		GENMASK(9, 9)
+
+/* "er" is either an endpoint id (for filters) or a route id (for routes) */
+#define IPA_REG_ENDP_FILTER_ROUTER_HSH_CFG_N_OFFSET(er) \
+					(0x0000085c + 0x0070 * (er))
+#define FILTER_HASH_MSK_SRC_ID_FMASK		GENMASK(0, 0)
+#define FILTER_HASH_MSK_SRC_IP_FMASK		GENMASK(1, 1)
+#define FILTER_HASH_MSK_DST_IP_FMASK		GENMASK(2, 2)
+#define FILTER_HASH_MSK_SRC_PORT_FMASK		GENMASK(3, 3)
+#define FILTER_HASH_MSK_DST_PORT_FMASK		GENMASK(4, 4)
+#define FILTER_HASH_MSK_PROTOCOL_FMASK		GENMASK(5, 5)
+#define FILTER_HASH_MSK_METADATA_FMASK		GENMASK(6, 6)
+#define FILTER_HASH_UNDEFINED1_FMASK		GENMASK(15, 7)
+#define IPA_REG_ENDP_FILTER_HASH_MSK_ALL	GENMASK(15, 0)
+
+#define ROUTER_HASH_MSK_SRC_ID_FMASK		GENMASK(16, 16)
+#define ROUTER_HASH_MSK_SRC_IP_FMASK		GENMASK(17, 17)
+#define ROUTER_HASH_MSK_DST_IP_FMASK		GENMASK(18, 18)
+#define ROUTER_HASH_MSK_SRC_PORT_FMASK		GENMASK(19, 19)
+#define ROUTER_HASH_MSK_DST_PORT_FMASK		GENMASK(20, 20)
+#define ROUTER_HASH_MSK_PROTOCOL_FMASK		GENMASK(21, 21)
+#define ROUTER_HASH_MSK_METADATA_FMASK		GENMASK(22, 22)
+#define ROUTER_HASH_UNDEFINED2_FMASK		GENMASK(31, 23)
+#define IPA_REG_ENDP_ROUTER_HASH_MSK_ALL	GENMASK(31, 16)
+
+#define IPA_REG_IRQ_STTS_OFFSET	\
+				IPA_REG_IRQ_STTS_EE_N_OFFSET(GSI_EE_AP)
+#define IPA_REG_IRQ_STTS_EE_N_OFFSET(ee) \
+					(0x00003008 + 0x1000 * (ee))
+
+#define IPA_REG_IRQ_EN_OFFSET \
+				IPA_REG_IRQ_EN_EE_N_OFFSET(GSI_EE_AP)
+#define IPA_REG_IRQ_EN_EE_N_OFFSET(ee) \
+					(0x0000300c + 0x1000 * (ee))
+
+#define IPA_REG_IRQ_CLR_OFFSET \
+				IPA_REG_IRQ_CLR_EE_N_OFFSET(GSI_EE_AP)
+#define IPA_REG_IRQ_CLR_EE_N_OFFSET(ee) \
+					(0x00003010 + 0x1000 * (ee))
+
+#define IPA_REG_IRQ_UC_OFFSET \
+				IPA_REG_IRQ_UC_EE_N_OFFSET(GSI_EE_AP)
+#define IPA_REG_IRQ_UC_EE_N_OFFSET(ee) \
+					(0x0000301c + 0x1000 * (ee))
+
+#define IPA_REG_IRQ_SUSPEND_INFO_OFFSET \
+				IPA_REG_IRQ_SUSPEND_INFO_EE_N_OFFSET(GSI_EE_AP)
+#define IPA_REG_IRQ_SUSPEND_INFO_EE_N_OFFSET(ee) \
+					(0x00003030 + 0x1000 * (ee))
+
+#define IPA_REG_SUSPEND_IRQ_EN_OFFSET \
+				IPA_REG_SUSPEND_IRQ_EN_EE_N_OFFSET(GSI_EE_AP)
+#define IPA_REG_SUSPEND_IRQ_EN_EE_N_OFFSET(ee) \
+					(0x00003034 + 0x1000 * (ee))
+
+#define IPA_REG_SUSPEND_IRQ_CLR_OFFSET \
+				IPA_REG_SUSPEND_IRQ_CLR_EE_N_OFFSET(GSI_EE_AP)
+#define IPA_REG_SUSPEND_IRQ_CLR_EE_N_OFFSET(ee) \
+					(0x00003038 + 0x1000 * (ee))
+
+/** enum ipa_cs_offload_en - checksum offload field in ENDP_INIT_CFG_N */
+enum ipa_cs_offload_en {
+	IPA_CS_OFFLOAD_NONE	= 0,
+	IPA_CS_OFFLOAD_UL	= 1,
+	IPA_CS_OFFLOAD_DL	= 2,
+	IPA_CS_RSVD
+};
+
+/** enum ipa_aggr_en - aggregation type field in ENDP_INIT_AGGR_N */
+enum ipa_aggr_en {
+	IPA_BYPASS_AGGR		= 0,
+	IPA_ENABLE_AGGR		= 1,
+	IPA_ENABLE_DEAGGR	= 2,
+};
+
+/** enum ipa_aggr_type - aggregation type field in in_ENDP_INIT_AGGR_N */
+enum ipa_aggr_type {
+	IPA_MBIM_16 = 0,
+	IPA_HDLC    = 1,
+	IPA_TLP	    = 2,
+	IPA_RNDIS   = 3,
+	IPA_GENERIC = 4,
+	IPA_QCMAP   = 6,
+};
+
+/** enum ipa_mode - mode field in ENDP_INIT_MODE_N */
+enum ipa_mode {
+	IPA_BASIC			= 0,
+	IPA_ENABLE_FRAMING_HDLC		= 1,
+	IPA_ENABLE_DEFRAMING_HDLC	= 2,
+	IPA_DMA				= 3,
+};
+
+/**
+ * enum ipa_seq_type - HPS and DPS sequencer type fields in in ENDP_INIT_SEQ_N
+ * @IPA_SEQ_DMA_ONLY:		only DMA is performed
+ * @IPA_SEQ_PKT_PROCESS_NO_DEC_UCP:
+ *	packet processing + no decipher + microcontroller (Ethernet Bridging)
+ * @IPA_SEQ_2ND_PKT_PROCESS_PASS_NO_DEC_UCP:
+ *	second packet processing pass + no decipher + microcontroller
+ * @IPA_SEQ_DMA_DEC:		DMA + cipher/decipher
+ * @IPA_SEQ_DMA_COMP_DECOMP:	DMA + compression/decompression
+ * @IPA_SEQ_INVALID:		invalid sequencer type
+ */
+enum ipa_seq_type {
+	IPA_SEQ_DMA_ONLY			= 0x00,
+	IPA_SEQ_PKT_PROCESS_NO_DEC_UCP		= 0x02,
+	IPA_SEQ_2ND_PKT_PROCESS_PASS_NO_DEC_UCP	= 0x04,
+	IPA_SEQ_DMA_DEC				= 0x11,
+	IPA_SEQ_DMA_COMP_DECOMP			= 0x20,
+	IPA_SEQ_INVALID				= 0xff,
+};
+
+#endif /* _IPA_REG_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 05/18] soc: qcom: ipa: configuration data
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (3 preceding siblings ...)
  2019-05-12  1:24 ` [PATCH 04/18] soc: qcom: ipa: main code Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-12  1:24 ` [PATCH 06/18] soc: qcom: ipa: clocking, interrupts, and memory Alex Elder
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch defines configuration data that is used to specify some
of the details of IPA hardware supported by the driver.  It is built
as Device Tree match data, discovered at boot time.  Initially the
driver only supports the Qualcomm SDM845 SoC.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa_data-sdm845.c | 245 +++++++++++++++++++++++++++
 drivers/net/ipa/ipa_data.h        | 267 ++++++++++++++++++++++++++++++
 2 files changed, 512 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_data-sdm845.c
 create mode 100644 drivers/net/ipa/ipa_data.h

diff --git a/drivers/net/ipa/ipa_data-sdm845.c b/drivers/net/ipa/ipa_data-sdm845.c
new file mode 100644
index 000000000000..62c0f25f5161
--- /dev/null
+++ b/drivers/net/ipa/ipa_data-sdm845.c
@@ -0,0 +1,245 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include <linux/log2.h>
+
+#include "gsi.h"
+#include "ipa_data.h"
+#include "ipa_endpoint.h"
+
+/* Differentiate Boolean from numerical options */
+#define NO	0
+#define YES	1
+
+/* Endpoint configuration for the SDM845 SoC. */
+static const struct gsi_ipa_endpoint_data gsi_ipa_endpoint_data[] = {
+	{
+		.ee_id		= GSI_EE_AP,
+		.channel_id	= 4,
+		.endpoint_id	= IPA_ENDPOINT_AP_COMMAND_TX,
+		.toward_ipa	= YES,
+		.channel = {
+			.tlv_count	= 20,
+			.wrr_priority	= YES,
+			.tre_count	= 256,
+			.event_count	= 512,
+		},
+		.endpoint = {
+			.seq_type	= IPA_SEQ_DMA_ONLY,
+			.config = {
+				.dma_mode	= YES,
+				.dma_endpoint	= IPA_ENDPOINT_AP_LAN_RX,
+			},
+		},
+	},
+	{
+		.ee_id		= GSI_EE_AP,
+		.channel_id	= 5,
+		.endpoint_id	= IPA_ENDPOINT_AP_LAN_RX,
+		.toward_ipa	= NO,
+		.channel = {
+			.tlv_count	= 8,
+			.tre_count	= 256,
+			.event_count	= 256,
+		},
+		.endpoint = {
+			.seq_type	= IPA_SEQ_INVALID,
+			.config = {
+				.checksum	= YES,
+				.aggregation	= YES,
+				.status_enable	= YES,
+				.rx = {
+					.pad_align	= ilog2(sizeof(u32)),
+				},
+			},
+		},
+	},
+	{
+		.ee_id		= GSI_EE_AP,
+		.channel_id	= 3,
+		.endpoint_id	= IPA_ENDPOINT_AP_MODEM_TX,
+		.toward_ipa	= YES,
+		.channel = {
+			.tlv_count	= 16,
+			.tre_count	= 512,
+			.event_count	= 512,
+		},
+		.endpoint = {
+			.support_flt	= YES,
+			.seq_type	=
+				IPA_SEQ_2ND_PKT_PROCESS_PASS_NO_DEC_UCP,
+			.config = {
+				.checksum	= YES,
+				.qmap		= YES,
+				.status_enable	= YES,
+				.tx = {
+					.delay	= YES,
+					.status_endpoint =
+						IPA_ENDPOINT_MODEM_AP_RX,
+				},
+			},
+		},
+	},
+	{
+		.ee_id		= GSI_EE_AP,
+		.channel_id	= 6,
+		.endpoint_id	= IPA_ENDPOINT_AP_MODEM_RX,
+		.toward_ipa	= NO,
+		.channel = {
+			.tlv_count	= 8,
+			.tre_count	= 256,
+			.event_count	= 256,
+		},
+		.endpoint = {
+			.seq_type	= IPA_SEQ_INVALID,
+			.config = {
+				.checksum	= YES,
+				.qmap		= YES,
+				.aggregation	= YES,
+				.rx = {
+					.aggr_close_eof	= YES,
+				},
+			},
+		},
+	},
+	{
+		.ee_id		= GSI_EE_MODEM,
+		.channel_id	= 1,
+		.endpoint_id	= IPA_ENDPOINT_MODEM_COMMAND_TX,
+		.toward_ipa	= YES,
+		.endpoint = {
+			.seq_type	= IPA_SEQ_PKT_PROCESS_NO_DEC_UCP,
+		},
+	},
+	{
+		.ee_id		= GSI_EE_MODEM,
+		.channel_id	= 0,
+		.endpoint_id	= IPA_ENDPOINT_MODEM_LAN_TX,
+		.toward_ipa	= YES,
+		.endpoint = {
+			.support_flt	= YES,
+		},
+	},
+	{
+		.ee_id		= GSI_EE_MODEM,
+		.channel_id	= 3,
+		.endpoint_id	= IPA_ENDPOINT_MODEM_LAN_RX,
+		.toward_ipa	= NO,
+	},
+	{
+		.ee_id		= GSI_EE_MODEM,
+		.channel_id	= 4,
+		.endpoint_id	= IPA_ENDPOINT_MODEM_AP_TX,
+		.toward_ipa	= YES,
+		.endpoint = {
+			.support_flt	= YES,
+		},
+	},
+	{
+		.ee_id		= GSI_EE_MODEM,
+		.channel_id	= 2,
+		.endpoint_id	= IPA_ENDPOINT_MODEM_AP_RX,
+		.toward_ipa	= NO,
+	},
+};
+
+static const struct ipa_resource_src ipa_resource_src[] = {
+	{
+		.type = IPA_RESOURCE_TYPE_SRC_PKT_CONTEXTS,
+		.limits[IPA_RESOURCE_GROUP_LWA_DL] = {
+			.min = 1,
+			.max = 63,
+		},
+		.limits[IPA_RESOURCE_GROUP_UL_DL] = {
+			.min = 1,
+			.max = 63,
+		},
+	},
+	{
+		.type = IPA_RESOURCE_TYPE_SRC_DESCRIPTOR_LISTS,
+		.limits[IPA_RESOURCE_GROUP_LWA_DL] = {
+			.min = 10,
+			.max = 10,
+		},
+		.limits[IPA_RESOURCE_GROUP_UL_DL] = {
+			.min = 10,
+			.max = 10,
+		},
+	},
+	{
+		.type = IPA_RESOURCE_TYPE_SRC_DESCRIPTOR_BUFF,
+		.limits[IPA_RESOURCE_GROUP_LWA_DL] = {
+			.min = 12,
+			.max = 12,
+		},
+		.limits[IPA_RESOURCE_GROUP_UL_DL] = {
+			.min = 14,
+			.max = 14,
+		},
+	},
+	{
+		.type = IPA_RESOURCE_TYPE_SRC_HPS_DMARS,
+		.limits[IPA_RESOURCE_GROUP_LWA_DL] = {
+			.min = 0,
+			.max = 63,
+		},
+		.limits[IPA_RESOURCE_GROUP_UL_DL] = {
+			.min = 0,
+			.max = 63,
+		},
+	},
+	{
+		.type = IPA_RESOURCE_TYPE_SRC_ACK_ENTRIES,
+		.limits[IPA_RESOURCE_GROUP_LWA_DL] = {
+			.min = 14,
+			.max = 14,
+		},
+		.limits[IPA_RESOURCE_GROUP_UL_DL] = {
+			.min = 20,
+			.max = 20,
+		},
+	},
+};
+
+static const struct ipa_resource_dst ipa_resource_dst[] = {
+	{
+		.type = IPA_RESOURCE_TYPE_DST_DATA_SECTORS,
+		.limits[IPA_RESOURCE_GROUP_LWA_DL] = {
+			.min = 4,
+			.max = 4,
+		},
+		.limits[IPA_RESOURCE_GROUP_UL_DL] = {
+			.min = 4,
+			.max = 4,
+		},
+	},
+	{
+		.type = IPA_RESOURCE_TYPE_DST_DPS_DMARS,
+		.limits[IPA_RESOURCE_GROUP_LWA_DL] = {
+			.min = 2,
+			.max = 63,
+		},
+		.limits[IPA_RESOURCE_GROUP_UL_DL] = {
+			.min = 1,
+			.max = 63,
+		},
+	},
+};
+
+/* Resource configuration for the SDM845 SoC. */
+static const struct ipa_resource_data ipa_resource_data = {
+	.resource_src		= ipa_resource_src,
+	.resource_src_count	= ARRAY_SIZE(ipa_resource_src),
+	.resource_dst		= ipa_resource_dst,
+	.resource_dst_count	= ARRAY_SIZE(ipa_resource_dst),
+};
+
+/* Configuration data for the SDM845 SoC. */
+const struct ipa_data ipa_data_sdm845 = {
+	.endpoint_data		= gsi_ipa_endpoint_data,
+	.endpoint_data_count	= ARRAY_SIZE(gsi_ipa_endpoint_data),
+	.resource_data		= &ipa_resource_data,
+};
diff --git a/drivers/net/ipa/ipa_data.h b/drivers/net/ipa/ipa_data.h
new file mode 100644
index 000000000000..f7669f73efc3
--- /dev/null
+++ b/drivers/net/ipa/ipa_data.h
@@ -0,0 +1,267 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _IPA_DATA_H_
+#define _IPA_DATA_H_
+
+#include <linux/types.h>
+
+#include "ipa_endpoint.h"
+
+/**
+ * DOC: IPA/GSI Configuration Data
+ *
+ * Boot-time configuration data is used to define the configuration of the
+ * IPA and GSI resources to use for a given platform.  This data is supplied
+ * via the Device Tree match table, associated with a particular compatible
+ * string.  The data defines information about resources, endpoints, and
+ * channels.  For endpoints and channels, the configuration data defines how
+ * these hardware entities are initially configured, but in almost all cases,
+ * this configuration never changes.
+ *
+ * Resources are data structures used internally by the IPA hardware.  The
+ * configuration data defines the number (or limits of the number) of various
+ * types of these resources.
+ *
+ * Endpoint configuration data defines properties of both IPA endpoints and
+ * GSI channels.  A channel is a GSI construct, and represents a single
+ * communication path between the IPA and a particular execution environment
+ * (EE), such as the AP or Modem.  Each EE has a set of channels associated
+ * with it, and each channel has an ID unique for that EE.  Only GSI channels
+ * associated with the AP are of concern to this driver.
+ *
+ * An endpoint is an IPA construct representing a single channel anywhere
+ * within the system.  As such, an IPA endpoint ID maps directly to an
+ * (EE, channel_id) pair.  Generally, this driver is concerned with only
+ * endpoints associated with the AP, however this will change when support
+ * for routing (etc.) is added.  IPA endpoint and GSI channel configuration
+ * data are defined together, establishing the endpoint_id->(EE, channel_id)
+ * mapping.
+ *
+ * Endpoint configuration data consists of three parts:  properties that
+ * are common to IPA and GSI (EE ID, channel ID, endpoint ID, and direction);
+ * properties associated with the GSI channel; and properties associated with
+ * the IPA endpoint.
+ */
+
+/**
+ * struct gsi_channel_data - GSI channel configuration data
+ * @tlv_count:		number of entries in channel's TLV FIFO
+ * @wrr_priority:	whether channel gets priority (AP command TX only)
+ * @tre_count:		number of TREs in the channel ring
+ * @event_count:	number of slots in the associated event ring
+ *
+ * A GSI channel is a unidirectional means of transferring data to or from
+ * (and through) the IPA.  A GSI channel has a fixed number of "transfer
+ * elements" (TREs) that specify individual commands.  A set of commands
+ * are provided to a GSI channel, and when they complete the GSI generates
+ * an event (and an interrupt) to signal their completion.  These event
+ * structures are managed in a fixed-size event ring.
+ *
+ * Each GSI channel is fed by a FIFO if type/length/value (TLV) structures,
+ * and the number of entries in this FIFO limits the number of TREs that can
+ * be included in a single transaction.
+ *
+ * The GSI does weighted round-robin servicing of its channels, and it's
+ * possible to adjust a channel's priority of service.  Only the AP command
+ * TX channel specifies that it should get priority.
+ */
+struct gsi_channel_data {
+	u32 tlv_count;
+
+	u32 wrr_priority;
+	u32 tre_count;
+	u32 event_count;
+};
+
+/**
+ * struct ipa_endpoint_tx_data - configuration data for TX endpoints
+ * @delay:		whether endpoint starts in delay mode
+ * @status_endpoint:	endpoint to which status elements are sent
+ *
+ * Delay mode prevents an endpoint from transmitting anything, even if
+ * commands have been presented to the hardware.  Once the endpoint exits
+ * delay mode, queued transfer commands are sent.
+ *
+ * The @status_endpoint is only valid if the endpoint's @status_enable
+ * flag is set.
+ */
+struct ipa_endpoint_tx_data {
+	u32 delay;
+	enum ipa_endpoint_id status_endpoint;
+};
+
+/**
+ * struct ipa_endpoint_rx_data - configuration data for RX endpoints
+ * @pad_align:	power-of-2 boundary to which packet payload is aligned
+ * @aggr_close_eof: whether aggregation closes on end-of-frame
+ *
+ * With each packet it transfers, the IPA hardware can perform certain
+ * transformations of its packet data.  One of these is adding pad bytes
+ * to the end of the packet data so the result ends on a power-of-2 boundary.
+ *
+ * It is also able to aggregate multiple packets into a single receive buffer.
+ * Aggregation is "open" while a buffer is being filled, and "closes" when
+ * certain criteria are met.  One of those criteria is the sender indicating
+ * a "frame" consisting of several transfers has ended.
+ */
+struct ipa_endpoint_rx_data {
+	u32 pad_align;
+	u32 aggr_close_eof;
+};
+
+/**
+ * struct ipa_endpoint_config_data - IPA endpoint hardware configuration
+ * @checksum:		whether checksum offload is enabled
+ * @qmap:		whether endpoint uses QMAP protocol
+ * @aggregation:	whether endpoint supports aggregation
+ * @dma_mode:		whether endpoint operates in DMA mode
+ * @dma_endpoint:	peer endpoint, if operating in DMA mode
+ * @status_enable:	whether status elements are generated for endpoint
+ * @tx:			TX-specific endpoint information (see above)
+ * @rx:			RX-specific endpoint information (see above)
+ */
+struct ipa_endpoint_config_data {
+	u32 checksum;
+	u32 qmap;
+	u32 aggregation;
+	u32 dma_mode;
+	enum ipa_endpoint_id dma_endpoint;
+	u32 status_enable;
+	union {
+		struct ipa_endpoint_tx_data tx;
+		struct ipa_endpoint_rx_data rx;
+	};
+};
+
+/**
+ * struct ipa_endpoint_data - IPA endpoint configuration data
+ * @support_flt:	whether endpoint supports filtering
+ * @seq_type:		hardware sequencer type used for endpoint
+ * @config:		hardware configuration (see above)
+ *
+ * Not all endpoints support the IPA filtering capability.  A filter table
+ * defines the filters to apply for those endpoints that support it.  The
+ * AP is responsible for initializing this table, and it must include entries
+ * for non-AP endpoints.  For this reason we define *all* endpoints used
+ * in the system, and indicate whether they support filtering.
+ *
+ * The remaining endpoint configuration data applies only to AP endpoints.
+ * The IPA hardware is implemented by sequencers, and the AP must program
+ * the type(s) of these sequencers at initialization time.  The remaining
+ * endpoint configuration data is defined above.
+ */
+struct ipa_endpoint_data {
+	u32 support_flt;
+	/* The rest are specified only for AP endpoints */
+	enum ipa_seq_type seq_type;
+	struct ipa_endpoint_config_data config;
+};
+
+/**
+ * struct gsi_ipa_endpoint_data - GSI channel/IPA endpoint data
+ * ee:		GSI execution environment ID
+ * channel_id:	GSI channel ID
+ * endpoint_id:	IPA endpoint ID
+ * toward_ipa:	direction of data transfer
+ * gsi:		GSI channel configuration data (see above)
+ * ipa:		IPA endpoint configuration data (see above)
+ */
+struct gsi_ipa_endpoint_data {
+	u32 ee_id;
+	u32 channel_id;
+	enum ipa_endpoint_id endpoint_id;
+	u32 toward_ipa;
+
+	struct gsi_channel_data channel;
+	struct ipa_endpoint_data endpoint;
+};
+
+/** enum ipa_resource_group - IPA resource group */
+enum ipa_resource_group {
+	IPA_RESOURCE_GROUP_LWA_DL,	/* currently not used */
+	IPA_RESOURCE_GROUP_UL_DL,
+	IPA_RESOURCE_GROUP_MAX,
+};
+
+/** enum ipa_resource_type_src - source resource types */
+enum ipa_resource_type_src {
+	IPA_RESOURCE_TYPE_SRC_PKT_CONTEXTS,
+	IPA_RESOURCE_TYPE_SRC_DESCRIPTOR_LISTS,
+	IPA_RESOURCE_TYPE_SRC_DESCRIPTOR_BUFF,
+	IPA_RESOURCE_TYPE_SRC_HPS_DMARS,
+	IPA_RESOURCE_TYPE_SRC_ACK_ENTRIES,
+};
+
+/** enum ipa_resource_type_dst - destination resource types */
+enum ipa_resource_type_dst {
+	IPA_RESOURCE_TYPE_DST_DATA_SECTORS,
+	IPA_RESOURCE_TYPE_DST_DPS_DMARS,
+};
+
+/**
+ * struct ipa_resource_limits - minimum and maximum resource counts
+ * @min:	minimum number of resources of a given type
+ * @max:	maximum number of resources of a given type
+ */
+struct ipa_resource_limits {
+	u32 min;
+	u32 max;
+};
+
+/**
+ * struct ipa_resource_src - source endpoint group resource usage
+ * @type:	source group resource type
+ * @limits:	array of limits to use for each resource group
+ */
+struct ipa_resource_src {
+	enum ipa_resource_type_src type;
+	struct ipa_resource_limits limits[IPA_RESOURCE_GROUP_MAX];
+};
+
+/**
+ * struct ipa_resource_dst - destination endpoint group resource usage
+ * @type:	destination group resource type
+ * @limits:	array of limits to use for each resource group
+ */
+struct ipa_resource_dst {
+	enum ipa_resource_type_dst type;
+	struct ipa_resource_limits limits[IPA_RESOURCE_GROUP_MAX];
+};
+
+/**
+ * struct ipa_resource_data - IPA resource configuration data
+ * @resource_src:	source endpoint group resources
+ * @resource_src_count:	number of entries in the resource_src array
+ * @resource_dst:	destination endpoint group resources
+ * @resource_dst_count:	number of entries in the resource_dst array
+ *
+ * In order to manage quality of service between endpoints, certain resources
+ * required for operation are allocated to groups of endpoints.  Generally
+ * this information is invisible to the AP, but the AP is responsible for
+ * programming it at initialization time, so we specify it here.
+ */
+struct ipa_resource_data {
+	const struct ipa_resource_src *resource_src;
+	u32 resource_src_count;
+	const struct ipa_resource_dst *resource_dst;
+	u32 resource_dst_count;
+};
+
+/**
+ * struct ipa_data - combined IPA/GSI configuration data
+ * @resource_data:		IPA resource configuration data
+ * @endpoint_data:		IPA endpoint/GSI channel data
+ * @endpoint_data_count:	number of entries in endpoint_data array
+ */
+struct ipa_data {
+	const struct ipa_resource_data *resource_data;
+	const struct gsi_ipa_endpoint_data *endpoint_data;
+	u32 endpoint_data_count;	/* # entries in endpoint_data[] */
+};
+
+extern const struct ipa_data ipa_data_sdm845;
+
+#endif /* _IPA_DATA_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 06/18] soc: qcom: ipa: clocking, interrupts, and memory
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (4 preceding siblings ...)
  2019-05-12  1:24 ` [PATCH 05/18] soc: qcom: ipa: configuration data Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-12  1:24 ` [PATCH 07/18] soc: qcom: ipa: GSI headers Alex Elder
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch incorporates three source files (and their headers).  They're
grouped into one patch mainly for the purpose of making the number and
size of patches in this series somewhat reasonable.

  - "ipa_clock.c" and "ipa_clock.h" implement clocking for the IPA device.
    The IPA has a single core clock managed by the common clock framework.
    In addition, the IPA has three buses whose bandwidth is managed by the
    Linux interconnect framework.  At this time the core clock and all
    three buses are either on or off; we don't yet do any more fine-grained
    management than that.  The core clock and interconnects are enabled
    and disabled as a unit, using a unified clock-like abstraction,
    ipa_clock_get()/ipa_clock_put().

  - "ipa_interrupt.c" and "ipa_interrupt.h" implement IPA interrupts.
    There are two hardware IRQs used by the IPA driver (the other is
    the GSI interrupt, described in a separate patch).  Several types
    of interrupt are handled by the IPA IRQ handler; these are not part
    of data/fast path.

  - The IPA has a region of local memory that is accessible by the AP
    (and modem).  Within that region are areas with certain defined
    purposes.  "ipa_mem.c" and "ipa_mem.h" define those regions, and
    implement their initialization.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa_clock.c     | 291 ++++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_clock.h     |  52 ++++++
 drivers/net/ipa/ipa_interrupt.c | 279 ++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_interrupt.h |  53 ++++++
 drivers/net/ipa/ipa_mem.c       | 237 ++++++++++++++++++++++++++
 drivers/net/ipa/ipa_mem.h       |  82 +++++++++
 6 files changed, 994 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_clock.c
 create mode 100644 drivers/net/ipa/ipa_clock.h
 create mode 100644 drivers/net/ipa/ipa_interrupt.c
 create mode 100644 drivers/net/ipa/ipa_interrupt.h
 create mode 100644 drivers/net/ipa/ipa_mem.c
 create mode 100644 drivers/net/ipa/ipa_mem.h

diff --git a/drivers/net/ipa/ipa_clock.c b/drivers/net/ipa/ipa_clock.c
new file mode 100644
index 000000000000..686f7ac2ce15
--- /dev/null
+++ b/drivers/net/ipa/ipa_clock.c
@@ -0,0 +1,291 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+#include <linux/atomic.h>
+#include <linux/mutex.h>
+#include <linux/clk.h>
+#include <linux/device.h>
+#include <linux/interconnect.h>
+
+#include "ipa.h"
+#include "ipa_clock.h"
+#include "ipa_netdev.h"
+
+/**
+ * DOC: IPA Clocking
+ *
+ * The "IPA Clock" manages both the IPA core clock and the interconnects
+ * (buses) the IPA depends on as a single logical entity.  A reference count
+ * is incremented by "get" operations and decremented by "put" operations.
+ * Transitions of that count from 0 to 1 result in the clock and interconnects
+ * being enabled, and transitions of the count from 1 to 0 cause them to be
+ * disabled.  We currently operate the core clock at a fixed clock rate, and
+ * all buses at a fixed average and peak bandwidth.  As more advanced IPA
+ * features are enabled, we can will better use of clock and bus scaling.
+ *
+ * An IPA clock reference must be held for any access to IPA hardware.
+ */
+
+#define	IPA_CORE_CLOCK_RATE		(75UL * 1000 * 1000)	/* Hz */
+
+/* Interconnect path bandwidths (each times 1000 bytes per second) */
+#define IPA_MEMORY_AVG			(80 * 1000)	/* 80 MBps */
+#define IPA_MEMORY_PEAK			(600 * 1000)
+
+#define IPA_IMEM_AVG			(80 * 1000)
+#define IPA_IMEM_PEAK			(350 * 1000)
+
+#define IPA_CONFIG_AVG			(40 * 1000)
+#define IPA_CONFIG_PEAK			(40 * 1000)
+
+/**
+ * struct ipa_clock - IPA clocking information
+ * @core:		IPA core clock
+ * @memory_path:	Memory interconnect
+ * @imem_path:		Internal memory interconnect
+ * @config_path:	Configuration space interconnect
+ * @mutex;		Protects clock enable/disable
+ * @count:		Clocking reference count
+ */
+struct ipa_clock {
+	struct ipa *ipa;
+	atomic_t count;
+	struct mutex mutex; /* protects clock enable/disable */
+	struct clk *core;
+	struct icc_path *memory_path;
+	struct icc_path *imem_path;
+	struct icc_path *config_path;
+};
+
+/* Initialize interconnects required for IPA operation */
+static int ipa_interconnect_init(struct ipa_clock *clock, struct device *dev)
+{
+	struct icc_path *path;
+
+	path = of_icc_get(dev, "memory");
+	if (IS_ERR(path))
+		goto err_return;
+	clock->memory_path = path;
+
+	path = of_icc_get(dev, "imem");
+	if (IS_ERR(path))
+		goto err_memory_path_put;
+	clock->imem_path = path;
+
+	path = of_icc_get(dev, "config");
+	if (IS_ERR(path))
+		goto err_imem_path_put;
+	clock->config_path = path;
+
+	return 0;
+
+err_imem_path_put:
+	icc_put(clock->imem_path);
+err_memory_path_put:
+	icc_put(clock->memory_path);
+err_return:
+
+	return PTR_ERR(path);
+}
+
+/* Inverse of ipa_interconnect_init() */
+static void ipa_interconnect_exit(struct ipa_clock *clock)
+{
+	icc_put(clock->config_path);
+	icc_put(clock->imem_path);
+	icc_put(clock->memory_path);
+}
+
+/* Currently we only use one bandwidth level, so just "enable" interconnects */
+static int ipa_interconnect_enable(struct ipa_clock *clock)
+{
+	int ret;
+
+	ret = icc_set_bw(clock->memory_path, IPA_MEMORY_AVG, IPA_MEMORY_PEAK);
+	if (ret)
+		return ret;
+
+	ret = icc_set_bw(clock->imem_path, IPA_IMEM_AVG, IPA_IMEM_PEAK);
+	if (ret)
+		goto err_disable_memory_path;
+
+	ret = icc_set_bw(clock->config_path, IPA_CONFIG_AVG, IPA_CONFIG_PEAK);
+	if (ret)
+		goto err_disable_imem_path;
+
+	return 0;
+
+err_disable_imem_path:
+	(void)icc_set_bw(clock->imem_path, 0, 0);
+err_disable_memory_path:
+	(void)icc_set_bw(clock->memory_path, 0, 0);
+
+	return ret;
+}
+
+/* To disable an interconnect, we just its bandwidth to 0 */
+static int ipa_interconnect_disable(struct ipa_clock *clock)
+{
+	int ret;
+
+	ret = icc_set_bw(clock->memory_path, 0, 0);
+	if (ret)
+		return ret;
+
+	ret = icc_set_bw(clock->imem_path, 0, 0);
+	if (ret)
+		goto err_reenable_memory_path;
+
+	ret = icc_set_bw(clock->config_path, 0, 0);
+	if (ret)
+		goto err_reenable_imem_path;
+
+	return 0;
+
+err_reenable_imem_path:
+	(void)icc_set_bw(clock->imem_path, IPA_IMEM_AVG, IPA_IMEM_PEAK);
+err_reenable_memory_path:
+	(void)icc_set_bw(clock->memory_path, IPA_MEMORY_AVG, IPA_MEMORY_PEAK);
+
+	return ret;
+}
+
+/* Turn on IPA clocks, including interconnects */
+static int ipa_clock_enable(struct ipa_clock *clock)
+{
+	int ret;
+
+	ret = ipa_interconnect_enable(clock);
+	if (ret)
+		return ret;
+
+	ret = clk_prepare_enable(clock->core);
+	if (ret)
+		ipa_interconnect_disable(clock);
+
+	return ret;
+}
+
+/* Inverse of ipa_clock_enable() */
+static void ipa_clock_disable(struct ipa_clock *clock)
+{
+	clk_disable_unprepare(clock->core);
+	(void)ipa_interconnect_disable(clock);
+}
+
+/* Get an IPA clock reference, but only if the reference count is
+ * already non-zero.  Returns true if the additional reference was
+ * added successfully, or false otherwise.
+ */
+bool ipa_clock_get_additional(struct ipa_clock *clock)
+{
+	return !!atomic_inc_not_zero(&clock->count);
+}
+
+/* Get an IPA clock reference.  If the reference count is non-zero, it is
+ * incremented and return is immediate.  Otherwise it is checked again
+ * under protection of the mutex, and enable clocks and resume RX endpoints
+ * before returning.  For the first reference, the count is intentionally
+ * not incremented until after these activities are complete.
+ */
+void ipa_clock_get(struct ipa_clock *clock)
+{
+	/* If the clock is running, just bump the reference count */
+	if (ipa_clock_get_additional(clock))
+		return;
+
+	/* Otherwise get the mutex and check again */
+	mutex_lock(&clock->mutex);
+
+	/* A reference might have been added before we got the mutex. */
+	if (!ipa_clock_get_additional(clock)) {
+		int ret;
+
+		ret = ipa_clock_enable(clock);
+		if (!WARN(ret, "error %d enabling IPA clock\n", ret)) {
+			struct ipa *ipa = clock->ipa;
+
+			if (ipa->default_endpoint)
+				ipa_endpoint_resume(ipa->default_endpoint);
+
+			if (ipa->modem_netdev)
+				ipa_netdev_resume(ipa->modem_netdev);
+
+			atomic_inc(&clock->count);
+		}
+	}
+
+	mutex_unlock(&clock->mutex);
+}
+
+/* Attempt to remove an IPA clock reference.  If this represents
+ * the last reference, suspend endpoints and disable the clock
+ * (and interconnects) under protection of a mutex.
+ */
+void ipa_clock_put(struct ipa_clock *clock)
+{
+	/* If this is not the last reference there's nothing more to do */
+	if (!atomic_dec_and_mutex_lock(&clock->count, &clock->mutex))
+		return;
+
+	if (clock->ipa->modem_netdev)
+		ipa_netdev_suspend(clock->ipa->modem_netdev);
+
+	if (clock->ipa->default_endpoint)
+		ipa_endpoint_suspend(clock->ipa->default_endpoint);
+
+	ipa_clock_disable(clock);
+
+	mutex_unlock(&clock->mutex);
+}
+
+/* Initialize IPA clocking */
+struct ipa_clock *ipa_clock_init(struct ipa *ipa)
+{
+	struct device *dev = &ipa->pdev->dev;
+	struct ipa_clock *clock;
+	int ret;
+
+	clock = kzalloc(sizeof(*clock), GFP_KERNEL);
+	if (!clock)
+		return ERR_PTR(-ENOMEM);
+
+	clock->ipa = ipa;
+	clock->core = clk_get(dev, "core");
+	if (IS_ERR(clock->core)) {
+		ret = PTR_ERR(clock->core);
+		goto err_free_clock;
+	}
+
+	ret = clk_set_rate(clock->core, IPA_CORE_CLOCK_RATE);
+	if (ret)
+		goto err_clk_put;
+
+	ret = ipa_interconnect_init(clock, dev);
+	if (ret)
+		goto err_clk_put;
+
+	mutex_init(&clock->mutex);
+	atomic_set(&clock->count, 0);
+
+	return clock;
+
+err_clk_put:
+	clk_put(clock->core);
+err_free_clock:
+	kfree(clock);
+
+	return ERR_PTR(ret);
+}
+
+/* Inverse of ipa_clock_init() */
+void ipa_clock_exit(struct ipa_clock *clock)
+{
+	mutex_destroy(&clock->mutex);
+	ipa_interconnect_exit(clock);
+	clk_put(clock->core);
+	kfree(clock);
+}
diff --git a/drivers/net/ipa/ipa_clock.h b/drivers/net/ipa/ipa_clock.h
new file mode 100644
index 000000000000..f38c3face29a
--- /dev/null
+++ b/drivers/net/ipa/ipa_clock.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _IPA_CLOCK_H_
+#define _IPA_CLOCK_H_
+
+struct ipa;
+struct ipa_clock;
+
+/**
+ * ipa_clock_init() - Initialize IPA clocking
+ * @ipa:	IPA pointer
+ *
+ * @Return:	A pointer to an ipa_clock structure, or a pointer-coded error
+ */
+struct ipa_clock *ipa_clock_init(struct ipa *ipa);
+
+/**
+ * ipa_clock_exit() - Inverse of ipa_clock_init()
+ * @clock:	IPA clock pointer
+ */
+void ipa_clock_exit(struct ipa_clock *clock);
+
+/**
+ * ipa_clock_get() - Get an IPA clock reference
+ * @clock:	IPA clock pointer
+ *
+ * This call blocks if this is the first reference.
+ */
+void ipa_clock_get(struct ipa_clock *clock);
+
+/**
+ * ipa_clock_get_additional() - Get an IPA clock reference if not first
+ * @clock:	IPA clock pointer
+ *
+ * This returns immediately, and only takes a reference if not the first
+ */
+bool ipa_clock_get_additional(struct ipa_clock *clock);
+
+/**
+ * ipa_clock_put() - Drop an IPA clock reference
+ * @clock:	IPA clock pointer
+ *
+ * This drops a clock reference.  If the last reference is being dropped,
+ * the clock is stopped and RX endpoints are suspended.  This call will
+ * not block unless the last reference is dropped.
+ */
+void ipa_clock_put(struct ipa_clock *clock);
+
+#endif /* _IPA_CLOCK_H_ */
diff --git a/drivers/net/ipa/ipa_interrupt.c b/drivers/net/ipa/ipa_interrupt.c
new file mode 100644
index 000000000000..5be6b3c762ed
--- /dev/null
+++ b/drivers/net/ipa/ipa_interrupt.c
@@ -0,0 +1,279 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2014-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+/* DOC: IPA Interrupts
+ *
+ * The IPA has an interrupt line distinct from the interrupt used by the GSI
+ * code.  Whereas GSI interrupts are generally related to channel events (like
+ * transfer completions), IPA interrupts are related to other events related
+ * to the IPA.  Some of the IPA interrupts come from a microcontroller
+ * embedded in the IPA.  Each IPA interrupt type can be both masked and
+ * acknowledged independent of the others.
+ *
+ * Two of the IPA interrupts are initiated by the microcontroller.  A third
+ * can be generated to signal the need for a wakeup/resume when an IPA
+ * endpoint has been suspended.  There are other IPA events defined, but at
+ * this time only these three are supported.
+ */
+
+#include <linux/types.h>
+#include <linux/interrupt.h>
+
+#include "ipa.h"
+#include "ipa_clock.h"
+#include "ipa_reg.h"
+#include "ipa_endpoint.h"
+#include "ipa_interrupt.h"
+
+/* Maximum number of bits in an IPA interrupt mask */
+#define IPA_INTERRUPT_MAX	(sizeof(u32) * BITS_PER_BYTE)
+
+struct ipa_interrupt_info {
+	ipa_irq_handler_t handler;
+	enum ipa_interrupt_id interrupt_id;
+};
+
+/**
+ * struct ipa_interrupt - IPA interrupt information
+ * @ipa:		IPA pointer
+ * @irq:		Linux IRQ number used for IPA interrupts
+ * @interrupt_info:	Information for each IPA interrupt type
+ */
+struct ipa_interrupt {
+	struct ipa *ipa;
+	u32 irq;
+	u32 enabled;
+	struct ipa_interrupt_info info[IPA_INTERRUPT_MAX];
+};
+
+/* Map a logical interrupt number to a hardware IPA IRQ number */
+static const u32 ipa_interrupt_mapping[] = {
+	[IPA_INTERRUPT_UC_0]		= 2,
+	[IPA_INTERRUPT_UC_1]		= 3,
+	[IPA_INTERRUPT_TX_SUSPEND]	= 14,
+};
+
+static bool ipa_interrupt_uc(struct ipa_interrupt *interrupt, u32 ipa_irq)
+{
+	return ipa_irq == ipa_interrupt_mapping[IPA_INTERRUPT_UC_0] ||
+		ipa_irq == ipa_interrupt_mapping[IPA_INTERRUPT_UC_1];
+}
+
+static void ipa_interrupt_process(struct ipa_interrupt *interrupt, u32 ipa_irq)
+{
+	struct ipa_interrupt_info *info = &interrupt->info[ipa_irq];
+	bool uc_irq = ipa_interrupt_uc(interrupt, ipa_irq);
+	struct ipa *ipa = interrupt->ipa;
+	u32 mask = BIT(ipa_irq);
+
+	/* For microcontroller interrupts, clear the interrupt right away,
+	 * "to avoid clearing unhandled interrupts."
+	 */
+	if (uc_irq)
+		iowrite32(mask, ipa->reg_virt + IPA_REG_IRQ_CLR_OFFSET);
+
+	if (info->handler)
+		info->handler(interrupt->ipa, info->interrupt_id);
+
+	/* Clearing the SUSPEND_TX interrupt also clears the register
+	 * that tells us which suspended endpoint(s) caused the interrupt,
+	 * so defer clearing until after the handler's been called.
+	 */
+	if (!uc_irq)
+		iowrite32(mask, ipa->reg_virt + IPA_REG_IRQ_CLR_OFFSET);
+}
+
+static void ipa_interrupt_process_all(struct ipa_interrupt *interrupt)
+{
+	struct ipa *ipa = interrupt->ipa;
+	u32 enabled = interrupt->enabled;
+	u32 mask;
+
+	/* The status register indicates which conditions are present,
+	 * including conditions whose interrupt is not enabled.  Handle
+	 * only the enabled ones.
+	 */
+	mask = ioread32(ipa->reg_virt + IPA_REG_IRQ_STTS_OFFSET);
+	while ((mask &= enabled)) {
+		do {
+			u32 ipa_irq = __ffs(mask);
+
+			mask ^= BIT(ipa_irq);
+
+			ipa_interrupt_process(interrupt, ipa_irq);
+		} while (mask);
+		mask = ioread32(ipa->reg_virt + IPA_REG_IRQ_STTS_OFFSET);
+	}
+}
+
+/* Threaded part of the IRQ handler */
+static irqreturn_t ipa_isr_thread(int irq, void *dev_id)
+{
+	struct ipa_interrupt *interrupt = dev_id;
+
+	ipa_clock_get(interrupt->ipa->clock);
+
+	ipa_interrupt_process_all(interrupt);
+
+	ipa_clock_put(interrupt->ipa->clock);
+
+	return IRQ_HANDLED;
+}
+
+/* Hard part of the IRQ handler */
+static irqreturn_t ipa_isr(int irq, void *dev_id)
+{
+	struct ipa_interrupt *interrupt = dev_id;
+	struct ipa *ipa = interrupt->ipa;
+	u32 mask;
+
+	mask = ioread32(ipa->reg_virt + IPA_REG_IRQ_STTS_OFFSET);
+	if (mask & interrupt->enabled)
+		return IRQ_WAKE_THREAD;
+
+	/* Nothing in the mask was supposed to cause an interrupt */
+	iowrite32(mask, ipa->reg_virt + IPA_REG_IRQ_CLR_OFFSET);
+
+	dev_err(&ipa->pdev->dev, "%s: unexpected interrupt, mask 0x%08x\n",
+		__func__, mask);
+
+	return IRQ_HANDLED;
+}
+
+static void ipa_interrupt_suspend_control(struct ipa_interrupt *interrupt,
+					  enum ipa_endpoint_id endpoint_id,
+					  bool enable)
+{
+	u32 offset = IPA_REG_SUSPEND_IRQ_EN_OFFSET;
+	u32 mask = BIT(endpoint_id);
+	u32 val;
+
+	val = ioread32(interrupt->ipa->reg_virt + offset);
+	if (enable)
+		val |= mask;
+	else
+		val &= ~mask;
+	iowrite32(val, interrupt->ipa->reg_virt + offset);
+}
+
+void ipa_interrupt_suspend_enable(struct ipa_interrupt *interrupt,
+				  enum ipa_endpoint_id endpoint_id)
+{
+	ipa_interrupt_suspend_control(interrupt, endpoint_id, true);
+}
+
+void ipa_interrupt_suspend_disable(struct ipa_interrupt *interrupt,
+				   enum ipa_endpoint_id endpoint_id)
+{
+	ipa_interrupt_suspend_control(interrupt, endpoint_id, false);
+}
+
+/* Clear the suspend interrupt for all endpoints that signaled it */
+void ipa_interrupt_suspend_clear_all(struct ipa_interrupt *interrupt)
+{
+	struct ipa *ipa = interrupt->ipa;
+	u32 val;
+
+	val = ioread32(ipa->reg_virt + IPA_REG_IRQ_SUSPEND_INFO_OFFSET);
+	iowrite32(val, ipa->reg_virt + IPA_REG_SUSPEND_IRQ_CLR_OFFSET);
+}
+
+/**
+ * ipa_interrupt_simulate() - Simulate arrival of an IPA TX_SUSPEND interrupt
+ *
+ * This is needed to work around a problem that occurs if aggregation
+ * is active on an endpoint when its underlying channel is suspended.
+ */
+void ipa_interrupt_simulate_suspend(struct ipa_interrupt *interrupt)
+{
+	u32 ipa_irq = ipa_interrupt_mapping[IPA_INTERRUPT_TX_SUSPEND];
+
+	ipa_interrupt_process(interrupt, ipa_irq);
+}
+
+/**
+ * ipa_interrupt_add() - Adds handler for an IPA interrupt
+ * @interrupt_id:	IPA interrupt type
+ * @handler:		The handler for that interrupt
+ *
+ * Adds handler for an IPA interrupt and enable it.  IPA interrupt
+ * handlers are run in threaded interrupt context, so are allowed to
+ * block.
+ */
+void ipa_interrupt_add(struct ipa_interrupt *interrupt,
+		       enum ipa_interrupt_id interrupt_id,
+		       ipa_irq_handler_t handler)
+{
+	u32 ipa_irq = ipa_interrupt_mapping[interrupt_id];
+	struct ipa *ipa = interrupt->ipa;
+
+	interrupt->info[ipa_irq].handler = handler;
+	interrupt->info[ipa_irq].interrupt_id = interrupt_id;
+
+	/* Update the IPA interrupt mask to enable it */
+	interrupt->enabled |= BIT(ipa_irq);
+	iowrite32(interrupt->enabled, ipa->reg_virt + IPA_REG_IRQ_EN_OFFSET);
+}
+
+/**
+ * ipa_interrupt_remove() - Removes handler for an IPA interrupt type
+ * @interrupt:		IPA interrupt type
+ *
+ * Remove an IPA interrupt handler and disable it.
+ */
+void ipa_interrupt_remove(struct ipa_interrupt *interrupt,
+			  enum ipa_interrupt_id interrupt_id)
+{
+	u32 ipa_irq = ipa_interrupt_mapping[interrupt_id];
+	struct ipa *ipa = interrupt->ipa;
+
+	/* Update the IPA interrupt mask to disable it */
+	interrupt->enabled &= ~BIT(ipa_irq);
+	iowrite32(interrupt->enabled, ipa->reg_virt + IPA_REG_IRQ_EN_OFFSET);
+
+	interrupt->info[ipa_irq].handler = NULL;
+}
+
+/**
+ * ipa_interrupts_init() - Initialize the IPA interrupts framework
+ */
+struct ipa_interrupt *ipa_interrupt_setup(struct ipa *ipa)
+{
+	struct ipa_interrupt *interrupt;
+	unsigned int irq;
+	int ret;
+
+	ret = platform_get_irq_byname(ipa->pdev, "ipa");
+	if (ret < 0)
+		return ERR_PTR(ret);
+	irq = ret;
+
+	interrupt = kzalloc(sizeof(*interrupt), GFP_KERNEL);
+	if (!interrupt)
+		return ERR_PTR(-ENOMEM);
+	interrupt->ipa = ipa;
+	interrupt->irq = irq;
+
+	/* Start with all IPA interrupts disabled */
+	iowrite32(0, ipa->reg_virt + IPA_REG_IRQ_EN_OFFSET);
+
+	ret = request_threaded_irq(irq, ipa_isr, ipa_isr_thread, IRQF_ONESHOT,
+				   "ipa", interrupt);
+	if (ret)
+		goto err_free_interrupt;
+
+	return interrupt;
+
+err_free_interrupt:
+	kfree(interrupt);
+
+	return ERR_PTR(ret);
+}
+
+void ipa_interrupt_teardown(struct ipa_interrupt *interrupt)
+{
+	free_irq(interrupt->irq, interrupt);
+}
diff --git a/drivers/net/ipa/ipa_interrupt.h b/drivers/net/ipa/ipa_interrupt.h
new file mode 100644
index 000000000000..6e452430c156
--- /dev/null
+++ b/drivers/net/ipa/ipa_interrupt.h
@@ -0,0 +1,53 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _IPA_INTERRUPT_H_
+#define _IPA_INTERRUPT_H_
+
+#include <linux/types.h>
+#include <linux/bits.h>
+
+struct ipa;
+struct ipa_interrupt;
+
+/**
+ * enum ipa_interrupt_id - IPA Interrupt Type
+ *
+ * Used to register handlers for IPA interrupts.
+ */
+enum ipa_interrupt_id {
+	IPA_INTERRUPT_UC_0,
+	IPA_INTERRUPT_UC_1,
+	IPA_INTERRUPT_TX_SUSPEND,
+};
+
+/**
+ * typedef ipa_irq_handler_t - irq handler/callback type
+ * @param interrupt		- interrupt type
+ * @param interrupt_data	- interrupt information data
+ *
+ * Callback function registered by ipa_interrupt_add() to handle a specific
+ * interrupt type
+ */
+typedef void (*ipa_irq_handler_t)(struct ipa *ipa,
+				  enum ipa_interrupt_id interrupt_id);
+
+struct ipa_interrupt *ipa_interrupt_setup(struct ipa *ipa);
+void ipa_interrupt_teardown(struct ipa_interrupt *interrupt);
+
+void ipa_interrupt_add(struct ipa_interrupt *interrupt,
+		       enum ipa_interrupt_id interrupt_id,
+		       ipa_irq_handler_t handler);
+void ipa_interrupt_remove(struct ipa_interrupt *interrupt,
+			  enum ipa_interrupt_id interrupt_id);
+
+void ipa_interrupt_suspend_enable(struct ipa_interrupt *interrupt,
+				  enum ipa_endpoint_id endpoint_id);
+void ipa_interrupt_suspend_disable(struct ipa_interrupt *interrupt,
+				   enum ipa_endpoint_id endpoint_id);
+void ipa_interrupt_suspend_clear_all(struct ipa_interrupt *interrupt);
+void ipa_interrupt_simulate_suspend(struct ipa_interrupt *interrupt);
+
+#endif /* _IPA_INTERRUPT_H_ */
diff --git a/drivers/net/ipa/ipa_mem.c b/drivers/net/ipa/ipa_mem.c
new file mode 100644
index 000000000000..dc4190ddc9db
--- /dev/null
+++ b/drivers/net/ipa/ipa_mem.c
@@ -0,0 +1,237 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/bitfield.h>
+#include <linux/bug.h>
+#include <linux/dma-mapping.h>
+#include <linux/io.h>
+
+#include "ipa.h"
+#include "ipa_reg.h"
+#include "ipa_cmd.h"
+#include "ipa_mem.h"
+
+/* "Canary" value placed between memory regions to detect overflow */
+#define IPA_SMEM_CANARY_VAL		cpu_to_le32(0xdeadbeef)
+
+/* Only used for IPA_SMEM_UC_EVENT_RING */
+static __always_inline void smem_set_canary(struct ipa *ipa, u32 offset)
+{
+	__le32 *cp = ipa->shared_virt + offset;
+
+	BUILD_BUG_ON(offset < sizeof(*cp));
+
+	*--cp = IPA_SMEM_CANARY_VAL;
+}
+
+static __always_inline void smem_set_canaries(struct ipa *ipa, u32 offset)
+{
+	__le32 *cp = ipa->shared_virt + offset;
+
+	BUILD_BUG_ON(offset < 2 * sizeof(*cp));
+	BUILD_BUG_ON(offset % 8);
+
+	*--cp = IPA_SMEM_CANARY_VAL;
+	*--cp = IPA_SMEM_CANARY_VAL;
+}
+
+/* Initialize AP-owned areas in the shared memory by zero-filling them. */
+static int ipa_smem_zero_ap(struct ipa *ipa)
+{
+	int ret = 0;
+
+	BUILD_BUG_ON(IPA_SMEM_AP_HDR_OFFSET % 8);
+	BUILD_BUG_ON(IPA_SMEM_AP_HDR_PROC_CTX_OFFSET % 8);
+
+	if (IPA_SMEM_AP_HDR_SIZE) {
+		ret = ipa_cmd_smem_dma_zero(ipa, IPA_SMEM_AP_HDR_OFFSET,
+					    IPA_SMEM_AP_HDR_SIZE);
+		if (ret)
+			return ret;
+	}
+
+	if (IPA_SMEM_AP_HDR_PROC_CTX_SIZE)
+		ret = ipa_cmd_smem_dma_zero(ipa,
+					    IPA_SMEM_AP_HDR_PROC_CTX_OFFSET,
+					    IPA_SMEM_AP_HDR_PROC_CTX_SIZE);
+
+	return ret;
+}
+
+/**
+ * ipa_smem_setup() - Set up IPA AP and modem shared memory areas
+ *
+ * Configure the shared memory areas used for AP and modem header
+ * structures.  Zero the AP areas; the modem areas will be zeroed
+ * each time the modem comes up.
+ *
+ * Return:	0 if successful, or a negative error code
+ */
+int ipa_smem_setup(struct ipa *ipa)
+{
+	int ret = 0;
+	u32 val;
+
+	if (IPA_SMEM_AP_HDR_SIZE) {
+		ret = ipa_cmd_hdr_init_local(ipa, IPA_SMEM_AP_HDR_OFFSET,
+					     IPA_SMEM_AP_HDR_SIZE);
+		if (ret)
+			return ret;
+	}
+
+	if (IPA_SMEM_MODEM_HDR_SIZE)
+		ret = ipa_cmd_hdr_init_local(ipa, IPA_SMEM_MODEM_HDR_OFFSET,
+					     IPA_SMEM_MODEM_HDR_SIZE);
+
+	val = ipa->shared_offset + IPA_SMEM_MODEM_HDR_PROC_CTX_OFFSET;
+	iowrite32(val, ipa->reg_virt +
+			IPA_REG_LOCAL_PKT_PROC_CNTXT_BASE_OFFSET);
+
+	/* Modem smem regions will be zeroed whenever modem comes up */
+	ipa_smem_zero_ap(ipa);
+
+	return ret;
+}
+
+void ipa_smem_teardown(struct ipa *ipa)
+{
+	/* Nothing to do */
+}
+
+/**
+ * ipa_smem_config() - Configure IPA shared memory
+ *
+ * Return:	0 if successful, or a negative error code
+ */
+int ipa_smem_config(struct ipa *ipa)
+{
+	u32 size;
+	u32 val;
+
+	/* Check the advertised location and size of the shared memory area */
+	val = ioread32(ipa->reg_virt + IPA_REG_SHARED_MEM_SIZE_OFFSET);
+
+	/* The fields in the register are in 8 byte units */
+	ipa->shared_offset = 8 * u32_get_bits(val, SHARED_MEM_BADDR_FMASK);
+	dev_dbg(&ipa->pdev->dev, "shared memory offset 0x%x bytes\n",
+		ipa->shared_offset);
+	if (WARN_ON(ipa->shared_offset))
+		return -EINVAL;
+
+	/* The code assumes a certain minimum shared memory area size */
+	size = 8 * u32_get_bits(val, SHARED_MEM_SIZE_FMASK);
+	dev_dbg(&ipa->pdev->dev, "shared memory size 0x%x bytes\n", size);
+	if (WARN_ON(size < IPA_SMEM_SIZE))
+		return -EINVAL;
+
+	/* Now write "canary" values before each sub-section. */
+	smem_set_canaries(ipa, IPA_SMEM_V4_FLT_HASH_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_V4_FLT_NHASH_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_V6_FLT_HASH_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_V6_FLT_NHASH_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_V4_RT_HASH_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_V4_RT_NHASH_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_V6_RT_HASH_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_V6_RT_NHASH_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_MODEM_HDR_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_MODEM_HDR_PROC_CTX_OFFSET);
+	smem_set_canaries(ipa, IPA_SMEM_MODEM_OFFSET);
+
+	/* Only one canary precedes the microcontroller ring */
+	BUILD_BUG_ON(IPA_SMEM_UC_EVENT_RING_OFFSET % 1024);
+	smem_set_canary(ipa, IPA_SMEM_UC_EVENT_RING_OFFSET);
+
+	return 0;
+}
+
+void ipa_smem_deconfig(struct ipa *ipa)
+{
+	/* Don't bother zeroing any of the shared memory on exit */
+}
+
+/**
+ * ipa_smem_zero_modem() - Initialize modem general memory and header memory
+ */
+int ipa_smem_zero_modem(struct ipa *ipa)
+{
+	int ret = 0;
+
+	if (IPA_SMEM_MODEM_SIZE) {
+		ret = ipa_cmd_smem_dma_zero(ipa, IPA_SMEM_MODEM_OFFSET,
+					    IPA_SMEM_MODEM_SIZE);
+		if (ret)
+			return ret;
+	}
+
+	if (IPA_SMEM_MODEM_HDR_SIZE) {
+		ret = ipa_cmd_smem_dma_zero(ipa, IPA_SMEM_MODEM_HDR_OFFSET,
+					    IPA_SMEM_MODEM_HDR_SIZE);
+		if (ret)
+			return ret;
+	}
+
+	if (IPA_SMEM_MODEM_HDR_PROC_CTX_SIZE)
+		ret = ipa_cmd_smem_dma_zero(ipa,
+					    IPA_SMEM_MODEM_HDR_PROC_CTX_OFFSET,
+					    IPA_SMEM_MODEM_HDR_PROC_CTX_SIZE);
+
+	return ret;
+}
+
+int ipa_mem_init(struct ipa *ipa)
+{
+	struct resource *res;
+	int ret;
+
+	ret = dma_set_mask_and_coherent(&ipa->pdev->dev, DMA_BIT_MASK(64));
+	if (ret)
+		return ret;
+
+	/* Set up IPA shared memory */
+	res = platform_get_resource_byname(ipa->pdev, IORESOURCE_MEM,
+					   "ipa-shared");
+	if (!res)
+		return -ENODEV;
+
+	/* The code assumes a certain minimum shared memory area size */
+	if (WARN_ON(resource_size(res) < IPA_SMEM_SIZE))
+		return -EINVAL;
+
+	ipa->shared_virt = memremap(res->start, resource_size(res),
+				    MEMREMAP_WC);
+	if (!ipa->shared_virt)
+		ret = -ENOMEM;
+	ipa->shared_phys = res->start;
+
+	/* Setup IPA register memory  */
+	res = platform_get_resource_byname(ipa->pdev, IORESOURCE_MEM,
+					   "ipa-reg");
+	if (!res) {
+		ret = -ENODEV;
+		goto err_unmap_shared;
+	}
+
+	ipa->reg_virt = ioremap(res->start, resource_size(res));
+	if (!ipa->reg_virt) {
+		ret = -ENOMEM;
+		goto err_unmap_shared;
+	}
+	ipa->reg_phys = res->start;
+
+	return 0;
+
+err_unmap_shared:
+	memunmap(ipa->shared_virt);
+
+	return ret;
+}
+
+void ipa_mem_exit(struct ipa *ipa)
+{
+	iounmap(ipa->reg_virt);
+	memunmap(ipa->shared_virt);
+}
diff --git a/drivers/net/ipa/ipa_mem.h b/drivers/net/ipa/ipa_mem.h
new file mode 100644
index 000000000000..53f32c32ac06
--- /dev/null
+++ b/drivers/net/ipa/ipa_mem.h
@@ -0,0 +1,82 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _IPA_MEM_H_
+#define _IPA_MEM_H_
+
+struct ipa;
+
+/**
+ * DOC:
+ * The IPA has a block of shared memory, divided into regions used for
+ * specific purposes.  The offset within the IPA address space of this shared
+ * memory block is defined by the IPA_SMEM_DIRECT_ACCESS_OFFSET register.
+ *
+ * The regions within the shared block are bounded by an offset and size found
+ * in the IPA_SHARED_MEM_SIZE register.  The first 128 bytes of the shared
+ * memory block are shared with the microcontroller, and the first 40 bytes of
+ * that contain a structure used to communicate between the microcontroller
+ * and the AP.
+ *
+ * There is a set of filter and routing tables, and each is given a 128 byte
+ * region in shared memory.  Each entry in a filter or route table is
+ * IPA_TABLE_ENTRY_SIZE, or 8 bytes.  The first "slot" of every table is
+ * filled with a "canary" value, and the table offsets defined below represent
+ * the location of the first real entry in each table after this.
+ *
+ * The number of filter table entries depends on the number of endpoints that
+ * support filtering.  The first non-canary slot of a filter table contains a
+ * bitmap, with each set bit indicating an endpoint containing an entry in the
+ * table.  Bit 0 is used to represent a global filter.
+ *
+ * About half of the routing table entries are reserved for modem use.
+ */
+
+/* The maximum number of filter table entries (IPv4, IPv6; hashed and not) */
+#define IPA_SMEM_FLT_COUNT			14
+
+/* The number of routing table entries (IPv4, IPv6; hashed and not) */
+#define IPA_SMEM_RT_COUNT			15
+
+ /* Which routing table entries are for the modem */
+#define IPA_SMEM_MODEM_RT_COUNT			8
+#define IPA_SMEM_MODEM_RT_INDEX_MIN		0
+#define IPA_SMEM_MODEM_RT_INDEX_MAX \
+		(IPA_SMEM_MODEM_RT_INDEX_MIN + IPA_SMEM_MODEM_RT_COUNT - 1)
+
+/* Regions within the shared memory block.  Table sizes are 0x80 bytes. */
+#define IPA_SMEM_V4_FLT_HASH_OFFSET		0x0288
+#define IPA_SMEM_V4_FLT_NHASH_OFFSET		0x0308
+#define IPA_SMEM_V6_FLT_HASH_OFFSET		0x0388
+#define IPA_SMEM_V6_FLT_NHASH_OFFSET		0x0408
+#define IPA_SMEM_V4_RT_HASH_OFFSET		0x0488
+#define IPA_SMEM_V4_RT_NHASH_OFFSET		0x0508
+#define IPA_SMEM_V6_RT_HASH_OFFSET		0x0588
+#define IPA_SMEM_V6_RT_NHASH_OFFSET		0x0608
+#define IPA_SMEM_MODEM_HDR_OFFSET		0x0688
+#define IPA_SMEM_MODEM_HDR_SIZE			0x0140
+#define IPA_SMEM_AP_HDR_OFFSET			0x07c8
+#define IPA_SMEM_AP_HDR_SIZE			0x0000
+#define IPA_SMEM_MODEM_HDR_PROC_CTX_OFFSET	0x07d0
+#define IPA_SMEM_MODEM_HDR_PROC_CTX_SIZE	0x0200
+#define IPA_SMEM_AP_HDR_PROC_CTX_OFFSET		0x09d0
+#define IPA_SMEM_AP_HDR_PROC_CTX_SIZE		0x0200
+#define IPA_SMEM_MODEM_OFFSET			0x0bd8
+#define IPA_SMEM_MODEM_SIZE			0x1024
+#define IPA_SMEM_UC_EVENT_RING_OFFSET		0x1c00	/* v3.5 and later */
+#define IPA_SMEM_SIZE				0x2000
+
+int ipa_smem_config(struct ipa *ipa);
+void ipa_smem_deconfig(struct ipa *ipa);
+
+int ipa_smem_setup(struct ipa *ipa);
+void ipa_smem_teardown(struct ipa *ipa);
+
+int ipa_smem_zero_modem(struct ipa *ipa);
+
+int ipa_mem_init(struct ipa *ipa);
+void ipa_mem_exit(struct ipa *ipa);
+
+#endif /* _IPA_SMEM_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 07/18] soc: qcom: ipa: GSI headers
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (5 preceding siblings ...)
  2019-05-12  1:24 ` [PATCH 06/18] soc: qcom: ipa: clocking, interrupts, and memory Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-12  1:24 ` [PATCH 08/18] soc: qcom: ipa: the generic software interface Alex Elder
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

The Generic Software Interface is a layer of the IPA driver that
abstracts the underlying hardware.  The next patch includes the
main code for GSI (including some additional documentation).  This
patch just includes three GSI header files.

  - "gsi.h" is the top-level GSI header file.  There is one of these
    associated with the IPA structure; in fact, it is embedded within
    the IPA structure.  (Were it not embedded this way, many of the
    definitions structures defined here could be private to GSI code.)
    The main abstraction implemented by the GSI code is the channel,
    and this header exposes several operations that can be performed
    on a GSI channel.

  - "gsi_private.h" exposes some definitions that are intended to be
    private, used only by the main GSI code and the GSI transaction
    code (defined in an upcoming patch).

  - Like "ipa_reg.h", "gsi_reg.h" defines the offsets of the 32-bit
    registers used by the GSI layer, along with masks that define the
    position and width of fields less than 32 bits located within
    these registers.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/gsi.h         | 241 ++++++++++++++++++++++
 drivers/net/ipa/gsi_private.h | 147 +++++++++++++
 drivers/net/ipa/gsi_reg.h     | 376 ++++++++++++++++++++++++++++++++++
 3 files changed, 764 insertions(+)
 create mode 100644 drivers/net/ipa/gsi.h
 create mode 100644 drivers/net/ipa/gsi_private.h
 create mode 100644 drivers/net/ipa/gsi_reg.h

diff --git a/drivers/net/ipa/gsi.h b/drivers/net/ipa/gsi.h
new file mode 100644
index 000000000000..8194a4110a40
--- /dev/null
+++ b/drivers/net/ipa/gsi.h
@@ -0,0 +1,241 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _GSI_H_
+#define _GSI_H_
+
+#include <linux/types.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/completion.h>
+#include <linux/platform_device.h>
+#include <linux/netdevice.h>
+
+#define GSI_CHANNEL_MAX		14
+#define GSI_EVT_RING_MAX	10
+
+struct device;
+struct scatterlist;
+struct platform_device;
+
+struct gsi;
+struct gsi_trans;
+struct gsi_channel_data;
+struct gsi_ipa_endpoint_data;
+
+/* Execution environment IDs */
+enum gsi_ee_id {
+	GSI_EE_AP	= 0,
+	GSI_EE_MODEM	= 1,
+	GSI_EE_UC	= 2,
+	GSI_EE_TZ	= 3,
+};
+
+/* Channel operation statistics, aggregated across all channels */
+struct gsi_channel_stats {
+	u64 allocate;
+	u64 start;
+	u64 stop;
+	u64 reset;
+	u64 free;
+};
+
+struct gsi_ring {
+	spinlock_t spinlock;		/* protects wp, rp updates */
+	void *virt;
+	dma_addr_t addr;
+	size_t size;
+	u32 base;			/* low 32 bits of addr */
+	u32 wp;
+	u32 wp_local;
+	u32 rp_local;
+	u32 end;			/* offset past last element */
+};
+
+struct gsi_trans_info {
+	struct gsi_trans **map;		/* TRE -> transaction map */
+	u32 pool_count;			/* # transactions in the pool */
+	struct gsi_trans *pool;		/* transaction allocation pool */
+	u32 pool_free;			/* next free transaction in pool */
+	u32 sg_pool_count;		/* # SGs in the allocation pool */
+	struct scatterlist *sg_pool;	/* SG allocation pool */
+	u32 sg_pool_free;		/* next free SG pool entry */
+
+	spinlock_t spinlock;		/* protects updates to the rest */
+	u32 tre_avail;			/* unallocated TREs in ring */
+	struct list_head alloc;		/* allocated, not committed */
+	struct list_head pending;	/* committed, awaiting completion */
+	struct list_head complete;	/* completed, awaiting poll */
+	struct list_head polled;	/* returned by gsi_channel_poll_one() */
+};
+
+/* Hardware values signifying the state of a channel */
+enum gsi_channel_state {
+	GSI_CHANNEL_STATE_NOT_ALLOCATED	= 0x0,
+	GSI_CHANNEL_STATE_ALLOCATED	= 0x1,
+	GSI_CHANNEL_STATE_STARTED	= 0x2,
+	GSI_CHANNEL_STATE_STOPPED	= 0x3,
+	GSI_CHANNEL_STATE_STOP_IN_PROC	= 0x4,
+	GSI_CHANNEL_STATE_ERROR		= 0xf,
+};
+
+/* We only care about channels between IPA and AP */
+struct gsi_channel {
+	struct gsi *gsi;
+	u32 toward_ipa;			/* 0: IPA->AP; 1: AP->IPA */
+
+	const struct gsi_channel_data *data;	/* initialization data */
+
+	struct completion completion;	/* signals channel state changes */
+	enum gsi_channel_state state;
+
+	struct gsi_ring tre_ring;
+	u32 evt_ring_id;
+
+	u64 byte_count;			/* total # bytes transferred */
+	u64 trans_count;		/* total # transactions */
+	u64 doorbell_byte_count;	/* TX byte_count at last doorbell */
+	u64 doorbell_trans_count;	/* TX trans_count at last doorbell */
+
+	struct gsi_trans_info trans_info;
+
+	struct napi_struct napi;
+};
+
+/* Hardware values signifying the state of an event ring */
+enum gsi_evt_ring_state {
+	GSI_EVT_RING_STATE_NOT_ALLOCATED	= 0x0,
+	GSI_EVT_RING_STATE_ALLOCATED		= 0x1,
+	GSI_EVT_RING_STATE_ERROR		= 0xf,
+};
+
+struct gsi_evt_ring {
+	struct gsi_channel *channel;
+	struct completion completion;	/* signals event ring state changes */
+	enum gsi_evt_ring_state state;
+	struct gsi_ring ring;
+};
+
+struct gsi {
+	struct device *dev;		/* Same as IPA device */
+	struct net_device dummy_dev;	/* needed for NAPI */
+	void __iomem *virt;
+	u32 irq;
+	u32 irq_wake_enabled;		/* 1: irq wake was enabled */
+	struct gsi_channel channel[GSI_CHANNEL_MAX];
+	struct gsi_channel_stats channel_stats;
+	struct gsi_evt_ring evt_ring[GSI_EVT_RING_MAX];
+	u32 event_bitmap;
+	u32 event_enable_bitmap;
+	spinlock_t spinlock;		/* global register updates */
+	struct mutex mutex;		/* protects commands, programming */
+};
+
+/**
+ * gsi_setup() - Set up the GSI subsystem
+ * @gsi:	Address of GSI structure embedded in an IPA structure
+ *
+ * @Return:	0 if successful, or a negative error code
+ *
+ * Performs initialization that must wait until the GSI hardware is
+ * ready (including firmware loaded).
+ */
+int gsi_setup(struct gsi *gsi);
+
+/**
+ * gsi_teardown() - Tear down GSI subsystem
+ * @gsi:	GSI address previously passed to a successful gsi_setup() call
+ */
+void gsi_teardown(struct gsi *gsi);
+
+/**
+ * gsi_channel_trans_max() - Channel maximum number of transactions
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel whose limit is to be returned
+ *
+ * @Return:	 The maximum number of pending transactions on the channel
+ */
+u32 gsi_channel_trans_max(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_trans_tre_max() - Return the maximum TREs per transaction
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel whose limit is to be returned
+ *
+ * @Return:	 The maximum TRE count per transaction on the channel
+ */
+u32 gsi_channel_trans_tre_max(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_trans_quiesce() - Wait for channel transactions to complete
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel to quiesce
+ *
+ * Wait for all of a channel's currently-allocated transactions to
+ * be committed, complete, and be freed.
+ *
+ * NOTE:  Assumes no new transactions will be issued before it returns.
+ */
+void gsi_channel_trans_quiesce(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_start() - Make a GSI channel operational
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel to start
+ *
+ * @Return:	0 if successful, or a negative error code
+ */
+int gsi_channel_start(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_stop() - Stop an operational GSI channel
+ * @gsi:	GSI pointer returned by gsi_setup()
+ * @channel_id:	Channel to stop
+ *
+ * @Return:	0 if successful, or a negative error code
+ */
+int gsi_channel_stop(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_reset() - Reset a GSI channel
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel to be reset
+ *
+ * @Return:	0 if successful, or a negative error code
+ *
+ * GSI hardware relinquishes ownership of all pending receive buffer
+ * transactions as a result of reset.  They will be completed with
+ * result code -ECANCELED.
+ */
+int gsi_channel_reset(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_config() - Configure a GSI channel
+ * @gsi:		GSI pointer
+ * @channel_id:		Channel to be configured
+ * @doorbell_enable:	Whether to enable hardware doorbell engine
+ */
+void gsi_channel_config(struct gsi *gsi, u32 channel_id, bool doorbell_enable);
+
+/**
+ * gsi_init() - Initialize the GSI subsystem
+ * @gsi:	Address of GSI structure embedded in an IPA structure
+ * @pdev:	IPA platform device
+ *
+ * @Return:	0 if successful, or a negative error code
+ *
+ * Early stage initialization of the GSI subsystem, performing tasks
+ * that can be done before the GSI hardware is ready to use.
+ */
+int gsi_init(struct gsi *gsi, struct platform_device *pdev, u32 data_count,
+	     const struct gsi_ipa_endpoint_data *data);
+
+/**
+ * gsi_exit() - Exit the GSI subsystem
+ * @gsi:	GSI address previously passed to a successful gsi_init() call
+ */
+void gsi_exit(struct gsi *gsi);
+
+#endif /* _GSI_H_ */
diff --git a/drivers/net/ipa/gsi_private.h b/drivers/net/ipa/gsi_private.h
new file mode 100644
index 000000000000..31a665fb7756
--- /dev/null
+++ b/drivers/net/ipa/gsi_private.h
@@ -0,0 +1,147 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _GSI_PRIVATE_H_
+#define _GSI_PRIVATE_H_
+
+#include <linux/types.h>
+
+/* === NOTE:  Only "gsi.c" and "gsi_trans.c" should include this file === */
+
+struct gsi_trans;
+struct gsi_ring;
+struct gsi_channel;
+
+/* An entry in an event ring */
+struct gsi_xfer_compl_evt {
+	__le64 xfer_ptr;
+	__le16 len;
+	u8 rsvd1;
+	u8 code;
+	__le16 rsvd;
+	u8 type;
+	u8 chid;
+} __packed;
+
+/* An entry in a channel ring */
+struct gsi_tre {
+	__le64 addr;		/* DMA address */
+	__le16 len_opcode;	/* length in bytes or enum IPA_CMD_* */
+	__le16 reserved;
+	__le32 flags;		/* GSI_TRE_FLAGS_* */
+};
+
+/**
+ * gsi_trans_move_complete() - Mark a GSI transaction completed
+ * @trans:	Transaction to commit
+ */
+void gsi_trans_move_complete(struct gsi_trans *trans);
+
+/**
+ * gsi_trans_move_polled() - Mark a transaction polled
+ * @trans:	Transaction to update
+ */
+void gsi_trans_move_polled(struct gsi_trans *trans);
+
+/**
+ * gsi_trans_complete() - Complete a GSI transaction
+ * @trans:	Transaction to complete
+ *
+ * Marks a transaction complete (including freeing it).
+ */
+void gsi_trans_complete(struct gsi_trans *trans);
+
+/**
+ * gsi_channel_trans_mapped() - Return a transaction mapped to a TRE index
+ * @channel:	Channel associated with the transaction
+ * @index:	Index of the TRE having a transaction
+ *
+ * @Return:	The GSI transaction pointer associated with the TRE index
+ */
+struct gsi_trans *gsi_channel_trans_mapped(struct gsi_channel *channel,
+					   u32 index);
+
+/**
+ * gsi_channel_trans_complete() - Return a channel's next completed transaction
+ * @channel:	Channel whose next transaction is to be returned
+ *
+ * @Return:	The next completed transaction, or NULL if nothing new
+ */
+struct gsi_trans *gsi_channel_trans_complete(struct gsi_channel *channel);
+
+/**
+ * gsi_channel_trans_cancel_pending() - Cancel pending transactions
+ * @channel:	Channel whose pending transactions should be cancelled
+ *
+ * Cancel all pending transactions on a channel.  These are
+ * transactions that have been comitted but not yet completed.  This
+ * is required when the channel gets reset.  At that time all
+ * pending transactions will be completed with a result -ECANCELED.
+ *
+ * NOTE:  Transactions already complete at the time of this call are
+ *	  unaffected.
+ */
+void gsi_channel_trans_cancel_pending(struct gsi_channel *channel);
+
+/**
+ * gsi_channel_trans_init() - Initialize a channel's GSI transaction info
+ * @channel:	The channel whose transaction info is to be set up
+ *
+ * @Return:	0 if successful, or -ENOMEM on allocation failure
+ *
+ * Creates and sets up information for managing transactions on a channel
+ */
+int gsi_channel_trans_init(struct gsi_channel *channel);
+
+/**
+ * gsi_channel_trans_exit() - Inverse of gsi_channel_trans_init()
+ * @channel:	Channel whose transaction information is to be cleaned up
+ */
+void gsi_channel_trans_exit(struct gsi_channel *channel);
+
+/**
+ * gsi_channel_doorbell() - Ring a channel's doorbell
+ * @channel:	Channel whose doorbell should be rung
+ *
+ * Rings a channel's doorbell to inform the GSI hardware that new
+ * transactions (TREs, really) are available for it to process.
+ */
+void gsi_channel_doorbell(struct gsi_channel *channel);
+
+/**
+ * gsi_ring_virt() - Return virtual address for a 32-bit ring offset
+ * @ring:	Ring whose address is to be translated
+ * @addr:	32-bit ring "hardware" address (low 32 bits of DMA address)
+ */
+void *gsi_ring_virt(struct gsi_ring *ring, u32 offset);
+
+/**
+ * gsi_ring_index() - Return index for a 32-bit ring offset
+ * @ring:	Ring whose address is to be translated
+ * @addr:	32-bit ring "hardware" address (low 32 bits of DMA address)
+ */
+u32 ring_index(struct gsi_ring *ring, u32 offset);
+
+/**
+ * gsi_ring_wp_local_add() - Advance ring local write pointer
+ * @ring:	Ring whose address is to be translated
+ * @val:	Number of ring slots to add to local write pointer
+ */
+void gsi_ring_wp_local_add(struct gsi_ring *ring, u32 val);
+
+/**
+ * gsi_event_handle() - Handle the arrival of an event
+ * @gsi:	GSI pointer
+ * @evt_ring_id: Event ring for which an event has occurred
+ *
+ * This is normally used in IRQ handling when an IEOB interrupt
+ * arrives.  But it is also used when cancelling transactions
+ * following a channel reset.
+ */
+void gsi_event_handle(struct gsi *gsi, u32 evt_ring_id);
+
+u32 gsi_channel_id(struct gsi_channel *channel);
+
+#endif /* _GSI_PRIVATE_H_ */
diff --git a/drivers/net/ipa/gsi_reg.h b/drivers/net/ipa/gsi_reg.h
new file mode 100644
index 000000000000..969215ae1580
--- /dev/null
+++ b/drivers/net/ipa/gsi_reg.h
@@ -0,0 +1,376 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _GSI_REG_H_
+#define _GSI_REG_H_
+
+#include <linux/bits.h>
+
+/* === NOTE:  Only "gsi.c" should include this file === */
+
+/**
+ * DOC: GSI Registers
+ *
+ * GSI registers are located within the "gsi" address space defined by Device
+ * Tree.  The offset of each register within that space is specified by
+ * symbols defined below.  The GSI address space is mapped to virtual memory
+ * space in gsi_init().  All GSI registers are 32 bits wide.
+ *
+ * Each register type is duplicated for a number of instances of something.
+ * For example, each GSI channel has its own set of registers defining its
+ * configuration.  The offset to a channel's set of registers is computed
+ * based on a "base" offset plus an additional "stride" amount computed by
+ * from the channel's ID.  For such registers, the offset is computed by a
+ * function-like macro that takes a parameter used in the computation.
+ *
+ * The offset of a register dependent on execution environment is computed
+ * by a macro that is supplied a parameter "ee".  The "ee" value is a member
+ * of the gsi_ee enumerated type.
+ *
+ * The offset of a channel register is computed by a macro that is supplied a
+ * parameter "ch".  The "ch" value is a channel id whose maximum value is 30
+ * (though the actual limit is hardware-dependent).
+ *
+ * The offset of an event register is computed by a macro that is supplied a
+ * parameter "ev".  The "ev" value is an event id whose maximum value is 15
+ * (though the actual limit is hardware-dependent).
+ */
+
+#define GSI_INTER_EE_SRC_CH_IRQ_OFFSET \
+			GSI_INTER_EE_N_SRC_CH_IRQ_OFFSET(GSI_EE_AP)
+#define GSI_INTER_EE_N_SRC_CH_IRQ_OFFSET(ee) \
+			(0x0000c018 + 0x1000 * (ee))
+
+#define GSI_INTER_EE_SRC_EV_CH_IRQ_OFFSET \
+			GSI_INTER_EE_N_SRC_EV_CH_IRQ_OFFSET(GSI_EE_AP)
+#define GSI_INTER_EE_N_SRC_EV_CH_IRQ_OFFSET(ee) \
+			(0x0000c01c + 0x1000 * (ee))
+
+#define GSI_INTER_EE_SRC_CH_IRQ_CLR_OFFSET \
+			GSI_INTER_EE_N_SRC_CH_IRQ_CLR_OFFSET(GSI_EE_AP)
+#define GSI_INTER_EE_N_SRC_CH_IRQ_CLR_OFFSET(ee) \
+			(0x0000c028 + 0x1000 * (ee))
+
+#define GSI_INTER_EE_SRC_EV_CH_IRQ_CLR_OFFSET \
+			GSI_INTER_EE_N_SRC_EV_CH_IRQ_CLR_OFFSET(GSI_EE_AP)
+#define GSI_INTER_EE_N_SRC_EV_CH_IRQ_CLR_OFFSET(ee) \
+			(0x0000c02c + 0x1000 * (ee))
+
+#define GSI_CH_C_CNTXT_0_OFFSET(ch) \
+		GSI_EE_N_CH_C_CNTXT_0_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_0_OFFSET(ch, ee) \
+		(0x0001c000 + 0x4000 * (ee) + 0x80 * (ch))
+#define CHTYPE_PROTOCOL_FMASK		GENMASK(2, 0)
+#define CHTYPE_DIR_FMASK		GENMASK(3, 3)
+#define EE_FMASK			GENMASK(7, 4)
+#define CHID_FMASK			GENMASK(12, 8)
+#define ERINDEX_FMASK			GENMASK(18, 14)
+#define CHSTATE_FMASK			GENMASK(23, 20)
+#define ELEMENT_SIZE_FMASK		GENMASK(31, 24)
+
+#define GSI_CH_C_CNTXT_1_OFFSET(ch) \
+		GSI_EE_N_CH_C_CNTXT_1_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_1_OFFSET(ch, ee) \
+		(0x0001c004 + 0x4000 * (ee) + 0x80 * (ch))
+#define R_LENGTH_FMASK			GENMASK(15, 0)
+
+#define GSI_CH_C_CNTXT_2_OFFSET(ch) \
+		GSI_EE_N_CH_C_CNTXT_2_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_2_OFFSET(ch, ee) \
+		(0x0001c008 + 0x4000 * (ee) + 0x80 * (ch))
+
+#define GSI_CH_C_CNTXT_3_OFFSET(ch) \
+		GSI_EE_N_CH_C_CNTXT_3_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_3_OFFSET(ch, ee) \
+		(0x0001c00c + 0x4000 * (ee) + 0x80 * (ch))
+
+#define GSI_CH_C_QOS_OFFSET(ch) \
+		GSI_EE_N_CH_C_QOS_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_QOS_OFFSET(ch, ee) \
+		(0x0001c05c + 0x4000 * (ee) + 0x80 * (ch))
+#define WRR_WEIGHT_FMASK		GENMASK(3, 0)
+#define MAX_PREFETCH_FMASK		GENMASK(8, 8)
+#define USE_DB_ENG_FMASK		GENMASK(9, 9)
+
+#define GSI_CH_C_SCRATCH_0_OFFSET(ch) \
+		GSI_EE_N_CH_C_SCRATCH_0_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_SCRATCH_0_OFFSET(ch, ee) \
+		(0x0001c060 + 0x4000 * (ee) + 0x80 * (ch))
+
+#define GSI_CH_C_SCRATCH_1_OFFSET(ch) \
+		GSI_EE_N_CH_C_SCRATCH_1_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_SCRATCH_1_OFFSET(ch, ee) \
+		(0x0001c064 + 0x4000 * (ee) + 0x80 * (ch))
+
+#define GSI_CH_C_SCRATCH_2_OFFSET(ch) \
+		GSI_EE_N_CH_C_SCRATCH_2_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_SCRATCH_2_OFFSET(ch, ee) \
+		(0x0001c068 + 0x4000 * (ee) + 0x80 * (ch))
+
+#define GSI_CH_C_SCRATCH_3_OFFSET(ch) \
+		GSI_EE_N_CH_C_SCRATCH_3_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_SCRATCH_3_OFFSET(ch, ee) \
+		(0x0001c06c + 0x4000 * (ee) + 0x80 * (ch))
+
+#define GSI_EV_CH_E_CNTXT_0_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_0_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_0_OFFSET(ev, ee) \
+		(0x0001d000 + 0x4000 * (ee) + 0x80 * (ev))
+#define EV_CHTYPE_FMASK			GENMASK(3, 0)
+#define EV_EE_FMASK			GENMASK(7, 4)
+#define EV_EVCHID_FMASK			GENMASK(15, 8)
+#define EV_INTYPE_FMASK			GENMASK(16, 16)
+#define EV_CHSTATE_FMASK		GENMASK(23, 20)
+#define EV_ELEMENT_SIZE_FMASK		GENMASK(31, 24)
+
+#define GSI_EV_CH_E_CNTXT_1_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_1_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_1_OFFSET(ev, ee) \
+		(0x0001d004 + 0x4000 * (ee) + 0x80 * (ev))
+#define EV_R_LENGTH_FMASK		GENMASK(15, 0)
+
+#define GSI_EV_CH_E_CNTXT_2_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_2_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_2_OFFSET(ev, ee) \
+		(0x0001d008 + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_CNTXT_3_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_3_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_3_OFFSET(ev, ee) \
+		(0x0001d00c + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_CNTXT_4_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_4_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_4_OFFSET(ev, ee) \
+		(0x0001d010 + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_CNTXT_8_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_8_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_8_OFFSET(ev, ee) \
+		(0x0001d020 + 0x4000 * (ee) + 0x80 * (ev))
+#define MODT_FMASK			GENMASK(15, 0)
+#define MODC_FMASK			GENMASK(23, 16)
+#define MOD_CNT_FMASK			GENMASK(31, 24)
+
+#define GSI_EV_CH_E_CNTXT_9_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_9_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_9_OFFSET(ev, ee) \
+		(0x0001d024 + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_CNTXT_10_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_10_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_10_OFFSET(ev, ee) \
+		(0x0001d028 + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_CNTXT_11_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_11_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_11_OFFSET(ev, ee) \
+		(0x0001d02c + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_CNTXT_12_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_12_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_12_OFFSET(ev, ee) \
+		(0x0001d030 + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_CNTXT_13_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_CNTXT_13_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_13_OFFSET(ev, ee) \
+		(0x0001d034 + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_SCRATCH_0_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_SCRATCH_0_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_SCRATCH_0_OFFSET(ev, ee) \
+		(0x0001d048 + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_EV_CH_E_SCRATCH_1_OFFSET(ev) \
+		GSI_EE_N_EV_CH_E_SCRATCH_1_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_SCRATCH_1_OFFSET(ev, ee) \
+		(0x0001d04c + 0x4000 * (ee) + 0x80 * (ev))
+
+#define GSI_CH_C_DOORBELL_0_OFFSET(ch) \
+		GSI_EE_N_CH_C_DOORBELL_0_OFFSET((ch), GSI_EE_AP)
+#define GSI_EE_N_CH_C_DOORBELL_0_OFFSET(ch, ee) \
+			(0x0001e000 + 0x4000 * (ee) + 0x08 * (ch))
+
+#define GSI_EV_CH_E_DOORBELL_0_OFFSET(ev) \
+			GSI_EE_N_EV_CH_E_DOORBELL_0_OFFSET((ev), GSI_EE_AP)
+#define GSI_EE_N_EV_CH_E_DOORBELL_0_OFFSET(ev, ee) \
+			(0x0001e100 + 0x4000 * (ee) + 0x08 * (ev))
+
+#define GSI_GSI_STATUS_OFFSET \
+			GSI_EE_N_GSI_STATUS_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_GSI_STATUS_OFFSET(ee) \
+			(0x0001f000 + 0x4000 * (ee))
+#define ENABLED_FMASK			GENMASK(0, 0)
+
+#define GSI_CH_CMD_OFFSET \
+			GSI_EE_N_CH_CMD_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CH_CMD_OFFSET(ee) \
+			(0x0001f008 + 0x4000 * (ee))
+#define CH_CHID_FMASK			GENMASK(7, 0)
+#define CH_OPCODE_FMASK			GENMASK(31, 24)
+
+#define GSI_EV_CH_CMD_OFFSET \
+			GSI_EE_N_EV_CH_CMD_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_EV_CH_CMD_OFFSET(ee) \
+			(0x0001f010 + 0x4000 * (ee))
+#define EV_CHID_FMASK			GENMASK(7, 0)
+#define EV_OPCODE_FMASK			GENMASK(31, 24)
+
+#define GSI_GSI_HW_PARAM_2_OFFSET \
+			GSI_EE_N_GSI_HW_PARAM_2_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_GSI_HW_PARAM_2_OFFSET(ee) \
+			(0x0001f040 + 0x4000 * (ee))
+#define IRAM_SIZE_FMASK			GENMASK(2, 0)
+#define NUM_CH_PER_EE_FMASK		GENMASK(7, 3)
+#define NUM_EV_PER_EE_FMASK		GENMASK(12, 8)
+#define GSI_CH_PEND_TRANSLATE_FMASK	GENMASK(13, 13)
+#define GSI_CH_FULL_LOGIC_FMASK		GENMASK(14, 14)
+#define IRAM_SIZE_ONE_KB_FVAL			0
+#define IRAM_SIZE_TWO_KB_FVAL			1
+
+#define GSI_CNTXT_TYPE_IRQ_OFFSET \
+			GSI_EE_N_CNTXT_TYPE_IRQ_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_TYPE_IRQ_OFFSET(ee) \
+			(0x0001f080 + 0x4000 * (ee))
+#define CH_CTRL_FMASK			GENMASK(0, 0)
+#define EV_CTRL_FMASK			GENMASK(1, 1)
+#define GLOB_EE_FMASK			GENMASK(2, 2)
+#define IEOB_FMASK			GENMASK(3, 3)
+#define INTER_EE_CH_CTRL_FMASK		GENMASK(4, 4)
+#define INTER_EE_EV_CTRL_FMASK		GENMASK(5, 5)
+#define GENERAL_FMASK			GENMASK(6, 6)
+
+#define GSI_CNTXT_TYPE_IRQ_MSK_OFFSET \
+			GSI_EE_N_CNTXT_TYPE_IRQ_MSK_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_TYPE_IRQ_MSK_OFFSET(ee) \
+			(0x0001f088 + 0x4000 * (ee))
+#define MSK_CH_CTRL_FMASK		GENMASK(0, 0)
+#define MSK_EV_CTRL_FMASK		GENMASK(1, 1)
+#define MSK_GLOB_EE_FMASK		GENMASK(2, 2)
+#define MSK_IEOB_FMASK			GENMASK(3, 3)
+#define MSK_INTER_EE_CH_CTRL_FMASK	GENMASK(4, 4)
+#define MSK_INTER_EE_EV_CTRL_FMASK	GENMASK(5, 5)
+#define MSK_GENERAL_FMASK		GENMASK(6, 6)
+#define GSI_CNTXT_TYPE_IRQ_MSK_ALL	GENMASK(6, 0)
+
+#define GSI_CNTXT_SRC_CH_IRQ_OFFSET \
+			GSI_EE_N_CNTXT_SRC_CH_IRQ_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_CH_IRQ_OFFSET(ee) \
+			(0x0001f090 + 0x4000 * (ee))
+
+#define GSI_CNTXT_SRC_EV_CH_IRQ_OFFSET \
+			GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_OFFSET(ee) \
+			(0x0001f094 + 0x4000 * (ee))
+
+#define GSI_CNTXT_SRC_CH_IRQ_MSK_OFFSET \
+			GSI_EE_N_CNTXT_SRC_CH_IRQ_MSK_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_CH_IRQ_MSK_OFFSET(ee) \
+			(0x0001f098 + 0x4000 * (ee))
+
+#define GSI_CNTXT_SRC_EV_CH_IRQ_MSK_OFFSET \
+			GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_MSK_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_MSK_OFFSET(ee) \
+			(0x0001f09c + 0x4000 * (ee))
+
+#define GSI_CNTXT_SRC_CH_IRQ_CLR_OFFSET \
+			GSI_EE_N_CNTXT_SRC_CH_IRQ_CLR_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_CH_IRQ_CLR_OFFSET(ee) \
+			(0x0001f0a0 + 0x4000 * (ee))
+
+#define GSI_CNTXT_SRC_EV_CH_IRQ_CLR_OFFSET \
+			GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_CLR_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_CLR_OFFSET(ee) \
+			(0x0001f0a4 + 0x4000 * (ee))
+
+#define GSI_CNTXT_SRC_IEOB_IRQ_OFFSET \
+			GSI_EE_N_CNTXT_SRC_IEOB_IRQ_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_IEOB_IRQ_OFFSET(ee) \
+			(0x0001f0b0 + 0x4000 * (ee))
+
+#define GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET \
+			GSI_EE_N_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET(ee) \
+			(0x0001f0b8 + 0x4000 * (ee))
+
+#define GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET \
+			GSI_EE_N_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET(ee) \
+			(0x0001f0c0 + 0x4000 * (ee))
+
+#define GSI_CNTXT_GLOB_IRQ_STTS_OFFSET \
+			GSI_EE_N_CNTXT_GLOB_IRQ_STTS_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_GLOB_IRQ_STTS_OFFSET(ee) \
+			(0x0001f100 + 0x4000 * (ee))
+#define ERROR_INT_FMASK			GENMASK(0, 0)
+#define GP_INT1_FMASK			GENMASK(1, 1)
+#define GP_INT2_FMASK			GENMASK(2, 2)
+#define GP_INT3_FMASK			GENMASK(3, 3)
+
+#define GSI_CNTXT_GLOB_IRQ_EN_OFFSET \
+			GSI_EE_N_CNTXT_GLOB_IRQ_EN_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_GLOB_IRQ_EN_OFFSET(ee) \
+			(0x0001f108 + 0x4000 * (ee))
+#define EN_ERROR_INT_FMASK		GENMASK(0, 0)
+#define EN_GP_INT1_FMASK		GENMASK(1, 1)
+#define EN_GP_INT2_FMASK		GENMASK(2, 2)
+#define EN_GP_INT3_FMASK		GENMASK(3, 3)
+#define GSI_CNTXT_GLOB_IRQ_ALL		GENMASK(3, 0)
+
+#define GSI_CNTXT_GLOB_IRQ_CLR_OFFSET \
+			GSI_EE_N_CNTXT_GLOB_IRQ_CLR_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_GLOB_IRQ_CLR_OFFSET(ee) \
+			(0x0001f110 + 0x4000 * (ee))
+#define CLR_ERROR_INT_FMASK		GENMASK(0, 0)
+#define CLR_GP_INT1_FMASK		GENMASK(1, 1)
+#define CLR_GP_INT2_FMASK		GENMASK(2, 2)
+#define CLR_GP_INT3_FMASK		GENMASK(3, 3)
+
+#define GSI_CNTXT_GSI_IRQ_STTS_OFFSET \
+			GSI_EE_N_CNTXT_GSI_IRQ_STTS_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_GSI_IRQ_STTS_OFFSET(ee) \
+			(0x0001f118 + 0x4000 * (ee))
+#define BREAK_POINT_FMASK		GENMASK(0, 0)
+#define BUS_ERROR_FMASK			GENMASK(1, 1)
+#define CMD_FIFO_OVRFLOW_FMASK		GENMASK(2, 2)
+#define MCS_STACK_OVRFLOW_FMASK		GENMASK(3, 3)
+
+#define GSI_CNTXT_GSI_IRQ_EN_OFFSET \
+			GSI_EE_N_CNTXT_GSI_IRQ_EN_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_GSI_IRQ_EN_OFFSET(ee) \
+			(0x0001f120 + 0x4000 * (ee))
+#define EN_BREAK_POINT_FMASK		GENMASK(0, 0)
+#define EN_BUS_ERROR_FMASK		GENMASK(1, 1)
+#define EN_CMD_FIFO_OVRFLOW_FMASK	GENMASK(2, 2)
+#define EN_MCS_STACK_OVRFLOW_FMASK	GENMASK(3, 3)
+#define GSI_CNTXT_GSI_IRQ_ALL		GENMASK(3, 0)
+
+#define GSI_CNTXT_GSI_IRQ_CLR_OFFSET \
+			GSI_EE_N_CNTXT_GSI_IRQ_CLR_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_GSI_IRQ_CLR_OFFSET(ee) \
+			(0x0001f128 + 0x4000 * (ee))
+#define CLR_BREAK_POINT_FMASK		GENMASK(0, 0)
+#define CLR_BUS_ERROR_FMASK		GENMASK(1, 1)
+#define CLR_CMD_FIFO_OVRFLOW_FMASK	GENMASK(2, 2)
+#define CLR_MCS_STACK_OVRFLOW_FMASK	GENMASK(3, 3)
+
+#define GSI_CNTXT_INTSET_OFFSET \
+			GSI_EE_N_CNTXT_INTSET_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_CNTXT_INTSET_OFFSET(ee) \
+			(0x0001f180 + 0x4000 * (ee))
+#define INTYPE_FMASK			GENMASK(0, 0)
+
+#define GSI_ERROR_LOG_OFFSET \
+			GSI_EE_N_ERROR_LOG_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_ERROR_LOG_OFFSET(ee) \
+			(0x0001f200 + 0x4000 * (ee))
+
+#define GSI_ERROR_LOG_CLR_OFFSET \
+			GSI_EE_N_ERROR_LOG_CLR_OFFSET(GSI_EE_AP)
+#define GSI_EE_N_ERROR_LOG_CLR_OFFSET(ee) \
+			(0x0001f210 + 0x4000 * (ee))
+
+#endif	/* _GSI_REG_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 08/18] soc: qcom: ipa: the generic software interface
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (6 preceding siblings ...)
  2019-05-12  1:24 ` [PATCH 07/18] soc: qcom: ipa: GSI headers Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-15  7:21   ` Arnd Bergmann
                     ` (2 more replies)
  2019-05-12  1:24 ` [PATCH 09/18] soc: qcom: ipa: GSI transactions Alex Elder
                   ` (10 subsequent siblings)
  18 siblings, 3 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch includes "gsi.c", which implements the generic software
interface (GSI) for IPA.  The generic software interface abstracts
channels, which provide a means of transferring data either from the
AP to the IPA, or from the IPA to the AP.  A ring buffer of "transfer
elements" (TREs) is used to describe data transfers to perform.  The
AP writes a doorbell register associated with a channel to let it know
it has added new entries (for an AP->IPA channel) or has finished
processing entries (for an IPA->AP channel).

Each channel also has an event ring buffer, used by the IPA to
communicate information about events related to a channel (for
example, the completion of TREs).  The IPA writes its own doorbell
register, which triggers an interrupt on the AP, to signal that
new event information has arrived.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/gsi.c | 1741 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1741 insertions(+)
 create mode 100644 drivers/net/ipa/gsi.c

diff --git a/drivers/net/ipa/gsi.c b/drivers/net/ipa/gsi.c
new file mode 100644
index 000000000000..e9dd40c058c6
--- /dev/null
+++ b/drivers/net/ipa/gsi.c
@@ -0,0 +1,1741 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/bits.h>
+#include <linux/bitfield.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/completion.h>
+#include <linux/io.h>
+#include <linux/bug.h>
+#include <linux/interrupt.h>
+#include <linux/platform_device.h>
+#include <linux/netdevice.h>
+
+#include "gsi.h"
+#include "gsi_reg.h"
+#include "gsi_private.h"
+#include "gsi_trans.h"
+#include "ipa_gsi.h"
+#include "ipa_data.h"
+
+/**
+ * DOC: The IPA Generic Software Interface
+ *
+ * The generic software interface (GSI) is an integral component of the IPA,
+ * providing a well-defined communication layer between the AP subsystem
+ * and the IPA core.  The modem uses the GSI layer as well.
+ *
+ *	--------	     ---------
+ *	|      |	     |	     |
+ *	|  AP  +<---.	.----+ Modem |
+ *	|      +--. |	| .->+	     |
+ *	|      |  | |	| |  |	     |
+ *	--------  | |	| |  ---------
+ *		  v |	v |
+ *		--+-+---+-+--
+ *		|    GSI    |
+ *		|-----------|
+ *		|	    |
+ *		|    IPA    |
+ *		|	    |
+ *		-------------
+ *
+ * In the above diagram, the AP and Modem represent "execution environments"
+ * (EEs), which are independent operating environments that use the IPA for
+ * data transfer.
+ *
+ * Each EE uses a set of unidirectional GSI "channels," which allow transfer
+ * of data to or from the IPA.  A channel is implemented as a ring buffer,
+ * with a DRAM-resident array of "transfer elements" (TREs) available to
+ * describe transfers to or from other EEs through the IPA.  A transfer
+ * element can also contain an immediate command, requesting the IPA perform
+ * actions other than data transfer.
+ *
+ * Each TRE refers to a block of data--also located DRAM.  After writing one
+ * or more TREs to a channel, the writer (either the IPA or an EE) writes a
+ * doorbell register to inform the receiving side how many elements have
+ * been written.  Writing to a doorbell register triggers an interrupt on
+ * the receiver.
+ *
+ * Each channel has a GSI "event ring" associated with it.  An event ring
+ * is implemented very much like a channel ring, but is always directed from
+ * the IPA to an EE.  The IPA notifies an EE (such as the AP) about channel
+ * events by adding an entry to the event ring associated with the channel;
+ * when it writes the event ring's doorbell register the EE is interrupted.
+ * Each entry in an event ring contains a pointer to the channel TRE whose
+ * completion the event represents.
+ *
+ * Each TRE in a channel ring has a set of flags.  One flag indicates whether
+ * the completion of the transfer operation generates an entry (and possibly
+ * an interrupt) in the channel's event ring.  Oother flags allow transfer
+ * elements to be chained together, forming a single logical transaction.
+ * TRE flags are used to control whether and when interrupts are generated
+ * to signal completion of channel transfers.
+ *
+ * Elements in channel and event rings are completed (or consumed) strictly
+ * in order.  Completion of one entry implies the completion of all preceding
+ * entries.  A single completion interrupt can communicate the completion of
+ * many transfers.
+ *
+ * Note that all GSI registers are little-endian, which is the assumed
+ * endianness of I/O space accesses.  The accessor functions perform byte
+ * swapping if needed (i.e., for a big endian CPU).
+ */
+
+/* Delay period for interrupt moderation (in 32KHz IPA timer ticks) */
+#define IPA_GSI_EVT_RING_INT_MODT	(32 * 1) /* 1ms under 32KHz clock */
+
+#define GSI_CMD_TIMEOUT		5	/* seconds */
+
+#define GSI_MHI_ER_START	10	/* First reserved event number */
+#define GSI_MHI_ER_END		16	/* Last reserved event number */
+
+#define GSI_RESET_WA_MIN_SLEEP	1000	/* microseconds */
+#define GSI_RESET_WA_MAX_SLEEP	2000	/* microseconds */
+
+#define GSI_ISR_MAX_ITER	50
+
+/* Hardware values from the error log register error code field */
+enum gsi_err_code {
+	GSI_INVALID_TRE_ERR			= 0x1,
+	GSI_OUT_OF_BUFFERS_ERR			= 0x2,
+	GSI_OUT_OF_RESOURCES_ERR		= 0x3,
+	GSI_UNSUPPORTED_INTER_EE_OP_ERR		= 0x4,
+	GSI_EVT_RING_EMPTY_ERR			= 0x5,
+	GSI_NON_ALLOCATED_EVT_ACCESS_ERR	= 0x6,
+	GSI_HWO_1_ERR				= 0x8,
+};
+
+/* Hardware values from the error log register error type field */
+enum gsi_err_type {
+	GSI_ERR_TYPE_GLOB	= 0x1,
+	GSI_ERR_TYPE_CHAN	= 0x2,
+	GSI_ERR_TYPE_EVT	= 0x3,
+};
+
+/* Fields in an error log register at GSI_ERROR_LOG_OFFSET */
+#define GSI_LOG_ERR_ARG3_FMASK		GENMASK(3, 0)
+#define GSI_LOG_ERR_ARG2_FMASK		GENMASK(7, 4)
+#define GSI_LOG_ERR_ARG1_FMASK		GENMASK(11, 8)
+#define GSI_LOG_ERR_CODE_FMASK		GENMASK(15, 12)
+#define GSI_LOG_ERR_VIRT_IDX_FMASK	GENMASK(23, 19)
+#define GSI_LOG_ERR_TYPE_FMASK		GENMASK(27, 24)
+#define GSI_LOG_ERR_EE_FMASK		GENMASK(31, 28)
+
+/* Hardware values used when programming an event ring */
+enum gsi_evt_chtype {
+	GSI_EVT_CHTYPE_MHI_EV	= 0x0,
+	GSI_EVT_CHTYPE_XHCI_EV	= 0x1,
+	GSI_EVT_CHTYPE_GPI_EV	= 0x2,
+	GSI_EVT_CHTYPE_XDCI_EV	= 0x3,
+};
+
+/* Hardware values used when programming a channel */
+enum gsi_channel_protocol {
+	GSI_CHANNEL_PROTOCOL_MHI	= 0x0,
+	GSI_CHANNEL_PROTOCOL_XHCI	= 0x1,
+	GSI_CHANNEL_PROTOCOL_GPI	= 0x2,
+	GSI_CHANNEL_PROTOCOL_XDCI	= 0x3,
+};
+
+/* Hardware values representing an event ring immediate command opcode */
+enum gsi_evt_ch_cmd_opcode {
+	GSI_EVT_ALLOCATE	= 0x0,
+	GSI_EVT_RESET		= 0x9,
+	GSI_EVT_DE_ALLOC	= 0xa,
+};
+
+/* Hardware values representing a channel immediate command opcode */
+enum gsi_ch_cmd_opcode {
+	GSI_CH_ALLOCATE	= 0x0,
+	GSI_CH_START	= 0x1,
+	GSI_CH_STOP	= 0x2,
+	GSI_CH_RESET	= 0x9,
+	GSI_CH_DE_ALLOC	= 0xa,
+	GSI_CH_DB_STOP	= 0xb,
+};
+
+/** gsi_gpi_channel_scratch - GPI protocol scratch register
+ *
+ * @max_outstanding_tre:
+ *	Defines the maximum number of TREs allowed in a single transaction
+ *	on a channel (in Bytes).  This determines the amount of prefetch
+ *	performed by the hardware.  We configure this to equal the size of
+ *	the TLV FIFO for the channel.
+ * @outstanding_threshold:
+ *	Defines the threshold (in Bytes) determining when the sequencer
+ *	should update the channel doorbell.  We configure this to equal
+ *	the size of two TREs.
+ */
+struct gsi_gpi_channel_scratch {
+	u64 rsvd1;
+	u16 rsvd2;
+	u16 max_outstanding_tre;
+	u16 rsvd3;
+	u16 outstanding_threshold;
+} __packed;
+
+/** gsi_channel_scratch - channel scratch configuration area
+ *
+ * The exact interpretation of this register is protocol-specific.
+ * We only use GPI channels; see struct gsi_gpi_channel_scratch, above.
+ */
+union gsi_channel_scratch {
+	struct gsi_gpi_channel_scratch gpi;
+	struct {
+		u32 word1;
+		u32 word2;
+		u32 word3;
+		u32 word4;
+	} data;
+} __packed;
+
+/* Enable or disable an event interrupt */
+static void
+_gsi_irq_control_event(struct gsi *gsi, u32 evt_ring_id, bool enable)
+{
+	u32 mask = BIT(evt_ring_id);
+	u32 val;
+
+	if (enable)
+		gsi->event_enable_bitmap |= mask;
+	else
+		gsi->event_enable_bitmap &= ~mask;
+
+	val = gsi->event_enable_bitmap;
+	iowrite32(val, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET);
+}
+
+static void gsi_irq_enable_event(struct gsi *gsi, u32 evt_ring_id)
+{
+	_gsi_irq_control_event(gsi, evt_ring_id, true);
+}
+
+static void gsi_irq_disable_event(struct gsi *gsi, u32 evt_ring_id)
+{
+	_gsi_irq_control_event(gsi, evt_ring_id, false);
+}
+
+/* Enable or disable all interrupt types */
+static void _gsi_irq_control_all(struct gsi *gsi, bool enable)
+{
+	u32 val;
+
+	/* Inter EE commands / interrupt are no supported. */
+	val = enable ? GSI_CNTXT_TYPE_IRQ_MSK_ALL : 0;
+	iowrite32(val, gsi->virt + GSI_CNTXT_TYPE_IRQ_MSK_OFFSET);
+
+	val = enable ? GENMASK(GSI_CHANNEL_MAX - 1, 0) : 0;
+	iowrite32(val, gsi->virt + GSI_CNTXT_SRC_CH_IRQ_MSK_OFFSET);
+
+	val = enable ? GENMASK(GSI_EVT_RING_MAX - 1, 0) : 0;
+	iowrite32(val, gsi->virt + GSI_CNTXT_SRC_EV_CH_IRQ_MSK_OFFSET);
+
+	/* IEOB interrupts are managed individually */
+	val = enable ? gsi->event_enable_bitmap : 0;
+	iowrite32(val, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET);
+
+	val = enable ? GSI_CNTXT_GLOB_IRQ_ALL : 0;
+	iowrite32(val, gsi->virt + GSI_CNTXT_GLOB_IRQ_EN_OFFSET);
+
+	/* Never enable GSI_BREAK_POINT */
+	val = enable ? GSI_CNTXT_GSI_IRQ_ALL & ~EN_BREAK_POINT_FMASK : 0;
+	iowrite32(val, gsi->virt + GSI_CNTXT_GSI_IRQ_EN_OFFSET);
+}
+
+static void gsi_irq_disable_all(struct gsi *gsi)
+{
+	_gsi_irq_control_all(gsi, false);
+}
+
+static void gsi_irq_enable_all(struct gsi *gsi)
+{
+	_gsi_irq_control_all(gsi, true);
+}
+
+/* Return the channel id associated with a given channel */
+u32 gsi_channel_id(struct gsi_channel *channel)
+{
+	return channel - &channel->gsi->channel[0];
+}
+
+/* Return the hardware's notion of the current state of a channel */
+static enum gsi_channel_state gsi_channel_state(struct gsi_channel *channel)
+{
+	u32 channel_id = gsi_channel_id(channel);
+	struct gsi *gsi = channel->gsi;
+	u32 val;
+
+	val = ioread32(gsi->virt + GSI_CH_C_CNTXT_0_OFFSET(channel_id));
+
+	return u32_get_bits(val, CHSTATE_FMASK);
+}
+
+/* Return the hardware's notion of the current state of an event ring */
+static enum gsi_evt_ring_state
+gsi_evt_ring_state(struct gsi *gsi, u32 evt_ring_id)
+{
+	u32 val = ioread32(gsi->virt + GSI_EV_CH_E_CNTXT_0_OFFSET(evt_ring_id));
+
+	return u32_get_bits(val, EV_CHSTATE_FMASK);
+}
+
+/* Channel control interrupt handler */
+static void gsi_isr_chan_ctrl(struct gsi *gsi)
+{
+	u32 channel_mask;
+
+	channel_mask = ioread32(gsi->virt + GSI_CNTXT_SRC_CH_IRQ_OFFSET);
+	iowrite32(channel_mask, gsi->virt + GSI_CNTXT_SRC_CH_IRQ_CLR_OFFSET);
+
+	while (channel_mask) {
+		u32 channel_id = __ffs(channel_mask);
+		struct gsi_channel *channel;
+
+		channel_mask ^= BIT(channel_id);
+
+		channel = &gsi->channel[channel_id];
+		channel->state = gsi_channel_state(channel);
+
+		complete(&channel->completion);
+	}
+}
+
+static void gsi_isr_evt_ctrl(struct gsi *gsi)
+{
+	u32 evt_mask;
+
+	evt_mask = ioread32(gsi->virt + GSI_CNTXT_SRC_EV_CH_IRQ_OFFSET);
+	iowrite32(evt_mask, gsi->virt + GSI_CNTXT_SRC_EV_CH_IRQ_CLR_OFFSET);
+
+	while (evt_mask) {
+		u32 evt_ring_id = __ffs(evt_mask);
+		struct gsi_evt_ring *evt_ring;
+
+		evt_mask ^= BIT(evt_ring_id);
+
+		evt_ring = &gsi->evt_ring[evt_ring_id];
+		evt_ring->state = gsi_evt_ring_state(gsi, evt_ring_id);
+
+		complete(&evt_ring->completion);
+	}
+}
+
+static void
+gsi_isr_glob_chan_err(struct gsi *gsi, u32 err_ee, u32 channel_id, u32 code)
+{
+	if (code == GSI_OUT_OF_RESOURCES_ERR) {
+		dev_err(gsi->dev, "channel %u out of resources\n", channel_id);
+		complete(&gsi->channel[channel_id].completion);
+		return;
+	}
+
+	/* Report, but otherwise ignore all other error codes */
+	WARN(true, "channel %u global error ee 0x%08x code 0x%08x\n",
+	     channel_id, err_ee, code);
+}
+
+static void
+gsi_isr_glob_evt_err(struct gsi *gsi, u32 err_ee, u32 evt_ring_id, u32 code)
+{
+	if (code == GSI_OUT_OF_RESOURCES_ERR) {
+		struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+		u32 channel_id = gsi_channel_id(evt_ring->channel);
+
+		complete(&evt_ring->completion);
+		dev_err(gsi->dev, "evt_ring for channel %u out of resources\n",
+			channel_id);
+		return;
+	}
+
+	/* Report, but otherwise ignore all other error codes */
+	WARN(true, "event ring 0x%08x global error ee %u code 0x%08x\n",
+	     evt_ring_id, err_ee, code);
+}
+
+static void gsi_isr_glob_err(struct gsi *gsi)
+{
+	enum gsi_err_type type;
+	enum gsi_err_code code;
+	u32 which;
+	u32 val;
+	u32 ee;
+
+	/* Get the logged error, then reinitialize the log */
+	val = ioread32(gsi->virt + GSI_ERROR_LOG_OFFSET);
+	iowrite32(0, gsi->virt + GSI_ERROR_LOG_OFFSET);
+	iowrite32(~0, gsi->virt + GSI_ERROR_LOG_CLR_OFFSET);
+
+	ee = u32_get_bits(val, GSI_LOG_ERR_EE_FMASK);
+	which = u32_get_bits(val, GSI_LOG_ERR_VIRT_IDX_FMASK);
+	type = u32_get_bits(val, GSI_LOG_ERR_TYPE_FMASK);
+	code = u32_get_bits(val, GSI_LOG_ERR_CODE_FMASK);
+
+	if (type == GSI_ERR_TYPE_CHAN)
+		gsi_isr_glob_chan_err(gsi, ee, which, code);
+	else if (type == GSI_ERR_TYPE_EVT)
+		gsi_isr_glob_evt_err(gsi, ee, which, code);
+	else	/* type GSI_ERR_TYPE_GLOB should be fatal */
+		WARN(true, "unexpected global error 0x%08x\n", type);
+}
+
+static void gsi_isr_glob_ee(struct gsi *gsi)
+{
+	u32 val;
+
+	val = ioread32(gsi->virt + GSI_CNTXT_GLOB_IRQ_STTS_OFFSET);
+
+	if (val & ERROR_INT_FMASK)
+		gsi_isr_glob_err(gsi);
+
+	iowrite32(val, gsi->virt + GSI_CNTXT_GLOB_IRQ_CLR_OFFSET);
+
+	val ^= ERROR_INT_FMASK;
+
+	if (val & EN_GP_INT1_FMASK)
+		dev_err(gsi->dev, "unexpected global INT1\n");
+	val ^= EN_GP_INT1_FMASK;
+
+	WARN(val, "unexpected global interrupt 0x%08x\n", val);
+}
+
+/* Returns true if the interrupt state (enabled or not) changed */
+static bool gsi_channel_intr(struct gsi_channel *channel, bool enable)
+{
+	u32 evt_ring_id = channel->evt_ring_id;
+	struct gsi *gsi = channel->gsi;
+	u32 mask = BIT(evt_ring_id);
+	unsigned long flags;
+	bool different;
+	u32 enabled;
+
+	spin_lock_irqsave(&gsi->spinlock, flags);
+
+	enabled = gsi->event_enable_bitmap & mask;
+	different = enable == !enabled;
+
+	if (different) {
+		if (enabled)
+			gsi_irq_disable_event(channel->gsi, evt_ring_id);
+		else
+			gsi_irq_enable_event(channel->gsi, evt_ring_id);
+	}
+
+	spin_unlock_irqrestore(&gsi->spinlock, flags);
+
+	return different;
+}
+
+/* This function is almost always called in interrupt context,
+ * meaning the interrupt is enabled.  The request to disable
+ * the interrupt here will therefore "succeed", that is, it
+ * will disable an enabled interrupt.
+ *
+ * However, this function is also called when cancelling pending
+ * transactions, and when that occurs it's possible interrupts are
+ * already disabled.  For that reason we only schedule NAPI if we
+ * actually caused interrupts to become disabled.
+ */
+void gsi_event_handle(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	struct gsi_channel *channel = evt_ring->channel;
+
+	if (gsi_channel_intr(channel, false))
+		napi_schedule(&channel->napi);
+}
+
+static void gsi_isr_ieob(struct gsi *gsi)
+{
+	u32 evt_mask;
+
+	evt_mask = ioread32(gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_OFFSET);
+	evt_mask &= ioread32(gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET);
+	iowrite32(evt_mask, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET);
+
+	while (evt_mask) {
+		u32 evt_ring_id = __ffs(evt_mask);
+
+		evt_mask ^= BIT(evt_ring_id);
+
+		gsi_event_handle(gsi, evt_ring_id);
+	}
+}
+
+static void gsi_isr_inter_ee_chan_ctrl(struct gsi *gsi)
+{
+	u32 channel_mask;
+
+	channel_mask = ioread32(gsi->virt + GSI_INTER_EE_SRC_CH_IRQ_OFFSET);
+	iowrite32(channel_mask, gsi->virt + GSI_INTER_EE_SRC_CH_IRQ_CLR_OFFSET);
+
+	while (channel_mask) {
+		u32 channel_id = __ffs(channel_mask);
+
+		/* not currently expected */
+		dev_err(gsi->dev, "ch %u inter-EE interrupt\n", channel_id);
+		channel_mask ^= BIT(channel_id);
+	}
+}
+
+static void gsi_isr_inter_ee_evt_ctrl(struct gsi *gsi)
+{
+	u32 evt_mask;
+
+	evt_mask = ioread32(gsi->virt + GSI_INTER_EE_SRC_EV_CH_IRQ_OFFSET);
+	iowrite32(evt_mask, gsi->virt + GSI_INTER_EE_SRC_EV_CH_IRQ_CLR_OFFSET);
+
+	while (evt_mask) {
+		u32 evt_ring_id = __ffs(evt_mask);
+
+		/* not currently expected */
+		dev_err(gsi->dev, "evt %u inter-EE interrupt\n", evt_ring_id);
+		evt_mask ^= BIT(evt_ring_id);
+	}
+}
+
+static void gsi_isr_general(struct gsi *gsi)
+{
+	u32 val;
+
+	val = ioread32(gsi->virt + GSI_CNTXT_GSI_IRQ_STTS_OFFSET);
+	iowrite32(val, gsi->virt + GSI_CNTXT_GSI_IRQ_CLR_OFFSET);
+
+	if (val & CLR_BREAK_POINT_FMASK)
+		dev_err(gsi->dev, "breakpoint!\n");
+	val ^= CLR_BREAK_POINT_FMASK;
+
+	WARN(val, "unexpected general interrupt 0x%08x\n", val);
+}
+
+/**
+ * gsi_isr() - Top level GSI interrupt service routine
+ * @irq:	Interrupt number (ignored)
+ * @dev_id:	Device id pointer supplied to request_irq()
+ *
+ * This is the main handler function registered for the GSI IRQ.  The
+ * GSI pointer is supplied as the "device id" value when the handler
+ * is registered, and is provided here.  Each type of interrupt has a
+ * separate handler function that is called from here.
+ */
+static irqreturn_t gsi_isr(int irq, void *dev_id)
+{
+	struct gsi *gsi = dev_id;
+	u32 intr_mask;
+	u32 cnt = 0;
+
+	while ((intr_mask = ioread32(gsi->virt + GSI_CNTXT_TYPE_IRQ_OFFSET))) {
+		/* intr_mask contains bitmask of pending GSI interrupts */
+		do {
+			u32 gsi_intr = BIT(__ffs(intr_mask));
+
+			intr_mask ^= gsi_intr;
+
+			switch (gsi_intr) {
+			case CH_CTRL_FMASK:
+				gsi_isr_chan_ctrl(gsi);
+				break;
+			case EV_CTRL_FMASK:
+				gsi_isr_evt_ctrl(gsi);
+				break;
+			case GLOB_EE_FMASK:
+				gsi_isr_glob_ee(gsi);
+				break;
+			case IEOB_FMASK:
+				gsi_isr_ieob(gsi);
+				break;
+			case INTER_EE_CH_CTRL_FMASK:
+				gsi_isr_inter_ee_chan_ctrl(gsi);
+				break;
+			case INTER_EE_EV_CTRL_FMASK:
+				gsi_isr_inter_ee_evt_ctrl(gsi);
+				break;
+			case GENERAL_FMASK:
+				gsi_isr_general(gsi);
+				break;
+			default:
+				WARN(true, "%s: unrecognized type 0x%08x\n",
+				     __func__, gsi_intr);
+				break;
+			}
+		} while (intr_mask);
+
+		if (WARN(++cnt > GSI_ISR_MAX_ITER, "interrupt flood\n"))
+			break;
+	}
+
+	return IRQ_HANDLED;
+}
+
+/* Return the virtual address associated with a 32-bit ring offset */
+void *gsi_ring_virt(struct gsi_ring *ring, u32 offset)
+{
+	return ring->virt + (offset - ring->base);
+}
+
+/* Return the ring index of a 32-bit ring offset */
+u32 ring_index(struct gsi_ring *ring, u32 offset)
+{
+	/* Code assumes channel and event ring elements are the same size */
+	BUILD_BUG_ON(sizeof(struct gsi_tre) !=
+		     sizeof(struct gsi_xfer_compl_evt));
+
+	return (offset - ring->base) / sizeof(struct gsi_tre);
+}
+
+/* Return the 32-bit ring offset that precedes the one at the given offset */
+static u32 ring_prev(struct gsi_ring *ring, u32 offset)
+{
+	if (offset == ring->base)
+		offset = ring->end;
+
+	return offset - sizeof(struct gsi_tre);
+}
+
+/* Advance a ring's local write pointer by the given number of slots */
+void gsi_ring_wp_local_add(struct gsi_ring *ring, u32 val)
+{
+	ring->wp_local += val * sizeof(struct gsi_tre);
+	if (ring->wp_local >= ring->end)
+		ring->wp_local -= ring->size;
+}
+
+/* Advance a ring's local read pointer by the given number of slots */
+static void gsi_ring_rp_local_add(struct gsi_ring *ring, u32 val)
+{
+	ring->rp_local += val * sizeof(struct gsi_tre);
+	if (ring->rp_local == ring->end)
+		ring->rp_local -= ring->size;
+}
+
+static void __gsi_evt_tx_update(struct gsi_evt_ring *evt_ring, u32 rp)
+{
+	struct gsi_channel *channel = evt_ring->channel;
+	struct gsi_ring *ring = &evt_ring->ring;
+	struct gsi_xfer_compl_evt *evt;
+	struct gsi_trans *first_trans;
+	struct gsi_trans *last_trans;
+	u32 trans_count;
+	u32 byte_count;
+	u32 tre_offset;
+	u32 tre_index;
+
+	/* Get the first (oldest) un-processed event */
+	evt = gsi_ring_virt(ring, ring->rp_local);
+	/* Get the TRE offset from that, and its associated transaction */
+	tre_offset = le64_to_cpu(evt->xfer_ptr) & GENMASK(31, 0);
+	tre_index = ring_index(&channel->tre_ring, tre_offset);
+	first_trans = gsi_channel_trans_mapped(channel, tre_index);
+
+	/* Get the last (newest) un-processed event */
+	evt = gsi_ring_virt(ring, ring_prev(ring, rp));
+	/* Get the TRE offset from that, and its associated transaction */
+	tre_offset = le64_to_cpu(evt->xfer_ptr) & GENMASK(31, 0);
+	tre_index = ring_index(&channel->tre_ring, tre_offset);
+	last_trans = gsi_channel_trans_mapped(channel, tre_index);
+
+	/* Report the total number of transactions and bytes that have
+	 * been transferred, *including* the last one.
+	 */
+	trans_count = last_trans->trans_count - first_trans->trans_count + 1;
+	byte_count = last_trans->byte_count - first_trans->byte_count;
+	byte_count += last_trans->len;
+
+	ipa_gsi_channel_tx_completed(channel->gsi, gsi_channel_id(channel),
+				     trans_count, byte_count);
+}
+
+/**
+ * __gsi_evt_rx_update() - Record lengths of received data
+ * @evt_ring:	Event ring associated with channel that received packets
+ * @ep:		Last event in the ring associated with a completed request
+ *
+ * Events for RX channels contain the actual number of bytes received into
+ * the buffer.  Every event has a transaction associated with it, and here
+ * we update each transaction's result code to record the received length.
+ *
+ * This function is called whenever we learn that the GSI hardware has filled
+ * new events since the last time we checked.  We need to update transaction
+ * lengths for events starting at the ring's rp_local up to (and including)
+ * the ring offset supplied as an argument.
+ *
+ * Events are sequential within the event ring, and transactions are
+ * sequential within the transaction pool.  We compute the first event's
+ * transaction pointer; the next event's transaction will just next one in
+ * the transaction pool.
+ *
+ * Note that @rp always points to an element *within* the event ring.
+ */
+static void __gsi_evt_rx_update(struct gsi_evt_ring *evt_ring, u32 rp)
+{
+	struct gsi_channel *channel = evt_ring->channel;
+	struct gsi_ring *ring = &evt_ring->ring;
+	struct gsi_xfer_compl_evt *evt_last;
+	struct gsi_xfer_compl_evt *evt_end;
+	struct gsi_trans_info *trans_info;
+	struct gsi_xfer_compl_evt *evt;
+	struct gsi_trans *trans_end;
+	struct gsi_trans *trans;
+	u32 byte_count = 0;
+	u32 tre_offset;
+	u32 tre_index;
+
+	/* Start with the first un-processed event */
+	evt = gsi_ring_virt(ring, ring->rp_local);
+	evt_last = gsi_ring_virt(ring, rp);
+	evt_end = gsi_ring_virt(ring, ring->end);
+
+	/* Event xfer_ptr records the TRE it's associated with */
+	tre_offset = le64_to_cpu(evt->xfer_ptr) & GENMASK(31, 0);
+	tre_index = ring_index(&channel->tre_ring, tre_offset);
+	/* Get the transaction mapped to the first unprocessed event */
+	trans = gsi_channel_trans_mapped(channel, tre_index);
+	trans_info = &channel->trans_info;
+	trans_end = &trans_info->pool[trans_info->pool_count];
+
+	do {
+		trans->len = __le16_to_cpu(evt->len);
+		trans->result = __le16_to_cpu(evt->len);
+		byte_count += trans->result;
+		if (++evt == evt_end)
+			evt = gsi_ring_virt(&evt_ring->ring, ring->base);
+		if (++trans == trans_end)
+			trans = &trans_info->pool[0];
+	} while (evt != evt_last);
+
+	/* We record RX bytes when they are received */
+	channel->byte_count += byte_count;
+	channel->trans_count++;
+}
+
+static void
+gsi_evt_ring_doorbell(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	u32 val;
+
+	/* We only need to write the lower 32 bits */
+	val = evt_ring->ring.wp_local;
+	iowrite32(val, gsi->virt + GSI_EV_CH_E_DOORBELL_0_OFFSET(evt_ring_id));
+}
+
+static u32 gsi_channel_max(struct gsi *gsi)
+{
+	u32 val = ioread32(gsi->virt + GSI_GSI_HW_PARAM_2_OFFSET);
+
+	return u32_get_bits(val, NUM_CH_PER_EE_FMASK);
+}
+
+static u32 gsi_evt_ring_max(struct gsi *gsi)
+{
+	u32 val = ioread32(gsi->virt + GSI_GSI_HW_PARAM_2_OFFSET);
+
+	return u32_get_bits(val, NUM_EV_PER_EE_FMASK);
+}
+
+/* Issue a GSI command by writing a value to a register, then wait
+ * for completion to be signaled.  Returns true if successful or
+ * false if a timeout occurred.
+ */
+static void
+gsi_command(struct gsi *gsi, u32 reg, u32 val, struct completion *completion)
+{
+	unsigned long ret;
+
+	reinit_completion(completion);
+
+	iowrite32(val, gsi->virt + reg);
+	ret = wait_for_completion_timeout(completion, GSI_CMD_TIMEOUT * HZ);
+	WARN(!ret, "%s timeout reg 0x%08x val 0x%08x\n", __func__, reg, val);
+}
+
+/* Issue an event ring command and wait for it to complete */
+static void evt_ring_command(struct gsi *gsi, u32 evt_ring_id,
+			     enum gsi_evt_ch_cmd_opcode op)
+{
+	struct completion *completion = &gsi->evt_ring[evt_ring_id].completion;
+	u32 val = 0;
+
+	val |= u32_encode_bits(evt_ring_id, EV_CHID_FMASK);
+	val |= u32_encode_bits(op, EV_OPCODE_FMASK);
+
+	gsi_command(gsi, GSI_EV_CH_CMD_OFFSET, val, completion);
+}
+
+/* Issue a channel command and wait for it to complete */
+static void
+gsi_channel_command(struct gsi_channel *channel, enum gsi_ch_cmd_opcode op)
+{
+	u32 channel_id = gsi_channel_id(channel);
+	u32 val = 0;
+
+	val |= u32_encode_bits(channel_id, CH_CHID_FMASK);
+	val |= u32_encode_bits(op, CH_OPCODE_FMASK);
+
+	gsi_command(channel->gsi, GSI_CH_CMD_OFFSET, val, &channel->completion);
+}
+
+static int gsi_ring_alloc(struct gsi *gsi, struct gsi_ring *ring, u32 count)
+{
+	size_t size = roundup_pow_of_two(count * sizeof(struct gsi_tre));
+	dma_addr_t addr;
+
+	/* Hardware requires a power-of-2 ring size (and alignment) */
+	ring->virt = dma_alloc_coherent(gsi->dev, size, &addr, GFP_KERNEL);
+	if (!ring->virt)
+		return -ENOMEM;
+	ring->addr = addr;
+	ring->base = addr & GENMASK(31, 0);
+	ring->size = size;
+	ring->end = ring->base + size;
+	spin_lock_init(&ring->spinlock);
+
+	return 0;
+}
+
+static void gsi_ring_free(struct gsi *gsi, struct gsi_ring *ring)
+{
+	dma_free_coherent(gsi->dev, ring->size, ring->virt, ring->addr);
+	memset(ring, 0, sizeof(*ring));
+}
+
+static void gsi_evt_ring_prime(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	struct gsi_ring *ring = &evt_ring->ring;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ring->spinlock, flags);
+
+	memset(ring->virt, 0, ring->size);
+	/* Point the write pointer at the last element */
+	ring->wp_local = ring_prev(ring, ring->base);
+	gsi_evt_ring_doorbell(gsi, evt_ring_id);
+
+	spin_unlock_irqrestore(&ring->spinlock, flags);
+}
+
+static void gsi_evt_ring_program(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	u32 val = 0;
+
+	BUILD_BUG_ON(sizeof(struct gsi_xfer_compl_evt) >
+		     field_max(EV_ELEMENT_SIZE_FMASK));
+
+	val |= u32_encode_bits(GSI_EVT_CHTYPE_GPI_EV, EV_CHTYPE_FMASK);
+	val |= EV_INTYPE_FMASK;
+	val |= u32_encode_bits(sizeof(struct gsi_xfer_compl_evt),
+			       EV_ELEMENT_SIZE_FMASK);
+	iowrite32(val, gsi->virt + GSI_EV_CH_E_CNTXT_0_OFFSET(evt_ring_id));
+
+	val = u32_encode_bits(evt_ring->ring.size, EV_R_LENGTH_FMASK);
+	iowrite32(val, gsi->virt + GSI_EV_CH_E_CNTXT_1_OFFSET(evt_ring_id));
+
+	/* The context 2 and 3 registers store the low-order and
+	 * high-order 32 bits of the address of the event ring,
+	 * respectively.
+	 */
+	val = evt_ring->ring.base;
+	iowrite32(val, gsi->virt + GSI_EV_CH_E_CNTXT_2_OFFSET(evt_ring_id));
+
+	val = evt_ring->ring.addr >> 32;
+	iowrite32(val, gsi->virt + GSI_EV_CH_E_CNTXT_3_OFFSET(evt_ring_id));
+
+	/* Enable interrupt moderation by setting the moderation delay */
+	val = u32_encode_bits(IPA_GSI_EVT_RING_INT_MODT, MODT_FMASK);
+	val |= u32_encode_bits(1, MODC_FMASK);	/* comes from channel */
+	iowrite32(val, gsi->virt + GSI_EV_CH_E_CNTXT_8_OFFSET(evt_ring_id));
+
+	/* No MSI write data, and MSI address high and low address is 0 */
+	iowrite32(0, gsi->virt + GSI_EV_CH_E_CNTXT_9_OFFSET(evt_ring_id));
+	iowrite32(0, gsi->virt + GSI_EV_CH_E_CNTXT_10_OFFSET(evt_ring_id));
+	iowrite32(0, gsi->virt + GSI_EV_CH_E_CNTXT_11_OFFSET(evt_ring_id));
+
+	/* We don't need to get event read pointer updates */
+	iowrite32(0, gsi->virt + GSI_EV_CH_E_CNTXT_12_OFFSET(evt_ring_id));
+	iowrite32(0, gsi->virt + GSI_EV_CH_E_CNTXT_13_OFFSET(evt_ring_id));
+}
+
+static void gsi_ring_init(struct gsi_ring *ring)
+{
+	ring->wp = ring->base;
+	ring->wp_local = ring->base;
+	ring->rp_local = ring->base;
+}
+
+static void gsi_evt_ring_scratch_zero(struct gsi *gsi, u32 evt_ring_id)
+{
+	iowrite32(0, gsi->virt + GSI_EV_CH_E_SCRATCH_0_OFFSET(evt_ring_id));
+	iowrite32(0, gsi->virt + GSI_EV_CH_E_SCRATCH_1_OFFSET(evt_ring_id));
+}
+
+static int gsi_evt_ring_alloc_hw(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	unsigned long flags;
+	u32 val;
+
+	evt_ring_command(gsi, evt_ring_id, GSI_EVT_ALLOCATE);
+
+	if (evt_ring->state != GSI_EVT_RING_STATE_ALLOCATED) {
+		dev_err(gsi->dev, "evt_ring_id %u allocation bad state %u\n",
+			evt_ring_id, evt_ring->state);
+		return -EIO;
+	}
+
+	gsi_evt_ring_program(gsi, evt_ring_id);
+	gsi_ring_init(&evt_ring->ring);
+	gsi_evt_ring_prime(gsi, evt_ring_id);
+
+	spin_lock_irqsave(&gsi->spinlock, flags);
+
+	/* Enable the event interrupt (clear it first in case pending) */
+	val = BIT(evt_ring_id);
+	iowrite32(val, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET);
+	gsi_irq_enable_event(gsi, evt_ring_id);
+
+	spin_unlock_irqrestore(&gsi->spinlock, flags);
+
+	return 0;
+}
+
+static void gsi_evt_ring_free_hw(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	unsigned long flags;
+
+	spin_lock_irqsave(&gsi->spinlock, flags);
+
+	/* Disable the event interrupt */
+	gsi_irq_disable_event(gsi, evt_ring_id);
+
+	spin_unlock_irqrestore(&gsi->spinlock, flags);
+
+	evt_ring_command(gsi, evt_ring_id, GSI_EVT_RESET);
+
+	gsi_evt_ring_program(gsi, evt_ring_id);
+	gsi_ring_init(&evt_ring->ring);
+	gsi_evt_ring_scratch_zero(gsi, evt_ring_id);
+	gsi_evt_ring_prime(gsi, evt_ring_id);
+
+	evt_ring_command(gsi, evt_ring_id, GSI_EVT_DE_ALLOC);
+}
+
+static int gsi_evt_ring_id_alloc(struct gsi *gsi)
+{
+	u32 evt_ring_id;
+
+	if (gsi->event_bitmap == ~0U)
+		return -ENOSPC;
+
+	evt_ring_id = ffz(gsi->event_bitmap);
+	gsi->event_bitmap |= BIT(evt_ring_id);
+
+	return (int)evt_ring_id;
+}
+
+static void gsi_evt_ring_id_free(struct gsi *gsi, u32 evt_ring_id)
+{
+	gsi->event_bitmap &= ~BIT(evt_ring_id);
+}
+
+void gsi_channel_doorbell(struct gsi_channel *channel)
+{
+	u32 channel_id = gsi_channel_id(channel);
+	struct gsi *gsi = channel->gsi;
+	u32 val;
+
+	channel->tre_ring.wp = channel->tre_ring.wp_local;
+
+	/* We only need to write the lower 32 bits */
+	val = channel->tre_ring.wp_local;
+	iowrite32(val, gsi->virt + GSI_CH_C_DOORBELL_0_OFFSET(channel_id));
+}
+
+static void __gsi_evt_ring_update(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	u32 offset = GSI_EV_CH_E_CNTXT_4_OFFSET(evt_ring_id);
+	struct gsi_channel *channel = evt_ring->channel;
+	struct gsi_ring *tre_ring = &channel->tre_ring;
+	struct gsi_ring *ring = &evt_ring->ring;
+	u32 rp = ioread32(gsi->virt + offset);
+	struct gsi_xfer_compl_evt *evt;
+	struct gsi_trans *trans;
+	u32 tre_offset;
+	u32 tre_index;
+	u32 rp_last;
+
+	/* If we have nothing new to process we're done */
+	if (ring->rp_local == rp)
+		return;
+
+	/* Extract information from the newly-completed events.  For TX
+	 * channels, report the number of transferred bytes they represent.
+	 * For RX channels, update each transaction with the number of bytes
+	 * actually received.
+	 */
+	if (channel->toward_ipa)
+		__gsi_evt_tx_update(evt_ring, rp);
+	else
+		__gsi_evt_rx_update(evt_ring, rp);
+
+	/* Get the TRE pointer from the latest completion event, and get
+	 * the transaction associated with that.  Move all new transactions
+	 * up to and including that one to the completed list.
+	 */
+	rp_last = ring_prev(ring, rp);
+	evt = gsi_ring_virt(ring, rp_last);
+	tre_offset = le64_to_cpu(evt->xfer_ptr) & GENMASK(31, 0);
+	tre_index = ring_index(tre_ring, tre_offset);
+	trans = gsi_channel_trans_mapped(channel, tre_index);
+	gsi_trans_move_complete(trans);
+
+	/* We need nothing more from these TREs, so consume them */
+	tre_ring->rp_local = tre_offset;
+	gsi_ring_rp_local_add(tre_ring, 1);
+
+	/* Record that we're caught up on these events, and give the
+	 * completed ones back to the hardware for reuse.
+	 */
+	ring->rp_local = rp;
+	ring->wp_local = rp_last;
+	gsi_evt_ring_doorbell(channel->gsi, channel->evt_ring_id);
+}
+
+/* Consult hardware, move any newly completed transactions to completed list */
+static void gsi_channel_update(struct gsi_channel *channel)
+{
+	struct gsi_evt_ring *evt_ring;
+	unsigned long flags;
+
+	evt_ring = &channel->gsi->evt_ring[channel->evt_ring_id];
+
+	spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
+
+	__gsi_evt_ring_update(channel->gsi, channel->evt_ring_id);
+
+	spin_unlock_irqrestore(&evt_ring->ring.spinlock, flags);
+}
+
+/**
+ * gsi_channel_poll_one() - Return a single completed transaction on a channel
+ * @channel:	Channel to be polled
+ *
+ * @Return:	 Transaction pointer, or null if none are available
+ *
+ * This function returns the first entry on a channel's completed
+ * transaction list.  If that list is empty, the hardware is consulted
+ * to determine whether any new transactions have completed.  If so,
+ * they're moved to the completed list and the new first entry is
+ * returned.  If there are no more completed transactions, a null
+ * pointer is returned.
+ */
+static struct gsi_trans *gsi_channel_poll_one(struct gsi_channel *channel)
+{
+	struct gsi_trans *trans;
+
+	/* Get the first transaction from the completed list */
+	trans = gsi_channel_trans_complete(channel);
+	if (!trans) {
+		/* List is empty; see if there's more to do */
+		gsi_channel_update(channel);
+		trans = gsi_channel_trans_complete(channel);
+	}
+
+	if (trans)
+		gsi_trans_move_polled(trans);
+
+	return trans;
+}
+
+/**
+ * gsi_channel_poll() - NAPI poll function for a channel
+ * @napi:	NAPI structure for the channel
+ * @budget:	Budget supplied by NAPI core
+
+ * @channel_id:	Channel to be reset
+ *
+ * @Return:	 Number of items polled (<= budget)
+ *
+ * Single transactions completed by hardware are polled until either
+ * the budget is exhausted, or there are no more.  Each transaction
+ * polled is passed to gsi_trans_complete(), to perform remaining
+ * completion processing and retire/free the transaction.
+ */
+static int gsi_channel_poll(struct napi_struct *napi, int budget)
+{
+	struct gsi_channel *channel;
+	int count = 0;
+
+	channel = container_of(napi, struct gsi_channel, napi);
+	while (count < budget) {
+		struct gsi_trans *trans;
+
+		trans = gsi_channel_poll_one(channel);
+		if (!trans)
+			break;
+		gsi_trans_complete(trans);
+	}
+
+	if (count < budget) {
+		napi_complete(&channel->napi);
+		(void)gsi_channel_intr(channel, true);
+	}
+
+	return count;
+}
+
+/* The event bitmap represents which event ids are available for
+ * allocation.  Set bits are not available, clear bits can be used.
+ * This function initializes the map so all events supported by the
+ * hardware are available, then precludes any reserved events from
+ * being allocated.
+ */
+static u32 gsi_event_bitmap_init(u32 evt_ring_max)
+{
+	u32 event_bitmap = GENMASK(BITS_PER_LONG - 1, evt_ring_max);
+
+	return event_bitmap | GENMASK(GSI_MHI_ER_END, GSI_MHI_ER_START);
+}
+
+/* Setup function for event rings */
+static int gsi_evt_ring_setup(struct gsi *gsi)
+{
+	u32 evt_ring_max;
+	u32 evt_ring_id;
+
+	evt_ring_max = gsi_evt_ring_max(gsi);
+	dev_dbg(gsi->dev, "evt_ring_max %u\n", evt_ring_max);
+	if (evt_ring_max != GSI_EVT_RING_MAX)
+		return -EIO;
+
+	for (evt_ring_id = 0; evt_ring_id < GSI_EVT_RING_MAX; evt_ring_id++) {
+		struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+
+		evt_ring->state = gsi_evt_ring_state(gsi, evt_ring_id);
+		if (evt_ring->state != GSI_EVT_RING_STATE_NOT_ALLOCATED)
+			return -EIO;
+	}
+
+	/* Enable all event interrupts */
+	gsi_irq_enable_all(gsi);
+
+	return 0;
+}
+
+/* Inverse of gsi_evt_ring_setup() */
+static void gsi_evt_ring_teardown(struct gsi *gsi)
+{
+	gsi_irq_disable_all(gsi);
+}
+
+static void gsi_channel_scratch_write(struct gsi_channel *channel)
+{
+	u32 channel_id = gsi_channel_id(channel);
+	struct gsi_gpi_channel_scratch *gpi;
+	union gsi_channel_scratch scr = { };
+	struct gsi *gsi = channel->gsi;
+	u32 val;
+
+	/* See comments above definition of gsi_gpi_channel_scratch */
+	gpi = &scr.gpi;
+	gpi->max_outstanding_tre = channel->data->tlv_count *
+					sizeof(struct gsi_tre);
+	gpi->outstanding_threshold = 2 * sizeof(struct gsi_tre);
+
+	val = scr.data.word1;
+	iowrite32(val, gsi->virt + GSI_CH_C_SCRATCH_0_OFFSET(channel_id));
+
+	val = scr.data.word2;
+	iowrite32(val, gsi->virt + GSI_CH_C_SCRATCH_1_OFFSET(channel_id));
+
+	val = scr.data.word3;
+	iowrite32(val, gsi->virt + GSI_CH_C_SCRATCH_2_OFFSET(channel_id));
+
+	/* We must preserve the upper 16 bits of the last scratch
+	 * register.  The next sequence assumes those bits remain
+	 * unchanged between the read and the write.
+	 */
+	val = ioread32(gsi->virt + GSI_CH_C_SCRATCH_3_OFFSET(channel_id));
+	val = (scr.data.word4 & GENMASK(31, 16)) | (val & GENMASK(15, 0));
+	iowrite32(val, gsi->virt + GSI_CH_C_SCRATCH_3_OFFSET(channel_id));
+}
+
+static void gsi_channel_program(struct gsi_channel *channel, bool doorbell)
+{
+	u32 channel_id = gsi_channel_id(channel);
+	struct gsi *gsi = channel->gsi;
+	u32 wrr_weight = 0;
+	u32 val = 0;
+
+	BUILD_BUG_ON(sizeof(struct gsi_tre) > field_max(ELEMENT_SIZE_FMASK));
+
+	val |= u32_encode_bits(GSI_CHANNEL_PROTOCOL_GPI, CHTYPE_PROTOCOL_FMASK);
+	if (channel->toward_ipa)
+		val |= CHTYPE_DIR_FMASK;
+	val |= u32_encode_bits(channel->evt_ring_id, ERINDEX_FMASK);
+	val |= u32_encode_bits(sizeof(struct gsi_tre), ELEMENT_SIZE_FMASK);
+	iowrite32(val, gsi->virt + GSI_CH_C_CNTXT_0_OFFSET(channel_id));
+
+	val = u32_encode_bits(channel->tre_ring.size, R_LENGTH_FMASK);
+	iowrite32(val, gsi->virt + GSI_CH_C_CNTXT_1_OFFSET(channel_id));
+
+	/* The context 2 and 3 registers store the low-order and
+	 * high-order 32 bits of the address of the channel ring,
+	 * respectively.
+	 */
+	val = channel->tre_ring.addr & GENMASK(31, 0);
+	iowrite32(val, gsi->virt + GSI_CH_C_CNTXT_2_OFFSET(channel_id));
+
+	val = channel->tre_ring.addr >> 32;
+	iowrite32(val, gsi->virt + GSI_CH_C_CNTXT_3_OFFSET(channel_id));
+
+	if (channel->data->wrr_priority)
+		wrr_weight = field_max(WRR_WEIGHT_FMASK);
+	val = u32_encode_bits(wrr_weight, WRR_WEIGHT_FMASK);
+
+	/* Max prefetch is 1 segment (do not set MAX_PREFETCH_FMASK) */
+	if (doorbell)
+		val |= USE_DB_ENG_FMASK;
+	iowrite32(val, gsi->virt + GSI_CH_C_QOS_OFFSET(channel_id));
+}
+
+static void
+__gsi_channel_config(struct gsi_channel *channel, bool doorbell_enable)
+{
+	gsi_channel_program(channel, doorbell_enable);
+	gsi_ring_init(&channel->tre_ring);
+	gsi_channel_scratch_write(channel);
+}
+
+void gsi_channel_config(struct gsi *gsi, u32 channel_id, bool doorbell_enable)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+
+	mutex_lock(&gsi->mutex);
+
+	__gsi_channel_config(channel, doorbell_enable);
+
+	mutex_unlock(&gsi->mutex);
+}
+
+/* Setup function for a single channel */
+static int gsi_channel_setup_one(struct gsi_channel *channel)
+{
+	struct gsi *gsi = channel->gsi;
+	int ret;
+
+	if (!gsi)
+		return 0;	/* Ignore uninitialized channels */
+
+	channel->state = gsi_channel_state(channel);
+	if (channel->state != GSI_CHANNEL_STATE_NOT_ALLOCATED)
+		return -EIO;
+
+	mutex_lock(&gsi->mutex);
+
+	ret = gsi_evt_ring_alloc_hw(gsi, channel->evt_ring_id);
+	if (ret) {
+		mutex_unlock(&gsi->mutex);
+
+		return ret;
+	}
+
+	gsi_channel_command(channel, GSI_CH_ALLOCATE);
+	ret = channel->state == GSI_CHANNEL_STATE_ALLOCATED ? 0 : -EIO;
+	if (ret) {
+		gsi_evt_ring_free_hw(gsi, channel->evt_ring_id);
+		mutex_unlock(&gsi->mutex);
+
+		return ret;
+	}
+
+	__gsi_channel_config(channel, true);
+
+	mutex_unlock(&gsi->mutex);
+
+	gsi->channel_stats.allocate++;
+
+	if (channel->toward_ipa)
+		netif_tx_napi_add(&gsi->dummy_dev, &channel->napi,
+				  gsi_channel_poll, NAPI_POLL_WEIGHT);
+	else
+		netif_napi_add(&gsi->dummy_dev, &channel->napi,
+			       gsi_channel_poll, NAPI_POLL_WEIGHT);
+
+	return 0;
+}
+
+/* Inverse of gsi_channel_setup_one() */
+static void gsi_channel_teardown_one(struct gsi_channel *channel)
+{
+	struct gsi *gsi = channel->gsi;
+
+	if (!gsi)
+		return;
+
+	netif_napi_del(&channel->napi);
+
+	mutex_lock(&gsi->mutex);
+
+	gsi_channel_command(channel, GSI_CH_DE_ALLOC);
+
+	gsi->channel_stats.free++;
+
+	gsi_evt_ring_free_hw(gsi, channel->evt_ring_id);
+
+	mutex_unlock(&gsi->mutex);
+
+	gsi_channel_trans_exit(channel);
+}
+
+/* Setup function for channels */
+static int gsi_channel_setup(struct gsi *gsi)
+{
+	u32 channel_max;
+	u32 channel_id;
+	int ret;
+
+	channel_max = gsi_channel_max(gsi);
+	dev_dbg(gsi->dev, "channel_max %u\n", channel_max);
+	if (channel_max != GSI_CHANNEL_MAX)
+		return -EIO;
+
+	ret = gsi_evt_ring_setup(gsi);
+	if (ret)
+		return ret;
+
+	for (channel_id = 0; channel_id < GSI_CHANNEL_MAX; channel_id++) {
+		ret = gsi_channel_setup_one(&gsi->channel[channel_id]);
+		if (ret)
+			goto err_unwind;
+	}
+
+	return 0;
+
+err_unwind:
+	while (channel_id--)
+		gsi_channel_teardown_one(&gsi->channel[channel_id]);
+	gsi_evt_ring_teardown(gsi);
+
+	return ret;
+}
+
+/* Inverse of gsi_channel_setup() */
+static void gsi_channel_teardown(struct gsi *gsi)
+{
+	u32 channel_id;
+
+	for (channel_id = 0; channel_id < GSI_CHANNEL_MAX; channel_id++) {
+		struct gsi_channel *channel = &gsi->channel[channel_id];
+
+		gsi_channel_teardown_one(channel);
+	}
+
+	gsi_evt_ring_teardown(gsi);
+}
+
+/* Setup function for GSI.  GSI firmware must be loaded and initialized */
+int gsi_setup(struct gsi *gsi)
+{
+	u32 val;
+
+	/* Here is where we first touch the GSI hardware */
+	val = ioread32(gsi->virt + GSI_GSI_STATUS_OFFSET);
+	if (!(val & ENABLED_FMASK)) {
+		dev_err(gsi->dev, "GSI has not been enabled\n");
+		return -EIO;
+	}
+
+	/* Initialize the error log */
+	iowrite32(0, gsi->virt + GSI_ERROR_LOG_OFFSET);
+
+	/* Writing 1 indicates IRQ interrupts; 0 would be MSI */
+	iowrite32(1, gsi->virt + GSI_CNTXT_INTSET_OFFSET);
+
+	return gsi_channel_setup(gsi);
+}
+
+/* Inverse of gsi_setup() */
+void gsi_teardown(struct gsi *gsi)
+{
+	gsi_channel_teardown(gsi);
+}
+
+/* Initialize a channel's event ring */
+static int gsi_channel_evt_ring_init(struct gsi_channel *channel)
+{
+	struct gsi *gsi = channel->gsi;
+	struct gsi_evt_ring *evt_ring;
+	int ret;
+
+	ret = gsi_evt_ring_id_alloc(gsi);
+	if (ret < 0)
+		return ret;
+	channel->evt_ring_id = ret;
+
+	evt_ring = &gsi->evt_ring[channel->evt_ring_id];
+	evt_ring->channel = channel;
+
+	ret = gsi_ring_alloc(gsi, &evt_ring->ring, channel->data->event_count);
+	if (ret)
+		goto err_free_evt_ring_id;
+
+	return 0;
+
+err_free_evt_ring_id:
+	gsi_evt_ring_id_free(gsi, channel->evt_ring_id);
+
+	return ret;
+}
+
+/* Inverse of gsi_channel_evt_ring_init() */
+static void gsi_channel_evt_ring_exit(struct gsi_channel *channel)
+{
+	struct gsi *gsi = channel->gsi;
+	struct gsi_evt_ring *evt_ring;
+
+	evt_ring = &gsi->evt_ring[channel->evt_ring_id];
+	gsi_ring_free(gsi, &evt_ring->ring);
+
+	gsi_evt_ring_id_free(gsi, channel->evt_ring_id);
+}
+
+/* Init function for event rings */
+static void gsi_evt_ring_init(struct gsi *gsi)
+{
+	u32 evt_ring_id;
+
+	BUILD_BUG_ON(GSI_EVT_RING_MAX >= BITS_PER_LONG);
+
+	gsi->event_bitmap = gsi_event_bitmap_init(GSI_EVT_RING_MAX);
+	gsi->event_enable_bitmap = 0;
+	for (evt_ring_id = 0; evt_ring_id < GSI_EVT_RING_MAX; evt_ring_id++)
+		init_completion(&gsi->evt_ring[evt_ring_id].completion);
+}
+
+/* Inverse of gsi_evt_ring_init() */
+static void gsi_evt_ring_exit(struct gsi *gsi)
+{
+	/* Nothing to do */
+}
+
+/* Init function for a single channel */
+static int
+gsi_channel_init_one(struct gsi *gsi, const struct gsi_ipa_endpoint_data *data)
+{
+	struct gsi_channel *channel;
+	int ret;
+
+	if (data->ee_id != GSI_EE_AP)
+		return 0;	/* Ignore non-AP channels */
+
+	if (data->channel_id >= GSI_CHANNEL_MAX)
+		return -EIO;
+	channel = &gsi->channel[data->channel_id];
+
+	channel->gsi = gsi;
+	channel->toward_ipa = data->toward_ipa;
+	channel->data = &data->channel;
+
+	init_completion(&channel->completion);
+
+	ret = gsi_channel_evt_ring_init(channel);
+	if (ret)
+		return ret;
+
+	ret = gsi_ring_alloc(gsi, &channel->tre_ring, channel->data->tre_count);
+	if (ret)
+		goto err_channel_evt_ring_exit;
+
+	ret = gsi_channel_trans_init(channel);
+	if (ret)
+		goto err_ring_free;
+
+	return 0;
+
+err_ring_free:
+	gsi_ring_free(gsi, &channel->tre_ring);
+err_channel_evt_ring_exit:
+	gsi_channel_evt_ring_exit(channel);
+
+	return ret;
+}
+
+/* Inverse of gsi_channel_init_one() */
+static void gsi_channel_exit_one(struct gsi_channel *channel)
+{
+	gsi_channel_trans_exit(channel);
+	gsi_ring_free(channel->gsi, &channel->tre_ring);
+	gsi_channel_evt_ring_exit(channel);
+}
+
+/* Init function for channels */
+static int gsi_channel_init(struct gsi *gsi, u32 data_count,
+			    const struct gsi_ipa_endpoint_data *data)
+{
+	int ret = 0;
+	u32 i;
+
+	gsi_evt_ring_init(gsi);
+	for (i = 0; i < data_count; i++) {
+		ret = gsi_channel_init_one(gsi, &data[i]);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+
+/* Inverse of gsi_channel_init() */
+static void gsi_channel_exit(struct gsi *gsi)
+{
+	u32 channel_id;
+
+	for (channel_id = 0; channel_id < GSI_CHANNEL_MAX; channel_id++)
+		gsi_channel_exit_one(&gsi->channel[channel_id]);
+	gsi_evt_ring_exit(gsi);
+}
+
+/* Init function for GSI.  GSI hardware does not need to be "ready" */
+int gsi_init(struct gsi *gsi, struct platform_device *pdev, u32 data_count,
+	     const struct gsi_ipa_endpoint_data *data)
+{
+	struct resource *res;
+	resource_size_t size;
+	unsigned int irq;
+	int ret;
+
+	gsi->dev = &pdev->dev;
+	init_dummy_netdev(&gsi->dummy_dev);
+
+	/* Get GSI memory range and map it */
+	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "gsi");
+	if (!res)
+		return -ENXIO;
+
+	size = resource_size(res);
+	if (res->start > U32_MAX || size > U32_MAX - res->start)
+		return -EINVAL;
+
+	gsi->virt = ioremap_nocache(res->start, size);
+	if (!gsi->virt)
+		return -ENOMEM;
+
+	ret = platform_get_irq_byname(pdev, "gsi");
+	if (ret < 0)
+		goto err_unmap_virt;
+	irq = ret;
+
+	ret = request_irq(irq, gsi_isr, 0, "gsi", gsi);
+	if (ret)
+		goto err_unmap_virt;
+	gsi->irq = irq;
+
+	ret = enable_irq_wake(gsi->irq);
+	if (ret)
+		dev_err(gsi->dev, "error %d enabling gsi wake irq\n", ret);
+	gsi->irq_wake_enabled = ret ? 0 : 1;
+
+	spin_lock_init(&gsi->spinlock);
+	mutex_init(&gsi->mutex);
+
+	ret = gsi_channel_init(gsi, data_count, data);
+	if (ret)
+		goto err_mutex_destroy;
+
+	return 0;
+
+err_mutex_destroy:
+	mutex_destroy(&gsi->mutex);
+	if (gsi->irq_wake_enabled)
+		(void)disable_irq_wake(gsi->irq);
+	free_irq(gsi->irq, gsi);
+err_unmap_virt:
+	iounmap(gsi->virt);
+
+	return ret;
+}
+
+/* Inverse of gsi_init() */
+void gsi_exit(struct gsi *gsi)
+{
+	gsi_channel_exit(gsi);
+
+	mutex_destroy(&gsi->mutex);
+	if (gsi->irq_wake_enabled)
+		(void)disable_irq_wake(gsi->irq);
+	free_irq(gsi->irq, gsi);
+	iounmap(gsi->virt);
+}
+
+/* Returns the maximum number of pending transactions on a channel */
+u32 gsi_channel_trans_max(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+
+	return channel->data->tre_count;
+}
+
+/* Returns the maximum number of TREs in a single transaction for a channel */
+u32 gsi_channel_trans_tre_max(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+
+	return channel->data->tlv_count;
+}
+
+/* Wait for all transaction activity on a channel to complete */
+void gsi_channel_trans_quiesce(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	struct gsi_trans_info *trans_info;
+	struct gsi_trans *trans = NULL;
+	struct gsi_evt_ring *evt_ring;
+	struct list_head *list;
+	unsigned long flags;
+
+	trans_info = &channel->trans_info;
+	evt_ring = &channel->gsi->evt_ring[channel->evt_ring_id];
+
+	spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
+
+	/* Find the last list to which a transaction was added */
+	if (!list_empty(&trans_info->alloc))
+		list = &trans_info->alloc;
+	else if (!list_empty(&trans_info->pending))
+		list = &trans_info->pending;
+	else if (!list_empty(&trans_info->complete))
+		list = &trans_info->complete;
+	else if (!list_empty(&trans_info->polled))
+		list = &trans_info->polled;
+	else
+		list = NULL;
+
+	if (list) {
+		struct gsi_trans *trans;
+
+		/* The last entry on this list is the last one allocated.
+		 * Grab a reference so we can wait for it.
+		 */
+		trans = list_last_entry(list, struct gsi_trans, links);
+		refcount_inc(&trans->refcount);
+	}
+
+	spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
+
+	/* If there is one, wait for it to complete */
+	if (trans) {
+		wait_for_completion(&trans->completion);
+		gsi_trans_free(trans);
+	}
+}
+
+/* Make a channel operational */
+int gsi_channel_start(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+
+	if (channel->state != GSI_CHANNEL_STATE_ALLOCATED &&
+	    channel->state != GSI_CHANNEL_STATE_STOP_IN_PROC &&
+	    channel->state != GSI_CHANNEL_STATE_STOPPED) {
+		dev_err(gsi->dev, "channel %u bad state %u\n", channel_id,
+			(u32)channel->state);
+		return -ENOTSUPP;
+	}
+
+	napi_enable(&channel->napi);
+
+	mutex_lock(&gsi->mutex);
+
+	gsi_channel_command(channel, GSI_CH_START);
+
+	mutex_unlock(&gsi->mutex);
+
+	gsi->channel_stats.start++;
+
+	return 0;
+}
+
+/* Stop an operational channel */
+int gsi_channel_stop(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	int ret;
+
+	if (channel->state == GSI_CHANNEL_STATE_STOPPED)
+		return 0;
+
+	if (channel->state != GSI_CHANNEL_STATE_STARTED &&
+	    channel->state != GSI_CHANNEL_STATE_STOP_IN_PROC &&
+	    channel->state != GSI_CHANNEL_STATE_ERROR) {
+		dev_err(gsi->dev, "channel %u bad state %u\n", channel_id,
+			(u32)channel->state);
+		return -ENOTSUPP;
+	}
+
+	gsi_channel_trans_quiesce(gsi, channel_id);
+
+	mutex_lock(&gsi->mutex);
+
+	gsi_channel_command(channel, GSI_CH_STOP);
+
+	mutex_unlock(&gsi->mutex);
+
+	if (channel->state == GSI_CHANNEL_STATE_STOPPED)
+		ret = 0;
+	else if (channel->state == GSI_CHANNEL_STATE_STOP_IN_PROC)
+		ret = -EAGAIN;
+	else
+		ret = -EIO;
+
+	gsi->channel_stats.stop++;
+
+	if (!ret)
+		napi_disable(&channel->napi);
+
+	return ret;
+}
+
+/* Reset a GSI channel */
+int gsi_channel_reset(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+
+	if (channel->state != GSI_CHANNEL_STATE_STOPPED) {
+		dev_err(gsi->dev, "channel %u bad state %u\n", channel_id,
+			(u32)channel->state);
+		return -ENOTSUPP;
+	}
+
+	/* In case the reset follows stop, need to wait 1 msec */
+	usleep_range(USEC_PER_MSEC, 2 * USEC_PER_MSEC);
+
+	mutex_lock(&gsi->mutex);
+
+	gsi_channel_command(channel, GSI_CH_RESET);
+
+	/* workaround: reset RX channels again */
+	if (!channel->toward_ipa) {
+		usleep_range(USEC_PER_MSEC, 2 * USEC_PER_MSEC);
+		gsi_channel_command(channel, GSI_CH_RESET);
+	}
+
+	__gsi_channel_config(channel, true);
+
+	/* Cancel pending transactions before the channel is started again */
+	gsi_channel_trans_cancel_pending(channel);
+
+	mutex_unlock(&gsi->mutex);
+
+	gsi->channel_stats.reset++;
+
+	return 0;
+}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (7 preceding siblings ...)
  2019-05-12  1:24 ` [PATCH 08/18] soc: qcom: ipa: the generic software interface Alex Elder
@ 2019-05-12  1:24 ` Alex Elder
  2019-05-15  7:34   ` Arnd Bergmann
  2019-05-12  1:25 ` [PATCH 10/18] soc: qcom: ipa: IPA interface to GSI Alex Elder
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:24 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch implements GSI transactions.  A GSI transaction is a
structure that represents a single request (consisting of one or
more TREs) sent to the GSI hardware.  The last TRE in a transaction
includes a flag requesting that the GSI interrupt the AP to notify
that it has completed.

TREs are executed and completed strictly in order.  For this reason,
the completion of a single TRE implies that all previous TREs (in
particular all of those "earlier" in a transaction) have completed.

Whenever there is a need to send a request (a set of TREs) to the
IPA, a GSI transaction is allocated, specifying the number of TREs
that will be required.  Details of the request (e.g. transfer offsets
and length) are represented by in a Linux scatterlist array that is
incorporated in the transaction structure.

Once "filled," the transaction is committed.  The GSI transaction
layer performs all needed mapping (and unmapping) for DMA, and
issues the request to the hardware.  When the hardware signals
that the request has completed, a function in the IPA layer is
called, allowing for cleanup or followup activity to be performed
before the transaction is freed.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/gsi_trans.c | 604 ++++++++++++++++++++++++++++++++++++
 drivers/net/ipa/gsi_trans.h | 106 +++++++
 2 files changed, 710 insertions(+)
 create mode 100644 drivers/net/ipa/gsi_trans.c
 create mode 100644 drivers/net/ipa/gsi_trans.h

diff --git a/drivers/net/ipa/gsi_trans.c b/drivers/net/ipa/gsi_trans.c
new file mode 100644
index 000000000000..e35e8369d5d0
--- /dev/null
+++ b/drivers/net/ipa/gsi_trans.c
@@ -0,0 +1,604 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/bits.h>
+#include <linux/bitfield.h>
+#include <linux/refcount.h>
+#include <linux/scatterlist.h>
+
+#include "gsi.h"
+#include "gsi_private.h"
+#include "gsi_trans.h"
+#include "ipa_gsi.h"
+#include "ipa_data.h"
+#include "ipa_cmd.h"
+
+/**
+ * DOC: GSI Transactions
+ *
+ * A GSI transaction abstracts the behavior of a GSI channel by representing
+ * everything about a related group of data transfers in a single structure.
+ * Most details of interaction with the GSI hardware are managed by the GSI
+ * transaction core, allowing users to simply describe transfers to be
+ * performed and optionally supply a callback function to run once the set
+ * of transfers has been completed.
+ *
+ * To perform a data transfer (or a related set of them), a user of the GSI
+ * transaction interface allocates a transaction, indicating the number of
+ * TREs required (one per data transfer).  If sufficient TREs are available,
+ * they are reserved for use in the transaction and the allocation succeeds.
+ * This way exhaustion of the available TREs in a channel ring is detected
+ * as early as possible.  All resources required to complete a transaction
+ * are allocated at transaction allocation time.
+ *
+ * Transfers performed as part of a transaction are represented in an array
+ * of Linux scatterlist structures.  This array is allocated with the
+ * transaction, and its entries must be initialized using standard
+ * scatterlist functions (such as sg_init_one() or skb_to_sgvec()).  The
+ * user must supply the total number of bytes represented by all transfers
+ * in the transaction.
+ *
+ * Once a transaction has been prepared, it is committed.  The GSI transaction
+ * layer is responsible for DMA mapping (and unmapping) memory described in
+ * the transaction's scatterlist array.  The only way committing a transaction
+ * fails is if this DMA mapping step returns an error.  Otherwise, ownership
+ * of the entire transaction is transferred to the GSI transaction core.  The
+ * GSI transaction code formats the content of the scatterlist array into the
+ * channel ring buffer and informs the hardware that new TREs are available
+ * to process.
+ *
+ * The last TRE in each transaction is marked to interrupt the AP when the
+ * GSI hardware has completed it.  Because transfers described by TREs are
+ * performed strictly in order, signaling the completion of just the last
+ * TRE in the transaction is sufficient to indicate the full transaction
+ * is complete.
+ *
+ * When a transaction is complete, ipa_gsi_trans_complete() is called by the
+ * GSI code into the IPA layer, allowing it to perform any final cleanup
+ * required before the transaction is freed.
+ *
+ * It is possible to await the completion of a transaction; only immediate
+ * commands currently use this functionality.
+ */
+
+/* gsi_tre->flags mask values (in CPU byte order) */
+#define GSI_TRE_FLAGS_CHAIN_FMASK	GENMASK(0, 0)
+#define GSI_TRE_FLAGS_IEOB_FMASK	GENMASK(8, 8)
+#define GSI_TRE_FLAGS_IEOT_FMASK	GENMASK(9, 9)
+#define GSI_TRE_FLAGS_BEI_FMASK		GENMASK(10, 10)
+#define GSI_TRE_FLAGS_TYPE_FMASK	GENMASK(23, 16)
+
+/* Hardware values representing a transfer element type */
+enum gsi_tre_type {
+	GSI_RE_XFER	= 0x2,
+	GSI_RE_IMMD_CMD	= 0x3,
+	GSI_RE_NOP	= 0x4,
+};
+
+/* Map a given ring entry index to the transaction associated with it */
+static void gsi_channel_trans_map(struct gsi_channel *channel, u32 index,
+				  struct gsi_trans *trans)
+{
+	channel->trans_info.map[index] = trans;
+}
+
+/* Return the transaction mapped to a given ring entry */
+struct gsi_trans *
+gsi_channel_trans_mapped(struct gsi_channel *channel, u32 index)
+{
+	return channel->trans_info.map[index];
+}
+
+/* Return the oldest completed transaction for a channel (or null) */
+struct gsi_trans *gsi_channel_trans_complete(struct gsi_channel *channel)
+{
+	return list_first_entry_or_null(&channel->trans_info.complete,
+					struct gsi_trans, links);
+}
+
+/* Move a transaction from the allocated list to the pending list */
+static void gsi_trans_move_pending(struct gsi_trans *trans)
+{
+	struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
+	struct gsi_trans_info *trans_info = &channel->trans_info;
+	unsigned long flags;
+
+	spin_lock_irqsave(&trans_info->spinlock, flags);
+
+	list_move_tail(&trans->links, &trans_info->pending);
+
+	spin_unlock_irqrestore(&trans_info->spinlock, flags);
+}
+
+/* Move a transaction and all of its predecessors from the pending list
+ * to the completed list.
+ */
+void gsi_trans_move_complete(struct gsi_trans *trans)
+{
+	struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
+	struct gsi_trans_info *trans_info;
+	struct list_head list;
+	unsigned long flags;
+
+	trans_info = &channel->trans_info;
+
+	spin_lock_irqsave(&trans_info->spinlock, flags);
+
+	/* Move this transaction and all predecessors to completed list */
+	list_cut_position(&list, &trans_info->pending, &trans->links);
+	list_splice_tail(&list, &trans_info->complete);
+
+	spin_unlock_irqrestore(&trans_info->spinlock, flags);
+}
+
+/* Move a transaction from the completed list to the polled list */
+void gsi_trans_move_polled(struct gsi_trans *trans)
+{
+	struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
+	struct gsi_trans_info *trans_info = &channel->trans_info;
+	unsigned long flags;
+
+	spin_lock_irqsave(&trans_info->spinlock, flags);
+
+	list_move_tail(&trans->links, &trans_info->polled);
+
+	spin_unlock_irqrestore(&trans_info->spinlock, flags);
+}
+
+/* Allocate a GSI transaction on a channel */
+struct gsi_trans *
+gsi_channel_trans_alloc(struct gsi *gsi, u32 channel_id, u32 tre_count)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	struct gsi_trans_info *trans_info = &channel->trans_info;
+	struct gsi_trans *trans = NULL;
+	unsigned long flags;
+
+	/* Caller should know the limit is gsi_channel_trans_max() */
+	if (WARN_ON(tre_count > channel->data->tlv_count))
+		return NULL;
+
+	spin_lock_irqsave(&trans_info->spinlock, flags);
+
+	if (trans_info->tre_avail >= tre_count) {
+		u32 avail;
+
+		/* Allocate the transaction */
+		if (trans_info->pool_free == trans_info->pool_count)
+			trans_info->pool_free = 0;
+		trans = &trans_info->pool[trans_info->pool_free++];
+
+		/* Allocate the scatter/gather entries it will use.  If
+		 * what's needed would cross the end-of-pool boundary,
+		 * allocate them from the beginning.
+		 */
+		avail = trans_info->sg_pool_count - trans_info->sg_pool_free;
+		if (tre_count > avail)
+			trans_info->sg_pool_free = 0;
+		trans->sgl = &trans_info->sg_pool[trans_info->sg_pool_free];
+		trans->sgc = tre_count;
+		trans_info->sg_pool_free += tre_count;
+
+		/* We reserve the TREs now, but consume them at commit time */
+		trans_info->tre_avail -= tre_count;
+
+		list_add_tail(&trans->links, &trans_info->alloc);
+	}
+
+	spin_unlock_irqrestore(&trans_info->spinlock, flags);
+
+	if (!trans)
+		return NULL;
+
+	trans->gsi = gsi;
+	trans->channel_id = channel_id;
+	refcount_set(&trans->refcount, 1);
+	trans->tre_count = tre_count;
+	init_completion(&trans->completion);
+
+	/* We're reusing, so make sure all fields are reinitialized */
+	trans->dev = gsi->dev;
+	trans->result = 0;	/* Success assumed unless overwritten */
+	trans->data = NULL;
+
+	return trans;
+}
+
+/* Free a previously-allocated transaction (used only in case of error) */
+void gsi_trans_free(struct gsi_trans *trans)
+{
+	struct gsi_trans_info *trans_info;
+	struct gsi_channel *channel;
+	unsigned long flags;
+
+	if (!refcount_dec_and_test(&trans->refcount))
+		return;
+
+	channel = &trans->gsi->channel[trans->channel_id];
+	trans_info = &channel->trans_info;
+
+	spin_lock_irqsave(&trans_info->spinlock, flags);
+
+	list_del(&trans->links);
+	trans_info->tre_avail += trans->tre_count;
+
+	spin_unlock_irqrestore(&trans_info->spinlock, flags);
+}
+
+/* Compute the length/opcode value to use for a TRE */
+static __le16 gsi_tre_len_opcode(enum ipa_cmd_opcode opcode, u32 len)
+{
+	return opcode == IPA_CMD_NONE ? cpu_to_le16((u16)len)
+				      : cpu_to_le16((u16)opcode);
+}
+
+/* Compute the flags value to use for a given TRE */
+static __le32 gsi_tre_flags(bool last_tre, bool bei, enum ipa_cmd_opcode opcode)
+{
+	enum gsi_tre_type tre_type;
+	u32 tre_flags;
+
+	tre_type = opcode == IPA_CMD_NONE ? GSI_RE_XFER : GSI_RE_IMMD_CMD;
+	tre_flags = u32_encode_bits(tre_type, GSI_TRE_FLAGS_TYPE_FMASK);
+
+	/* Last TRE contains interrupt flags */
+	if (last_tre) {
+		/* All transactions end in a transfer completion interrupt */
+		tre_flags |= GSI_TRE_FLAGS_IEOT_FMASK;
+		/* Don't interrupt when outbound commands are acknowledged */
+		if (bei)
+			tre_flags |= GSI_TRE_FLAGS_BEI_FMASK;
+	} else {	/* All others indicate there's more to come */
+		tre_flags |= GSI_TRE_FLAGS_CHAIN_FMASK;
+	}
+
+	return cpu_to_le32(tre_flags);
+}
+
+static void gsi_trans_tre_fill(struct gsi_tre *dest_tre, dma_addr_t addr,
+			       u32 len, bool last_tre, bool bei,
+			       enum ipa_cmd_opcode opcode)
+{
+	struct gsi_tre tre;
+
+	tre.addr = cpu_to_le64(addr);
+	tre.len_opcode = gsi_tre_len_opcode(opcode, len);
+	tre.reserved = 0;
+	tre.flags = gsi_tre_flags(last_tre, bei, opcode);
+
+	*dest_tre = tre;	/* Write TRE as a single (16-byte) unit */
+}
+
+int gsi_trans_read_byte(struct gsi *gsi, u32 channel_id, dma_addr_t addr)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	struct gsi_trans_info *trans_info;
+	struct gsi_evt_ring *evt_ring;
+	struct gsi_tre *dest_tre;
+	bool exhausted = false;
+	unsigned long flags;
+	u32 wp_local;
+
+	/* assert(!channel->toward_ipa); */
+
+	trans_info = &channel->trans_info;
+	spin_lock_irqsave(&trans_info->spinlock, flags);
+
+	if (trans_info->tre_avail)
+		trans_info->tre_avail--;
+	else
+		exhausted = true;
+
+	spin_unlock_irqrestore(&trans_info->spinlock, flags);
+
+	if (exhausted)
+		return -EBUSY;
+
+	evt_ring = &gsi->evt_ring[channel->evt_ring_id];
+	spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
+
+	wp_local = channel->tre_ring.wp_local;
+	if (wp_local == channel->tre_ring.end)
+		wp_local = channel->tre_ring.base;
+	dest_tre = gsi_ring_virt(&channel->tre_ring, wp_local);
+
+	gsi_trans_tre_fill(dest_tre, addr, 1, true, false, IPA_CMD_NONE);
+
+	gsi_ring_wp_local_add(&channel->tre_ring, 1);
+
+	gsi_channel_doorbell(channel);
+
+	spin_unlock_irqrestore(&evt_ring->ring.spinlock, flags);
+
+	return 0;
+}
+
+/**
+ * __gsi_trans_commit() - Common GSI transaction commit code
+ * @trans:	Transaction to commit
+ * @opcode:	Immediate command opcode, or IPA_CMD_NONE
+ * @ring_db:	Whether to tell the hardware about these queued transfers
+ *
+ * @Return:	0 if successful, or a negative error code
+ *
+ * Maps the transactions's scatterlist array for DMA, and returns -ENOMEM
+ * if that fails.  Formats channel ring TRE entries based on the content of
+ * the scatterlist.  Records the transaction pointer in the map entry for
+ * the last ring entry used for the transaction so it can be recovered when
+ * it completes, and moves the transaction to the pending list.  Updates the
+ * channel ring pointer and optionally rings the doorbell.
+ */
+static int __gsi_trans_commit(struct gsi_trans *trans,
+			      enum ipa_cmd_opcode opcode, bool ring_db)
+{
+	struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
+	enum dma_data_direction direction;
+	bool bei = channel->toward_ipa;
+	struct gsi_evt_ring *evt_ring;
+	struct scatterlist *sg;
+	struct gsi_tre *dest_tre;
+	u32 queued_trans_count;
+	u32 queued_byte_count;
+	unsigned long flags;
+	u32 byte_count = 0;
+	u32 wp_local;
+	u32 index;
+	u32 avail;
+	int ret;
+	u32 i;
+
+	direction = channel->toward_ipa ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+	ret = dma_map_sg(trans->dev, trans->sgl, trans->sgc, direction);
+	if (!ret)
+		return -ENOMEM;
+
+	evt_ring = &channel->gsi->evt_ring[channel->evt_ring_id];
+
+	spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
+
+	/* We'll consume the entries available at the end of the ring,
+	 * switching to the beginning to finish if necessary.
+	 */
+	wp_local = channel->tre_ring.wp_local;
+	dest_tre = gsi_ring_virt(&channel->tre_ring, wp_local);
+
+	avail = (channel->tre_ring.end - wp_local) / sizeof(*dest_tre);
+
+	for_each_sg(trans->sgl, sg, trans->sgc, i) {
+		bool last_tre = i == trans->tre_count - 1;
+		dma_addr_t addr = sg_dma_address(sg);
+		u32 len = sg_dma_len(sg);
+
+		byte_count += len;
+		if (!avail--)
+			dest_tre = gsi_ring_virt(&channel->tre_ring,
+						 channel->tre_ring.base);
+
+		gsi_trans_tre_fill(dest_tre, addr, len, last_tre, bei, opcode);
+		dest_tre++;
+	}
+
+	if (channel->toward_ipa) {
+		/* We record TX bytes when they are sent */
+		trans->len = byte_count;
+		trans->trans_count = channel->trans_count;
+		trans->byte_count = channel->byte_count;
+		channel->trans_count++;
+		channel->byte_count += byte_count;
+	}
+
+	/* Advance the write pointer; record info for last used element */
+	gsi_ring_wp_local_add(&channel->tre_ring, trans->tre_count - 1);
+	index = ring_index(&channel->tre_ring, channel->tre_ring.wp_local);
+	gsi_channel_trans_map(channel, index, trans);
+	gsi_ring_wp_local_add(&channel->tre_ring, 1);
+
+	gsi_trans_move_pending(trans);
+
+	/* Ring doorbell if requested, or if all TREs are allocated */
+	if (ring_db || !channel->trans_info.tre_avail) {
+		/* Report what we're handing off to hardware for TX channels */
+		if (channel->toward_ipa) {
+			queued_trans_count = channel->trans_count -
+						channel->doorbell_trans_count;
+			queued_byte_count = channel->byte_count -
+						channel->doorbell_byte_count;
+			channel->doorbell_trans_count = channel->trans_count;
+			channel->doorbell_byte_count = channel->byte_count;
+
+			ipa_gsi_channel_tx_queued(trans->gsi,
+						  gsi_channel_id(channel),
+						  queued_trans_count,
+						  queued_byte_count);
+		}
+
+		gsi_channel_doorbell(channel);
+	}
+
+	spin_unlock_irqrestore(&evt_ring->ring.spinlock, flags);
+
+	return 0;
+}
+
+/* Commit a GSI transaction */
+int gsi_trans_commit(struct gsi_trans *trans, bool ring_db)
+{
+	return __gsi_trans_commit(trans, IPA_CMD_NONE, ring_db);
+}
+
+/* Commit a GSI command transaction and wait for it to complete */
+int gsi_trans_commit_command(struct gsi_trans *trans,
+			     enum ipa_cmd_opcode opcode)
+{
+	int ret;
+
+	refcount_inc(&trans->refcount);
+
+	ret = __gsi_trans_commit(trans, opcode, true);
+	if (ret)
+		goto out_free_trans;
+
+	wait_for_completion(&trans->completion);
+
+out_free_trans:
+	gsi_trans_free(trans);
+
+	return ret;
+}
+
+/* Commit a GSI command transaction, wait for it to complete, with timeout */
+int gsi_trans_commit_command_timeout(struct gsi_trans *trans,
+				     enum ipa_cmd_opcode opcode,
+				     unsigned long timeout)
+{
+	unsigned long timeout_jiffies = msecs_to_jiffies(timeout);
+	unsigned long remaining;
+	int ret;
+
+	refcount_inc(&trans->refcount);
+
+	ret = __gsi_trans_commit(trans, opcode, true);
+	if (ret)
+		goto out_free_trans;
+
+	remaining = wait_for_completion_timeout(&trans->completion,
+						timeout_jiffies);
+out_free_trans:
+	gsi_trans_free(trans);
+
+	return ret ? ret : remaining ? 0 : -ETIMEDOUT;
+}
+
+/* Return a channel's next completed transaction (or NULL) */
+void gsi_trans_complete(struct gsi_trans *trans)
+{
+	struct gsi_channel *channel = &trans->gsi->channel[trans->channel_id];
+	enum dma_data_direction direction;
+
+	direction = channel->toward_ipa ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+
+	dma_unmap_sg(trans->dev, trans->sgl, trans->sgc, direction);
+
+	ipa_gsi_trans_complete(trans);
+
+	complete(&trans->completion);
+
+	gsi_trans_free(trans);
+}
+
+/* Cancel a channel's pending transactions */
+void gsi_channel_trans_cancel_pending(struct gsi_channel *channel)
+{
+	struct gsi_trans_info *trans_info = &channel->trans_info;
+	u32 evt_ring_id = channel->evt_ring_id;
+	struct gsi *gsi = channel->gsi;
+	struct gsi_evt_ring *evt_ring;
+	struct gsi_trans *trans;
+	unsigned long flags;
+
+	evt_ring = &gsi->evt_ring[evt_ring_id];
+
+	spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
+
+	list_for_each_entry(trans, &trans_info->pending, links)
+		trans->result = -ECANCELED;
+
+	list_splice_tail_init(&trans_info->pending, &trans_info->complete);
+
+	spin_unlock_irqrestore(&evt_ring->ring.spinlock, flags);
+
+	spin_lock_irqsave(&gsi->spinlock, flags);
+
+	if (gsi->event_enable_bitmap & BIT(evt_ring_id))
+		gsi_event_handle(gsi, evt_ring_id);
+
+	spin_unlock_irqrestore(&gsi->spinlock, flags);
+}
+
+/* Initialize a channel's GSI transaction info */
+int gsi_channel_trans_init(struct gsi_channel *channel)
+{
+	struct gsi_trans_info *trans_info = &channel->trans_info;
+	u32 tre_count = channel->data->tre_count;
+
+	trans_info->map = kcalloc(tre_count, sizeof(*trans_info->map),
+				  GFP_KERNEL);
+	if (!trans_info->map)
+		return -ENOMEM;
+
+	/* We will never need more transactions than there are TRE
+	 * entries in the transfer ring.  For that reason, we can
+	 * preallocate an array of (at least) that many transactions,
+	 * and use a single free index to determine the next one
+	 * available for allocation.
+	 */
+	trans_info->pool_count = tre_count;
+	trans_info->pool = kcalloc(trans_info->pool_count,
+				   sizeof(*trans_info->pool), GFP_KERNEL);
+	if (!trans_info->pool)
+		goto err_free_map;
+	/* If we get extra memory from the allocator, use it */
+	trans_info->pool_count =
+		ksize(trans_info->pool) / sizeof(*trans_info->pool);
+	trans_info->pool_free = 0;
+
+	/* While transactions are allocated one at a time, a transaction
+	 * can have multiple TREs.  The number of TRE entries in a single
+	 * transaction is limited by the number of TLV FIFO entries the
+	 * channel has.  We reserve TREs when a transaction is allocated,
+	 * but we don't actually use them until the transaction is
+	 * committed.
+	 *
+	 * A transaction uses a scatterlist array to represent the data
+	 * transfers implemented by the transaction.  Each scatterliest
+	 * element is used to fill a single TRE when the transaction is
+	 * committed.  As a result, we need the same number of scatterlist
+	 * elements as there are TREs in the transfer ring, and we can
+	 * preallocate them in a pool.
+	 *
+	 * If we allocate a few (tlv_count - 1) extra entries in our pool
+	 * we can always satisfy requests without ever worrying about
+	 * straddling the end of the array.  If there aren't enough
+	 * entries starting at the free index, we just allocate free
+	 * entries from the beginning of the pool.
+	 */
+	trans_info->sg_pool_count = tre_count + channel->data->tlv_count - 1;
+	trans_info->sg_pool = kcalloc(trans_info->sg_pool_count,
+				      sizeof(*trans_info->sg_pool), GFP_KERNEL);
+	if (!trans_info->sg_pool)
+		goto err_free_pool;
+	/* Use any extra memory we get from the allocator */
+	trans_info->sg_pool_count =
+		ksize(trans_info->sg_pool) / sizeof(*trans_info->sg_pool);
+	trans_info->sg_pool_free = 0;
+
+	spin_lock_init(&trans_info->spinlock);
+	trans_info->tre_avail = tre_count;	/* maximum active */
+	INIT_LIST_HEAD(&trans_info->alloc);
+	INIT_LIST_HEAD(&trans_info->pending);
+	INIT_LIST_HEAD(&trans_info->complete);
+	INIT_LIST_HEAD(&trans_info->polled);
+
+	return 0;
+
+err_free_pool:
+	kfree(trans_info->pool);
+err_free_map:
+	kfree(trans_info->map);
+
+	return -ENOMEM;
+}
+
+/* Inverse of gsi_channel_trans_init() */
+void gsi_channel_trans_exit(struct gsi_channel *channel)
+{
+	struct gsi_trans_info *trans_info = &channel->trans_info;
+
+	kfree(trans_info->sg_pool);
+	kfree(trans_info->pool);
+	kfree(trans_info->map);
+}
diff --git a/drivers/net/ipa/gsi_trans.h b/drivers/net/ipa/gsi_trans.h
new file mode 100644
index 000000000000..160902b077a7
--- /dev/null
+++ b/drivers/net/ipa/gsi_trans.h
@@ -0,0 +1,106 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _GSI_TRANS_H_
+#define _GSI_TRANS_H_
+
+#include <linux/types.h>
+#include <linux/refcount.h>
+#include <linux/completion.h>
+
+struct scatterlist;
+struct device;
+
+struct gsi;
+struct gsi_trans;
+enum ipa_cmd_opcode;
+
+struct gsi_trans {
+	struct list_head links;		/* gsi_channel lists */
+
+	struct gsi *gsi;
+	u32 channel_id;
+
+	u32 tre_count;			/* # TREs requested */
+	u32 len;			/* total # bytes in sgl */
+	struct scatterlist *sgl;
+	u32 sgc;			/* # entries in sgl[] */
+
+	struct completion completion;
+	refcount_t refcount;
+
+	/* fields above are internal only */
+
+	struct device *dev;		/* Use this for DMA mapping */
+	long result;			/* RX count, 0, or error code */
+
+	u64 byte_count;			/* channel byte_count when committed */
+	u64 trans_count;		/* channel trans_count when committed */
+
+	void *data;
+};
+
+/**
+ * gsi_channel_trans_alloc() - Allocate a GSI transaction on a channel
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel the transaction is associated with
+ * @tre_count:	Number of elements in the transaction
+ *
+ * @Return:	A GSI transaction structure, or a null pointer if all
+ *		available transactions are in use
+ */
+struct gsi_trans *gsi_channel_trans_alloc(struct gsi *gsi, u32 channel_id,
+					  u32 tre_count);
+
+/**
+ * gsi_trans_free() - Free a previously-allocated GSI transaction
+ * @trans:	Transaction to be freed
+ *
+ * Note: this should only be used in error paths, before the transaction is
+ * committed or in the event committing the transaction produces an error.
+ * Successfully committing a transaction passes ownership of the structure
+ * to the core transaction code.
+ */
+void gsi_trans_free(struct gsi_trans *trans);
+
+/**
+ * gsi_trans_commit() - Commit a GSI transaction
+ * @trans:	Transaction to commit
+ * @ring_db:	Whether to tell the hardware about these queued transfers
+ * @callback:	Function called when transaction has completed.
+ */
+int gsi_trans_commit(struct gsi_trans *trans, bool ring_db);
+
+/**
+ * gsi_trans_commit_command() - Commit a GSI command transaction and wait
+ *				wait for it to complete
+ * @trans:	Transaction to commit
+ */
+int gsi_trans_commit_command(struct gsi_trans *trans,
+			     enum ipa_cmd_opcode opcode);
+
+/**
+ * gsi_trans_commit_command_timeout() - Commit a GSI command transaction,
+ *					wait for it to complete, with timeout
+ * @trans:	Transaction to commit
+ * @ring_db:	Whether to tell the hardware about these queued transfers
+ * @timeout:	Timeout period (in milliseconds)
+ */
+int gsi_trans_commit_command_timeout(struct gsi_trans *trans,
+				     enum ipa_cmd_opcode opcode,
+				     unsigned long timeout);
+
+/**
+ * gsi_trans_read_byte() - Issue a single byte read TRE on a channel
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel on which to read a byte
+ * @addr:	DMA address into which to transfer the one byte
+ *
+ * This is not a transaction operation at all.  It's defined here because
+ * it needs to be done in coordination with other transaction activity.
+ */
+int gsi_trans_read_byte(struct gsi *gsi, u32 channel_id, dma_addr_t addr);
+
+#endif /* _GSI_TRANS_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 10/18] soc: qcom: ipa: IPA interface to GSI
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (8 preceding siblings ...)
  2019-05-12  1:24 ` [PATCH 09/18] soc: qcom: ipa: GSI transactions Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-12  1:25 ` [PATCH 11/18] soc: qcom: ipa: IPA endpoints Alex Elder
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch provides interface functions supplied by the IPA layer
that are called from the GSI layer.  One function is called when a
GSI transaction has completed.  The others allow the GSI layer to
inform the IPA layer when transactions have been supplied to
hardware, and when the hardware has indicated transactions have
completed.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa_gsi.c | 48 ++++++++++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_gsi.h | 49 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 97 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_gsi.c
 create mode 100644 drivers/net/ipa/ipa_gsi.h

diff --git a/drivers/net/ipa/ipa_gsi.c b/drivers/net/ipa/ipa_gsi.c
new file mode 100644
index 000000000000..c4e6c96d1676
--- /dev/null
+++ b/drivers/net/ipa/ipa_gsi.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+
+#include "gsi_trans.h"
+#include "ipa.h"
+#include "ipa_endpoint.h"
+
+void ipa_gsi_trans_complete(struct gsi_trans *trans)
+{
+	struct ipa *ipa = container_of(trans->gsi, struct ipa, gsi);
+	struct ipa_endpoint *endpoint;
+
+	endpoint = ipa->endpoint_map[trans->channel_id];
+	if (endpoint == ipa->command_endpoint)
+		return;		/* Nothing to do for commands */
+
+	if (endpoint->toward_ipa)
+		ipa_endpoint_skb_tx_callback(trans);
+	else
+		ipa_endpoint_rx_callback(trans);
+}
+
+void ipa_gsi_channel_tx_queued(struct gsi *gsi, u32 channel_id, u32 count,
+			       u32 byte_count)
+{
+	struct ipa *ipa = container_of(gsi, struct ipa, gsi);
+	struct ipa_endpoint *endpoint;
+
+	endpoint = ipa->endpoint_map[channel_id];
+	if (endpoint->netdev)
+		netdev_sent_queue(endpoint->netdev, byte_count);
+}
+
+void ipa_gsi_channel_tx_completed(struct gsi *gsi, u32 channel_id, u32 count,
+				  u32 byte_count)
+{
+	struct ipa *ipa = container_of(gsi, struct ipa, gsi);
+	struct ipa_endpoint *endpoint;
+
+	endpoint = ipa->endpoint_map[channel_id];
+	if (endpoint->netdev)
+		netdev_completed_queue(endpoint->netdev, count, byte_count);
+}
diff --git a/drivers/net/ipa/ipa_gsi.h b/drivers/net/ipa/ipa_gsi.h
new file mode 100644
index 000000000000..72adb520da40
--- /dev/null
+++ b/drivers/net/ipa/ipa_gsi.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _IPA_GSI_TRANS_H_
+#define _IPA_GSI_TRANS_H_
+
+#include <linux/types.h>
+
+struct gsi_trans;
+
+/**
+ * ipa_gsi_trans_complete() - GSI transaction completion callback
+ * @gsi:	GSI pointer
+ * @trans:	Transaction that has completed
+ *
+ * This called from the GSI layer to notify the IPA layer that a
+ * transaction has completed.
+ */
+void ipa_gsi_trans_complete(struct gsi_trans *trans);
+
+/**
+ * ipa_gsi_channel_tx_queued() - GSI queued to hardware notification
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel number
+ * @count:	Number of transactions queued
+ * @byte_count:	Number of bytes to transfer represented by transactions
+ *
+ * This called from the GSI layer to notify the IPA layer that some
+ * number of transactions have been queued to hardware for execution.
+ */
+void ipa_gsi_channel_tx_queued(struct gsi *gsi, u32 channel_id, u32 count,
+			       u32 byte_count);
+/**
+ * ipa_gsi_trans_complete() - GSI transaction completion callback
+ipa_gsi_channel_tx_completed()
+ * @gsi:	GSI pointer
+ * @channel_id:	Channel number
+ * @count:	Number of transactions completed since last report
+ * @byte_count:	Number of bytes transferred represented by transactions
+ *
+ * This called from the GSI layer to notify the IPA layer that the hardware
+ * has reported the completion of some number of transactions.
+ */
+void ipa_gsi_channel_tx_completed(struct gsi *gsi, u32 channel_id, u32 count,
+				  u32 byte_count);
+
+#endif /* _IPA_GSI_TRANS_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 11/18] soc: qcom: ipa: IPA endpoints
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (9 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 10/18] soc: qcom: ipa: IPA interface to GSI Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-12  1:25 ` [PATCH 12/18] soc: qcom: ipa: immediate commands Alex Elder
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch includes the code implementing an IPA endpoint.  This is
the primary abstraction implemented by the IPA.  An endpoint is one
end of a network connection between two entities physically
connected to the IPA.  Specifically, the AP and the modem implement
endpoints, and an (AP endpoint, modem endpoint) pair implements the
transfer of network data in one direction between the AP and modem.

Endpoints are built on top of GSI channels, but IPA endpoints
represent the higher-level functionality that the IPA provides.
Data can be sent through a GSI channel, but it is the IPA endpoint
that represents what is on the "other end" to receive that data.
Other functionality, including aggregation, checksum offload and
(at some future date) IP routing and filtering are all associated
with the IPA endpoint.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa_endpoint.c | 1253 ++++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_endpoint.h |   96 +++
 2 files changed, 1349 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_endpoint.c
 create mode 100644 drivers/net/ipa/ipa_endpoint.h

diff --git a/drivers/net/ipa/ipa_endpoint.c b/drivers/net/ipa/ipa_endpoint.c
new file mode 100644
index 000000000000..5dc10abc6480
--- /dev/null
+++ b/drivers/net/ipa/ipa_endpoint.c
@@ -0,0 +1,1253 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <linux/bitfield.h>
+#include <soc/qcom/rmnet.h>
+
+#include "gsi.h"
+#include "gsi_trans.h"
+#include "ipa.h"
+#include "ipa_data.h"
+#include "ipa_endpoint.h"
+#include "ipa_cmd.h"
+#include "ipa_mem.h"
+#include "ipa_netdev.h"
+
+#define atomic_dec_not_zero(v)	atomic_add_unless((v), -1, 0)
+
+#define IPA_REPLENISH_BATCH	16
+
+#define IPA_RX_BUFFER_SIZE	(PAGE_SIZE << IPA_RX_BUFFER_ORDER)
+#define IPA_RX_BUFFER_ORDER	1	/* 8KB endpoint RX buffers (2 pages) */
+
+/* The amount of RX buffer space consumed by standard skb overhead */
+#define IPA_RX_BUFFER_OVERHEAD	(PAGE_SIZE - SKB_MAX_ORDER(NET_SKB_PAD, 0))
+
+#define IPA_ENDPOINT_STOP_RETRY_MAX		10
+#define IPA_ENDPOINT_STOP_RX_SIZE		1	/* bytes */
+
+#define IPA_ENDPOINT_RESET_AGGR_RETRY_MAX	3
+#define IPA_AGGR_TIME_LIMIT_DEFAULT		1	/* milliseconds */
+
+/** enum ipa_status_opcode - status element opcode hardware values */
+enum ipa_status_opcode {
+	IPA_STATUS_OPCODE_PACKET		= 0x01,
+	IPA_STATUS_OPCODE_NEW_FRAG_RULE		= 0x02,
+	IPA_STATUS_OPCODE_DROPPED_PACKET	= 0x04,
+	IPA_STATUS_OPCODE_SUSPENDED_PACKET	= 0x08,
+	IPA_STATUS_OPCODE_LOG			= 0x10,
+	IPA_STATUS_OPCODE_DCMP			= 0x20,
+	IPA_STATUS_OPCODE_PACKET_2ND_PASS	= 0x40,
+};
+
+/** enum ipa_status_exception - status element exception type */
+enum ipa_status_exception {
+	IPA_STATUS_EXCEPTION_NONE,
+	IPA_STATUS_EXCEPTION_DEAGGR,
+	IPA_STATUS_EXCEPTION_IPTYPE,
+	IPA_STATUS_EXCEPTION_PACKET_LENGTH,
+	IPA_STATUS_EXCEPTION_PACKET_THRESHOLD,
+	IPA_STATUS_EXCEPTION_FRAG_RULE_MISS,
+	IPA_STATUS_EXCEPTION_SW_FILT,
+	IPA_STATUS_EXCEPTION_NAT,
+	IPA_STATUS_EXCEPTION_IPV6CT,
+	IPA_STATUS_EXCEPTION_MAX,
+};
+
+/**
+ * struct ipa_status - Abstracted IPA status element
+ * @opcode:		Status element type
+ * @exception:		The first exception that took place
+ * @pkt_len:		Payload length
+ * @dst_endpoint:	Destination endpoint
+ * @metadata:		32-bit metadata value used by packet
+ * @rt_miss:		Flag; if 1, indicates there was a routing rule miss
+ *
+ * Note that the hardware status element supplies additional information
+ * that is currently unused.
+ */
+struct ipa_status {
+	enum ipa_status_opcode opcode;
+	enum ipa_status_exception exception;
+	u32 pkt_len;
+	u32 dst_endpoint;
+	u32 metadata;
+	u32 rt_miss;
+};
+
+/* Field masks for struct ipa_status_raw structure fields */
+
+#define IPA_STATUS_SRC_IDX_FMASK		GENMASK(4, 0)
+
+#define IPA_STATUS_DST_IDX_FMASK		GENMASK(4, 0)
+
+#define IPA_STATUS_FLAGS1_FLT_LOCAL_FMASK	GENMASK(0, 0)
+#define IPA_STATUS_FLAGS1_FLT_HASH_FMASK	GENMASK(1, 1)
+#define IPA_STATUS_FLAGS1_FLT_GLOBAL_FMASK	GENMASK(2, 2)
+#define IPA_STATUS_FLAGS1_FLT_RET_HDR_FMASK	GENMASK(3, 3)
+#define IPA_STATUS_FLAGS1_FLT_RULE_ID_FMASK	GENMASK(13, 4)
+#define IPA_STATUS_FLAGS1_RT_LOCAL_FMASK	GENMASK(14, 14)
+#define IPA_STATUS_FLAGS1_RT_HASH_FMASK		GENMASK(15, 15)
+#define IPA_STATUS_FLAGS1_UCP_FMASK		GENMASK(16, 16)
+#define IPA_STATUS_FLAGS1_RT_TBL_IDX_FMASK	GENMASK(21, 17)
+#define IPA_STATUS_FLAGS1_RT_RULE_ID_FMASK	GENMASK(31, 22)
+
+#define IPA_STATUS_FLAGS2_NAT_HIT_FMASK		GENMASK_ULL(0, 0)
+#define IPA_STATUS_FLAGS2_NAT_ENTRY_IDX_FMASK	GENMASK_ULL(13, 1)
+#define IPA_STATUS_FLAGS2_NAT_TYPE_FMASK	GENMASK_ULL(15, 14)
+#define IPA_STATUS_FLAGS2_TAG_INFO_FMASK	GENMASK_ULL(63, 16)
+
+#define IPA_STATUS_FLAGS3_SEQ_NUM_FMASK		GENMASK(7, 0)
+#define IPA_STATUS_FLAGS3_TOD_CTR_FMASK		GENMASK(31, 8)
+
+#define IPA_STATUS_FLAGS4_HDR_LOCAL_FMASK	GENMASK(0, 0)
+#define IPA_STATUS_FLAGS4_HDR_OFFSET_FMASK	GENMASK(10, 1)
+#define IPA_STATUS_FLAGS4_FRAG_HIT_FMASK	GENMASK(11, 11)
+#define IPA_STATUS_FLAGS4_FRAG_RULE_FMASK	GENMASK(15, 12)
+#define IPA_STATUS_FLAGS4_HW_SPECIFIC_FMASK	GENMASK(31, 16)
+
+/* Status element provided by hardware */
+struct ipa_status_raw {
+	u8 opcode;
+	u8 exception;
+	u16 mask;
+	u16 pkt_len;
+	u8 endp_src_idx;	/* Only bottom 5 bits valid */
+	u8 endp_dst_idx;	/* Only bottom 5 bits valid */
+	u32 metadata;
+	u32 flags1;
+	u64 flags2;
+	u32 flags3;
+	u32 flags4;
+};
+
+static void ipa_endpoint_replenish(struct ipa_endpoint *endpoint);
+
+/* suspend_delay represents suspend for RX, delay for TX endpoints */
+bool ipa_endpoint_init_ctrl(struct ipa_endpoint *endpoint, bool suspend_delay)
+{
+	u32 offset = IPA_REG_ENDP_INIT_CTRL_N_OFFSET(endpoint->endpoint_id);
+	u32 mask;
+	u32 val;
+
+	mask = endpoint->toward_ipa ? ENDP_DELAY_FMASK : ENDP_SUSPEND_FMASK;
+
+	val = ioread32(endpoint->ipa->reg_virt + offset);
+	if (suspend_delay == !!(val & mask))
+		return false;	/* Already set to desired state */
+
+	val ^= mask;
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+
+	return true;
+}
+
+static void ipa_endpoint_init_cfg(struct ipa_endpoint *endpoint)
+{
+	u32 offset = IPA_REG_ENDP_INIT_CFG_N_OFFSET(endpoint->endpoint_id);
+	u32 val = 0;
+
+	/* FRAG_OFFLOAD_EN is 0 */
+	if (endpoint->data->config.checksum) {
+		if (endpoint->toward_ipa) {
+			u32 checksum_offset;
+
+			val |= u32_encode_bits(IPA_CS_OFFLOAD_UL,
+					       CS_OFFLOAD_EN_FMASK);
+			/* Checksum header offset is in 4-byte units */
+			checksum_offset = sizeof(struct rmnet_map_header);
+			checksum_offset /= sizeof(u32);
+			val |= u32_encode_bits(checksum_offset,
+					       CS_METADATA_HDR_OFFSET_FMASK);
+		} else {
+			val |= u32_encode_bits(IPA_CS_OFFLOAD_DL,
+					       CS_OFFLOAD_EN_FMASK);
+		}
+	} else {
+		val |= u32_encode_bits(IPA_CS_OFFLOAD_NONE,
+				       CS_OFFLOAD_EN_FMASK);
+	}
+	/* CS_GEN_QMB_MASTER_SEL is 0 */
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+static void ipa_endpoint_init_hdr(struct ipa_endpoint *endpoint)
+{
+	u32 offset = IPA_REG_ENDP_INIT_HDR_N_OFFSET(endpoint->endpoint_id);
+	u32 val = 0;
+
+	if (endpoint->data->config.qmap) {
+		size_t header_size = sizeof(struct rmnet_map_header);
+
+		if (endpoint->toward_ipa && endpoint->data->config.checksum)
+			header_size += sizeof(struct rmnet_map_ul_csum_header);
+
+		val |= u32_encode_bits(header_size, HDR_LEN_FMASK);
+		/* metadata is the 4 byte rmnet_map header itself */
+		val |= HDR_OFST_METADATA_VALID_FMASK;
+		val |= u32_encode_bits(0, HDR_OFST_METADATA_FMASK);
+		/* HDR_ADDITIONAL_CONST_LEN is 0; (IPA->AP only) */
+		if (!endpoint->toward_ipa) {
+			u32 size_offset = offsetof(struct rmnet_map_header,
+						   pkt_len);
+
+			val |= HDR_OFST_PKT_SIZE_VALID_FMASK;
+			val |= u32_encode_bits(size_offset,
+					       HDR_OFST_PKT_SIZE_FMASK);
+		}
+		/* HDR_A5_MUX is 0 */
+		/* HDR_LEN_INC_DEAGG_HDR is 0 */
+		/* HDR_METADATA_REG_VALID is 0; (AP->IPA only) */
+	}
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+static void ipa_endpoint_init_hdr_ext(struct ipa_endpoint *endpoint)
+{
+	u32 offset = IPA_REG_ENDP_INIT_HDR_EXT_N_OFFSET(endpoint->endpoint_id);
+	u32 pad_align = endpoint->data->config.rx.pad_align;
+	u32 val = 0;
+
+	val |= HDR_ENDIANNESS_FMASK;		/* big endian */
+	val |= HDR_TOTAL_LEN_OR_PAD_VALID_FMASK;
+	/* HDR_TOTAL_LEN_OR_PAD is 0 (pad, not total_len) */
+	/* HDR_PAYLOAD_LEN_INC_PADDING is 0 */
+	/* HDR_TOTAL_LEN_OR_PAD_OFFSET is 0 */
+	if (!endpoint->toward_ipa)
+		val |= u32_encode_bits(pad_align, HDR_PAD_TO_ALIGNMENT_FMASK);
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+/**
+ * Generate a metadata mask value that will select only the mux_id
+ * field in an rmnet_map header structure.  The mux_id is at offset
+ * 1 byte from the beginning of the structure, but the metadata
+ * value is treated as a 4-byte unit.  So this mask must be computed
+ * with endianness in mind.  Note that ipa_endpoint_init_hdr_metadata_mask()
+ * will convert this value to the proper byte order.
+ *
+ * Marked __always_inline because this is really computing a
+ * constant value.
+ */
+static __always_inline __be32 ipa_rmnet_mux_id_metadata_mask(void)
+{
+	size_t mux_id_offset = offsetof(struct rmnet_map_header, mux_id);
+	u32 mux_id_mask = 0;
+	u8 *bytes;
+
+	bytes = (u8 *)&mux_id_mask;
+	bytes[mux_id_offset] = 0xff;	/* mux_id is 1 byte */
+
+	return cpu_to_be32(mux_id_mask);
+}
+
+static void ipa_endpoint_init_hdr_metadata_mask(struct ipa_endpoint *endpoint)
+{
+	u32 endpoint_id = endpoint->endpoint_id;
+	u32 val = 0;
+	u32 offset;
+
+	offset = IPA_REG_ENDP_INIT_HDR_METADATA_MASK_N_OFFSET(endpoint_id);
+
+	if (!endpoint->toward_ipa && endpoint->data->config.qmap)
+		val = ipa_rmnet_mux_id_metadata_mask();
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+/* Compute the aggregation size value to use for a given buffer size */
+static u32 ipa_aggr_size_kb(u32 rx_buffer_size)
+{
+	BUILD_BUG_ON(IPA_RX_BUFFER_SIZE >
+		     field_max(AGGR_BYTE_LIMIT_FMASK) * SZ_1K +
+		     IPA_MTU + IPA_RX_BUFFER_OVERHEAD);
+
+	/* Because we don't have the "hard byte limit" enabled, we
+	 * need to make sure there's enough space in the buffer to
+	 * receive a complete MTU (plus normal skb overhead) beyond
+	 * the aggregated size limit we specify.
+	 */
+	rx_buffer_size -= IPA_MTU + IPA_RX_BUFFER_OVERHEAD;
+
+	return rx_buffer_size / SZ_1K;
+}
+
+static void ipa_endpoint_init_aggr(struct ipa_endpoint *endpoint)
+{
+	const struct ipa_endpoint_config_data *config = &endpoint->data->config;
+	u32 offset = IPA_REG_ENDP_INIT_AGGR_N_OFFSET(endpoint->endpoint_id);
+	u32 val = 0;
+
+	if (config->aggregation) {
+		if (!endpoint->toward_ipa) {
+			u32 aggr_size = ipa_aggr_size_kb(IPA_RX_BUFFER_SIZE);
+
+			val |= u32_encode_bits(IPA_ENABLE_AGGR, AGGR_EN_FMASK);
+			val |= u32_encode_bits(IPA_GENERIC, AGGR_TYPE_FMASK);
+			val |= u32_encode_bits(aggr_size,
+					       AGGR_BYTE_LIMIT_FMASK);
+			val |= u32_encode_bits(IPA_AGGR_TIME_LIMIT_DEFAULT,
+					       AGGR_TIME_LIMIT_FMASK);
+			val |= u32_encode_bits(0, AGGR_PKT_LIMIT_FMASK);
+			if (config->rx.aggr_close_eof)
+				val |= AGGR_SW_EOF_ACTIVE_FMASK;
+			/* AGGR_HARD_BYTE_LIMIT_ENABLE is 0 */
+		} else {
+			val |= u32_encode_bits(IPA_ENABLE_DEAGGR,
+					       AGGR_EN_FMASK);
+			val |= u32_encode_bits(IPA_QCMAP, AGGR_TYPE_FMASK);
+			/* other fields ignored */
+		}
+		/* AGGR_FORCE_CLOSE is 0 */
+	} else {
+		val |= u32_encode_bits(IPA_BYPASS_AGGR, AGGR_EN_FMASK);
+		/* other fields ignored */
+	}
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+static void ipa_endpoint_init_mode(struct ipa_endpoint *endpoint)
+{
+	u32 offset = IPA_REG_ENDP_INIT_MODE_N_OFFSET(endpoint->endpoint_id);
+	u32 val = 0;
+
+	if (endpoint->toward_ipa && endpoint->data->config.dma_mode) {
+		u32 dma_endpoint_id = endpoint->data->config.dma_endpoint;
+
+		val |= u32_encode_bits(IPA_DMA, MODE_FMASK);
+		val |= u32_encode_bits(dma_endpoint_id, DEST_PIPE_INDEX_FMASK);
+	} else {
+		val |= u32_encode_bits(IPA_BASIC, MODE_FMASK);
+	}
+	/* Other bitfields unspecified (and 0) */
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+static void ipa_endpoint_init_deaggr(struct ipa_endpoint *endpoint)
+{
+	u32 offset = IPA_REG_ENDP_INIT_DEAGGR_N_OFFSET(endpoint->endpoint_id);
+	u32 val = 0;
+
+	/* DEAGGR_HDR_LEN is 0 */
+	/* PACKET_OFFSET_VALID is 0 */
+	/* PACKET_OFFSET_LOCATION is ignored (not valid) */
+	/* MAX_PACKET_LEN is 0 (not enforced) */
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+static void ipa_endpoint_init_seq(struct ipa_endpoint *endpoint)
+{
+	u32 offset = IPA_REG_ENDP_INIT_SEQ_N_OFFSET(endpoint->endpoint_id);
+	u32 seq_type = endpoint->data->seq_type;
+	u32 val = 0;
+
+	val |= u32_encode_bits(seq_type & 0xf, HPS_SEQ_TYPE_FMASK);
+	val |= u32_encode_bits((seq_type >> 4) & 0xf, DPS_SEQ_TYPE_FMASK);
+	/* HPS_REP_SEQ_TYPE is 0 */
+	/* DPS_REP_SEQ_TYPE is 0 */
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+/* Complete transaction initiated in ipa_endpoint_skb_tx() */
+void ipa_endpoint_skb_tx_callback(struct gsi_trans *trans)
+{
+	struct sk_buff *skb = trans->data;
+
+	dev_kfree_skb_any(skb);
+}
+
+/**
+ * ipa_endpoint_skb_tx() - Transmit a socket buffer
+ * @endpoint:	Endpoint pointer
+ * @skb:	Socket buffer to send
+ *
+ * Returns:	0 if successful, or a negative error code
+ */
+int ipa_endpoint_skb_tx(struct ipa_endpoint *endpoint, struct sk_buff *skb)
+{
+	struct gsi_trans *trans;
+	bool doorbell;
+	u32 nr_frags;
+	int ret;
+
+	/* Make sure source endpoint's TLV FIFO has enough entries to
+	 * hold the linear portion of the skb and all its fragments.
+	 * If not, see if we can linearize it before giving up.
+	 */
+	nr_frags = skb_shinfo(skb)->nr_frags;
+	if (1 + nr_frags > endpoint->trans_tre_max) {
+		if (skb_linearize(skb))
+			return -ENOMEM;
+		nr_frags = 0;
+	}
+
+	trans = gsi_channel_trans_alloc(&endpoint->ipa->gsi,
+					endpoint->channel_id, nr_frags + 1);
+	if (!trans)
+		return -EBUSY;
+	trans->data = skb;
+
+	ret = skb_to_sgvec(skb, trans->sgl, 0, skb->len);
+	if (ret < 0)
+		goto err_trans_free;
+	trans->sgc = ret;
+
+	/* doorbell = __netdev_sent_queue(skb->dev, ->len, ->xmit_more); */
+	doorbell = !skb->xmit_more;
+	ret = gsi_trans_commit(trans, doorbell);
+	if (ret)
+		goto err_trans_free;
+	return 0;
+
+err_trans_free:
+	gsi_trans_free(trans);
+
+	return -ENOMEM;
+}
+
+static void ipa_endpoint_status(struct ipa_endpoint *endpoint)
+{
+	const struct ipa_endpoint_config_data *config = &endpoint->data->config;
+	enum ipa_endpoint_id endpoint_id = endpoint->endpoint_id;
+	u32 val = 0;
+	u32 offset;
+
+	offset = IPA_REG_ENDP_STATUS_N_OFFSET(endpoint_id);
+
+	if (endpoint->data->config.status_enable) {
+		val |= STATUS_EN_FMASK;
+		if (endpoint->toward_ipa) {
+			u32 status_endpoint_id = config->tx.status_endpoint;
+
+			val |= u32_encode_bits(status_endpoint_id,
+					       STATUS_ENDP_FMASK);
+		}
+		/* STATUS_LOCATION is 0 (status element precedes packet) */
+		/* STATUS_PKT_SUPPRESS_FMASK */
+	}
+
+	iowrite32(val, endpoint->ipa->reg_virt + offset);
+}
+
+static void ipa_endpoint_skb_copy(struct ipa_endpoint *endpoint,
+				  void *data, u32 len, u32 extra)
+{
+	struct sk_buff *skb;
+
+	skb = __dev_alloc_skb(len, GFP_ATOMIC);
+	if (skb) {
+		skb_put(skb, len);
+		memcpy(skb->data, data, len);
+		skb->truesize += extra;
+	}
+
+	/* Now receive it, or drop it if there's no netdev */
+	if (endpoint->netdev)
+		ipa_netdev_skb_rx(endpoint->netdev, skb);
+	else if (skb)
+		dev_kfree_skb_any(skb);
+}
+
+static void ipa_endpoint_skb_build(struct ipa_endpoint *endpoint,
+				   struct page *page, u32 len)
+{
+	struct sk_buff *skb;
+
+	/* assert(len <= SKB_WITH_OVERHEAD(IPA_RX_BUFFER_SIZE-NET_SKB_PAD)); */
+	skb = build_skb(page_address(page), IPA_RX_BUFFER_SIZE);
+	if (skb) {
+		/* Reserve the headroom and account for the data */
+		skb_reserve(skb, NET_SKB_PAD);
+		skb_put(skb, len);
+	}
+
+	/* Now receive it, or drop it if there's no netdev */
+	if (endpoint->netdev)
+		ipa_netdev_skb_rx(endpoint->netdev, skb);
+	else if (skb)
+		dev_kfree_skb_any(skb);
+
+	/* If no socket buffer took the pages, free them */
+	if (!skb)
+		__free_pages(page, IPA_RX_BUFFER_ORDER);
+}
+
+/* Maps an exception type returned in a ipa_status_raw structure
+ * to the ipa_status_exception value that represents it in
+ * the exception field of a ipa_status structure.  Returns
+ * IPA_STATUS_EXCEPTION_MAX for an unrecognized value.
+ */
+static enum ipa_status_exception exception_map(u8 exception, bool is_ipv6)
+{
+	switch (exception) {
+	case 0x00:	return IPA_STATUS_EXCEPTION_NONE;
+	case 0x01:	return IPA_STATUS_EXCEPTION_DEAGGR;
+	case 0x04:	return IPA_STATUS_EXCEPTION_IPTYPE;
+	case 0x08:	return IPA_STATUS_EXCEPTION_PACKET_LENGTH;
+	case 0x10:	return IPA_STATUS_EXCEPTION_FRAG_RULE_MISS;
+	case 0x20:	return IPA_STATUS_EXCEPTION_SW_FILT;
+	case 0x40:	return is_ipv6 ? IPA_STATUS_EXCEPTION_IPV6CT
+				       : IPA_STATUS_EXCEPTION_NAT;
+	default:	return IPA_STATUS_EXCEPTION_MAX;
+	}
+}
+
+/* A rule miss is indicated as an all-1's value in the rt_rule_id
+ * or flt_rule_id field of the ipa_status structure.
+ */
+static bool ipa_rule_miss_id(u32 id)
+{
+	return id == field_max(IPA_STATUS_FLAGS1_RT_RULE_ID_FMASK);
+}
+
+size_t ipa_status_parse(struct ipa_status *status, void *data, u32 count)
+{
+	const struct ipa_status_raw *status_raw = data;
+	bool is_ipv6;
+	u32 val;
+
+	BUILD_BUG_ON(sizeof(*status_raw) % 4);
+	if (WARN_ON(count < sizeof(*status_raw)))
+		return 0;
+
+	status->opcode = status_raw->opcode;
+	is_ipv6 = status_raw->mask & BIT(7) ? false : true;
+	status->exception = exception_map(status_raw->exception, is_ipv6);
+	status->pkt_len = status_raw->pkt_len;
+	val = u32_get_bits(status_raw->endp_dst_idx, IPA_STATUS_DST_IDX_FMASK);
+	status->dst_endpoint = val;
+	status->metadata = status_raw->metadata;
+	val = u32_get_bits(status_raw->flags1,
+			   IPA_STATUS_FLAGS1_RT_RULE_ID_FMASK);
+	status->rt_miss = ipa_rule_miss_id(val) ? 1 : 0;
+
+	return sizeof(*status_raw);
+}
+
+/* The format of a packet status element is the same for several status
+ * types (opcodes).  The NEW_FRAG_RULE, LOG, DCMP (decompression) types
+ * aren't currently supported
+ */
+static bool ipa_status_format_packet(enum ipa_status_opcode opcode)
+{
+	switch (opcode) {
+	case IPA_STATUS_OPCODE_PACKET:
+	case IPA_STATUS_OPCODE_DROPPED_PACKET:
+	case IPA_STATUS_OPCODE_SUSPENDED_PACKET:
+	case IPA_STATUS_OPCODE_PACKET_2ND_PASS:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static bool ipa_endpoint_status_skip(struct ipa_endpoint *endpoint,
+				     struct ipa_status *status)
+{
+	if (!ipa_status_format_packet(status->opcode))
+		return true;
+	if (!status->pkt_len)
+		return true;
+	if (status->dst_endpoint != endpoint->endpoint_id)
+		return true;
+
+	return false;	/* Don't skip this packet, process it */
+}
+
+static void ipa_endpoint_status_parse(struct ipa_endpoint *endpoint,
+				      struct page *page, u32 total_len)
+{
+	void *data = page_address(page) + NET_SKB_PAD;
+	u32 unused = IPA_RX_BUFFER_SIZE - total_len;
+	u32 resid = total_len;
+
+	while (resid) {
+		struct ipa_status status;
+		bool drop_packet = false;
+		size_t status_size;
+		u32 align;
+		u32 len;
+
+		status_size = ipa_status_parse(&status, data, resid);
+
+		/* Skip over status packets that lack packet data */
+		if (ipa_endpoint_status_skip(endpoint, &status)) {
+			data += status_size;
+			resid -= status_size;
+			continue;
+		}
+
+		/* Packet data follows the status structure.  Unless
+		 * the packet failed to match a routing rule, or it
+		 * had a deaggregation exception, we'll consume it.
+		 */
+		if (status.exception == IPA_STATUS_EXCEPTION_NONE) {
+			if (status.rt_miss)
+				drop_packet = true;
+		} else if (status.exception == IPA_STATUS_EXCEPTION_DEAGGR) {
+			drop_packet = true;
+		}
+
+		/* Compute the amount of buffer space consumed by the
+		 * packet, including the status element.  If the hardware
+		 * is configured to pad packet data to an aligned boundary,
+		 * account for that.  And if checksum offload is is enabled
+		 * a trailer containing computed checksum information will
+		 * be appended.
+		 */
+		align = endpoint->data->config.rx.pad_align ? : 1;
+		len = status_size + ALIGN(status.pkt_len, align);
+		if (endpoint->data->config.checksum)
+			len += sizeof(struct rmnet_map_dl_csum_trailer);
+
+		/* Charge the new packet with a proportional fraction of
+		 * the unused space in the original receive buffer.
+		 * XXX Charge a proportion of the *whole* receive buffer?
+		 */
+		if (!drop_packet) {
+			u32 extra = unused * len / total_len;
+			void *data2 = data + status_size;
+			u32 len2 = status.pkt_len;
+
+			/* Client receives only packet data (no status) */
+			ipa_endpoint_skb_copy(endpoint, data2, len2, extra);
+		}
+
+		/* Consume status and the full packet it describes */
+		data += len;
+		resid -= len;
+	}
+
+	__free_pages(page, IPA_RX_BUFFER_ORDER);
+}
+
+/* Complete transaction initiated in ipa_endpoint_replenish_one() */
+void ipa_endpoint_rx_callback(struct gsi_trans *trans)
+{
+	struct page *page = trans->data;
+	struct ipa_endpoint *endpoint;
+	struct ipa *ipa;
+	u32 len;
+
+	ipa = container_of(trans->gsi, struct ipa, gsi);
+	endpoint = ipa->endpoint_map[trans->channel_id];
+
+	atomic_inc(&endpoint->replenish_backlog);
+	ipa_endpoint_replenish(endpoint);
+
+	if (trans->result == -ECANCELED) {
+		__free_pages(page, IPA_RX_BUFFER_ORDER);
+		return;
+	}
+
+	/* Record the actual received length and build a socket buffer */
+	/* assert(trans_result > 0); */
+	len = trans->result;
+
+	if (endpoint->data->config.status_enable)
+		ipa_endpoint_status_parse(endpoint, page, len);
+	else
+		ipa_endpoint_skb_build(endpoint, page, len);
+}
+
+static int ipa_endpoint_replenish_one(struct ipa_endpoint *endpoint)
+{
+	struct gsi_trans *trans;
+	bool doorbell = false;
+	struct page *page;
+	u32 offset;
+	u32 len;
+
+	page = dev_alloc_pages(IPA_RX_BUFFER_ORDER);
+	if (!page)
+		return false;
+	offset = NET_SKB_PAD;
+	len = IPA_RX_BUFFER_SIZE - offset;
+
+	trans = gsi_channel_trans_alloc(&endpoint->ipa->gsi,
+					endpoint->channel_id, 1);
+	if (!trans)
+		goto err_page_free;
+	trans->data = page;
+
+	/* Set up and map a scatterlist entry representing the buffer */
+	sg_init_table(trans->sgl, trans->sgc);
+	sg_set_page(trans->sgl, page, len, offset);
+
+	if (++endpoint->replenish_ready == IPA_REPLENISH_BATCH) {
+		doorbell = true;
+		endpoint->replenish_ready = 0;
+	}
+
+	if (!gsi_trans_commit(trans, doorbell))
+		return 0;
+
+err_page_free:
+	__free_pages(page, IPA_RX_BUFFER_ORDER);
+
+	return -ENOMEM;
+}
+
+/**
+ * ipa_endpoint_replenish() - Replenish the Rx packets cache.
+ *
+ * Allocate RX packet wrapper structures with maximal socket buffers
+ * for an endpoint.  These are supplied to the hardware, which fills
+ * them with incoming data.
+ */
+static void ipa_endpoint_replenish(struct ipa_endpoint *endpoint)
+{
+	u32 backlog;
+
+	while (atomic_dec_not_zero(&endpoint->replenish_backlog))
+		if (ipa_endpoint_replenish_one(endpoint))
+			goto try_again_later;
+
+	return;
+
+try_again_later:
+	/* The last one didn't succeed, so fix the backlog */
+	backlog = atomic_inc_return(&endpoint->replenish_backlog);
+
+	/* Whenever a receive buffer transaction completes we'll try to
+	 * replenish again.  It's unlikely, but if we fail to supply even
+	 * one buffer, nothing will trigger another replenish attempt.
+	 * If this happens, schedule work to try again.
+	 */
+	if (backlog == endpoint->replenish_max)
+		schedule_delayed_work(&endpoint->replenish_work,
+				      msecs_to_jiffies(1));
+}
+
+static void ipa_endpoint_replenish_work(struct work_struct *work)
+{
+	struct delayed_work *dwork = to_delayed_work(work);
+	struct ipa_endpoint *endpoint;
+
+	endpoint = container_of(dwork, struct ipa_endpoint, replenish_work);
+
+	ipa_endpoint_replenish(endpoint);
+}
+
+static bool ipa_endpoint_set_up(struct ipa_endpoint *endpoint)
+{
+	struct ipa *ipa = endpoint->ipa;
+
+	return ipa && (ipa->set_up & BIT(endpoint->endpoint_id));
+}
+
+static void ipa_endpoint_default_route_set(struct ipa *ipa,
+					   enum ipa_endpoint_id endpoint_id)
+{
+	u32 val;
+
+	/* ROUTE_DIS is 0 */
+	val = u32_encode_bits(endpoint_id, ROUTE_DEF_PIPE_FMASK);
+	val |= ROUTE_DEF_HDR_TABLE_FMASK;
+	val |= u32_encode_bits(0, ROUTE_DEF_HDR_OFST_FMASK);
+	val |= u32_encode_bits(endpoint_id, ROUTE_FRAG_DEF_PIPE_FMASK);
+	val |= ROUTE_DEF_RETAIN_HDR_FMASK;
+
+	iowrite32(val, ipa->reg_virt + IPA_REG_ROUTE_OFFSET);
+}
+/**
+ * ipa_endpoint_default_route_init() - Configure IPA default route
+ * @ipa:	IPA pointer
+ * @client:	Client to which exceptions should be directed
+ */
+void ipa_endpoint_default_route_setup(struct ipa_endpoint *endpoint)
+{
+	ipa_endpoint_default_route_set(endpoint->ipa, endpoint->endpoint_id);
+}
+
+/**
+ * ipa_endpoint_default_route_teardown() -
+ *			Inverse of ipa_endpoint_default_route_setup()
+ * @ipa:	IPA pointer
+ */
+void ipa_endpoint_default_route_teardown(struct ipa_endpoint *endpoint)
+{
+	ipa_endpoint_default_route_set(endpoint->ipa, 0);
+}
+
+/**
+ * ipa_endpoint_stop()- Stops a GSI channel in IPA
+ * @client:	Client whose endpoint should be stopped
+ *
+ * This function implements the sequence to stop a GSI channel
+ * in IPA. This function returns when the channel is is STOP state.
+ *
+ * Return value: 0 on success, negative otherwise
+ */
+int ipa_endpoint_stop(struct ipa_endpoint *endpoint)
+{
+	struct device *dev = &endpoint->ipa->pdev->dev;
+	size_t size = IPA_ENDPOINT_STOP_RX_SIZE;
+	struct gsi *gsi = &endpoint->ipa->gsi;
+	void *virt = NULL;
+	dma_addr_t addr;
+	int ret;
+	int i;
+
+	/* An RX endpoint might not stop right away.  In that case we issue
+	 * a small (1-byte) DMA command, delay for a bit (1-2 milliseconds),
+	 * and try again.  Allocate the DMA buffer in case this is needed.
+	 */
+	if (!endpoint->toward_ipa) {
+		virt = dma_alloc_coherent(dev, size, &addr, GFP_KERNEL);
+		if (!virt)
+			return -ENOMEM;
+	}
+
+	for (i = 0; i < IPA_ENDPOINT_STOP_RETRY_MAX; i++) {
+		ret = gsi_channel_stop(gsi, endpoint->channel_id);
+		if (ret != -EAGAIN)
+			break;
+
+		if (endpoint->toward_ipa)
+			continue;
+
+		/* Send a 1 byte 32-bit DMA task and try again after a delay */
+		ret = ipa_cmd_dma_task_32(endpoint->ipa, size, addr);
+		if (ret)
+			break;
+
+		usleep_range(USEC_PER_MSEC, 2 * USEC_PER_MSEC);
+	}
+	if (i >= IPA_ENDPOINT_STOP_RETRY_MAX)
+		ret = -EIO;
+
+	if (!endpoint->toward_ipa)
+		dma_free_coherent(dev, size, virt, addr);
+
+	return ret;
+}
+
+bool ipa_endpoint_enabled(struct ipa_endpoint *endpoint)
+{
+	return !!(endpoint->ipa->enabled & BIT(endpoint->endpoint_id));
+}
+
+int ipa_endpoint_enable_one(struct ipa_endpoint *endpoint)
+{
+	struct ipa *ipa = endpoint->ipa;
+	int ret;
+
+	if (WARN_ON(!ipa_endpoint_set_up(endpoint)))
+		return -EINVAL;
+
+	ret = gsi_channel_start(&ipa->gsi, endpoint->channel_id);
+	if (ret)
+		return ret;
+
+	ipa_interrupt_suspend_enable(ipa->interrupt, endpoint->endpoint_id);
+
+	if (!endpoint->toward_ipa)
+		ipa_endpoint_replenish(endpoint);
+
+	ipa->enabled |= BIT(endpoint->endpoint_id);
+
+	return 0;
+}
+
+void ipa_endpoint_disable_one(struct ipa_endpoint *endpoint)
+{
+	struct ipa *ipa = endpoint->ipa;
+	int ret;
+
+	if (WARN_ON(!ipa_endpoint_enabled(endpoint)))
+		return;
+
+	ipa_interrupt_suspend_disable(ipa->interrupt, endpoint->endpoint_id);
+
+	ret = ipa_endpoint_stop(endpoint);
+	WARN(ret, "error %d attempting to stop endpoint %u\n", ret,
+	     endpoint->endpoint_id);
+
+	if (!ret)
+		endpoint->ipa->enabled &= ~BIT(endpoint->endpoint_id);
+}
+
+static bool ipa_endpoint_aggr_active(struct ipa_endpoint *endpoint)
+{
+	u32 mask = BIT(endpoint->endpoint_id);
+	struct ipa *ipa = endpoint->ipa;
+	u32 val;
+
+	val = ioread32(ipa->reg_virt + IPA_REG_STATE_AGGR_ACTIVE_OFFSET);
+
+	return !!(val & mask);
+}
+
+static void ipa_endpoint_force_close(struct ipa_endpoint *endpoint)
+{
+	u32 mask = BIT(endpoint->endpoint_id);
+	struct ipa *ipa = endpoint->ipa;
+	u32 val;
+
+	val = u32_encode_bits(mask, PIPE_BITMAP_FMASK);
+	iowrite32(val, ipa->reg_virt + IPA_REG_AGGR_FORCE_CLOSE_OFFSET);
+}
+
+/**
+ * ipa_endpoint_reset_rx_aggr() - Reset RX endpoint with aggregation active
+ * @endpoint:	Endpoint to be reset
+ *
+ * If aggregation is active on an RX endpoint when a reset is performed
+ * on its underlying GSI channel, a special sequence of actions must be
+ * taken to ensure the IPA pipeline is properly cleared.
+ *
+ * Return:	0 if successful, or a negative error code
+ */
+static int ipa_endpoint_reset_rx_aggr(struct ipa_endpoint *endpoint)
+{
+	struct device *dev = &endpoint->ipa->pdev->dev;
+	struct ipa *ipa = endpoint->ipa;
+	bool endpoint_suspended = false;
+	struct gsi *gsi = &ipa->gsi;
+	dma_addr_t addr;
+	u32 len = 1;
+	void *virt;
+	int ret;
+	int i;
+
+	virt = kzalloc(len, GFP_KERNEL);
+	if (!virt)
+		return -ENOMEM;
+
+	addr = dma_map_single(dev, virt, len, DMA_FROM_DEVICE);
+	if (dma_mapping_error(dev, addr)) {
+		ret = -ENOMEM;
+		goto out_free_virt;
+	}
+
+	/* Force close aggregation before issuing the reset */
+	ipa_endpoint_force_close(endpoint);
+
+	ret = gsi_channel_reset(gsi, endpoint->channel_id);
+	if (ret)
+		goto out_unmap_addr;
+
+	/* Reconfigure the channel with the doorbell engine disabled,
+	 * and poll until we know aggregation isn't active any more.
+	 */
+	gsi_channel_config(gsi, endpoint->channel_id, false);
+
+	if (ipa_endpoint_init_ctrl(endpoint, false))
+		endpoint_suspended = true;
+
+	/* Start channel and do a 1 byte read */
+	ret = gsi_channel_start(gsi, endpoint->channel_id);
+	if (ret)
+		goto out_suspend_again;
+
+	ret = gsi_trans_read_byte(gsi, endpoint->channel_id, addr);
+	if (ret)
+		goto err_stop_channel;
+
+	/* Wait for aggregation to be closed on the channel */
+	for (i = 0; i < IPA_ENDPOINT_RESET_AGGR_RETRY_MAX; i++) {
+		if (!ipa_endpoint_aggr_active(endpoint))
+			break;
+		usleep_range(USEC_PER_MSEC, 2 * USEC_PER_MSEC);
+	}
+	WARN_ON(ipa_endpoint_aggr_active(endpoint));
+
+	ret = ipa_endpoint_stop(endpoint);
+	if (ret)
+		goto out_suspend_again;
+
+	/* Finally, reset the channel again, and sleep for 1 millisecond
+	 * to complete the channel reset sequence.
+	 *
+	 * Finish by suspending the channel again (if necessary) and
+	 * re-enabling its doorbell engine.
+	 */
+	ret = gsi_channel_reset(gsi, endpoint->channel_id);
+
+	usleep_range(USEC_PER_MSEC, 2 * USEC_PER_MSEC);
+
+	goto out_suspend_again;
+
+err_stop_channel:
+	ipa_endpoint_stop(endpoint);
+out_suspend_again:
+	if (endpoint_suspended)
+		(void)ipa_endpoint_init_ctrl(endpoint, true);
+
+	/* Reconfigure the channel, with doorbell engine enabled again */
+	gsi_channel_config(gsi, endpoint->channel_id, true);
+out_unmap_addr:
+	dma_unmap_single(dev, addr, len, DMA_FROM_DEVICE);
+out_free_virt:
+	kfree(virt);
+
+	return ret;
+}
+
+static void ipa_endpoint_reset(struct ipa_endpoint *endpoint)
+{
+	u32 channel_id = endpoint->channel_id;
+	struct ipa *ipa = endpoint->ipa;
+	struct gsi *gsi = &ipa->gsi;
+	int ret;
+
+	/* For TX endpoints, or RX endpoints without aggregation active,
+	 * we only need to reset the underlying GSI channel.
+	 */
+	if (!endpoint->toward_ipa && endpoint->data->config.aggregation) {
+		if (ipa_endpoint_aggr_active(endpoint))
+			ret = ipa_endpoint_reset_rx_aggr(endpoint);
+		else
+			ret = gsi_channel_reset(gsi, channel_id);
+	} else {
+		ret = gsi_channel_reset(gsi, channel_id);
+	}
+	WARN(ret, "error %d attempting to reset channel %u\n", ret,
+	     endpoint->channel_id);
+}
+
+static bool ipa_endpoint_suspended(struct ipa_endpoint *endpoint)
+{
+	return !!(endpoint->ipa->suspended & BIT(endpoint->endpoint_id));
+}
+
+/**
+ * ipa_endpoint_suspend_aggr() - Emulate suspend interrupt
+ * @endpoint_id:	Endpoint on which to emulate a suspend
+ *
+ *  Emulate suspend IPA interrupt to unsuspend an endpoint suspended
+ *  with an open aggregation frame.  This is to work around a hardware
+ *  issue where the suspend interrupt will not be generated when it
+ *  should be.
+ */
+static void ipa_endpoint_suspend_aggr(struct ipa_endpoint *endpoint)
+{
+	struct ipa *ipa = endpoint->ipa;
+
+	/* Nothing to do if the endpoint doesn't have aggregation open */
+	if (!ipa_endpoint_aggr_active(endpoint))
+		return;
+
+	/* Force close aggregation */
+	ipa_endpoint_force_close(endpoint);
+
+	ipa_interrupt_simulate_suspend(ipa->interrupt);
+}
+
+/* Suspend an RX endpoint */
+void ipa_endpoint_suspend(struct ipa_endpoint *endpoint)
+{
+	struct gsi *gsi = &endpoint->ipa->gsi;
+
+	if (!ipa_endpoint_enabled(endpoint))
+		return;
+
+	if (!ipa_endpoint_init_ctrl(endpoint, true))
+		return;
+
+	/* Due to a hardware bug, a client suspended with an open
+	 * aggregation frame will not generate a SUSPEND IPA interrupt.
+	 * We work around this by force-closing the aggregation frame,
+	 * then simulating the arrival of such an interrupt.
+	 */
+	if (endpoint->data->config.aggregation)
+		ipa_endpoint_suspend_aggr(endpoint);
+
+	gsi_channel_trans_quiesce(gsi, endpoint->channel_id);
+
+	endpoint->ipa->suspended |= BIT(endpoint->endpoint_id);
+}
+
+/* Resume a suspended RX endpoint */
+void ipa_endpoint_resume(struct ipa_endpoint *endpoint)
+{
+	if (!ipa_endpoint_suspended(endpoint))
+		return;
+
+	if (!ipa_endpoint_init_ctrl(endpoint, false))
+		return;
+
+	endpoint->ipa->suspended &= ~BIT(endpoint->endpoint_id);
+}
+
+static void ipa_endpoint_program(struct ipa_endpoint *endpoint)
+{
+	if (endpoint->toward_ipa) {
+		bool delay_mode = !!endpoint->data->config.tx.delay;
+
+		(void)ipa_endpoint_init_ctrl(endpoint, delay_mode);
+		ipa_endpoint_init_hdr_ext(endpoint);
+		ipa_endpoint_init_aggr(endpoint);
+		ipa_endpoint_init_deaggr(endpoint);
+		ipa_endpoint_init_seq(endpoint);
+	} else {
+		(void)ipa_endpoint_init_ctrl(endpoint, false);
+		ipa_endpoint_init_hdr_ext(endpoint);
+		ipa_endpoint_init_aggr(endpoint);
+	}
+	ipa_endpoint_init_cfg(endpoint);
+	ipa_endpoint_init_hdr(endpoint);
+	ipa_endpoint_init_hdr_metadata_mask(endpoint);
+	ipa_endpoint_init_mode(endpoint);
+	ipa_endpoint_status(endpoint);
+}
+
+static void ipa_endpoint_setup_one(struct ipa_endpoint *endpoint)
+{
+	struct gsi *gsi = &endpoint->ipa->gsi;
+	u32 channel_id = endpoint->channel_id;
+
+	/* Only AP endpoints get configured */
+	if (endpoint->ee_id != GSI_EE_AP)
+		return;
+
+	endpoint->trans_tre_max = gsi_channel_trans_tre_max(gsi, channel_id);
+	if (!endpoint->toward_ipa) {
+		endpoint->replenish_max =
+				gsi_channel_trans_max(gsi, channel_id);
+		atomic_set(&endpoint->replenish_backlog,
+			   endpoint->replenish_max);
+		INIT_DELAYED_WORK(&endpoint->replenish_work,
+				  ipa_endpoint_replenish_work);
+	}
+
+	ipa_endpoint_program(endpoint);
+
+	endpoint->ipa->set_up |= BIT(endpoint->endpoint_id);
+}
+
+static void ipa_endpoint_teardown_one(struct ipa_endpoint *endpoint)
+{
+	if (!endpoint->toward_ipa)
+		cancel_delayed_work_sync(&endpoint->replenish_work);
+
+	ipa_endpoint_reset(endpoint);
+
+	endpoint->ipa->set_up &= ~BIT(endpoint->endpoint_id);
+}
+
+void ipa_endpoint_setup(struct ipa *ipa)
+{
+	u32 initialized = ipa->initialized;
+
+	ipa->set_up = 0;
+	while (initialized) {
+		enum ipa_endpoint_id endpoint_id = __ffs(initialized);
+
+		initialized ^= BIT(endpoint_id);
+
+		ipa_endpoint_setup_one(&ipa->endpoint[endpoint_id]);
+	}
+}
+
+void ipa_endpoint_teardown(struct ipa *ipa)
+{
+	u32 set_up = ipa->set_up;
+
+	while (set_up) {
+		enum ipa_endpoint_id endpoint_id = __fls(set_up);
+
+		set_up ^= BIT(endpoint_id);
+
+		ipa_endpoint_teardown_one(&ipa->endpoint[endpoint_id]);
+	}
+}
+
+static int ipa_endpoint_init_one(struct ipa *ipa,
+				 const struct gsi_ipa_endpoint_data *data)
+{
+	struct ipa_endpoint *endpoint;
+
+	if (data->endpoint_id >= IPA_ENDPOINT_MAX)
+		return -EIO;
+	endpoint = &ipa->endpoint[data->endpoint_id];
+
+	if (data->ee_id == GSI_EE_AP)
+		ipa->endpoint_map[data->channel_id] = endpoint;
+
+	endpoint->ipa = ipa;
+	endpoint->ee_id = data->ee_id;
+	endpoint->channel_id = data->channel_id;
+	endpoint->endpoint_id = data->endpoint_id;
+	endpoint->toward_ipa = data->toward_ipa;
+	endpoint->data = &data->endpoint;
+
+	if (endpoint->data->support_flt)
+		ipa->filter_support |= BIT(endpoint->endpoint_id);
+
+	ipa->initialized |= BIT(endpoint->endpoint_id);
+
+	return 0;
+}
+
+void ipa_endpoint_exit_one(struct ipa_endpoint *endpoint)
+{
+	endpoint->ipa->initialized &= ~BIT(endpoint->endpoint_id);
+}
+
+int ipa_endpoint_init(struct ipa *ipa, u32 data_count,
+		      const struct gsi_ipa_endpoint_data *data)
+{
+	u32 initialized;
+	int ret;
+	u32 i;
+
+	ipa->initialized = 0;
+
+	ipa->filter_support = 0;
+	for (i = 0; i < data_count; i++) {
+		ret = ipa_endpoint_init_one(ipa, &data[i]);
+		if (ret)
+			goto err_endpoint_unwind;
+	}
+	dev_dbg(&ipa->pdev->dev, "initialized 0x%08x\n", ipa->initialized);
+
+	/* Verify the bitmap of endpoints that support filtering. */
+	dev_dbg(&ipa->pdev->dev, "filter_support 0x%08x\n",
+		ipa->filter_support);
+	if (!ipa->filter_support)
+		goto err_endpoint_unwind;
+	if (hweight32(ipa->filter_support) > IPA_SMEM_FLT_COUNT)
+		goto err_endpoint_unwind;
+
+	return 0;
+
+err_endpoint_unwind:
+	initialized = ipa->initialized;
+	while (initialized) {
+		enum ipa_endpoint_id endpoint_id = __fls(initialized);
+
+		initialized ^= BIT(endpoint_id);
+
+		ipa_endpoint_exit_one(&ipa->endpoint[endpoint_id]);
+	}
+
+	return ret;
+}
+
+void ipa_endpoint_exit(struct ipa *ipa)
+{
+	u32 initialized = ipa->initialized;
+
+	while (initialized) {
+		enum ipa_endpoint_id endpoint_id = __fls(initialized);
+
+		initialized ^= BIT(endpoint_id);
+
+		ipa_endpoint_exit_one(&ipa->endpoint[endpoint_id]);
+	}
+}
diff --git a/drivers/net/ipa/ipa_endpoint.h b/drivers/net/ipa/ipa_endpoint.h
new file mode 100644
index 000000000000..39fb02f524bc
--- /dev/null
+++ b/drivers/net/ipa/ipa_endpoint.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _IPA_ENDPOINT_H_
+#define _IPA_ENDPOINT_H_
+
+#include <linux/types.h>
+#include <linux/workqueue.h>
+#include <linux/if_ether.h>
+
+#include "gsi.h"
+#include "ipa_reg.h"
+
+struct net_device;
+struct sk_buff;
+
+struct ipa;
+struct gsi_ipa_endpoint_data;
+
+#define IPA_MTU	ETH_DATA_LEN
+
+enum ipa_endpoint_id {
+	IPA_ENDPOINT_INVALID		= 0,
+	IPA_ENDPOINT_AP_MODEM_TX	= 2,
+	IPA_ENDPOINT_MODEM_LAN_TX	= 3,
+	IPA_ENDPOINT_MODEM_COMMAND_TX	= 4,
+	IPA_ENDPOINT_AP_COMMAND_TX	= 5,
+	IPA_ENDPOINT_MODEM_AP_TX	= 6,
+	IPA_ENDPOINT_AP_LAN_RX		= 9,
+	IPA_ENDPOINT_AP_MODEM_RX	= 10,
+	IPA_ENDPOINT_MODEM_AP_RX	= 12,
+	IPA_ENDPOINT_MODEM_LAN_RX	= 13,
+};
+
+#define IPA_ENDPOINT_MAX		32	/* Max supported */
+
+/**
+ * struct ipa_endpoint - IPA endpoint information
+ * @client:	Client associated with the endpoint
+ * @channel_id:	EP's GSI channel
+ * @evt_ring_id: EP's GSI channel event ring
+ */
+struct ipa_endpoint {
+	struct ipa *ipa;
+	enum ipa_seq_type seq_type;
+	enum gsi_ee_id ee_id;
+	u32 channel_id;
+	enum ipa_endpoint_id endpoint_id;
+	u32 toward_ipa;		/* Boolean */
+	const struct ipa_endpoint_data *data;
+
+	u32 trans_tre_max;	/* maximum descriptors per transaction */
+	u32 evt_ring_id;
+
+	/* Net device this endpoint is associated with, if any */
+	struct net_device *netdev;
+
+	/* Receive buffer replenishing for RX endpoints */
+	u32 replenish_max;
+	u32 replenish_ready;
+	atomic_t replenish_backlog;
+	struct delayed_work replenish_work;		/* global wq */
+};
+
+bool ipa_endpoint_init_ctrl(struct ipa_endpoint *endpoint, bool suspend_delay);
+
+int ipa_endpoint_skb_tx(struct ipa_endpoint *endpoint, struct sk_buff *skb);
+
+int ipa_endpoint_stop(struct ipa_endpoint *endpoint);
+
+void ipa_endpoint_exit_one(struct ipa_endpoint *endpoint);
+
+bool ipa_endpoint_enabled(struct ipa_endpoint *endpoint);
+int ipa_endpoint_enable_one(struct ipa_endpoint *endpoint);
+void ipa_endpoint_disable_one(struct ipa_endpoint *endpoint);
+
+void ipa_endpoint_default_route_setup(struct ipa_endpoint *endpoint);
+void ipa_endpoint_default_route_teardown(struct ipa_endpoint *endpoint);
+
+void ipa_endpoint_suspend(struct ipa_endpoint *endpoint);
+void ipa_endpoint_resume(struct ipa_endpoint *endpoint);
+
+void ipa_endpoint_setup(struct ipa *ipa);
+void ipa_endpoint_teardown(struct ipa *ipa);
+
+int ipa_endpoint_init(struct ipa *ipa, u32 data_count,
+		      const struct gsi_ipa_endpoint_data *data);
+void ipa_endpoint_exit(struct ipa *ipa);
+
+void ipa_endpoint_skb_tx_callback(struct gsi_trans *trans);
+void ipa_endpoint_rx_callback(struct gsi_trans *trans);
+
+
+#endif /* _IPA_ENDPOINT_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 12/18] soc: qcom: ipa: immediate commands
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (10 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 11/18] soc: qcom: ipa: IPA endpoints Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-15  8:16   ` Arnd Bergmann
  2019-05-12  1:25 ` [PATCH 13/18] soc: qcom: ipa: IPA network device and microcontroller Alex Elder
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

One TX endpoint (per EE) is used for issuing immediate commands to
the IPA.  These commands request activites beyond simple data
transfers to be done by the IPA hardware.  For example, the IPA is
able to manage routing packets among endpoints, and immediate commands
are used to configure tables used for that routing.

Immediate commands are built on top of GSI transactions.  They are
different from normal transfers (in that they use a special endpoint,
and their "payload" is interpreted differently), so separate functions
are used to issue immediate command transactions.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa_cmd.c | 372 ++++++++++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_cmd.h | 116 ++++++++++++
 2 files changed, 488 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_cmd.c
 create mode 100644 drivers/net/ipa/ipa_cmd.h

diff --git a/drivers/net/ipa/ipa_cmd.c b/drivers/net/ipa/ipa_cmd.c
new file mode 100644
index 000000000000..7b655d6f9b7b
--- /dev/null
+++ b/drivers/net/ipa/ipa_cmd.c
@@ -0,0 +1,372 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <linux/bitfield.h>
+
+#include "gsi.h"
+#include "gsi_trans.h"
+#include "ipa.h"
+#include "ipa_endpoint.h"
+#include "ipa_cmd.h"
+#include "ipa_mem.h"
+
+/**
+ * DOC:  IPA Immediate Commands
+ *
+ * The AP command TX endpoint is used to issue immediate commands to the IPA.
+ * An immediate command is generally used to request the IPA do something
+ * other than data transfer to another endpoint.
+ *
+ * Immediate commands are represented by GSI transactions just like other
+ * transfer requests, represented by a single GSI TRE.  Each immediate
+ * command has a well-defined format, having a payload of a known length.
+ * This allows the transfer element's length field to be used to hold an
+ * immediate command's opcode.  The payload for a command resides in DRAM
+ * and is described by a single scatterlist entry in its transaction.
+ * Commands do not require a transaction completion callback.  To commit
+ * an immediate command transaction, either gsi_trans_commit_command() or
+ * gsi_trans_commit_command_timeout() is used.
+ */
+
+#define IPA_GSI_DMA_TASK_TIMEOUT	15	/* milliseconds */
+
+/**
+ * __ipa_cmd_timeout() - Send an immediate command with timeout
+ * @ipa:	IPA structure
+ * @opcode:	Immediate command opcode (must not be IPA_CMD_NONE)
+ * payload:	Pointer to command payload
+ * size:	Size of payload
+ * @timeout:	Milliseconds to wait for completion (0 waits indefinitely)
+ *
+ * This common function implements ipa_cmd() and ipa_cmd_timeout().  It
+ * allocates, initializes, and commits a transaction for the immediate
+ * command.  The transaction is committed using gsi_trans_commit_command(),
+ * or if a non-zero timeout is supplied, gsi_trans_commit_command_timeout().
+ *
+ * Return:	0 if successful, or a negative error code
+ */
+static int __ipa_cmd_timeout(struct ipa *ipa, enum ipa_cmd_opcode opcode,
+			     void *payload, size_t size, u32 timeout)
+{
+	struct ipa_endpoint *endpoint = ipa->command_endpoint;
+	struct gsi_trans *trans;
+	int ret;
+
+	/* assert(opcode != IPA_CMD_NONE) */
+	trans = gsi_channel_trans_alloc(&ipa->gsi, endpoint->channel_id, 1);
+	if (!trans)
+		return -EBUSY;
+
+	sg_init_one(trans->sgl, payload, size);
+
+	if (timeout)
+		ret = gsi_trans_commit_command_timeout(trans, opcode, timeout);
+	else
+		ret = gsi_trans_commit_command(trans, opcode);
+	if (ret)
+		goto err_trans_free;
+
+	return 0;
+
+err_trans_free:
+	gsi_trans_free(trans);
+
+	return ret;
+}
+
+static int
+ipa_cmd(struct ipa *ipa, enum ipa_cmd_opcode opcode, void *payload, size_t size)
+{
+	return __ipa_cmd_timeout(ipa, opcode, payload, size, 0);
+}
+
+static int ipa_cmd_timeout(struct ipa *ipa, enum ipa_cmd_opcode opcode,
+			   void *payload, size_t size)
+{
+	return __ipa_cmd_timeout(ipa, opcode, payload, size,
+				 IPA_GSI_DMA_TASK_TIMEOUT);
+}
+
+/* Field masks for ipa_imm_cmd_hw_hdr_init_local structure fields */
+#define IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK	GENMASK(11, 0)
+#define IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK	GENMASK(27, 12)
+
+struct ipa_imm_cmd_hw_hdr_init_local {
+	u64 hdr_table_addr;
+	u32 flags;
+	u32 pad;	/* XXX Needed? */
+};
+
+/* Initialize header space in IPA local memory */
+int ipa_cmd_hdr_init_local(struct ipa *ipa, u32 offset, u32 size)
+{
+	struct ipa_imm_cmd_hw_hdr_init_local *payload;
+	struct device *dev = &ipa->pdev->dev;
+	dma_addr_t addr;
+	void *virt;
+	u32 flags;
+	u32 max;
+	int ret;
+
+	/* Note: size *can* be zero in this case */
+	if (size > field_max(IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK))
+		return -EINVAL;
+
+	max = field_max(IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK);
+	if (offset > max || ipa->shared_offset > max - offset)
+		return -EINVAL;
+	offset += ipa->shared_offset;
+
+	/* A zero-filled buffer of the right size is all that's required */
+	virt = dma_alloc_coherent(dev, size, &addr, GFP_KERNEL);
+	if (!virt)
+		return -ENOMEM;
+
+	payload = kzalloc(sizeof(*payload), GFP_KERNEL);
+	if (!payload) {
+		ret = -ENOMEM;
+		goto out_dma_free;
+	}
+
+	payload->hdr_table_addr = addr;
+	flags = u32_encode_bits(size, IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK);
+	flags |= u32_encode_bits(offset, IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK);
+	payload->flags = flags;
+
+	ret = ipa_cmd(ipa, IPA_CMD_HDR_INIT_LOCAL, payload, sizeof(*payload));
+
+	kfree(payload);
+out_dma_free:
+	dma_free_coherent(dev, size, virt, addr);
+
+	return ret;
+}
+
+enum ipahal_pipeline_clear_option {
+	IPAHAL_HPS_CLEAR		= 0,
+	IPAHAL_SRC_GRP_CLEAR		= 1,
+	IPAHAL_FULL_PIPELINE_CLEAR	= 2,
+};
+
+/* Field masks for ipa_imm_cmd_hw_dma_shared_mem structure fields */
+#define IPA_CMD_DMA_SHARED_FLAGS_DIRECTION_FMASK	GENMASK(0, 0)
+#define IPA_CMD_DMA_SHARED_FLAGS_SKIP_CLEAR_FMASK	GENMASK(1, 1)
+#define IPA_CMD_DMA_SHARED_FLAGS_CLEAR_OPTIONS_FMASK	GENMASK(3, 2)
+
+struct ipa_imm_cmd_hw_dma_shared_mem {
+	u16 sw_rsvd;
+	u16 size;
+	u16 local_addr;
+	u16 flags;
+	u64 system_addr;
+};
+
+/* Use a DMA command to zero a block of memory */
+int ipa_cmd_smem_dma_zero(struct ipa *ipa, u32 offset, u32 size)
+{
+	struct ipa_imm_cmd_hw_dma_shared_mem *payload;
+	struct device *dev = &ipa->pdev->dev;
+	dma_addr_t addr;
+	void *virt;
+	u32 val;
+	int ret;
+
+	/* size must be non-zero, and must fit in a 16 bit field */
+	if (!size || size > U16_MAX)
+		return -EINVAL;
+
+	/* offset must fit in a 16 bit local_addr field */
+	if (offset > U16_MAX || ipa->shared_offset > U16_MAX - offset)
+		return -EINVAL;
+	offset += ipa->shared_offset;
+
+	/* A zero-filled buffer of the right size is all that's required */
+	virt = dma_alloc_coherent(dev, size, &addr, GFP_KERNEL);
+	if (!virt)
+		return -ENOMEM;
+
+	payload = kzalloc(sizeof(*payload), GFP_KERNEL);
+	if (!payload) {
+		ret = -ENOMEM;
+		goto out_dma_free;
+	}
+
+	payload->size = size;
+	payload->local_addr = offset;
+	/* direction: 0 = write to IPA; skip clear: 0 = don't wait */
+	val = u16_encode_bits(IPAHAL_HPS_CLEAR,
+			      IPA_CMD_DMA_SHARED_FLAGS_CLEAR_OPTIONS_FMASK);
+	payload->flags = val;
+	payload->system_addr = addr;
+
+	ret = ipa_cmd(ipa, IPA_CMD_DMA_SHARED_MEM, payload, sizeof(*payload));
+
+	kfree(payload);
+out_dma_free:
+	dma_free_coherent(dev, size, virt, addr);
+
+	return ret;
+}
+
+/* Field masks for ipa_imm_cmd_hw_ip_fltrt_init structure fields */
+#define IPA_CMD_IP_FLTRT_FLAGS_HASH_SIZE_FMASK	GENMASK_ULL(11, 0)
+#define IPA_CMD_IP_FLTRT_FLAGS_HASH_ADDR_FMASK	GENMASK_ULL(27, 12)
+#define IPA_CMD_IP_FLTRT_FLAGS_NHASH_SIZE_FMASK	GENMASK_ULL(39, 28)
+#define IPA_CMD_IP_FLTRT_FLAGS_NHASH_ADDR_FMASK	GENMASK_ULL(55, 40)
+
+struct ipa_imm_cmd_hw_ip_fltrt_init {
+	u64 hash_rules_addr;
+	u64 flags;
+	u64 nhash_rules_addr;
+};
+
+/* Configure a routing or filter table, for IPv4 or IPv6 */
+static int ipa_cmd_table_config(struct ipa *ipa, enum ipa_cmd_opcode opcode,
+				dma_addr_t addr, size_t size, u32 hash_offset,
+				u32 nhash_offset)
+{
+	struct ipa_imm_cmd_hw_ip_fltrt_init *payload;
+	u64 val;
+	u32 max;
+	int ret;
+
+	if (size > field_max(IPA_CMD_IP_FLTRT_FLAGS_HASH_SIZE_FMASK))
+		return -EINVAL;
+	if (size > field_max(IPA_CMD_IP_FLTRT_FLAGS_NHASH_SIZE_FMASK))
+		return -EINVAL;
+
+	max = field_max(IPA_CMD_IP_FLTRT_FLAGS_HASH_ADDR_FMASK);
+	if (hash_offset > max || ipa->shared_offset > max - hash_offset)
+		return -EINVAL;
+	hash_offset += ipa->shared_offset;
+
+	max = field_max(IPA_CMD_IP_FLTRT_FLAGS_NHASH_ADDR_FMASK);
+	if (nhash_offset > max || ipa->shared_offset > max - nhash_offset)
+		return -EINVAL;
+	nhash_offset += ipa->shared_offset;
+
+	payload = kzalloc(sizeof(*payload), GFP_KERNEL);
+	if (!payload)
+		return -ENOMEM;
+
+	payload->hash_rules_addr = addr;
+	val = u64_encode_bits(size, IPA_CMD_IP_FLTRT_FLAGS_HASH_SIZE_FMASK);
+	val |= u64_encode_bits(hash_offset,
+			       IPA_CMD_IP_FLTRT_FLAGS_HASH_ADDR_FMASK);
+	val |= u64_encode_bits(size, IPA_CMD_IP_FLTRT_FLAGS_NHASH_SIZE_FMASK);
+	val |= u64_encode_bits(nhash_offset,
+			       IPA_CMD_IP_FLTRT_FLAGS_NHASH_ADDR_FMASK);
+	payload->flags = val;
+	payload->nhash_rules_addr = addr;
+
+	ret = ipa_cmd(ipa, opcode, payload, sizeof(*payload));
+
+	kfree(payload);
+
+	return ret;
+}
+
+/* Configure IPv4 routing table */
+int ipa_cmd_route_config_ipv4(struct ipa *ipa, size_t size)
+{
+	enum ipa_cmd_opcode opcode = IPA_CMD_IP_V4_ROUTING_INIT;
+	u32 nhash_offset = IPA_SMEM_V4_RT_NHASH_OFFSET;
+	u32 hash_offset = IPA_SMEM_V4_RT_HASH_OFFSET;
+	dma_addr_t addr = ipa->route_addr;
+
+	return ipa_cmd_table_config(ipa, opcode, addr, size, hash_offset,
+				    nhash_offset);
+}
+
+/* Configure IPv6 routing table */
+int ipa_cmd_route_config_ipv6(struct ipa *ipa, size_t size)
+{
+	enum ipa_cmd_opcode opcode = IPA_CMD_IP_V6_ROUTING_INIT;
+	u32 nhash_offset = IPA_SMEM_V6_RT_NHASH_OFFSET;
+	u32 hash_offset = IPA_SMEM_V6_RT_HASH_OFFSET;
+	dma_addr_t addr = ipa->route_addr;
+
+	return ipa_cmd_table_config(ipa, opcode, addr, size, hash_offset,
+				    nhash_offset);
+}
+
+/* Configure IPv4 filter table */
+int ipa_cmd_filter_config_ipv4(struct ipa *ipa, size_t size)
+{
+	enum ipa_cmd_opcode opcode = IPA_CMD_IP_V4_FILTER_INIT;
+	u32 nhash_offset = IPA_SMEM_V4_FLT_NHASH_OFFSET;
+	u32 hash_offset = IPA_SMEM_V4_FLT_HASH_OFFSET;
+	dma_addr_t addr = ipa->filter_addr;
+
+	return ipa_cmd_table_config(ipa, opcode, addr, size, hash_offset,
+				    nhash_offset);
+}
+
+/* Configure IPv6 filter table */
+int ipa_cmd_filter_config_ipv6(struct ipa *ipa, size_t size)
+{
+	enum ipa_cmd_opcode opcode = IPA_CMD_IP_V6_FILTER_INIT;
+	u32 nhash_offset = IPA_SMEM_V6_FLT_NHASH_OFFSET;
+	u32 hash_offset = IPA_SMEM_V6_FLT_HASH_OFFSET;
+	dma_addr_t addr = ipa->filter_addr;
+
+	return ipa_cmd_table_config(ipa, opcode, addr, size, hash_offset,
+				    nhash_offset);
+}
+
+/* Field masks for ipa_imm_cmd_hw_dma_task_32b_addr structure fields */
+#define IPA_CMD_DMA32_TASK_SW_RSVD_FMASK	GENMASK(10, 0)
+#define IPA_CMD_DMA32_TASK_CMPLT_FMASK		GENMASK(11, 11)
+#define IPA_CMD_DMA32_TASK_EOF_FMASK		GENMASK(12, 12)
+#define IPA_CMD_DMA32_TASK_FLSH_FMASK		GENMASK(13, 13)
+#define IPA_CMD_DMA32_TASK_LOCK_FMASK		GENMASK(14, 14)
+#define IPA_CMD_DMA32_TASK_UNLOCK_FMASK		GENMASK(15, 15)
+#define IPA_CMD_DMA32_SIZE1_FMASK		GENMASK(31, 16)
+#define IPA_CMD_DMA32_PACKET_SIZE_FMASK		GENMASK(15, 0)
+
+struct ipa_imm_cmd_hw_dma_task_32b_addr {
+	u32 size1_flags;
+	u32 addr1;
+	u32 packet_size;
+	u32 rsvd;
+};
+
+/* Use a 32-bit DMA command to zero a block of memory */
+int ipa_cmd_dma_task_32(struct ipa *ipa, size_t size, dma_addr_t addr)
+{
+	struct ipa_imm_cmd_hw_dma_task_32b_addr *payload;
+	u32 size1_flags;
+	int ret;
+
+	if (size > field_max(IPA_CMD_DMA32_SIZE1_FMASK))
+		return -EINVAL;
+	if (size > field_max(IPA_CMD_DMA32_PACKET_SIZE_FMASK))
+		return -EINVAL;
+
+	payload = kzalloc(sizeof(*payload), GFP_KERNEL);
+	if (!payload)
+		return -ENOMEM;
+
+	/* complete: 0 = don't interrupt; eof: 0 = don't assert eot */
+	size1_flags = IPA_CMD_DMA32_TASK_FLSH_FMASK;
+	/* lock: 0 = don't lock endpoint; unlock: 0 = don't unlock */
+	size1_flags |= u32_encode_bits(size, IPA_CMD_DMA32_SIZE1_FMASK);
+
+	payload->size1_flags = size1_flags;
+	payload->addr1 = addr;
+	payload->packet_size =
+			u32_encode_bits(size, IPA_CMD_DMA32_PACKET_SIZE_FMASK);
+
+	ret = ipa_cmd_timeout(ipa, IPA_CMD_DMA_TASK_32B_ADDR, payload,
+			      sizeof(payload));
+
+	kfree(payload);
+
+	return ret;
+}
diff --git a/drivers/net/ipa/ipa_cmd.h b/drivers/net/ipa/ipa_cmd.h
new file mode 100644
index 000000000000..dadb9d92067a
--- /dev/null
+++ b/drivers/net/ipa/ipa_cmd.h
@@ -0,0 +1,116 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _IPA_CMD_H_
+#define _IPA_CMD_H_
+
+#include <linux/types.h>
+
+struct sk_buff;
+
+struct ipa;
+
+/**
+ * enum ipa_cmd_opcode:	IPA immediate commands
+ *
+ * All immediate commands are issued using the AP command TX endpoint.
+ * The numeric values here are the opcodes for IPA v3.5.1 hardware.
+ *
+ * IPA_CMD_NONE is a special (invalid) value that's used to indicate
+ * a request is *not* an immediate command.
+ */
+enum ipa_cmd_opcode {
+	IPA_CMD_NONE			= 0,
+	IPA_CMD_IP_V4_FILTER_INIT	= 3,
+	IPA_CMD_IP_V6_FILTER_INIT	= 4,
+	IPA_CMD_IP_V4_ROUTING_INIT	= 7,
+	IPA_CMD_IP_V6_ROUTING_INIT	= 8,
+	IPA_CMD_HDR_INIT_LOCAL		= 9,
+	IPA_CMD_DMA_TASK_32B_ADDR	= 17,
+	IPA_CMD_DMA_SHARED_MEM		= 19,
+};
+
+/**
+ * ipa_cmd_hdr_init_local() - Initialize header space in IPA local memory
+ * @ipa:	IPA structure
+ * @offset:	Offset of memory to be initialized
+ * @size:	Size of memory to be initialized
+ *
+ * Return:	0 if successful, or a negative error code
+ *
+ * Defines the location of a block of local memory to use for
+ * headers and fills it with zeroes.
+ */
+int ipa_cmd_hdr_init_local(struct ipa *ipa, u32 offset, u32 size);
+
+/**
+ * ipa_cmd_smem_dma_zero() - Use a DMA command to zero a block of memory
+ * @ipa:	IPA structure
+ * @offset:	Offset of memory to be zeroed
+ * @size:	Size in bytes of memory to be zeroed
+ *
+ * Return:	0 if successful, or a negative error code
+ */
+int ipa_cmd_smem_dma_zero(struct ipa *ipa, u32 offset, u32 size);
+
+/**
+ * ipa_cmd_route_config_ipv4() - Configure IPv4 routing table
+ * @ipa:	IPA structure
+ * @size:	Size in bytes of table
+ *
+ * Return:	0 if successful, or a negative error code
+ *
+ * Defines the location and size of the IPv4 routing table and
+ * zeroes its content.
+ */
+int ipa_cmd_route_config_ipv4(struct ipa *ipa, size_t size);
+
+/**
+ * ipa_cmd_route_config_ipv6() - Configure IPv6 routing table
+ * @ipa:	IPA structure
+ * @size:	Size in bytes of table
+ *
+ * Return:	0 if successful, or a negative error code
+ *
+ * Defines the location and size of the IPv6 routing table and
+ * zeroes its content.
+ */
+int ipa_cmd_route_config_ipv6(struct ipa *ipa, size_t size);
+
+/**
+ * ipa_cmd_filter_config_ipv4() - Configure IPv4 filter table
+ * @ipa:	IPA structure
+ * @size:	Size in bytes of table
+ *
+ * Return:	0 if successful, or a negative error code
+ *
+ * Defines the location and size of the IPv4 filter table and
+ * zeroes its content.
+ */
+int ipa_cmd_filter_config_ipv4(struct ipa *ipa, size_t size);
+
+/**
+ * ipa_cmd_filter_config_ipv6() - Configure IPv6 filter table
+ * @ipa:	IPA structure
+ * @size:	Size in bytes of table
+ *
+ * Return:	0 if successful, or a negative error code
+ *
+ * Defines the location and size of the IPv6 filter table and
+ * zeroes its content.
+ */
+int ipa_cmd_filter_config_ipv6(struct ipa *ipa, size_t size);
+
+/**
+ * ipa_cmd_dma_task_32() - Use a 32-bit DMA command to zero a block of memory
+ * @ipa:	IPA structure
+ * @size:	Size of memory to be zeroed
+ * @addr:	DMA address defining start of range to be zeroed
+ *
+ * Return:	0 if successful, or a negative error code
+ */
+int ipa_cmd_dma_task_32(struct ipa *ipa, size_t size, dma_addr_t addr);
+
+#endif /* _IPA_CMD_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 13/18] soc: qcom: ipa: IPA network device and microcontroller
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (11 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 12/18] soc: qcom: ipa: immediate commands Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-15  8:21   ` Arnd Bergmann
  2019-05-12  1:25 ` [PATCH 14/18] soc: qcom: ipa: AP/modem communications Alex Elder
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch includes the code that implements a Linux network device,
using one TX and one RX IPA endpoint.  It is used to implement the
network device representing the modem and its connection to wireless
networks.  There are only a few things that are really modem-specific
though, and they aren't clearly called out here.  Such distinctions
will be made clearer if we wish to support a network device for
anything other than the modem.

Sort of unrelated, this patch also includes the code supporting the
microcontroller CPU present on the IPA.  The microcontroller can be
used to implement special handling of packets, but at this time we
don't support that.  Still, it is a component that needs to be
initialized, and in the event of a crash we need to do some
synchronization between the AP and the microcontroller.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa_netdev.c | 250 +++++++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_netdev.h |  24 ++++
 drivers/net/ipa/ipa_uc.c     | 208 +++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_uc.h     |  32 +++++
 4 files changed, 514 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_netdev.c
 create mode 100644 drivers/net/ipa/ipa_netdev.h
 create mode 100644 drivers/net/ipa/ipa_uc.c
 create mode 100644 drivers/net/ipa/ipa_uc.h

diff --git a/drivers/net/ipa/ipa_netdev.c b/drivers/net/ipa/ipa_netdev.c
new file mode 100644
index 000000000000..18e6a9de401a
--- /dev/null
+++ b/drivers/net/ipa/ipa_netdev.c
@@ -0,0 +1,250 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2014-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+/* Modem Transport Network Driver. */
+
+#include <linux/errno.h>
+#include <linux/if_arp.h>
+#include <linux/netdevice.h>
+#include <linux/skbuff.h>
+#include <soc/qcom/rmnet.h>
+
+#include "ipa.h"
+#include "ipa_data.h"
+#include "ipa_endpoint.h"
+#include "ipa_mem.h"
+#include "ipa_netdev.h"
+#include "ipa_qmi.h"
+
+#define IPA_NETDEV_NAME		"rmnet_ipa%d"
+
+#define TAILROOM		0	/* for padding by mux layer */
+
+#define IPA_NETDEV_TIMEOUT	10	/* seconds */
+
+/** struct ipa_priv - IPA network device private data */
+struct ipa_priv {
+	struct ipa_endpoint *tx_endpoint;
+	struct ipa_endpoint *rx_endpoint;
+};
+
+/** ipa_netdev_open() - Opens the modem network interface */
+static int ipa_netdev_open(struct net_device *netdev)
+{
+	struct ipa_priv *priv = netdev_priv(netdev);
+	int ret;
+
+	ret = ipa_endpoint_enable_one(priv->rx_endpoint);
+	if (ret)
+		return ret;
+	ret = ipa_endpoint_enable_one(priv->tx_endpoint);
+	if (ret)
+		goto err_disable_rx;
+
+	netif_start_queue(netdev);
+
+	return 0;
+
+err_disable_rx:
+	ipa_endpoint_disable_one(priv->rx_endpoint);
+
+	return ret;
+}
+
+/** ipa_netdev_stop() - Stops the modem network interface. */
+static int ipa_netdev_stop(struct net_device *netdev)
+{
+	struct ipa_priv *priv = netdev_priv(netdev);
+
+	netif_stop_queue(netdev);
+
+	ipa_endpoint_disable_one(priv->tx_endpoint);
+	ipa_endpoint_disable_one(priv->rx_endpoint);
+
+	return 0;
+}
+
+/** ipa_netdev_xmit() - Transmits an skb.
+ *
+ * @skb: skb to be transmitted
+ * @dev: network device
+ *
+ * Return codes:
+ * NETDEV_TX_OK: Success
+ * NETDEV_TX_BUSY: Error while transmitting the skb. Try again later
+ */
+static int ipa_netdev_xmit(struct sk_buff *skb, struct net_device *netdev)
+{
+	struct net_device_stats *stats = &netdev->stats;
+	struct ipa_priv *priv = netdev_priv(netdev);
+	struct ipa_endpoint *endpoint;
+	u32 skb_len = skb->len;
+
+	if (!skb_len)
+		goto err_drop;
+
+	endpoint = priv->tx_endpoint;
+	if (endpoint->data->config.qmap && skb->protocol != htons(ETH_P_MAP))
+		goto err_drop;
+
+	if (ipa_endpoint_skb_tx(endpoint, skb))
+		return NETDEV_TX_BUSY;
+
+	stats->tx_packets++;
+	stats->tx_bytes += skb_len;
+
+	return NETDEV_TX_OK;
+
+err_drop:
+	dev_kfree_skb_any(skb);
+	stats->tx_dropped++;
+
+	return NETDEV_TX_OK;
+}
+
+void ipa_netdev_skb_rx(struct net_device *netdev, struct sk_buff *skb)
+{
+	struct net_device_stats *stats = &netdev->stats;
+
+	if (skb) {
+		skb->dev = netdev;
+		skb->protocol = htons(ETH_P_MAP);
+		stats->rx_packets++;
+		stats->rx_bytes += skb->len;
+
+		(void)netif_receive_skb(skb);
+	} else {
+		stats->rx_dropped++;
+	}
+}
+
+static const struct net_device_ops ipa_netdev_ops = {
+	.ndo_open	= ipa_netdev_open,
+	.ndo_stop	= ipa_netdev_stop,
+	.ndo_start_xmit	= ipa_netdev_xmit,
+};
+
+/** netdev_setup() - netdev setup function  */
+static void netdev_setup(struct net_device *netdev)
+{
+	netdev->netdev_ops = &ipa_netdev_ops;
+	ether_setup(netdev);
+	/* No header ops (override value set by ether_setup()) */
+	netdev->header_ops = NULL;
+	netdev->type = ARPHRD_RAWIP;
+	netdev->hard_header_len = 0;
+	netdev->max_mtu = IPA_MTU;
+	netdev->mtu = netdev->max_mtu;
+	netdev->addr_len = 0;
+	netdev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST);
+	/* The endpoint is configured for QMAP */
+	netdev->needed_headroom = sizeof(struct rmnet_map_header);
+	netdev->needed_tailroom = TAILROOM;
+	netdev->watchdog_timeo = IPA_NETDEV_TIMEOUT * HZ;
+	netdev->hw_features = NETIF_F_SG;
+}
+
+/** ipa_netdev_suspend() - suspend callback for runtime_pm
+ * @dev: pointer to device
+ *
+ * This callback will be invoked by the runtime_pm framework when an AP suspend
+ * operation is invoked, usually by pressing a suspend button.
+ *
+ * Returns -EAGAIN to runtime_pm framework in case there are pending packets
+ * in the Tx queue. This will postpone the suspend operation until all the
+ * pending packets will be transmitted.
+ *
+ * In case there are no packets to send, releases the WWAN0_PROD entity.
+ * As an outcome, the number of IPA active clients should be decremented
+ * until IPA clocks can be gated.
+ */
+void ipa_netdev_suspend(struct net_device *netdev)
+{
+	struct ipa_priv *priv = netdev_priv(netdev);
+
+	netif_stop_queue(netdev);
+
+	ipa_endpoint_suspend(priv->rx_endpoint);
+}
+
+/** ipa_netdev_resume() - resume callback for runtime_pm
+ * @dev: pointer to device
+ *
+ * This callback will be invoked by the runtime_pm framework when an AP resume
+ * operation is invoked.
+ *
+ * Enables the network interface queue and returns success to the
+ * runtime_pm framework.
+ */
+void ipa_netdev_resume(struct net_device *netdev)
+{
+	struct ipa_priv *priv = netdev_priv(netdev);
+
+	ipa_endpoint_resume(priv->rx_endpoint);
+
+	netif_wake_queue(netdev);
+}
+
+struct net_device *ipa_netdev_setup(struct ipa *ipa,
+				    struct ipa_endpoint *rx_endpoint,
+				    struct ipa_endpoint *tx_endpoint)
+{
+	struct net_device *netdev;
+	struct ipa_priv *priv;
+	int ret;
+
+	/* Zero modem shared memory before we begin */
+	ret = ipa_smem_zero_modem(ipa);
+	if (ret)
+		return ERR_PTR(ret);
+
+	/* Start QMI communication with the modem */
+	ret = ipa_qmi_setup(ipa);
+	if (ret)
+		return ERR_PTR(ret);
+
+	netdev = alloc_netdev(sizeof(struct ipa_priv), IPA_NETDEV_NAME,
+			      NET_NAME_UNKNOWN, netdev_setup);
+	if (!netdev) {
+		ret = -ENOMEM;
+		goto err_qmi_exit;
+	}
+
+	rx_endpoint->netdev = netdev;
+	tx_endpoint->netdev = netdev;
+
+	priv = netdev_priv(netdev);
+	priv->rx_endpoint = rx_endpoint;
+	priv->tx_endpoint = tx_endpoint;
+
+	ret = register_netdev(netdev);
+	if (ret)
+		goto err_free_netdev;
+
+	return netdev;
+
+err_free_netdev:
+	free_netdev(netdev);
+err_qmi_exit:
+	ipa_qmi_teardown(ipa);
+
+	return ERR_PTR(ret);
+}
+
+void ipa_netdev_teardown(struct net_device *netdev)
+{
+	struct ipa_priv *priv = netdev_priv(netdev);
+	struct ipa *ipa = priv->tx_endpoint->ipa;
+
+	if (!netif_queue_stopped(netdev))
+		(void)ipa_netdev_stop(netdev);
+
+	unregister_netdev(netdev);
+
+	free_netdev(netdev);
+
+	ipa_qmi_teardown(ipa);
+}
diff --git a/drivers/net/ipa/ipa_netdev.h b/drivers/net/ipa/ipa_netdev.h
new file mode 100644
index 000000000000..8ab1e8ea0b4a
--- /dev/null
+++ b/drivers/net/ipa/ipa_netdev.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _IPA_NETDEV_H_
+#define _IPA_NETDEV_H_
+
+struct ipa;
+struct ipa_endpoint;
+struct net_device;
+struct sk_buff;
+
+struct net_device *ipa_netdev_setup(struct ipa *ipa,
+				    struct ipa_endpoint *rx_endpoint,
+				    struct ipa_endpoint *tx_endpoint);
+void ipa_netdev_teardown(struct net_device *netdev);
+
+void ipa_netdev_skb_rx(struct net_device *netdev, struct sk_buff *skb);
+
+void ipa_netdev_suspend(struct net_device *netdev);
+void ipa_netdev_resume(struct net_device *netdev);
+
+#endif /* _IPA_NETDEV_H_ */
diff --git a/drivers/net/ipa/ipa_uc.c b/drivers/net/ipa/ipa_uc.c
new file mode 100644
index 000000000000..57256d1c3b90
--- /dev/null
+++ b/drivers/net/ipa/ipa_uc.c
@@ -0,0 +1,208 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/io.h>
+#include <linux/delay.h>
+
+#include "ipa.h"
+#include "ipa_clock.h"
+#include "ipa_uc.h"
+
+/**
+ * DOC:  The IPA embedded microcontroller
+ *
+ * The IPA incorporates a microcontroller that is able to do some additional
+ * handling/offloading of network activity.  The current code makes
+ * essentially no use of the microcontroller, but it still requires some
+ * initialization.  It needs to be notified in the event the AP crashes.
+ *
+ * The microcontroller can generate two interrupts to the AP.  One interrupt
+ * is used to indicate that a response to a request from the AP is available.
+ * The other is used to notify the AP of the occurrence of an event.  In
+ * addition, the AP can interrupt the microcontroller by writing a register.
+ *
+ * A 128 byte block of structured memory within the IPA SRAM is used together
+ * with these interrupts to implement the communication interface between the
+ * AP and the IPA microcontroller.  Each side writes data to the shared area
+ * before interrupting its peer, which will read the written data in response
+ * to the interrupt.  Some information found in the shared area is currently
+ * unused.  All remaining space in the shared area is reserved, and must not
+ * be read or written by the AP.
+ */
+/* Supports hardware interface version 0x2000 */
+
+/* Offset relative to the base of the IPA shared address space of the
+ * shared region used for communication with the microcontroller.  The
+ * region is 128 bytes in size, but only the first 40 bytes are used.
+ */
+#define IPA_SMEM_UC_OFFSET	0x0000
+
+/* Delay to allow a the microcontroller to save state when crashing */
+#define IPA_SEND_DELAY		100	/* microseconds */
+
+/**
+ * struct ipa_uc_shared_area - AP/microcontroller shared memory area
+ * @command:		command code (AP->microcontroller)
+ * @command_param:	low 32 bits of command parameter (AP->microcontroller)
+ * @command_param_hi:	high 32 bits of command parameter (AP->microcontroller)
+ *
+ * @response:		response code (microcontroller->AP)
+ * @response_param:	response parameter (microcontroller->AP)
+ *
+ * @event:		event code (microcontroller->AP)
+ * @event_param:	event parameter (microcontroller->AP)
+ *
+ * @first_error_address: address of first error-source on SNOC
+ * @hw_state:		state of hardware (including error type information)
+ * @warning_counter:	counter of non-fatal hardware errors
+ * @interface_version:	hardware-reported interface version
+ */
+struct ipa_uc_shared_area {
+	u8 command;		/* enum ipa_uc_command */
+	u8 reserved0[3];
+	__le32 command_param;
+	__le32 command_param_hi;
+	u8 response;		/* enum ipa_uc_response */
+	u8 reserved1[3];
+	__le32 response_param;
+	u8 event;		/* enum ipa_uc_event */
+	u8 reserved2[3];
+
+	__le32 event_param;
+	__le32 first_error_address;
+	u8 hw_state;
+	u8 warning_counter;
+	__le16 reserved3;
+	__le16 interface_version;
+	__le16 reserved4;
+};
+
+/** enum ipa_uc_command - commands from the AP to the microcontroller */
+enum ipa_uc_command {
+	IPA_UC_COMMAND_NO_OP		= 0,
+	IPA_UC_COMMAND_UPDATE_FLAGS	= 1,
+	IPA_UC_COMMAND_DEBUG_RUN_TEST	= 2,
+	IPA_UC_COMMAND_DEBUG_GET_INFO	= 3,
+	IPA_UC_COMMAND_ERR_FATAL	= 4,
+	IPA_UC_COMMAND_CLK_GATE		= 5,
+	IPA_UC_COMMAND_CLK_UNGATE	= 6,
+	IPA_UC_COMMAND_MEMCPY		= 7,
+	IPA_UC_COMMAND_RESET_PIPE	= 8,
+	IPA_UC_COMMAND_REG_WRITE	= 9,
+	IPA_UC_COMMAND_GSI_CH_EMPTY	= 10,
+};
+
+/** enum ipa_uc_response - microcontroller response codes */
+enum ipa_uc_response {
+	IPA_UC_RESPONSE_NO_OP		= 0,
+	IPA_UC_RESPONSE_INIT_COMPLETED	= 1,
+	IPA_UC_RESPONSE_CMD_COMPLETED	= 2,
+	IPA_UC_RESPONSE_DEBUG_GET_INFO	= 3,
+};
+
+/** enum ipa_uc_event - common cpu events reported by the microcontroller */
+enum ipa_uc_event {
+	IPA_UC_EVENT_NO_OP     = 0,
+	IPA_UC_EVENT_ERROR     = 1,
+	IPA_UC_EVENT_LOG_INFO  = 2,
+};
+
+/* Microcontroller event IPA interrupt handler */
+static void ipa_uc_event_handler(struct ipa *ipa,
+				 enum ipa_interrupt_id interrupt_id)
+{
+	struct ipa_uc_shared_area *shared;
+
+	shared = ipa->shared_virt + IPA_SMEM_UC_OFFSET;
+	dev_err(&ipa->pdev->dev, "unsupported microcontroller event %hhu\n",
+		shared->event);
+	WARN_ON(shared->event == IPA_UC_EVENT_ERROR);
+}
+
+/* Microcontroller response IPA interrupt handler */
+static void ipa_uc_response_hdlr(struct ipa *ipa,
+				 enum ipa_interrupt_id interrupt_id)
+{
+	struct ipa_uc_shared_area *shared;
+
+	/* An INIT_COMPLETED response message is sent to the AP by the
+	 * microcontroller when it is operational.  Other than this, the AP
+	 * should only receive responses from the microntroller when it has
+	 * sent it a request message.
+	 *
+	 * We can drop the clock reference taken in ipa_uc_init() once we
+	 * know the microcontroller has finished its initialization.
+	 */
+	shared = ipa->shared_virt + IPA_SMEM_UC_OFFSET;
+	switch (shared->response) {
+	case IPA_UC_RESPONSE_INIT_COMPLETED:
+		ipa->uc_loaded = 1;
+		ipa_clock_put(ipa->clock);
+		break;
+	default:
+		dev_warn(&ipa->pdev->dev,
+			 "unsupported microcontroller response %hhu\n",
+			 shared->response);
+		break;
+	}
+}
+
+/* ipa_uc_setup() - Set up the microcontroller */
+void ipa_uc_setup(struct ipa *ipa)
+{
+	/* The microcontroller needs the IPA clock running until it has
+	 * completed its initialization.  It signals this by sending an
+	 * INIT_COMPLETED response message to the AP.  This could occur after
+	 * we have finished doing the rest of the IPA initialization, so we
+	 * need to take an extra "proxy" reference, and hold it until we've
+	 * received that signal.  (This reference is dropped in
+	 * ipa_uc_response_hdlr(), above.)
+	 */
+	ipa_clock_get(ipa->clock);
+
+	ipa->uc_loaded = 0;
+	ipa_interrupt_add(ipa->interrupt, IPA_INTERRUPT_UC_0,
+			  ipa_uc_event_handler);
+	ipa_interrupt_add(ipa->interrupt, IPA_INTERRUPT_UC_1,
+			  ipa_uc_response_hdlr);
+}
+
+/* Inverse of ipa_uc_setup() */
+void ipa_uc_teardown(struct ipa *ipa)
+{
+	ipa_interrupt_remove(ipa->interrupt, IPA_INTERRUPT_UC_1);
+	ipa_interrupt_remove(ipa->interrupt, IPA_INTERRUPT_UC_0);
+	if (!ipa->uc_loaded)
+		ipa_clock_put(ipa->clock);
+}
+
+/* Send a command to the microcontroller */
+static void send_uc_command(struct ipa *ipa, u32 command, u32 command_param)
+{
+	struct ipa_uc_shared_area *shared;
+
+	shared = ipa->shared_virt + IPA_SMEM_UC_OFFSET;
+	shared->command = command;
+	shared->command_param = cpu_to_le32(command_param);
+	shared->command_param_hi = 0;
+	shared->response = 0;
+	shared->response_param = 0;
+
+	iowrite32(1, ipa->reg_virt + IPA_REG_IRQ_UC_OFFSET);
+}
+
+/* Tell the microcontroller the AP is shutting down */
+void ipa_uc_panic_notifier(struct ipa *ipa)
+{
+	if (!ipa->uc_loaded)
+		return;
+
+	send_uc_command(ipa, IPA_UC_COMMAND_ERR_FATAL, 0);
+
+	/* give uc enough time to save state */
+	udelay(IPA_SEND_DELAY);
+}
diff --git a/drivers/net/ipa/ipa_uc.h b/drivers/net/ipa/ipa_uc.h
new file mode 100644
index 000000000000..c258cb6e1161
--- /dev/null
+++ b/drivers/net/ipa/ipa_uc.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _IPA_UC_H_
+#define _IPA_UC_H_
+
+struct ipa;
+
+/**
+ * ipa_uc_setup() - set up the IPA microcontroller subsystem
+ * @ipa:	IPA pointer
+ */
+void ipa_uc_setup(struct ipa *ipa);
+
+/**
+ * ipa_uc_teardown() - inverse of ipa_uc_setup()
+ * @ipa:	IPA pointer
+ */
+void ipa_uc_teardown(struct ipa *ipa);
+
+/**
+ * ipa_uc_panic_notifier()
+ * @ipa:	IPA pointer
+ *
+ * Notifier function called when the system crashes, to inform the
+ * microcontroller of the event.
+ */
+void ipa_uc_panic_notifier(struct ipa *ipa);
+
+#endif /* _IPA_UC_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 14/18] soc: qcom: ipa: AP/modem communications
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (12 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 13/18] soc: qcom: ipa: IPA network device and microcontroller Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-12  1:25 ` [PATCH 15/18] soc: qcom: ipa: support build of IPA code Alex Elder
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

This patch implements two forms of out-of-band communication between
the AP and modem.

  - QMI is a mechanism that allows clients running on the AP
    interact with services running on the modem (and vice-versa).
    The AP IPA driver uses QMI to communicate with the corresponding
    IPA driver resident on the modem, to agree on parameters used
    with the IPA hardware and to ensure both sides are ready before
    entering operational mode.

  - SMP2P is a more primitive mechanism available for the modem and
    AP to communicate with each other.  It provides a means for either
    the AP or modem to interrupt the other, and furthermore, to provide
    32 bits worth of information.  The IPA driver uses SMP2P to tell
    the modem what the state of the IPA clock was in the event of a
    crash.  This allows the modem to safely access the IPA hardware
    (or avoid doing so) when a crash occurs, for example, to access
    information within the IPA hardware.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa_qmi.c     | 399 +++++++++++++++++++++++
 drivers/net/ipa/ipa_qmi.h     |  35 ++
 drivers/net/ipa/ipa_qmi_msg.c | 583 ++++++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_qmi_msg.h | 238 ++++++++++++++
 drivers/net/ipa/ipa_smp2p.c   | 304 ++++++++++++++++++
 drivers/net/ipa/ipa_smp2p.h   |  47 +++
 6 files changed, 1606 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_qmi.c
 create mode 100644 drivers/net/ipa/ipa_qmi.h
 create mode 100644 drivers/net/ipa/ipa_qmi_msg.c
 create mode 100644 drivers/net/ipa/ipa_qmi_msg.h
 create mode 100644 drivers/net/ipa/ipa_smp2p.c
 create mode 100644 drivers/net/ipa/ipa_smp2p.h

diff --git a/drivers/net/ipa/ipa_qmi.c b/drivers/net/ipa/ipa_qmi.c
new file mode 100644
index 000000000000..ce5455e58ca9
--- /dev/null
+++ b/drivers/net/ipa/ipa_qmi.c
@@ -0,0 +1,399 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2013-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/qrtr.h>
+#include <linux/soc/qcom/qmi.h>
+
+#include "ipa.h"
+#include "ipa_endpoint.h"
+#include "ipa_mem.h"
+#include "ipa_qmi_msg.h"
+
+#define QMI_INIT_DRIVER_TIMEOUT	60000	/* A minute in milliseconds */
+
+/* The AP and modem perform a "handshake" at initialization time to ensure
+ * each side knows the other side is ready.  Two QMI handles (endpoints) are
+ * used for this; one provides service on the modem for AP requests, and the
+ * other is on the AP to service modem requests (and to supply an indication
+ * from the AP).
+ *
+ * The QMI service on the modem expects to receive an INIT_DRIVER request from
+ * the AP, which contains parameters used by the modem during initialization.
+ * The AP sends this request using the client handle as soon as it is knows
+ * the modem side service is available.  The modem responds to this request
+ * immediately.
+ *
+ * When the modem learns the AP service is available, it is able to
+ * communicate its status to the AP.  The modem uses this to tell
+ * the AP when it is ready to receive an indication, sending an
+ * INDICATION_REGISTER request to the handle served by the AP.  This
+ * is independent of the modem's initialization of its driver.
+ *
+ * When the modem has completed the driver initialization requested by the
+ * AP, it sends a DRIVER_INIT_COMPLETE request to the AP.   This request
+ * could arrive at the AP either before or after the INDICATION_REGISTER
+ * request.
+ *
+ * The final step in the handshake occurs after the AP has received both
+ * requests from the modem.  The AP completes the handshake by sending an
+ * INIT_COMPLETE_IND indication message to the modem.
+ */
+
+#define IPA_HOST_SERVICE_SVC_ID		0x31
+#define IPA_HOST_SVC_VERS		1
+#define IPA_HOST_SERVICE_INS_ID		1
+
+#define IPA_MODEM_SERVICE_SVC_ID	0x31
+#define IPA_MODEM_SERVICE_INS_ID	2
+#define IPA_MODEM_SVC_VERS		1
+
+/* Send an INIT_COMPLETE_IND indication message to the modem */
+static int ipa_send_master_driver_init_complete_ind(struct qmi_handle *qmi,
+						    struct sockaddr_qrtr *sq)
+{
+	struct ipa_init_complete_ind ind = { };
+
+	ind.status.result = QMI_RESULT_SUCCESS_V01;
+	ind.status.error = QMI_ERR_NONE_V01;
+
+	return qmi_send_indication(qmi, sq, IPA_QMI_INIT_COMPLETE_IND,
+				   IPA_QMI_INIT_COMPLETE_IND_SZ,
+				   ipa_init_complete_ind_ei, &ind);
+}
+
+/* This function is called to determine whether to complete the handshake by
+ * sending an INIT_COMPLETE_IND indication message to the modem.  The
+ * "init_driver" parameter is false when we've received an INDICATION_REGISTER
+ * request message from the modem, or true when we've received the response
+ * from the INIT_DRIVER request message we send.  If this function decides the
+ * message should be sent, it calls ipa_send_master_driver_init_complete_ind()
+ * to send it.
+ */
+static void ipa_handshake_complete(struct qmi_handle *qmi,
+				   struct sockaddr_qrtr *sq, bool init_driver)
+{
+	struct ipa *ipa;
+	bool send_it;
+	int ret;
+
+	if (init_driver) {
+		ipa = container_of(qmi, struct ipa, qmi.client_handle);
+		ipa->qmi.init_driver_response_received = true;
+		send_it = !!ipa->qmi.indication_register_received;
+	} else {
+		ipa = container_of(qmi, struct ipa, qmi.server_handle);
+		ipa->qmi.indication_register_received = 1;
+		send_it = !!ipa->qmi.init_driver_response_received;
+	}
+	if (!send_it)
+		return;
+
+	ret = ipa_send_master_driver_init_complete_ind(qmi, sq);
+	WARN(ret, "error %d sending init complete indication\n", ret);
+}
+
+/* Callback function to handle an INDICATION_REGISTER request message from the
+ * modem.  This informs the AP that the modem is now ready to receive the
+ * INIT_COMPLETE_IND indication message.
+ */
+static void ipa_indication_register_fn(struct qmi_handle *qmi,
+				       struct sockaddr_qrtr *sq,
+				       struct qmi_txn *txn,
+				       const void *decoded)
+{
+	struct ipa_indication_register_rsp rsp = { };
+	int ret;
+
+	rsp.rsp.result = QMI_RESULT_SUCCESS_V01;
+	rsp.rsp.error = QMI_ERR_NONE_V01;
+
+	ret = qmi_send_response(qmi, sq, txn, IPA_QMI_INDICATION_REGISTER,
+				IPA_QMI_INDICATION_REGISTER_RSP_SZ,
+				ipa_indication_register_rsp_ei, &rsp);
+	if (!WARN(ret, "error %d sending response\n", ret))
+		ipa_handshake_complete(qmi, sq, false);
+}
+
+/* Callback function to handle a DRIVER_INIT_COMPLETE request message from the
+ * modem.  This informs the AP that the modem has completed the initializion
+ * of its driver.
+ */
+static void ipa_driver_init_complete_fn(struct qmi_handle *qmi,
+					struct sockaddr_qrtr *sq,
+					struct qmi_txn *txn,
+					const void *decoded)
+{
+	struct ipa_driver_init_complete_rsp rsp = { };
+	int ret;
+
+	rsp.rsp.result = QMI_RESULT_SUCCESS_V01;
+	rsp.rsp.error = QMI_ERR_NONE_V01;
+
+	ret = qmi_send_response(qmi, sq, txn, IPA_QMI_DRIVER_INIT_COMPLETE,
+				IPA_QMI_DRIVER_INIT_COMPLETE_RSP_SZ,
+				ipa_driver_init_complete_rsp_ei, &rsp);
+
+	WARN(ret, "error %d sending response\n", ret);
+}
+
+/* The server handles two request message types sent by the modem. */
+static struct qmi_msg_handler ipa_server_msg_handlers[] = {
+	{
+		.type		= QMI_REQUEST,
+		.msg_id		= IPA_QMI_INDICATION_REGISTER,
+		.ei		= ipa_indication_register_req_ei,
+		.decoded_size	= IPA_QMI_INDICATION_REGISTER_REQ_SZ,
+		.fn		= ipa_indication_register_fn,
+	},
+	{
+		.type		= QMI_REQUEST,
+		.msg_id		= IPA_QMI_DRIVER_INIT_COMPLETE,
+		.ei		= ipa_driver_init_complete_req_ei,
+		.decoded_size	= IPA_QMI_DRIVER_INIT_COMPLETE_REQ_SZ,
+		.fn		= ipa_driver_init_complete_fn,
+	},
+};
+
+/* Callback function to handle an IPA_QMI_INIT_DRIVER response message from
+ * the modem.  This only acknowledges that the modem received the request.
+ * The modem will eventually report that it has completed its modem
+ * initialization by sending a IPA_QMI_DRIVER_INIT_COMPLETE request.
+ */
+static void ipa_init_driver_rsp_fn(struct qmi_handle *qmi,
+				   struct sockaddr_qrtr *sq,
+				   struct qmi_txn *txn,
+				   const void *decoded)
+{
+	txn->result = 0;	/* IPA_QMI_INIT_DRIVER request was successful */
+	complete(&txn->completion);
+
+	ipa_handshake_complete(qmi, sq, true);
+}
+
+/* The client handles one response message type sent by the modem. */
+static struct qmi_msg_handler ipa_client_msg_handlers[] = {
+	{
+		.type		= QMI_RESPONSE,
+		.msg_id		= IPA_QMI_INIT_DRIVER,
+		.ei		= ipa_init_modem_driver_rsp_ei,
+		.decoded_size	= IPA_QMI_INIT_DRIVER_RSP_SZ,
+		.fn		= ipa_init_driver_rsp_fn,
+	},
+};
+
+/* Return a pointer to an init modem driver request structure, which contains
+ * configuration parameters for the modem.  The modem may be started multiple
+ * times, but generally these parameters don't change so we can reuse the
+ * request structure once it's initialized.  The only exception is the
+ * skip_uc_load field, which will be set only after the microcontroller has
+ * reported it has completed its initialization.
+ */
+static const struct ipa_init_modem_driver_req *
+init_modem_driver_req(struct ipa_qmi *ipa_qmi)
+{
+	struct ipa *ipa = container_of(ipa_qmi, struct ipa, qmi);
+	static struct ipa_init_modem_driver_req req;
+
+	/* This is not the first boot if the microcontroller is loaded */
+	req.skip_uc_load = ipa->uc_loaded;
+	req.skip_uc_load_valid = true;
+
+	/* We only have to initialize most of it once */
+	if (req.platform_type_valid)
+		return &req;
+
+	req.platform_type_valid = true;
+	req.platform_type = IPA_QMI_PLATFORM_TYPE_MSM_ANDROID;
+
+	req.hdr_tbl_info_valid = IPA_SMEM_MODEM_HDR_SIZE ? 1 : 0;
+	req.hdr_tbl_info.start = ipa_qmi->base + IPA_SMEM_MODEM_HDR_OFFSET;
+	req.hdr_tbl_info.end = req.hdr_tbl_info.start +
+					IPA_SMEM_MODEM_HDR_SIZE - 1;
+
+	req.v4_route_tbl_info_valid = true;
+	req.v4_route_tbl_info.start =
+			ipa_qmi->base + IPA_SMEM_V4_RT_NHASH_OFFSET;
+	req.v4_route_tbl_info.count = IPA_SMEM_MODEM_RT_COUNT;
+
+	req.v6_route_tbl_info_valid = true;
+	req.v6_route_tbl_info.start =
+			ipa_qmi->base + IPA_SMEM_V6_RT_NHASH_OFFSET;
+	req.v6_route_tbl_info.count = IPA_SMEM_MODEM_RT_COUNT;
+
+	req.v4_filter_tbl_start_valid = true;
+	req.v4_filter_tbl_start = ipa_qmi->base + IPA_SMEM_V4_FLT_NHASH_OFFSET;
+
+	req.v6_filter_tbl_start_valid = true;
+	req.v6_filter_tbl_start = ipa_qmi->base + IPA_SMEM_V6_FLT_NHASH_OFFSET;
+
+	req.modem_mem_info_valid = IPA_SMEM_MODEM_SIZE ? 1 : 0;
+	req.modem_mem_info.start = ipa_qmi->base + IPA_SMEM_MODEM_OFFSET;
+	req.modem_mem_info.size = IPA_SMEM_MODEM_SIZE;
+
+	req.ctrl_comm_dest_end_pt_valid = true;
+	req.ctrl_comm_dest_end_pt = IPA_ENDPOINT_AP_MODEM_RX;
+
+	req.hdr_proc_ctx_tbl_info_valid =
+			IPA_SMEM_MODEM_HDR_PROC_CTX_SIZE ? 1 : 0;
+	req.hdr_proc_ctx_tbl_info.start =
+			ipa_qmi->base + IPA_SMEM_MODEM_HDR_PROC_CTX_OFFSET;
+	req.hdr_proc_ctx_tbl_info.end = req.hdr_proc_ctx_tbl_info.start +
+			IPA_SMEM_MODEM_HDR_PROC_CTX_SIZE - 1;
+
+	req.v4_hash_route_tbl_info_valid = true;
+	req.v4_hash_route_tbl_info.start =
+			ipa_qmi->base + IPA_SMEM_V4_RT_HASH_OFFSET;
+	req.v4_hash_route_tbl_info.count = IPA_SMEM_MODEM_RT_COUNT;
+
+	req.v6_hash_route_tbl_info_valid = true;
+	req.v6_hash_route_tbl_info.start =
+			ipa_qmi->base + IPA_SMEM_V6_RT_HASH_OFFSET;
+	req.v6_hash_route_tbl_info.count = IPA_SMEM_MODEM_RT_COUNT;
+
+	req.v4_hash_filter_tbl_start_valid = true;
+	req.v4_hash_filter_tbl_start =
+			ipa_qmi->base + IPA_SMEM_V4_FLT_HASH_OFFSET;
+
+	req.v6_hash_filter_tbl_start_valid = true;
+	req.v6_hash_filter_tbl_start =
+			ipa_qmi->base + IPA_SMEM_V6_FLT_HASH_OFFSET;
+
+	return &req;
+}
+
+/* The modem service we requested is now available via the client handle.
+ * Send an INIT_DRIVER request to the modem.
+ */
+static int
+ipa_client_new_server(struct qmi_handle *qmi, struct qmi_service *svc)
+{
+	const struct ipa_init_modem_driver_req *req;
+	struct ipa *ipa;
+	struct sockaddr_qrtr sq;
+	struct qmi_txn *txn;
+	int ret;
+
+	ipa = container_of(qmi, struct ipa, qmi.client_handle);
+	req = init_modem_driver_req(&ipa->qmi);
+
+	txn = kzalloc(sizeof(*txn), GFP_KERNEL);
+	if (!txn)
+		return -ENOMEM;
+
+	ret = qmi_txn_init(qmi, txn, NULL, NULL);
+	if (ret) {
+		kfree(txn);
+		return ret;
+	}
+
+	sq.sq_family = AF_QIPCRTR;
+	sq.sq_node = svc->node;
+	sq.sq_port = svc->port;
+
+	ret = qmi_send_request(qmi, &sq, txn, IPA_QMI_INIT_DRIVER,
+			       IPA_QMI_INIT_DRIVER_REQ_SZ,
+			       ipa_init_modem_driver_req_ei, req);
+	if (!ret)
+		ret = qmi_txn_wait(txn, MAX_SCHEDULE_TIMEOUT);
+	if (ret)
+		qmi_txn_cancel(txn);
+	kfree(txn);
+
+	return ret;
+}
+
+/* The only callback we supply for the client handle is notification that the
+ * service on the modem has become available.
+ */
+static struct qmi_ops ipa_client_ops = {
+	.new_server	= ipa_client_new_server,
+};
+
+static int ipa_qmi_initialize(struct ipa *ipa)
+{
+	struct ipa_qmi *ipa_qmi = &ipa->qmi;
+	int ret;
+
+	/* The only handle operation that might be interesting for the server
+	 * would be del_client, to find out when the modem side client has
+	 * disappeared.  But other than reporting the event, we wouldn't do
+	 * anything about that.  So we just pass a null pointer for its handle
+	 * operations.  All the real work is done by the message handlers.
+	 */
+	ret = qmi_handle_init(&ipa_qmi->server_handle,
+			      IPA_QMI_SERVER_MAX_RCV_SZ, NULL,
+			      ipa_server_msg_handlers);
+	if (ret)
+		return ret;
+
+	ret = qmi_add_server(&ipa_qmi->server_handle, IPA_HOST_SERVICE_SVC_ID,
+			     IPA_HOST_SVC_VERS, IPA_HOST_SERVICE_INS_ID);
+	if (ret)
+		goto err_release_server_handle;
+
+	/* The client handle is only used for sending an INIT_DRIVER request
+	 * to the modem, and receiving its response message.
+	 */
+	ret = qmi_handle_init(&ipa_qmi->client_handle,
+			      IPA_QMI_CLIENT_MAX_RCV_SZ, &ipa_client_ops,
+			      ipa_client_msg_handlers);
+	if (ret)
+		goto err_release_server_handle;
+
+	ret = qmi_add_lookup(&ipa_qmi->client_handle, IPA_MODEM_SERVICE_SVC_ID,
+			     IPA_MODEM_SVC_VERS, IPA_MODEM_SERVICE_INS_ID);
+	if (ret)
+		goto err_release_client_handle;
+
+	/* All QMI offsets are relative to the start of IPA shared memory */
+	ipa_qmi->base = ipa->shared_offset;
+	ipa_qmi->initialized = 1;
+
+	return 0;
+
+err_release_client_handle:
+	/* Releasing the handle also removes registered lookups */
+	qmi_handle_release(&ipa_qmi->client_handle);
+	memset(&ipa_qmi->client_handle, 0, sizeof(ipa_qmi->client_handle));
+err_release_server_handle:
+	/* Releasing the handle also removes registered services */
+	qmi_handle_release(&ipa_qmi->server_handle);
+	memset(&ipa_qmi->server_handle, 0, sizeof(ipa_qmi->server_handle));
+
+	return ret;
+}
+
+/* This is called by ipa_netdev_setup().  We can be informed via remoteproc
+ * that the modem has shut down, in which case this function will be called
+ * again to prepare for it coming back up again.
+ */
+int ipa_qmi_setup(struct ipa *ipa)
+{
+	ipa->qmi.init_driver_response_received = 0;
+	ipa->qmi.indication_register_received = 0;
+
+	if (!ipa->qmi.initialized)
+		return ipa_qmi_initialize(ipa);
+
+	return 0;
+}
+
+void ipa_qmi_teardown(struct ipa *ipa)
+{
+	if (!ipa->qmi.initialized)
+		return;
+
+	qmi_handle_release(&ipa->qmi.client_handle);
+	memset(&ipa->qmi.client_handle, 0, sizeof(ipa->qmi.client_handle));
+
+	qmi_handle_release(&ipa->qmi.server_handle);
+	memset(&ipa->qmi.server_handle, 0, sizeof(ipa->qmi.server_handle));
+
+	ipa->qmi.initialized = 0;
+}
diff --git a/drivers/net/ipa/ipa_qmi.h b/drivers/net/ipa/ipa_qmi.h
new file mode 100644
index 000000000000..cfdafa23cf8f
--- /dev/null
+++ b/drivers/net/ipa/ipa_qmi.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _IPA_QMI_H_
+#define _IPA_QMI_H_
+
+#include <linux/types.h>
+#include <linux/soc/qcom/qmi.h>
+
+struct ipa;
+
+/**
+ * struct ipa_qmi - QMI state associated with an IPA
+ * @initialized		- whether QMI initialization has completed
+ * @client_handle	- used to send an QMI requests to the modem
+ * @server_handle	- used to handle QMI requests from the modem
+ * @indication_register_received - tracks modem request receipt
+ * @init_driver_response_received - tracks modem response receipt
+ */
+struct ipa_qmi {
+	u32 initialized;
+	u32 base;
+	struct qmi_handle client_handle;
+	struct qmi_handle server_handle;
+	u32 indication_register_received;
+	u32 init_driver_response_received;
+
+};
+
+int ipa_qmi_setup(struct ipa *ipa);
+void ipa_qmi_teardown(struct ipa *ipa);
+
+#endif /* !_IPA_QMI_H_ */
diff --git a/drivers/net/ipa/ipa_qmi_msg.c b/drivers/net/ipa/ipa_qmi_msg.c
new file mode 100644
index 000000000000..b6b278dff6fb
--- /dev/null
+++ b/drivers/net/ipa/ipa_qmi_msg.c
@@ -0,0 +1,583 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#include <linux/stddef.h>
+#include <linux/soc/qcom/qmi.h>
+
+#include "ipa_qmi_msg.h"
+
+/* QMI message structure definition for struct ipa_indication_register_req */
+struct qmi_elem_info ipa_indication_register_req_ei[] = {
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_indication_register_req,
+				     master_driver_init_complete_valid),
+		.tlv_type	= 0x10,
+		.offset		= offsetof(struct ipa_indication_register_req,
+					   master_driver_init_complete_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_1_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_indication_register_req,
+				     master_driver_init_complete),
+		.tlv_type	= 0x10,
+		.offset		= offsetof(struct ipa_indication_register_req,
+					   master_driver_init_complete),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_indication_register_req,
+				     data_usage_quota_reached_valid),
+		.tlv_type	= 0x11,
+		.offset		= offsetof(struct ipa_indication_register_req,
+					   data_usage_quota_reached_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_1_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_indication_register_req,
+				     data_usage_quota_reached),
+		.tlv_type	= 0x11,
+		.offset		= offsetof(struct ipa_indication_register_req,
+					   data_usage_quota_reached),
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_indication_register_rsp */
+struct qmi_elem_info ipa_indication_register_rsp_ei[] = {
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_indication_register_rsp,
+				     rsp),
+		.tlv_type	= 0x02,
+		.offset		= offsetof(struct ipa_indication_register_rsp,
+					   rsp),
+		.ei_array	= qmi_response_type_v01_ei,
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_driver_init_complete_req */
+struct qmi_elem_info ipa_driver_init_complete_req_ei[] = {
+	{
+		.data_type	= QMI_UNSIGNED_1_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_driver_init_complete_req,
+				     status),
+		.tlv_type	= 0x01,
+		.offset		= offsetof(struct ipa_driver_init_complete_req,
+					   status),
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_driver_init_complete_rsp */
+struct qmi_elem_info ipa_driver_init_complete_rsp_ei[] = {
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_driver_init_complete_rsp,
+				     rsp),
+		.tlv_type	= 0x02,
+		.elem_size	= offsetof(struct ipa_driver_init_complete_rsp,
+					   rsp),
+		.ei_array	= qmi_response_type_v01_ei,
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_init_complete_ind */
+struct qmi_elem_info ipa_init_complete_ind_ei[] = {
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_complete_ind,
+				     status),
+		.tlv_type	= 0x02,
+		.elem_size	= offsetof(struct ipa_init_complete_ind,
+					   status),
+		.ei_array	= qmi_response_type_v01_ei,
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_mem_bounds */
+struct qmi_elem_info ipa_mem_bounds_ei[] = {
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_mem_bounds, start),
+		.offset		= offsetof(struct ipa_mem_bounds, start),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_mem_bounds, end),
+		.offset		= offsetof(struct ipa_mem_bounds, end),
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_mem_array */
+struct qmi_elem_info ipa_mem_array_ei[] = {
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_mem_array, start),
+		.offset		= offsetof(struct ipa_mem_array, start),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_mem_array, count),
+		.offset		= offsetof(struct ipa_mem_array, count),
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_mem_range */
+struct qmi_elem_info ipa_mem_range_ei[] = {
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_mem_range, start),
+		.offset		= offsetof(struct ipa_mem_range, start),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_mem_range, size),
+		.offset		= offsetof(struct ipa_mem_range, size),
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_init_modem_driver_req */
+struct qmi_elem_info ipa_init_modem_driver_req_ei[] = {
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     platform_type_valid),
+		.tlv_type	= 0x10,
+		.elem_size	= offsetof(struct ipa_init_modem_driver_req,
+					   platform_type_valid),
+	},
+	{
+		.data_type	= QMI_SIGNED_4_BYTE_ENUM,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     platform_type),
+		.tlv_type	= 0x10,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   platform_type),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     hdr_tbl_info_valid),
+		.tlv_type	= 0x11,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   hdr_tbl_info_valid),
+	},
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     hdr_tbl_info),
+		.tlv_type	= 0x11,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   hdr_tbl_info),
+		.ei_array	= ipa_mem_bounds_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v4_route_tbl_info_valid),
+		.tlv_type	= 0x12,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v4_route_tbl_info_valid),
+	},
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v4_route_tbl_info),
+		.tlv_type	= 0x12,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v4_route_tbl_info),
+		.ei_array	= ipa_mem_array_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v6_route_tbl_info_valid),
+		.tlv_type	= 0x13,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v6_route_tbl_info_valid),
+	},
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v6_route_tbl_info),
+		.tlv_type	= 0x13,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v6_route_tbl_info),
+		.ei_array	= ipa_mem_array_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v4_filter_tbl_start_valid),
+		.tlv_type	= 0x14,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v4_filter_tbl_start_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v4_filter_tbl_start),
+		.tlv_type	= 0x14,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v4_filter_tbl_start),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v6_filter_tbl_start_valid),
+		.tlv_type	= 0x15,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v6_filter_tbl_start_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v6_filter_tbl_start),
+		.tlv_type	= 0x15,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v6_filter_tbl_start),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     modem_mem_info_valid),
+		.tlv_type	= 0x16,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   modem_mem_info_valid),
+	},
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     modem_mem_info),
+		.tlv_type	= 0x16,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   modem_mem_info),
+		.ei_array	= ipa_mem_range_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     ctrl_comm_dest_end_pt_valid),
+		.tlv_type	= 0x17,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   ctrl_comm_dest_end_pt_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     ctrl_comm_dest_end_pt),
+		.tlv_type	= 0x17,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   ctrl_comm_dest_end_pt),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     skip_uc_load_valid),
+		.tlv_type	= 0x18,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   skip_uc_load_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_1_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     skip_uc_load),
+		.tlv_type	= 0x18,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   skip_uc_load),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     hdr_proc_ctx_tbl_info_valid),
+		.tlv_type	= 0x19,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   hdr_proc_ctx_tbl_info_valid),
+	},
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     hdr_proc_ctx_tbl_info),
+		.tlv_type	= 0x19,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   hdr_proc_ctx_tbl_info),
+		.ei_array	= ipa_mem_bounds_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     zip_tbl_info_valid),
+		.tlv_type	= 0x1a,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   zip_tbl_info_valid),
+	},
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     zip_tbl_info),
+		.tlv_type	= 0x1a,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   zip_tbl_info),
+		.ei_array	= ipa_mem_bounds_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v4_hash_route_tbl_info_valid),
+		.tlv_type	= 0x1b,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v4_hash_route_tbl_info_valid),
+	},
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v4_hash_route_tbl_info),
+		.tlv_type	= 0x1b,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v4_hash_route_tbl_info),
+		.ei_array	= ipa_mem_array_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v6_hash_route_tbl_info_valid),
+		.tlv_type	= 0x1c,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v6_hash_route_tbl_info_valid),
+	},
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v6_hash_route_tbl_info),
+		.tlv_type	= 0x1c,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v6_hash_route_tbl_info),
+		.ei_array	= ipa_mem_array_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v4_hash_filter_tbl_start_valid),
+		.tlv_type	= 0x1d,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v4_hash_filter_tbl_start_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v4_hash_filter_tbl_start),
+		.tlv_type	= 0x1d,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v4_hash_filter_tbl_start),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v6_hash_filter_tbl_start_valid),
+		.tlv_type	= 0x1e,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v6_hash_filter_tbl_start_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_req,
+				     v6_hash_filter_tbl_start),
+		.tlv_type	= 0x1e,
+		.offset		= offsetof(struct ipa_init_modem_driver_req,
+					   v6_hash_filter_tbl_start),
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
+
+/* QMI message structure definition for struct ipa_init_modem_driver_rsp */
+struct qmi_elem_info ipa_init_modem_driver_rsp_ei[] = {
+	{
+		.data_type	= QMI_STRUCT,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_rsp,
+				     rsp),
+		.tlv_type	= 0x02,
+		.offset		= offsetof(struct ipa_init_modem_driver_rsp,
+					   rsp),
+		.ei_array	= qmi_response_type_v01_ei,
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_rsp,
+				     ctrl_comm_dest_end_pt_valid),
+		.tlv_type	= 0x10,
+		.offset		= offsetof(struct ipa_init_modem_driver_rsp,
+					   ctrl_comm_dest_end_pt_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_rsp,
+				     ctrl_comm_dest_end_pt),
+		.tlv_type	= 0x10,
+		.offset		= offsetof(struct ipa_init_modem_driver_rsp,
+					   ctrl_comm_dest_end_pt),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_rsp,
+				     default_end_pt_valid),
+		.tlv_type	= 0x11,
+		.offset		= offsetof(struct ipa_init_modem_driver_rsp,
+					   default_end_pt_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_4_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_rsp,
+				     default_end_pt),
+		.tlv_type	= 0x11,
+		.offset		= offsetof(struct ipa_init_modem_driver_rsp,
+					   default_end_pt),
+	},
+	{
+		.data_type	= QMI_OPT_FLAG,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_rsp,
+				     modem_driver_init_pending_valid),
+		.tlv_type	= 0x12,
+		.offset		= offsetof(struct ipa_init_modem_driver_rsp,
+					   modem_driver_init_pending_valid),
+	},
+	{
+		.data_type	= QMI_UNSIGNED_1_BYTE,
+		.elem_len	= 1,
+		.elem_size	=
+			sizeof_field(struct ipa_init_modem_driver_rsp,
+				     modem_driver_init_pending),
+		.tlv_type	= 0x12,
+		.offset		= offsetof(struct ipa_init_modem_driver_rsp,
+					   modem_driver_init_pending),
+	},
+	{
+		.data_type	= QMI_EOTI,
+	},
+};
diff --git a/drivers/net/ipa/ipa_qmi_msg.h b/drivers/net/ipa/ipa_qmi_msg.h
new file mode 100644
index 000000000000..174e7789efa4
--- /dev/null
+++ b/drivers/net/ipa/ipa_qmi_msg.h
@@ -0,0 +1,238 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018-2019 Linaro Ltd.
+ */
+#ifndef _IPA_QMI_MSG_H_
+#define _IPA_QMI_MSG_H_
+
+#include <linux/types.h>
+#include <linux/soc/qcom/qmi.h>
+
+/* === NOTE:  Only "ipa_qmi_msg.c" should include this file === */
+
+/* Request/response/indication QMI message ids used for IPA.  Receiving
+ * end issues a response for requests; indications require no response.
+ */
+#define IPA_QMI_INDICATION_REGISTER	0x20	/* modem -> AP request */
+#define IPA_QMI_INIT_DRIVER		0x21	/* AP -> modem request */
+#define IPA_QMI_INIT_COMPLETE_IND	0x22	/* AP -> modem indication */
+#define IPA_QMI_DRIVER_INIT_COMPLETE	0x35	/* modem -> AP request */
+
+/* The maximum size required for message types.  These sizes include
+ * the message data, along with type (1 byte) and length (2 byte)
+ * information for each field.  The qmi_send_*() interfaces require
+ * the message size to be provided.
+ */
+#define IPA_QMI_INDICATION_REGISTER_REQ_SZ	8	/* -> server handle */
+#define IPA_QMI_INDICATION_REGISTER_RSP_SZ	7	/* <- server handle */
+#define IPA_QMI_INIT_DRIVER_REQ_SZ		134	/* client handle -> */
+#define IPA_QMI_INIT_DRIVER_RSP_SZ		25	/* client handle <- */
+#define IPA_QMI_INIT_COMPLETE_IND_SZ		7	/* server handle -> */
+#define IPA_QMI_DRIVER_INIT_COMPLETE_REQ_SZ	4	/* -> server handle */
+#define IPA_QMI_DRIVER_INIT_COMPLETE_RSP_SZ	7	/* <- server handle */
+
+/* Maximum size of messages we expect the AP to receive (max of above) */
+#define IPA_QMI_SERVER_MAX_RCV_SZ		8
+#define IPA_QMI_CLIENT_MAX_RCV_SZ		25
+
+/* Request message for the IPA_QMI_INDICATION_REGISTER request */
+struct ipa_indication_register_req {
+	u8 master_driver_init_complete_valid;
+	u8 master_driver_init_complete;
+	u8 data_usage_quota_reached_valid;
+	u8 data_usage_quota_reached;
+};
+
+/* The response to a IPA_QMI_INDICATION_REGISTER request consists only of
+ * a standard QMI response.
+ */
+struct ipa_indication_register_rsp {
+	struct qmi_response_type_v01 rsp;
+};
+
+/* Request message for the IPA_QMI_DRIVER_INIT_COMPLETE request */
+struct ipa_driver_init_complete_req {
+	u8 status;
+};
+
+/* The response to a IPA_QMI_DRIVER_INIT_COMPLETE request consists only
+ * of a standard QMI response.
+ */
+struct ipa_driver_init_complete_rsp {
+	struct qmi_response_type_v01 rsp;
+};
+
+/* The message for the IPA_QMI_INIT_COMPLETE_IND indication consists
+ * only of a standard QMI response.
+ */
+struct ipa_init_complete_ind {
+	struct qmi_response_type_v01 status;
+};
+
+/* The AP tells the modem its platform type.  We assume Android. */
+enum ipa_platform_type {
+	IPA_QMI_PLATFORM_TYPE_INVALID		= 0,	/* Invalid */
+	IPA_QMI_PLATFORM_TYPE_TN		= 1,	/* Data card */
+	IPA_QMI_PLATFORM_TYPE_LE		= 2,	/* Data router */
+	IPA_QMI_PLATFORM_TYPE_MSM_ANDROID	= 3,	/* Android MSM */
+	IPA_QMI_PLATFORM_TYPE_MSM_WINDOWS	= 4,	/* Windows MSM */
+	IPA_QMI_PLATFORM_TYPE_MSM_QNX_V01	= 5,	/* QNX MSM */
+};
+
+/* This defines the start and end offset of a range of memory.  Both
+ * fields are offsets relative to the start of IPA shared memory.
+ * The end value is the last addressable byte *within* the range.
+ */
+struct ipa_mem_bounds {
+	u32 start;
+	u32 end;
+};
+
+/* This defines the location and size of an array.  The start value
+ * is an offset relative to the start of IPA shared memory.  The
+ * size of the array is implied by the number of entries (the entry
+ * size is assumed to be known).
+ */
+struct ipa_mem_array {
+	u32 start;
+	u32 count;
+};
+
+/* This defines the location and size of a range of memory.  The
+ * start is an offset relative to the start of IPA shared memory.
+ * This differs from the ipa_mem_bounds structure in that the size
+ * (in bytes) of the memory region is specified rather than the
+ * offset of its last byte.
+ */
+struct ipa_mem_range {
+	u32 start;
+	u32 size;
+};
+
+/* The message for the IPA_QMI_INIT_DRIVER request contains information
+ * from the AP that affects modem initialization.
+ */
+struct ipa_init_modem_driver_req {
+	u8			platform_type_valid;
+	u32			platform_type;	/* enum ipa_platform_type */
+
+	/* Modem header table information.  This defines the IPA shared
+	 * memory in which the modem may insert header table entries.
+	 */
+	u8			hdr_tbl_info_valid;
+	struct ipa_mem_bounds	hdr_tbl_info;
+
+	/* Routing table information.  These define the location and size of
+	 * non-hashable IPv4 and IPv6 filter tables.  The start values are
+	 * offsets relative to the start of IPA shared memory.
+	 */
+	u8			v4_route_tbl_info_valid;
+	struct ipa_mem_array	v4_route_tbl_info;
+	u8			v6_route_tbl_info_valid;
+	struct ipa_mem_array	v6_route_tbl_info;
+
+	/* Filter table information.  These define the location and size of
+	 * non-hashable IPv4 and IPv6 filter tables.  The start values are
+	 * offsets relative to the start of IPA shared memory.
+	 */
+	u8			v4_filter_tbl_start_valid;
+	u32			v4_filter_tbl_start;
+	u8			v6_filter_tbl_start_valid;
+	u32			v6_filter_tbl_start;
+
+	/* Modem memory information.  This defines the location and
+	 * size of memory available for the modem to use.
+	 */
+	u8			modem_mem_info_valid;
+	struct ipa_mem_range	modem_mem_info;
+
+	/* This defines the destination endpoint on the AP to which
+	 * the modem driver can send control commands.  IPA supports
+	 * 20 endpoints, so this must be 19 or less.
+	 */
+	u8			ctrl_comm_dest_end_pt_valid;
+	u32			ctrl_comm_dest_end_pt;
+
+	/* This defines whether the modem should load the microcontroller
+	 * or not.  It is unnecessary to reload it if the modem is being
+	 * restarted.
+	 *
+	 * NOTE: this field is named "is_ssr_bootup" elsewhere.
+	 */
+	u8			skip_uc_load_valid;
+	u8			skip_uc_load;
+
+	/* Processing context memory information.  This defines the memory in
+	 * which the modem may insert header processing context table entries.
+	 */
+	u8			hdr_proc_ctx_tbl_info_valid;
+	struct ipa_mem_bounds	hdr_proc_ctx_tbl_info;
+
+	/* Compression command memory information.  This defines the memory
+	 * in which the modem may insert compression/decompression commands.
+	 */
+	u8			zip_tbl_info_valid;
+	struct ipa_mem_bounds	zip_tbl_info;
+
+	/* Routing table information.  These define the location and size
+	 * of hashable IPv4 and IPv6 filter tables.  The start values are
+	 * offsets relative to the start of IPA shared memory.
+	 */
+	u8			v4_hash_route_tbl_info_valid;
+	struct ipa_mem_array	v4_hash_route_tbl_info;
+	u8			v6_hash_route_tbl_info_valid;
+	struct ipa_mem_array	v6_hash_route_tbl_info;
+
+	/* Filter table information.  These define the location and size
+	 * of hashable IPv4 and IPv6 filter tables.  The start values are
+	 * offsets relative to the start of IPA shared memory.
+	 */
+	u8			v4_hash_filter_tbl_start_valid;
+	u32			v4_hash_filter_tbl_start;
+	u8			v6_hash_filter_tbl_start_valid;
+	u32			v6_hash_filter_tbl_start;
+};
+
+/* The response to a IPA_QMI_INIT_DRIVER request begins with a standard
+ * QMI response, but contains other information as well.  Currently we
+ * simply wait for the the INIT_DRIVER transaction to complete and
+ * ignore any other data that might be returned.
+ */
+struct ipa_init_modem_driver_rsp {
+	struct qmi_response_type_v01	rsp;
+
+	/* This defines the destination endpoint on the modem to which
+	 * the AP driver can send control commands.  IPA supports
+	 * 20 endpoints, so this must be 19 or less.
+	 */
+	u8				ctrl_comm_dest_end_pt_valid;
+	u32				ctrl_comm_dest_end_pt;
+
+	/* This defines the default endpoint.  The AP driver is not
+	 * required to configure the hardware with this value.  IPA
+	 * supports 20 endpoints, so this must be 19 or less.
+	 */
+	u8				default_end_pt_valid;
+	u32				default_end_pt;
+
+	/* This defines whether a second handshake is required to complete
+	 * initialization.
+	 */
+	u8				modem_driver_init_pending_valid;
+	u8				modem_driver_init_pending;
+};
+
+/* Message structure definitions defined in "ipa_qmi_msg.c" */
+extern struct qmi_elem_info ipa_indication_register_req_ei[];
+extern struct qmi_elem_info ipa_indication_register_rsp_ei[];
+extern struct qmi_elem_info ipa_driver_init_complete_req_ei[];
+extern struct qmi_elem_info ipa_driver_init_complete_rsp_ei[];
+extern struct qmi_elem_info ipa_init_complete_ind_ei[];
+extern struct qmi_elem_info ipa_mem_bounds_ei[];
+extern struct qmi_elem_info ipa_mem_array_ei[];
+extern struct qmi_elem_info ipa_mem_range_ei[];
+extern struct qmi_elem_info ipa_init_modem_driver_req_ei[];
+extern struct qmi_elem_info ipa_init_modem_driver_rsp_ei[];
+
+#endif /* !_IPA_QMI_MSG_H_ */
diff --git a/drivers/net/ipa/ipa_smp2p.c b/drivers/net/ipa/ipa_smp2p.c
new file mode 100644
index 000000000000..c59f358b44b4
--- /dev/null
+++ b/drivers/net/ipa/ipa_smp2p.c
@@ -0,0 +1,304 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/notifier.h>
+#include <linux/soc/qcom/smem.h>
+#include <linux/soc/qcom/smem_state.h>
+
+#include "ipa_smp2p.h"
+#include "ipa.h"
+#include "ipa_uc.h"
+#include "ipa_clock.h"
+
+/**
+ * DOC: IPA SMP2P communication with the modem
+ *
+ * SMP2P is a primitive communication mechanism available between the AP and
+ * the modem.  The IPA driver uses this for two purposes:  to enable the modem
+ * to state that the GSI hardware is ready to use; and to communicate the
+ * state of the IPA clock in the event of a crash.
+ *
+ * GSI needs to have early initialization completed before it can be used.
+ * This initialization is done either by Trust Zone or by the modem.  In the
+ * latter case, the modem uses an SMP2P interrupt to tell the AP IPA driver
+ * when the GSI is ready to use.
+ *
+ * The modem is also able to inquire about the current state of the IPA
+ * clock by trigging another SMP2P interrupt to the AP.  We communicate
+ * whether the clock is enabled using two SMP2P state bits--one to
+ * indicate the clock state (on or off), and a second to indicate the
+ * clock state bit is valid.  The modem will poll the valid bit until it
+ * is set, and at that time records whether the AP has the IPA clock enabled.
+ *
+ * Finally, if the AP kernel panics, we update the SMP2P state bits even if
+ * we never receive an interrupt from the modem requesting this.
+ */
+
+/**
+ * struct ipa_smp2p - IPA SMP2P information
+ * @ipa:		IPA pointer
+ * @valid_state:	SMEM state indicating enabled state is valid
+ * @enabled_state:	SMEM state to indicate clock is enabled
+ * @valid_bit:		Valid bit in 32-bit SMEM state mask
+ * @enabled_bit:	Enabled bit in 32-bit SMEM state mask
+ * @enabled_bit:	Enabled bit in 32-bit SMEM state mask
+ * @clock_query_irq:	IPA interrupt triggered by modem for clock query
+ * @setup_ready_irq:	IPA interrupt triggered by modem to signal GSI ready
+ * @clock_on:		Whether IPA clock is on
+ * @notified:		Whether modem has been notified of clock state
+ * @disabled:		Whether setup ready interrupt handling is disabled
+ * @mutex mutex:	Motex protecting ready interrupt/shutdown interlock
+ * @panic_notifier:	Panic notifier structure
+*/
+struct ipa_smp2p {
+	struct ipa *ipa;
+	struct qcom_smem_state *valid_state;
+	struct qcom_smem_state *enabled_state;
+	u32 valid_bit;
+	u32 enabled_bit;
+	u32 clock_query_irq;
+	u32 setup_ready_irq;
+	u32 clock_on;
+	u32 notified;
+	u32 disabled;
+	struct mutex mutex;
+	struct notifier_block panic_notifier;
+};
+
+/**
+ * ipa_smp2p_notify() - use SMP2P to tell modem about IPA clock state
+ * @smp2p:	SMP2P information
+ *
+ * This is called either when the modem has requested it (by triggering
+ * the modem clock query IPA interrupt) or whenever the AP is shutting down
+ * (via a panic notifier).  It sets the two SMP2P state bits--one saying
+ * whether the IPA clock is running, and the other indicating the first bit
+ * is valid.
+ */
+static void ipa_smp2p_notify(struct ipa_smp2p *smp2p)
+{
+	u32 value;
+	u32 mask;
+
+	if (smp2p->notified)
+		return;
+
+	smp2p->clock_on = ipa_clock_get_additional(smp2p->ipa->clock) ? 1 : 0;
+
+	/* Signal whether the clock is enabled */
+	mask = BIT(smp2p->enabled_bit);
+	value = smp2p->clock_on ? mask : 0;
+	qcom_smem_state_update_bits(smp2p->enabled_state, mask, value);
+
+	/* Now indicate that the enabled flag is valid */
+	mask = BIT(smp2p->valid_bit);
+	value = mask;
+	qcom_smem_state_update_bits(smp2p->valid_state, mask, value);
+
+	smp2p->notified = 1;
+}
+
+/* Threaded IRQ handler for modem "ipa-clock-query" SMP2P interrupt */
+static irqreturn_t ipa_smp2p_modem_clk_query_isr(int irq, void *dev_id)
+{
+	struct ipa_smp2p *smp2p = dev_id;
+
+	ipa_smp2p_notify(smp2p);
+
+	return IRQ_HANDLED;
+}
+
+static int ipa_smp2p_panic_notifier(struct notifier_block *nb,
+				    unsigned long action, void *data)
+{
+	struct ipa_smp2p *smp2p;
+
+	smp2p = container_of(nb, struct ipa_smp2p, panic_notifier);
+
+	ipa_smp2p_notify(smp2p);
+
+	if (smp2p->clock_on)
+		ipa_uc_panic_notifier(smp2p->ipa);
+
+	return NOTIFY_DONE;
+}
+
+static int ipa_smp2p_panic_notifier_register(struct ipa_smp2p *smp2p)
+{
+	/* IPA panic handler needs to run before modem shuts down */
+	smp2p->panic_notifier.notifier_call = ipa_smp2p_panic_notifier;
+	smp2p->panic_notifier.priority = INT_MAX;	/* Do it early */
+
+	return atomic_notifier_chain_register(&panic_notifier_list,
+					      &smp2p->panic_notifier);
+}
+
+static void ipa_smp2p_panic_notifier_unregister(struct ipa_smp2p *smp2p)
+{
+	atomic_notifier_chain_unregister(&panic_notifier_list,
+					 &smp2p->panic_notifier);
+}
+
+/* Threaded IRQ handler for modem "ipa-setup-ready" SMP2P interrupt */
+static irqreturn_t ipa_smp2p_modem_setup_ready_isr(int irq, void *dev_id)
+{
+	struct ipa_smp2p *smp2p = dev_id;
+	int ret;
+
+	mutex_lock(&smp2p->mutex);
+	if (!smp2p->disabled) {
+		ret = ipa_setup(smp2p->ipa);
+		WARN(ret, "error %d from IPA setup\n", ret);
+	}
+	mutex_unlock(&smp2p->mutex);
+
+	return IRQ_HANDLED;
+}
+
+/* Initialize SMP2P interrupts */
+static int ipa_smp2p_irq_init(struct ipa_smp2p *smp2p, const char *name,
+			      irq_handler_t handler)
+{
+	unsigned int irq;
+	int ret;
+
+	ret = platform_get_irq_byname(smp2p->ipa->pdev, name);
+	if (ret < 0)
+		return ret;
+	if (!ret)
+		return -EINVAL;		/* IRQ mapping failure */
+	irq = ret;
+
+	ret = request_threaded_irq(irq, NULL, handler, 0, name, smp2p);
+	if (ret)
+		return ret;
+
+	return irq;
+}
+
+static void ipa_smp2p_irq_exit(struct ipa_smp2p *smp2p, u32 irq)
+{
+	free_irq(irq, smp2p);
+}
+
+/* Initialize the IPA SMP2P subsystem */
+struct ipa_smp2p *ipa_smp2p_init(struct ipa *ipa, bool modem_init)
+{
+	struct qcom_smem_state *enabled_state;
+	struct device *dev = &ipa->pdev->dev;
+	struct qcom_smem_state *valid_state;
+	struct ipa_smp2p *smp2p;
+	u32 enabled_bit;
+	u32 valid_bit;
+	int ret;
+
+	valid_state = qcom_smem_state_get(dev, "ipa-clock-enabled-valid",
+					  &valid_bit);
+	if (IS_ERR(valid_state))
+		return ERR_CAST(valid_state);
+	if (valid_bit >= BITS_PER_LONG)
+		return ERR_PTR(-EINVAL);
+
+	enabled_state = qcom_smem_state_get(dev, "ipa-clock-enabled",
+					    &enabled_bit);
+	if (IS_ERR(enabled_state))
+		return ERR_CAST(enabled_state);
+	if (enabled_bit >= BITS_PER_LONG)
+		return ERR_PTR(-EINVAL);
+
+	smp2p = kzalloc(sizeof(*smp2p), GFP_KERNEL);
+	if (!smp2p)
+		return ERR_PTR(-ENOMEM);
+
+	smp2p->ipa = ipa;
+
+	/* These fields are needed by the clock query interrupt
+	 * handler, so initialize them now.
+	 */
+	mutex_init(&smp2p->mutex);
+	smp2p->valid_state = valid_state;
+	smp2p->valid_bit = valid_bit;
+	smp2p->enabled_state = enabled_state;
+	smp2p->enabled_bit = enabled_bit;
+
+	ret = ipa_smp2p_irq_init(smp2p, "ipa-clock-query",
+				 ipa_smp2p_modem_clk_query_isr);
+	if (ret < 0)
+		goto err_mutex_destroy;
+	smp2p->clock_query_irq = ret;
+
+	ret = ipa_smp2p_panic_notifier_register(smp2p);
+	if (ret)
+		goto err_irq_exit;
+
+	if (modem_init) {
+		/* Result will be non-zero (negative for error) */
+		ret = ipa_smp2p_irq_init(smp2p, "ipa-setup-ready",
+					 ipa_smp2p_modem_setup_ready_isr);
+		if (ret < 0)
+			goto err_notifier_unregister;
+		smp2p->setup_ready_irq = ret;
+	}
+
+	return smp2p;
+
+err_notifier_unregister:
+	ipa_smp2p_panic_notifier_unregister(smp2p);
+err_irq_exit:
+	ipa_smp2p_irq_exit(smp2p, smp2p->clock_query_irq);
+err_mutex_destroy:
+	mutex_destroy(&smp2p->mutex);
+	kfree(smp2p);
+
+	return ERR_PTR(ret);
+}
+
+void ipa_smp2p_exit(struct ipa_smp2p *smp2p)
+{
+	if (smp2p->setup_ready_irq)
+		ipa_smp2p_irq_exit(smp2p, smp2p->setup_ready_irq);
+	ipa_smp2p_panic_notifier_unregister(smp2p);
+	ipa_smp2p_irq_exit(smp2p, smp2p->clock_query_irq);
+	mutex_destroy(&smp2p->mutex);
+	kfree(smp2p);
+}
+
+void ipa_smp2p_disable(struct ipa_smp2p *smp2p)
+{
+	if (smp2p->setup_ready_irq) {
+		mutex_lock(&smp2p->mutex);
+		smp2p->disabled = 1;
+		mutex_unlock(&smp2p->mutex);
+	}
+}
+
+/* Reset state tracking whether we have notified the modem */
+void ipa_smp2p_notify_reset(struct ipa_smp2p *smp2p)
+{
+	u32 mask;
+
+	if (!smp2p->notified)
+		return;
+
+	/* Drop the clock reference if it was taken above */
+	if (smp2p->clock_on) {
+		ipa_clock_put(smp2p->ipa->clock);
+		smp2p->clock_on = 0;
+	}
+
+	/* Reset the clock enabled valid flag */
+	mask = BIT(smp2p->valid_bit);
+	qcom_smem_state_update_bits(smp2p->valid_state, mask, 0);
+
+	/* Mark the clock disabled for good measure... */
+	mask = BIT(smp2p->enabled_bit);
+	qcom_smem_state_update_bits(smp2p->enabled_state, mask, 0);
+
+	smp2p->notified = 0;
+}
diff --git a/drivers/net/ipa/ipa_smp2p.h b/drivers/net/ipa/ipa_smp2p.h
new file mode 100644
index 000000000000..9c7e4339a7b0
--- /dev/null
+++ b/drivers/net/ipa/ipa_smp2p.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _IPA_SMP2P_H_
+#define _IPA_SMP2P_H_
+
+#include <linux/types.h>
+
+struct ipa;
+
+/**
+ * ipa_smp2p_init() - Initialize the IPA SMP2P subsystem
+ * @ipa:	IPA pointer
+ * @modem_init:	Whether the modem is responsible for GSI initialization
+ *
+ * @Return:	Pointer to IPA SMP2P info, or a pointer-coded error
+ */
+struct ipa_smp2p *ipa_smp2p_init(struct ipa *ipa, bool modem_init);
+
+/**
+ * ipa_smp2p_exit() - Inverse of ipa_smp2p_init()
+ * @smp2p:	SMP2P information pointer
+ */
+void ipa_smp2p_exit(struct ipa_smp2p *smp2p);
+
+/**
+ * ipa_smp2p_disable() - Prevent "ipa-setup-ready" interrupt handling
+ * @smp2p:	SMP2P information pointer
+ *
+ * Prevent handling of the "setup ready" interrupt from the modem.
+ * This is used before initiating shutdown of the driver.
+ */
+void ipa_smp2p_disable(struct ipa_smp2p *smp2p);
+
+/**
+ * ipa_smp2p_notify_reset() - Reset modem notification state
+ * @smp2p:	SMP2P information pointer
+ *
+ * If the modem crashes it queries the IPA clock state.  In cleaning
+ * up after such a crash this is used to reset some state maintained
+ * for managing this notification.
+ */
+void ipa_smp2p_notify_reset(struct ipa_smp2p *smp2p);
+
+#endif /* _IPA_SMP2P_H_ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 15/18] soc: qcom: ipa: support build of IPA code
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (13 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 14/18] soc: qcom: ipa: AP/modem communications Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-12  1:25 ` [PATCH 16/18] MAINTAINERS: add entry for the Qualcomm IPA driver Alex Elder
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas,
	sridhar.samudrala, jakub.kicinski, daniel, mcroce, j.neuschaefer
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

Add build and Kconfig support for the Qualcomm IPA driver.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/Kconfig      |  2 ++
 drivers/net/Makefile     |  1 +
 drivers/net/ipa/Kconfig  | 16 ++++++++++++++++
 drivers/net/ipa/Makefile |  7 +++++++
 4 files changed, 26 insertions(+)
 create mode 100644 drivers/net/ipa/Kconfig
 create mode 100644 drivers/net/ipa/Makefile

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 7a96d168efc4..0603fde43d54 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -387,6 +387,8 @@ source "drivers/net/fddi/Kconfig"
 
 source "drivers/net/hippi/Kconfig"
 
+source "drivers/net/ipa/Kconfig"
+
 config NET_SB1000
 	tristate "General Instruments Surfboard 1000"
 	depends on PNP
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 21cde7e78621..c01f48badba6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -45,6 +45,7 @@ obj-$(CONFIG_ETHERNET) += ethernet/
 obj-$(CONFIG_FDDI) += fddi/
 obj-$(CONFIG_HIPPI) += hippi/
 obj-$(CONFIG_HAMRADIO) += hamradio/
+obj-$(CONFIG_IPA) += ipa/
 obj-$(CONFIG_PLIP) += plip/
 obj-$(CONFIG_PPP) += ppp/
 obj-$(CONFIG_PPP_ASYNC) += ppp/
diff --git a/drivers/net/ipa/Kconfig b/drivers/net/ipa/Kconfig
new file mode 100644
index 000000000000..b1e3f7405992
--- /dev/null
+++ b/drivers/net/ipa/Kconfig
@@ -0,0 +1,16 @@
+config IPA
+	tristate "Qualcomm IPA support"
+	depends on NET
+	select QCOM_QMI_HELPERS
+	select QCOM_MDT_LOADER
+	default n
+	help
+	  Choose Y here to include support for the Qualcomm IP Accelerator
+	  (IPA), a hardware block present in some Qualcomm SoCs.  The IPA
+	  is a programmable protocol processor that is capable of generic
+	  hardware handling of IP packets, including routing, filtering,
+	  and NAT.  Currently the IPA driver supports only basic transport
+	  of network traffic between the AP and modem, on the Qualcomm
+	  SDM845 SoC.
+
+	  If unsure, say N.
diff --git a/drivers/net/ipa/Makefile b/drivers/net/ipa/Makefile
new file mode 100644
index 000000000000..a43039c09a25
--- /dev/null
+++ b/drivers/net/ipa/Makefile
@@ -0,0 +1,7 @@
+obj-$(CONFIG_IPA)	+=	ipa.o
+
+ipa-y			:=	ipa_main.o ipa_clock.o ipa_mem.o \
+				ipa_interrupt.o gsi.o gsi_trans.o \
+				ipa_gsi.o ipa_smp2p.o ipa_uc.o \
+				ipa_endpoint.o ipa_cmd.o ipa_netdev.o \
+				ipa_qmi.o ipa_qmi_msg.o ipa_data-sdm845.o
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 16/18] MAINTAINERS: add entry for the Qualcomm IPA driver
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (14 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 15/18] soc: qcom: ipa: support build of IPA code Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-12  1:25 ` [PATCH 17/18] arm64: dts: sdm845: add IPA information Alex Elder
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas, mchehab+samsung,
	gregkh, nicolas.ferre, paulmck
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

Add an entry in the MAINTAINERS file for the Qualcomm IPA driver

Signed-off-by: Alex Elder <elder@linaro.org>
---
 MAINTAINERS | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2c2fce72e694..2348a90d4dff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12694,6 +12694,12 @@ L:	alsa-devel@alsa-project.org (moderated for non-subscribers)
 S:	Supported
 F:	sound/soc/qcom/
 
+QCOM IPA DRIVER
+M:	Alex Elder <elder@kernel.org>
+L:	netdev@vger.kernel.org
+S:	Supported
+F:	drivers/net/ipa/
+
 QEMU MACHINE EMULATOR AND VIRTUALIZER SUPPORT
 M:	Gabriel Somlo <somlo@cmu.edu>
 M:	"Michael S. Tsirkin" <mst@redhat.com>
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 17/18] arm64: dts: sdm845: add IPA information
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (15 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 16/18] MAINTAINERS: add entry for the Qualcomm IPA driver Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-12  1:25 ` [PATCH 18/18] arm64: defconfig: enable build of IPA code Alex Elder
  2019-05-15 12:37 ` [PATCH 00/18] net: introduce Qualcomm IPA driver Arnd Bergmann
  18 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas, robh+dt,
	mark.rutland, andy.gross, david.brown
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

Add IPA-related nodes and definitions to "sdm845.dtsi".

Signed-off-by: Alex Elder <elder@linaro.org>
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 51 ++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 5308f1671824..b8b2bb753710 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -18,6 +18,7 @@
 #include <dt-bindings/soc/qcom,rpmh-rsc.h>
 #include <dt-bindings/clock/qcom,gcc-sdm845.h>
 #include <dt-bindings/thermal/thermal.h>
+#include <dt-bindings/interconnect/qcom,sdm845.h>
 
 / {
 	interrupt-parent = <&intc>;
@@ -342,6 +343,17 @@
 			interrupt-controller;
 			#interrupt-cells = <2>;
 		};
+
+		ipa_smp2p_out: ipa-ap-to-modem {
+			qcom,entry-name = "ipa";
+			#qcom,smem-state-cells = <1>;
+		};
+
+		ipa_smp2p_in: ipa-modem-to-ap {
+			qcom,entry-name = "ipa";
+			interrupt-controller;
+			#interrupt-cells = <2>;
+		};
 	};
 
 	smp2p-slpi {
@@ -1090,6 +1102,45 @@
 			};
 		};
 
+		ipa@1e40000 {
+			compatible = "qcom,sdm845-ipa";
+
+			modem-init;
+
+			reg = <0 0x1e40000 0 0x7000>,
+			      <0 0x1e47000 0 0x2000>,
+			      <0 0x1e04000 0 0x2c000>;
+			reg-names = "ipa-reg",
+				    "ipa-shared",
+				    "gsi";
+
+			interrupts-extended =
+					<&intc 0 311 IRQ_TYPE_EDGE_RISING>,
+					<&intc 0 432 IRQ_TYPE_LEVEL_HIGH>,
+					<&ipa_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+					<&ipa_smp2p_in 1 IRQ_TYPE_EDGE_RISING>;
+			interrupt-names = "ipa",
+					  "gsi",
+					  "ipa-clock-query",
+					  "ipa-setup-ready";
+
+			clocks = <&rpmhcc RPMH_IPA_CLK>;
+			clock-names = "core";
+
+			interconnects =
+				<&rsc_hlos MASTER_IPA &rsc_hlos SLAVE_EBI1>,
+				<&rsc_hlos MASTER_IPA &rsc_hlos SLAVE_IMEM>,
+				<&rsc_hlos MASTER_APPSS_PROC &rsc_hlos SLAVE_IPA_CFG>;
+			interconnect-names = "memory",
+					     "imem",
+					     "config";
+
+			qcom,smem-states = <&ipa_smp2p_out 0>,
+					   <&ipa_smp2p_out 1>;
+			qcom,smem-state-names = "ipa-clock-enabled-valid",
+						"ipa-clock-enabled";
+		};
+
 		tcsr_mutex_regs: syscon@1f40000 {
 			compatible = "syscon";
 			reg = <0 0x01f40000 0 0x40000>;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 18/18] arm64: defconfig: enable build of IPA code
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (16 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 17/18] arm64: dts: sdm845: add IPA information Alex Elder
@ 2019-05-12  1:25 ` Alex Elder
  2019-05-15  8:23   ` Arnd Bergmann
  2019-05-15 12:37 ` [PATCH 00/18] net: introduce Qualcomm IPA driver Arnd Bergmann
  18 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-12  1:25 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas, catalin.marinas,
	will.deacon, andy.gross, olof, maxime.ripard, horms+renesas,
	jagan, stefan.wahren, marc.w.gonzalez, enric.balletbo
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel, Alex Elder

Add CONFIG_IPA to the 64-bit Arm defconfig.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 2d9c39033c1a..4f4d803e563d 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -268,6 +268,7 @@ CONFIG_SMSC911X=y
 CONFIG_SNI_AVE=y
 CONFIG_SNI_NETSEC=y
 CONFIG_STMMAC_ETH=m
+CONFIG_IPA=y
 CONFIG_MDIO_BUS_MUX_MMIOREG=y
 CONFIG_AT803X_PHY=m
 CONFIG_MARVELL_PHY=m
-- 
2.20.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-12  1:24 ` [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h" Alex Elder
@ 2019-05-12  2:34   ` Joe Perches
  2019-05-12 12:15     ` Alex Elder
  2019-05-15  6:59   ` Arnd Bergmann
  1 sibling, 1 reply; 66+ messages in thread
From: Joe Perches @ 2019-05-12  2:34 UTC (permalink / raw)
  To: Alex Elder, davem, arnd, bjorn.andersson, ilias.apalodimas,
	subashab, stranche, yuehaibing
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel

On Sat, 2019-05-11 at 20:24 -0500, Alex Elder wrote:
>  include/soc/qcom/rmnet.h

Should this file be added to the MAINTAINERS file
update in patch 16/18 ?


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max()
  2019-05-12  1:24 ` [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max() Alex Elder
@ 2019-05-12  6:33   ` Kalle Valo
  2019-05-12 12:18     ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Kalle Valo @ 2019-05-12  6:33 UTC (permalink / raw)
  To: Alex Elder
  Cc: davem, arnd, bjorn.andersson, ilias.apalodimas, johannes,
	andy.shevchenko, syadagir, mjavid, evgreen, benchan, ejcaruso,
	abhishek.esse, linux-kernel

Alex Elder <elder@linaro.org> writes:

> Define FIELD_MAX(), which supplies the maximum value that can be
> represented by a field value.  Define field_max() as well, to go
> along with the lower-case forms of the field mask functions.
>
> Signed-off-by: Alex Elder <elder@linaro.org>

Via which tree is this going? I assume I do not have take it unless
someone says otherwise.

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-12  2:34   ` Joe Perches
@ 2019-05-12 12:15     ` Alex Elder
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-12 12:15 UTC (permalink / raw)
  To: Joe Perches, davem, arnd, bjorn.andersson, ilias.apalodimas,
	subashab, stranche, yuehaibing
  Cc: syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel

On 5/11/19 9:34 PM, Joe Perches wrote:
> On Sat, 2019-05-11 at 20:24 -0500, Alex Elder wrote:
>>  include/soc/qcom/rmnet.h
> 
> Should this file be added to the MAINTAINERS file
> update in patch 16/18 ?

Sure, that's a good point.  I'll add it when I submit a v2.
Thank you.

					-Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max()
  2019-05-12  6:33   ` Kalle Valo
@ 2019-05-12 12:18     ` Alex Elder
  2019-05-12 19:30       ` Johannes Berg
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-12 12:18 UTC (permalink / raw)
  To: Kalle Valo
  Cc: davem, arnd, bjorn.andersson, ilias.apalodimas, johannes,
	andy.shevchenko, syadagir, mjavid, evgreen, benchan, ejcaruso,
	abhishek.esse, linux-kernel

On 5/12/19 1:33 AM, Kalle Valo wrote:
> Alex Elder <elder@linaro.org> writes:
> 
>> Define FIELD_MAX(), which supplies the maximum value that can be
>> represented by a field value.  Define field_max() as well, to go
>> along with the lower-case forms of the field mask functions.
>>
>> Signed-off-by: Alex Elder <elder@linaro.org>
> 
> Via which tree is this going? I assume I do not have take it unless
> someone says otherwise.

Sorry about that, perhaps I should have posted it separately.

I don't have an answer, but we could avoid having to coordinate
if it went together with the rest.

					-Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max()
  2019-05-12 12:18     ` Alex Elder
@ 2019-05-12 19:30       ` Johannes Berg
  0 siblings, 0 replies; 66+ messages in thread
From: Johannes Berg @ 2019-05-12 19:30 UTC (permalink / raw)
  To: Alex Elder, Kalle Valo
  Cc: davem, arnd, bjorn.andersson, ilias.apalodimas, andy.shevchenko,
	syadagir, mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	linux-kernel

On Sun, 2019-05-12 at 07:18 -0500, Alex Elder wrote:
> On 5/12/19 1:33 AM, Kalle Valo wrote:
> > Alex Elder <elder@linaro.org> writes:
> > 
> > > Define FIELD_MAX(), which supplies the maximum value that can be
> > > represented by a field value.  Define field_max() as well, to go
> > > along with the lower-case forms of the field mask functions.
> > > 
> > > Signed-off-by: Alex Elder <elder@linaro.org>
> > 
> > Via which tree is this going? I assume I do not have take it unless
> > someone says otherwise.
> 
> Sorry about that, perhaps I should have posted it separately.
> 
> I don't have an answer, but we could avoid having to coordinate
> if it went together with the rest.

It's unlikely to conflict, and I don't think anyone really thinks that
the file is "theirs" (being basically standalone), so I think you should
just take it with whatever code that needs it.

johannes


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-12  1:24 ` [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h" Alex Elder
  2019-05-12  2:34   ` Joe Perches
@ 2019-05-15  6:59   ` Arnd Bergmann
  2019-05-15 12:03     ` Alex Elder
  1 sibling, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15  6:59 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, subashab,
	stranche, YueHaibing, Joe Perches, syadagir, mjavid, evgreen,
	benchan, ejcaruso, abhishek.esse, Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:

> diff --git a/include/soc/qcom/rmnet.h b/include/soc/qcom/rmnet.h
> new file mode 100644
> index 000000000000..80dcd6e68c3d
> --- /dev/null
> +++ b/include/soc/qcom/rmnet.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/* Copyright (c) 2013-2018, The Linux Foundation. All rights reserved.
> + * Copyright (C) 2018-2019 Linaro Ltd.
> + */
> +#ifndef _SOC_QCOM_RMNET_H_
> +#define _SOC_QCOM_RMNET_H_
> +
> +#include <linux/types.h>
> +
> +/* Header structure that precedes packets in ETH_P_MAP protocol */
> +struct rmnet_map_header {
> +       u8  pad_len             : 6;
> +       u8  reserved_bit        : 1;
> +       u8  cd_bit              : 1;
> +       u8  mux_id;
> +       __be16 pkt_len;
> +}  __aligned(1);

If we move this into include/soc/, I want the structure to be portable,
and avoid the bit fields. Please use mask/shift operations or the
include/linux/bits.h macros instead to make this work with big-endian
kernels.

     Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings
  2019-05-12  1:24 ` [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings Alex Elder
@ 2019-05-15  7:03   ` Arnd Bergmann
  2019-05-15 12:04     ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15  7:03 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, Rob Herring,
	Mark Rutland, Andy Gross, David Brown, syadagir, mjavid, evgreen,
	benchan, ejcaruso, abhishek.esse, Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
>
> Add the binding definitions for the "qcom,ipa" device tree node.
>
> Signed-off-by: Alex Elder <elder@linaro.org>
> ---
>  .../devicetree/bindings/net/qcom,ipa.txt      | 164 ++++++++++++++++++
>  1 file changed, 164 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/qcom,ipa.txt
>
> diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.txt b/Documentation/devicetree/bindings/net/qcom,ipa.txt
> new file mode 100644
> index 000000000000..2705e198f12e
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/qcom,ipa.txt

For new bindings, we should use the yaml format so we can verify the
device tree files against the binding.

> +
> +- reg:
> +       Resources specifying the physical address spaces of the IPA and GSI.
> +
> +- reg-names:
> +       The names of the two address space ranges defined by the "reg"
> +       property.  Must be:
> +               "ipa-reg"
> +               "ipa-shared"
> +               "gsi"

Those are three, not two ;-)

        Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface
  2019-05-12  1:24 ` [PATCH 08/18] soc: qcom: ipa: the generic software interface Alex Elder
@ 2019-05-15  7:21   ` Arnd Bergmann
  2019-05-15 12:13     ` Alex Elder
  2019-05-15 10:47   ` Arnd Bergmann
  2019-05-15 19:37   ` Arnd Bergmann
  2 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15  7:21 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:

> +/** gsi_gpi_channel_scratch - GPI protocol scratch register
> + *
> + * @max_outstanding_tre:
> + *     Defines the maximum number of TREs allowed in a single transaction
> + *     on a channel (in Bytes).  This determines the amount of prefetch
> + *     performed by the hardware.  We configure this to equal the size of
> + *     the TLV FIFO for the channel.
> + * @outstanding_threshold:
> + *     Defines the threshold (in Bytes) determining when the sequencer
> + *     should update the channel doorbell.  We configure this to equal
> + *     the size of two TREs.
> + */
> +struct gsi_gpi_channel_scratch {
> +       u64 rsvd1;
> +       u16 rsvd2;
> +       u16 max_outstanding_tre;
> +       u16 rsvd3;
> +       u16 outstanding_threshold;
> +} __packed;
> +
> +/** gsi_channel_scratch - channel scratch configuration area
> + *
> + * The exact interpretation of this register is protocol-specific.
> + * We only use GPI channels; see struct gsi_gpi_channel_scratch, above.
> + */
> +union gsi_channel_scratch {
> +       struct gsi_gpi_channel_scratch gpi;
> +       struct {
> +               u32 word1;
> +               u32 word2;
> +               u32 word3;
> +               u32 word4;
> +       } data;
> +} __packed;

What are the exact alignment requirements on these structures,
do you ever need to have them on odd addresses? If not, please
remove the __packed, or add __aligned() with the actual alignment,
e.g. __aligned(4), to let the compiler create better code and
avoid bytewise accesses.

> +/* Init function for GSI.  GSI hardware does not need to be "ready" */
> +int gsi_init(struct gsi *gsi, struct platform_device *pdev, u32 data_count,
> +            const struct gsi_ipa_endpoint_data *data)
> +{
> +       struct resource *res;
> +       resource_size_t size;
> +       unsigned int irq;
> +       int ret;
> +
> +       gsi->dev = &pdev->dev;
> +       init_dummy_netdev(&gsi->dummy_dev);

Can you add a comment here to explain what the 'dummy' device is
needed for?

> +       /* Get GSI memory range and map it */
> +       res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "gsi");
> +       if (!res)
> +               return -ENXIO;
> +
> +       size = resource_size(res);
> +       if (res->start > U32_MAX || size > U32_MAX - res->start)
> +               return -EINVAL;
> +
> +       gsi->virt = ioremap_nocache(res->start, size);
> +       if (!gsi->virt)
> +               return -ENOMEM;

The _nocache() postfix is not needed here, and I find it a bit
confusing, just use plain ioremap, or maybe even
devm_platform_ioremap_resource() to save the
platform_get_resource_byname().

> +       ret = request_irq(irq, gsi_isr, 0, "gsi", gsi);
> +       if (ret)
> +               goto err_unmap_virt;
> +       gsi->irq = irq;
> +
> +       ret = enable_irq_wake(gsi->irq);
> +       if (ret)
> +               dev_err(gsi->dev, "error %d enabling gsi wake irq\n", ret);
> +       gsi->irq_wake_enabled = ret ? 0 : 1;
> +
> +       spin_lock_init(&gsi->spinlock);
> +       mutex_init(&gsi->mutex);

This looks a bit dangerous if you can ever get to the point of
having a pending interrupt. before the structure is fully initialized.
This can probably not happen in practice, but it's better to request
the interrupts last to be on the safe side.

> +/* Wait for all transaction activity on a channel to complete */
> +void gsi_channel_trans_quiesce(struct gsi *gsi, u32 channel_id)
> +{
> +       struct gsi_channel *channel = &gsi->channel[channel_id];
> +       struct gsi_trans_info *trans_info;
> +       struct gsi_trans *trans = NULL;
> +       struct gsi_evt_ring *evt_ring;
> +       struct list_head *list;
> +       unsigned long flags;
> +
> +       trans_info = &channel->trans_info;
> +       evt_ring = &channel->gsi->evt_ring[channel->evt_ring_id];
> +
> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
> +
> +       /* Find the last list to which a transaction was added */
> +       if (!list_empty(&trans_info->alloc))
> +               list = &trans_info->alloc;
> +       else if (!list_empty(&trans_info->pending))
> +               list = &trans_info->pending;
> +       else if (!list_empty(&trans_info->complete))
> +               list = &trans_info->complete;
> +       else if (!list_empty(&trans_info->polled))
> +               list = &trans_info->polled;
> +       else
> +               list = NULL;
> +
> +       if (list) {
> +               struct gsi_trans *trans;
> +
> +               /* The last entry on this list is the last one allocated.
> +                * Grab a reference so we can wait for it.
> +                */
> +               trans = list_last_entry(list, struct gsi_trans, links);
> +               refcount_inc(&trans->refcount);
> +       }
> +
> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
> +
> +       /* If there is one, wait for it to complete */
> +       if (trans) {
> +               wait_for_completion(&trans->completion);

Since you are waiting here, you clearly can't be called
from interrupt context, or with interrupts disabled, so it's
clearer to use spin_lock_irq() instead of spin_lock_irqsave().

I generally try to avoid the _irqsave versions altogether, unless
it is really needed for a function that is called both from
irq-disabled and irq-enabled context.

     Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-12  1:24 ` [PATCH 09/18] soc: qcom: ipa: GSI transactions Alex Elder
@ 2019-05-15  7:34   ` Arnd Bergmann
  2019-05-15 12:25     ` Alex Elder
  2019-05-17 18:08     ` Alex Elder
  0 siblings, 2 replies; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15  7:34 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	Linux Kernel Mailing List

> +static void gsi_trans_tre_fill(struct gsi_tre *dest_tre, dma_addr_t addr,
> +                              u32 len, bool last_tre, bool bei,
> +                              enum ipa_cmd_opcode opcode)
> +{
> +       struct gsi_tre tre;
> +
> +       tre.addr = cpu_to_le64(addr);
> +       tre.len_opcode = gsi_tre_len_opcode(opcode, len);
> +       tre.reserved = 0;
> +       tre.flags = gsi_tre_flags(last_tre, bei, opcode);
> +
> +       *dest_tre = tre;        /* Write TRE as a single (16-byte) unit */
> +}

Have you checked that the atomic write is actually what happens here,
but looking at the compiler output? You might need to add a 'volatile'
qualifier to the dest_tre argument so the temporary structure doesn't
get optimized away here.

> +/* Cancel a channel's pending transactions */
> +void gsi_channel_trans_cancel_pending(struct gsi_channel *channel)
> +{
> +       struct gsi_trans_info *trans_info = &channel->trans_info;
> +       u32 evt_ring_id = channel->evt_ring_id;
> +       struct gsi *gsi = channel->gsi;
> +       struct gsi_evt_ring *evt_ring;
> +       struct gsi_trans *trans;
> +       unsigned long flags;
> +
> +       evt_ring = &gsi->evt_ring[evt_ring_id];
> +
> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
> +
> +       list_for_each_entry(trans, &trans_info->pending, links)
> +               trans->result = -ECANCELED;
> +
> +       list_splice_tail_init(&trans_info->pending, &trans_info->complete);
> +
> +       spin_unlock_irqrestore(&evt_ring->ring.spinlock, flags);
> +
> +       spin_lock_irqsave(&gsi->spinlock, flags);
> +
> +       if (gsi->event_enable_bitmap & BIT(evt_ring_id))
> +               gsi_event_handle(gsi, evt_ring_id);
> +
> +       spin_unlock_irqrestore(&gsi->spinlock, flags);
> +}

That is a lot of irqsave()/irqrestore() operations. Do you actually call
all of these functions from hardirq context?

      Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/18] soc: qcom: ipa: immediate commands
  2019-05-12  1:25 ` [PATCH 12/18] soc: qcom: ipa: immediate commands Alex Elder
@ 2019-05-15  8:16   ` Arnd Bergmann
  2019-05-15 12:35     ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15  8:16 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:

> +/* Initialize header space in IPA local memory */
> +int ipa_cmd_hdr_init_local(struct ipa *ipa, u32 offset, u32 size)
> +{
> +       struct ipa_imm_cmd_hw_hdr_init_local *payload;
> +       struct device *dev = &ipa->pdev->dev;
> +       dma_addr_t addr;
> +       void *virt;
> +       u32 flags;
> +       u32 max;
> +       int ret;
> +
> +       /* Note: size *can* be zero in this case */
> +       if (size > field_max(IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK))
> +               return -EINVAL;
> +
> +       max = field_max(IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK);
> +       if (offset > max || ipa->shared_offset > max - offset)
> +               return -EINVAL;
> +       offset += ipa->shared_offset;
> +
> +       /* A zero-filled buffer of the right size is all that's required */
> +       virt = dma_alloc_coherent(dev, size, &addr, GFP_KERNEL);
> +       if (!virt)
> +               return -ENOMEM;
> +
> +       payload = kzalloc(sizeof(*payload), GFP_KERNEL);
> +       if (!payload) {
> +               ret = -ENOMEM;
> +               goto out_dma_free;
> +       }
> +
> +       payload->hdr_table_addr = addr;
> +       flags = u32_encode_bits(size, IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK);
> +       flags |= u32_encode_bits(offset, IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK);
> +       payload->flags = flags;
> +
> +       ret = ipa_cmd(ipa, IPA_CMD_HDR_INIT_LOCAL, payload, sizeof(*payload));
> +
> +       kfree(payload);
> +out_dma_free:
> +       dma_free_coherent(dev, size, virt, addr);
> +
> +       return ret;
> +}

This looks rather strange. I think I looked at it before and you explained
it, but I have since forgotten what you do it for, so I assume everyone else
that tries to understand this will have problems too.

The issue I see is that you do an expensive dma_alloc_coherent()
but then never actually use the pointer returned by it, only the
dma address that cannot be turned back into a virtual address
in order to access the data in it.

If you can't actually use payload->hdr_table_addr, why even allocate
it here?

     Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 13/18] soc: qcom: ipa: IPA network device and microcontroller
  2019-05-12  1:25 ` [PATCH 13/18] soc: qcom: ipa: IPA network device and microcontroller Alex Elder
@ 2019-05-15  8:21   ` Arnd Bergmann
  2019-05-15 12:46     ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15  8:21 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
>
> This patch includes the code that implements a Linux network device,
> using one TX and one RX IPA endpoint.  It is used to implement the
> network device representing the modem and its connection to wireless
> networks.  There are only a few things that are really modem-specific
> though, and they aren't clearly called out here.  Such distinctions
> will be made clearer if we wish to support a network device for
> anything other than the modem.

This does not seem to do much at all, as far as I can see it's a fairly
small abstraction between the linux netdev layer and the actual
implementation. Could you just merge this file into whichever file
it interacts with most closely, and open-code the wrappers there?

      Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 18/18] arm64: defconfig: enable build of IPA code
  2019-05-12  1:25 ` [PATCH 18/18] arm64: defconfig: enable build of IPA code Alex Elder
@ 2019-05-15  8:23   ` Arnd Bergmann
  2019-05-15 12:49     ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15  8:23 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, Catalin Marinas,
	Will Deacon, Andy Gross, Olof Johansson, Maxime Ripard,
	Simon Horman, Jagan Teki, Stefan Wahren, Marc Gonzalez,
	Enric Balletbo i Serra, syadagir, mjavid, evgreen, Ben Chan,
	Eric Caruso, abhishek.esse, Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:

> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
> index 2d9c39033c1a..4f4d803e563d 100644
> --- a/arch/arm64/configs/defconfig
> +++ b/arch/arm64/configs/defconfig
> @@ -268,6 +268,7 @@ CONFIG_SMSC911X=y
>  CONFIG_SNI_AVE=y
>  CONFIG_SNI_NETSEC=y
>  CONFIG_STMMAC_ETH=m
> +CONFIG_IPA=y
>  CONFIG_MDIO_BUS_MUX_MMIOREG=y
>  CONFIG_AT803X_PHY=m
>  CONFIG_MARVELL_PHY=m

Since the device is not needed for booting, please make this
CONFIG_IPA=m instead to keep the kernel image a little smaller.

     Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface
  2019-05-12  1:24 ` [PATCH 08/18] soc: qcom: ipa: the generic software interface Alex Elder
  2019-05-15  7:21   ` Arnd Bergmann
@ 2019-05-15 10:47   ` Arnd Bergmann
  2019-05-15 13:32     ` Alex Elder
  2019-05-15 19:37   ` Arnd Bergmann
  2 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15 10:47 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:

The per-event interrupt handling seems to be more complex than
necessary:

> +/* Enable or disable an event interrupt */
> +static void
> +_gsi_irq_control_event(struct gsi *gsi, u32 evt_ring_id, bool enable)
> +{
> +       u32 mask = BIT(evt_ring_id);
> +       u32 val;
> +
> +       if (enable)
> +               gsi->event_enable_bitmap |= mask;
> +       else
> +               gsi->event_enable_bitmap &= ~mask;
> +
> +       val = gsi->event_enable_bitmap;
> +       iowrite32(val, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET);
> +}
> +
> +static void gsi_irq_enable_event(struct gsi *gsi, u32 evt_ring_id)
> +{
> +       _gsi_irq_control_event(gsi, evt_ring_id, true);

You maintain a bitmap here of the enabled-state, and have
to use a spinlock to ensure that the two are in sync.

> +/* Returns true if the interrupt state (enabled or not) changed */
> +static bool gsi_channel_intr(struct gsi_channel *channel, bool enable)
> +{
> +       u32 evt_ring_id = channel->evt_ring_id;
> +       struct gsi *gsi = channel->gsi;
> +       u32 mask = BIT(evt_ring_id);
> +       unsigned long flags;
> +       bool different;
> +       u32 enabled;
> +
> +       spin_lock_irqsave(&gsi->spinlock, flags);
> +
> +       enabled = gsi->event_enable_bitmap & mask;
> +       different = enable == !enabled;
> +
> +       if (different) {
> +               if (enabled)
> +                       gsi_irq_disable_event(channel->gsi, evt_ring_id);
> +               else
> +                       gsi_irq_enable_event(channel->gsi, evt_ring_id);
> +       }
> +
> +       spin_unlock_irqrestore(&gsi->spinlock, flags);
> +
> +       return different;
> +}

This gets called for each active channel, so you repeatedly
have to get the spinlock and read the irq-enabled state for it.

> +static void gsi_isr_ieob(struct gsi *gsi)
> +{
> +       u32 evt_mask;
> +
> +       evt_mask = ioread32(gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_OFFSET);
> +       evt_mask &= ioread32(gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET);
> +       iowrite32(evt_mask, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET);
> +
> +       while (evt_mask) {
> +               u32 evt_ring_id = __ffs(evt_mask);
> +
> +               evt_mask ^= BIT(evt_ring_id);
> +
> +               gsi_event_handle(gsi, evt_ring_id);
> +       }
> +}

However, you start out by clearing all bits here.

Why not skip the clearing and and leave the interrupts enabled,
while moving the GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET
write (for a single channel that was completed) to the end of
gsi_channel_poll()?

Something like

static void gsi_isr_ieob(struct gsi *gsi)
{
      u32 evt_mask;

      evt_mask = ioread32(gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_OFFSET);
      while (evt_mask) {
               u32 evt_ring_id = __ffs(evt_mask);
               evt_mask ^= BIT(evt_ring_id);

               napi_schedule(gsi->evt_ring[evt_ring_id].channel.napi);
      }
}

I also removed the GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET
read here, as that is probably more expensive than calling napi_schedule()
for a channel that is already scheduled. Most of the time, I'd expect the
interrupt to only signal a single channel anyway.

        Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-15  6:59   ` Arnd Bergmann
@ 2019-05-15 12:03     ` Alex Elder
  2019-05-16  1:09       ` Subash Abhinov Kasiviswanathan
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-15 12:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, subashab,
	stranche, YueHaibing, Joe Perches, syadagir, mjavid, evgreen,
	benchan, ejcaruso, abhishek.esse, Linux Kernel Mailing List

On 5/15/19 1:59 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
> 
>> diff --git a/include/soc/qcom/rmnet.h b/include/soc/qcom/rmnet.h
>> new file mode 100644
>> index 000000000000..80dcd6e68c3d
>> --- /dev/null
>> +++ b/include/soc/qcom/rmnet.h
>> @@ -0,0 +1,38 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +/* Copyright (c) 2013-2018, The Linux Foundation. All rights reserved.
>> + * Copyright (C) 2018-2019 Linaro Ltd.
>> + */
>> +#ifndef _SOC_QCOM_RMNET_H_
>> +#define _SOC_QCOM_RMNET_H_
>> +
>> +#include <linux/types.h>
>> +
>> +/* Header structure that precedes packets in ETH_P_MAP protocol */
>> +struct rmnet_map_header {
>> +       u8  pad_len             : 6;
>> +       u8  reserved_bit        : 1;
>> +       u8  cd_bit              : 1;
>> +       u8  mux_id;
>> +       __be16 pkt_len;
>> +}  __aligned(1);
> 
> If we move this into include/soc/, I want the structure to be portable,
> and avoid the bit fields. Please use mask/shift operations or the
> include/linux/bits.h macros instead to make this work with big-endian
> kernels.

Sure, I'll do that.  I did that everywhere else in the driver,
but here I just tried to preserve the original code as I moved
it.  I will update at least these structures, and all existing
code (plus the IPA code) to use fields masks.

					-Alex
> 
>      Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings
  2019-05-15  7:03   ` Arnd Bergmann
@ 2019-05-15 12:04     ` Alex Elder
  2019-05-15 16:50       ` Rob Herring
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-15 12:04 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, Rob Herring,
	Mark Rutland, Andy Gross, David Brown, syadagir, mjavid, evgreen,
	benchan, ejcaruso, abhishek.esse, Linux Kernel Mailing List

On 5/15/19 2:03 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
>>
>> Add the binding definitions for the "qcom,ipa" device tree node.
>>
>> Signed-off-by: Alex Elder <elder@linaro.org>
>> ---
>>  .../devicetree/bindings/net/qcom,ipa.txt      | 164 ++++++++++++++++++
>>  1 file changed, 164 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/net/qcom,ipa.txt
>>
>> diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.txt b/Documentation/devicetree/bindings/net/qcom,ipa.txt
>> new file mode 100644
>> index 000000000000..2705e198f12e
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/net/qcom,ipa.txt
> 
> For new bindings, we should use the yaml format so we can verify the
> device tree files against the binding.

OK.  I didn't realize that was upstream yet.  I will convert.

>> +
>> +- reg:
>> +       Resources specifying the physical address spaces of the IPA and GSI.
>> +
>> +- reg-names:
>> +       The names of the two address space ranges defined by the "reg"
>> +       property.  Must be:
>> +               "ipa-reg"
>> +               "ipa-shared"
>> +               "gsi"
> 
> Those are three, not two ;-)

Oops!  I added one recently and I guess I missed that.  Thanks
for catching it.

					-Alex

> 
>         Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface
  2019-05-15  7:21   ` Arnd Bergmann
@ 2019-05-15 12:13     ` Alex Elder
  2019-05-15 12:40       ` Arnd Bergmann
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-15 12:13 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 2:21 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
> 
>> +/** gsi_gpi_channel_scratch - GPI protocol scratch register
>> + *
>> + * @max_outstanding_tre:
>> + *     Defines the maximum number of TREs allowed in a single transaction
>> + *     on a channel (in Bytes).  This determines the amount of prefetch
>> + *     performed by the hardware.  We configure this to equal the size of
>> + *     the TLV FIFO for the channel.
>> + * @outstanding_threshold:
>> + *     Defines the threshold (in Bytes) determining when the sequencer
>> + *     should update the channel doorbell.  We configure this to equal
>> + *     the size of two TREs.
>> + */
>> +struct gsi_gpi_channel_scratch {
>> +       u64 rsvd1;
>> +       u16 rsvd2;
>> +       u16 max_outstanding_tre;
>> +       u16 rsvd3;
>> +       u16 outstanding_threshold;
>> +} __packed;
>> +
>> +/** gsi_channel_scratch - channel scratch configuration area
>> + *
>> + * The exact interpretation of this register is protocol-specific.
>> + * We only use GPI channels; see struct gsi_gpi_channel_scratch, above.
>> + */
>> +union gsi_channel_scratch {
>> +       struct gsi_gpi_channel_scratch gpi;
>> +       struct {
>> +               u32 word1;
>> +               u32 word2;
>> +               u32 word3;
>> +               u32 word4;
>> +       } data;
>> +} __packed;
> 
> What are the exact alignment requirements on these structures,
> do you ever need to have them on odd addresses? If not, please
> remove the __packed, or add __aligned() with the actual alignment,
> e.g. __aligned(4), to let the compiler create better code and
> avoid bytewise accesses.

Honestly I don't know but I would guess they've actually
got alignment requirements consistent with C standard...
Many, many structures had the __packed attribute attached
in the original code.  I removed most but apparently not
all.  I will remove the __packed here, and will scan through
the rest of the code for other similar instances and will
remove those if appropriate as well.

>> +/* Init function for GSI.  GSI hardware does not need to be "ready" */
>> +int gsi_init(struct gsi *gsi, struct platform_device *pdev, u32 data_count,
>> +            const struct gsi_ipa_endpoint_data *data)
>> +{
>> +       struct resource *res;
>> +       resource_size_t size;
>> +       unsigned int irq;
>> +       int ret;
>> +
>> +       gsi->dev = &pdev->dev;
>> +       init_dummy_netdev(&gsi->dummy_dev);
> 
> Can you add a comment here to explain what the 'dummy' device is
> needed for?

Yes, good idea.

FYI it's needed because the GSI code is not a "real"
network device (that, where needed, is implemented in
"ipa_netdev.c", two logical layers up), but in order
to use NAPI there needs to be one.


>> +       /* Get GSI memory range and map it */
>> +       res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "gsi");
>> +       if (!res)
>> +               return -ENXIO;
>> +
>> +       size = resource_size(res);
>> +       if (res->start > U32_MAX || size > U32_MAX - res->start)
>> +               return -EINVAL;
>> +
>> +       gsi->virt = ioremap_nocache(res->start, size);
>> +       if (!gsi->virt)
>> +               return -ENOMEM;
> 
> The _nocache() postfix is not needed here, and I find it a bit
> confusing, just use plain ioremap, or maybe even
> devm_platform_ioremap_resource() to save the
> platform_get_resource_byname().

OK good idea.  This was in the original code and I neglected
to chase this down.  Thank you for catching it.

>> +       ret = request_irq(irq, gsi_isr, 0, "gsi", gsi);
>> +       if (ret)
>> +               goto err_unmap_virt;
>> +       gsi->irq = irq;
>> +
>> +       ret = enable_irq_wake(gsi->irq);
>> +       if (ret)
>> +               dev_err(gsi->dev, "error %d enabling gsi wake irq\n", ret);
>> +       gsi->irq_wake_enabled = ret ? 0 : 1;
>> +
>> +       spin_lock_init(&gsi->spinlock);
>> +       mutex_init(&gsi->mutex);
> 
> This looks a bit dangerous if you can ever get to the point of
> having a pending interrupt. before the structure is fully initialized.
> This can probably not happen in practice, but it's better to request
> the interrupts last to be on the safe side.

Understood.  I'll fix that.

>> +/* Wait for all transaction activity on a channel to complete */
>> +void gsi_channel_trans_quiesce(struct gsi *gsi, u32 channel_id)
>> +{
>> +       struct gsi_channel *channel = &gsi->channel[channel_id];
>> +       struct gsi_trans_info *trans_info;
>> +       struct gsi_trans *trans = NULL;
>> +       struct gsi_evt_ring *evt_ring;
>> +       struct list_head *list;
>> +       unsigned long flags;
>> +
>> +       trans_info = &channel->trans_info;
>> +       evt_ring = &channel->gsi->evt_ring[channel->evt_ring_id];
>> +
>> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
>> +
>> +       /* Find the last list to which a transaction was added */
>> +       if (!list_empty(&trans_info->alloc))
>> +               list = &trans_info->alloc;
>> +       else if (!list_empty(&trans_info->pending))
>> +               list = &trans_info->pending;
>> +       else if (!list_empty(&trans_info->complete))
>> +               list = &trans_info->complete;
>> +       else if (!list_empty(&trans_info->polled))
>> +               list = &trans_info->polled;
>> +       else
>> +               list = NULL;
>> +
>> +       if (list) {
>> +               struct gsi_trans *trans;
>> +
>> +               /* The last entry on this list is the last one allocated.
>> +                * Grab a reference so we can wait for it.
>> +                */
>> +               trans = list_last_entry(list, struct gsi_trans, links);
>> +               refcount_inc(&trans->refcount);
>> +       }
>> +
>> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
>> +
>> +       /* If there is one, wait for it to complete */
>> +       if (trans) {
>> +               wait_for_completion(&trans->completion);
> 
> Since you are waiting here, you clearly can't be called
> from interrupt context, or with interrupts disabled, so it's
> clearer to use spin_lock_irq() instead of spin_lock_irqsave().
> 
> I generally try to avoid the _irqsave versions altogether, unless
> it is really needed for a function that is called both from
> irq-disabled and irq-enabled context.

OK.  And I appreciate what your saying here because I do prefer
code that communicates more about the context in ways like
you describe.

Thanks you.

					-Alex

> 
>      Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-15  7:34   ` Arnd Bergmann
@ 2019-05-15 12:25     ` Alex Elder
  2019-05-15 20:50       ` Arnd Bergmann
  2019-05-17 18:08     ` Alex Elder
  1 sibling, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-15 12:25 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 2:34 AM, Arnd Bergmann wrote:
>> +static void gsi_trans_tre_fill(struct gsi_tre *dest_tre, dma_addr_t addr,
>> +                              u32 len, bool last_tre, bool bei,
>> +                              enum ipa_cmd_opcode opcode)
>> +{
>> +       struct gsi_tre tre;
>> +
>> +       tre.addr = cpu_to_le64(addr);
>> +       tre.len_opcode = gsi_tre_len_opcode(opcode, len);
>> +       tre.reserved = 0;
>> +       tre.flags = gsi_tre_flags(last_tre, bei, opcode);
>> +
>> +       *dest_tre = tre;        /* Write TRE as a single (16-byte) unit */
>> +}
> 
> Have you checked that the atomic write is actually what happens here,
> but looking at the compiler output? You might need to add a 'volatile'
> qualifier to the dest_tre argument so the temporary structure doesn't
> get optimized away here.

No, and I really should have checked, since I'm assuming that's
what will happen.  I will check, and may well add the volatile
regardless.

>> +/* Cancel a channel's pending transactions */
>> +void gsi_channel_trans_cancel_pending(struct gsi_channel *channel)
>> +{
>> +       struct gsi_trans_info *trans_info = &channel->trans_info;
>> +       u32 evt_ring_id = channel->evt_ring_id;
>> +       struct gsi *gsi = channel->gsi;
>> +       struct gsi_evt_ring *evt_ring;
>> +       struct gsi_trans *trans;
>> +       unsigned long flags;
>> +
>> +       evt_ring = &gsi->evt_ring[evt_ring_id];
>> +
>> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
>> +
>> +       list_for_each_entry(trans, &trans_info->pending, links)
>> +               trans->result = -ECANCELED;
>> +
>> +       list_splice_tail_init(&trans_info->pending, &trans_info->complete);
>> +
>> +       spin_unlock_irqrestore(&evt_ring->ring.spinlock, flags);
>> +
>> +       spin_lock_irqsave(&gsi->spinlock, flags);
>> +
>> +       if (gsi->event_enable_bitmap & BIT(evt_ring_id))
>> +               gsi_event_handle(gsi, evt_ring_id);
>> +
>> +       spin_unlock_irqrestore(&gsi->spinlock, flags);
>> +}
> 
> That is a lot of irqsave()/irqrestore() operations. Do you actually call
> all of these functions from hardirq context?

The transaction list is definitely updated in IRQ context,
but I think it is no longer updated in hardirq context (the
softirq was a recent change).  This particular function is
definitely not called in a hardirq context, so I can remove
the irqsave/irqrestore.

I'll survey my spinlock use throughout the driver and will
remove any irqsave/irqrestore used in non-hardirq contexts.

Thanks.

					-Alex


>       Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/18] soc: qcom: ipa: immediate commands
  2019-05-15  8:16   ` Arnd Bergmann
@ 2019-05-15 12:35     ` Alex Elder
  2019-05-18  0:34       ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-15 12:35 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 3:16 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
> 
>> +/* Initialize header space in IPA local memory */
>> +int ipa_cmd_hdr_init_local(struct ipa *ipa, u32 offset, u32 size)
>> +{
>> +       struct ipa_imm_cmd_hw_hdr_init_local *payload;
>> +       struct device *dev = &ipa->pdev->dev;
>> +       dma_addr_t addr;
>> +       void *virt;
>> +       u32 flags;
>> +       u32 max;
>> +       int ret;
>> +
>> +       /* Note: size *can* be zero in this case */
>> +       if (size > field_max(IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK))
>> +               return -EINVAL;
>> +
>> +       max = field_max(IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK);
>> +       if (offset > max || ipa->shared_offset > max - offset)
>> +               return -EINVAL;
>> +       offset += ipa->shared_offset;
>> +
>> +       /* A zero-filled buffer of the right size is all that's required */
>> +       virt = dma_alloc_coherent(dev, size, &addr, GFP_KERNEL);
>> +       if (!virt)
>> +               return -ENOMEM;
>> +
>> +       payload = kzalloc(sizeof(*payload), GFP_KERNEL);
>> +       if (!payload) {
>> +               ret = -ENOMEM;
>> +               goto out_dma_free;
>> +       }
>> +
>> +       payload->hdr_table_addr = addr;
>> +       flags = u32_encode_bits(size, IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK);
>> +       flags |= u32_encode_bits(offset, IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK);
>> +       payload->flags = flags;
>> +
>> +       ret = ipa_cmd(ipa, IPA_CMD_HDR_INIT_LOCAL, payload, sizeof(*payload));
>> +
>> +       kfree(payload);
>> +out_dma_free:
>> +       dma_free_coherent(dev, size, virt, addr);
>> +
>> +       return ret;
>> +}
> 
> This looks rather strange. I think I looked at it before and you explained
> it, but I have since forgotten what you do it for, so I assume everyone else
> that tries to understand this will have problems too.

This is a bug.  I think I misunderstood why you were
puzzled before.  Now I get it.  I need to save that
DMA address and not free it at the end of the function
(except on error).

Here's what I think happened.  There are two parts of
initializing these tables.  One part tells the hardware
where the table is located.  Another part zeroes the
contents of those tables.  (The zeroing part could be
accomplished when the table is allocated, but there
are cases where they have to be zeroed again without
needing to tell the hardware so we need to at least
be able to do that independently.)

I think I was assuming this was the function that did
the zeroing, and I thought that adding the comment about
"all we need is a zero-filled buffer" addressed what
you thought should be made clearer.

I will definitely fix this, and I'm glad you repeated
it so I was forced to take another look.

If I again misunderstand your point, please let me know.

					-Alex

> The issue I see is that you do an expensive dma_alloc_coherent()
> but then never actually use the pointer returned by it, only the
> dma address that cannot be turned back into a virtual address
> in order to access the data in it.
> 
> If you can't actually use payload->hdr_table_addr, why even allocate
> it here?
> 
>      Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/18] net: introduce Qualcomm IPA driver
  2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
                   ` (17 preceding siblings ...)
  2019-05-12  1:25 ` [PATCH 18/18] arm64: defconfig: enable build of IPA code Alex Elder
@ 2019-05-15 12:37 ` Arnd Bergmann
  2019-05-15 12:52   ` Alex Elder
  18 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15 12:37 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
>
> A version of this code was posted in November 2018 as an RFC.
>   https://lore.kernel.org/lkml/20181107003250.5832-1-elder@linaro.org/
> Fixes addressing all feedback received have been implemented.  It
> has undergone considerable further rework since that time, and
> most of the "future work" described then has now been completed.

I think this has turned out really well now.  I've gone through the patches
today and not found any real show-stoppers, but replied with a couple of
minor things I noticed.

I think it's probably worth rearranging the rx and tx code to avoid
the spinlocks, which would be the main optimization I can still
think of to reduce the coherency traffic.

       Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface
  2019-05-15 12:13     ` Alex Elder
@ 2019-05-15 12:40       ` Arnd Bergmann
  0 siblings, 0 replies; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15 12:40 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Wed, May 15, 2019 at 2:13 PM Alex Elder <elder@linaro.org> wrote:
> On 5/15/19 2:21 AM, Arnd Bergmann wrote:


> >> +/* Wait for all transaction activity on a channel to complete */
> >> +void gsi_channel_trans_quiesce(struct gsi *gsi, u32 channel_id)
> >> +{
> >> +       struct gsi_channel *channel = &gsi->channel[channel_id];
> >> +       struct gsi_trans_info *trans_info;
> >> +       struct gsi_trans *trans = NULL;
> >> +       struct gsi_evt_ring *evt_ring;
> >> +       struct list_head *list;
> >> +       unsigned long flags;
> >> +
> >> +       trans_info = &channel->trans_info;
> >> +       evt_ring = &channel->gsi->evt_ring[channel->evt_ring_id];
> >> +
> >> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
> >> +
> >> +       /* Find the last list to which a transaction was added */
> >> +       if (!list_empty(&trans_info->alloc))
> >> +               list = &trans_info->alloc;
> >> +       else if (!list_empty(&trans_info->pending))
> >> +               list = &trans_info->pending;
> >> +       else if (!list_empty(&trans_info->complete))
> >> +               list = &trans_info->complete;
> >> +       else if (!list_empty(&trans_info->polled))
> >> +               list = &trans_info->polled;
> >> +       else
> >> +               list = NULL;
> >> +
> >> +       if (list) {
> >> +               struct gsi_trans *trans;
> >> +
> >> +               /* The last entry on this list is the last one allocated.
> >> +                * Grab a reference so we can wait for it.
> >> +                */
> >> +               trans = list_last_entry(list, struct gsi_trans, links);
> >> +               refcount_inc(&trans->refcount);
> >> +       }
> >> +
> >> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
> >> +
> >> +       /* If there is one, wait for it to complete */
> >> +       if (trans) {
> >> +               wait_for_completion(&trans->completion);
> >
> > Since you are waiting here, you clearly can't be called
> > from interrupt context, or with interrupts disabled, so it's
> > clearer to use spin_lock_irq() instead of spin_lock_irqsave().
> >
> > I generally try to avoid the _irqsave versions altogether, unless
> > it is really needed for a function that is called both from
> > irq-disabled and irq-enabled context.
>
> OK.  And I appreciate what your saying here because I do prefer
> code that communicates more about the context in ways like
> you describe.

Right, also reading the status of the irq-enable flag can be
expensive on some CPUs, so spin_lock_irqsave() ends up
much more slower than spin_lock() or spin_lock_irq(). Not sure
if it makes a huge difference on this particular platform, but
it's better not to have to worry about it.

     Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 13/18] soc: qcom: ipa: IPA network device and microcontroller
  2019-05-15  8:21   ` Arnd Bergmann
@ 2019-05-15 12:46     ` Alex Elder
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-15 12:46 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 3:21 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
>>
>> This patch includes the code that implements a Linux network device,
>> using one TX and one RX IPA endpoint.  It is used to implement the
>> network device representing the modem and its connection to wireless
>> networks.  There are only a few things that are really modem-specific
>> though, and they aren't clearly called out here.  Such distinctions
>> will be made clearer if we wish to support a network device for
>> anything other than the modem.
> 
> This does not seem to do much at all, as far as I can see it's a fairly
> small abstraction between the linux netdev layer and the actual
> implementation. Could you just merge this file into whichever file
> it interacts with most closely, and open-code the wrappers there?

This used to be a bigger file, containing IOCTLs for configuring
the endpoints used.

It is logically separate from endpoints, because not all endpoints
are attached to network devices.  The IPA command TX endpoint isn't
associated with a network device, and the default route RX endpoint
isn't either.

In addition, the modem can crash and be restarted independent of
all other endpoints.  I haven't added proper handling for that yet,
though (just the ipa_ssr_*() stubs), and I thought maybe in
finishing that I might find  keeping it separated in this way
would better fit how that would work out.  The presence of the
"rmnet_ipa0" network device essentially represents the presence
of a functioning modem.

That being said, in the process of trying to streamline the data
path, I *did* add a netdev pointer to the ipa_endpoint structure,
so I've already blurred the line between these layers.

So I will try to do as you suggest, and this code will most likely
end up in "ipa_endpoint.c".

					-Alex


>       Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 18/18] arm64: defconfig: enable build of IPA code
  2019-05-15  8:23   ` Arnd Bergmann
@ 2019-05-15 12:49     ` Alex Elder
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-15 12:49 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, Catalin Marinas,
	Will Deacon, Andy Gross, Olof Johansson, Maxime Ripard,
	Simon Horman, Jagan Teki, Stefan Wahren, Marc Gonzalez,
	Enric Balletbo i Serra, syadagir, mjavid, evgreen, Ben Chan,
	Eric Caruso, abhishek.esse, Linux Kernel Mailing List

On 5/15/19 3:23 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
> 
>> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
>> index 2d9c39033c1a..4f4d803e563d 100644
>> --- a/arch/arm64/configs/defconfig
>> +++ b/arch/arm64/configs/defconfig
>> @@ -268,6 +268,7 @@ CONFIG_SMSC911X=y
>>  CONFIG_SNI_AVE=y
>>  CONFIG_SNI_NETSEC=y
>>  CONFIG_STMMAC_ETH=m
>> +CONFIG_IPA=y
>>  CONFIG_MDIO_BUS_MUX_MMIOREG=y
>>  CONFIG_AT803X_PHY=m
>>  CONFIG_MARVELL_PHY=m
> 
> Since the device is not needed for booting, please make this
> CONFIG_IPA=m instead to keep the kernel image a little smaller.
> 
>      Arnd
> 

Oops, yes, that was my intention but I forgot to fix that
before I sent it out.   This code works as a module, but
in order to make the whole system allow the module to be both
removed and re-inserted safely, I need some work to be done
on the modem end and that's beyond my direct control.  I have
been testing with it as a kernel built-in driver in the mean
time.

In any case, it is my intention to have it be normally built
as a module and I will ensure that when I send out future
revisions of this series.

					-Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/18] net: introduce Qualcomm IPA driver
  2019-05-15 12:37 ` [PATCH 00/18] net: introduce Qualcomm IPA driver Arnd Bergmann
@ 2019-05-15 12:52   ` Alex Elder
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-15 12:52 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 7:37 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
>>
>> A version of this code was posted in November 2018 as an RFC.
>>   https://lore.kernel.org/lkml/20181107003250.5832-1-elder@linaro.org/
>> Fixes addressing all feedback received have been implemented.  It
>> has undergone considerable further rework since that time, and
>> most of the "future work" described then has now been completed.
> 
> I think this has turned out really well now.  I've gone through the patches
> today and not found any real show-stoppers, but replied with a couple of
> minor things I noticed.
> 
> I think it's probably worth rearranging the rx and tx code to avoid
> the spinlocks, which would be the main optimization I can still
> think of to reduce the coherency traffic.

Arnd I appreciate your review *so* much.  You clearly committed
a significant block of time to this and your comments were very
insightful and focused.  I concur with everything you pointed
out and will address all of your concerns when I send out v2
of the series.

Thank you for your careful review.

					-Alex

> 
>        Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface
  2019-05-15 10:47   ` Arnd Bergmann
@ 2019-05-15 13:32     ` Alex Elder
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-15 13:32 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 5:47 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
> 
> The per-event interrupt handling seems to be more complex than
> necessary:

I just noticed this message.  I'll take another look at this
whole interrupt control mechanism and will try to streamline
it along the lines of what you describe.

Thanks.

					-Alex

> 
>> +/* Enable or disable an event interrupt */
>> +static void
>> +_gsi_irq_control_event(struct gsi *gsi, u32 evt_ring_id, bool enable)
>> +{
>> +       u32 mask = BIT(evt_ring_id);
>> +       u32 val;
>> +
>> +       if (enable)
>> +               gsi->event_enable_bitmap |= mask;
>> +       else
>> +               gsi->event_enable_bitmap &= ~mask;
>> +
>> +       val = gsi->event_enable_bitmap;
>> +       iowrite32(val, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET);
>> +}
>> +
>> +static void gsi_irq_enable_event(struct gsi *gsi, u32 evt_ring_id)
>> +{
>> +       _gsi_irq_control_event(gsi, evt_ring_id, true);
> 
> You maintain a bitmap here of the enabled-state, and have
> to use a spinlock to ensure that the two are in sync.
> 
>> +/* Returns true if the interrupt state (enabled or not) changed */
>> +static bool gsi_channel_intr(struct gsi_channel *channel, bool enable)
>> +{
>> +       u32 evt_ring_id = channel->evt_ring_id;
>> +       struct gsi *gsi = channel->gsi;
>> +       u32 mask = BIT(evt_ring_id);
>> +       unsigned long flags;
>> +       bool different;
>> +       u32 enabled;
>> +
>> +       spin_lock_irqsave(&gsi->spinlock, flags);
>> +
>> +       enabled = gsi->event_enable_bitmap & mask;
>> +       different = enable == !enabled;
>> +
>> +       if (different) {
>> +               if (enabled)
>> +                       gsi_irq_disable_event(channel->gsi, evt_ring_id);
>> +               else
>> +                       gsi_irq_enable_event(channel->gsi, evt_ring_id);
>> +       }
>> +
>> +       spin_unlock_irqrestore(&gsi->spinlock, flags);
>> +
>> +       return different;
>> +}
> 
> This gets called for each active channel, so you repeatedly
> have to get the spinlock and read the irq-enabled state for it.
> 
>> +static void gsi_isr_ieob(struct gsi *gsi)
>> +{
>> +       u32 evt_mask;
>> +
>> +       evt_mask = ioread32(gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_OFFSET);
>> +       evt_mask &= ioread32(gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET);
>> +       iowrite32(evt_mask, gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET);
>> +
>> +       while (evt_mask) {
>> +               u32 evt_ring_id = __ffs(evt_mask);
>> +
>> +               evt_mask ^= BIT(evt_ring_id);
>> +
>> +               gsi_event_handle(gsi, evt_ring_id);
>> +       }
>> +}
> 
> However, you start out by clearing all bits here.
> 
> Why not skip the clearing and and leave the interrupts enabled,
> while moving the GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFSET
> write (for a single channel that was completed) to the end of
> gsi_channel_poll()?
> 
> Something like
> 
> static void gsi_isr_ieob(struct gsi *gsi)
> {
>       u32 evt_mask;
> 
>       evt_mask = ioread32(gsi->virt + GSI_CNTXT_SRC_IEOB_IRQ_OFFSET);
>       while (evt_mask) {
>                u32 evt_ring_id = __ffs(evt_mask);
>                evt_mask ^= BIT(evt_ring_id);
> 
>                napi_schedule(gsi->evt_ring[evt_ring_id].channel.napi);
>       }
> }
> 
> I also removed the GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFSET
> read here, as that is probably more expensive than calling napi_schedule()
> for a channel that is already scheduled. Most of the time, I'd expect the
> interrupt to only signal a single channel anyway.
> 
>         Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings
  2019-05-15 12:04     ` Alex Elder
@ 2019-05-15 16:50       ` Rob Herring
  2019-05-15 17:05         ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Rob Herring @ 2019-05-15 16:50 UTC (permalink / raw)
  To: Alex Elder
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	Mark Rutland, Andy Gross, David Brown, syadagir, mjavid,
	Evan Green, Ben Chan, ejcaruso, abhishek.esse,
	Linux Kernel Mailing List

On Wed, May 15, 2019 at 7:04 AM Alex Elder <elder@linaro.org> wrote:
>
> On 5/15/19 2:03 AM, Arnd Bergmann wrote:
> > On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
> >>
> >> Add the binding definitions for the "qcom,ipa" device tree node.
> >>
> >> Signed-off-by: Alex Elder <elder@linaro.org>
> >> ---
> >>  .../devicetree/bindings/net/qcom,ipa.txt      | 164 ++++++++++++++++++
> >>  1 file changed, 164 insertions(+)
> >>  create mode 100644 Documentation/devicetree/bindings/net/qcom,ipa.txt
> >>
> >> diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.txt b/Documentation/devicetree/bindings/net/qcom,ipa.txt
> >> new file mode 100644
> >> index 000000000000..2705e198f12e
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/bindings/net/qcom,ipa.txt
> >
> > For new bindings, we should use the yaml format so we can verify the
> > device tree files against the binding.
>
> OK.  I didn't realize that was upstream yet.  I will convert.

Not required yet, but it puts the maintainer in a good mood. :)

As does CCing the DT list.

Rob

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings
  2019-05-15 16:50       ` Rob Herring
@ 2019-05-15 17:05         ` Alex Elder
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-15 17:05 UTC (permalink / raw)
  To: Rob Herring
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	Mark Rutland, Andy Gross, David Brown, syadagir, mjavid,
	Evan Green, Ben Chan, ejcaruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 11:50 AM, Rob Herring wrote:
>> OK.  I didn't realize that was upstream yet.  I will convert.
> Not required yet, but it puts the maintainer in a good mood. :)
> 
> As does CCing the DT list.

I'll convert it to YAML *and* CC the DT list next time, to
avoid triggering a crabby maintainer.

					-Alex


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface
  2019-05-12  1:24 ` [PATCH 08/18] soc: qcom: ipa: the generic software interface Alex Elder
  2019-05-15  7:21   ` Arnd Bergmann
  2019-05-15 10:47   ` Arnd Bergmann
@ 2019-05-15 19:37   ` Arnd Bergmann
  2 siblings, 0 replies; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15 19:37 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:

> +static int gsi_ring_alloc(struct gsi *gsi, struct gsi_ring *ring, u32 count)
> +{
> +       size_t size = roundup_pow_of_two(count * sizeof(struct gsi_tre));
> +       dma_addr_t addr;
> +
> +       /* Hardware requires a power-of-2 ring size (and alignment) */
> +       ring->virt = dma_alloc_coherent(gsi->dev, size, &addr, GFP_KERNEL);
> +       if (!ring->virt)
> +               return -ENOMEM;
> +       ring->addr = addr;
> +       ring->base = addr & GENMASK(31, 0);
> +       ring->size = size;
> +       ring->end = ring->base + size;
> +       spin_lock_init(&ring->spinlock);
> +
> +       return 0;
> +}

Another comment for this patch: dma_alloc_coherent() does not guarantee
alignment of the requested buffer as implied by the comment. In many
configurations, it /is/ naturally aligned because the buffer comes from
alloc_pages(), but you can't really be sure.

I suspect it's actually only broken when the buffer spans a 4GB boundary
(and updating the lower 32 bit in the register gives a wrong pointer), which
is unlikely but will happen at some point according to Murphy's law.
If you just need the dma_addr_t to not cross a 4GB boundary, the
easiest solution would be to use GFP_DMA32, which gives you a
buffer that is mapped to the first 4GB bus address space (not necessarily
the first 4GB of RAM if you have an iommu).

If you manually align the ring buffer, it should be fine too, though I have
to say that the way the driver does pointer arithmetic on 32-bit integers
seems rather fragile as well.

A nicer way to deal with ring buffers in general is to only ever use a
32-bit index number stored in an atomic_t, use atomic_inc_return()
to advance the index and then mask the number when turning it into
an index. With that, you should also be able to avoid the shared
spinlock. Moving the rp and wp into separate cache lines further
reduces the coherency traffic by avoiding concurrent writes on the
same line.

      Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-15 12:25     ` Alex Elder
@ 2019-05-15 20:50       ` Arnd Bergmann
  0 siblings, 0 replies; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-15 20:50 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Wed, May 15, 2019 at 2:26 PM Alex Elder <elder@linaro.org> wrote:
> On 5/15/19 2:34 AM, Arnd Bergmann wrote:
> >> +/* Cancel a channel's pending transactions */
> >> +void gsi_channel_trans_cancel_pending(struct gsi_channel *channel)
> >> +{
> >> +       struct gsi_trans_info *trans_info = &channel->trans_info;
> >> +       u32 evt_ring_id = channel->evt_ring_id;
> >> +       struct gsi *gsi = channel->gsi;
> >> +       struct gsi_evt_ring *evt_ring;
> >> +       struct gsi_trans *trans;
> >> +       unsigned long flags;
> >> +
> >> +       evt_ring = &gsi->evt_ring[evt_ring_id];
> >> +
> >> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
> >> +
> >> +       list_for_each_entry(trans, &trans_info->pending, links)
> >> +               trans->result = -ECANCELED;
> >> +
> >> +       list_splice_tail_init(&trans_info->pending, &trans_info->complete);
> >> +
> >> +       spin_unlock_irqrestore(&evt_ring->ring.spinlock, flags);
> >> +
> >> +       spin_lock_irqsave(&gsi->spinlock, flags);
> >> +
> >> +       if (gsi->event_enable_bitmap & BIT(evt_ring_id))
> >> +               gsi_event_handle(gsi, evt_ring_id);
> >> +
> >> +       spin_unlock_irqrestore(&gsi->spinlock, flags);
> >> +}
> >
> > That is a lot of irqsave()/irqrestore() operations. Do you actually call
> > all of these functions from hardirq context?
>
> The transaction list is definitely updated in IRQ context,
> but I think it is no longer updated in hardirq context (the
> softirq was a recent change).  This particular function is
> definitely not called in a hardirq context, so I can remove
> the irqsave/irqrestore.

If you want to protect against concurrent softirqs, you still
need spin_lock_bh(), which is cheaper than spin_lock_irqsave()
but still requires writing to the shared cache line for the
atomic update of the lock.

> I'll survey my spinlock use throughout the driver and will
> remove any irqsave/irqrestore used in non-hardirq contexts.

Ok. I actually hope that most of the spinlocks can be
removed from the data path entirely. I just replied on the
ring.spinlock, which I think can go away and be replaced
either with two atomic_t values (rp_local and wp_local
only; 'wp' appears to be unused), or even just an smp_rmb()/
smp_wmb() pair for each access. The gsi register spinlock
can probably be avoided as well if we stop disabling and
renabling the interrupts as I suggested.

gsi_trans_info->spinlock is harder to get rid of unfortunately,
as that would require changing the way you do the doubly linked
lists.

     Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-15 12:03     ` Alex Elder
@ 2019-05-16  1:09       ` Subash Abhinov Kasiviswanathan
  2019-05-17 17:27         ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Subash Abhinov Kasiviswanathan @ 2019-05-16  1:09 UTC (permalink / raw)
  To: Alex Elder
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	stranche, YueHaibing, Joe Perches, syadagir, mjavid, evgreen,
	benchan, ejcaruso, abhishek.esse, Linux Kernel Mailing List

>>> +#ifndef _SOC_QCOM_RMNET_H_
>>> +#define _SOC_QCOM_RMNET_H_
>>> +
>>> +#include <linux/types.h>
>>> +
>>> +/* Header structure that precedes packets in ETH_P_MAP protocol */
>>> +struct rmnet_map_header {
>>> +       u8  pad_len             : 6;
>>> +       u8  reserved_bit        : 1;
>>> +       u8  cd_bit              : 1;
>>> +       u8  mux_id;
>>> +       __be16 pkt_len;
>>> +}  __aligned(1);
>> 
>> If we move this into include/soc/, I want the structure to be 
>> portable,
>> and avoid the bit fields. Please use mask/shift operations or the
>> include/linux/bits.h macros instead to make this work with big-endian
>> kernels.
> 
> Sure, I'll do that.  I did that everywhere else in the driver,
> but here I just tried to preserve the original code as I moved
> it.  I will update at least these structures, and all existing
> code (plus the IPA code) to use fields masks.
> 
> 					-Alex
>> 
>>      Arnd
>> 

Hi Alex

Could we instead have the rmnet header definition in
include/linux/if_rmnet.h

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-16  1:09       ` Subash Abhinov Kasiviswanathan
@ 2019-05-17 17:27         ` Alex Elder
  2019-05-17 18:08           ` Subash Abhinov Kasiviswanathan
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-17 17:27 UTC (permalink / raw)
  To: Subash Abhinov Kasiviswanathan
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	stranche, YueHaibing, Joe Perches, syadagir, mjavid, evgreen,
	benchan, ejcaruso, abhishek.esse, Linux Kernel Mailing List

On 5/15/19 8:09 PM, Subash Abhinov Kasiviswanathan wrote:
. . .
> Hi Alex
> 
> Could we instead have the rmnet header definition in
> include/linux/if_rmnet.h

I have no objection to that, but I don't actually know what
the criteria are for putting a file in that directory.

Glancing at other "if_*" files there it seems sensible, but
because I don't know, I'd like to have a little better
justification.

Can you provide a good explanation about why these
definitions belong in "include/linux/if_rmnet.h" instead
of "include/soc/qcom/rmnet.h"?

Thanks.

					-Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-17 17:27         ` Alex Elder
@ 2019-05-17 18:08           ` Subash Abhinov Kasiviswanathan
  2019-05-19 17:37             ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Subash Abhinov Kasiviswanathan @ 2019-05-17 18:08 UTC (permalink / raw)
  To: Alex Elder
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	stranche, YueHaibing, Joe Perches, syadagir, mjavid, evgreen,
	benchan, ejcaruso, abhishek.esse, Linux Kernel Mailing List

On 2019-05-17 11:27, Alex Elder wrote:
> On 5/15/19 8:09 PM, Subash Abhinov Kasiviswanathan wrote:
> . . .
>> Hi Alex
>> 
>> Could we instead have the rmnet header definition in
>> include/linux/if_rmnet.h
> 
> I have no objection to that, but I don't actually know what
> the criteria are for putting a file in that directory.
> 
> Glancing at other "if_*" files there it seems sensible, but
> because I don't know, I'd like to have a little better
> justification.
> 
> Can you provide a good explanation about why these
> definitions belong in "include/linux/if_rmnet.h" instead
> of "include/soc/qcom/rmnet.h"?
> 
> Thanks.
> 
> 					-Alex

rmnet was designed similar to vlan / macvlan / ipvlan / bridge.
These drivers support creation of virtual netdevices,
define custom rtnl_link_ops, expose netlink attributes to
uapi via if_link.h and register rx_handlers.

They expose some common structs and helpers via if_vlan.h /
if_macvlan.h / if_bridge.h. I would prefer rmnet to use if_rmnet.h
similar to them.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-15  7:34   ` Arnd Bergmann
  2019-05-15 12:25     ` Alex Elder
@ 2019-05-17 18:08     ` Alex Elder
  2019-05-17 18:33       ` Arnd Bergmann
  1 sibling, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-17 18:08 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, benchan, ejcaruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 2:34 AM, Arnd Bergmann wrote:
>> +static void gsi_trans_tre_fill(struct gsi_tre *dest_tre, dma_addr_t addr,
>> +                              u32 len, bool last_tre, bool bei,
>> +                              enum ipa_cmd_opcode opcode)
>> +{
>> +       struct gsi_tre tre;
>> +
>> +       tre.addr = cpu_to_le64(addr);
>> +       tre.len_opcode = gsi_tre_len_opcode(opcode, len);
>> +       tre.reserved = 0;
>> +       tre.flags = gsi_tre_flags(last_tre, bei, opcode);
>> +
>> +       *dest_tre = tre;        /* Write TRE as a single (16-byte) unit */
>> +}
> Have you checked that the atomic write is actually what happens here,
> but looking at the compiler output? You might need to add a 'volatile'
> qualifier to the dest_tre argument so the temporary structure doesn't
> get optimized away here.

Currently, the assignment *does* become a "stp" instruction.
But I don't know that we can *force* the compiler to write it
as a pair of registers, so I'll soften the comment with
"Attempt to write" or something similar.

To my knowledge, adding a volatile qualifier only prevents the
compiler from performing funny optimizations, but that has no
effect on whether the 128-bit assignment is made as a single
unit.  Do you know otherwise?

					-Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-17 18:08     ` Alex Elder
@ 2019-05-17 18:33       ` Arnd Bergmann
  2019-05-17 18:44         ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-17 18:33 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Fri, May 17, 2019 at 8:08 PM Alex Elder <elder@linaro.org> wrote:
>
> On 5/15/19 2:34 AM, Arnd Bergmann wrote:
> >> +static void gsi_trans_tre_fill(struct gsi_tre *dest_tre, dma_addr_t addr,
> >> +                              u32 len, bool last_tre, bool bei,
> >> +                              enum ipa_cmd_opcode opcode)
> >> +{
> >> +       struct gsi_tre tre;
> >> +
> >> +       tre.addr = cpu_to_le64(addr);
> >> +       tre.len_opcode = gsi_tre_len_opcode(opcode, len);
> >> +       tre.reserved = 0;
> >> +       tre.flags = gsi_tre_flags(last_tre, bei, opcode);
> >> +
> >> +       *dest_tre = tre;        /* Write TRE as a single (16-byte) unit */
> >> +}
> > Have you checked that the atomic write is actually what happens here,
> > but looking at the compiler output? You might need to add a 'volatile'
> > qualifier to the dest_tre argument so the temporary structure doesn't
> > get optimized away here.
>
> Currently, the assignment *does* become a "stp" instruction.
> But I don't know that we can *force* the compiler to write it
> as a pair of registers, so I'll soften the comment with
> "Attempt to write" or something similar.
>
> To my knowledge, adding a volatile qualifier only prevents the
> compiler from performing funny optimizations, but that has no
> effect on whether the 128-bit assignment is made as a single
> unit.  Do you know otherwise?

I don't think it you can force the 128-bit assignment to be
atomic, but marking 'dest_tre' should serve to prevent a
specific optimization that replaces the function with

    dest_tre->addr = ...
    dest_tre->len_opcode = ...
    dest_tre->reserved = ...
    dest_tre->flags = ...

which it might find more efficient than the stp and is equivalent
when the pointer is not marked volatile. We also have the WRITE_ONCE()
macro that can help prevent this, but it does not work reliably beyond
64 bit assignments.

      Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-17 18:33       ` Arnd Bergmann
@ 2019-05-17 18:44         ` Alex Elder
  2019-05-19 17:11           ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-17 18:44 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/17/19 1:33 PM, Arnd Bergmann wrote:
> On Fri, May 17, 2019 at 8:08 PM Alex Elder <elder@linaro.org> wrote:
>>
>> On 5/15/19 2:34 AM, Arnd Bergmann wrote:
>>>> +static void gsi_trans_tre_fill(struct gsi_tre *dest_tre, dma_addr_t addr,
>>>> +                              u32 len, bool last_tre, bool bei,
>>>> +                              enum ipa_cmd_opcode opcode)
>>>> +{
>>>> +       struct gsi_tre tre;
>>>> +
>>>> +       tre.addr = cpu_to_le64(addr);
>>>> +       tre.len_opcode = gsi_tre_len_opcode(opcode, len);
>>>> +       tre.reserved = 0;
>>>> +       tre.flags = gsi_tre_flags(last_tre, bei, opcode);
>>>> +
>>>> +       *dest_tre = tre;        /* Write TRE as a single (16-byte) unit */
>>>> +}
>>> Have you checked that the atomic write is actually what happens here,
>>> but looking at the compiler output? You might need to add a 'volatile'
>>> qualifier to the dest_tre argument so the temporary structure doesn't
>>> get optimized away here.
>>
>> Currently, the assignment *does* become a "stp" instruction.
>> But I don't know that we can *force* the compiler to write it
>> as a pair of registers, so I'll soften the comment with
>> "Attempt to write" or something similar.
>>
>> To my knowledge, adding a volatile qualifier only prevents the
>> compiler from performing funny optimizations, but that has no
>> effect on whether the 128-bit assignment is made as a single
>> unit.  Do you know otherwise?
> 
> I don't think it you can force the 128-bit assignment to be
> atomic, but marking 'dest_tre' should serve to prevent a
> specific optimization that replaces the function with
> 
>     dest_tre->addr = ...
>     dest_tre->len_opcode = ...
>     dest_tre->reserved = ...
>     dest_tre->flags = ...
> 
> which it might find more efficient than the stp and is equivalent
> when the pointer is not marked volatile. We also have the WRITE_ONCE()
> macro that can help prevent this, but it does not work reliably beyond
> 64 bit assignments.

OK, I'll mark it volatile to avoid that potential result.
Thanks.

					-Alex

> 
>       Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/18] soc: qcom: ipa: immediate commands
  2019-05-15 12:35     ` Alex Elder
@ 2019-05-18  0:34       ` Alex Elder
  2019-05-20 14:50         ` Arnd Bergmann
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-18  0:34 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/15/19 7:35 AM, Alex Elder wrote:
> On 5/15/19 3:16 AM, Arnd Bergmann wrote:
>> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@linaro.org> wrote:
>>
>>> +/* Initialize header space in IPA local memory */
>>> +int ipa_cmd_hdr_init_local(struct ipa *ipa, u32 offset, u32 size)
>>> +{
>>> +       struct ipa_imm_cmd_hw_hdr_init_local *payload;
>>> +       struct device *dev = &ipa->pdev->dev;
>>> +       dma_addr_t addr;
>>> +       void *virt;
>>> +       u32 flags;
>>> +       u32 max;
>>> +       int ret;
>>> +
>>> +       /* Note: size *can* be zero in this case */
>>> +       if (size > field_max(IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK))
>>> +               return -EINVAL;
>>> +
>>> +       max = field_max(IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK);
>>> +       if (offset > max || ipa->shared_offset > max - offset)
>>> +               return -EINVAL;
>>> +       offset += ipa->shared_offset;
>>> +
>>> +       /* A zero-filled buffer of the right size is all that's required */
>>> +       virt = dma_alloc_coherent(dev, size, &addr, GFP_KERNEL);
>>> +       if (!virt)
>>> +               return -ENOMEM;
>>> +
>>> +       payload = kzalloc(sizeof(*payload), GFP_KERNEL);
>>> +       if (!payload) {
>>> +               ret = -ENOMEM;
>>> +               goto out_dma_free;
>>> +       }
>>> +
>>> +       payload->hdr_table_addr = addr;
>>> +       flags = u32_encode_bits(size, IPA_CMD_HDR_INIT_FLAGS_TABLE_SIZE_FMASK);
>>> +       flags |= u32_encode_bits(offset, IPA_CMD_HDR_INIT_FLAGS_HDR_ADDR_FMASK);
>>> +       payload->flags = flags;
>>> +
>>> +       ret = ipa_cmd(ipa, IPA_CMD_HDR_INIT_LOCAL, payload, sizeof(*payload));
>>> +
>>> +       kfree(payload);
>>> +out_dma_free:
>>> +       dma_free_coherent(dev, size, virt, addr);
>>> +
>>> +       return ret;
>>> +}
>>
>> This looks rather strange. I think I looked at it before and you explained
>> it, but I have since forgotten what you do it for, so I assume everyone else
>> that tries to understand this will have problems too.
> 
> This is a bug.  I think I misunderstood why you were
> puzzled before.  Now I get it.  I need to save that
> DMA address and not free it at the end of the function
> (except on error).

OK, now I'm going to correct myself.  I hope I don't make
any mistakes here because things are confused enough...

Part of what I described previously is still true, namely
there are tables that need to be initialized (i.e., the
IPA needs to be told where they reside), and there is a
separate step is available to zero the content of the tables.

But there really is no need for the AP to hang onto this
DMA memory after this immediate command has been issued.
I will add comments in the code to make it less surprising.

But here's a summary of why.

I think there are two things at play that make it confusing.

The first thing is that these "header tables" are actually
located in a region of shared memory ("smem") that is local
to the IPA (not the AP).  The the IPA_CMD_HDR_INIT_LOCAL
immediate command is meant to:
1) define the header table location in IPA local memory
2) define the header table size
3) provide a buffer used to fill the table with its initial
   contents

The location and size are encoded in the flags field
of the payload (offset and size).

The initial contents are filled via DMA from a buffer
in main memory, whose DMA address is supplied in the
hdr_table_addr parameter in the payload.  The initial
contents we supply are all zero.  So this is why we
need to allocate DMA memory.

The second thing is that this is an instance where the
AP is responsible for performing some initialization
of resources it may not "own" thereafter.  The IPA
hardware owns this table, even though the AP needs to
tell it where it sits in IPA local memory.  The AP is
able to copy (using DMA) content into that table, but
doing so involves a DMA transfer.

More advanced features of the IPA would make more use
of this header table, but those features not yet
supported so this initialization (and a subsequent,
seemingly redundant zeroing) is all we do.

Does that make sense?

					-Alex


> Here's what I think happened.  There are two parts of
> initializing these tables.  One part tells the hardware
> where the table is located.  Another part zeroes the
> contents of those tables.  (The zeroing part could be
> accomplished when the table is allocated, but there
> are cases where they have to be zeroed again without
> needing to tell the hardware so we need to at least
> be able to do that independently.)
> 
> I think I was assuming this was the function that did
> the zeroing, and I thought that adding the comment about
> "all we need is a zero-filled buffer" addressed what
> you thought should be made clearer.
> 
> I will definitely fix this, and I'm glad you repeated
> it so I was forced to take another look.
> 
> If I again misunderstand your point, please let me know.
> 
> 					-Alex
> 
>> The issue I see is that you do an expensive dma_alloc_coherent()
>> but then never actually use the pointer returned by it, only the
>> dma address that cannot be turned back into a virtual address
>> in order to access the data in it.
>>
>> If you can't actually use payload->hdr_table_addr, why even allocate
>> it here?
>>
>>      Arnd
>>
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-17 18:44         ` Alex Elder
@ 2019-05-19 17:11           ` Alex Elder
  2019-05-20  9:25             ` Arnd Bergmann
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-19 17:11 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/17/19 1:44 PM, Alex Elder wrote:
> On 5/17/19 1:33 PM, Arnd Bergmann wrote:
>> On Fri, May 17, 2019 at 8:08 PM Alex Elder <elder@linaro.org>
>> wrote:
>>> 
>>> On 5/15/19 2:34 AM, Arnd Bergmann wrote:
>>>>> +static void gsi_trans_tre_fill(struct gsi_tre *dest_tre,
>>>>> dma_addr_t addr, +                              u32 len, bool
>>>>> last_tre, bool bei, +                              enum
>>>>> ipa_cmd_opcode opcode) +{ +       struct gsi_tre tre; + +
>>>>> tre.addr = cpu_to_le64(addr); +       tre.len_opcode =
>>>>> gsi_tre_len_opcode(opcode, len); +       tre.reserved = 0; +
>>>>> tre.flags = gsi_tre_flags(last_tre, bei, opcode); + +
>>>>> *dest_tre = tre;        /* Write TRE as a single (16-byte)
>>>>> unit */ +}
>>>> Have you checked that the atomic write is actually what happens
>>>> here, but looking at the compiler output? You might need to add
>>>> a 'volatile' qualifier to the dest_tre argument so the
>>>> temporary structure doesn't get optimized away here.
>>> 
>>> Currently, the assignment *does* become a "stp" instruction. But
>>> I don't know that we can *force* the compiler to write it as a
>>> pair of registers, so I'll soften the comment with "Attempt to
>>> write" or something similar.
>>> 
>>> To my knowledge, adding a volatile qualifier only prevents the 
>>> compiler from performing funny optimizations, but that has no 
>>> effect on whether the 128-bit assignment is made as a single 
>>> unit.  Do you know otherwise?
>> 
>> I don't think it you can force the 128-bit assignment to be atomic,
>> but marking 'dest_tre' should serve to prevent a specific
>> optimization that replaces the function with
>> 
>> dest_tre->addr = ... dest_tre->len_opcode = ... dest_tre->reserved
>> = ... dest_tre->flags = ...
>> 
>> which it might find more efficient than the stp and is equivalent 
>> when the pointer is not marked volatile. We also have the
>> WRITE_ONCE() macro that can help prevent this, but it does not work
>> reliably beyond 64 bit assignments.
> 
> OK, I'll mark it volatile to avoid that potential result.

OK I got interesting results so I wanted to report back.

The way it is currently written (no volatile qualifier) is
the *only* way I get a 16-byte store instruction.

Specifically, with dest_tre having type (struct gsi_tre *):
        *dest_tre = tre;

Produces this:
 4ec: a9002013        stp     x19, x8, [x0]

I attempted to make the assigned-to pointer volatile:
        *(volatile struct gsi_tre *)dest_tre = tre;

But that apparently is interpreted as "assign things
in the destination in exactly the way they were assigned
above into the "tre" structure."  Because it produced this:
 748: f9000348        str     x8, [x26]
 74c: 7940c3e8        ldrh    w8, [sp, #96]
 750: 79001348        strh    w8, [x26, #8]
 754: 7940bbe8        ldrh    w8, [sp, #92]
 758: 79001748        strh    w8, [x26, #10]
 75c: b9405be8        ldr     w8, [sp, #88]
 760: b9000f48        str     w8, [x26, #12]

From there I went back and changed the type of the dest_tre pointer
parameter to (volatile struct gsi_tre *), and changed the the type
of the values passed through that argument in the two callers to
also have the volatile qualifier.  This way there was no need to
use a cast in the left-hand side.  That too produced a string of
separate assignments, not a single 128-bit one.

I also attempted this:
	*dest_tre = *(volatile struct gsi_tre *)&tre;
And even this:
        *(volatile struct gsi_tre *)dest_tre = *(volatile struct gsi_tre *)&tre;
But neither produced a single "stp" instruction; all produced
the same sequence of instructions above.

So it seems that I must *not* apply a volatile qualifier,
because doing so restricts the compiler from making the
single instruction optimization.

If I've missed something and you have another suggestion for
me to try let me know and I'll try it.

					-Alex


> Thanks.
> 
> -Alex
> 
>> 
>> Arnd
>> 
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h"
  2019-05-17 18:08           ` Subash Abhinov Kasiviswanathan
@ 2019-05-19 17:37             ` Alex Elder
  0 siblings, 0 replies; 66+ messages in thread
From: Alex Elder @ 2019-05-19 17:37 UTC (permalink / raw)
  To: Subash Abhinov Kasiviswanathan
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	stranche, YueHaibing, Joe Perches, syadagir, mjavid, evgreen,
	benchan, ejcaruso, abhishek.esse, Linux Kernel Mailing List

On 5/17/19 1:08 PM, Subash Abhinov Kasiviswanathan wrote:
> On 2019-05-17 11:27, Alex Elder wrote:
. . .
>> Can you provide a good explanation about why these
>> definitions belong in "include/linux/if_rmnet.h" instead
>> of "include/soc/qcom/rmnet.h"?
>>
>> Thanks.
>>
>>                     -Alex
> 
> rmnet was designed similar to vlan / macvlan / ipvlan / bridge.
> These drivers support creation of virtual netdevices,
> define custom rtnl_link_ops, expose netlink attributes to
> uapi via if_link.h and register rx_handlers.
> 
> They expose some common structs and helpers via if_vlan.h /
> if_macvlan.h / if_bridge.h. I would prefer rmnet to use if_rmnet.h
> similar to them.

OK, I will name the file "include/linux/if_rmnet.h" as you suggest.
It will still only define the three structures that I need in the
IPA driver; I won't expose anything else from the rmnet_data driver.

I will mention now that, to facilitate addressing Arnd's concerns
about the portability of using C bit-fields in these structures,
I made a set of other changes (including a bug fix in one of
the structure definitions).  As a preview, here are the subject
lines for that series:
    net: qualcomm: rmnet: fix struct rmnet_map_header
    net: qualcomm: rmnet: kill RMNET_MAP_GET_*() accessor macros
    net: qualcomm: rmnet: use field masks instead of C bit-fields
    net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum header
    net: qualcomm: rmnet: don't use C bit-fields in rmnet checksum trailer
    soc: qcom: ipa: get rid of a variable in rmnet_map_ipv4_ul_csum_header()
    net: create "include/linux/if_rmnet.h"

I will be posting that as a separate series now and will have
the IPA driver series mention a dependence on that.

					-Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-19 17:11           ` Alex Elder
@ 2019-05-20  9:25             ` Arnd Bergmann
  2019-05-20 12:50               ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-20  9:25 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Sun, May 19, 2019 at 7:11 PM Alex Elder <elder@linaro.org> wrote:
> On 5/17/19 1:44 PM, Alex Elder wrote:
> > On 5/17/19 1:33 PM, Arnd Bergmann wrote:
> >> On Fri, May 17, 2019 at 8:08 PM Alex Elder <elder@linaro.org>
>
> So it seems that I must *not* apply a volatile qualifier,
> because doing so restricts the compiler from making the
> single instruction optimization.

Right, I guess that makes sense.

> If I've missed something and you have another suggestion for
> me to try let me know and I'll try it.

A memcpy() might do the right thing as well. Another idea would
be a cast to __int128 like

#ifdef CONFIG_ARCH_SUPPORTS_INT128
typedef __int128 tre128_t;
#else
typedef struct { __u64 a; __u64 b; } tre128_t;
#else

static inline void set_tre(struct gsi_tre *dest_tre, struct gs_tre *src_tre)
{
     *(volatile tre128_t *)dest_tre = *(tre128_t *)src_tre;
}

      Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-20  9:25             ` Arnd Bergmann
@ 2019-05-20 12:50               ` Alex Elder
  2019-05-20 14:43                 ` Arnd Bergmann
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-20 12:50 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/20/19 4:25 AM, Arnd Bergmann wrote:
> On Sun, May 19, 2019 at 7:11 PM Alex Elder <elder@linaro.org> wrote:
>> On 5/17/19 1:44 PM, Alex Elder wrote:
>>> On 5/17/19 1:33 PM, Arnd Bergmann wrote:
>>>> On Fri, May 17, 2019 at 8:08 PM Alex Elder <elder@linaro.org>
>>
>> So it seems that I must *not* apply a volatile qualifier,
>> because doing so restricts the compiler from making the
>> single instruction optimization.
> 
> Right, I guess that makes sense.
> 
>> If I've missed something and you have another suggestion for
>> me to try let me know and I'll try it.
> 
> A memcpy() might do the right thing as well. Another idea would

I find memcpy() does the right thing.

> be a cast to __int128 like

I find that my environment supports 128 bit integers.  But...

> #ifdef CONFIG_ARCH_SUPPORTS_INT128
> typedef __int128 tre128_t;
> #else
> typedef struct { __u64 a; __u64 b; } tre128_t;
> #else
> 
> static inline void set_tre(struct gsi_tre *dest_tre, struct gs_tre *src_tre)
> {
>      *(volatile tre128_t *)dest_tre = *(tre128_t *)src_tre;
> }
...this produces two 8-bit assignments.  Could it be because
it's implemented as two 64-bit values?  I think so.  Dropping
the volatile qualifier produces a single "stp" instruction.

The only other thing I thought I could do to encourage
the compiler to do the right thing is define the type (or
variables) to have 128-bit alignment.  And doing that for
the original simple assignment didn't change the (desirable)
outcome, but I don't think it's really necessary in this
case, considering the single instruction uses two 64-bit
registers.

I'm going to leave it as it was originally; it's the simplest:
	*dest_tre = tre;

I added a comment about structuring the code this way with
the intention of getting the single instruction.  If a different
compiler produces different result

					-Alex

> 
>       Arnd
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-20 12:50               ` Alex Elder
@ 2019-05-20 14:43                 ` Arnd Bergmann
  2019-05-20 14:44                   ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-20 14:43 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Mon, May 20, 2019 at 2:50 PM Alex Elder <elder@linaro.org> wrote:
>
> On 5/20/19 4:25 AM, Arnd Bergmann wrote:
> > On Sun, May 19, 2019 at 7:11 PM Alex Elder <elder@linaro.org> wrote:
> >> On 5/17/19 1:44 PM, Alex Elder wrote:
> >>> On 5/17/19 1:33 PM, Arnd Bergmann wrote:
> >>>> On Fri, May 17, 2019 at 8:08 PM Alex Elder <elder@linaro.org>
> >>
> >> So it seems that I must *not* apply a volatile qualifier,
> >> because doing so restricts the compiler from making the
> >> single instruction optimization.
> >
> > Right, I guess that makes sense.
> >
> >> If I've missed something and you have another suggestion for
> >> me to try let me know and I'll try it.
> >
> > A memcpy() might do the right thing as well. Another idea would
>
> I find memcpy() does the right thing.
>
> > be a cast to __int128 like
>
> I find that my environment supports 128 bit integers.  But...
>
> > #ifdef CONFIG_ARCH_SUPPORTS_INT128
> > typedef __int128 tre128_t;
> > #else
> > typedef struct { __u64 a; __u64 b; } tre128_t;
> > #else
> >
> > static inline void set_tre(struct gsi_tre *dest_tre, struct gs_tre *src_tre)
> > {
> >      *(volatile tre128_t *)dest_tre = *(tre128_t *)src_tre;
> > }
> ...this produces two 8-bit assignments.  Could it be because
> it's implemented as two 64-bit values?  I think so.  Dropping
> the volatile qualifier produces a single "stp" instruction.

I have no idea how two 8-bit assignments could do that,
it sounds like a serious gcc bug, unless you mean two
8-byte assignments, which would be within the range
of expected behavior. If it's actually 8-bit stores, please
open a bug against gcc with a minimized test case.

> The only other thing I thought I could do to encourage
> the compiler to do the right thing is define the type (or
> variables) to have 128-bit alignment.  And doing that for
> the original simple assignment didn't change the (desirable)
> outcome, but I don't think it's really necessary in this
> case, considering the single instruction uses two 64-bit
> registers.
>
> I'm going to leave it as it was originally; it's the simplest:
>         *dest_tre = tre;
>
> I added a comment about structuring the code this way with
> the intention of getting the single instruction.  If a different
> compiler produces different result.

Ok, that's probably the best we can do then.

      Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-20 14:43                 ` Arnd Bergmann
@ 2019-05-20 14:44                   ` Alex Elder
  2019-05-20 16:34                     ` Evan Green
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-20 14:44 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/20/19 9:43 AM, Arnd Bergmann wrote:
> I have no idea how two 8-bit assignments could do that,
> it sounds like a serious gcc bug, unless you mean two
> 8-byte assignments, which would be within the range
> of expected behavior. If it's actually 8-bit stores, please
> open a bug against gcc with a minimized test case.

Sorry, it's 8 *byte* assignments, not 8 bit.	-Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/18] soc: qcom: ipa: immediate commands
  2019-05-18  0:34       ` Alex Elder
@ 2019-05-20 14:50         ` Arnd Bergmann
  2019-05-20 14:55           ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Arnd Bergmann @ 2019-05-20 14:50 UTC (permalink / raw)
  To: Alex Elder
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Sat, May 18, 2019 at 2:34 AM Alex Elder <elder@linaro.org> wrote:
> On 5/15/19 7:35 AM, Alex Elder wrote:
> > On 5/15/19 3:16 AM, Arnd Bergmann wrote:
> >>
> >> This looks rather strange. I think I looked at it before and you explained
> >> it, but I have since forgotten what you do it for, so I assume everyone else
> >> that tries to understand this will have problems too.
> >
> > This is a bug.  I think I misunderstood why you were
> > puzzled before.  Now I get it.  I need to save that
> > DMA address and not free it at the end of the function
> > (except on error).
>
> OK, now I'm going to correct myself.  I hope I don't make
> any mistakes here because things are confused enough...
>
> Part of what I described previously is still true, namely
> there are tables that need to be initialized (i.e., the
> IPA needs to be told where they reside), and there is a
> separate step is available to zero the content of the tables.
>
> But there really is no need for the AP to hang onto this
> DMA memory after this immediate command has been issued.
> I will add comments in the code to make it less surprising.
>
> But here's a summary of why.
>
> I think there are two things at play that make it confusing.
>
> The first thing is that these "header tables" are actually
> located in a region of shared memory ("smem") that is local
> to the IPA (not the AP).  The the IPA_CMD_HDR_INIT_LOCAL
> immediate command is meant to:
> 1) define the header table location in IPA local memory
> 2) define the header table size
> 3) provide a buffer used to fill the table with its initial
>    contents
>
> The location and size are encoded in the flags field
> of the payload (offset and size).
>
> The initial contents are filled via DMA from a buffer
> in main memory, whose DMA address is supplied in the
> hdr_table_addr parameter in the payload.  The initial
> contents we supply are all zero.  So this is why we
> need to allocate DMA memory.
>
> The second thing is that this is an instance where the
> AP is responsible for performing some initialization
> of resources it may not "own" thereafter.  The IPA
> hardware owns this table, even though the AP needs to
> tell it where it sits in IPA local memory.  The AP is
> able to copy (using DMA) content into that table, but
> doing so involves a DMA transfer.
>
> More advanced features of the IPA would make more use
> of this header table, but those features not yet
> supported so this initialization (and a subsequent,
> seemingly redundant zeroing) is all we do.
>
> Does that make sense?

Ok, that sounds reasonable, yes. I'm not sure if
dma_alloc_coherent() guarantees zero-initialization
though, so if that is required, you may have to
add a memset().

       Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/18] soc: qcom: ipa: immediate commands
  2019-05-20 14:50         ` Arnd Bergmann
@ 2019-05-20 14:55           ` Alex Elder
  2019-05-20 17:35             ` Christoph Hellwig
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-20 14:55 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David Miller, Bjorn Andersson, Ilias Apalodimas, syadagir,
	mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List, Christoph Hellwig

On 5/20/19 9:50 AM, Arnd Bergmann wrote:
> Ok, that sounds reasonable, yes. I'm not sure if
> dma_alloc_coherent() guarantees zero-initialization
> though, so if that is required, you may have to
> add a memset().
I had dma_zalloc_coherent() originally but I think
Christoph Hellwig posted something recently that
removed that function, because dma_alloc_coherent()
always zeroes the underlying memory.

+Christoph, who might be able to explain or confirm

					-Alex

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-20 14:44                   ` Alex Elder
@ 2019-05-20 16:34                     ` Evan Green
  2019-05-20 16:50                       ` Alex Elder
  0 siblings, 1 reply; 66+ messages in thread
From: Evan Green @ 2019-05-20 16:34 UTC (permalink / raw)
  To: Alex Elder
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	syadagir, mjavid, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Mon, May 20, 2019 at 7:44 AM Alex Elder <elder@linaro.org> wrote:
>
> On 5/20/19 9:43 AM, Arnd Bergmann wrote:
> > I have no idea how two 8-bit assignments could do that,
> > it sounds like a serious gcc bug, unless you mean two
> > 8-byte assignments, which would be within the range
> > of expected behavior. If it's actually 8-bit stores, please
> > open a bug against gcc with a minimized test case.
>
> Sorry, it's 8 *byte* assignments, not 8 bit.    -Alex

Is it important to the hardware that you're writing all 128 bits of
this in an atomic manner? My understanding is that while you may get
different behaviors using various combinations of
volatile/aligned/packed, this is way deep in the compiler's "free to
do whatever I want" territory. If the hardware's going to misbehave if
you don't get an atomic 128-bit write, then I don't think this has
been nailed down enough, since I think Clang or even a different
version of GCC is within its right to split the writes up differently.

Is filling out the TRE touching memory that the hardware is also
watching at the same time? Usually when the hardware cares about the
contents of a struct, there's a particular (smaller) field that can be
written out atomically. I remember USB having these structs that
needed to be filled out, but the hardware wouldn't actually slurp them
up until the smaller "token" part was non-zero. If the hardware is
scanning this struct, it might be safer to do it in two steps: 1)
flush out the filled out struct except for the field with the "go" bit
(which you'd have zeroed), then 2) set the field containing the "go"
bit.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-20 16:34                     ` Evan Green
@ 2019-05-20 16:50                       ` Alex Elder
  2019-05-20 17:36                         ` Evan Green
  0 siblings, 1 reply; 66+ messages in thread
From: Alex Elder @ 2019-05-20 16:50 UTC (permalink / raw)
  To: Evan Green
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	syadagir, mjavid, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On 5/20/19 11:34 AM, Evan Green wrote:
> On Mon, May 20, 2019 at 7:44 AM Alex Elder <elder@linaro.org> wrote:
>>
>> On 5/20/19 9:43 AM, Arnd Bergmann wrote:
>>> I have no idea how two 8-bit assignments could do that,
>>> it sounds like a serious gcc bug, unless you mean two
>>> 8-byte assignments, which would be within the range
>>> of expected behavior. If it's actually 8-bit stores, please
>>> open a bug against gcc with a minimized test case.
>>
>> Sorry, it's 8 *byte* assignments, not 8 bit.    -Alex
> 
> Is it important to the hardware that you're writing all 128 bits of

No, it is not important in the ways you are describing.

We're just geeking out over how to get optimal performance.
A single 128-bit write is nicer than two 64-bit writes,
or more smaller writes.

The hardware won't touch the TRE until the doorbell gets
rung telling it that it is permitted to do so.  The doorbell
is an I/O write, which implies a memory barrier, so the entire
TRE will be up-to-date in memory regardless of whether we
write it 128 bits or 8 bits at a time.

					-Alex

> this in an atomic manner? My understanding is that while you may get
> different behaviors using various combinations of
> volatile/aligned/packed, this is way deep in the compiler's "free to
> do whatever I want" territory. If the hardware's going to misbehave if
> you don't get an atomic 128-bit write, then I don't think this has
> been nailed down enough, since I think Clang or even a different
> version of GCC is within its right to split the writes up differently.
> 
> Is filling out the TRE touching memory that the hardware is also
> watching at the same time? Usually when the hardware cares about the
> contents of a struct, there's a particular (smaller) field that can be
> written out atomically. I remember USB having these structs that
> needed to be filled out, but the hardware wouldn't actually slurp them
> up until the smaller "token" part was non-zero. If the hardware is
> scanning this struct, it might be safer to do it in two steps: 1)
> flush out the filled out struct except for the field with the "go" bit
> (which you'd have zeroed), then 2) set the field containing the "go"
> bit.
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/18] soc: qcom: ipa: immediate commands
  2019-05-20 14:55           ` Alex Elder
@ 2019-05-20 17:35             ` Christoph Hellwig
  0 siblings, 0 replies; 66+ messages in thread
From: Christoph Hellwig @ 2019-05-20 17:35 UTC (permalink / raw)
  To: Alex Elder
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	syadagir, mjavid, evgreen, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List, Christoph Hellwig

On Mon, May 20, 2019 at 09:55:42AM -0500, Alex Elder wrote:
> On 5/20/19 9:50 AM, Arnd Bergmann wrote:
> > Ok, that sounds reasonable, yes. I'm not sure if
> > dma_alloc_coherent() guarantees zero-initialization
> > though, so if that is required, you may have to
> > add a memset().
> I had dma_zalloc_coherent() originally but I think
> Christoph Hellwig posted something recently that
> removed that function, because dma_alloc_coherent()
> always zeroes the underlying memory.
> 
> +Christoph, who might be able to explain or confirm

dma_alloc_coherent always zeroes the returned memory, which
is why dma_zalloc_coherent has been removed.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions
  2019-05-20 16:50                       ` Alex Elder
@ 2019-05-20 17:36                         ` Evan Green
  0 siblings, 0 replies; 66+ messages in thread
From: Evan Green @ 2019-05-20 17:36 UTC (permalink / raw)
  To: Alex Elder
  Cc: Arnd Bergmann, David Miller, Bjorn Andersson, Ilias Apalodimas,
	syadagir, mjavid, Ben Chan, Eric Caruso, abhishek.esse,
	Linux Kernel Mailing List

On Mon, May 20, 2019 at 9:50 AM Alex Elder <elder@linaro.org> wrote:
>
> On 5/20/19 11:34 AM, Evan Green wrote:
> > On Mon, May 20, 2019 at 7:44 AM Alex Elder <elder@linaro.org> wrote:
> >>
> >> On 5/20/19 9:43 AM, Arnd Bergmann wrote:
> >>> I have no idea how two 8-bit assignments could do that,
> >>> it sounds like a serious gcc bug, unless you mean two
> >>> 8-byte assignments, which would be within the range
> >>> of expected behavior. If it's actually 8-bit stores, please
> >>> open a bug against gcc with a minimized test case.
> >>
> >> Sorry, it's 8 *byte* assignments, not 8 bit.    -Alex
> >
> > Is it important to the hardware that you're writing all 128 bits of
>
> No, it is not important in the ways you are describing.
>
> We're just geeking out over how to get optimal performance.
> A single 128-bit write is nicer than two 64-bit writes,
> or more smaller writes.
>
> The hardware won't touch the TRE until the doorbell gets
> rung telling it that it is permitted to do so.  The doorbell
> is an I/O write, which implies a memory barrier, so the entire
> TRE will be up-to-date in memory regardless of whether we
> write it 128 bits or 8 bits at a time.
>

Ah, understood. Carry on!

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2019-05-20 17:37 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-12  1:24 [PATCH 00/18] net: introduce Qualcomm IPA driver Alex Elder
2019-05-12  1:24 ` [PATCH 01/18] bitfield.h: add FIELD_MAX() and field_max() Alex Elder
2019-05-12  6:33   ` Kalle Valo
2019-05-12 12:18     ` Alex Elder
2019-05-12 19:30       ` Johannes Berg
2019-05-12  1:24 ` [PATCH 02/18] soc: qcom: create "include/soc/qcom/rmnet.h" Alex Elder
2019-05-12  2:34   ` Joe Perches
2019-05-12 12:15     ` Alex Elder
2019-05-15  6:59   ` Arnd Bergmann
2019-05-15 12:03     ` Alex Elder
2019-05-16  1:09       ` Subash Abhinov Kasiviswanathan
2019-05-17 17:27         ` Alex Elder
2019-05-17 18:08           ` Subash Abhinov Kasiviswanathan
2019-05-19 17:37             ` Alex Elder
2019-05-12  1:24 ` [PATCH 03/18] dt-bindings: soc: qcom: add IPA bindings Alex Elder
2019-05-15  7:03   ` Arnd Bergmann
2019-05-15 12:04     ` Alex Elder
2019-05-15 16:50       ` Rob Herring
2019-05-15 17:05         ` Alex Elder
2019-05-12  1:24 ` [PATCH 04/18] soc: qcom: ipa: main code Alex Elder
2019-05-12  1:24 ` [PATCH 05/18] soc: qcom: ipa: configuration data Alex Elder
2019-05-12  1:24 ` [PATCH 06/18] soc: qcom: ipa: clocking, interrupts, and memory Alex Elder
2019-05-12  1:24 ` [PATCH 07/18] soc: qcom: ipa: GSI headers Alex Elder
2019-05-12  1:24 ` [PATCH 08/18] soc: qcom: ipa: the generic software interface Alex Elder
2019-05-15  7:21   ` Arnd Bergmann
2019-05-15 12:13     ` Alex Elder
2019-05-15 12:40       ` Arnd Bergmann
2019-05-15 10:47   ` Arnd Bergmann
2019-05-15 13:32     ` Alex Elder
2019-05-15 19:37   ` Arnd Bergmann
2019-05-12  1:24 ` [PATCH 09/18] soc: qcom: ipa: GSI transactions Alex Elder
2019-05-15  7:34   ` Arnd Bergmann
2019-05-15 12:25     ` Alex Elder
2019-05-15 20:50       ` Arnd Bergmann
2019-05-17 18:08     ` Alex Elder
2019-05-17 18:33       ` Arnd Bergmann
2019-05-17 18:44         ` Alex Elder
2019-05-19 17:11           ` Alex Elder
2019-05-20  9:25             ` Arnd Bergmann
2019-05-20 12:50               ` Alex Elder
2019-05-20 14:43                 ` Arnd Bergmann
2019-05-20 14:44                   ` Alex Elder
2019-05-20 16:34                     ` Evan Green
2019-05-20 16:50                       ` Alex Elder
2019-05-20 17:36                         ` Evan Green
2019-05-12  1:25 ` [PATCH 10/18] soc: qcom: ipa: IPA interface to GSI Alex Elder
2019-05-12  1:25 ` [PATCH 11/18] soc: qcom: ipa: IPA endpoints Alex Elder
2019-05-12  1:25 ` [PATCH 12/18] soc: qcom: ipa: immediate commands Alex Elder
2019-05-15  8:16   ` Arnd Bergmann
2019-05-15 12:35     ` Alex Elder
2019-05-18  0:34       ` Alex Elder
2019-05-20 14:50         ` Arnd Bergmann
2019-05-20 14:55           ` Alex Elder
2019-05-20 17:35             ` Christoph Hellwig
2019-05-12  1:25 ` [PATCH 13/18] soc: qcom: ipa: IPA network device and microcontroller Alex Elder
2019-05-15  8:21   ` Arnd Bergmann
2019-05-15 12:46     ` Alex Elder
2019-05-12  1:25 ` [PATCH 14/18] soc: qcom: ipa: AP/modem communications Alex Elder
2019-05-12  1:25 ` [PATCH 15/18] soc: qcom: ipa: support build of IPA code Alex Elder
2019-05-12  1:25 ` [PATCH 16/18] MAINTAINERS: add entry for the Qualcomm IPA driver Alex Elder
2019-05-12  1:25 ` [PATCH 17/18] arm64: dts: sdm845: add IPA information Alex Elder
2019-05-12  1:25 ` [PATCH 18/18] arm64: defconfig: enable build of IPA code Alex Elder
2019-05-15  8:23   ` Arnd Bergmann
2019-05-15 12:49     ` Alex Elder
2019-05-15 12:37 ` [PATCH 00/18] net: introduce Qualcomm IPA driver Arnd Bergmann
2019-05-15 12:52   ` Alex Elder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).