Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC v4 net-next 0/2] Add RTNL interface for SyncE
@ 2021-09-03 15:14 Maciej Machnikowski
  2021-09-03 15:14 ` [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status Maciej Machnikowski
  2021-09-03 15:14 ` [PATCH net-next 2/2] ice: add support for reading SyncE DPLL state Maciej Machnikowski
  0 siblings, 2 replies; 42+ messages in thread
From: Maciej Machnikowski @ 2021-09-03 15:14 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba, linux-kselftest

Synchronous Ethernet networks use a physical layer clock to syntonize
the frequency across different network elements.

Basic SyncE node defined in the ITU-T G.8264 consist of an Ethernet
Equipment Clock (EEC) and have the ability to recover synchronization
from the synchronization inputs - either traffic interfaces or external
frequency sources.
The EEC can synchronize its frequency (syntonize) to any of those sources.
It is also able to select synchronization source through priority tables
and synchronization status messaging. It also provides neccessary
filtering and holdover capabilities

This patch series introduces basic interface for reading the Ethernet
Equipment Clock (EEC) state on a SyncE capable device. This state gives
information about the source of the syntonization signal (ether my port,
or any external one) and the state of EEC. This interface is required\
to implement Synchronization Status Messaging on upper layers.

Next steps:
 - add interface to enable source clocks and get information about them
 - properly return the EEC_SRC_PORT flag depending on the port recovered
   clock being enabled and locked

v2:
- removed whitespace changes
- fix issues reported by test robot
v3:
- Changed naming from SyncE to EEC
- Clarify cover letter and commit message for patch 1
v4:
- Removed sync_source and pin_idx info
- Changed one structure to attributes
- Added EEC_SRC_PORT flag to indicate that the EEC is synchronized
  to the recovered clock of a port that returns the state

Maciej Machnikowski (2):
  rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  ice: add support for reading SyncE DPLL state

 drivers/net/ethernet/intel/ice/ice.h          |  5 ++
 .../net/ethernet/intel/ice/ice_adminq_cmd.h   | 34 +++++++++
 drivers/net/ethernet/intel/ice/ice_common.c   | 62 ++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_common.h   |  4 ++
 drivers/net/ethernet/intel/ice/ice_devids.h   |  3 +
 drivers/net/ethernet/intel/ice/ice_main.c     | 29 ++++++++
 drivers/net/ethernet/intel/ice/ice_ptp.c      | 35 +++++++++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c   | 44 ++++++++++++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.h   | 22 ++++++
 include/linux/netdevice.h                     |  6 ++
 include/uapi/linux/if_link.h                  | 31 ++++++++
 include/uapi/linux/rtnetlink.h                |  3 +
 net/core/rtnetlink.c                          | 71 +++++++++++++++++++
 security/selinux/nlmsgtab.c                   |  3 +-
 14 files changed, 351 insertions(+), 1 deletion(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-03 15:14 [RFC v4 net-next 0/2] Add RTNL interface for SyncE Maciej Machnikowski
@ 2021-09-03 15:14 ` Maciej Machnikowski
  2021-09-03 16:18   ` Stephen Hemminger
                     ` (2 more replies)
  2021-09-03 15:14 ` [PATCH net-next 2/2] ice: add support for reading SyncE DPLL state Maciej Machnikowski
  1 sibling, 3 replies; 42+ messages in thread
From: Maciej Machnikowski @ 2021-09-03 15:14 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba, linux-kselftest

This patch series introduces basic interface for reading the Ethernet
Equipment Clock (EEC) state on a SyncE capable device. This state gives
information about the state of EEC. This interface is required to
implement Synchronization Status Messaging on upper layers.

Initial implementation returns SyncE EEC state and flags attributes.
The only flag currently implemented is the EEC_SRC_PORT. When it's set
the EEC is synchronized to the recovered clock recovered from the
current port.

SyncE EEC state read needs to be implemented as a ndo_get_eec_state
function.

Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
---
 include/linux/netdevice.h      |  6 +++
 include/uapi/linux/if_link.h   | 31 +++++++++++++++
 include/uapi/linux/rtnetlink.h |  3 ++
 net/core/rtnetlink.c           | 71 ++++++++++++++++++++++++++++++++++
 security/selinux/nlmsgtab.c    |  3 +-
 5 files changed, 113 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 7c41593c1d6a..de78b8ed903c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1344,6 +1344,8 @@ struct netdev_net_notifier {
  *	The caller must be under RCU read context.
  * int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path);
  *     Get the forwarding path to reach the real device from the HW destination address
+ * int (*ndo_get_synce_state)(struct net_device *dev, struct if_eec_state_msg *state)
+ *	Get state of physical layer frequency syntonization (SyncE)
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1563,6 +1565,10 @@ struct net_device_ops {
 	struct net_device *	(*ndo_get_peer_dev)(struct net_device *dev);
 	int                     (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx,
                                                          struct net_device_path *path);
+	int			(*ndo_get_eec_state)(struct net_device *dev,
+						     enum if_eec_state *state,
+						     u32 *eec_flags,
+						     struct netlink_ext_ack *extack);
 };
 
 /**
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index eebd3894fe89..78a8a5af8cd8 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -1273,4 +1273,35 @@ enum {
 
 #define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
 
+/* SyncE section */
+
+enum if_eec_state {
+	IF_EEC_STATE_INVALID = 0,
+	IF_EEC_STATE_FREERUN,
+	IF_EEC_STATE_LOCKACQ,
+	IF_EEC_STATE_LOCKREC,
+	IF_EEC_STATE_LOCKED,
+	IF_EEC_STATE_HOLDOVER,
+	IF_EEC_STATE_OPEN_LOOP,
+	__IF_EEC_STATE_MAX,
+};
+
+#define IF_EEC_STATE_MAX	(__IF_EEC_STATE_MAX - 1)
+#define EEC_SRC_PORT		(1 << 0) /* recovered clock from the port is
+					  * currently the source for the EEC
+					  */
+
+struct if_eec_state_msg {
+	__u32 ifindex;
+};
+
+enum {
+	IFLA_EEC_UNSPEC,
+	IFLA_EEC_STATE,
+	IFLA_EEC_FLAGS,
+	__IFLA_EEC_MAX,
+};
+
+#define IFLA_EEC_MAX (__IFLA_EEC_MAX - 1)
+
 #endif /* _UAPI_LINUX_IF_LINK_H */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 5888492a5257..642ac18a09d9 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -185,6 +185,9 @@ enum {
 	RTM_GETNEXTHOPBUCKET,
 #define RTM_GETNEXTHOPBUCKET	RTM_GETNEXTHOPBUCKET
 
+	RTM_GETEECSTATE = 120,
+#define RTM_GETEECSTATE	RTM_GETEECSTATE
+
 	__RTM_MAX,
 #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
 };
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 972c8cb303a5..11d4e96d3371 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -5468,6 +5468,75 @@ static int rtnl_stats_dump(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
+static int rtnl_fill_eec_state(struct sk_buff *skb, struct net_device *dev,
+			       u32 portid, u32 seq, struct netlink_callback *cb,
+			       int flags, struct netlink_ext_ack *extack)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	struct if_eec_state_msg *state_msg;
+	enum if_eec_state state;
+	struct nlmsghdr *nlh;
+	u32 eec_flags;
+	int err;
+
+	ASSERT_RTNL();
+
+	if (!ops->ndo_get_eec_state)
+		return -EOPNOTSUPP;
+
+	err = ops->ndo_get_eec_state(dev, &state, &eec_flags, extack);
+	if (err)
+		return err;
+
+	nlh = nlmsg_put(skb, portid, seq, RTM_GETEECSTATE, sizeof(*state_msg),
+			flags);
+	if (!nlh)
+		return -EMSGSIZE;
+
+	state_msg = nlmsg_data(nlh);
+	state_msg->ifindex = dev->ifindex;
+
+	if (nla_put(skb, IFLA_EEC_STATE, sizeof(state), &state) ||
+	    nla_put_u32(skb, IFLA_EEC_FLAGS, eec_flags))
+		return -EMSGSIZE;
+
+	nlmsg_end(skb, nlh);
+	return 0;
+}
+
+static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh,
+			      struct netlink_ext_ack *extack)
+{
+	struct net *net = sock_net(skb->sk);
+	struct if_eec_state_msg *state;
+	struct net_device *dev;
+	struct sk_buff *nskb;
+	int err;
+
+	state = nlmsg_data(nlh);
+	if (state->ifindex > 0)
+		dev = __dev_get_by_index(net, state->ifindex);
+	else
+		return -EINVAL;
+
+	if (!dev)
+		return -ENODEV;
+
+	nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!nskb)
+		return -ENOBUFS;
+
+	err = rtnl_fill_eec_state(nskb, dev, NETLINK_CB(skb).portid,
+				  nlh->nlmsg_seq, NULL, nlh->nlmsg_flags,
+				  extack);
+	if (err < 0)
+		kfree_skb(nskb);
+	else
+		err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
+
+	return err;
+}
+
 /* Process one rtnetlink message. */
 
 static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
@@ -5693,4 +5762,6 @@ void __init rtnetlink_init(void)
 
 	rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump,
 		      0);
+
+	rtnl_register(PF_UNSPEC, RTM_GETEECSTATE, rtnl_eec_state_get, NULL, 0);
 }
diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c
index d59276f48d4f..0e2d84d6045f 100644
--- a/security/selinux/nlmsgtab.c
+++ b/security/selinux/nlmsgtab.c
@@ -91,6 +91,7 @@ static const struct nlmsg_perm nlmsg_route_perms[] =
 	{ RTM_NEWNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
 	{ RTM_DELNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
 	{ RTM_GETNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_READ  },
+	{ RTM_GETEECSTATE,	NETLINK_ROUTE_SOCKET__NLMSG_READ  },
 };
 
 static const struct nlmsg_perm nlmsg_tcpdiag_perms[] =
@@ -174,7 +175,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm)
 		 * structures at the top of this file with the new mappings
 		 * before updating the BUILD_BUG_ON() macro!
 		 */
-		BUILD_BUG_ON(RTM_MAX != (RTM_NEWNEXTHOPBUCKET + 3));
+		BUILD_BUG_ON(RTM_MAX != (RTM_GETEECSTATE + 3));
 		err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms,
 				 sizeof(nlmsg_route_perms));
 		break;
-- 
2.26.3


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH net-next 2/2] ice: add support for reading SyncE DPLL state
  2021-09-03 15:14 [RFC v4 net-next 0/2] Add RTNL interface for SyncE Maciej Machnikowski
  2021-09-03 15:14 ` [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status Maciej Machnikowski
@ 2021-09-03 15:14 ` Maciej Machnikowski
  2021-09-03 22:06   ` Jakub Kicinski
  2021-09-21 13:25   ` Ido Schimmel
  1 sibling, 2 replies; 42+ messages in thread
From: Maciej Machnikowski @ 2021-09-03 15:14 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba, linux-kselftest

Implement SyncE DPLL monitoring for E810-T devices.
Poll loop will periodically check the state of the DPLL and cache it
in the pf structure. State changes will be logged in the system log.

Cached state can be read using the RTM_GETEECSTATE rtnetlink
message.

Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h          |  5 ++
 .../net/ethernet/intel/ice/ice_adminq_cmd.h   | 34 ++++++++++
 drivers/net/ethernet/intel/ice/ice_common.c   | 62 +++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_common.h   |  4 ++
 drivers/net/ethernet/intel/ice/ice_devids.h   |  3 +
 drivers/net/ethernet/intel/ice/ice_main.c     | 29 +++++++++
 drivers/net/ethernet/intel/ice/ice_ptp.c      | 35 +++++++++++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c   | 44 +++++++++++++
 drivers/net/ethernet/intel/ice/ice_ptp_hw.h   | 22 +++++++
 9 files changed, 238 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index eadcb9958346..6fb7e07e8a62 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -508,6 +508,11 @@ struct ice_pf {
 #define ICE_VF_AGG_NODE_ID_START	65
 #define ICE_MAX_VF_AGG_NODES		32
 	struct ice_agg_node vf_agg_node[ICE_MAX_VF_AGG_NODES];
+
+	enum if_eec_state synce_dpll_state;
+	u8 synce_dpll_pin;
+	enum if_eec_state ptp_dpll_state;
+	u8 ptp_dpll_pin;
 };
 
 struct ice_netdev_priv {
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 21b4c7cd6f05..b84da5e9d025 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -1727,6 +1727,36 @@ struct ice_aqc_add_rdma_qset_data {
 	struct ice_aqc_add_tx_rdma_qset_entry rdma_qsets[];
 };
 
+/* Get CGU DPLL status (direct 0x0C66) */
+struct ice_aqc_get_cgu_dpll_status {
+	u8 dpll_num;
+	u8 ref_state;
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_LOS		BIT(0)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_SCM		BIT(1)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_CFM		BIT(2)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_GST		BIT(3)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_PFM		BIT(4)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_ESYNC	BIT(6)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_FAST_LOCK_EN	BIT(7)
+	__le16 dpll_state;
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_LOCK		BIT(0)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO		BIT(1)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY	BIT(2)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_FLHIT		BIT(5)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_PSLHIT	BIT(7)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT	8
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SEL	\
+	ICE_M(0x1F, ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT)
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE_SHIFT	13
+#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE \
+	ICE_M(0x7, ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE_SHIFT)
+	__le32 phase_offset_h;
+	__le32 phase_offset_l;
+	u8 eec_mode;
+	u8 rsvd[1];
+	__le16 node_handle;
+};
+
 /* Configure Firmware Logging Command (indirect 0xFF09)
  * Logging Information Read Response (indirect 0xFF10)
  * Note: The 0xFF10 command has no input parameters.
@@ -1954,6 +1984,7 @@ struct ice_aq_desc {
 		struct ice_aqc_fw_logging fw_logging;
 		struct ice_aqc_get_clear_fw_log get_clear_fw_log;
 		struct ice_aqc_download_pkg download_pkg;
+		struct ice_aqc_get_cgu_dpll_status get_cgu_dpll_status;
 		struct ice_aqc_driver_shared_params drv_shared_params;
 		struct ice_aqc_set_mac_lb set_mac_lb;
 		struct ice_aqc_alloc_free_res_cmd sw_res_ctrl;
@@ -2108,6 +2139,9 @@ enum ice_adminq_opc {
 	ice_aqc_opc_update_pkg				= 0x0C42,
 	ice_aqc_opc_get_pkg_info_list			= 0x0C43,
 
+	/* 1588/SyncE commands/events */
+	ice_aqc_opc_get_cgu_dpll_status			= 0x0C66,
+
 	ice_aqc_opc_driver_shared_params		= 0x0C90,
 
 	/* Standalone Commands/Events */
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 2fb81e359cdf..e7474643a421 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -69,6 +69,31 @@ bool ice_is_e810(struct ice_hw *hw)
 	return hw->mac_type == ICE_MAC_E810;
 }
 
+/**
+ * ice_is_e810t
+ * @hw: pointer to the hardware structure
+ *
+ * returns true if the device is E810T based, false if not.
+ */
+bool ice_is_e810t(struct ice_hw *hw)
+{
+	switch (hw->device_id) {
+	case ICE_DEV_ID_E810C_SFP:
+		if (hw->subsystem_device_id == ICE_SUBDEV_ID_E810T ||
+		    hw->subsystem_device_id == ICE_SUBDEV_ID_E810T2)
+			return true;
+		break;
+	case ICE_DEV_ID_E810C_QSFP:
+		if (hw->subsystem_device_id == ICE_SUBDEV_ID_E810T2)
+			return true;
+		break;
+	default:
+		break;
+	}
+
+	return false;
+}
+
 /**
  * ice_clear_pf_cfg - Clear PF configuration
  * @hw: pointer to the hardware structure
@@ -4520,6 +4545,42 @@ ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid,
 	return ice_status_to_errno(status);
 }
 
+/**
+ * ice_aq_get_cgu_dpll_status
+ * @hw: pointer to the HW struct
+ * @dpll_num: DPLL index
+ * @ref_state: Reference clock state
+ * @dpll_state: DPLL state
+ * @phase_offset: Phase offset in ps
+ * @eec_mode: EEC_mode
+ *
+ * Get CGU DPLL status (0x0C66)
+ */
+enum ice_status
+ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state,
+			   u16 *dpll_state, u64 *phase_offset, u8 *eec_mode)
+{
+	struct ice_aqc_get_cgu_dpll_status *cmd;
+	struct ice_aq_desc desc;
+	enum ice_status status;
+
+	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_cgu_dpll_status);
+	cmd = &desc.params.get_cgu_dpll_status;
+	cmd->dpll_num = dpll_num;
+
+	status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL);
+	if (!status) {
+		*ref_state = cmd->ref_state;
+		*dpll_state = le16_to_cpu(cmd->dpll_state);
+		*phase_offset = le32_to_cpu(cmd->phase_offset_h);
+		*phase_offset <<= 32;
+		*phase_offset += le32_to_cpu(cmd->phase_offset_l);
+		*eec_mode = cmd->eec_mode;
+	}
+
+	return status;
+}
+
 /**
  * ice_replay_pre_init - replay pre initialization
  * @hw: pointer to the HW struct
@@ -4974,3 +5035,4 @@ bool ice_fw_supports_report_dflt_cfg(struct ice_hw *hw)
 	}
 	return false;
 }
+
diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h
index fb16070f02e2..ccd76c0cbf2c 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.h
+++ b/drivers/net/ethernet/intel/ice/ice_common.h
@@ -100,6 +100,7 @@ enum ice_status
 ice_aq_manage_mac_write(struct ice_hw *hw, const u8 *mac_addr, u8 flags,
 			struct ice_sq_cd *cd);
 bool ice_is_e810(struct ice_hw *hw);
+bool ice_is_e810t(struct ice_hw *hw);
 enum ice_status ice_clear_pf_cfg(struct ice_hw *hw);
 enum ice_status
 ice_aq_set_phy_cfg(struct ice_hw *hw, struct ice_port_info *pi,
@@ -156,6 +157,9 @@ ice_cfg_vsi_rdma(struct ice_port_info *pi, u16 vsi_handle, u16 tc_bitmap,
 int
 ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
 		      u16 *rdma_qset, u16 num_qsets, u32 *qset_teid);
+enum ice_status
+ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state,
+			   u16 *dpll_state, u64 *phase_offset, u8 *eec_mode);
 int
 ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid,
 		      u16 *q_id);
diff --git a/drivers/net/ethernet/intel/ice/ice_devids.h b/drivers/net/ethernet/intel/ice/ice_devids.h
index 9d8194671f6a..e52dbeddb783 100644
--- a/drivers/net/ethernet/intel/ice/ice_devids.h
+++ b/drivers/net/ethernet/intel/ice/ice_devids.h
@@ -52,4 +52,7 @@
 /* Intel(R) Ethernet Connection E822-L 1GbE */
 #define ICE_DEV_ID_E822L_SGMII		0x189A
 
+#define ICE_SUBDEV_ID_E810T		0x000E
+#define ICE_SUBDEV_ID_E810T2		0x000F
+
 #endif /* _ICE_DEVIDS_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 0d6c143f6653..62a011f135bd 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5978,6 +5978,34 @@ static void ice_napi_disable_all(struct ice_vsi *vsi)
 	}
 }
 
+/**
+ * ice_get_eec_state - get state of SyncE DPLL
+ * @netdev: network interface device structure
+ * @state: state of SyncE DPLL
+ * @sync_src: port is a source of the sync signal
+ */
+static int
+ice_get_eec_state(struct net_device *netdev, enum if_eec_state *state,
+		  u32 *eec_flags, struct netlink_ext_ack *extack)
+{
+	struct ice_netdev_priv *np = netdev_priv(netdev);
+	struct ice_vsi *vsi = np->vsi;
+	struct ice_pf *pf = vsi->back;
+
+	if (!ice_is_e810t(&pf->hw))
+		return -EOPNOTSUPP;
+
+	*state = pf->synce_dpll_state;
+
+	*eec_flags = 0;
+	if (pf->synce_dpll_pin == REF1P || pf->synce_dpll_pin == REF1N) {
+		/* Add check if the current PF drives the EEC clock */
+		*eec_flags |= EEC_SRC_PORT;
+	}
+
+	return 0;
+}
+
 /**
  * ice_down - Shutdown the connection
  * @vsi: The VSI being stopped
@@ -7268,4 +7296,5 @@ static const struct net_device_ops ice_netdev_ops = {
 	.ndo_bpf = ice_xdp,
 	.ndo_xdp_xmit = ice_xdp_xmit,
 	.ndo_xsk_wakeup = ice_xsk_wakeup,
+	.ndo_get_eec_state = ice_get_eec_state,
 };
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index 05cc5870e4ef..ccbc2d1cc754 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -1409,6 +1409,36 @@ static void ice_ptp_tx_tstamp_cleanup(struct ice_ptp_tx *tx)
 	}
 }
 
+static void ice_handle_cgu_state(struct ice_pf *pf)
+{
+	enum if_eec_state cgu_state;
+	u8 pin;
+
+	cgu_state = ice_get_dpll_state(&pf->hw, ICE_CGU_DPLL_SYNCE, &pin);
+	if (pf->synce_dpll_state != cgu_state) {
+		pf->synce_dpll_state = cgu_state;
+		pf->synce_dpll_pin = pin;
+
+		dev_warn(ice_pf_to_dev(pf),
+			 "<DPLL%i> state changed to: %d, pin %d",
+			 ICE_CGU_DPLL_SYNCE,
+			 pf->synce_dpll_state,
+			 pin);
+	}
+
+	cgu_state = ice_get_dpll_state(&pf->hw, ICE_CGU_DPLL_PTP, &pin);
+	if (pf->ptp_dpll_state != cgu_state) {
+		pf->ptp_dpll_state = cgu_state;
+		pf->ptp_dpll_pin = pin;
+
+		dev_warn(ice_pf_to_dev(pf),
+			 "<DPLL%i> state changed to: %d, pin %d",
+			 ICE_CGU_DPLL_PTP,
+			 pf->ptp_dpll_state,
+			 pin);
+	}
+}
+
 static void ice_ptp_periodic_work(struct kthread_work *work)
 {
 	struct ice_ptp *ptp = container_of(work, struct ice_ptp, work.work);
@@ -1417,6 +1447,10 @@ static void ice_ptp_periodic_work(struct kthread_work *work)
 	if (!test_bit(ICE_FLAG_PTP, pf->flags))
 		return;
 
+	if (ice_is_e810t(&pf->hw) &&
+	    &pf->hw.func_caps.ts_func_info.src_tmr_owned)
+		ice_handle_cgu_state(pf);
+
 	ice_ptp_update_cached_phctime(pf);
 
 	ice_ptp_tx_tstamp_cleanup(&pf->ptp.port.tx);
@@ -1598,3 +1632,4 @@ void ice_ptp_release(struct ice_pf *pf)
 
 	dev_info(ice_pf_to_dev(pf), "Removed PTP clock\n");
 }
+
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 3eca0e4eab0b..df09f176512b 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -375,6 +375,50 @@ static int ice_ptp_port_cmd_e810(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
 	return 0;
 }
 
+/**
+ * ice_get_dpll_state - get the state of the DPLL
+ * @hw: pointer to the hw struct
+ * @dpll_idx: Index of internal DPLL unit
+ * @pin: pointer to a buffer for returning currently active pin
+ *
+ * This function will read the state of the DPLL(dpll_idx). If optional
+ * parameter pin is given it'll be used to retrieve currently active pin.
+ *
+ * Return: state of the DPLL
+ */
+enum if_eec_state
+ice_get_dpll_state(struct ice_hw *hw, u8 dpll_idx, u8 *pin)
+{
+	enum ice_status status;
+	u64 phase_offset;
+	u16 dpll_state;
+	u8 ref_state;
+	u8 eec_mode;
+
+	if (dpll_idx >= ICE_CGU_DPLL_MAX)
+		return IF_EEC_STATE_INVALID;
+
+	status = ice_aq_get_cgu_dpll_status(hw, dpll_idx, &ref_state,
+					    &dpll_state, &phase_offset,
+					    &eec_mode);
+	if (status)
+		return IF_EEC_STATE_INVALID;
+
+	if (pin) {
+		/* current ref pin in dpll_state_refsel_status_X register */
+		*pin = (dpll_state &
+			ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SEL) >>
+		       ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT;
+	}
+
+	if (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_LOCK)
+		return IF_EEC_STATE_LOCKED;
+	else if (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO)
+		return IF_EEC_STATE_HOLDOVER;
+
+	return IF_EEC_STATE_FREERUN;
+}
+
 /* Device agnostic functions
  *
  * The following functions implement useful behavior to hide the differences
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
index 55a414e87018..cc5dc79fc926 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
@@ -30,6 +30,8 @@ int ice_clear_phy_tstamp(struct ice_hw *hw, u8 block, u8 idx);
 
 /* E810 family functions */
 int ice_ptp_init_phy_e810(struct ice_hw *hw);
+enum if_eec_state
+ice_get_dpll_state(struct ice_hw *hw, u8 dpll_idx, u8 *pin);
 
 #define PFTSYN_SEM_BYTES	4
 
@@ -76,4 +78,24 @@ int ice_ptp_init_phy_e810(struct ice_hw *hw);
 #define LOW_TX_MEMORY_BANK_START	0x03090000
 #define HIGH_TX_MEMORY_BANK_START	0x03090004
 
+enum ice_e810t_cgu_dpll {
+	ICE_CGU_DPLL_SYNCE,
+	ICE_CGU_DPLL_PTP,
+	ICE_CGU_DPLL_MAX
+};
+
+enum ice_e810t_cgu_pins {
+	REF0P,
+	REF0N,
+	REF1P,
+	REF1N,
+	REF2P,
+	REF2N,
+	REF3P,
+	REF3N,
+	REF4P,
+	REF4N,
+	NUM_E810T_CGU_PINS
+};
+
 #endif /* _ICE_PTP_HW_H_ */
-- 
2.26.3


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-03 15:14 ` [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status Maciej Machnikowski
@ 2021-09-03 16:18   ` Stephen Hemminger
  2021-09-03 22:20     ` Machnikowski, Maciej
  2021-09-03 22:14   ` Jakub Kicinski
  2021-09-21 13:15   ` Ido Schimmel
  2 siblings, 1 reply; 42+ messages in thread
From: Stephen Hemminger @ 2021-09-03 16:18 UTC (permalink / raw)
  To: Maciej Machnikowski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi,
	anthony.l.nguyen, davem, kuba, linux-kselftest

On Fri,  3 Sep 2021 17:14:35 +0200
Maciej Machnikowski <maciej.machnikowski@intel.com> wrote:

> This patch series introduces basic interface for reading the Ethernet
> Equipment Clock (EEC) state on a SyncE capable device. This state gives
> information about the state of EEC. This interface is required to
> implement Synchronization Status Messaging on upper layers.
> 
> Initial implementation returns SyncE EEC state and flags attributes.
> The only flag currently implemented is the EEC_SRC_PORT. When it's set
> the EEC is synchronized to the recovered clock recovered from the
> current port.
> 
> SyncE EEC state read needs to be implemented as a ndo_get_eec_state
> function.
> 
> Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>

Is there a simpler way to do this? Seems like you are adding
a lot for a use case specific to a small class of devices.
For example adding a new network device operation adds small
amount of bloat to every other network device in the kernel.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 2/2] ice: add support for reading SyncE DPLL state
  2021-09-03 15:14 ` [PATCH net-next 2/2] ice: add support for reading SyncE DPLL state Maciej Machnikowski
@ 2021-09-03 22:06   ` Jakub Kicinski
  2021-09-21 13:25   ` Ido Schimmel
  1 sibling, 0 replies; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-03 22:06 UTC (permalink / raw)
  To: Maciej Machnikowski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi,
	anthony.l.nguyen, davem, linux-kselftest

On Fri,  3 Sep 2021 17:14:36 +0200 Maciej Machnikowski wrote:
> Implement SyncE DPLL monitoring for E810-T devices.
> Poll loop will periodically check the state of the DPLL and cache it
> in the pf structure. State changes will be logged in the system log.
> 
> Cached state can be read using the RTM_GETEECSTATE rtnetlink
> message.
> 
> Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>

kdoc issues here:

drivers/net/ethernet/intel/ice/ice_main.c:5990: warning: Function parameter or member 'eec_flags' not described in 'ice_get_eec_state'
drivers/net/ethernet/intel/ice/ice_main.c:5990: warning: Function parameter or member 'extack' not described in 'ice_get_eec_state'
drivers/net/ethernet/intel/ice/ice_main.c:5990: warning: Excess function parameter 'sync_src' description in 'ice_get_eec_state'

./scripts/kernel-doc -none and/or W=1 is your friend.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-03 15:14 ` [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status Maciej Machnikowski
  2021-09-03 16:18   ` Stephen Hemminger
@ 2021-09-03 22:14   ` Jakub Kicinski
  2021-09-06 18:30     ` Machnikowski, Maciej
  2021-09-21 13:15   ` Ido Schimmel
  2 siblings, 1 reply; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-03 22:14 UTC (permalink / raw)
  To: Maciej Machnikowski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi,
	anthony.l.nguyen, davem, linux-kselftest, Andrew Lunn,
	Michal Kubecek

On Fri,  3 Sep 2021 17:14:35 +0200 Maciej Machnikowski wrote:
> This patch series introduces basic interface for reading the Ethernet
> Equipment Clock (EEC) state on a SyncE capable device. This state gives
> information about the state of EEC. This interface is required to
> implement Synchronization Status Messaging on upper layers.
> 
> Initial implementation returns SyncE EEC state and flags attributes.
> The only flag currently implemented is the EEC_SRC_PORT. When it's set
> the EEC is synchronized to the recovered clock recovered from the
> current port.
> 
> SyncE EEC state read needs to be implemented as a ndo_get_eec_state
> function.
> 
> Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>

Since we're talking SyncE-only now my intuition would be to put this 
op in ethtool. Was there a reason ethtool was not chosen? If not what
do others think / if yes can the reason be included in the commit
message?


> +/* SyncE section */
> +
> +enum if_eec_state {
> +	IF_EEC_STATE_INVALID = 0,
> +	IF_EEC_STATE_FREERUN,
> +	IF_EEC_STATE_LOCKACQ,
> +	IF_EEC_STATE_LOCKREC,
> +	IF_EEC_STATE_LOCKED,
> +	IF_EEC_STATE_HOLDOVER,
> +	IF_EEC_STATE_OPEN_LOOP,
> +	__IF_EEC_STATE_MAX,
> +};
> +
> +#define IF_EEC_STATE_MAX	(__IF_EEC_STATE_MAX - 1)

You don't need MAC for an output-only enum, MAX define in netlink is
usually for attributes to know how to size attribute arrays.

> +#define EEC_SRC_PORT		(1 << 0) /* recovered clock from the port is
> +					  * currently the source for the EEC
> +					  */

Why include it then? Just leave the value out and if the attr is not
present user space should assume the source is port.

> +struct if_eec_state_msg {
> +	__u32 ifindex;
> +};
> +
> +enum {
> +	IFLA_EEC_UNSPEC,
> +	IFLA_EEC_STATE,
> +	IFLA_EEC_FLAGS,

With "SRC_PORT" gone there's no reason for flags at this point.

> +	__IFLA_EEC_MAX,
> +};
> +
> +#define IFLA_EEC_MAX (__IFLA_EEC_MAX - 1)

> +static int rtnl_fill_eec_state(struct sk_buff *skb, struct net_device *dev,
> +			       u32 portid, u32 seq, struct netlink_callback *cb,
> +			       int flags, struct netlink_ext_ack *extack)
> +{
> +	const struct net_device_ops *ops = dev->netdev_ops;
> +	struct if_eec_state_msg *state_msg;
> +	enum if_eec_state state;
> +	struct nlmsghdr *nlh;
> +	u32 eec_flags;
> +	int err;
> +
> +	ASSERT_RTNL();
> +
> +	if (!ops->ndo_get_eec_state)
> +		return -EOPNOTSUPP;
> +
> +	err = ops->ndo_get_eec_state(dev, &state, &eec_flags, extack);
> +	if (err)
> +		return err;
> +
> +	nlh = nlmsg_put(skb, portid, seq, RTM_GETEECSTATE, sizeof(*state_msg),
> +			flags);
> +	if (!nlh)
> +		return -EMSGSIZE;
> +
> +	state_msg = nlmsg_data(nlh);
> +	state_msg->ifindex = dev->ifindex;
> +
> +	if (nla_put(skb, IFLA_EEC_STATE, sizeof(state), &state) ||

This should be a u32.

> +	    nla_put_u32(skb, IFLA_EEC_FLAGS, eec_flags))
> +		return -EMSGSIZE;
> +
> +	nlmsg_end(skb, nlh);
> +	return 0;
> +}
> +
> +static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh,
> +			      struct netlink_ext_ack *extack)
> +{
> +	struct net *net = sock_net(skb->sk);
> +	struct if_eec_state_msg *state;
> +	struct net_device *dev;
> +	struct sk_buff *nskb;
> +	int err;
> +
> +	state = nlmsg_data(nlh);
> +	if (state->ifindex > 0)
> +		dev = __dev_get_by_index(net, state->ifindex);
> +	else
> +		return -EINVAL;
> +
> +	if (!dev)
> +		return -ENODEV;

I think I pointed this out already, this is more natural without the
else branch:

if (ifindex <= 0)
	return -EINVAL;

dev = ...
if (!dev)
	return -ENOENT;

or don't check the ifindex at all and let dev_get_by.. fail.


Thanks for pushing this API forward!

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-03 16:18   ` Stephen Hemminger
@ 2021-09-03 22:20     ` Machnikowski, Maciej
  0 siblings, 0 replies; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-03 22:20 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, kuba, linux-kselftest

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Friday, September 3, 2021 6:19 PM
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Fri,  3 Sep 2021 17:14:35 +0200
> Maciej Machnikowski <maciej.machnikowski@intel.com> wrote:
> 
> > This patch series introduces basic interface for reading the Ethernet
> > Equipment Clock (EEC) state on a SyncE capable device. This state gives
> > information about the state of EEC. This interface is required to
> > implement Synchronization Status Messaging on upper layers.
> >
> > Initial implementation returns SyncE EEC state and flags attributes.
> > The only flag currently implemented is the EEC_SRC_PORT. When it's set
> > the EEC is synchronized to the recovered clock recovered from the
> > current port.
> >
> > SyncE EEC state read needs to be implemented as a ndo_get_eec_state
> > function.
> >
> > Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
> 
> Is there a simpler way to do this? Seems like you are adding
> a lot for a use case specific to a small class of devices.
> For example adding a new network device operation adds small
> amount of bloat to every other network device in the kernel.

Hi!

I couldn't find any simpler way. Do you have something specific in mind?
A function pointer is only U64. I can hardly think of anything smaller.

Regards
Maciek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-03 22:14   ` Jakub Kicinski
@ 2021-09-06 18:30     ` Machnikowski, Maciej
  2021-09-06 18:39       ` Jakub Kicinski
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-06 18:30 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Saturday, September 4, 2021 12:14 AM
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Fri,  3 Sep 2021 17:14:35 +0200 Maciej Machnikowski wrote:
> > This patch series introduces basic interface for reading the Ethernet
> > Equipment Clock (EEC) state on a SyncE capable device. This state gives
> > information about the state of EEC. This interface is required to
> > implement Synchronization Status Messaging on upper layers.
> >
> > Initial implementation returns SyncE EEC state and flags attributes.
> > The only flag currently implemented is the EEC_SRC_PORT. When it's set
> > the EEC is synchronized to the recovered clock recovered from the
> > current port.
> >
> > SyncE EEC state read needs to be implemented as a ndo_get_eec_state
> > function.
> >
> > Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
> 
> Since we're talking SyncE-only now my intuition would be to put this
> op in ethtool. Was there a reason ethtool was not chosen? If not what
> do others think / if yes can the reason be included in the commit
> message?

Hmm. Main reason for netlink is that linuxptp already supports it,
and it was suggested by Richard.
Having an NDO would also make it easier to add a SyncE-related
files to the sysfs for easier operation (following the ideas from the ptp
pins subsystem).
But I'm open for suggestions. 

> 
>  
> > +#define EEC_SRC_PORT		(1 << 0) /* recovered clock from the
> port is
> > +					  * currently the source for the EEC
> > +					  */
> 
> Why include it then? Just leave the value out and if the attr is not
> present user space should assume the source is port.

This bit has a different meaning. If it's set the port in question
is a frequency source for the multiport device, if it's cleared - some other
source is used as a source. This is needed to prevent setting invalid 
configurations in the PHY (like setting the frequency source as a Master
in AN) or sending invalid messages. If the port is a frequency source
it must always send back QL-DNU messages to prevent synchronization
loops.

> 
> or don't check the ifindex at all and let dev_get_by.. fail.
> 
> 
> Thanks for pushing this API forward!

Addressed all other comments - and thanks for giving a lot of helpful
suggestions!

Regards
Maciek

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-06 18:30     ` Machnikowski, Maciej
@ 2021-09-06 18:39       ` Jakub Kicinski
  2021-09-06 19:01         ` Machnikowski, Maciej
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-06 18:39 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek

On Mon, 6 Sep 2021 18:30:40 +0000 Machnikowski, Maciej wrote:
> > -----Original Message-----
> > From: Jakub Kicinski <kuba@kernel.org>
> > Sent: Saturday, September 4, 2021 12:14 AM
> > Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> > message to get SyncE status
> > 
> > On Fri,  3 Sep 2021 17:14:35 +0200 Maciej Machnikowski wrote:  
> > > This patch series introduces basic interface for reading the Ethernet
> > > Equipment Clock (EEC) state on a SyncE capable device. This state gives
> > > information about the state of EEC. This interface is required to
> > > implement Synchronization Status Messaging on upper layers.
> > >
> > > Initial implementation returns SyncE EEC state and flags attributes.
> > > The only flag currently implemented is the EEC_SRC_PORT. When it's set
> > > the EEC is synchronized to the recovered clock recovered from the
> > > current port.
> > >
> > > SyncE EEC state read needs to be implemented as a ndo_get_eec_state
> > > function.
> > >
> > > Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>  
> > 
> > Since we're talking SyncE-only now my intuition would be to put this
> > op in ethtool. Was there a reason ethtool was not chosen? If not what
> > do others think / if yes can the reason be included in the commit
> > message?  
> 
> Hmm. Main reason for netlink is that linuxptp already supports it,
> and it was suggested by Richard.
> Having an NDO would also make it easier to add a SyncE-related
> files to the sysfs for easier operation (following the ideas from the ptp
> pins subsystem).
> But I'm open for suggestions. 

I think linuxptp will need support for ethtool netlink sockets sooner
rather than later. Moving this to ethtool makes sense to me since it's
very much a Ethernet-oriented API at this point.

> > > +#define EEC_SRC_PORT		(1 << 0) /* recovered clock from the  
> > port is  
> > > +					  * currently the source for the EEC
> > > +					  */  
> > 
> > Why include it then? Just leave the value out and if the attr is not
> > present user space should assume the source is port.  
> 
> This bit has a different meaning. If it's set the port in question
> is a frequency source for the multiport device, if it's cleared - some other
> source is used as a source. This is needed to prevent setting invalid 
> configurations in the PHY (like setting the frequency source as a Master
> in AN) or sending invalid messages. If the port is a frequency source
> it must always send back QL-DNU messages to prevent synchronization
> loops.

Ah! I see. Is being the "source" negotiated somehow? Don't we need to
give the user / linuxptp to select the source based on whatever info 
it has about topology?

> > or don't check the ifindex at all and let dev_get_by.. fail.
> > 
> > 
> > Thanks for pushing this API forward!  
> 
> Addressed all other comments - and thanks for giving a lot of helpful
> suggestions!

Thanks, BTW I think I forgot to ask for documentation, dumping info
about the API and context under Documentation/ would be great!

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-06 18:39       ` Jakub Kicinski
@ 2021-09-06 19:01         ` Machnikowski, Maciej
  2021-09-07  1:01           ` Jakub Kicinski
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-06 19:01 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek

> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Monday, September 6, 2021 8:39 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Mon, 6 Sep 2021 18:30:40 +0000 Machnikowski, Maciej wrote:
> > > -----Original Message-----
> > > From: Jakub Kicinski <kuba@kernel.org>
> > > Sent: Saturday, September 4, 2021 12:14 AM
> > > Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> > > message to get SyncE status
> > >
> > > On Fri,  3 Sep 2021 17:14:35 +0200 Maciej Machnikowski wrote:
> > > > This patch series introduces basic interface for reading the Ethernet
> > > > Equipment Clock (EEC) state on a SyncE capable device. This state gives
> > > > information about the state of EEC. This interface is required to
> > > > implement Synchronization Status Messaging on upper layers.
> > > >
> > > > Initial implementation returns SyncE EEC state and flags attributes.
> > > > The only flag currently implemented is the EEC_SRC_PORT. When it's
> set
> > > > the EEC is synchronized to the recovered clock recovered from the
> > > > current port.
> > > >
> > > > SyncE EEC state read needs to be implemented as a ndo_get_eec_state
> > > > function.
> > > >
> > > > Signed-off-by: Maciej Machnikowski
> <maciej.machnikowski@intel.com>
> > >
> > > Since we're talking SyncE-only now my intuition would be to put this
> > > op in ethtool. Was there a reason ethtool was not chosen? If not what
> > > do others think / if yes can the reason be included in the commit
> > > message?
> >
> > Hmm. Main reason for netlink is that linuxptp already supports it,
> > and it was suggested by Richard.
> > Having an NDO would also make it easier to add a SyncE-related
> > files to the sysfs for easier operation (following the ideas from the ptp
> > pins subsystem).
> > But I'm open for suggestions.
> 
> I think linuxptp will need support for ethtool netlink sockets sooner
> rather than later. Moving this to ethtool makes sense to me since it's
> very much a Ethernet-oriented API at this point.

Ethtool also makes a lot of sense, but will it be possible to still make sysfs,
and it makes sense to add it for some deployments (more on that below)

> > > > +#define EEC_SRC_PORT		(1 << 0) /* recovered clock from the
> > > port is
> > > > +					  * currently the source for the EEC
> > > > +					  */
> > >
> > > Why include it then? Just leave the value out and if the attr is not
> > > present user space should assume the source is port.
> >
> > This bit has a different meaning. If it's set the port in question
> > is a frequency source for the multiport device, if it's cleared - some other
> > source is used as a source. This is needed to prevent setting invalid
> > configurations in the PHY (like setting the frequency source as a Master
> > in AN) or sending invalid messages. If the port is a frequency source
> > it must always send back QL-DNU messages to prevent synchronization
> > loops.
> 
> Ah! I see. Is being the "source" negotiated somehow? Don't we need to
> give the user / linuxptp to select the source based on whatever info
> it has about topology?

The frequency source can be either pre-set statically, negotiated using
ESMC QL-levels (if working in QL-Enabled mode), or follow automatic
fallback inside the device. This  flag gives feedback about the validity
of recovered clock coming from a given port and is useful when you
enable multiple recovered clocks on more than one port in
active-passive model. In that case the "driving" port may change 
dynamically, so it's a good idea to have some interface to reflect that.

That's where sysfs file be useful. When I add the implementation for
recovered clock configuration, the sysfs may be used as standalone 
interface for configuring them when no dynamic change is needed.
 
> > > or don't check the ifindex at all and let dev_get_by.. fail.
> > >
> > >
> > > Thanks for pushing this API forward!
> >
> > Addressed all other comments - and thanks for giving a lot of helpful
> > suggestions!
> 
> Thanks, BTW I think I forgot to ask for documentation, dumping info
> about the API and context under Documentation/ would be great!

Could you suggest where to add that? Grepping for ndo_ don't give much.
I can add a new synce.rst file if it makes sense.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-06 19:01         ` Machnikowski, Maciej
@ 2021-09-07  1:01           ` Jakub Kicinski
  2021-09-07  8:50             ` Machnikowski, Maciej
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-07  1:01 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek

On Mon, 6 Sep 2021 19:01:54 +0000 Machnikowski, Maciej wrote:
> > > Hmm. Main reason for netlink is that linuxptp already supports it,
> > > and it was suggested by Richard.
> > > Having an NDO would also make it easier to add a SyncE-related
> > > files to the sysfs for easier operation (following the ideas from the ptp
> > > pins subsystem).
> > > But I'm open for suggestions.  
> > 
> > I think linuxptp will need support for ethtool netlink sockets sooner
> > rather than later. Moving this to ethtool makes sense to me since it's
> > very much a Ethernet-oriented API at this point.  
> 
> Ethtool also makes a lot of sense, but will it be possible to still make sysfs,
> and it makes sense to add it for some deployments (more on that below)

It should not make much difference whether ndo or ethtool op is used.

> > > This bit has a different meaning. If it's set the port in question
> > > is a frequency source for the multiport device, if it's cleared - some other
> > > source is used as a source. This is needed to prevent setting invalid
> > > configurations in the PHY (like setting the frequency source as a Master
> > > in AN) or sending invalid messages. If the port is a frequency source
> > > it must always send back QL-DNU messages to prevent synchronization
> > > loops.  
> > 
> > Ah! I see. Is being the "source" negotiated somehow? Don't we need to
> > give the user / linuxptp to select the source based on whatever info
> > it has about topology?  
> 
> The frequency source can be either pre-set statically, negotiated using
> ESMC QL-levels (if working in QL-Enabled mode), or follow automatic
> fallback inside the device. This  flag gives feedback about the validity
> of recovered clock coming from a given port and is useful when you
> enable multiple recovered clocks on more than one port in
> active-passive model. In that case the "driving" port may change 
> dynamically, so it's a good idea to have some interface to reflect that.

The ESMC messages are handled by Linux or some form of firmware?
I don't see how you can implement any selection policy with a read-only
API.

In general it would be more natural to place a "source id" at the
DPLL/clock, the "source" flag seems to mark the wrong end of the
relationship. If there ever are multiple consumers we won't be able 
to tell which "target" the "source" is referring to. Hard to judge 
how much of a problem that could be by looking at a small slice of 
the system.

> That's where sysfs file be useful. When I add the implementation for
> recovered clock configuration, the sysfs may be used as standalone 
> interface for configuring them when no dynamic change is needed.

I didn't get that. Do you mean using a sysfs file to configure 
the parameters of the DPLL? 

If the DPLL has its own set of concerns we should go ahead and create
explicit object / configuration channel for it.

Somehow I got it into my head that you care mostly about transmitting
the clock, IOW recovering it from one port and using on another but
that's probably not even a strong use case for you or NICs in general :S

> > > Addressed all other comments - and thanks for giving a lot of helpful
> > > suggestions!  
> > 
> > Thanks, BTW I think I forgot to ask for documentation, dumping info
> > about the API and context under Documentation/ would be great!  
> 
> Could you suggest where to add that? Grepping for ndo_ don't give much.
> I can add a new synce.rst file if it makes sense.

New networking/synce.rst file makes perfect sense to me. And perhaps
link to it from driver-api/ptp.rst.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-07  1:01           ` Jakub Kicinski
@ 2021-09-07  8:50             ` Machnikowski, Maciej
  2021-09-07 14:55               ` Jakub Kicinski
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-07  8:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek

> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, September 7, 2021 3:01 AM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Mon, 6 Sep 2021 19:01:54 +0000 Machnikowski, Maciej wrote:
> > > > Hmm. Main reason for netlink is that linuxptp already supports it,
> > > > and it was suggested by Richard.
> > > > Having an NDO would also make it easier to add a SyncE-related
> > > > files to the sysfs for easier operation (following the ideas from the ptp
> > > > pins subsystem).
> > > > But I'm open for suggestions.
> > >
> > > I think linuxptp will need support for ethtool netlink sockets sooner
> > > rather than later. Moving this to ethtool makes sense to me since it's
> > > very much a Ethernet-oriented API at this point.
> >
> > Ethtool also makes a lot of sense, but will it be possible to still make sysfs,
> > and it makes sense to add it for some deployments (more on that below)
> 
> It should not make much difference whether ndo or ethtool op is used.

Will look into it then.

> 
> > > > This bit has a different meaning. If it's set the port in question
> > > > is a frequency source for the multiport device, if it's cleared - some
> other
> > > > source is used as a source. This is needed to prevent setting invalid
> > > > configurations in the PHY (like setting the frequency source as a Master
> > > > in AN) or sending invalid messages. If the port is a frequency source
> > > > it must always send back QL-DNU messages to prevent synchronization
> > > > loops.
> > >
> > > Ah! I see. Is being the "source" negotiated somehow? Don't we need to
> > > give the user / linuxptp to select the source based on whatever info
> > > it has about topology?
> >
> > The frequency source can be either pre-set statically, negotiated using
> > ESMC QL-levels (if working in QL-Enabled mode), or follow automatic
> > fallback inside the device. This  flag gives feedback about the validity
> > of recovered clock coming from a given port and is useful when you
> > enable multiple recovered clocks on more than one port in
> > active-passive model. In that case the "driving" port may change
> > dynamically, so it's a good idea to have some interface to reflect that.
> 
> The ESMC messages are handled by Linux or some form of firmware?
> I don't see how you can implement any selection policy with a read-only
> API.

It can be either in FW or in Linux - depending on the deployment.
We try to define the API that would enable Linux to manage that.
EEC state will be read-only, but the recovered clock management part
will allow changes for QL-disabled SyncE deployments that only need
to see if the clock they receive on a given port is valid or not.

> In general it would be more natural to place a "source id" at the
> DPLL/clock, the "source" flag seems to mark the wrong end of the
> relationship. If there ever are multiple consumers we won't be able
> to tell which "target" the "source" is referring to. Hard to judge
> how much of a problem that could be by looking at a small slice of
> the system.

The DPLL will operate on pins, so it will have a pin connected from the
MAC/PHY that will have the recovered clock, but the recovered clock
can be enabled from any port/lane. That information is kept in the 
MAC/PHY and the DPLL side will not be aware who it belongs to.

We can come up with a better name,  but think of it like:
You have multiport device (switch/NIC). One port is recovering
the clock, the PHY/MAC outputs that clock through the pin
to the EEC (DPLL). The DPLL knows if it locked to the signal coming
from the multiport PHY/MAC, but it doesn't know which port is the one
that generates that clock signal. All other ports can also present the
"locked" state, but they are following the clock that was received
in the chosen port. If we drop this flag we won't be able to easily tell
which port/lane drives the recovered clock.
In short: the port with that flag on is following the network clock
and leading clock of other ports of the multiport device.

In the most basic SyncE deployment you can put the passive DPLL that
will only give you the lock/holdover/unlocked info and just use this flag 
to know who currently drives the DPLL.

> > That's where sysfs file be useful. When I add the implementation for
> > recovered clock configuration, the sysfs may be used as standalone
> > interface for configuring them when no dynamic change is needed.
> 
> I didn't get that. Do you mean using a sysfs file to configure
> the parameters of the DPLL?

Only the PHY/MAC side of thing which is recovered clock configuration
and the ECC state.
 
> If the DPLL has its own set of concerns we should go ahead and create
> explicit object / configuration channel for it.
> 
> Somehow I got it into my head that you care mostly about transmitting
> the clock, IOW recovering it from one port and using on another but
> that's probably not even a strong use case for you or NICs in general :S

This is the right thinking. The DPLL can also have different external sources,
like the GNSS, and can also drive different output clocks. But for the most
basic SyncE implementation, which only runs on a recovered clock, we won't
need the DPLL subsystem.

> > > > Addressed all other comments - and thanks for giving a lot of helpful
> > > > suggestions!
> > >
> > > Thanks, BTW I think I forgot to ask for documentation, dumping info
> > > about the API and context under Documentation/ would be great!
> >
> > Could you suggest where to add that? Grepping for ndo_ don't give much.
> > I can add a new synce.rst file if it makes sense.
> 
> New networking/synce.rst file makes perfect sense to me. And perhaps
> link to it from driver-api/ptp.rst.

OK will try to come up with something there

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-07  8:50             ` Machnikowski, Maciej
@ 2021-09-07 14:55               ` Jakub Kicinski
  2021-09-07 15:47                 ` Machnikowski, Maciej
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-07 14:55 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek,
	Saeed Mahameed, Michael Chan

On Tue, 7 Sep 2021 08:50:55 +0000 Machnikowski, Maciej wrote:
> > > The frequency source can be either pre-set statically, negotiated using
> > > ESMC QL-levels (if working in QL-Enabled mode), or follow automatic
> > > fallback inside the device. This  flag gives feedback about the validity
> > > of recovered clock coming from a given port and is useful when you
> > > enable multiple recovered clocks on more than one port in
> > > active-passive model. In that case the "driving" port may change
> > > dynamically, so it's a good idea to have some interface to reflect that.  
> > 
> > The ESMC messages are handled by Linux or some form of firmware?
> > I don't see how you can implement any selection policy with a read-only
> > API.  
> 
> It can be either in FW or in Linux - depending on the deployment.
> We try to define the API that would enable Linux to manage that.

We should implement the API for Linux to manage things from the get go.

> EEC state will be read-only, but the recovered clock management part
> will allow changes for QL-disabled SyncE deployments that only need
> to see if the clock they receive on a given port is valid or not.
> 
> > In general it would be more natural to place a "source id" at the
> > DPLL/clock, the "source" flag seems to mark the wrong end of the
> > relationship. If there ever are multiple consumers we won't be able
> > to tell which "target" the "source" is referring to. Hard to judge
> > how much of a problem that could be by looking at a small slice of
> > the system.  
> 
> The DPLL will operate on pins, so it will have a pin connected from the
> MAC/PHY that will have the recovered clock, but the recovered clock
> can be enabled from any port/lane. That information is kept in the 
> MAC/PHY and the DPLL side will not be aware who it belongs to.

So the clock outputs are muxed to a single pin at the Ethernet IP
level, in your design. I wonder if this is the common implementation
and therefore if it's safe to bake that into the API. Input from other
vendors would be great...

Also do I understand correctly that the output of the Ethernet IP 
is just the raw Rx clock once receiver is locked and the DPLL which 
enum if_synce_state refers to is in the time IP, that DPLL could be
driven by GNSS etc?

> We can come up with a better name,  but think of it like:
> You have multiport device (switch/NIC). One port is recovering
> the clock, the PHY/MAC outputs that clock through the pin
> to the EEC (DPLL). The DPLL knows if it locked to the signal coming
> from the multiport PHY/MAC, but it doesn't know which port is the one
> that generates that clock signal. All other ports can also present the
> "locked" state, but they are following the clock that was received
> in the chosen port. If we drop this flag we won't be able to easily tell
> which port/lane drives the recovered clock.
> In short: the port with that flag on is following the network clock
> and leading clock of other ports of the multiport device.
> 
> In the most basic SyncE deployment you can put the passive DPLL that
> will only give you the lock/holdover/unlocked info and just use this flag 
> to know who currently drives the DPLL.
> 
> > > That's where sysfs file be useful. When I add the implementation for
> > > recovered clock configuration, the sysfs may be used as standalone
> > > interface for configuring them when no dynamic change is needed.  
> > 
> > I didn't get that. Do you mean using a sysfs file to configure
> > the parameters of the DPLL?  
> 
> Only the PHY/MAC side of thing which is recovered clock configuration
> and the ECC state.
>  
> > If the DPLL has its own set of concerns we should go ahead and create
> > explicit object / configuration channel for it.
> > 
> > Somehow I got it into my head that you care mostly about transmitting
> > the clock, IOW recovering it from one port and using on another but
> > that's probably not even a strong use case for you or NICs in general :S  
> 
> This is the right thinking. The DPLL can also have different external sources,
> like the GNSS, and can also drive different output clocks. But for the most
> basic SyncE implementation, which only runs on a recovered clock, we won't
> need the DPLL subsystem.

The GNSS pulse would come in over an external pin, tho, right? Your
earlier version of the patchset had GNSS as an enum value, how would
the driver / FW know that a given pin means GNSS?

> > > Could you suggest where to add that? Grepping for ndo_ don't give much.
> > > I can add a new synce.rst file if it makes sense.  
> > 
> > New networking/synce.rst file makes perfect sense to me. And perhaps
> > link to it from driver-api/ptp.rst.  
> 
> OK will try to come up with something there

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-07 14:55               ` Jakub Kicinski
@ 2021-09-07 15:47                 ` Machnikowski, Maciej
  2021-09-07 19:47                   ` Jakub Kicinski
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-07 15:47 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek,
	Saeed Mahameed, Michael Chan



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, September 7, 2021 4:55 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Cc: netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org;
> richardcochran@gmail.com; abyagowi@fb.com; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; davem@davemloft.net; linux-
> kselftest@vger.kernel.org; Andrew Lunn <andrew@lunn.ch>; Michal
> Kubecek <mkubecek@suse.cz>; Saeed Mahameed <saeed@kernel.org>;
> Michael Chan <michael.chan@broadcom.com>
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Tue, 7 Sep 2021 08:50:55 +0000 Machnikowski, Maciej wrote:
> > > > The frequency source can be either pre-set statically, negotiated using
> > > > ESMC QL-levels (if working in QL-Enabled mode), or follow automatic
> > > > fallback inside the device. This  flag gives feedback about the validity
> > > > of recovered clock coming from a given port and is useful when you
> > > > enable multiple recovered clocks on more than one port in
> > > > active-passive model. In that case the "driving" port may change
> > > > dynamically, so it's a good idea to have some interface to reflect that.
> > >
> > > The ESMC messages are handled by Linux or some form of firmware?
> > > I don't see how you can implement any selection policy with a read-only
> > > API.
> >
> > It can be either in FW or in Linux - depending on the deployment.
> > We try to define the API that would enable Linux to manage that.
> 
> We should implement the API for Linux to manage things from the get go.

Yep! Yet let's go one step at a time. I believe once we have the basics (EEC 
monitoring and recovered clock configuration) we'll be able to implement
a basic functionality - and add bells and whistles later on, as there are more
capabilities that we could support in SW.
 
> > EEC state will be read-only, but the recovered clock management part
> > will allow changes for QL-disabled SyncE deployments that only need
> > to see if the clock they receive on a given port is valid or not.
> >
> > > In general it would be more natural to place a "source id" at the
> > > DPLL/clock, the "source" flag seems to mark the wrong end of the
> > > relationship. If there ever are multiple consumers we won't be able
> > > to tell which "target" the "source" is referring to. Hard to judge
> > > how much of a problem that could be by looking at a small slice of
> > > the system.
> >
> > The DPLL will operate on pins, so it will have a pin connected from the
> > MAC/PHY that will have the recovered clock, but the recovered clock
> > can be enabled from any port/lane. That information is kept in the
> > MAC/PHY and the DPLL side will not be aware who it belongs to.
> 
> So the clock outputs are muxed to a single pin at the Ethernet IP
> level, in your design. I wonder if this is the common implementation
> and therefore if it's safe to bake that into the API. Input from other
> vendors would be great...

I believe this is the state-of-art: here's the Broadcom public one
https://docs.broadcom.com/doc/1211168567832, I believe Marvel
has similar solution. But would also be happy to hear others.

> Also do I understand correctly that the output of the Ethernet IP
> is just the raw Rx clock once receiver is locked and the DPLL which
> enum if_synce_state refers to is in the time IP, that DPLL could be
> driven by GNSS etc?

Ethernet IP/PHY usually outputs a divided clock signal (since it's 
easier to route) derived from the RX clock.
The DPLL connectivity is vendor-specific, as you can use it to connect 
some external signals, but you can as well just care about relying 
the SyncE clock and only allow recovering it and passing along 
the QL info when your EEC is locked. That's why I backed up from
a full DPLL implementation in favor of a more generic EEC clock.
The Time IP is again relative and vendor-specific. If SyncE is deployed 
alongside PTP it will most likely be tightly coupled, but if you only
care about having a frequency source - it's not mandatory and it can be
as well in the PHY IP.

Also I think I will strip the reported states to the bare minimum defined
in the ITU-T G.781 instead of reusing the states that were already defined 
for a specific DPLL.
 
> > We can come up with a better name,  but think of it like:
> > You have multiport device (switch/NIC). One port is recovering
> > the clock, the PHY/MAC outputs that clock through the pin
> > to the EEC (DPLL). The DPLL knows if it locked to the signal coming
> > from the multiport PHY/MAC, but it doesn't know which port is the one
> > that generates that clock signal. All other ports can also present the
> > "locked" state, but they are following the clock that was received
> > in the chosen port. If we drop this flag we won't be able to easily tell
> > which port/lane drives the recovered clock.
> > In short: the port with that flag on is following the network clock
> > and leading clock of other ports of the multiport device.
> >
> > In the most basic SyncE deployment you can put the passive DPLL that
> > will only give you the lock/holdover/unlocked info and just use this flag
> > to know who currently drives the DPLL.
> >
> > > > That's where sysfs file be useful. When I add the implementation for
> > > > recovered clock configuration, the sysfs may be used as standalone
> > > > interface for configuring them when no dynamic change is needed.
> > >
> > > I didn't get that. Do you mean using a sysfs file to configure
> > > the parameters of the DPLL?
> >
> > Only the PHY/MAC side of thing which is recovered clock configuration
> > and the ECC state.
> >
> > > If the DPLL has its own set of concerns we should go ahead and create
> > > explicit object / configuration channel for it.
> > >
> > > Somehow I got it into my head that you care mostly about transmitting
> > > the clock, IOW recovering it from one port and using on another but
> > > that's probably not even a strong use case for you or NICs in general :S
> >
> > This is the right thinking. The DPLL can also have different external sources,
> > like the GNSS, and can also drive different output clocks. But for the most
> > basic SyncE implementation, which only runs on a recovered clock, we
> won't
> > need the DPLL subsystem.
> 
> The GNSS pulse would come in over an external pin, tho, right? Your
> earlier version of the patchset had GNSS as an enum value, how would
> the driver / FW know that a given pin means GNSS?

The GNSS 1PPS will more likely go directly to the "full" DPLL. 
The pin topology can be derived from FW or any vendor-specific way of mapping
pins to their sources. And, in "worst" case can just be hardcoded for a specific
device.
 
> > > > Could you suggest where to add that? Grepping for ndo_ don't give
> much.
> > > > I can add a new synce.rst file if it makes sense.
> > >
> > > New networking/synce.rst file makes perfect sense to me. And perhaps
> > > link to it from driver-api/ptp.rst.
> >
> > OK will try to come up with something there

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-07 15:47                 ` Machnikowski, Maciej
@ 2021-09-07 19:47                   ` Jakub Kicinski
  2021-09-08  8:03                     ` Machnikowski, Maciej
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-07 19:47 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek,
	Saeed Mahameed, Michael Chan

On Tue, 7 Sep 2021 15:47:05 +0000 Machnikowski, Maciej wrote:
> > > It can be either in FW or in Linux - depending on the deployment.
> > > We try to define the API that would enable Linux to manage that.  
> > 
> > We should implement the API for Linux to manage things from the get go.  
> 
> Yep! Yet let's go one step at a time. I believe once we have the basics (EEC 
> monitoring and recovered clock configuration) we'll be able to implement
> a basic functionality - and add bells and whistles later on, as there are more
> capabilities that we could support in SW.

The set API may shape how the get API looks. We need a minimal viable
API where the whole control part of it is not "firmware or proprietary
tools take care of that".

Do you have public docs on how the whole solution works?

> > > The DPLL will operate on pins, so it will have a pin connected from the
> > > MAC/PHY that will have the recovered clock, but the recovered clock
> > > can be enabled from any port/lane. That information is kept in the
> > > MAC/PHY and the DPLL side will not be aware who it belongs to.  
> > 
> > So the clock outputs are muxed to a single pin at the Ethernet IP
> > level, in your design. I wonder if this is the common implementation
> > and therefore if it's safe to bake that into the API. Input from other
> > vendors would be great...  
> 
> I believe this is the state-of-art: here's the Broadcom public one
> https://docs.broadcom.com/doc/1211168567832, I believe Marvel
> has similar solution. But would also be happy to hear others.

Interesting. That reveals the need for also marking the backup
(/secondary) clock.

Have you seen any docs on how systems with discreet PHY ASICs mux 
the clocks?

> > Also do I understand correctly that the output of the Ethernet IP
> > is just the raw Rx clock once receiver is locked and the DPLL which
> > enum if_synce_state refers to is in the time IP, that DPLL could be
> > driven by GNSS etc?  
> 
> Ethernet IP/PHY usually outputs a divided clock signal (since it's 
> easier to route) derived from the RX clock.
> The DPLL connectivity is vendor-specific, as you can use it to connect 
> some external signals, but you can as well just care about relying 
> the SyncE clock and only allow recovering it and passing along 
> the QL info when your EEC is locked. That's why I backed up from
> a full DPLL implementation in favor of a more generic EEC clock.

What is an ECC clock? To me the PLL state in the Ethernet port is the
state of the recovered clock. enum if_eec_state has values like
holdover which seem to be more applicable to the "system wide" PLL.

Let me ask this - if one port is training the link and the other one has
the lock and is the source - what state will be reported for each port?

> The Time IP is again relative and vendor-specific. If SyncE is deployed 
> alongside PTP it will most likely be tightly coupled, but if you only
> care about having a frequency source - it's not mandatory and it can be
> as well in the PHY IP.

I would not think having just the freq is very useful.

> Also I think I will strip the reported states to the bare minimum defined
> in the ITU-T G.781 instead of reusing the states that were already defined 
> for a specific DPLL.
> 
> > > This is the right thinking. The DPLL can also have different external sources,
> > > like the GNSS, and can also drive different output clocks. But for the most
> > > basic SyncE implementation, which only runs on a recovered clock, we won't  
> > > need the DPLL subsystem.  
> > 
> > The GNSS pulse would come in over an external pin, tho, right? Your
> > earlier version of the patchset had GNSS as an enum value, how would
> > the driver / FW know that a given pin means GNSS?  
> 
> The GNSS 1PPS will more likely go directly to the "full" DPLL. 
> The pin topology can be derived from FW or any vendor-specific way of mapping
> pins to their sources. And, in "worst" case can just be hardcoded for a specific
> device.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-07 19:47                   ` Jakub Kicinski
@ 2021-09-08  8:03                     ` Machnikowski, Maciej
  2021-09-08 16:21                       ` Jakub Kicinski
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-08  8:03 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek,
	Saeed Mahameed, Michael Chan



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, September 7, 2021 9:48 PM
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Tue, 7 Sep 2021 15:47:05 +0000 Machnikowski, Maciej wrote:
> > > > It can be either in FW or in Linux - depending on the deployment.
> > > > We try to define the API that would enable Linux to manage that.
> > >
> > > We should implement the API for Linux to manage things from the get
> go.
> >
> > Yep! Yet let's go one step at a time. I believe once we have the basics (EEC
> > monitoring and recovered clock configuration) we'll be able to implement
> > a basic functionality - and add bells and whistles later on, as there are more
> > capabilities that we could support in SW.
> 
> The set API may shape how the get API looks. We need a minimal viable
> API where the whole control part of it is not "firmware or proprietary
> tools take care of that".
> 
> Do you have public docs on how the whole solution works?

The best reference would be my netdev 0x15 tutorial:
https://netdevconf.info/0x15/session.html?Introduction-to-time-synchronization-over-Ethernet
The SyncE API considerations starts ~54:00, but basically we need API for:
- Controlling the lane to pin mapping for clock recovery
- Check the EEC/DPLL state and see what's the source of reference frequency
(in more advanced deployments)
- control additional input and output pins (GNSS input, external inputs, recovered
  frequency reference)
 
> > > > The DPLL will operate on pins, so it will have a pin connected from the
> > > > MAC/PHY that will have the recovered clock, but the recovered clock
> > > > can be enabled from any port/lane. That information is kept in the
> > > > MAC/PHY and the DPLL side will not be aware who it belongs to.
> > >
> > > So the clock outputs are muxed to a single pin at the Ethernet IP
> > > level, in your design. I wonder if this is the common implementation
> > > and therefore if it's safe to bake that into the API. Input from other
> > > vendors would be great...
> >
> > I believe this is the state-of-art: here's the Broadcom public one
> > https://docs.broadcom.com/doc/1211168567832, I believe Marvel
> > has similar solution. But would also be happy to hear others.
> 
> Interesting. That reveals the need for also marking the backup
> (/secondary) clock.

That's optional, but useful. And here's where we need a feedback
on which port/lane is currently used, as the switch may be automatic

> Have you seen any docs on how systems with discreet PHY ASICs mux
> the clocks?

Yes - unfortunately they are not public :(

 
> > > Also do I understand correctly that the output of the Ethernet IP
> > > is just the raw Rx clock once receiver is locked and the DPLL which
> > > enum if_synce_state refers to is in the time IP, that DPLL could be
> > > driven by GNSS etc?
> >
> > Ethernet IP/PHY usually outputs a divided clock signal (since it's
> > easier to route) derived from the RX clock.
> > The DPLL connectivity is vendor-specific, as you can use it to connect
> > some external signals, but you can as well just care about relying
> > the SyncE clock and only allow recovering it and passing along
> > the QL info when your EEC is locked. That's why I backed up from
> > a full DPLL implementation in favor of a more generic EEC clock.
> 
> What is an ECC clock? To me the PLL state in the Ethernet port is the
> state of the recovered clock. enum if_eec_state has values like
> holdover which seem to be more applicable to the "system wide" PLL.

EEC is Ethernet Equipment Clock. In most cases this will be a DPLL, but that's
not mandatory and I believe it may be different is switches where
you only need to drive all ports TX from a single frequency source. In this
case the DPLL can be embedded in the multiport PHY,
 
> Let me ask this - if one port is training the link and the other one has
> the lock and is the source - what state will be reported for each port?

In this case the port that has the lock source will report the lock and 
the EEC_SRC_PORT flag. The port that trains the link will show the
lock without the flag and once it completes the training sequence it will
use the EEC's frequency to transmit the data so that the next hop is able
to synchronize its EEC to the incoming RX frequency

> > The Time IP is again relative and vendor-specific. If SyncE is deployed
> > alongside PTP it will most likely be tightly coupled, but if you only
> > care about having a frequency source - it's not mandatory and it can be
> > as well in the PHY IP.
> 
> I would not think having just the freq is very useful.

This depends on the deployment. There are couple popular frequencies
Most popular are 2,048 kHz, 10 MHz and 64 kHz. There are many 
deployments that only require frequency sync without the phase
and/or time. I.e. if you deploy frequency division duplex you only need the
frequency reference, and the higher frequency you have - the faster you can
lock to it.
 
> > Also I think I will strip the reported states to the bare minimum defined
> > in the ITU-T G.781 instead of reusing the states that were already defined
> > for a specific DPLL.
> >
> > > > This is the right thinking. The DPLL can also have different external
> sources,
> > > > like the GNSS, and can also drive different output clocks. But for the
> most
> > > > basic SyncE implementation, which only runs on a recovered clock, we
> won't
> > > > need the DPLL subsystem.
> > >
> > > The GNSS pulse would come in over an external pin, tho, right? Your
> > > earlier version of the patchset had GNSS as an enum value, how would
> > > the driver / FW know that a given pin means GNSS?
> >
> > The GNSS 1PPS will more likely go directly to the "full" DPLL.
> > The pin topology can be derived from FW or any vendor-specific way of
> mapping
> > pins to their sources. And, in "worst" case can just be hardcoded for a
> specific
> > device.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08  8:03                     ` Machnikowski, Maciej
@ 2021-09-08 16:21                       ` Jakub Kicinski
  2021-09-08 17:30                         ` Machnikowski, Maciej
                                           ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-08 16:21 UTC (permalink / raw)
  To: Machnikowski, Maciej, Florian Fainelli, Ido Schimmel
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek,
	Saeed Mahameed, Michael Chan

On Wed, 8 Sep 2021 08:03:35 +0000 Machnikowski, Maciej wrote:
> > > Yep! Yet let's go one step at a time. I believe once we have the basics (EEC
> > > monitoring and recovered clock configuration) we'll be able to implement
> > > a basic functionality - and add bells and whistles later on, as there are more
> > > capabilities that we could support in SW.  
> > 
> > The set API may shape how the get API looks. We need a minimal viable
> > API where the whole control part of it is not "firmware or proprietary
> > tools take care of that".
> > 
> > Do you have public docs on how the whole solution works?  
> 
> The best reference would be my netdev 0x15 tutorial:
> https://netdevconf.info/0x15/session.html?Introduction-to-time-synchronization-over-Ethernet
> The SyncE API considerations starts ~54:00, but basically we need API for:
> - Controlling the lane to pin mapping for clock recovery
> - Check the EEC/DPLL state and see what's the source of reference frequency
> (in more advanced deployments)
> - control additional input and output pins (GNSS input, external inputs, recovered
>   frequency reference)

Thanks, good presentation! I haven't seen much in terms of system
design, at the level similar to the Broadcom whitepaper you shared.

> > > I believe this is the state-of-art: here's the Broadcom public one
> > > https://docs.broadcom.com/doc/1211168567832, I believe Marvel
> > > has similar solution. But would also be happy to hear others.  
> > 
> > Interesting. That reveals the need for also marking the backup
> > (/secondary) clock.  
> 
> That's optional, but useful. And here's where we need a feedback
> on which port/lane is currently used, as the switch may be automatic

What do you mean by optional? How does the user know if there is
fallback or not? Is it that Intel is intending to support only
devices with up to 2 ports and the second port is always the
backup (apologies for speculating).

> > Have you seen any docs on how systems with discreet PHY ASICs mux
> > the clocks?  
> 
> Yes - unfortunately they are not public :(

Mumble, mumble. Ido, Florian - any devices within your purview which
would support SyncE? 

> > > Ethernet IP/PHY usually outputs a divided clock signal (since it's
> > > easier to route) derived from the RX clock.
> > > The DPLL connectivity is vendor-specific, as you can use it to connect
> > > some external signals, but you can as well just care about relying
> > > the SyncE clock and only allow recovering it and passing along
> > > the QL info when your EEC is locked. That's why I backed up from
> > > a full DPLL implementation in favor of a more generic EEC clock.  
> > 
> > What is an ECC clock? To me the PLL state in the Ethernet port is the
> > state of the recovered clock. enum if_eec_state has values like
> > holdover which seem to be more applicable to the "system wide" PLL.  
> 
> EEC is Ethernet Equipment Clock. In most cases this will be a DPLL, but that's
> not mandatory and I believe it may be different is switches where
> you only need to drive all ports TX from a single frequency source. In this
> case the DPLL can be embedded in the multiport PHY,
>  
> > Let me ask this - if one port is training the link and the other one has
> > the lock and is the source - what state will be reported for each port?  
> 
> In this case the port that has the lock source will report the lock and 
> the EEC_SRC_PORT flag. The port that trains the link will show the
> lock without the flag and once it completes the training sequence it will
> use the EEC's frequency to transmit the data so that the next hop is able
> to synchronize its EEC to the incoming RX frequency

Alright, I don't like that. It feels like you're attaching one object's
information (ECC) to other objects (ports), and repeating it. Prof
Goczyła and dr Landowska would not be proud.

> > > The Time IP is again relative and vendor-specific. If SyncE is deployed
> > > alongside PTP it will most likely be tightly coupled, but if you only
> > > care about having a frequency source - it's not mandatory and it can be
> > > as well in the PHY IP.  
> > 
> > I would not think having just the freq is very useful.  
> 
> This depends on the deployment. There are couple popular frequencies
> Most popular are 2,048 kHz, 10 MHz and 64 kHz. There are many 
> deployments that only require frequency sync without the phase
> and/or time. I.e. if you deploy frequency division duplex you only need the
> frequency reference, and the higher frequency you have - the faster you can
> lock to it.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 16:21                       ` Jakub Kicinski
@ 2021-09-08 17:30                         ` Machnikowski, Maciej
  2021-09-08 19:34                           ` Andrew Lunn
  2021-09-08 22:18                           ` Jakub Kicinski
  2021-09-13  8:50                         ` Ido Schimmel
  2021-09-21 13:36                         ` Ido Schimmel
  2 siblings, 2 replies; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-08 17:30 UTC (permalink / raw)
  To: Jakub Kicinski, Florian Fainelli, Ido Schimmel
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek,
	Saeed Mahameed, Michael Chan

> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Wednesday, September 8, 2021 6:21 PM
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Wed, 8 Sep 2021 08:03:35 +0000 Machnikowski, Maciej wrote:
> > > > Yep! Yet let's go one step at a time. I believe once we have the basics
> (EEC
> > > > monitoring and recovered clock configuration) we'll be able to
> implement
> > > > a basic functionality - and add bells and whistles later on, as there are
> more
> > > > capabilities that we could support in SW.
> > >
> > > The set API may shape how the get API looks. We need a minimal viable
> > > API where the whole control part of it is not "firmware or proprietary
> > > tools take care of that".
> > >
> > > Do you have public docs on how the whole solution works?
> >
> > The best reference would be my netdev 0x15 tutorial:
> > https://netdevconf.info/0x15/session.html?Introduction-to-time-
> synchronization-over-Ethernet
> > The SyncE API considerations starts ~54:00, but basically we need API for:
> > - Controlling the lane to pin mapping for clock recovery
> > - Check the EEC/DPLL state and see what's the source of reference
> frequency
> > (in more advanced deployments)
> > - control additional input and output pins (GNSS input, external inputs,
> recovered
> >   frequency reference)
> 
> Thanks, good presentation! I haven't seen much in terms of system
> design, at the level similar to the Broadcom whitepaper you shared.

See the "drawing" below.
 
> > > > I believe this is the state-of-art: here's the Broadcom public one
> > > > https://docs.broadcom.com/doc/1211168567832, I believe Marvel
> > > > has similar solution. But would also be happy to hear others.
> > >
> > > Interesting. That reveals the need for also marking the backup
> > > (/secondary) clock.
> >
> > That's optional, but useful. And here's where we need a feedback
> > on which port/lane is currently used, as the switch may be automatic
> 
> What do you mean by optional? How does the user know if there is
> fallback or not? Is it that Intel is intending to support only
> devices with up to 2 ports and the second port is always the
> backup (apologies for speculating).

That will be a part of pin config API. It needs to give info about the number
of supported pins that the PHY/MAC can use as recovered clock outputs.
Once the pin is assigned to the lane the recovered clock (divided or not)
will appear on the configured PHY/MAC pin and EEC will be able to
use it. If there is more than 1 available - they will have some priority
assigned to them that will be known to the EEC/board designer.

And we plan to support devices that only comes with 1 recovered clock
output as well.
 
> > > Have you seen any docs on how systems with discreet PHY ASICs mux
> > > the clocks?
> >
> > Yes - unfortunately they are not public :(
> 
> Mumble, mumble. Ido, Florian - any devices within your purview which
> would support SyncE?

OK Found one that's public: 
https://www.marvell.com/content/dam/marvell/en/public-collateral/transceivers/marvell-phys-transceivers-alaska-c-88x5113-datasheet.pdf
see Fig. 23 and chapter 3.11  for details, but conceptually it's similar.
 
> > > > Ethernet IP/PHY usually outputs a divided clock signal (since it's
> > > > easier to route) derived from the RX clock.
> > > > The DPLL connectivity is vendor-specific, as you can use it to connect
> > > > some external signals, but you can as well just care about relying
> > > > the SyncE clock and only allow recovering it and passing along
> > > > the QL info when your EEC is locked. That's why I backed up from
> > > > a full DPLL implementation in favor of a more generic EEC clock.
> > >
> > > What is an ECC clock? To me the PLL state in the Ethernet port is the
> > > state of the recovered clock. enum if_eec_state has values like
> > > holdover which seem to be more applicable to the "system wide" PLL.
> >
> > EEC is Ethernet Equipment Clock. In most cases this will be a DPLL, but that's
> > not mandatory and I believe it may be different is switches where
> > you only need to drive all ports TX from a single frequency source. In this
> > case the DPLL can be embedded in the multiport PHY,
> >
> > > Let me ask this - if one port is training the link and the other one has
> > > the lock and is the source - what state will be reported for each port?
> >
> > In this case the port that has the lock source will report the lock and
> > the EEC_SRC_PORT flag. The port that trains the link will show the
> > lock without the flag and once it completes the training sequence it will
> > use the EEC's frequency to transmit the data so that the next hop is able
> > to synchronize its EEC to the incoming RX frequency
> 
> Alright, I don't like that. It feels like you're attaching one object's
> information (ECC) to other objects (ports), and repeating it. Prof
> Goczyła and dr Landowska would not be proud.

Hahaha - did not expect them to pop up here :)
It's true, but the problem is that they depend on each other. The path is:

Lane0
------------- |\  Pin0     RefN   ____
------------- | |-----------------|         |      synced clk
...               | |-----------------| EEC  |------------------
------------- |/ PinN     RefM|____ |
Lane N      MUX

To get the full info a port needs to know the EEC state and which lane is used
as a source (or rather - my lane or any other).
The lane -> Pin mapping is buried in the PHY/MAC, but the source of frequency
is in the EEC.
What's even more - the Pin->Ref mapping is board specific.

The viable solutions are:
- Limit to the proposed "I drive the clock" vs "Someone drives it" and assume the
   Driver returns all info
- return the EEC Ref index, figure out which pin is connected to it and then check
  which MAC/PHY lane that drives it.

I assume option one is easy to implement and keep in the future even if we
finally move to option 2 once we define EEC/DPLL subsystem.

In future #1 can take the lock information from the DPLL subsystem, but
will also enable simple deployments that won't expose the whole DPLL, 
like a filter PLL embedded in a multiport PHY that will only work for
SyncE in which case this API will only touch a single component.
 
> > > > The Time IP is again relative and vendor-specific. If SyncE is deployed
> > > > alongside PTP it will most likely be tightly coupled, but if you only
> > > > care about having a frequency source - it's not mandatory and it can be
> > > > as well in the PHY IP.
> > >
> > > I would not think having just the freq is very useful.
> >
> > This depends on the deployment. There are couple popular frequencies
> > Most popular are 2,048 kHz, 10 MHz and 64 kHz. There are many
> > deployments that only require frequency sync without the phase
> > and/or time. I.e. if you deploy frequency division duplex you only need the
> > frequency reference, and the higher frequency you have - the faster you
> can
> > lock to it.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 17:30                         ` Machnikowski, Maciej
@ 2021-09-08 19:34                           ` Andrew Lunn
  2021-09-08 20:27                             ` Machnikowski, Maciej
  2021-09-08 22:20                             ` Jakub Kicinski
  2021-09-08 22:18                           ` Jakub Kicinski
  1 sibling, 2 replies; 42+ messages in thread
From: Andrew Lunn @ 2021-09-08 19:34 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Jakub Kicinski, Florian Fainelli, Ido Schimmel, netdev,
	intel-wired-lan, richardcochran, abyagowi, Nguyen, Anthony L,
	davem, linux-kselftest, Michal Kubecek, Saeed Mahameed,
	Michael Chan

> > > The SyncE API considerations starts ~54:00, but basically we need API for:
> > > - Controlling the lane to pin mapping for clock recovery
> > > - Check the EEC/DPLL state and see what's the source of reference
> > frequency
> > > (in more advanced deployments)
> > > - control additional input and output pins (GNSS input, external inputs,
> > recovered
> > >   frequency reference)

Now that you have pointed to a datasheet...

> - Controlling the lane to pin mapping for clock recovery

So this is a PHY property. That could be Linux driving the PHY, via
phylib, drivers/net/phy, or there could be firmware in the MAC driver
which hides the PHY and gives you some sort of API to access it.

> Check the EEC/DPLL state and see what's the source of reference
> frequency

Where is the EEC/DPLL implemented? Is it typically also in the PHY? Or
some other hardware block?

I just want to make sure we have an API which we can easily delegate
to different subsystems, some of it in the PHY driver, maybe some of
it somewhere else.

Also, looking at the Marvell datasheet, it appears these registers are
in the MDIO_MMD_VEND2 range. Has any of this been specified? Can we
expect to be able to write a generic implementation sometime in the
future which PHY drivers can share?

I just looked at a 1G Marvell PHY. It uses RGMII or SGMII towards the
host. But there is no indication you can take the clock from the SGMII
SERDES, it is only the recovered clock from the line. And the
recovered clock always goes out the CLK125 pin, which can either be
125MHz or 25MHz.

So in this case, you have no need to control the lane to pin mapping,
it is fixed, but do we want to be able to control the divider?

Do we need a mechanism to actually enumerate what the hardware can do?

Since we are talking about clocks and dividers, and multiplexors,
should all this be using the common clock framework, which already
supports most of this? Do we actually need something new?

       Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 19:34                           ` Andrew Lunn
@ 2021-09-08 20:27                             ` Machnikowski, Maciej
  2021-09-08 22:20                             ` Jakub Kicinski
  1 sibling, 0 replies; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-08 20:27 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jakub Kicinski, Florian Fainelli, Ido Schimmel, netdev,
	intel-wired-lan, richardcochran, abyagowi, Nguyen, Anthony L,
	davem, linux-kselftest, Michal Kubecek, Saeed Mahameed,
	Michael Chan



> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Wednesday, September 8, 2021 9:35 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> > > > The SyncE API considerations starts ~54:00, but basically we need API
> for:
> > > > - Controlling the lane to pin mapping for clock recovery
> > > > - Check the EEC/DPLL state and see what's the source of reference
> > > frequency
> > > > (in more advanced deployments)
> > > > - control additional input and output pins (GNSS input, external inputs,
> > > recovered
> > > >   frequency reference)
> 
> Now that you have pointed to a datasheet...
> 
> > - Controlling the lane to pin mapping for clock recovery
> 
> So this is a PHY property. That could be Linux driving the PHY, via
> phylib, drivers/net/phy, or there could be firmware in the MAC driver
> which hides the PHY and gives you some sort of API to access it.

Yes, that's deployment dependent - in our case we use MAC driver that
proxies that.
 
> > Check the EEC/DPLL state and see what's the source of reference
> > frequency
> 
> Where is the EEC/DPLL implemented? Is it typically also in the PHY? Or
> some other hardware block?

In most cases it will be an external device, but there are implementations 
That have a PLL/DPLL inside the PHY (like a Broadcom example). And
I don't know how switches implements that.
 
> I just want to make sure we have an API which we can easily delegate
> to different subsystems, some of it in the PHY driver, maybe some of
> it somewhere else.

The reasoning behind putting it in the ndo subsystem is because ultimately
the netdev receives the ESMC message and is the entity that should
know how it is connected to the PHY. That's because it will be very
complex to move the mapping between netdevs and PHY lanes/ports
to the SW running in the userspace. I believe the right way is to let the
netdev driver manage that complexity and forward the request to the
PHY subsystem if needed.
The logic is:
- Receive ESMC message on a netdev
- send the message to enable recovered clock to the netdev that received it
- ask the netdev to give the status of EEC that drives it.

> Also, looking at the Marvell datasheet, it appears these registers are
> in the MDIO_MMD_VEND2 range. Has any of this been specified? Can we
> expect to be able to write a generic implementation sometime in the
> future which PHY drivers can share?

I believe it will be vendor-specific, as there are different implementations
of it. It can be an internal PLL that gets all inputs and syncs to a one of them
or it can be a MUX with divder(s)

> I just looked at a 1G Marvell PHY. It uses RGMII or SGMII towards the
> host. But there is no indication you can take the clock from the SGMII
> SERDES, it is only the recovered clock from the line. And the
> recovered clock always goes out the CLK125 pin, which can either be
> 125MHz or 25MHz.
> 
> So in this case, you have no need to control the lane to pin mapping,
> it is fixed, but do we want to be able to control the divider?

That's a complex question. Controlling the divider would make sense
when we have control over the DPLL, and it still needs to be optional,
as it's not always available. I assume that in the initial
implementation we can rely on MAC driver to set up the dividers
to output the expected frequency to the DPLL. On faster interfaces
the RCLK speed is also speed-dependent and differs across the link
speeds.

> Do we need a mechanism to actually enumerate what the hardware can do?

For recovered clocks I think we need to start with the number of
PHY outputs that go towards the DPLL. We can add additional attributes
like a frequency and others once we need them. I want to avoid creating
a swiss-knife with too many options here :)

> Since we are talking about clocks and dividers, and multiplexors,
> should all this be using the common clock framework, which already
> supports most of this? Do we actually need something new?

I believe that's a specific enough case that deserves a separate one
Regards
Maciek

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 17:30                         ` Machnikowski, Maciej
  2021-09-08 19:34                           ` Andrew Lunn
@ 2021-09-08 22:18                           ` Jakub Kicinski
  2021-09-08 23:14                             ` Andrew Lunn
  2021-09-09  8:11                             ` Machnikowski, Maciej
  1 sibling, 2 replies; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-08 22:18 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek,
	Saeed Mahameed, Michael Chan

On Wed, 8 Sep 2021 17:30:24 +0000 Machnikowski, Maciej wrote:
> Lane0
> ------------- |\  Pin0        RefN ____
> ------------- | |-----------------|     |      synced clk
>               | |-----------------| EEC |------------------
> ------------- |/ PinN         RefM|____ |
> Lane N      MUX
> 
> To get the full info a port needs to know the EEC state and which lane is used
> as a source (or rather - my lane or any other).

EEC here is what the PHY documentation calls "Cleanup PLL" right?

> The lane -> Pin mapping is buried in the PHY/MAC, but the source of frequency
> is in the EEC.

Not sure what "source of frequency" means here. There's a lot of
frequencies here.

> What's even more - the Pin->Ref mapping is board specific.

Breaking down the system into components we have:

Port
  A.1 Rx lanes
  A.2 Rx pins (outputs)
  A.3 Rx clk divider
  B.1 Tx lanes
  B.2 Tx pins (inputs)

ECC
  C.1 Inputs
  C.2 Outputs
  C.3 PLL state

In the most general case we want to be able to:
 map recovered clocks to PHY output pins (A.1 <> A.2)
 set freq div on the recovered clock (A.2 <> A.3)
 set the priorities of inputs on ECC (C.1)
 read the ECC state (C.3)
 control outputs of the ECC (C.2)
 select the clock source for port Tx (B.2 <> B.1)

As you said, pin -> ref mapping is board specific, so the API should
not assume knowledge of routing between Port and ECC. If it does just
give the pins matching names.

We don't have to implement the entire design but the pieces we do create
must be right for the larger context. With the current code the
ECC/Cleanup PLL is not represented as a separate entity, and mapping of
what source means is on the wrong "end" of the A.3 <> C.1 relationship.

> The viable solutions are:
> - Limit to the proposed "I drive the clock" vs "Someone drives it" and assume the
>    Driver returns all info
> - return the EEC Ref index, figure out which pin is connected to it and then check
>   which MAC/PHY lane that drives it.
> 
> I assume option one is easy to implement and keep in the future even if we
> finally move to option 2 once we define EEC/DPLL subsystem.
> 
> In future #1 can take the lock information from the DPLL subsystem, but
> will also enable simple deployments that won't expose the whole DPLL, 
> like a filter PLL embedded in a multiport PHY that will only work for
> SyncE in which case this API will only touch a single component.

Imagine a system with two cascaded switch ASICs and a bunch of PHYs.
How do you express that by pure extensions to the proposed API?
Here either the cleanup PLLs would be cascaded (subordinate one needs
to express that its "source" is another PLL) or single lane can be
designated as a source for both PLLs (but then there is only one
"source" bit and multiple "enum if_eec_state"s).

I think we can't avoid having a separate object for ECC/Cleanup PLL. 
You can add it as a subobject to devlink but new genetlink family seems
much preferable given the devlink instances themselves have unclear
semantics at this point. Or you can try to convince Richard that ECC
belongs as part of PTP :)

In fact I don't think you care about any of the PHY / port stuff
currently. All you need is the ECC side of the API. IIUC you have
relatively simple setup where there is only one pin per port, and
you don't care about syncing the Tx clock.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 19:34                           ` Andrew Lunn
  2021-09-08 20:27                             ` Machnikowski, Maciej
@ 2021-09-08 22:20                             ` Jakub Kicinski
  2021-09-08 22:59                               ` Andrew Lunn
  1 sibling, 1 reply; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-08 22:20 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Machnikowski, Maciej, Florian Fainelli, Ido Schimmel, netdev,
	intel-wired-lan, richardcochran, abyagowi, Nguyen, Anthony L,
	davem, linux-kselftest, Michal Kubecek, Saeed Mahameed,
	Michael Chan

On Wed, 8 Sep 2021 21:34:37 +0200 Andrew Lunn wrote:
> Since we are talking about clocks and dividers, and multiplexors,
> should all this be using the common clock framework, which already
> supports most of this? Do we actually need something new?

Does the common clock framework expose any user space API?

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 22:20                             ` Jakub Kicinski
@ 2021-09-08 22:59                               ` Andrew Lunn
  2021-09-09  2:09                                 ` Richard Cochran
  0 siblings, 1 reply; 42+ messages in thread
From: Andrew Lunn @ 2021-09-08 22:59 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Machnikowski, Maciej, Florian Fainelli, Ido Schimmel, netdev,
	intel-wired-lan, richardcochran, abyagowi, Nguyen, Anthony L,
	davem, linux-kselftest, Michal Kubecek, Saeed Mahameed,
	Michael Chan

On Wed, Sep 08, 2021 at 03:20:27PM -0700, Jakub Kicinski wrote:
> On Wed, 8 Sep 2021 21:34:37 +0200 Andrew Lunn wrote:
> > Since we are talking about clocks and dividers, and multiplexors,
> > should all this be using the common clock framework, which already
> > supports most of this? Do we actually need something new?
> 
> Does the common clock framework expose any user space API?

Ah, good point. No, i don't think it does, apart from debugfs, which
is not really a user space API, and it contains read only descriptions
of the clock tree, current status, mux settings, dividers etc.

So probably anybody implemented the proposed API the Linux way will
use the common clock framework and its internal API, and debug the
implementation via debugfs.

PHY drivers already do use it, e.g. when the PHY is using a clock
provided by the MAC, it uses the API to enable/disable and set ifs
frequency as needed.

    Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 22:18                           ` Jakub Kicinski
@ 2021-09-08 23:14                             ` Andrew Lunn
  2021-09-08 23:58                               ` Jakub Kicinski
  2021-09-09  8:11                             ` Machnikowski, Maciej
  1 sibling, 1 reply; 42+ messages in thread
From: Andrew Lunn @ 2021-09-08 23:14 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Machnikowski, Maciej, netdev, intel-wired-lan, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, linux-kselftest,
	Michal Kubecek, Saeed Mahameed, Michael Chan

> As you said, pin -> ref mapping is board specific, so the API should
> not assume knowledge of routing between Port and ECC.

That information will probably end up in device tree. And X different
implementations of ACPI, unless somebody puts there foot down and
stops the snow flakes.

> Imagine a system with two cascaded switch ASICs and a bunch of PHYs.
> How do you express that by pure extensions to the proposed API?

Device tree is good at that. ACPI might eventually catch up.

How complex a setup do we actually expect? Can there be multiple
disjoint SyncE trees within an Ethernet switch cluster? Or would it be
reasonable to say all you need to configure is the clock source, and
all other ports of the switches are slaves if SyncE is enabled for the
port? I've never see any SOHO switch hardware which allows you to have
disjoint PTP trees, so it does not sound too unreasonable to only
allow a single SyncE tree per switch cluster.

Also, if you are cascading switches, you generally don't put PHYs in
the middle, you just connect the SERDES lanes together.

	 Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 23:14                             ` Andrew Lunn
@ 2021-09-08 23:58                               ` Jakub Kicinski
  2021-09-09  8:26                                 ` Machnikowski, Maciej
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-08 23:58 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Machnikowski, Maciej, netdev, intel-wired-lan, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, linux-kselftest,
	Michal Kubecek, Saeed Mahameed, Michael Chan

On Thu, 9 Sep 2021 01:14:36 +0200 Andrew Lunn wrote:
> > As you said, pin -> ref mapping is board specific, so the API should
> > not assume knowledge of routing between Port and ECC.  
> 
> That information will probably end up in device tree. And X different
> implementations of ACPI, unless somebody puts there foot down and
> stops the snow flakes.
> 
> > Imagine a system with two cascaded switch ASICs and a bunch of PHYs.
> > How do you express that by pure extensions to the proposed API?  
> 
> Device tree is good at that. ACPI might eventually catch up.

I could well be wrong but some of those connectors could well be just
SMAs on the back plate, no? User could cable those up to their heart
content.

https://engineering.fb.com/2021/08/11/open-source/time-appliance/

> How complex a setup do we actually expect? Can there be multiple
> disjoint SyncE trees within an Ethernet switch cluster? Or would it be
> reasonable to say all you need to configure is the clock source, and
> all other ports of the switches are slaves if SyncE is enabled for the
> port? I've never see any SOHO switch hardware which allows you to have
> disjoint PTP trees, so it does not sound too unreasonable to only
> allow a single SyncE tree per switch cluster.

Not sure. I get the (perhaps unfounded) feeling that just forwarding
the clock from one port to the rest is simpler. Maciej cares primarily
about exposing the clock to other non-Ethernet things, the device would
be an endpoint here, using the clock for LTE or whatnot.

> Also, if you are cascading switches, you generally don't put PHYs in
> the middle, you just connect the SERDES lanes together.

My concern was a case where PHY connected to one switch exposes the
refclock on an aux pin which is then muxed to more than one switch ASIC.
IOW the "source port" is not actually under the same switch. 

Again, IDK if that's realistic.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 22:59                               ` Andrew Lunn
@ 2021-09-09  2:09                                 ` Richard Cochran
  2021-09-09  8:18                                   ` Machnikowski, Maciej
  0 siblings, 1 reply; 42+ messages in thread
From: Richard Cochran @ 2021-09-09  2:09 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jakub Kicinski, Machnikowski, Maciej, Florian Fainelli,
	Ido Schimmel, netdev, intel-wired-lan, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Michal Kubecek,
	Saeed Mahameed, Michael Chan

On Thu, Sep 09, 2021 at 12:59:27AM +0200, Andrew Lunn wrote:
> On Wed, Sep 08, 2021 at 03:20:27PM -0700, Jakub Kicinski wrote:
> > On Wed, 8 Sep 2021 21:34:37 +0200 Andrew Lunn wrote:
> > > Since we are talking about clocks and dividers, and multiplexors,
> > > should all this be using the common clock framework, which already
> > > supports most of this? Do we actually need something new?
> > 
> > Does the common clock framework expose any user space API?
> 
> Ah, good point. No, i don't think it does, apart from debugfs, which
> is not really a user space API, and it contains read only descriptions
> of the clock tree, current status, mux settings, dividers etc.

Wouldn't it make sense to develop some kind of user land API to
manipulate the common clock framework at run time?

Just dreaming...

Richard

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 22:18                           ` Jakub Kicinski
  2021-09-08 23:14                             ` Andrew Lunn
@ 2021-09-09  8:11                             ` Machnikowski, Maciej
  1 sibling, 0 replies; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-09  8:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Andrew Lunn, Michal Kubecek,
	Saeed Mahameed, Michael Chan

> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Thursday, September 9, 2021 12:19 AM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Wed, 8 Sep 2021 17:30:24 +0000 Machnikowski, Maciej wrote:
> > Lane0
> > ------------- |\  Pin0        RefN ____
> > ------------- | |-----------------|     |      synced clk
> >               | |-----------------| EEC |------------------
> > ------------- |/ PinN         RefM|____ |
> > Lane N      MUX
> >
> > To get the full info a port needs to know the EEC state and which lane is
> used
> > as a source (or rather - my lane or any other).
> 
> EEC here is what the PHY documentation calls "Cleanup PLL" right?

It's a generic term for an internal or external PLL that takes the reference
frequency from the certain port/lane and drives its TX part. In a specific 
case it can be the internal cleanup PLL.

> > The lane -> Pin mapping is buried in the PHY/MAC, but the source of
> frequency
> > is in the EEC.
> 
> Not sure what "source of frequency" means here. There's a lot of
> frequencies here.

It's the source lane/port that drives the DPLL reference. In other words,
the DPLL will tune its frequency to match the one recovered from 
a certain PHY lane/port.

> > What's even more - the Pin->Ref mapping is board specific.
> 
> Breaking down the system into components we have:
> 
> Port
>   A.1 Rx lanes
>   A.2 Rx pins (outputs)
>   A.3 Rx clk divider
>   B.1 Tx lanes
>   B.2 Tx pins (inputs)
> 
> ECC
>   C.1 Inputs
>   C.2 Outputs
>   C.3 PLL state
> 
> In the most general case we want to be able to:
>  map recovered clocks to PHY output pins (A.1 <> A.2)
>  set freq div on the recovered clock (A.2 <> A.3)

Technically the pin (if exists) will be the last thing, so a better way would 
be to map lane to the divider (A.1<>A.3) and then a divider to pin (A.3<>A.2), 
but the idea is the same

>  set the priorities of inputs on ECC (C.1)
>  read the ECC state (C.3)
>  control outputs of the ECC (C.2)

Yes!

>  select the clock source for port Tx (B.2 <> B.1)

And here we usually don't allow control over this. The DPLL is preconfigured to 
output the frequency that's expected on the PHY/MAC clock input and we
shouldn't mess with it in runtime.

> As you said, pin -> ref mapping is board specific, so the API should
> not assume knowledge of routing between Port and ECC. If it does just
> give the pins matching names.

I believe referring to a board user guide is enough to cover that one, just like
with PTP pins. There may be 1000 different ways of connecting those signals.

> We don't have to implement the entire design but the pieces we do create
> must be right for the larger context. With the current code the
> ECC/Cleanup PLL is not represented as a separate entity, and mapping of
> what source means is on the wrong "end" of the A.3 <> C.1 relationship.

That's why I initially started with the EEC state + pin idx of the driving source.
I believe it's a cleaner solution, as then we can implement the same pin
indexes in the recovered clock part of the interface and the user will be
able to see the state and driving pin from the EEC_STATE (both belongs to
the DPLL) and use the PHY pins subsystem to see if the driving pin index
matches the index that I drive. In that case we keep all C-thingies in the C
subsystem and A stuff in A subsystem. The only "mix" is in the pin indexes
that would use numbering from C.
If it's an attribute - it can as well be optional for the deployments that
don't need/support it.

> > The viable solutions are:
> > - Limit to the proposed "I drive the clock" vs "Someone drives it" and
> assume the
> >    Driver returns all info
> > - return the EEC Ref index, figure out which pin is connected to it and then
> check
> >   which MAC/PHY lane that drives it.
> >
> > I assume option one is easy to implement and keep in the future even if
> we
> > finally move to option 2 once we define EEC/DPLL subsystem.
> >
> > In future #1 can take the lock information from the DPLL subsystem, but
> > will also enable simple deployments that won't expose the whole DPLL,
> > like a filter PLL embedded in a multiport PHY that will only work for
> > SyncE in which case this API will only touch a single component.
> 
> Imagine a system with two cascaded switch ASICs and a bunch of PHYs.
> How do you express that by pure extensions to the proposed API?
> Here either the cleanup PLLs would be cascaded (subordinate one needs
> to express that its "source" is another PLL) or single lane can be
> designated as a source for both PLLs (but then there is only one
> "source" bit and multiple "enum if_eec_state"s).

In that case - once we have pins- we'll see that the leader DPLL is synced
to the pin that comes from the PHY/MAC and be able to find the corresponding
lane, and on the follower side we'll see that it's locked to the pin that
corresponds to the master DPLL. The logic to "do something" with
that knowledge needs to be in the userspace app, as there may be
some external connections needed that are unknown at the board level
(think of 2 separate adapters connected with an external cable).

> I think we can't avoid having a separate object for ECC/Cleanup PLL.
> You can add it as a subobject to devlink but new genetlink family seems
> much preferable given the devlink instances themselves have unclear
> semantics at this point. Or you can try to convince Richard that ECC
> belongs as part of PTP :)

> In fact I don't think you care about any of the PHY / port stuff
> currently. All you need is the ECC side of the API. IIUC you have
> relatively simple setup where there is only one pin per port, and
> you don't care about syncing the Tx clock.

I actually do. There's (almost) always less recovered clock resources
(aka pins) than ports/lanes in the system. The TX clock will be
synchronized once the EEC reports the lock state, as it's the part that
generates clocks for the TX part of the PHY.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-09  2:09                                 ` Richard Cochran
@ 2021-09-09  8:18                                   ` Machnikowski, Maciej
  2021-09-10 14:14                                     ` Richard Cochran
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-09  8:18 UTC (permalink / raw)
  To: Richard Cochran, Andrew Lunn
  Cc: Jakub Kicinski, Florian Fainelli, Ido Schimmel, netdev,
	intel-wired-lan, abyagowi, Nguyen, Anthony L, davem,
	linux-kselftest, Michal Kubecek, Saeed Mahameed, Michael Chan

> -----Original Message-----
> From: Richard Cochran <richardcochran@gmail.com>
> Sent: Thursday, September 9, 2021 4:09 AM
> To: Andrew Lunn <andrew@lunn.ch>
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Thu, Sep 09, 2021 at 12:59:27AM +0200, Andrew Lunn wrote:
> > On Wed, Sep 08, 2021 at 03:20:27PM -0700, Jakub Kicinski wrote:
> > > On Wed, 8 Sep 2021 21:34:37 +0200 Andrew Lunn wrote:
> > > > Since we are talking about clocks and dividers, and multiplexors,
> > > > should all this be using the common clock framework, which already
> > > > supports most of this? Do we actually need something new?
> > >
> > > Does the common clock framework expose any user space API?
> >
> > Ah, good point. No, i don't think it does, apart from debugfs, which
> > is not really a user space API, and it contains read only descriptions
> > of the clock tree, current status, mux settings, dividers etc.
> 
> Wouldn't it make sense to develop some kind of user land API to
> manipulate the common clock framework at run time?
> 
> Just dreaming...
> 
> Richard

That may be dangerous. In SyncE world we control clocks that are only
references to the EEC block. The worst that can happen is that the DPLL
will ignore incoming clock as it has invalid frequency, or it will drift off to 
the edge of allowed range.
Controlling the clock that actually drives any components (PHY/MAC) in
runtime can be a good way to brick the part. I feel that, while the reuse 
of structures may be a good idea, the userspace API for clocks is not. 
They are usually set up once at the board init level and stay like that "forever".

The outputs we need to control are only a subset of all of them and they
rather fall in the PTP pins level of details, rather than clock ones.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 23:58                               ` Jakub Kicinski
@ 2021-09-09  8:26                                 ` Machnikowski, Maciej
  2021-09-09  9:24                                   ` Machnikowski, Maciej
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-09  8:26 UTC (permalink / raw)
  To: Jakub Kicinski, Andrew Lunn
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, linux-kselftest, Michal Kubecek,
	Saeed Mahameed, Michael Chan



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Thursday, September 9, 2021 1:58 AM
> To: Andrew Lunn <andrew@lunn.ch>
> Cc: Machnikowski, Maciej <maciej.machnikowski@intel.com>;
> netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org;
> richardcochran@gmail.com; abyagowi@fb.com; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; davem@davemloft.net; linux-
> kselftest@vger.kernel.org; Michal Kubecek <mkubecek@suse.cz>; Saeed
> Mahameed <saeed@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Thu, 9 Sep 2021 01:14:36 +0200 Andrew Lunn wrote:
> > > As you said, pin -> ref mapping is board specific, so the API should
> > > not assume knowledge of routing between Port and ECC.
> >
> > That information will probably end up in device tree. And X different
> > implementations of ACPI, unless somebody puts there foot down and
> > stops the snow flakes.
> >
> > > Imagine a system with two cascaded switch ASICs and a bunch of PHYs.
> > > How do you express that by pure extensions to the proposed API?
> >
> > Device tree is good at that. ACPI might eventually catch up.
> 
> I could well be wrong but some of those connectors could well be just
> SMAs on the back plate, no? User could cable those up to their heart
> content.
> 
> https://engineering.fb.com/2021/08/11/open-source/time-appliance/

Yep! We should base on the abstract indexes, rather than the device tree.
The user needs to make sense of the indexes, just like with PTP pins.
 
> > How complex a setup do we actually expect? Can there be multiple
> > disjoint SyncE trees within an Ethernet switch cluster? Or would it be
> > reasonable to say all you need to configure is the clock source, and
> > all other ports of the switches are slaves if SyncE is enabled for the
> > port? I've never see any SOHO switch hardware which allows you to have
> > disjoint PTP trees, so it does not sound too unreasonable to only
> > allow a single SyncE tree per switch cluster.
> 
> Not sure. I get the (perhaps unfounded) feeling that just forwarding
> the clock from one port to the rest is simpler. Maciej cares primarily
> about exposing the clock to other non-Ethernet things, the device would
> be an endpoint here, using the clock for LTE or whatnot.

Also multiple PTP domain switches starts appearing on the market.
I know Cisco makes them:
https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus3548/sw/system_mgmt/602/b_N3548_System_Management_Config_602A11/b_N3548_Sysetm_Management_Config_602A11_chapter_011.html
Disjoint SyncE trees will be harder to implement due to the physical nature
of it.

> > Also, if you are cascading switches, you generally don't put PHYs in
> > the middle, you just connect the SERDES lanes together.
> 
> My concern was a case where PHY connected to one switch exposes the
> refclock on an aux pin which is then muxed to more than one switch ASIC.
> IOW the "source port" is not actually under the same switch.
> 
> Again, IDK if that's realistic.

It can be done, but the userspace app should make sense out of this config.
Coding all scenarios in kernel would make 1000000 different possible
configurations in the driver - which probably no one wants to keep/maintain.
We can return info about hardwired pins (like GNSS, PHY recovered clks,
PTP clock and so on) but should also give some generic ones.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-09  8:26                                 ` Machnikowski, Maciej
@ 2021-09-09  9:24                                   ` Machnikowski, Maciej
  2021-09-09 10:15                                     ` David Miller
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-09  9:24 UTC (permalink / raw)
  To: Machnikowski, Maciej, davem
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, linux-kselftest, Michal Kubecek, Saeed Mahameed,
	Michael Chan, Jakub Kicinski, Andrew Lunn

Dave,

Are there any free slots on Plumbers to discuss and close on SyncE interfaces 
(or can we add an extra one). I can reuse the slides from the Netdev to give 
background and a live discussion may help closing opens around it,
and I'd be happy to co-present with anyone who wants to also join this effort.

Regards
Maciek

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-09  9:24                                   ` Machnikowski, Maciej
@ 2021-09-09 10:15                                     ` David Miller
  0 siblings, 0 replies; 42+ messages in thread
From: David Miller @ 2021-09-09 10:15 UTC (permalink / raw)
  To: maciej.machnikowski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi,
	anthony.l.nguyen, linux-kselftest, mkubecek, saeed, michael.chan,
	kuba, andrew

From: "Machnikowski, Maciej" <maciej.machnikowski@intel.com>
Date: Thu, 9 Sep 2021 09:24:07 +0000

> Dave,
> 
> Are there any free slots on Plumbers to discuss and close on SyncE interfaces 
> (or can we add an extra one). I can reuse the slides from the Netdev to give 
> background and a live discussion may help closing opens around it,
> and I'd be happy to co-present with anyone who wants to also join this effort.

Sorry, I think it's much too late for this.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-09  8:18                                   ` Machnikowski, Maciej
@ 2021-09-10 14:14                                     ` Richard Cochran
  0 siblings, 0 replies; 42+ messages in thread
From: Richard Cochran @ 2021-09-10 14:14 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: Andrew Lunn, Jakub Kicinski, Florian Fainelli, Ido Schimmel,
	netdev, intel-wired-lan, abyagowi, Nguyen, Anthony L, davem,
	linux-kselftest, Michal Kubecek, Saeed Mahameed, Michael Chan

On Thu, Sep 09, 2021 at 08:18:10AM +0000, Machnikowski, Maciej wrote:

> Controlling the clock that actually drives any components (PHY/MAC) in
> runtime can be a good way to brick the part.

I didn't say that.

> I feel that, while the reuse 
> of structures may be a good idea, the userspace API for clocks is not. 
> They are usually set up once at the board init level and stay like that "forever".
> 
> The outputs we need to control are only a subset of all of them and they
> rather fall in the PTP pins level of details, rather than clock ones.

clk-gate.c
clk-mux.c

Making that available for user space to twiddle is a better way that
tacking on to the PTP stuff.

You can model your device as having a multiplexer in front of it.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 16:21                       ` Jakub Kicinski
  2021-09-08 17:30                         ` Machnikowski, Maciej
@ 2021-09-13  8:50                         ` Ido Schimmel
  2021-09-21 13:36                         ` Ido Schimmel
  2 siblings, 0 replies; 42+ messages in thread
From: Ido Schimmel @ 2021-09-13  8:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Machnikowski, Maciej, Florian Fainelli, netdev, intel-wired-lan,
	richardcochran, abyagowi, Nguyen, Anthony L, davem,
	linux-kselftest, Andrew Lunn, Michal Kubecek, Saeed Mahameed,
	Michael Chan

On Wed, Sep 08, 2021 at 09:21:15AM -0700, Jakub Kicinski wrote:
> Mumble, mumble. Ido, Florian - any devices within your purview which
> would support SyncE? 

Sorry for the delay, was AFK. I remember SyncE being mentioned a few
times in the past, so it is likely that we will be requested to add
SyncE support in mlxsw in the future. I will try to solicit feedback
from colleagues more familiar with the subject and provide it here next
week.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-03 15:14 ` [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status Maciej Machnikowski
  2021-09-03 16:18   ` Stephen Hemminger
  2021-09-03 22:14   ` Jakub Kicinski
@ 2021-09-21 13:15   ` Ido Schimmel
  2021-09-21 13:37     ` Machnikowski, Maciej
  2 siblings, 1 reply; 42+ messages in thread
From: Ido Schimmel @ 2021-09-21 13:15 UTC (permalink / raw)
  To: Maciej Machnikowski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi,
	anthony.l.nguyen, davem, kuba, linux-kselftest

On Fri, Sep 03, 2021 at 05:14:35PM +0200, Maciej Machnikowski wrote:
> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> index eebd3894fe89..78a8a5af8cd8 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -1273,4 +1273,35 @@ enum {
>  
>  #define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
>  
> +/* SyncE section */
> +
> +enum if_eec_state {
> +	IF_EEC_STATE_INVALID = 0,
> +	IF_EEC_STATE_FREERUN,
> +	IF_EEC_STATE_LOCKACQ,
> +	IF_EEC_STATE_LOCKREC,
> +	IF_EEC_STATE_LOCKED,
> +	IF_EEC_STATE_HOLDOVER,
> +	IF_EEC_STATE_OPEN_LOOP,
> +	__IF_EEC_STATE_MAX,

Can you document these states? I'm not clear on what LOCKACQ, LOCKREC
and OPEN_LOOP mean. I also don't see ice using them and it's not really
a good practice to add new uAPI without any current users.

From v1 I understand that these states were copied from commit
797d3186544f ("ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.")
from Renesas.

Figure 11 in the following PDF describes the states, but it seems
specific to the particular device and probably shouldn't be exposed to
user space as-is:
https://www.renesas.com/us/en/document/dst/8a34041-datasheet

I have a few questions about this being a per-netdev attribute:

1. My understanding is that in the multi-port adapter you are working
with you have a single EEC that is used to set the Tx frequency of all
the ports. Is that correct?

2. Assuming the above is correct, is it possible that one port is in
LOCKED state and another (for some reason) is in HOLDOVER state? If yes,
then I agree about this being a per-netdev attribute. The interface can
also be extended with another attribute specifying the HOLDOVER reason.
For example, netdev being down.

Regardless, I agree with previous comments made about this belonging in
ethtool rather than rtnetlink.

> +};
> +
> +#define IF_EEC_STATE_MAX	(__IF_EEC_STATE_MAX - 1)
> +#define EEC_SRC_PORT		(1 << 0) /* recovered clock from the port is
> +					  * currently the source for the EEC
> +					  */

I'm not sure about this one. If we are going to expose this as a
per-netdev attribute (see more below), any reason it can't be added as
another state (e.g., IF_EEC_STATE_LOCKED_SRC)?

IIUC, in the common case of a simple NE the source of the EEC is always
one of the ports, but in the case of a PRC the source is most likely
external (e.g., GPS).

If so, I think we need a way to represent the EEC as a separate object
with the ability to set its source and report it via the same interface.
I'm unclear on how exactly an external source looks like, but for the
netdev case maybe something like this:

devlink clock show [ dev clock CLOCK ]
devlink clock set DEV clock CLOCK [ { src_type SRC_TYPE } ]
SRC_TYPE : = { port DEV/PORT_INDEX }

The only source type above is 'port' with the ability to set the
relevant port, but more can be added. Obviously, 'devlink clock show'
will give you the current source in addition to other information such
as frequency difference with respect to the input frequency.

Finally, can you share more info about the relation to the PHC? My
understanding is that one of the primary use cases for SyncE is to drive
all the PHCs in the network using the same frequency. How do you imagine
this configuration is going to look like? Can the above interface be
extended for that?

All of the above might be complete nonsense as I'm still trying to wrap
my head around the subject.

Thanks for working on this

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 2/2] ice: add support for reading SyncE DPLL state
  2021-09-03 15:14 ` [PATCH net-next 2/2] ice: add support for reading SyncE DPLL state Maciej Machnikowski
  2021-09-03 22:06   ` Jakub Kicinski
@ 2021-09-21 13:25   ` Ido Schimmel
  1 sibling, 0 replies; 42+ messages in thread
From: Ido Schimmel @ 2021-09-21 13:25 UTC (permalink / raw)
  To: Maciej Machnikowski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi,
	anthony.l.nguyen, davem, kuba, linux-kselftest

On Fri, Sep 03, 2021 at 05:14:36PM +0200, Maciej Machnikowski wrote:
> Implement SyncE DPLL monitoring for E810-T devices.
> Poll loop will periodically check the state of the DPLL and cache it
> in the pf structure. State changes will be logged in the system log.
> 
> Cached state can be read using the RTM_GETEECSTATE rtnetlink
> message.

This seems sub-optimal. My understanding is that this information is of
importance to the user space process that takes care of the ESMC
protocol. It would probably want to receive a notification when the
state of the DPLL/EEC changes as opposed to polling the kernel or poking
into its log.

So I think that whatever netlink-based interface we agree on to
represent the EEC (devlink or something else), should have the ability
to send a notification when the state changes.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-08 16:21                       ` Jakub Kicinski
  2021-09-08 17:30                         ` Machnikowski, Maciej
  2021-09-13  8:50                         ` Ido Schimmel
@ 2021-09-21 13:36                         ` Ido Schimmel
  2 siblings, 0 replies; 42+ messages in thread
From: Ido Schimmel @ 2021-09-21 13:36 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Machnikowski, Maciej, Florian Fainelli, netdev, intel-wired-lan,
	richardcochran, abyagowi, Nguyen, Anthony L, davem,
	linux-kselftest, Andrew Lunn, Michal Kubecek, Saeed Mahameed,
	Michael Chan

On Wed, Sep 08, 2021 at 09:21:15AM -0700, Jakub Kicinski wrote:
> Mumble, mumble. Ido, Florian - any devices within your purview which
> would support SyncE?

So it's public info that Connect-X has it:
https://www.mellanox.com/files/doc-2020/pb-connectx-6-dx-en-card.pdf

Given the nature of SyncE, I think you can extrapolate from that about
other devices :)

Anyway, Petr and I started discussing this with the colleagues defining
the HW/SW interfaces and we will help with the review to make sure we
end up with a uAPI that works across multiple implementations.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-21 13:15   ` Ido Schimmel
@ 2021-09-21 13:37     ` Machnikowski, Maciej
  2021-09-21 14:58       ` Ido Schimmel
  0 siblings, 1 reply; 42+ messages in thread
From: Machnikowski, Maciej @ 2021-09-21 13:37 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, kuba, linux-kselftest

> -----Original Message-----
> From: Ido Schimmel <idosch@idosch.org>
> Sent: Tuesday, September 21, 2021 3:16 PM
> To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> message to get SyncE status
> 
> On Fri, Sep 03, 2021 at 05:14:35PM +0200, Maciej Machnikowski wrote:
> > diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> > index eebd3894fe89..78a8a5af8cd8 100644
> > --- a/include/uapi/linux/if_link.h
> > +++ b/include/uapi/linux/if_link.h
> > @@ -1273,4 +1273,35 @@ enum {
> >
> >  #define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
> >
> > +/* SyncE section */
> > +
> > +enum if_eec_state {
> > +	IF_EEC_STATE_INVALID = 0,
> > +	IF_EEC_STATE_FREERUN,
> > +	IF_EEC_STATE_LOCKACQ,
> > +	IF_EEC_STATE_LOCKREC,
> > +	IF_EEC_STATE_LOCKED,
> > +	IF_EEC_STATE_HOLDOVER,
> > +	IF_EEC_STATE_OPEN_LOOP,
> > +	__IF_EEC_STATE_MAX,
> 
> Can you document these states? I'm not clear on what LOCKACQ, LOCKREC
> and OPEN_LOOP mean. I also don't see ice using them and it's not really
> a good practice to add new uAPI without any current users.
> 

I'll fix that enum to use generic states defined in G.781 which are limited to:
- Freerun
- LockedACQ (locked, acquiring holdover memory)
- Locked (locked with holdover acquired)
- Holdover

> From v1 I understand that these states were copied from commit
> 797d3186544f ("ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.")
> from Renesas.
> 
> Figure 11 in the following PDF describes the states, but it seems
> specific to the particular device and probably shouldn't be exposed to
> user space as-is:
> https://www.renesas.com/us/en/document/dst/8a34041-datasheet
> 
> I have a few questions about this being a per-netdev attribute:
> 
> 1. My understanding is that in the multi-port adapter you are working
> with you have a single EEC that is used to set the Tx frequency of all
> the ports. Is that correct?

That's correct.
 
> 2. Assuming the above is correct, is it possible that one port is in
> LOCKED state and another (for some reason) is in HOLDOVER state? If yes,
> then I agree about this being a per-netdev attribute. The interface can
> also be extended with another attribute specifying the HOLDOVER reason.
> For example, netdev being down.

All ports driven by a single EEC will report the same state.

> Regardless, I agree with previous comments made about this belonging in
> ethtool rather than rtnetlink.

Will take a look at it - as it will require support in linuxptp as well.

> > +};
> > +
> > +#define IF_EEC_STATE_MAX	(__IF_EEC_STATE_MAX - 1)
> > +#define EEC_SRC_PORT		(1 << 0) /* recovered clock from the
> port is
> > +					  * currently the source for the EEC
> > +					  */
> 
> I'm not sure about this one. If we are going to expose this as a
> per-netdev attribute (see more below), any reason it can't be added as
> another state (e.g., IF_EEC_STATE_LOCKED_SRC)?

OK! That's a great idea! Yet we'll need LOCKED_SRC and LOCKED_ACQ_SRC,
but still sounds good.

> IIUC, in the common case of a simple NE the source of the EEC is always
> one of the ports, but in the case of a PRC the source is most likely
> external (e.g., GPS).

True

> If so, I think we need a way to represent the EEC as a separate object
> with the ability to set its source and report it via the same interface.
> I'm unclear on how exactly an external source looks like, but for the
> netdev case maybe something like this:
> 
> devlink clock show [ dev clock CLOCK ]
> devlink clock set DEV clock CLOCK [ { src_type SRC_TYPE } ]
> SRC_TYPE : = { port DEV/PORT_INDEX }

Unfortunately, EEC lives in 2 worlds - it belongs to a netdev (in very simple
deployments the EEC may be a part of the PHY and only allow synchronizing
the TX frequency to a single lane/port), but also can go outside of netdev
and be a boar-wise frequency source.

> The only source type above is 'port' with the ability to set the
> relevant port, but more can be added. Obviously, 'devlink clock show'
> will give you the current source in addition to other information such
> as frequency difference with respect to the input frequency.

We considered devlink interface for configuring the clock/DPLL, but a
new concept was born at the list to add a DPLL subsystem that will
cover more use cases, like a TimeCard.

> Finally, can you share more info about the relation to the PHC? My
> understanding is that one of the primary use cases for SyncE is to drive
> all the PHCs in the network using the same frequency. How do you imagine
> this configuration is going to look like? Can the above interface be
> extended for that?

That would be a configurable parameter/option of the PTP program.
Just like it can check the existence of link on a given port, it'll also be
able to check if we use EEC and it's locked. If it is, and is a source of
PHC frequency - the ptp tool can decide to not apply frequency corrections
to the PHC, just like the ptp4l does when nullf servo is used, but can do that
dynamically.

> All of the above might be complete nonsense as I'm still trying to wrap
> my head around the subject.

It's certainly not! Thanks for your input!


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-21 13:37     ` Machnikowski, Maciej
@ 2021-09-21 14:58       ` Ido Schimmel
  2021-09-21 21:14         ` Jakub Kicinski
  0 siblings, 1 reply; 42+ messages in thread
From: Ido Schimmel @ 2021-09-21 14:58 UTC (permalink / raw)
  To: Machnikowski, Maciej
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi, Nguyen,
	Anthony L, davem, kuba, linux-kselftest

On Tue, Sep 21, 2021 at 01:37:37PM +0000, Machnikowski, Maciej wrote:
> > -----Original Message-----
> > From: Ido Schimmel <idosch@idosch.org>
> > Sent: Tuesday, September 21, 2021 3:16 PM
> > To: Machnikowski, Maciej <maciej.machnikowski@intel.com>
> > Subject: Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE
> > message to get SyncE status
> > 
> > On Fri, Sep 03, 2021 at 05:14:35PM +0200, Maciej Machnikowski wrote:
> > > diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> > > index eebd3894fe89..78a8a5af8cd8 100644
> > > --- a/include/uapi/linux/if_link.h
> > > +++ b/include/uapi/linux/if_link.h
> > > @@ -1273,4 +1273,35 @@ enum {
> > >
> > >  #define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
> > >
> > > +/* SyncE section */
> > > +
> > > +enum if_eec_state {
> > > +	IF_EEC_STATE_INVALID = 0,
> > > +	IF_EEC_STATE_FREERUN,
> > > +	IF_EEC_STATE_LOCKACQ,
> > > +	IF_EEC_STATE_LOCKREC,
> > > +	IF_EEC_STATE_LOCKED,
> > > +	IF_EEC_STATE_HOLDOVER,
> > > +	IF_EEC_STATE_OPEN_LOOP,
> > > +	__IF_EEC_STATE_MAX,
> > 
> > Can you document these states? I'm not clear on what LOCKACQ, LOCKREC
> > and OPEN_LOOP mean. I also don't see ice using them and it's not really
> > a good practice to add new uAPI without any current users.
> > 
> 
> I'll fix that enum to use generic states defined in G.781 which are limited to:
> - Freerun
> - LockedACQ (locked, acquiring holdover memory)
> - Locked (locked with holdover acquired)
> - Holdover

Thanks, it is good to conform to a standard.

Can ice distinguish between LockedACQ and Locked? From G.781 I
understand that the former is a transient state. Is the distinction
between the two important for user space / the selection operation? If
not, maybe we only need Locked?

> 
> > From v1 I understand that these states were copied from commit
> > 797d3186544f ("ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.")
> > from Renesas.
> > 
> > Figure 11 in the following PDF describes the states, but it seems
> > specific to the particular device and probably shouldn't be exposed to
> > user space as-is:
> > https://www.renesas.com/us/en/document/dst/8a34041-datasheet
> > 
> > I have a few questions about this being a per-netdev attribute:
> > 
> > 1. My understanding is that in the multi-port adapter you are working
> > with you have a single EEC that is used to set the Tx frequency of all
> > the ports. Is that correct?
> 
> That's correct.
>  
> > 2. Assuming the above is correct, is it possible that one port is in
> > LOCKED state and another (for some reason) is in HOLDOVER state? If yes,
> > then I agree about this being a per-netdev attribute. The interface can
> > also be extended with another attribute specifying the HOLDOVER reason.
> > For example, netdev being down.
> 
> All ports driven by a single EEC will report the same state.

So maybe we just need to report via ethtool the association between the
EEC and the netdev and expose the state as an attribute of the EEC
(along with the selected source and other info)?

This is similar to how PHC/netdev association is queried via ethtool.
For a given netdev, TSINFO_GET will report the PTP hw clock index via
ETHTOOL_A_TSINFO_PHC_INDEX. See Documentation/networking/ethtool-netlink.rst

> 
> > Regardless, I agree with previous comments made about this belonging in
> > ethtool rather than rtnetlink.
> 
> Will take a look at it - as it will require support in linuxptp as well.
> 
> > > +};
> > > +
> > > +#define IF_EEC_STATE_MAX	(__IF_EEC_STATE_MAX - 1)
> > > +#define EEC_SRC_PORT		(1 << 0) /* recovered clock from the
> > port is
> > > +					  * currently the source for the EEC
> > > +					  */
> > 
> > I'm not sure about this one. If we are going to expose this as a
> > per-netdev attribute (see more below), any reason it can't be added as
> > another state (e.g., IF_EEC_STATE_LOCKED_SRC)?
> 
> OK! That's a great idea! Yet we'll need LOCKED_SRC and LOCKED_ACQ_SRC,
> but still sounds good.
> 
> > IIUC, in the common case of a simple NE the source of the EEC is always
> > one of the ports, but in the case of a PRC the source is most likely
> > external (e.g., GPS).
> 
> True
> 
> > If so, I think we need a way to represent the EEC as a separate object
> > with the ability to set its source and report it via the same interface.
> > I'm unclear on how exactly an external source looks like, but for the
> > netdev case maybe something like this:
> > 
> > devlink clock show [ dev clock CLOCK ]
> > devlink clock set DEV clock CLOCK [ { src_type SRC_TYPE } ]
> > SRC_TYPE : = { port DEV/PORT_INDEX }
> 
> Unfortunately, EEC lives in 2 worlds - it belongs to a netdev (in very simple
> deployments the EEC may be a part of the PHY and only allow synchronizing
> the TX frequency to a single lane/port), but also can go outside of netdev
> and be a boar-wise frequency source.
> 
> > The only source type above is 'port' with the ability to set the
> > relevant port, but more can be added. Obviously, 'devlink clock show'
> > will give you the current source in addition to other information such
> > as frequency difference with respect to the input frequency.
> 
> We considered devlink interface for configuring the clock/DPLL, but a
> new concept was born at the list to add a DPLL subsystem that will
> cover more use cases, like a TimeCard.

The reason I suggested devlink is that it is suited for device-wide
configuration and it is already used by both MAC drivers and the
TimeCard driver. If we have a good reason to create a new generic
netlink family for this stuff, then OK.

> 
> > Finally, can you share more info about the relation to the PHC? My
> > understanding is that one of the primary use cases for SyncE is to drive
> > all the PHCs in the network using the same frequency. How do you imagine
> > this configuration is going to look like? Can the above interface be
> > extended for that?
> 
> That would be a configurable parameter/option of the PTP program.
> Just like it can check the existence of link on a given port, it'll also be
> able to check if we use EEC and it's locked. If it is, and is a source of
> PHC frequency - the ptp tool can decide to not apply frequency corrections
> to the PHC, just like the ptp4l does when nullf servo is used, but can do that
> dynamically.

The part I don't understand is "is a source of PHC frequency". How do we
configure / query that?

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-21 14:58       ` Ido Schimmel
@ 2021-09-21 21:14         ` Jakub Kicinski
  2021-09-22  6:22           ` Ido Schimmel
  0 siblings, 1 reply; 42+ messages in thread
From: Jakub Kicinski @ 2021-09-21 21:14 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Machnikowski, Maciej, netdev, intel-wired-lan, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, linux-kselftest

On Tue, 21 Sep 2021 17:58:05 +0300 Ido Schimmel wrote:
> > > The only source type above is 'port' with the ability to set the
> > > relevant port, but more can be added. Obviously, 'devlink clock show'
> > > will give you the current source in addition to other information such
> > > as frequency difference with respect to the input frequency.  
> > 
> > We considered devlink interface for configuring the clock/DPLL, but a
> > new concept was born at the list to add a DPLL subsystem that will
> > cover more use cases, like a TimeCard.  
> 
> The reason I suggested devlink is that it is suited for device-wide
> configuration and it is already used by both MAC drivers and the
> TimeCard driver. If we have a good reason to create a new generic
> netlink family for this stuff, then OK.

For NICs mapping between devlink instances and HW is not clear.
Most register devlink per PCI dev which usually maps to a Eth port.
So if we have one DPLL on a 2 port NIC mapping will get icky, no?

Is the motivation to save the boilerplate code associated with new
genetlink family or something more? 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-09-21 21:14         ` Jakub Kicinski
@ 2021-09-22  6:22           ` Ido Schimmel
  0 siblings, 0 replies; 42+ messages in thread
From: Ido Schimmel @ 2021-09-22  6:22 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Machnikowski, Maciej, netdev, intel-wired-lan, richardcochran,
	abyagowi, Nguyen, Anthony L, davem, linux-kselftest

On Tue, Sep 21, 2021 at 02:14:45PM -0700, Jakub Kicinski wrote:
> On Tue, 21 Sep 2021 17:58:05 +0300 Ido Schimmel wrote:
> > > > The only source type above is 'port' with the ability to set the
> > > > relevant port, but more can be added. Obviously, 'devlink clock show'
> > > > will give you the current source in addition to other information such
> > > > as frequency difference with respect to the input frequency.  
> > > 
> > > We considered devlink interface for configuring the clock/DPLL, but a
> > > new concept was born at the list to add a DPLL subsystem that will
> > > cover more use cases, like a TimeCard.  
> > 
> > The reason I suggested devlink is that it is suited for device-wide
> > configuration and it is already used by both MAC drivers and the
> > TimeCard driver. If we have a good reason to create a new generic
> > netlink family for this stuff, then OK.
> 
> For NICs mapping between devlink instances and HW is not clear.
> Most register devlink per PCI dev which usually maps to a Eth port.
> So if we have one DPLL on a 2 port NIC mapping will get icky, no?

Yes, having to represent the same EEC in multiple devlink instances is
not nice.

> 
> Is the motivation to save the boilerplate code associated with new
> genetlink family or something more? 

I don't mind either way. I simply wanted to understand the motivation
for not using any existing framework. The above argument is convincing
enough, IMO.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-08-31 11:52 ` [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status Maciej Machnikowski
@ 2021-08-31 13:44   ` Jakub Kicinski
  0 siblings, 0 replies; 42+ messages in thread
From: Jakub Kicinski @ 2021-08-31 13:44 UTC (permalink / raw)
  To: Maciej Machnikowski
  Cc: netdev, intel-wired-lan, richardcochran, abyagowi,
	anthony.l.nguyen, davem, linux-kselftest

On Tue, 31 Aug 2021 13:52:32 +0200 Maciej Machnikowski wrote:
> This patch series introduces basic interface for reading the Ethernet
> Equipment Clock (EEC) state on a SyncE capable device. This state gives
> information about the source of the syntonization signal and the state
> of EEC. This interface is required to implement Synchronization Status
> Messaging on upper layers.
> 
> Initial implementation returns:
>  - SyncE EEC state
>  - Source of signal driving SyncE EEC (SyncE, GNSS, PTP or External)
>  - Current index of the pin driving the EEC to track multiple recovered
>    clock paths
> 
> SyncE EEC state read needs to be implemented as ndo_get_eec_state
> function.
> 
> Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>

> +#define EEC_FLAG_STATE_VAL	(1 << 0)
> +#define EEC_FLAG_SRC_VAL	(1 << 1)
> +#define EEC_FLAG_PIN_VAL	(1 << 2)
> +
> +struct if_eec_state_msg {
> +	__u8 flags;
> +	__u8 state;
> +	__u8 src;
> +	__u8 pin;
> +	__u32 ifindex;
> +};

Break it up into attributes, please. It's the idiomatic way of dealing
with multiple values these days in netlink. Makes validation,
extensiblility, feature discover etc. etc. much easier down the line.

> +static int rtnl_fill_eec_state(struct sk_buff *msg, struct net_device *dev,
> +			       u32 portid, u32 seq, struct netlink_callback *cb,
> +			       int flags, struct netlink_ext_ack *extack)
> +{
> +	const struct net_device_ops *ops = dev->netdev_ops;
> +	struct if_eec_state_msg *state;
> +	struct nlmsghdr *nlh;
> +	int err;
> +
> +	ASSERT_RTNL();
> +
> +	if (!ops->ndo_get_eec_state)
> +		return -EOPNOTSUPP;
> +
> +	nlh = nlmsg_put(msg, portid, seq, RTM_GETEECSTATE,
> +			sizeof(*state), flags);
> +	if (!nlh)
> +		return -EMSGSIZE;
> +
> +	state = nlmsg_data(nlh);
> +
> +	memset(state, 0, sizeof(*state));
> +	err = ops->ndo_get_eec_state(dev, state, extack);
> +	if (err)
> +		return err;

return ops->ndo_get_eec_state(dev, state, extack);

Even tho it's perfectly fine as is IMO there are bots out there
matching on this pattern so let's not feed them.

> +	return 0;
> +}
> +
> +static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh,
> +			      struct netlink_ext_ack *extack)
> +{
> +	struct net *net = sock_net(skb->sk);
> +	struct if_eec_state_msg *state;
> +	struct net_device *dev;
> +	struct sk_buff *nskb;
> +	int err;
> +
> +	state = nlmsg_data(nlh);
> +	if (state->ifindex > 0)
> +		dev = __dev_get_by_index(net, state->ifindex);
> +	else
> +		return -EINVAL;

Keep the expected path unindented:

	if (state->ifindex <= 0)
		return -EINVAL;

	dev = __dev_get_by_index(net, state->ifindex);
	if (!dev)
		return -ENODEV;

That said I'm not sure why rtnl_stats_get() checks the ifindex is
positive in the first place (which is what I assumed inspired you).
We can just call __dev_get_by_index() and have it fail AFAICS.

> +	if (!dev)
> +		return -ENODEV;
> +
> +	nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> +	if (!nskb)
> +		return -ENOBUFS;
> +
> +	err = rtnl_fill_eec_state(nskb, dev, NETLINK_CB(skb).portid,
> +				  nlh->nlmsg_seq, NULL, nlh->nlmsg_flags,
> +				  extack);
> +	if (err < 0)
> +		kfree_skb(nskb);
> +	else
> +		err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
> +
> +	return err;
> +}

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
  2021-08-31 11:52 [PATCH net-next 0/2] Add RTNL interface for SyncE EEC state Maciej Machnikowski
@ 2021-08-31 11:52 ` Maciej Machnikowski
  2021-08-31 13:44   ` Jakub Kicinski
  0 siblings, 1 reply; 42+ messages in thread
From: Maciej Machnikowski @ 2021-08-31 11:52 UTC (permalink / raw)
  To: maciej.machnikowski, netdev, intel-wired-lan
  Cc: richardcochran, abyagowi, anthony.l.nguyen, davem, kuba, linux-kselftest

This patch series introduces basic interface for reading the Ethernet
Equipment Clock (EEC) state on a SyncE capable device. This state gives
information about the source of the syntonization signal and the state
of EEC. This interface is required to implement Synchronization Status
Messaging on upper layers.

Initial implementation returns:
 - SyncE EEC state
 - Source of signal driving SyncE EEC (SyncE, GNSS, PTP or External)
 - Current index of the pin driving the EEC to track multiple recovered
   clock paths

SyncE EEC state read needs to be implemented as ndo_get_eec_state
function.

Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
---
 include/linux/netdevice.h      |  5 +++
 include/uapi/linux/if_link.h   | 46 ++++++++++++++++++++++++
 include/uapi/linux/rtnetlink.h |  3 ++
 net/core/rtnetlink.c           | 64 ++++++++++++++++++++++++++++++++++
 security/selinux/nlmsgtab.c    |  3 +-
 5 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 6fd3a4d42668..4e5380900134 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1344,6 +1344,8 @@ struct netdev_net_notifier {
  *	The caller must be under RCU read context.
  * int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path);
  *     Get the forwarding path to reach the real device from the HW destination address
+ * int (*ndo_get_synce_state)(struct net_device *dev, struct if_eec_state_msg *state)
+ *	Get state of physical layer frequency syntonization (SyncE)
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1563,6 +1565,9 @@ struct net_device_ops {
 	struct net_device *	(*ndo_get_peer_dev)(struct net_device *dev);
 	int                     (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx,
                                                          struct net_device_path *path);
+	int			(*ndo_get_eec_state)(struct net_device *dev,
+						     struct if_eec_state_msg *state,
+						     struct netlink_ext_ack *extack);
 };
 
 /**
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index eebd3894fe89..7158a4b2b78f 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -1273,4 +1273,50 @@ enum {
 
 #define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
 
+/* SyncE section */
+
+enum if_eec_state {
+	IF_EEC_STATE_INVALID = 0,
+	IF_EEC_STATE_FREERUN,
+	IF_EEC_STATE_LOCKACQ,
+	IF_EEC_STATE_LOCKREC,
+	IF_EEC_STATE_LOCKED,
+	IF_EEC_STATE_HOLDOVER,
+	IF_EEC_STATE_OPEN_LOOP,
+	__IF_EEC_STATE_MAX,
+};
+
+#define IF_EEC_STATE_MAX (__IF_EEC_STATE_MAX - 1)
+
+enum if_eec_src {
+	IF_EEC_SRC_UNKNOWN = 0,
+	IF_EEC_SRC_SYNCE,
+	IF_EEC_SRC_GNSS,
+	IF_EEC_SRC_PTP,
+	IF_EEC_SRC_EXT,
+	__IF_EEC_SRC_MAX,
+};
+
+#define EEC_FLAG_STATE_VAL	(1 << 0)
+#define EEC_FLAG_SRC_VAL	(1 << 1)
+#define EEC_FLAG_PIN_VAL	(1 << 2)
+
+struct if_eec_state_msg {
+	__u8 flags;
+	__u8 state;
+	__u8 src;
+	__u8 pin;
+	__u32 ifindex;
+};
+
+enum {
+	IFLA_EEC_FLAGS,
+	IFLA_EEC_STATE,
+	IFLA_EEC_SRC,
+	IFLA_EEC_PIN,
+	__IFLA_EEC_MAX,
+};
+
+#define IFLA_EEC_MAX (__IFLA_EEC_MAX - 1)
+
 #endif /* _UAPI_LINUX_IF_LINK_H */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 5888492a5257..642ac18a09d9 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -185,6 +185,9 @@ enum {
 	RTM_GETNEXTHOPBUCKET,
 #define RTM_GETNEXTHOPBUCKET	RTM_GETNEXTHOPBUCKET
 
+	RTM_GETEECSTATE = 120,
+#define RTM_GETEECSTATE	RTM_GETEECSTATE
+
 	__RTM_MAX,
 #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
 };
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 972c8cb303a5..f5aa6e341952 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -5468,6 +5468,68 @@ static int rtnl_stats_dump(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
+static int rtnl_fill_eec_state(struct sk_buff *msg, struct net_device *dev,
+			       u32 portid, u32 seq, struct netlink_callback *cb,
+			       int flags, struct netlink_ext_ack *extack)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	struct if_eec_state_msg *state;
+	struct nlmsghdr *nlh;
+	int err;
+
+	ASSERT_RTNL();
+
+	if (!ops->ndo_get_eec_state)
+		return -EOPNOTSUPP;
+
+	nlh = nlmsg_put(msg, portid, seq, RTM_GETEECSTATE,
+			sizeof(*state), flags);
+	if (!nlh)
+		return -EMSGSIZE;
+
+	state = nlmsg_data(nlh);
+
+	memset(state, 0, sizeof(*state));
+	err = ops->ndo_get_eec_state(dev, state, extack);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh,
+			      struct netlink_ext_ack *extack)
+{
+	struct net *net = sock_net(skb->sk);
+	struct if_eec_state_msg *state;
+	struct net_device *dev;
+	struct sk_buff *nskb;
+	int err;
+
+	state = nlmsg_data(nlh);
+	if (state->ifindex > 0)
+		dev = __dev_get_by_index(net, state->ifindex);
+	else
+		return -EINVAL;
+
+	if (!dev)
+		return -ENODEV;
+
+	nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!nskb)
+		return -ENOBUFS;
+
+	err = rtnl_fill_eec_state(nskb, dev, NETLINK_CB(skb).portid,
+				  nlh->nlmsg_seq, NULL, nlh->nlmsg_flags,
+				  extack);
+	if (err < 0)
+		kfree_skb(nskb);
+	else
+		err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
+
+	return err;
+}
+
 /* Process one rtnetlink message. */
 
 static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
@@ -5693,4 +5755,6 @@ void __init rtnetlink_init(void)
 
 	rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump,
 		      0);
+
+	rtnl_register(PF_UNSPEC, RTM_GETEECSTATE, rtnl_eec_state_get, NULL, 0);
 }
diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c
index d59276f48d4f..0e2d84d6045f 100644
--- a/security/selinux/nlmsgtab.c
+++ b/security/selinux/nlmsgtab.c
@@ -91,6 +91,7 @@ static const struct nlmsg_perm nlmsg_route_perms[] =
 	{ RTM_NEWNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
 	{ RTM_DELNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
 	{ RTM_GETNEXTHOPBUCKET,	NETLINK_ROUTE_SOCKET__NLMSG_READ  },
+	{ RTM_GETEECSTATE,	NETLINK_ROUTE_SOCKET__NLMSG_READ  },
 };
 
 static const struct nlmsg_perm nlmsg_tcpdiag_perms[] =
@@ -174,7 +175,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm)
 		 * structures at the top of this file with the new mappings
 		 * before updating the BUILD_BUG_ON() macro!
 		 */
-		BUILD_BUG_ON(RTM_MAX != (RTM_NEWNEXTHOPBUCKET + 3));
+		BUILD_BUG_ON(RTM_MAX != (RTM_GETEECSTATE + 3));
 		err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms,
 				 sizeof(nlmsg_route_perms));
 		break;
-- 
2.26.3


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2021-09-22  6:22 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-03 15:14 [RFC v4 net-next 0/2] Add RTNL interface for SyncE Maciej Machnikowski
2021-09-03 15:14 ` [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status Maciej Machnikowski
2021-09-03 16:18   ` Stephen Hemminger
2021-09-03 22:20     ` Machnikowski, Maciej
2021-09-03 22:14   ` Jakub Kicinski
2021-09-06 18:30     ` Machnikowski, Maciej
2021-09-06 18:39       ` Jakub Kicinski
2021-09-06 19:01         ` Machnikowski, Maciej
2021-09-07  1:01           ` Jakub Kicinski
2021-09-07  8:50             ` Machnikowski, Maciej
2021-09-07 14:55               ` Jakub Kicinski
2021-09-07 15:47                 ` Machnikowski, Maciej
2021-09-07 19:47                   ` Jakub Kicinski
2021-09-08  8:03                     ` Machnikowski, Maciej
2021-09-08 16:21                       ` Jakub Kicinski
2021-09-08 17:30                         ` Machnikowski, Maciej
2021-09-08 19:34                           ` Andrew Lunn
2021-09-08 20:27                             ` Machnikowski, Maciej
2021-09-08 22:20                             ` Jakub Kicinski
2021-09-08 22:59                               ` Andrew Lunn
2021-09-09  2:09                                 ` Richard Cochran
2021-09-09  8:18                                   ` Machnikowski, Maciej
2021-09-10 14:14                                     ` Richard Cochran
2021-09-08 22:18                           ` Jakub Kicinski
2021-09-08 23:14                             ` Andrew Lunn
2021-09-08 23:58                               ` Jakub Kicinski
2021-09-09  8:26                                 ` Machnikowski, Maciej
2021-09-09  9:24                                   ` Machnikowski, Maciej
2021-09-09 10:15                                     ` David Miller
2021-09-09  8:11                             ` Machnikowski, Maciej
2021-09-13  8:50                         ` Ido Schimmel
2021-09-21 13:36                         ` Ido Schimmel
2021-09-21 13:15   ` Ido Schimmel
2021-09-21 13:37     ` Machnikowski, Maciej
2021-09-21 14:58       ` Ido Schimmel
2021-09-21 21:14         ` Jakub Kicinski
2021-09-22  6:22           ` Ido Schimmel
2021-09-03 15:14 ` [PATCH net-next 2/2] ice: add support for reading SyncE DPLL state Maciej Machnikowski
2021-09-03 22:06   ` Jakub Kicinski
2021-09-21 13:25   ` Ido Schimmel
  -- strict thread matches above, loose matches on Subject: below --
2021-08-31 11:52 [PATCH net-next 0/2] Add RTNL interface for SyncE EEC state Maciej Machnikowski
2021-08-31 11:52 ` [PATCH net-next 1/2] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status Maciej Machnikowski
2021-08-31 13:44   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).