LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH rdma-next 0/5] Misc update for hns driver
@ 2018-05-17 8:02 Wei Hu (Xavier)
2018-05-17 8:02 ` [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API Wei Hu (Xavier)
` (4 more replies)
0 siblings, 5 replies; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-17 8:02 UTC (permalink / raw)
To: dledford, jgg; +Cc: linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
This patchset included fixing bug, some optimization, reset
process and implementation of the disassociate_ucontext API
for hns driver.
Wei Hu (Xavier) (5):
RDMA/hns: Implement the disassociate_ucontext API
RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust
RDMA/hns: Increase checking CMQ status timeout value
RDMA/hns: Add reset process for RoCE in hip08
RDMA/hns: Fix the illegal memory operation when cross page
drivers/infiniband/hw/hns/hns_roce_cmd.c | 3 +
drivers/infiniband/hw/hns/hns_roce_device.h | 3 +
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 130 +++++++++++++++++++++++-----
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 3 +-
drivers/infiniband/hw/hns/hns_roce_main.c | 43 +++++++++
drivers/infiniband/hw/hns/hns_roce_pd.c | 10 ++-
6 files changed, 167 insertions(+), 25 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API
2018-05-17 8:02 [PATCH rdma-next 0/5] Misc update for hns driver Wei Hu (Xavier)
@ 2018-05-17 8:02 ` Wei Hu (Xavier)
2018-05-17 15:00 ` Jason Gunthorpe
2018-05-17 8:02 ` [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust Wei Hu (Xavier)
` (3 subsequent siblings)
4 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-17 8:02 UTC (permalink / raw)
To: dledford, jgg; +Cc: linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
This patch Implements the IB core disassociate_ucontext API.
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_main.c | 36 +++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index 96fb6a9..7fafe9d 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -33,6 +33,9 @@
#include <linux/acpi.h>
#include <linux/of_platform.h>
#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/sched/mm.h>
+#include <linux/sched/task.h>
#include <rdma/ib_addr.h>
#include <rdma/ib_smi.h>
#include <rdma/ib_user_verbs.h>
@@ -422,6 +425,38 @@ static int hns_roce_port_immutable(struct ib_device *ib_dev, u8 port_num,
return 0;
}
+static void hns_roce_disassociate_ucontext(struct ib_ucontext *ibcontext)
+{
+ struct task_struct *process;
+ struct mm_struct *mm;
+
+ process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
+ if (!process)
+ return;
+
+ mm = get_task_mm(process);
+ if (!mm) {
+ pr_info("no mm, disassociate ucontext is pending task termination\n");
+ while (1) {
+ put_task_struct(process);
+ usleep_range(1000, 2000);
+ process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
+ if (!process || process->state == TASK_DEAD) {
+ pr_info("disassociate ucontext done, task was terminated\n");
+ /* if task was dead, need to release the task
+ * struct.
+ */
+ if (process)
+ put_task_struct(process);
+ return;
+ }
+ }
+ }
+
+ mmput(mm);
+ put_task_struct(process);
+}
+
static void hns_roce_unregister_device(struct hns_roce_dev *hr_dev)
{
struct hns_roce_ib_iboe *iboe = &hr_dev->iboe;
@@ -516,6 +551,7 @@ static int hns_roce_register_device(struct hns_roce_dev *hr_dev)
/* OTHERS */
ib_dev->get_port_immutable = hns_roce_port_immutable;
+ ib_dev->disassociate_ucontext = hns_roce_disassociate_ucontext;
ib_dev->driver_id = RDMA_DRIVER_HNS;
ret = ib_register_device(ib_dev, NULL);
--
1.9.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust
2018-05-17 8:02 [PATCH rdma-next 0/5] Misc update for hns driver Wei Hu (Xavier)
2018-05-17 8:02 ` [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API Wei Hu (Xavier)
@ 2018-05-17 8:02 ` Wei Hu (Xavier)
2018-05-23 6:05 ` Leon Romanovsky
2018-05-17 8:02 ` [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value Wei Hu (Xavier)
` (2 subsequent siblings)
4 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-17 8:02 UTC (permalink / raw)
To: dledford, jgg; +Cc: linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
This patch modified uar allocation algorithm in hns_roce_uar_alloc
function to avoid bitmap exhaust.
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_device.h | 1 +
drivers/infiniband/hw/hns/hns_roce_pd.c | 10 ++++++----
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 53c2f1b..412297d4 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -214,6 +214,7 @@ enum {
struct hns_roce_uar {
u64 pfn;
unsigned long index;
+ unsigned long logic_idx;
};
struct hns_roce_ucontext {
diff --git a/drivers/infiniband/hw/hns/hns_roce_pd.c b/drivers/infiniband/hw/hns/hns_roce_pd.c
index 4b41e04..b9f2c87 100644
--- a/drivers/infiniband/hw/hns/hns_roce_pd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_pd.c
@@ -107,13 +107,15 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
int ret = 0;
/* Using bitmap to manager UAR index */
- ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->index);
+ ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);
if (ret == -1)
return -ENOMEM;
- if (uar->index > 0)
- uar->index = (uar->index - 1) %
+ if (uar->logic_idx > 0 && hr_dev->caps.phy_num_uars > 1)
+ uar->index = (uar->logic_idx - 1) %
(hr_dev->caps.phy_num_uars - 1) + 1;
+ else
+ uar->index = 0;
if (!dev_is_pci(hr_dev->dev)) {
res = platform_get_resource(hr_dev->pdev, IORESOURCE_MEM, 0);
@@ -132,7 +134,7 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
void hns_roce_uar_free(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
{
- hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->index,
+ hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->logic_idx,
BITMAP_NO_RR);
}
--
1.9.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value
2018-05-17 8:02 [PATCH rdma-next 0/5] Misc update for hns driver Wei Hu (Xavier)
2018-05-17 8:02 ` [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API Wei Hu (Xavier)
2018-05-17 8:02 ` [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust Wei Hu (Xavier)
@ 2018-05-17 8:02 ` Wei Hu (Xavier)
2018-05-23 5:49 ` Leon Romanovsky
2018-05-17 8:02 ` [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08 Wei Hu (Xavier)
2018-05-17 8:02 ` [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page Wei Hu (Xavier)
4 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-17 8:02 UTC (permalink / raw)
To: dledford, jgg; +Cc: linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
This patch increases checking CMQ status timeout value and
uses the same value with NIC driver to avoid deficiency of
time.
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index 182b672..f16df1b 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -76,7 +76,7 @@
#define HNS_ROCE_V2_PAGE_SIZE_SUPPORTED 0xFFFFF000
#define HNS_ROCE_V2_MAX_INNER_MTPT_NUM 2
#define HNS_ROCE_INVALID_LKEY 0x100
-#define HNS_ROCE_CMQ_TX_TIMEOUT 200
+#define HNS_ROCE_CMQ_TX_TIMEOUT 30000
#define HNS_ROCE_CONTEXT_HOP_NUM 1
#define HNS_ROCE_MTT_HOP_NUM 1
--
1.9.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-17 8:02 [PATCH rdma-next 0/5] Misc update for hns driver Wei Hu (Xavier)
` (2 preceding siblings ...)
2018-05-17 8:02 ` [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value Wei Hu (Xavier)
@ 2018-05-17 8:02 ` Wei Hu (Xavier)
2018-05-17 15:14 ` Jason Gunthorpe
2018-05-17 8:02 ` [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page Wei Hu (Xavier)
4 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-17 8:02 UTC (permalink / raw)
To: dledford, jgg; +Cc: linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
This patch added reset process for RoCE in hip08.
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_cmd.c | 3 ++
drivers/infiniband/hw/hns/hns_roce_device.h | 2 +
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 60 +++++++++++++++++++++++++++++
drivers/infiniband/hw/hns/hns_roce_main.c | 7 ++++
4 files changed, 72 insertions(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 9ebe839..a0ba19d 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -176,6 +176,9 @@ int hns_roce_cmd_mbox(struct hns_roce_dev *hr_dev, u64 in_param, u64 out_param,
unsigned long in_modifier, u8 op_modifier, u16 op,
unsigned long timeout)
{
+ if (hr_dev->is_reset)
+ return 0;
+
if (hr_dev->cmd.use_events)
return hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
in_modifier, op_modifier, op,
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 412297d4..da8512b 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -774,6 +774,8 @@ struct hns_roce_dev {
const char *irq_names[HNS_ROCE_MAX_IRQ_NUM];
spinlock_t sm_lock;
spinlock_t bt_cmd_lock;
+ bool active;
+ bool is_reset;
struct hns_roce_ib_iboe iboe;
struct list_head pgdir_list;
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 86ef15f..e1c44a6 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
int ret = 0;
int ntc;
+ if (hr_dev->is_reset)
+ return 0;
+
spin_lock_bh(&csq->lock);
if (num > hns_roce_cmq_space(csq)) {
@@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
return 0;
error_failed_get_cfg:
+ handle->priv = NULL;
kfree(hr_dev->priv);
error_failed_kzalloc:
@@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
{
struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
+ if (!hr_dev)
+ return;
+
hns_roce_exit(hr_dev);
+ handle->priv = NULL;
kfree(hr_dev->priv);
ib_dealloc_device(&hr_dev->ib_dev);
}
+static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle)
+{
+ struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
+ struct ib_event event;
+
+ if (!hr_dev) {
+ dev_err(&handle->pdev->dev,
+ "Input parameter handle->priv is NULL!\n");
+ return -EINVAL;
+ }
+
+ hr_dev->active = false;
+ hr_dev->is_reset = true;
+
+ event.event = IB_EVENT_DEVICE_FATAL;
+ event.device = &hr_dev->ib_dev;
+ event.element.port_num = 1;
+ ib_dispatch_event(&event);
+
+ return 0;
+}
+
+static int hns_roce_hw_v2_reset_notify_uninit(struct hnae3_handle *handle)
+{
+ msleep(100);
+ hns_roce_hw_v2_uninit_instance(handle, false);
+ return 0;
+}
+
+static int hns_roce_hw_v2_reset_notify(struct hnae3_handle *handle,
+ enum hnae3_reset_notify_type type)
+{
+ int ret = 0;
+
+ switch (type) {
+ case HNAE3_DOWN_CLIENT:
+ ret = hns_roce_hw_v2_reset_notify_down(handle);
+ break;
+ case HNAE3_INIT_CLIENT:
+ ret = hns_roce_hw_v2_init_instance(handle);
+ break;
+ case HNAE3_UNINIT_CLIENT:
+ ret = hns_roce_hw_v2_reset_notify_uninit(handle);
+ break;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
static const struct hnae3_client_ops hns_roce_hw_v2_ops = {
.init_instance = hns_roce_hw_v2_init_instance,
.uninit_instance = hns_roce_hw_v2_uninit_instance,
+ .reset_notify = hns_roce_hw_v2_reset_notify,
};
static struct hnae3_client hns_roce_hw_v2_client = {
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index 7fafe9d..56ee172 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -336,6 +336,9 @@ static struct ib_ucontext *hns_roce_alloc_ucontext(struct ib_device *ib_dev,
struct hns_roce_ib_alloc_ucontext_resp resp = {};
struct hns_roce_dev *hr_dev = to_hr_dev(ib_dev);
+ if (!hr_dev->active)
+ return ERR_PTR(-EAGAIN);
+
resp.qp_tab_size = hr_dev->caps.num_qps;
context = kmalloc(sizeof(*context), GFP_KERNEL);
@@ -461,6 +464,7 @@ static void hns_roce_unregister_device(struct hns_roce_dev *hr_dev)
{
struct hns_roce_ib_iboe *iboe = &hr_dev->iboe;
+ hr_dev->active = false;
unregister_netdevice_notifier(&iboe->nb);
ib_unregister_device(&hr_dev->ib_dev);
}
@@ -573,6 +577,7 @@ static int hns_roce_register_device(struct hns_roce_dev *hr_dev)
goto error_failed_setup_mtu_mac;
}
+ hr_dev->active = true;
return 0;
error_failed_setup_mtu_mac:
@@ -765,6 +770,7 @@ int hns_roce_init(struct hns_roce_dev *hr_dev)
return ret;
}
}
+ hr_dev->is_reset = false;
if (hr_dev->hw->cmq_init) {
ret = hr_dev->hw->cmq_init(hr_dev);
@@ -864,6 +870,7 @@ int hns_roce_init(struct hns_roce_dev *hr_dev)
void hns_roce_exit(struct hns_roce_dev *hr_dev)
{
hns_roce_unregister_device(hr_dev);
+
if (hr_dev->hw->hw_exit)
hr_dev->hw->hw_exit(hr_dev);
hns_roce_cleanup_bitmap(hr_dev);
--
1.9.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page
2018-05-17 8:02 [PATCH rdma-next 0/5] Misc update for hns driver Wei Hu (Xavier)
` (3 preceding siblings ...)
2018-05-17 8:02 ` [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08 Wei Hu (Xavier)
@ 2018-05-17 8:02 ` Wei Hu (Xavier)
2018-05-23 6:17 ` Leon Romanovsky
4 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-17 8:02 UTC (permalink / raw)
To: dledford, jgg; +Cc: linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
This patch fixed the potential illegal operation when using the
extend sge buffer cross page in post send operation. The bug
will cause the calltrace.
Reported-by: Jie Chen <chenjie103@huawei.com>
Reported-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
Fixes: b1c1583("RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 70 +++++++++++++++++++++---------
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 1 +
2 files changed, 51 insertions(+), 20 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index e1c44a6..5393149 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -52,6 +52,52 @@ static void set_data_seg_v2(struct hns_roce_v2_wqe_data_seg *dseg,
dseg->len = cpu_to_le32(sg->length);
}
+static void set_extend_sge(struct hns_roce_qp *qp, struct ib_send_wr *wr,
+ unsigned int *sge_ind)
+{
+ struct hns_roce_v2_wqe_data_seg *dseg;
+ struct ib_sge *sg;
+ int num_in_wqe = 0;
+ int extend_sge_num;
+ int fi_sge_num;
+ int se_sge_num;
+ int shift;
+ int i;
+
+ if (qp->ibqp.qp_type == IB_QPT_RC || qp->ibqp.qp_type == IB_QPT_UC)
+ num_in_wqe = HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE;
+ extend_sge_num = wr->num_sge - num_in_wqe;
+ sg = wr->sg_list + num_in_wqe;
+ shift = qp->hr_buf.page_shift;
+
+ /*
+ * Check whether wr->num_sge sges are in the same page. If not, we
+ * should calculate how many sges in the first page and the second
+ * page.
+ */
+ dseg = get_send_extend_sge(qp, (*sge_ind) & (qp->sge.sge_cnt - 1));
+ fi_sge_num = (((((u64)dseg >> shift) + 1) << shift) - (u64)dseg) /
+ sizeof(struct hns_roce_v2_wqe_data_seg);
+ if (extend_sge_num > fi_sge_num) {
+ se_sge_num = extend_sge_num - fi_sge_num;
+ for (i = 0; i < fi_sge_num; i++) {
+ set_data_seg_v2(dseg++, sg + i);
+ (*sge_ind)++;
+ }
+ dseg = get_send_extend_sge(qp,
+ (*sge_ind) & (qp->sge.sge_cnt - 1));
+ for (i = 0; i < se_sge_num; i++) {
+ set_data_seg_v2(dseg++, sg + fi_sge_num + i);
+ (*sge_ind)++;
+ }
+ } else {
+ for (i = 0; i < extend_sge_num; i++) {
+ set_data_seg_v2(dseg++, sg + i);
+ (*sge_ind)++;
+ }
+ }
+}
+
static int set_rwqe_data_seg(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct hns_roce_v2_rc_send_wqe *rc_sq_wqe,
void *wqe, unsigned int *sge_ind,
@@ -85,7 +131,7 @@ static int set_rwqe_data_seg(struct ib_qp *ibqp, struct ib_send_wr *wr,
roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_INLINE_S,
1);
} else {
- if (wr->num_sge <= 2) {
+ if (wr->num_sge <= HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE) {
for (i = 0; i < wr->num_sge; i++) {
if (likely(wr->sg_list[i].length)) {
set_data_seg_v2(dseg, wr->sg_list + i);
@@ -98,24 +144,14 @@ static int set_rwqe_data_seg(struct ib_qp *ibqp, struct ib_send_wr *wr,
V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S,
(*sge_ind) & (qp->sge.sge_cnt - 1));
- for (i = 0; i < 2; i++) {
+ for (i = 0; i < HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE; i++) {
if (likely(wr->sg_list[i].length)) {
set_data_seg_v2(dseg, wr->sg_list + i);
dseg++;
}
}
- dseg = get_send_extend_sge(qp,
- (*sge_ind) & (qp->sge.sge_cnt - 1));
-
- for (i = 0; i < wr->num_sge - 2; i++) {
- if (likely(wr->sg_list[i + 2].length)) {
- set_data_seg_v2(dseg,
- wr->sg_list + 2 + i);
- dseg++;
- (*sge_ind)++;
- }
- }
+ set_extend_sge(qp, wr, sge_ind);
}
roce_set_field(rc_sq_wqe->byte_16,
@@ -318,13 +354,7 @@ static int hns_roce_v2_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
memcpy(&ud_sq_wqe->dgid[0], &ah->av.dgid[0],
GID_LEN_V2);
- dseg = get_send_extend_sge(qp,
- sge_ind & (qp->sge.sge_cnt - 1));
- for (i = 0; i < wr->num_sge; i++) {
- set_data_seg_v2(dseg + i, wr->sg_list + i);
- sge_ind++;
- }
-
+ set_extend_sge(qp, wr, &sge_ind);
ind++;
} else if (ibqp->qp_type == IB_QPT_RC) {
rc_sq_wqe = wqe;
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index f16df1b..97f9bef 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -77,6 +77,7 @@
#define HNS_ROCE_V2_MAX_INNER_MTPT_NUM 2
#define HNS_ROCE_INVALID_LKEY 0x100
#define HNS_ROCE_CMQ_TX_TIMEOUT 30000
+#define HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE 2
#define HNS_ROCE_CONTEXT_HOP_NUM 1
#define HNS_ROCE_MTT_HOP_NUM 1
--
1.9.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API
2018-05-17 8:02 ` [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API Wei Hu (Xavier)
@ 2018-05-17 15:00 ` Jason Gunthorpe
2018-05-19 8:24 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2018-05-17 15:00 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On Thu, May 17, 2018 at 04:02:49PM +0800, Wei Hu (Xavier) wrote:
> This patch Implements the IB core disassociate_ucontext API.
>
> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
> drivers/infiniband/hw/hns/hns_roce_main.c | 36 +++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> index 96fb6a9..7fafe9d 100644
> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> @@ -33,6 +33,9 @@
> #include <linux/acpi.h>
> #include <linux/of_platform.h>
> #include <linux/module.h>
> +#include <linux/sched.h>
> +#include <linux/sched/mm.h>
> +#include <linux/sched/task.h>
> #include <rdma/ib_addr.h>
> #include <rdma/ib_smi.h>
> #include <rdma/ib_user_verbs.h>
> @@ -422,6 +425,38 @@ static int hns_roce_port_immutable(struct ib_device *ib_dev, u8 port_num,
> return 0;
> }
>
> +static void hns_roce_disassociate_ucontext(struct ib_ucontext *ibcontext)
> +{
> + struct task_struct *process;
> + struct mm_struct *mm;
> +
> + process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
> + if (!process)
> + return;
> +
> + mm = get_task_mm(process);
> + if (!mm) {
> + pr_info("no mm, disassociate ucontext is pending task termination\n");
> + while (1) {
> + put_task_struct(process);
> + usleep_range(1000, 2000);
> + process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
> + if (!process || process->state == TASK_DEAD) {
> + pr_info("disassociate ucontext done, task was terminated\n");
> + /* if task was dead, need to release the task
> + * struct.
> + */
> + if (process)
> + put_task_struct(process);
> + return;
> + }
> + }
> + }
I don't want to see this boilerplate code copied into every
driver. Hoist it into the core code, have the disassociate driver callback
accept a mm_struct parameter, and refactor the other drivers using this.
> + mmput(mm);
> + put_task_struct(process);
> +}
This can't be right, disassociate requires the driver to replace all
the mmaps it make to user space with the 0 page, I see hns does use
mmaps, so it must zap them here.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-17 8:02 ` [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08 Wei Hu (Xavier)
@ 2018-05-17 15:14 ` Jason Gunthorpe
2018-05-18 3:28 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2018-05-17 15:14 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> index 86ef15f..e1c44a6 100644
> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
> int ret = 0;
> int ntc;
>
> + if (hr_dev->is_reset)
> + return 0;
> +
> spin_lock_bh(&csq->lock);
>
> if (num > hns_roce_cmq_space(csq)) {
> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
> return 0;
>
> error_failed_get_cfg:
> + handle->priv = NULL;
> kfree(hr_dev->priv);
>
> error_failed_kzalloc:
> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
> {
> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
>
> + if (!hr_dev)
> + return;
> +
> hns_roce_exit(hr_dev);
> + handle->priv = NULL;
> kfree(hr_dev->priv);
> ib_dealloc_device(&hr_dev->ib_dev);
> }
Why are these hunks here? If init fails then uninit should not be
called, so why meddle with priv?
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-17 15:14 ` Jason Gunthorpe
@ 2018-05-18 3:28 ` Wei Hu (Xavier)
2018-05-18 4:15 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-18 3:28 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/17 23:14, Jason Gunthorpe wrote:
> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>> index 86ef15f..e1c44a6 100644
>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
>> int ret = 0;
>> int ntc;
>>
>> + if (hr_dev->is_reset)
>> + return 0;
>> +
>> spin_lock_bh(&csq->lock);
>>
>> if (num > hns_roce_cmq_space(csq)) {
>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
>> return 0;
>>
>> error_failed_get_cfg:
>> + handle->priv = NULL;
>> kfree(hr_dev->priv);
>>
>> error_failed_kzalloc:
>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>> {
>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
>>
>> + if (!hr_dev)
>> + return;
>> +
>> hns_roce_exit(hr_dev);
>> + handle->priv = NULL;
>> kfree(hr_dev->priv);
>> ib_dealloc_device(&hr_dev->ib_dev);
>> }
> Why are these hunks here? If init fails then uninit should not be
> called, so why meddle with priv?
In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with
hr_dev,
We want clear the value in hns_roce_hw_v2_uninit_instance function.
So we can ensure no problem in RoCE driver.
static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
{
struct hns_roce_dev *hr_dev;
int ret;
hr_dev = (struct hns_roce_dev *)ib_alloc_device(sizeof(*hr_dev));
if (!hr_dev)
return -ENOMEM;
...// other code
handle->priv = hr_dev;
....// other code
return 0;
error_xxx:
handle->priv = NULL;
...// other code
error_yyyy:
ib_dealloc_device(&hr_dev->ib_dev);
return ret;
}
static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
bool reset)
{
struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
if (!hr_dev)
return;
hns_roce_exit(hr_dev);
handle->priv = NULL;
kfree(hr_dev->priv);
ib_dealloc_device(&hr_dev->ib_dev);
}
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-18 3:28 ` Wei Hu (Xavier)
@ 2018-05-18 4:15 ` Jason Gunthorpe
2018-05-18 7:23 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2018-05-18 4:15 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote:
>
>
> On 2018/5/17 23:14, Jason Gunthorpe wrote:
> > On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
> >> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> >> index 86ef15f..e1c44a6 100644
> >> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> >> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
> >> int ret = 0;
> >> int ntc;
> >>
> >> + if (hr_dev->is_reset)
> >> + return 0;
> >> +
> >> spin_lock_bh(&csq->lock);
> >>
> >> if (num > hns_roce_cmq_space(csq)) {
> >> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
> >> return 0;
> >>
> >> error_failed_get_cfg:
> >> + handle->priv = NULL;
> >> kfree(hr_dev->priv);
> >>
> >> error_failed_kzalloc:
> >> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
> >> {
> >> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
> >>
> >> + if (!hr_dev)
> >> + return;
> >> +
> >> hns_roce_exit(hr_dev);
> >> + handle->priv = NULL;
> >> kfree(hr_dev->priv);
> >> ib_dealloc_device(&hr_dev->ib_dev);
> >> }
> > Why are these hunks here? If init fails then uninit should not be
> > called, so why meddle with priv?
> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with
> hr_dev,
> We want clear the value in hns_roce_hw_v2_uninit_instance function.
> So we can ensure no problem in RoCE driver.
What problem could happen?
I keep removing unnecessary sets to null and checks of null, so please
don't add them if they cannot happen.
Eg uninit should never be called with a null priv, that is a serious
logic mis-design someplace if it happens.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-18 4:15 ` Jason Gunthorpe
@ 2018-05-18 7:23 ` Wei Hu (Xavier)
2018-05-22 20:26 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-18 7:23 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/18 12:15, Jason Gunthorpe wrote:
> On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote:
>>
>> On 2018/5/17 23:14, Jason Gunthorpe wrote:
>>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>> index 86ef15f..e1c44a6 100644
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
>>>> int ret = 0;
>>>> int ntc;
>>>>
>>>> + if (hr_dev->is_reset)
>>>> + return 0;
>>>> +
>>>> spin_lock_bh(&csq->lock);
>>>>
>>>> if (num > hns_roce_cmq_space(csq)) {
>>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
>>>> return 0;
>>>>
>>>> error_failed_get_cfg:
>>>> + handle->priv = NULL;
>>>> kfree(hr_dev->priv);
>>>>
>>>> error_failed_kzalloc:
>>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>>>> {
>>>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
>>>>
>>>> + if (!hr_dev)
>>>> + return;
>>>> +
>>>> hns_roce_exit(hr_dev);
>>>> + handle->priv = NULL;
>>>> kfree(hr_dev->priv);
>>>> ib_dealloc_device(&hr_dev->ib_dev);
>>>> }
>>> Why are these hunks here? If init fails then uninit should not be
>>> called, so why meddle with priv?
>> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with
>> hr_dev,
>> We want clear the value in hns_roce_hw_v2_uninit_instance function.
>> So we can ensure no problem in RoCE driver.
> What problem could happen?
>
> I keep removing unnecessary sets to null and checks of null, so please
> don't add them if they cannot happen.
>
> Eg uninit should never be called with a null priv, that is a serious
> logic mis-design someplace if it happens.
>
> Jason
NIC driver call the registered reset_notify() function to finish the
part of RoCE reset process.
In RoCE driver, when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT,
we call hns_roce_hw_v2_uninit_instance(handle, false) to release the
resources.
when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call
hns_roce_hw_v2_init_instance.
if hns_roce_hw_v2_init_instance failed, we should ensure no problem in
the other callback
function registered by RoCE driver.
The related RoCE driver:
static int hns_roce_hw_v2_reset_notify_uninit(struct hnae3_handle *handle)
{
msleep(100);
hns_roce_hw_v2_uninit_instance(handle, false);
return 0;
}
static int hns_roce_hw_v2_reset_notify(struct hnae3_handle *handle,
enum hnae3_reset_notify_type type)
{
int ret = 0;
switch (type) {
case HNAE3_DOWN_CLIENT:
ret = hns_roce_hw_v2_reset_notify_down(handle);
break;
case HNAE3_INIT_CLIENT:
ret = hns_roce_hw_v2_init_instance(handle);
break;
case HNAE3_UNINIT_CLIENT:
ret = hns_roce_hw_v2_reset_notify_uninit(handle);
break;
default:
break;
}
return ret;
}
The related NIC driver:
static int hclge_notify_roce_client(struct hclge_dev *hdev,
enum hnae3_reset_notify_type type)
{
struct hnae3_client *client = hdev->roce_client;
struct hnae3_handle *handle;
int ret = 0;
u16 i;
if (!client)
return 0;
if (!client->ops->reset_notify)
return -EOPNOTSUPP;
for (i = 0; i < hdev->num_vmdq_vport + 1; i++) {
handle = &hdev->vport[i].roce;
ret = client->ops->reset_notify(handle, type);
if (ret) {
dev_err(&hdev->pdev->dev,
"notify roce client failed %d", ret);
return ret;
}
}
return ret;
}
static void hclge_reset(struct hclge_dev *hdev)
{
struct hnae3_handle *handle;
/* perform reset of the stack & ae device for a client */
handle = &hdev->vport[0].nic;
hclge_notify_roce_client(hdev, HNAE3_DOWN_CLIENT);
hclge_notify_roce_client(hdev, HNAE3_UNINIT_CLIENT);
rtnl_lock();
hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
if (!hclge_reset_wait(hdev)) {
hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
hclge_reset_ae_dev(hdev->ae_dev);
hclge_notify_client(hdev, HNAE3_INIT_CLIENT);
hclge_clear_reset_cause(hdev);
} else {
/* schedule again to check pending resets later */
set_bit(hdev->reset_type, &hdev->reset_pending);
hclge_reset_task_schedule(hdev);
}
hclge_notify_client(hdev, HNAE3_UP_CLIENT);
handle->last_reset_time = jiffies;
rtnl_unlock();
hclge_notify_roce_client(hdev, HNAE3_INIT_CLIENT);
hclge_notify_roce_client(hdev, HNAE3_UP_CLIENT);
}
Thanks, Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API
2018-05-17 15:00 ` Jason Gunthorpe
@ 2018-05-19 8:24 ` Wei Hu (Xavier)
2018-05-22 20:21 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-19 8:24 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/17 23:00, Jason Gunthorpe wrote:
> On Thu, May 17, 2018 at 04:02:49PM +0800, Wei Hu (Xavier) wrote:
>> This patch Implements the IB core disassociate_ucontext API.
>>
>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
>> drivers/infiniband/hw/hns/hns_roce_main.c | 36 +++++++++++++++++++++++++++++++
>> 1 file changed, 36 insertions(+)
>>
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
>> index 96fb6a9..7fafe9d 100644
>> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
>> @@ -33,6 +33,9 @@
>> #include <linux/acpi.h>
>> #include <linux/of_platform.h>
>> #include <linux/module.h>
>> +#include <linux/sched.h>
>> +#include <linux/sched/mm.h>
>> +#include <linux/sched/task.h>
>> #include <rdma/ib_addr.h>
>> #include <rdma/ib_smi.h>
>> #include <rdma/ib_user_verbs.h>
>> @@ -422,6 +425,38 @@ static int hns_roce_port_immutable(struct ib_device *ib_dev, u8 port_num,
>> return 0;
>> }
>>
>> +static void hns_roce_disassociate_ucontext(struct ib_ucontext *ibcontext)
>> +{
>> + struct task_struct *process;
>> + struct mm_struct *mm;
>> +
>> + process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
>> + if (!process)
>> + return;
>> +
>> + mm = get_task_mm(process);
>> + if (!mm) {
>> + pr_info("no mm, disassociate ucontext is pending task termination\n");
>> + while (1) {
>> + put_task_struct(process);
>> + usleep_range(1000, 2000);
>> + process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
>> + if (!process || process->state == TASK_DEAD) {
>> + pr_info("disassociate ucontext done, task was terminated\n");
>> + /* if task was dead, need to release the task
>> + * struct.
>> + */
>> + if (process)
>> + put_task_struct(process);
>> + return;
>> + }
>> + }
>> + }
> I don't want to see this boilerplate code copied into every
> driver. Hoist it into the core code, have the disassociate driver callback
> accept a mm_struct parameter, and refactor the other drivers using this.
When the userspace RDMA application process is suspended for some reason
without executing ibv_close_device function,
There will be calltrace as follows when rmmod roce kernel driver ko in
the current version.
It looks like a common problem to every driver and the code segment
above is suitable for every driver.
Pardon me for asking, but if you have any plan to do this?
root@(none)# rmmod
../ko/hns-roce-hw-v2.ko
[ 1222.676069] INFO: task rmmod:1996 blocked for more than 120
seconds.
[ 1222.682423] Not tainted 4.16.0-rc1-29112-ge237d0c-dirty
#15
[ 1222.688507] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1222.696327] rmmod D 0 1996 1951
0x00000000
[ 1222.701807] Call
trace:
[ 1222.704252]
__switch_to+0x9c/0xd8
[ 1222.707644]
__schedule+0x1d8/0x854
[ 1222.711125]
schedule+0x3c/0x9c
[ 1222.714258]
schedule_timeout+0x1dc/0x3f8
[ 1222.718260]
wait_for_common+0x120/0x1e0
[ 1222.722174]
wait_for_completion+0x28/0x34
[ 1222.726264]
ib_uverbs_remove_one+0x29c/0x2bc
[ 1222.730614]
ib_unregister_device+0xe8/0x198
[ 1222.734888] hns_roce_exit+0xb4/0xc4
[hns_roce]
[ 1222.739414] hns_roce_hw_v2_uninit_instance+0x24/0x40
[hns_roce_hw_v2]
[ 1222.745934]
hclge_uninit_client_instance+0x88/0xb8
[ 1222.750803]
hnae3_match_n_instantiate+0xbc/0xd0
[ 1222.755411]
hnae3_unregister_client+0x50/0xb0
[ 1222.759850] hns_roce_hw_v2_exit+0x10/0xd48
[hns_roce_hw_v2]
[ 1222.765501]
SyS_delete_module+0x1e8/0x238
[ 1222.769589]
el0_svc_naked+0x30/0x34
Thanks, Jason
>> + mmput(mm);
>> + put_task_struct(process);
>> +}
> This can't be right, disassociate requires the driver to replace all
> the mmaps it make to user space with the 0 page, I see hns does use
> mmaps, so it must zap them here.
Ok, got it.
>
> Jason
>
> .
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API
2018-05-19 8:24 ` Wei Hu (Xavier)
@ 2018-05-22 20:21 ` Jason Gunthorpe
2018-05-23 9:33 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2018-05-22 20:21 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On Sat, May 19, 2018 at 04:24:40PM +0800, Wei Hu (Xavier) wrote:
>
>
> On 2018/5/17 23:00, Jason Gunthorpe wrote:
> > On Thu, May 17, 2018 at 04:02:49PM +0800, Wei Hu (Xavier) wrote:
> >> This patch Implements the IB core disassociate_ucontext API.
> >>
> >> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
> >> drivers/infiniband/hw/hns/hns_roce_main.c | 36 +++++++++++++++++++++++++++++++
> >> 1 file changed, 36 insertions(+)
> >>
> >> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> >> index 96fb6a9..7fafe9d 100644
> >> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> >> @@ -33,6 +33,9 @@
> >> #include <linux/acpi.h>
> >> #include <linux/of_platform.h>
> >> #include <linux/module.h>
> >> +#include <linux/sched.h>
> >> +#include <linux/sched/mm.h>
> >> +#include <linux/sched/task.h>
> >> #include <rdma/ib_addr.h>
> >> #include <rdma/ib_smi.h>
> >> #include <rdma/ib_user_verbs.h>
> >> @@ -422,6 +425,38 @@ static int hns_roce_port_immutable(struct ib_device *ib_dev, u8 port_num,
> >> return 0;
> >> }
> >>
> >> +static void hns_roce_disassociate_ucontext(struct ib_ucontext *ibcontext)
> >> +{
> >> + struct task_struct *process;
> >> + struct mm_struct *mm;
> >> +
> >> + process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
> >> + if (!process)
> >> + return;
> >> +
> >> + mm = get_task_mm(process);
> >> + if (!mm) {
> >> + pr_info("no mm, disassociate ucontext is pending task termination\n");
> >> + while (1) {
> >> + put_task_struct(process);
> >> + usleep_range(1000, 2000);
> >> + process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
> >> + if (!process || process->state == TASK_DEAD) {
> >> + pr_info("disassociate ucontext done, task was terminated\n");
> >> + /* if task was dead, need to release the task
> >> + * struct.
> >> + */
> >> + if (process)
> >> + put_task_struct(process);
> >> + return;
> >> + }
> >> + }
> >> + }
> > I don't want to see this boilerplate code copied into every
> > driver. Hoist it into the core code, have the disassociate driver callback
> > accept a mm_struct parameter, and refactor the other drivers using this.
>
> When the userspace RDMA application process is suspended for some reason
> without executing ibv_close_device function,
> There will be calltrace as follows when rmmod roce kernel driver ko in
> the current version.
> It looks like a common problem to every driver and the code segment
> above is suitable for every driver.
> Pardon me for asking, but if you have any plan to do this?
My plan is to ask you to do it :)
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-18 7:23 ` Wei Hu (Xavier)
@ 2018-05-22 20:26 ` Jason Gunthorpe
2018-05-23 2:54 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2018-05-22 20:26 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On Fri, May 18, 2018 at 03:23:00PM +0800, Wei Hu (Xavier) wrote:
>
>
> On 2018/5/18 12:15, Jason Gunthorpe wrote:
> > On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote:
> >>
> >> On 2018/5/17 23:14, Jason Gunthorpe wrote:
> >>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
> >>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> >>>> index 86ef15f..e1c44a6 100644
> >>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> >>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
> >>>> int ret = 0;
> >>>> int ntc;
> >>>>
> >>>> + if (hr_dev->is_reset)
> >>>> + return 0;
> >>>> +
> >>>> spin_lock_bh(&csq->lock);
> >>>>
> >>>> if (num > hns_roce_cmq_space(csq)) {
> >>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
> >>>> return 0;
> >>>>
> >>>> error_failed_get_cfg:
> >>>> + handle->priv = NULL;
> >>>> kfree(hr_dev->priv);
> >>>>
> >>>> error_failed_kzalloc:
> >>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
> >>>> {
> >>>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
> >>>>
> >>>> + if (!hr_dev)
> >>>> + return;
> >>>> +
> >>>> hns_roce_exit(hr_dev);
> >>>> + handle->priv = NULL;
> >>>> kfree(hr_dev->priv);
> >>>> ib_dealloc_device(&hr_dev->ib_dev);
> >>>> }
> >>> Why are these hunks here? If init fails then uninit should not be
> >>> called, so why meddle with priv?
> >> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with
> >> hr_dev,
> >> We want clear the value in hns_roce_hw_v2_uninit_instance function.
> >> So we can ensure no problem in RoCE driver.
> > What problem could happen?
> >
> > I keep removing unnecessary sets to null and checks of null, so please
> > don't add them if they cannot happen.
> >
> > Eg uninit should never be called with a null priv, that is a serious
> > logic mis-design someplace if it happens.
> >
> > Jason
> NIC driver call the registered reset_notify() function to finish the
> part of RoCE reset process.
> In RoCE driver, when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT,
> we call hns_roce_hw_v2_uninit_instance(handle, false) to release the
> resources.
> when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call
> hns_roce_hw_v2_init_instance.
> if hns_roce_hw_v2_init_instance failed, we should ensure no problem in
> the other callback
> function registered by RoCE driver.
Don't design things like this.
init/uninit are paired - do not call something uninit if it can be
called after init fails, or better, arrange to prevent that so things
are sane.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-22 20:26 ` Jason Gunthorpe
@ 2018-05-23 2:54 ` Wei Hu (Xavier)
2018-05-23 3:47 ` Jason Gunthorpe
2018-05-23 3:49 ` Wei Hu (Xavier)
0 siblings, 2 replies; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 2:54 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 4:26, Jason Gunthorpe wrote:
> On Fri, May 18, 2018 at 03:23:00PM +0800, Wei Hu (Xavier) wrote:
>>
>> On 2018/5/18 12:15, Jason Gunthorpe wrote:
>>> On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote:
>>>> On 2018/5/17 23:14, Jason Gunthorpe wrote:
>>>>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
>>>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>>>> index 86ef15f..e1c44a6 100644
>>>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
>>>>>> int ret = 0;
>>>>>> int ntc;
>>>>>>
>>>>>> + if (hr_dev->is_reset)
>>>>>> + return 0;
>>>>>> +
>>>>>> spin_lock_bh(&csq->lock);
>>>>>>
>>>>>> if (num > hns_roce_cmq_space(csq)) {
>>>>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
>>>>>> return 0;
>>>>>>
>>>>>> error_failed_get_cfg:
>>>>>> + handle->priv = NULL;
>>>>>> kfree(hr_dev->priv);
>>>>>>
>>>>>> error_failed_kzalloc:
>>>>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>>>>>> {
>>>>>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
>>>>>>
>>>>>> + if (!hr_dev)
>>>>>> + return;
>>>>>> +
>>>>>> hns_roce_exit(hr_dev);
>>>>>> + handle->priv = NULL;
>>>>>> kfree(hr_dev->priv);
>>>>>> ib_dealloc_device(&hr_dev->ib_dev);
>>>>>> }
>>>>> Why are these hunks here? If init fails then uninit should not be
>>>>> called, so why meddle with priv?
>>>> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with
>>>> hr_dev,
>>>> We want clear the value in hns_roce_hw_v2_uninit_instance function.
>>>> So we can ensure no problem in RoCE driver.
>>> What problem could happen?
>>>
>>> I keep removing unnecessary sets to null and checks of null, so please
>>> don't add them if they cannot happen.
>>>
>>> Eg uninit should never be called with a null priv, that is a serious
>>> logic mis-design someplace if it happens.
>>>
>>> Jason
>> NIC driver call the registered reset_notify() function to finish the
>> part of RoCE reset process.
>> In RoCE driver, when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT,
>> we call hns_roce_hw_v2_uninit_instance(handle, false) to release the
>> resources.
>> when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call
>> hns_roce_hw_v2_init_instance.
>> if hns_roce_hw_v2_init_instance failed, we should ensure no problem in
>> the other callback
>> function registered by RoCE driver.
> Don't design things like this.
>
> init/uninit are paired - do not call something uninit if it can be
> called after init fails, or better, arrange to prevent that so things
> are sane.
>
> Jason
>
> .
The current RoCE driver registered 3 callback function to NIC driver as
belows:
1.init_instance/uninit_instance are paired.
2.In reset_notify function, RoCE dirver still call
init_instance/uninit_instance function.
but NIC driver does not perceive the behavior. We need to judge in RoCE
driver.
static const struct hnae3_client_ops hns_roce_hw_v2_ops = {
.init_instance = hns_roce_hw_v2_init_instance,
.uninit_instance = hns_roce_hw_v2_uninit_instance,
.reset_notify = hns_roce_hw_v2_reset_notify,
};
Wei Hu
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-23 2:54 ` Wei Hu (Xavier)
@ 2018-05-23 3:47 ` Jason Gunthorpe
2018-05-23 9:35 ` Wei Hu (Xavier)
2018-05-23 3:49 ` Wei Hu (Xavier)
1 sibling, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2018-05-23 3:47 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On Wed, May 23, 2018 at 10:54:54AM +0800, Wei Hu (Xavier) wrote:
>
>
> On 2018/5/23 4:26, Jason Gunthorpe wrote:
> > On Fri, May 18, 2018 at 03:23:00PM +0800, Wei Hu (Xavier) wrote:
> >>
> >> On 2018/5/18 12:15, Jason Gunthorpe wrote:
> >>> On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote:
> >>>> On 2018/5/17 23:14, Jason Gunthorpe wrote:
> >>>>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
> >>>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> >>>>>> index 86ef15f..e1c44a6 100644
> >>>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
> >>>>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
> >>>>>> int ret = 0;
> >>>>>> int ntc;
> >>>>>>
> >>>>>> + if (hr_dev->is_reset)
> >>>>>> + return 0;
> >>>>>> +
> >>>>>> spin_lock_bh(&csq->lock);
> >>>>>>
> >>>>>> if (num > hns_roce_cmq_space(csq)) {
> >>>>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
> >>>>>> return 0;
> >>>>>>
> >>>>>> error_failed_get_cfg:
> >>>>>> + handle->priv = NULL;
> >>>>>> kfree(hr_dev->priv);
> >>>>>>
> >>>>>> error_failed_kzalloc:
> >>>>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
> >>>>>> {
> >>>>>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
> >>>>>>
> >>>>>> + if (!hr_dev)
> >>>>>> + return;
> >>>>>> +
> >>>>>> hns_roce_exit(hr_dev);
> >>>>>> + handle->priv = NULL;
> >>>>>> kfree(hr_dev->priv);
> >>>>>> ib_dealloc_device(&hr_dev->ib_dev);
> >>>>>> }
> >>>>> Why are these hunks here? If init fails then uninit should not be
> >>>>> called, so why meddle with priv?
> >>>> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with
> >>>> hr_dev,
> >>>> We want clear the value in hns_roce_hw_v2_uninit_instance function.
> >>>> So we can ensure no problem in RoCE driver.
> >>> What problem could happen?
> >>>
> >>> I keep removing unnecessary sets to null and checks of null, so please
> >>> don't add them if they cannot happen.
> >>>
> >>> Eg uninit should never be called with a null priv, that is a serious
> >>> logic mis-design someplace if it happens.
> >>>
> >>> Jason
> >> NIC driver call the registered reset_notify() function to finish the
> >> part of RoCE reset process.
> >> In RoCE driver, when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT,
> >> we call hns_roce_hw_v2_uninit_instance(handle, false) to release the
> >> resources.
> >> when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call
> >> hns_roce_hw_v2_init_instance.
> >> if hns_roce_hw_v2_init_instance failed, we should ensure no problem in
> >> the other callback
> >> function registered by RoCE driver.
> > Don't design things like this.
> >
> > init/uninit are paired - do not call something uninit if it can be
> > called after init fails, or better, arrange to prevent that so things
> > are sane.
> >
> > Jason
> >
> > .
> The current RoCE driver registered 3 callback function to NIC driver as
> belows:
> 1.init_instance/uninit_instance are paired.
> 2.In reset_notify function, RoCE dirver still call
> init_instance/uninit_instance function.
> but NIC driver does not perceive the behavior. We need to judge in RoCE
> driver.
fix the nic driver
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-23 2:54 ` Wei Hu (Xavier)
2018-05-23 3:47 ` Jason Gunthorpe
@ 2018-05-23 3:49 ` Wei Hu (Xavier)
1 sibling, 0 replies; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 3:49 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 10:54, Wei Hu (Xavier) wrote:
>
> On 2018/5/23 4:26, Jason Gunthorpe wrote:
>> On Fri, May 18, 2018 at 03:23:00PM +0800, Wei Hu (Xavier) wrote:
>>> On 2018/5/18 12:15, Jason Gunthorpe wrote:
>>>> On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote:
>>>>> On 2018/5/17 23:14, Jason Gunthorpe wrote:
>>>>>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
>>>>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>>>>> index 86ef15f..e1c44a6 100644
>>>>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>>>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
>>>>>>> int ret = 0;
>>>>>>> int ntc;
>>>>>>>
>>>>>>> + if (hr_dev->is_reset)
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> spin_lock_bh(&csq->lock);
>>>>>>>
>>>>>>> if (num > hns_roce_cmq_space(csq)) {
>>>>>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
>>>>>>> return 0;
>>>>>>>
>>>>>>> error_failed_get_cfg:
>>>>>>> + handle->priv = NULL;
>>>>>>> kfree(hr_dev->priv);
>>>>>>>
>>>>>>> error_failed_kzalloc:
>>>>>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>>>>>>> {
>>>>>>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
>>>>>>>
>>>>>>> + if (!hr_dev)
>>>>>>> + return;
>>>>>>> +
>>>>>>> hns_roce_exit(hr_dev);
>>>>>>> + handle->priv = NULL;
>>>>>>> kfree(hr_dev->priv);
>>>>>>> ib_dealloc_device(&hr_dev->ib_dev);
>>>>>>> }
>>>>>> Why are these hunks here? If init fails then uninit should not be
>>>>>> called, so why meddle with priv?
>>>>> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with
>>>>> hr_dev,
>>>>> We want clear the value in hns_roce_hw_v2_uninit_instance function.
>>>>> So we can ensure no problem in RoCE driver.
>>>> What problem could happen?
>>>>
>>>> I keep removing unnecessary sets to null and checks of null, so please
>>>> don't add them if they cannot happen.
>>>>
>>>> Eg uninit should never be called with a null priv, that is a serious
>>>> logic mis-design someplace if it happens.
>>>>
>>>> Jason
>>> NIC driver call the registered reset_notify() function to finish the
>>> part of RoCE reset process.
>>> In RoCE driver, when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT,
>>> we call hns_roce_hw_v2_uninit_instance(handle, false) to release the
>>> resources.
>>> when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call
>>> hns_roce_hw_v2_init_instance.
>>> if hns_roce_hw_v2_init_instance failed, we should ensure no problem in
>>> the other callback
>>> function registered by RoCE driver.
>> Don't design things like this.
>>
>> init/uninit are paired - do not call something uninit if it can be
>> called after init fails, or better, arrange to prevent that so things
>> are sane.
>>
>> Jason
>>
>> .
> The current RoCE driver registered 3 callback function to NIC driver as
> belows:
> 1.init_instance/uninit_instance are paired.
> 2.In reset_notify function, RoCE dirver still call
> init_instance/uninit_instance function.
> but NIC driver does not perceive the behavior. We need to judge in RoCE
> driver.
>
> static const struct hnae3_client_ops hns_roce_hw_v2_ops = {
> .init_instance = hns_roce_hw_v2_init_instance,
> .uninit_instance = hns_roce_hw_v2_uninit_instance,
> .reset_notify = hns_roce_hw_v2_reset_notify,
> };
struct hnae3_handle is defined in NIC driver, and handle->priv is used
for RoCE driver,
NIC driver will not use this member handle->priv.
struct hnae3_handle {
struct hnae3_client *client;
struct pci_dev *pdev;
void *priv;
struct hnae3_ae_algo *ae_algo; /* the class who provides this handle */
u64 flags; /* Indicate the capabilities for this handle*/
unsigned long last_reset_time;
enum hnae3_reset_type reset_level;
union {
struct net_device *netdev; /* first member */
struct hnae3_knic_private_info kinfo;
struct hnae3_unic_private_info uinfo;
struct hnae3_roce_private_info rinfo;
};
u32 numa_node_mask; /* for multi-chip support */
};
> Wei Hu
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> .
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value
2018-05-17 8:02 ` [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value Wei Hu (Xavier)
@ 2018-05-23 5:49 ` Leon Romanovsky
2018-05-23 6:09 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Leon Romanovsky @ 2018-05-23 5:49 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
[-- Attachment #1: Type: text/plain, Size: 1000 bytes --]
On Thu, May 17, 2018 at 04:02:51PM +0800, Wei Hu (Xavier) wrote:
> This patch increases checking CMQ status timeout value and
> uses the same value with NIC driver to avoid deficiency of
> time.
>
> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
> ---
> drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
> index 182b672..f16df1b 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
> @@ -76,7 +76,7 @@
> #define HNS_ROCE_V2_PAGE_SIZE_SUPPORTED 0xFFFFF000
> #define HNS_ROCE_V2_MAX_INNER_MTPT_NUM 2
> #define HNS_ROCE_INVALID_LKEY 0x100
> -#define HNS_ROCE_CMQ_TX_TIMEOUT 200
> +#define HNS_ROCE_CMQ_TX_TIMEOUT 30000
Don't you think that loop of 30 seconds too much?
Thanks
>
> #define HNS_ROCE_CONTEXT_HOP_NUM 1
> #define HNS_ROCE_MTT_HOP_NUM 1
> --
> 1.9.1
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust
2018-05-17 8:02 ` [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust Wei Hu (Xavier)
@ 2018-05-23 6:05 ` Leon Romanovsky
2018-05-23 6:49 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Leon Romanovsky @ 2018-05-23 6:05 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
[-- Attachment #1: Type: text/plain, Size: 2401 bytes --]
On Thu, May 17, 2018 at 04:02:50PM +0800, Wei Hu (Xavier) wrote:
> This patch modified uar allocation algorithm in hns_roce_uar_alloc
> function to avoid bitmap exhaust.
>
> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
> ---
> drivers/infiniband/hw/hns/hns_roce_device.h | 1 +
> drivers/infiniband/hw/hns/hns_roce_pd.c | 10 ++++++----
> 2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
> index 53c2f1b..412297d4 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_device.h
> +++ b/drivers/infiniband/hw/hns/hns_roce_device.h
> @@ -214,6 +214,7 @@ enum {
> struct hns_roce_uar {
> u64 pfn;
> unsigned long index;
> + unsigned long logic_idx;
> };
>
> struct hns_roce_ucontext {
> diff --git a/drivers/infiniband/hw/hns/hns_roce_pd.c b/drivers/infiniband/hw/hns/hns_roce_pd.c
> index 4b41e04..b9f2c87 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_pd.c
> +++ b/drivers/infiniband/hw/hns/hns_roce_pd.c
> @@ -107,13 +107,15 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
> int ret = 0;
>
> /* Using bitmap to manager UAR index */
> - ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->index);
> + ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);
> if (ret == -1)
> return -ENOMEM;
>
> - if (uar->index > 0)
> - uar->index = (uar->index - 1) %
> + if (uar->logic_idx > 0 && hr_dev->caps.phy_num_uars > 1)
> + uar->index = (uar->logic_idx - 1) %
> (hr_dev->caps.phy_num_uars - 1) + 1;
> + else
> + uar->index = 0;
>
Sorry, but maybe I didn't understand this change fully, but logic_idx is
not initialized at all and one of two (needs to check your uar
allocation): the logic_idx is always zero -> index will be zero too,
or logic_idx is random variable -> index will be random too.
What did you want to do?
> if (!dev_is_pci(hr_dev->dev)) {
> res = platform_get_resource(hr_dev->pdev, IORESOURCE_MEM, 0);
> @@ -132,7 +134,7 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
>
> void hns_roce_uar_free(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
> {
> - hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->index,
> + hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->logic_idx,
> BITMAP_NO_RR);
> }
>
> --
> 1.9.1
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value
2018-05-23 5:49 ` Leon Romanovsky
@ 2018-05-23 6:09 ` Wei Hu (Xavier)
2018-05-23 6:15 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 6:09 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 13:49, Leon Romanovsky wrote:
> On Thu, May 17, 2018 at 04:02:51PM +0800, Wei Hu (Xavier) wrote:
>> This patch increases checking CMQ status timeout value and
>> uses the same value with NIC driver to avoid deficiency of
>> time.
>>
>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
>> ---
>> drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
>> index 182b672..f16df1b 100644
>> --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
>> @@ -76,7 +76,7 @@
>> #define HNS_ROCE_V2_PAGE_SIZE_SUPPORTED 0xFFFFF000
>> #define HNS_ROCE_V2_MAX_INNER_MTPT_NUM 2
>> #define HNS_ROCE_INVALID_LKEY 0x100
>> -#define HNS_ROCE_CMQ_TX_TIMEOUT 200
>> +#define HNS_ROCE_CMQ_TX_TIMEOUT 30000
> Don't you think that loop of 30 seconds too much?
>
> Thanks
Leon, 30ms
>
>> #define HNS_ROCE_CONTEXT_HOP_NUM 1
>> #define HNS_ROCE_MTT_HOP_NUM 1
>> --
>> 1.9.1
>>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value
2018-05-23 6:09 ` Wei Hu (Xavier)
@ 2018-05-23 6:15 ` Wei Hu (Xavier)
2018-05-23 6:23 ` Leon Romanovsky
0 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 6:15 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 14:09, Wei Hu (Xavier) wrote:
>
> On 2018/5/23 13:49, Leon Romanovsky wrote:
>> On Thu, May 17, 2018 at 04:02:51PM +0800, Wei Hu (Xavier) wrote:
>>> This patch increases checking CMQ status timeout value and
>>> uses the same value with NIC driver to avoid deficiency of
>>> time.
>>>
>>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
>>> ---
>>> drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
>>> index 182b672..f16df1b 100644
>>> --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
>>> @@ -76,7 +76,7 @@
>>> #define HNS_ROCE_V2_PAGE_SIZE_SUPPORTED 0xFFFFF000
>>> #define HNS_ROCE_V2_MAX_INNER_MTPT_NUM 2
>>> #define HNS_ROCE_INVALID_LKEY 0x100
>>> -#define HNS_ROCE_CMQ_TX_TIMEOUT 200
>>> +#define HNS_ROCE_CMQ_TX_TIMEOUT 30000
>> Don't you think that loop of 30 seconds too much?
>>
>> Thanks
> Leon, 30ms
Because firmware take turns excuting NIC and RoCE command.
Although RoCE command does not take so long, we still use
the same timeout value with NIC driver.
Thanks
>>> #define HNS_ROCE_CONTEXT_HOP_NUM 1
>>> #define HNS_ROCE_MTT_HOP_NUM 1
>>> --
>>> 1.9.1
>>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> .
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page
2018-05-17 8:02 ` [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page Wei Hu (Xavier)
@ 2018-05-23 6:17 ` Leon Romanovsky
2018-05-23 6:38 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Leon Romanovsky @ 2018-05-23 6:17 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
[-- Attachment #1: Type: text/plain, Size: 854 bytes --]
On Thu, May 17, 2018 at 04:02:53PM +0800, Wei Hu (Xavier) wrote:
> This patch fixed the potential illegal operation when using the
> extend sge buffer cross page in post send operation. The bug
> will cause the calltrace.
>
> Reported-by: Jie Chen <chenjie103@huawei.com>
> Reported-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
> Fixes: b1c1583("RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent")
Documentation/process/submitting-patches.rst
183 If your patch fixes a bug in a specific commit, e.g. you found an issue using
184 ``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
185 the SHA-1 ID, and the one line summary. For example::
186
187 Fixes: e21d2170f366 ("video: remove unnecessary platform_set_drvdata()")
Thanks
> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value
2018-05-23 6:15 ` Wei Hu (Xavier)
@ 2018-05-23 6:23 ` Leon Romanovsky
0 siblings, 0 replies; 30+ messages in thread
From: Leon Romanovsky @ 2018-05-23 6:23 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
[-- Attachment #1: Type: text/plain, Size: 1452 bytes --]
On Wed, May 23, 2018 at 02:15:48PM +0800, Wei Hu (Xavier) wrote:
>
>
> On 2018/5/23 14:09, Wei Hu (Xavier) wrote:
> >
> > On 2018/5/23 13:49, Leon Romanovsky wrote:
> >> On Thu, May 17, 2018 at 04:02:51PM +0800, Wei Hu (Xavier) wrote:
> >>> This patch increases checking CMQ status timeout value and
> >>> uses the same value with NIC driver to avoid deficiency of
> >>> time.
> >>>
> >>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
> >>> ---
> >>> drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +-
> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
> >>> index 182b672..f16df1b 100644
> >>> --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
> >>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
> >>> @@ -76,7 +76,7 @@
> >>> #define HNS_ROCE_V2_PAGE_SIZE_SUPPORTED 0xFFFFF000
> >>> #define HNS_ROCE_V2_MAX_INNER_MTPT_NUM 2
> >>> #define HNS_ROCE_INVALID_LKEY 0x100
> >>> -#define HNS_ROCE_CMQ_TX_TIMEOUT 200
> >>> +#define HNS_ROCE_CMQ_TX_TIMEOUT 30000
> >> Don't you think that loop of 30 seconds too much?
> >>
> >> Thanks
> > Leon, 30ms
> Because firmware take turns excuting NIC and RoCE command.
> Although RoCE command does not take so long, we still use
> the same timeout value with NIC driver.
Sorry, you are right it is 30ms and not 30s.
Thanks,
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page
2018-05-23 6:17 ` Leon Romanovsky
@ 2018-05-23 6:38 ` Wei Hu (Xavier)
0 siblings, 0 replies; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 6:38 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 14:17, Leon Romanovsky wrote:
> On Thu, May 17, 2018 at 04:02:53PM +0800, Wei Hu (Xavier) wrote:
>> This patch fixed the potential illegal operation when using the
>> extend sge buffer cross page in post send operation. The bug
>> will cause the calltrace.
>>
>> Reported-by: Jie Chen <chenjie103@huawei.com>
>> Reported-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com>
>> Fixes: b1c1583("RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent")
> Documentation/process/submitting-patches.rst
> 183 If your patch fixes a bug in a specific commit, e.g. you found an issue using
> 184 ``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
> 185 the SHA-1 ID, and the one line summary. For example::
> 186
> 187 Fixes: e21d2170f366 ("video: remove unnecessary platform_set_drvdata()")
>
> Thanks
Thanks, Leon
Got it. We will fix it in v2.
Wei Hu
>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust
2018-05-23 6:05 ` Leon Romanovsky
@ 2018-05-23 6:49 ` Wei Hu (Xavier)
2018-05-23 7:00 ` Leon Romanovsky
0 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 6:49 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 14:05, Leon Romanovsky wrote:
> On Thu, May 17, 2018 at 04:02:50PM +0800, Wei Hu (Xavier) wrote:
>> This patch modified uar allocation algorithm in hns_roce_uar_alloc
>> function to avoid bitmap exhaust.
>>
>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
>> ---
>> drivers/infiniband/hw/hns/hns_roce_device.h | 1 +
>> drivers/infiniband/hw/hns/hns_roce_pd.c | 10 ++++++----
>> 2 files changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
>> index 53c2f1b..412297d4 100644
>> --- a/drivers/infiniband/hw/hns/hns_roce_device.h
>> +++ b/drivers/infiniband/hw/hns/hns_roce_device.h
>> @@ -214,6 +214,7 @@ enum {
>> struct hns_roce_uar {
>> u64 pfn;
>> unsigned long index;
>> + unsigned long logic_idx;
>> };
>>
>> struct hns_roce_ucontext {
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_pd.c b/drivers/infiniband/hw/hns/hns_roce_pd.c
>> index 4b41e04..b9f2c87 100644
>> --- a/drivers/infiniband/hw/hns/hns_roce_pd.c
>> +++ b/drivers/infiniband/hw/hns/hns_roce_pd.c
>> @@ -107,13 +107,15 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
>> int ret = 0;
>>
>> /* Using bitmap to manager UAR index */
>> - ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->index);
>> + ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);
>> if (ret == -1)
>> return -ENOMEM;
>>
>> - if (uar->index > 0)
>> - uar->index = (uar->index - 1) %
>> + if (uar->logic_idx > 0 && hr_dev->caps.phy_num_uars > 1)
>> + uar->index = (uar->logic_idx - 1) %
>> (hr_dev->caps.phy_num_uars - 1) + 1;
>> + else
>> + uar->index = 0;
>>
> Sorry, but maybe I didn't understand this change fully, but logic_idx is
> not initialized at all and one of two (needs to check your uar
> allocation): the logic_idx is always zero -> index will be zero too,
> or logic_idx is random variable -> index will be random too.
>
> What did you want to do?
>
Hi, Leon
The prototype of hns_roce_bitmap_alloc as belows:
int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap,
unsigned long *obj);
In this statement, we evaluate uar->logic_idx.
ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap,
&uar->logic_idx);
In hip06, hr_dev->caps.phy_num_uars equals 8,
if (uar->logic_idx > 0)
uar-> index = 0;
else
uar-> index =(uar->logic_idx - 1) %
(hr_dev->caps.phy_num_uars - 1) + 1;
In hip08, hr_dev->caps.phy_num_uars equals 1, uar-> index = 0;
Regards
Wei Hu
>> if (!dev_is_pci(hr_dev->dev)) {
>> res = platform_get_resource(hr_dev->pdev, IORESOURCE_MEM, 0);
>> @@ -132,7 +134,7 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
>>
>> void hns_roce_uar_free(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
>> {
>> - hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->index,
>> + hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->logic_idx,
>> BITMAP_NO_RR);
>> }
>>
>> --
>> 1.9.1
>>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust
2018-05-23 6:49 ` Wei Hu (Xavier)
@ 2018-05-23 7:00 ` Leon Romanovsky
2018-05-23 7:12 ` Wei Hu (Xavier)
0 siblings, 1 reply; 30+ messages in thread
From: Leon Romanovsky @ 2018-05-23 7:00 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
[-- Attachment #1: Type: text/plain, Size: 3583 bytes --]
On Wed, May 23, 2018 at 02:49:35PM +0800, Wei Hu (Xavier) wrote:
>
>
> On 2018/5/23 14:05, Leon Romanovsky wrote:
> > On Thu, May 17, 2018 at 04:02:50PM +0800, Wei Hu (Xavier) wrote:
> >> This patch modified uar allocation algorithm in hns_roce_uar_alloc
> >> function to avoid bitmap exhaust.
> >>
> >> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
> >> ---
> >> drivers/infiniband/hw/hns/hns_roce_device.h | 1 +
> >> drivers/infiniband/hw/hns/hns_roce_pd.c | 10 ++++++----
> >> 2 files changed, 7 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
> >> index 53c2f1b..412297d4 100644
> >> --- a/drivers/infiniband/hw/hns/hns_roce_device.h
> >> +++ b/drivers/infiniband/hw/hns/hns_roce_device.h
> >> @@ -214,6 +214,7 @@ enum {
> >> struct hns_roce_uar {
> >> u64 pfn;
> >> unsigned long index;
> >> + unsigned long logic_idx;
> >> };
> >>
> >> struct hns_roce_ucontext {
> >> diff --git a/drivers/infiniband/hw/hns/hns_roce_pd.c b/drivers/infiniband/hw/hns/hns_roce_pd.c
> >> index 4b41e04..b9f2c87 100644
> >> --- a/drivers/infiniband/hw/hns/hns_roce_pd.c
> >> +++ b/drivers/infiniband/hw/hns/hns_roce_pd.c
> >> @@ -107,13 +107,15 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
> >> int ret = 0;
> >>
> >> /* Using bitmap to manager UAR index */
> >> - ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->index);
> >> + ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);
> >> if (ret == -1)
> >> return -ENOMEM;
> >>
> >> - if (uar->index > 0)
> >> - uar->index = (uar->index - 1) %
> >> + if (uar->logic_idx > 0 && hr_dev->caps.phy_num_uars > 1)
> >> + uar->index = (uar->logic_idx - 1) %
> >> (hr_dev->caps.phy_num_uars - 1) + 1;
> >> + else
> >> + uar->index = 0;
> >>
> > Sorry, but maybe I didn't understand this change fully, but logic_idx is
> > not initialized at all and one of two (needs to check your uar
> > allocation): the logic_idx is always zero -> index will be zero too,
> > or logic_idx is random variable -> index will be random too.
> >
> > What did you want to do?
> >
> Hi, Leon
>
> The prototype of hns_roce_bitmap_alloc as belows:
> int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap,
> unsigned long *obj);
> In this statement, we evaluate uar->logic_idx.
> ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap,
> &uar->logic_idx);
>
> In hip06, hr_dev->caps.phy_num_uars equals 8,
> if (uar->logic_idx > 0)
> uar-> index = 0;
> else
> uar-> index =(uar->logic_idx - 1) %
> (hr_dev->caps.phy_num_uars - 1) + 1;
> In hip08, hr_dev->caps.phy_num_uars equals 1, uar-> index = 0;
>
> Regards
Where did you change/set logic_idx?
Thanks
> Wei Hu
> >> if (!dev_is_pci(hr_dev->dev)) {
> >> res = platform_get_resource(hr_dev->pdev, IORESOURCE_MEM, 0);
> >> @@ -132,7 +134,7 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
> >>
> >> void hns_roce_uar_free(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
> >> {
> >> - hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->index,
> >> + hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->logic_idx,
> >> BITMAP_NO_RR);
> >> }
> >>
> >> --
> >> 1.9.1
> >>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust
2018-05-23 7:00 ` Leon Romanovsky
@ 2018-05-23 7:12 ` Wei Hu (Xavier)
2018-05-23 7:22 ` Leon Romanovsky
0 siblings, 1 reply; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 7:12 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 15:00, Leon Romanovsky wrote:
> On Wed, May 23, 2018 at 02:49:35PM +0800, Wei Hu (Xavier) wrote:
>>
>> On 2018/5/23 14:05, Leon Romanovsky wrote:
>>> On Thu, May 17, 2018 at 04:02:50PM +0800, Wei Hu (Xavier) wrote:
>>>> This patch modified uar allocation algorithm in hns_roce_uar_alloc
>>>> function to avoid bitmap exhaust.
>>>>
>>>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
>>>> ---
>>>> drivers/infiniband/hw/hns/hns_roce_device.h | 1 +
>>>> drivers/infiniband/hw/hns/hns_roce_pd.c | 10 ++++++----
>>>> 2 files changed, 7 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
>>>> index 53c2f1b..412297d4 100644
>>>> --- a/drivers/infiniband/hw/hns/hns_roce_device.h
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_device.h
>>>> @@ -214,6 +214,7 @@ enum {
>>>> struct hns_roce_uar {
>>>> u64 pfn;
>>>> unsigned long index;
>>>> + unsigned long logic_idx;
>>>> };
>>>>
>>>> struct hns_roce_ucontext {
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_pd.c b/drivers/infiniband/hw/hns/hns_roce_pd.c
>>>> index 4b41e04..b9f2c87 100644
>>>> --- a/drivers/infiniband/hw/hns/hns_roce_pd.c
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_pd.c
>>>> @@ -107,13 +107,15 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
>>>> int ret = 0;
>>>>
>>>> /* Using bitmap to manager UAR index */
>>>> - ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->index);
>>>> + ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);
>>>> if (ret == -1)
>>>> return -ENOMEM;
>>>>
>>>> - if (uar->index > 0)
>>>> - uar->index = (uar->index - 1) %
>>>> + if (uar->logic_idx > 0 && hr_dev->caps.phy_num_uars > 1)
>>>> + uar->index = (uar->logic_idx - 1) %
>>>> (hr_dev->caps.phy_num_uars - 1) + 1;
>>>> + else
>>>> + uar->index = 0;
>>>>
>>> Sorry, but maybe I didn't understand this change fully, but logic_idx is
>>> not initialized at all and one of two (needs to check your uar
>>> allocation): the logic_idx is always zero -> index will be zero too,
>>> or logic_idx is random variable -> index will be random too.
>>>
>>> What did you want to do?
>>>
>> Hi, Leon
>>
>> The prototype of hns_roce_bitmap_alloc as belows:
>> int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap,
>> unsigned long *obj);
>> In this statement, we evaluate uar->logic_idx.
>> ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap,
>> &uar->logic_idx);
>>
>> In hip06, hr_dev->caps.phy_num_uars equals 8,
>> if (uar->logic_idx > 0)
>> uar-> index = 0;
>> else
>> uar-> index =(uar->logic_idx - 1) %
>> (hr_dev->caps.phy_num_uars - 1) + 1;
>> In hip08, hr_dev->caps.phy_num_uars equals 1, uar-> index = 0;
>>
>> Regards
> Where did you change/set logic_idx?
In hns_roce_uar_alloc,
ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);
In hns_roce_uar_free,
hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->logic_idx,
BITMAP_NO_RR);
Thanks
> Thanks
>
>
>> Wei Hu
>>>> if (!dev_is_pci(hr_dev->dev)) {
>>>> res = platform_get_resource(hr_dev->pdev, IORESOURCE_MEM, 0);
>>>> @@ -132,7 +134,7 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
>>>>
>>>> void hns_roce_uar_free(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
>>>> {
>>>> - hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->index,
>>>> + hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->logic_idx,
>>>> BITMAP_NO_RR);
>>>> }
>>>>
>>>> --
>>>> 1.9.1
>>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust
2018-05-23 7:12 ` Wei Hu (Xavier)
@ 2018-05-23 7:22 ` Leon Romanovsky
0 siblings, 0 replies; 30+ messages in thread
From: Leon Romanovsky @ 2018-05-23 7:22 UTC (permalink / raw)
To: Wei Hu (Xavier)
Cc: dledford, jgg, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
[-- Attachment #1: Type: text/plain, Size: 3410 bytes --]
On Wed, May 23, 2018 at 03:12:45PM +0800, Wei Hu (Xavier) wrote:
>
>
> On 2018/5/23 15:00, Leon Romanovsky wrote:
> > On Wed, May 23, 2018 at 02:49:35PM +0800, Wei Hu (Xavier) wrote:
> >>
> >> On 2018/5/23 14:05, Leon Romanovsky wrote:
> >>> On Thu, May 17, 2018 at 04:02:50PM +0800, Wei Hu (Xavier) wrote:
> >>>> This patch modified uar allocation algorithm in hns_roce_uar_alloc
> >>>> function to avoid bitmap exhaust.
> >>>>
> >>>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
> >>>> ---
> >>>> drivers/infiniband/hw/hns/hns_roce_device.h | 1 +
> >>>> drivers/infiniband/hw/hns/hns_roce_pd.c | 10 ++++++----
> >>>> 2 files changed, 7 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
> >>>> index 53c2f1b..412297d4 100644
> >>>> --- a/drivers/infiniband/hw/hns/hns_roce_device.h
> >>>> +++ b/drivers/infiniband/hw/hns/hns_roce_device.h
> >>>> @@ -214,6 +214,7 @@ enum {
> >>>> struct hns_roce_uar {
> >>>> u64 pfn;
> >>>> unsigned long index;
> >>>> + unsigned long logic_idx;
> >>>> };
> >>>>
> >>>> struct hns_roce_ucontext {
> >>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_pd.c b/drivers/infiniband/hw/hns/hns_roce_pd.c
> >>>> index 4b41e04..b9f2c87 100644
> >>>> --- a/drivers/infiniband/hw/hns/hns_roce_pd.c
> >>>> +++ b/drivers/infiniband/hw/hns/hns_roce_pd.c
> >>>> @@ -107,13 +107,15 @@ int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
> >>>> int ret = 0;
> >>>>
> >>>> /* Using bitmap to manager UAR index */
> >>>> - ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->index);
> >>>> + ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);
> >>>> if (ret == -1)
> >>>> return -ENOMEM;
> >>>>
> >>>> - if (uar->index > 0)
> >>>> - uar->index = (uar->index - 1) %
> >>>> + if (uar->logic_idx > 0 && hr_dev->caps.phy_num_uars > 1)
> >>>> + uar->index = (uar->logic_idx - 1) %
> >>>> (hr_dev->caps.phy_num_uars - 1) + 1;
> >>>> + else
> >>>> + uar->index = 0;
> >>>>
> >>> Sorry, but maybe I didn't understand this change fully, but logic_idx is
> >>> not initialized at all and one of two (needs to check your uar
> >>> allocation): the logic_idx is always zero -> index will be zero too,
> >>> or logic_idx is random variable -> index will be random too.
> >>>
> >>> What did you want to do?
> >>>
> >> Hi, Leon
> >>
> >> The prototype of hns_roce_bitmap_alloc as belows:
> >> int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap,
> >> unsigned long *obj);
> >> In this statement, we evaluate uar->logic_idx.
> >> ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap,
> >> &uar->logic_idx);
> >>
> >> In hip06, hr_dev->caps.phy_num_uars equals 8,
> >> if (uar->logic_idx > 0)
> >> uar-> index = 0;
> >> else
> >> uar-> index =(uar->logic_idx - 1) %
> >> (hr_dev->caps.phy_num_uars - 1) + 1;
> >> In hip08, hr_dev->caps.phy_num_uars equals 1, uar-> index = 0;
> >>
> >> Regards
> > Where did you change/set logic_idx?
> In hns_roce_uar_alloc,
> ret = hns_roce_bitmap_alloc(&hr_dev->uar_table.bitmap, &uar->logic_idx);
> In hns_roce_uar_free,
> hns_roce_bitmap_free(&hr_dev->uar_table.bitmap, uar->logic_idx,
> BITMAP_NO_RR);
>
I see it, thanks
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API
2018-05-22 20:21 ` Jason Gunthorpe
@ 2018-05-23 9:33 ` Wei Hu (Xavier)
0 siblings, 0 replies; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 9:33 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 4:21, Jason Gunthorpe wrote:
> On Sat, May 19, 2018 at 04:24:40PM +0800, Wei Hu (Xavier) wrote:
>>
>> On 2018/5/17 23:00, Jason Gunthorpe wrote:
>>> On Thu, May 17, 2018 at 04:02:49PM +0800, Wei Hu (Xavier) wrote:
>>>> This patch Implements the IB core disassociate_ucontext API.
>>>>
>>>> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
>>>> drivers/infiniband/hw/hns/hns_roce_main.c | 36 +++++++++++++++++++++++++++++++
>>>> 1 file changed, 36 insertions(+)
>>>>
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
>>>> index 96fb6a9..7fafe9d 100644
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
>>>> @@ -33,6 +33,9 @@
>>>> #include <linux/acpi.h>
>>>> #include <linux/of_platform.h>
>>>> #include <linux/module.h>
>>>> +#include <linux/sched.h>
>>>> +#include <linux/sched/mm.h>
>>>> +#include <linux/sched/task.h>
>>>> #include <rdma/ib_addr.h>
>>>> #include <rdma/ib_smi.h>
>>>> #include <rdma/ib_user_verbs.h>
>>>> @@ -422,6 +425,38 @@ static int hns_roce_port_immutable(struct ib_device *ib_dev, u8 port_num,
>>>> return 0;
>>>> }
>>>>
>>>> +static void hns_roce_disassociate_ucontext(struct ib_ucontext *ibcontext)
>>>> +{
>>>> + struct task_struct *process;
>>>> + struct mm_struct *mm;
>>>> +
>>>> + process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
>>>> + if (!process)
>>>> + return;
>>>> +
>>>> + mm = get_task_mm(process);
>>>> + if (!mm) {
>>>> + pr_info("no mm, disassociate ucontext is pending task termination\n");
>>>> + while (1) {
>>>> + put_task_struct(process);
>>>> + usleep_range(1000, 2000);
>>>> + process = get_pid_task(ibcontext->tgid, PIDTYPE_PID);
>>>> + if (!process || process->state == TASK_DEAD) {
>>>> + pr_info("disassociate ucontext done, task was terminated\n");
>>>> + /* if task was dead, need to release the task
>>>> + * struct.
>>>> + */
>>>> + if (process)
>>>> + put_task_struct(process);
>>>> + return;
>>>> + }
>>>> + }
>>>> + }
>>> I don't want to see this boilerplate code copied into every
>>> driver. Hoist it into the core code, have the disassociate driver callback
>>> accept a mm_struct parameter, and refactor the other drivers using this.
>> When the userspace RDMA application process is suspended for some reason
>> without executing ibv_close_device function,
>> There will be calltrace as follows when rmmod roce kernel driver ko in
>> the current version.
>> It looks like a common problem to every driver and the code segment
>> above is suitable for every driver.
>> Pardon me for asking, but if you have any plan to do this?
> My plan is to ask you to do it :)
Hi, Jason
I will pull this patch out of the series and send V2.
We will think how to hoist it into the core code later.
Thanks
Regards
Wei Hu
> Jason
>
> .
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
2018-05-23 3:47 ` Jason Gunthorpe
@ 2018-05-23 9:35 ` Wei Hu (Xavier)
0 siblings, 0 replies; 30+ messages in thread
From: Wei Hu (Xavier) @ 2018-05-23 9:35 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford, linux-rdma, linux-kernel, xavier.huwei, lijun_nudt
On 2018/5/23 11:47, Jason Gunthorpe wrote:
> On Wed, May 23, 2018 at 10:54:54AM +0800, Wei Hu (Xavier) wrote:
>>
>> On 2018/5/23 4:26, Jason Gunthorpe wrote:
>>> On Fri, May 18, 2018 at 03:23:00PM +0800, Wei Hu (Xavier) wrote:
>>>> On 2018/5/18 12:15, Jason Gunthorpe wrote:
>>>>> On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote:
>>>>>> On 2018/5/17 23:14, Jason Gunthorpe wrote:
>>>>>>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
>>>>>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>>>>>> index 86ef15f..e1c44a6 100644
>>>>>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>>>>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
>>>>>>>> int ret = 0;
>>>>>>>> int ntc;
>>>>>>>>
>>>>>>>> + if (hr_dev->is_reset)
>>>>>>>> + return 0;
>>>>>>>> +
>>>>>>>> spin_lock_bh(&csq->lock);
>>>>>>>>
>>>>>>>> if (num > hns_roce_cmq_space(csq)) {
>>>>>>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
>>>>>>>> return 0;
>>>>>>>>
>>>>>>>> error_failed_get_cfg:
>>>>>>>> + handle->priv = NULL;
>>>>>>>> kfree(hr_dev->priv);
>>>>>>>>
>>>>>>>> error_failed_kzalloc:
>>>>>>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>>>>>>>> {
>>>>>>>> struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
>>>>>>>>
>>>>>>>> + if (!hr_dev)
>>>>>>>> + return;
>>>>>>>> +
>>>>>>>> hns_roce_exit(hr_dev);
>>>>>>>> + handle->priv = NULL;
>>>>>>>> kfree(hr_dev->priv);
>>>>>>>> ib_dealloc_device(&hr_dev->ib_dev);
>>>>>>>> }
>>>>>>> Why are these hunks here? If init fails then uninit should not be
>>>>>>> called, so why meddle with priv?
>>>>>> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with
>>>>>> hr_dev,
>>>>>> We want clear the value in hns_roce_hw_v2_uninit_instance function.
>>>>>> So we can ensure no problem in RoCE driver.
>>>>> What problem could happen?
>>>>>
>>>>> I keep removing unnecessary sets to null and checks of null, so please
>>>>> don't add them if they cannot happen.
>>>>>
>>>>> Eg uninit should never be called with a null priv, that is a serious
>>>>> logic mis-design someplace if it happens.
>>>>>
>>>>> Jason
>>>> NIC driver call the registered reset_notify() function to finish the
>>>> part of RoCE reset process.
>>>> In RoCE driver, when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT,
>>>> we call hns_roce_hw_v2_uninit_instance(handle, false) to release the
>>>> resources.
>>>> when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call
>>>> hns_roce_hw_v2_init_instance.
>>>> if hns_roce_hw_v2_init_instance failed, we should ensure no problem in
>>>> the other callback
>>>> function registered by RoCE driver.
>>> Don't design things like this.
>>>
>>> init/uninit are paired - do not call something uninit if it can be
>>> called after init fails, or better, arrange to prevent that so things
>>> are sane.
>>>
>>> Jason
>>>
>>> .
>> The current RoCE driver registered 3 callback function to NIC driver as
>> belows:
>> 1.init_instance/uninit_instance are paired.
>> 2.In reset_notify function, RoCE dirver still call
>> init_instance/uninit_instance function.
>> but NIC driver does not perceive the behavior. We need to judge in RoCE
>> driver.
Hi, Jason
I will send v2, thanks.
Regards
Wei Hu
> fix the nic driver
>
> Jason
>
> .
>
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2018-05-23 9:35 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-17 8:02 [PATCH rdma-next 0/5] Misc update for hns driver Wei Hu (Xavier)
2018-05-17 8:02 ` [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API Wei Hu (Xavier)
2018-05-17 15:00 ` Jason Gunthorpe
2018-05-19 8:24 ` Wei Hu (Xavier)
2018-05-22 20:21 ` Jason Gunthorpe
2018-05-23 9:33 ` Wei Hu (Xavier)
2018-05-17 8:02 ` [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust Wei Hu (Xavier)
2018-05-23 6:05 ` Leon Romanovsky
2018-05-23 6:49 ` Wei Hu (Xavier)
2018-05-23 7:00 ` Leon Romanovsky
2018-05-23 7:12 ` Wei Hu (Xavier)
2018-05-23 7:22 ` Leon Romanovsky
2018-05-17 8:02 ` [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value Wei Hu (Xavier)
2018-05-23 5:49 ` Leon Romanovsky
2018-05-23 6:09 ` Wei Hu (Xavier)
2018-05-23 6:15 ` Wei Hu (Xavier)
2018-05-23 6:23 ` Leon Romanovsky
2018-05-17 8:02 ` [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08 Wei Hu (Xavier)
2018-05-17 15:14 ` Jason Gunthorpe
2018-05-18 3:28 ` Wei Hu (Xavier)
2018-05-18 4:15 ` Jason Gunthorpe
2018-05-18 7:23 ` Wei Hu (Xavier)
2018-05-22 20:26 ` Jason Gunthorpe
2018-05-23 2:54 ` Wei Hu (Xavier)
2018-05-23 3:47 ` Jason Gunthorpe
2018-05-23 9:35 ` Wei Hu (Xavier)
2018-05-23 3:49 ` Wei Hu (Xavier)
2018-05-17 8:02 ` [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page Wei Hu (Xavier)
2018-05-23 6:17 ` Leon Romanovsky
2018-05-23 6:38 ` Wei Hu (Xavier)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).