LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Wei Hu (Xavier)" <xavier.huwei@huawei.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: <dledford@redhat.com>, <linux-rdma@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <xavier.huwei@tom.com>,
	<lijun_nudt@163.com>
Subject: Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08
Date: Fri, 18 May 2018 15:23:00 +0800	[thread overview]
Message-ID: <5AFE7F54.9030201@huawei.com> (raw)
In-Reply-To: <20180518041502.GS10842@ziepe.ca>



On 2018/5/18 12:15, Jason Gunthorpe wrote:
> On Fri, May 18, 2018 at 11:28:11AM +0800, Wei Hu (Xavier) wrote:
>>
>> On 2018/5/17 23:14, Jason Gunthorpe wrote:
>>> On Thu, May 17, 2018 at 04:02:52PM +0800, Wei Hu (Xavier) wrote:
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>> index 86ef15f..e1c44a6 100644
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>>>> @@ -774,6 +774,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
>>>>  	int ret = 0;
>>>>  	int ntc;
>>>>  
>>>> +	if (hr_dev->is_reset)
>>>> +		return 0;
>>>> +
>>>>  	spin_lock_bh(&csq->lock);
>>>>  
>>>>  	if (num > hns_roce_cmq_space(csq)) {
>>>> @@ -4790,6 +4793,7 @@ static int hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
>>>>  	return 0;
>>>>  
>>>>  error_failed_get_cfg:
>>>> +	handle->priv = NULL;
>>>>  	kfree(hr_dev->priv);
>>>>  
>>>>  error_failed_kzalloc:
>>>> @@ -4803,14 +4807,70 @@ static void hns_roce_hw_v2_uninit_instance(struct hnae3_handle *handle,
>>>>  {
>>>>  	struct hns_roce_dev *hr_dev = (struct hns_roce_dev *)handle->priv;
>>>>  
>>>> +	if (!hr_dev)
>>>> +		return;
>>>> +
>>>>  	hns_roce_exit(hr_dev);
>>>> +	handle->priv = NULL;
>>>>  	kfree(hr_dev->priv);
>>>>  	ib_dealloc_device(&hr_dev->ib_dev);
>>>>  }
>>> Why are these hunks here? If init fails then uninit should not be
>>> called, so why meddle with priv?
>> In hns_roce_hw_v2_init_instance function, we evaluate handle->priv with 
>> hr_dev,
>> We want clear the value in hns_roce_hw_v2_uninit_instance function.
>> So we can ensure no problem in RoCE driver.
> What problem could happen?
>
> I keep removing unnecessary sets to null and checks of null, so please
> don't add them if they cannot happen.
>
> Eg uninit should never be called with a null priv, that is a serious
> logic mis-design someplace if it happens.
>
> Jason
NIC driver call the registered reset_notify() function to finish the
part of RoCE reset process.
In RoCE driver,  when hnae3_reset_notify_type is HNAE3_UNINIT_CLIENT,
we call hns_roce_hw_v2_uninit_instance(handle, false) to release the
resources.
when hnae3_reset_notify_type is HNAE3_INIT_CLIENT, we call
hns_roce_hw_v2_init_instance.
if hns_roce_hw_v2_init_instance failed, we should ensure no problem in
the other callback
function registered by RoCE driver.
 
The related RoCE driver:
static int hns_roce_hw_v2_reset_notify_uninit(struct hnae3_handle *handle)
{
    msleep(100);
    hns_roce_hw_v2_uninit_instance(handle, false);
    return 0;
}

static int hns_roce_hw_v2_reset_notify(struct hnae3_handle *handle,
                       enum hnae3_reset_notify_type type)
{
    int ret = 0;

    switch (type) {
    case HNAE3_DOWN_CLIENT:
        ret = hns_roce_hw_v2_reset_notify_down(handle);
        break;
    case HNAE3_INIT_CLIENT:
        ret = hns_roce_hw_v2_init_instance(handle);
        break;
    case HNAE3_UNINIT_CLIENT:
        ret = hns_roce_hw_v2_reset_notify_uninit(handle);
        break;
    default:
        break;
    }

    return ret;
}

The related NIC driver:

static int hclge_notify_roce_client(struct hclge_dev *hdev,
                    enum hnae3_reset_notify_type type)
{
    struct hnae3_client *client = hdev->roce_client;
    struct hnae3_handle *handle;
    int ret = 0;
    u16 i;

    if (!client)
        return 0;

    if (!client->ops->reset_notify)
        return -EOPNOTSUPP;

    for (i = 0; i < hdev->num_vmdq_vport + 1; i++) {
        handle = &hdev->vport[i].roce;
        ret = client->ops->reset_notify(handle, type);
        if (ret) {
            dev_err(&hdev->pdev->dev,
                "notify roce client failed %d", ret);
            return ret;
        }
    }

    return ret;
}

static void hclge_reset(struct hclge_dev *hdev)
{
    struct hnae3_handle *handle;

    /* perform reset of the stack & ae device for a client */
    handle = &hdev->vport[0].nic;

    hclge_notify_roce_client(hdev, HNAE3_DOWN_CLIENT);
    hclge_notify_roce_client(hdev, HNAE3_UNINIT_CLIENT);

    rtnl_lock();
    hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);

    if (!hclge_reset_wait(hdev)) {
        hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
        hclge_reset_ae_dev(hdev->ae_dev);
        hclge_notify_client(hdev, HNAE3_INIT_CLIENT);

        hclge_clear_reset_cause(hdev);
    } else {
        /* schedule again to check pending resets later */
        set_bit(hdev->reset_type, &hdev->reset_pending);
        hclge_reset_task_schedule(hdev);
    }

    hclge_notify_client(hdev, HNAE3_UP_CLIENT);
    handle->last_reset_time = jiffies;
    rtnl_unlock();

    hclge_notify_roce_client(hdev, HNAE3_INIT_CLIENT);
    hclge_notify_roce_client(hdev, HNAE3_UP_CLIENT);
}

Thanks, Jason

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

  reply	other threads:[~2018-05-18  7:23 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-17  8:02 [PATCH rdma-next 0/5] Misc update for hns driver Wei Hu (Xavier)
2018-05-17  8:02 ` [PATCH rdma-next 1/5] RDMA/hns: Implement the disassociate_ucontext API Wei Hu (Xavier)
2018-05-17 15:00   ` Jason Gunthorpe
2018-05-19  8:24     ` Wei Hu (Xavier)
2018-05-22 20:21       ` Jason Gunthorpe
2018-05-23  9:33         ` Wei Hu (Xavier)
2018-05-17  8:02 ` [PATCH rdma-next 2/5] RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaust Wei Hu (Xavier)
2018-05-23  6:05   ` Leon Romanovsky
2018-05-23  6:49     ` Wei Hu (Xavier)
2018-05-23  7:00       ` Leon Romanovsky
2018-05-23  7:12         ` Wei Hu (Xavier)
2018-05-23  7:22           ` Leon Romanovsky
2018-05-17  8:02 ` [PATCH rdma-next 3/5] RDMA/hns: Increase checking CMQ status timeout value Wei Hu (Xavier)
2018-05-23  5:49   ` Leon Romanovsky
2018-05-23  6:09     ` Wei Hu (Xavier)
2018-05-23  6:15       ` Wei Hu (Xavier)
2018-05-23  6:23         ` Leon Romanovsky
2018-05-17  8:02 ` [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08 Wei Hu (Xavier)
2018-05-17 15:14   ` Jason Gunthorpe
2018-05-18  3:28     ` Wei Hu (Xavier)
2018-05-18  4:15       ` Jason Gunthorpe
2018-05-18  7:23         ` Wei Hu (Xavier) [this message]
2018-05-22 20:26           ` Jason Gunthorpe
2018-05-23  2:54             ` Wei Hu (Xavier)
2018-05-23  3:47               ` Jason Gunthorpe
2018-05-23  9:35                 ` Wei Hu (Xavier)
2018-05-23  3:49               ` Wei Hu (Xavier)
2018-05-17  8:02 ` [PATCH rdma-next 5/5] RDMA/hns: Fix the illegal memory operation when cross page Wei Hu (Xavier)
2018-05-23  6:17   ` Leon Romanovsky
2018-05-23  6:38     ` Wei Hu (Xavier)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5AFE7F54.9030201@huawei.com \
    --to=xavier.huwei@huawei.com \
    --cc=dledford@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=lijun_nudt@163.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=xavier.huwei@tom.com \
    --subject='Re: [PATCH rdma-next 4/5] RDMA/hns: Add reset process for RoCE in hip08' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).