Netdev Archive on
help / color / mirror / Atom feed
From: Karsten Graul <>
To: Wen Gu <>,,
Subject: Re: [PATCH net] net/smc: Avoid setting clcsock options after clcsock released
Date: Wed, 12 Jan 2022 10:38:27 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 11/01/2022 17:34, Wen Gu wrote:
> Thanks for your reply.
> On 2022/1/11 6:03 pm, Karsten Graul wrote:
>> On 10/01/2022 10:38, Wen Gu wrote:
>>> We encountered a crash in smc_setsockopt() and it is caused by
>>> accessing smc->clcsock after clcsock was released.
>> In the switch() the function smc_switch_to_fallback() might be called which also
>> accesses smc->clcsock without further checking. This should also be protected then?
>> Also from all callers of smc_switch_to_fallback() ?
>> There are more uses of smc->clcsock (e.g. smc_bind(), ...), so why does this problem
>> happen in setsockopt() for you only? I suspect it depends on the test case.
> Yes, it depends on the test case. The crash described here only happens one time when
> I run a stress test of nginx/wrk, accompanied with frequent RNIC up/down operations.
> Considering accessing smc->clcsock after its release is an uncommon, low probability
> issue and only happens in setsockopt() in my test, I choce an simple way to fix it, using
> the existing clcsock_release_lock, and only check in smc_setsockopt() and smc_getsockopt().
>> I wonder if it makes sense to check and protect smc->clcsock at all places in the code where
>> it is used... as of now we had no such races like you encountered. But I see that in theory
>> this problem could also happen in other code areas.
> But inspired by your questions, I think maybe we should treat the race as a general problem?
> Do you think it is necessary to find all the potential race related to the clcsock release and
> fix them in a unified approach? like define smc->clcsock as RCU pointer, hold rcu read lock
> before accessing smc->clcsock and call synchronize_rcu() before resetting smc->clcsock? just a rough idea :)
> Or we should decide it later, do some more tests to see the probability of recurrence of this problem?

I like the idea to use RCU with rcu_assign_pointer() to protect that pointer!

Lets go with your initial patch (improved to address the access in smc_switch_to_fallback())
for now because it solves your current problem. 

I put that RCU thing on our list, but if either of us here starts working on that please let the
others know so we don't end up doing parallel work on this. But I doubt that we will be able to start working
on that soon.

Thanks for the good idea!

  reply	other threads:[~2022-01-12  9:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-10  9:38 [PATCH net] net/smc: Avoid setting clcsock options after clcsock released Wen Gu
2022-01-11 10:03 ` Karsten Graul
2022-01-11 16:34   ` Wen Gu
2022-01-12  9:38     ` Karsten Graul [this message]
2022-01-13  8:23       ` Wen Gu
2022-01-13 15:15       ` Wen Gu
2022-01-11 18:14 ` Jakub Kicinski
2022-01-12  3:32   ` Wen Gu
2022-01-12  7:11 `
2022-01-12  8:16   ` Wen Gu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).