Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH net] sctp: not disable bh in the whole sctp_get_port_local()
@ 2020-08-20  6:48 Xin Long
  2020-08-20 13:13 ` Eric Dumazet
  0 siblings, 1 reply; 3+ messages in thread
From: Xin Long @ 2020-08-20  6:48 UTC (permalink / raw)
  To: network dev, linux-sctp
  Cc: davem, Eric Dumazet, Marcelo Ricardo Leitner, Neil Horman

With disabling bh in the whole sctp_get_port_local(), when
snum == 0 and too many ports have been used, the do-while
loop will take the cpu for a long time and cause cpu stuck:

  [ ] watchdog: BUG: soft lockup - CPU#11 stuck for 22s!
  [ ] RIP: 0010:native_queued_spin_lock_slowpath+0x4de/0x940
  [ ] Call Trace:
  [ ]  _raw_spin_lock+0xc1/0xd0
  [ ]  sctp_get_port_local+0x527/0x650 [sctp]
  [ ]  sctp_do_bind+0x208/0x5e0 [sctp]
  [ ]  sctp_autobind+0x165/0x1e0 [sctp]
  [ ]  sctp_connect_new_asoc+0x355/0x480 [sctp]
  [ ]  __sctp_connect+0x360/0xb10 [sctp]

There's no need to disable bh in the whole function of
sctp_get_port_local. So fix this cpu stuck by removing
local_bh_disable() called at the beginning, and using
spin_lock_bh() instead.

The same thing was actually done for inet_csk_get_port() in
Commit ea8add2b1903 ("tcp/dccp: better use of ephemeral
ports in bind()").

Thanks to Marcelo for pointing the buggy code out.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/socket.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index ec1fba1..f0f9956 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -8060,8 +8060,6 @@ static int sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 
 	pr_debug("%s: begins, snum:%d\n", __func__, snum);
 
-	local_bh_disable();
-
 	if (snum == 0) {
 		/* Search for an available port. */
 		int low, high, remaining, index;
@@ -8079,20 +8077,20 @@ static int sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 				continue;
 			index = sctp_phashfn(net, rover);
 			head = &sctp_port_hashtable[index];
-			spin_lock(&head->lock);
+			spin_lock_bh(&head->lock);
 			sctp_for_each_hentry(pp, &head->chain)
 				if ((pp->port == rover) &&
 				    net_eq(net, pp->net))
 					goto next;
 			break;
 		next:
-			spin_unlock(&head->lock);
+			spin_unlock_bh(&head->lock);
 		} while (--remaining > 0);
 
 		/* Exhausted local port range during search? */
 		ret = 1;
 		if (remaining <= 0)
-			goto fail;
+			return ret;
 
 		/* OK, here is the one we will use.  HEAD (the port
 		 * hash table list entry) is non-NULL and we hold it's
@@ -8107,7 +8105,7 @@ static int sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 		 * port iterator, pp being NULL.
 		 */
 		head = &sctp_port_hashtable[sctp_phashfn(net, snum)];
-		spin_lock(&head->lock);
+		spin_lock_bh(&head->lock);
 		sctp_for_each_hentry(pp, &head->chain) {
 			if ((pp->port == snum) && net_eq(pp->net, net))
 				goto pp_found;
@@ -8207,10 +8205,7 @@ static int sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 	ret = 0;
 
 fail_unlock:
-	spin_unlock(&head->lock);
-
-fail:
-	local_bh_enable();
+	spin_unlock_bh(&head->lock);
 	return ret;
 }
 
-- 
2.1.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net] sctp: not disable bh in the whole sctp_get_port_local()
  2020-08-20  6:48 [PATCH net] sctp: not disable bh in the whole sctp_get_port_local() Xin Long
@ 2020-08-20 13:13 ` Eric Dumazet
  2020-08-21  7:10   ` Xin Long
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2020-08-20 13:13 UTC (permalink / raw)
  To: Xin Long, network dev, linux-sctp
  Cc: davem, Eric Dumazet, Marcelo Ricardo Leitner, Neil Horman



On 8/19/20 11:48 PM, Xin Long wrote:
> With disabling bh in the whole sctp_get_port_local(), when
> snum == 0 and too many ports have been used, the do-while
> loop will take the cpu for a long time and cause cpu stuck:
> 
>   [ ] watchdog: BUG: soft lockup - CPU#11 stuck for 22s!
>   [ ] RIP: 0010:native_queued_spin_lock_slowpath+0x4de/0x940
>   [ ] Call Trace:
>   [ ]  _raw_spin_lock+0xc1/0xd0
>   [ ]  sctp_get_port_local+0x527/0x650 [sctp]
>   [ ]  sctp_do_bind+0x208/0x5e0 [sctp]
>   [ ]  sctp_autobind+0x165/0x1e0 [sctp]
>   [ ]  sctp_connect_new_asoc+0x355/0x480 [sctp]
>   [ ]  __sctp_connect+0x360/0xb10 [sctp]
> 
> There's no need to disable bh in the whole function of
> sctp_get_port_local. So fix this cpu stuck by removing
> local_bh_disable() called at the beginning, and using
> spin_lock_bh() instead.
> 
> The same thing was actually done for inet_csk_get_port() in
> Commit ea8add2b1903 ("tcp/dccp: better use of ephemeral
> ports in bind()").
> 
> Thanks to Marcelo for pointing the buggy code out.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: Ying Xu <yinxu@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---


Any reason you chose to not use a cond_resched() then ?

Clearly this function needs to yield, not only BH, but to other threads.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net] sctp: not disable bh in the whole sctp_get_port_local()
  2020-08-20 13:13 ` Eric Dumazet
@ 2020-08-21  7:10   ` Xin Long
  0 siblings, 0 replies; 3+ messages in thread
From: Xin Long @ 2020-08-21  7:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: network dev, linux-sctp, davem, Eric Dumazet,
	Marcelo Ricardo Leitner, Neil Horman

On Thu, Aug 20, 2020 at 9:13 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 8/19/20 11:48 PM, Xin Long wrote:
> > With disabling bh in the whole sctp_get_port_local(), when
> > snum == 0 and too many ports have been used, the do-while
> > loop will take the cpu for a long time and cause cpu stuck:
> >
> >   [ ] watchdog: BUG: soft lockup - CPU#11 stuck for 22s!
> >   [ ] RIP: 0010:native_queued_spin_lock_slowpath+0x4de/0x940
> >   [ ] Call Trace:
> >   [ ]  _raw_spin_lock+0xc1/0xd0
> >   [ ]  sctp_get_port_local+0x527/0x650 [sctp]
> >   [ ]  sctp_do_bind+0x208/0x5e0 [sctp]
> >   [ ]  sctp_autobind+0x165/0x1e0 [sctp]
> >   [ ]  sctp_connect_new_asoc+0x355/0x480 [sctp]
> >   [ ]  __sctp_connect+0x360/0xb10 [sctp]
> >
> > There's no need to disable bh in the whole function of
> > sctp_get_port_local. So fix this cpu stuck by removing
> > local_bh_disable() called at the beginning, and using
> > spin_lock_bh() instead.
> >
> > The same thing was actually done for inet_csk_get_port() in
> > Commit ea8add2b1903 ("tcp/dccp: better use of ephemeral
> > ports in bind()").
> >
> > Thanks to Marcelo for pointing the buggy code out.
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Reported-by: Ying Xu <yinxu@redhat.com>
> > Signed-off-by: Xin Long <lucien.xin@gmail.com>
> > ---
>
>
> Any reason you chose to not use a cond_resched() then ?
>
> Clearly this function needs to yield, not only BH, but to other threads.
>
Right, it explains why the cpu stuck is gone, but sctp_get_port_local()
still use the cpu quite a lot:


Thanks, will post v2.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-08-21  6:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-20  6:48 [PATCH net] sctp: not disable bh in the whole sctp_get_port_local() Xin Long
2020-08-20 13:13 ` Eric Dumazet
2020-08-21  7:10   ` Xin Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).