LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [syzbot] possible deadlock in br_ioctl_call
@ 2021-08-01 10:34 syzbot
       [not found] ` <20210801131406.1750-1-hdanton@sina.com>
  0 siblings, 1 reply; 3+ messages in thread
From: syzbot @ 2021-08-01 10:34 UTC (permalink / raw)
  To: andrii, ast, bpf, daniel, davem, john.fastabend, kafai, kpsingh,
	kuba, linux-kernel, netdev, songliubraving, syzkaller-bugs, yhs

Hello,

syzbot found the following issue on:

HEAD commit:    3bdc70669eb2 Merge branch 'devlink-register'
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=11ee370a300000
kernel config:  https://syzkaller.appspot.com/x/.config?x=914a8107c0ffdc14
dashboard link: https://syzkaller.appspot.com/bug?extid=34fe5894623c4ab1b379
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=114398c6300000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10d6d61a300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+34fe5894623c4ab1b379@syzkaller.appspotmail.com

netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc2-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor772/8460 is trying to acquire lock:
ffffffff8d0a9608 (br_ioctl_mutex){+.+.}-{3:3}, at: br_ioctl_call+0x3b/0xa0 net/socket.c:1089

but task is already holding lock:
ffffffff8d0cb568 (rtnl_mutex){+.+.}-{3:3}, at: dev_ioctl+0x1a7/0xee0 net/core/dev_ioctl.c:579

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (rtnl_mutex){+.+.}-{3:3}:
       __mutex_lock_common kernel/locking/mutex.c:959 [inline]
       __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104
       register_netdev+0x11/0x50 net/core/dev.c:10474
       br_add_bridge+0x97/0xf0 net/bridge/br_if.c:459
       br_ioctl_stub+0x750/0x7f0 net/bridge/br_ioctl.c:390
       br_ioctl_call+0x5e/0xa0 net/socket.c:1091
       sock_ioctl+0x30c/0x640 net/socket.c:1185
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:1069 [inline]
       __se_sys_ioctl fs/ioctl.c:1055 [inline]
       __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:1055
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (br_ioctl_mutex){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3051 [inline]
       check_prevs_add kernel/locking/lockdep.c:3174 [inline]
       validate_chain kernel/locking/lockdep.c:3789 [inline]
       __lock_acquire+0x2a07/0x54a0 kernel/locking/lockdep.c:5015
       lock_acquire kernel/locking/lockdep.c:5625 [inline]
       lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590
       __mutex_lock_common kernel/locking/mutex.c:959 [inline]
       __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104
       br_ioctl_call+0x3b/0xa0 net/socket.c:1089
       dev_ifsioc+0xc1f/0xf60 net/core/dev_ioctl.c:382
       dev_ioctl+0x1b9/0xee0 net/core/dev_ioctl.c:580
       sock_do_ioctl+0x18b/0x210 net/socket.c:1128
       sock_ioctl+0x2f1/0x640 net/socket.c:1231
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:1069 [inline]
       __se_sys_ioctl fs/ioctl.c:1055 [inline]
       __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:1055
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(rtnl_mutex);
                               lock(br_ioctl_mutex);
                               lock(rtnl_mutex);
  lock(br_ioctl_mutex);

 *** DEADLOCK ***

1 lock held by syz-executor772/8460:
 #0: ffffffff8d0cb568 (rtnl_mutex){+.+.}-{3:3}, at: dev_ioctl+0x1a7/0xee0 net/core/dev_ioctl.c:579

stack backtrace:
CPU: 0 PID: 8460 Comm: syz-executor772 Not tainted 5.14.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105
 check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2131
 check_prev_add kernel/locking/lockdep.c:3051 [inline]
 check_prevs_add kernel/locking/lockdep.c:3174 [inline]
 validate_chain kernel/locking/lockdep.c:3789 [inline]
 __lock_acquire+0x2a07/0x54a0 kernel/locking/lockdep.c:5015
 lock_acquire kernel/locking/lockdep.c:5625 [inline]
 lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590
 __mutex_lock_common kernel/locking/mutex.c:959 [inline]
 __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104
 br_ioctl_call+0x3b/0xa0 net/socket.c:1089
 dev_ifsioc+0xc1f/0xf60 net/core/dev_ioctl.c:382
 dev_ioctl+0x1b9/0xee0 net/core/dev_ioctl.c:580
 sock_do_ioctl+0x18b/0x210 net/socket.c:1128
 sock_ioctl+0x2f1/0x640 net/socket.c:1231
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:1069 [inline]
 __se_sys_ioctl fs/ioctl.c:1055 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:1055
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4431f9
Code: 28 c3 e8 4a 15 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd0ab19648 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffd0ab19658 RCX: 00000000004431f9
RDX: 0000000020000000 RSI: 00000000000089a2 RDI: 0000000000000004
RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd0ab19660
R13


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [syzbot] possible deadlock in br_ioctl_call
       [not found] ` <20210801131406.1750-1-hdanton@sina.com>
@ 2021-08-02  8:29   ` Nikolay Aleksandrov
  2021-08-02  8:40     ` Arnd Bergmann
  0 siblings, 1 reply; 3+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-02  8:29 UTC (permalink / raw)
  To: Hillf Danton, syzbot
  Cc: Arnd Bergmann, bridge, linux-kernel, netdev, syzkaller-bugs

On 01/08/2021 16:14, Hillf Danton wrote:
> On Sun, 01 Aug 2021 03:34:24 -0700
>> syzbot found the following issue on:
>>
>> HEAD commit:    3bdc70669eb2 Merge branch 'devlink-register'
>> git tree:       net-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=11ee370a300000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=914a8107c0ffdc14
>> dashboard link: https://syzkaller.appspot.com/bug?extid=34fe5894623c4ab1b379
>> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=114398c6300000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10d6d61a300000
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+34fe5894623c4ab1b379@syzkaller.appspotmail.com
>>
>> netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
>> netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
>> netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 5.14.0-rc2-syzkaller #0 Not tainted
>> ------------------------------------------------------
>> syz-executor772/8460 is trying to acquire lock:
>> ffffffff8d0a9608 (br_ioctl_mutex){+.+.}-{3:3}, at: br_ioctl_call+0x3b/0xa0 net/socket.c:1089
>>
>> but task is already holding lock:
>> ffffffff8d0cb568 (rtnl_mutex){+.+.}-{3:3}, at: dev_ioctl+0x1a7/0xee0 net/core/dev_ioctl.c:579
>>
>> which lock already depends on the new lock.
>>
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (rtnl_mutex){+.+.}-{3:3}:
>>        __mutex_lock_common kernel/locking/mutex.c:959 [inline]
>>        __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104
>>        register_netdev+0x11/0x50 net/core/dev.c:10474
>>        br_add_bridge+0x97/0xf0 net/bridge/br_if.c:459
>>        br_ioctl_stub+0x750/0x7f0 net/bridge/br_ioctl.c:390
>>        br_ioctl_call+0x5e/0xa0 net/socket.c:1091
>>        sock_ioctl+0x30c/0x640 net/socket.c:1185
>>        vfs_ioctl fs/ioctl.c:51 [inline]
>>        __do_sys_ioctl fs/ioctl.c:1069 [inline]
>>        __se_sys_ioctl fs/ioctl.c:1055 [inline]
>>        __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:1055
>>        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>        do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
>>        entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> -> #0 (br_ioctl_mutex){+.+.}-{3:3}:
>>        check_prev_add kernel/locking/lockdep.c:3051 [inline]
>>        check_prevs_add kernel/locking/lockdep.c:3174 [inline]
>>        validate_chain kernel/locking/lockdep.c:3789 [inline]
>>        __lock_acquire+0x2a07/0x54a0 kernel/locking/lockdep.c:5015
>>        lock_acquire kernel/locking/lockdep.c:5625 [inline]
>>        lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590
>>        __mutex_lock_common kernel/locking/mutex.c:959 [inline]
>>        __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104
>>        br_ioctl_call+0x3b/0xa0 net/socket.c:1089
>>        dev_ifsioc+0xc1f/0xf60 net/core/dev_ioctl.c:382
>>        dev_ioctl+0x1b9/0xee0 net/core/dev_ioctl.c:580
>>        sock_do_ioctl+0x18b/0x210 net/socket.c:1128
>>        sock_ioctl+0x2f1/0x640 net/socket.c:1231
>>        vfs_ioctl fs/ioctl.c:51 [inline]
>>        __do_sys_ioctl fs/ioctl.c:1069 [inline]
>>        __se_sys_ioctl fs/ioctl.c:1055 [inline]
>>        __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:1055
>>        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>        do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
>>        entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> other info that might help us debug this:
>>
>>  Possible unsafe locking scenario:
>>
>>        CPU0                    CPU1
>>        ----                    ----
>>   lock(rtnl_mutex);
>>                                lock(br_ioctl_mutex);
>>                                lock(rtnl_mutex);
>>   lock(br_ioctl_mutex);
>>
>>  *** DEADLOCK ***
> 
> Fix it by doing bridge ioctl outside rtnl lock after checking netdev present
> and bumping up its reference. Recheck netdev state (or take rtnl lock) after
> acquiring br_ioctl_mutex with a stable netdev.
> 
> Now only for thoughts.
> 
> +++ x/net/core/dev_ioctl.c
> @@ -379,7 +379,12 @@ static int dev_ifsioc(struct net *net, s
>  	case SIOCBRDELIF:
>  		if (!netif_device_present(dev))
>  			return -ENODEV;
> -		return br_ioctl_call(net, netdev_priv(dev), cmd, ifr, NULL);
> +		dev_hold(dev);
> +		rtnl_unlock();
> +		err = br_ioctl_call(net, netdev_priv(dev), cmd, ifr, NULL);
> +		dev_put(dev);
> +		rtnl_lock();
> +		return err;
>  
>  	case SIOCSHWTSTAMP:
>  		err = net_hwtstamp_validate(ifr);
> 

Thanks, but it will need more work, the bridge ioctl calls were divided in two parts
before: one was deviceless called by sock_ioctl and didn't expect rtnl to be held, the other was
with a device called by dev_ifsioc() and expected rtnl to be held.
Then ad2f99aedf8f ("net: bridge: move bridge ioctls out of .ndo_do_ioctl")
united them in a single ioctl stub, but didn't take care of the locking expectations.
For sock_ioctl now we acquire  (1) br_ioctl_mutex, (2) rtnl and for dev_ifsioc we
acquire (1) rtnl, (2) br_ioctl_mutex as the lockdep warning has demonstrated.

That fix above can work if rtnl gets reacquired by the ioctl in the proper switch cases.
To avoid playing even more locking games it'd probably be best to always acquire and
release rtnl by the bridge ioctl which will need a bit more work.

Arnd, should I take care of it?

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [syzbot] possible deadlock in br_ioctl_call
  2021-08-02  8:29   ` Nikolay Aleksandrov
@ 2021-08-02  8:40     ` Arnd Bergmann
  0 siblings, 0 replies; 3+ messages in thread
From: Arnd Bergmann @ 2021-08-02  8:40 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Hillf Danton, syzbot, Arnd Bergmann, bridge,
	Linux Kernel Mailing List, Networking, syzkaller-bugs

On Mon, Aug 2, 2021 at 10:30 AM Nikolay Aleksandrov <nikolay@nvidia.com> wrote:
> On 01/08/2021 16:14, Hillf Danton wrote:
> > On Sun, 01 Aug 2021 03:34:24 -0700
> >> syzbot found the following issue on:
>
> Thanks, but it will need more work, the bridge ioctl calls were divided in two parts
> before: one was deviceless called by sock_ioctl and didn't expect rtnl to be held, the other was
> with a device called by dev_ifsioc() and expected rtnl to be held.
> Then ad2f99aedf8f ("net: bridge: move bridge ioctls out of .ndo_do_ioctl")
> united them in a single ioctl stub, but didn't take care of the locking expectations.
> For sock_ioctl now we acquire  (1) br_ioctl_mutex, (2) rtnl and for dev_ifsioc we
> acquire (1) rtnl, (2) br_ioctl_mutex as the lockdep warning has demonstrated.

Right, sorry about causing problems here.

> That fix above can work if rtnl gets reacquired by the ioctl in the proper switch cases.
> To avoid playing even more locking games it'd probably be best to always acquire and
> release rtnl by the bridge ioctl which will need a bit more work.
>
> Arnd, should I take care of it?

That would be best I think. As you have already analyzed the problem and come
up with a possible solution, I'm sure you will get to a better fix
more quickly than
I would.

Thanks,

       Arnd

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-02  8:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-01 10:34 [syzbot] possible deadlock in br_ioctl_call syzbot
     [not found] ` <20210801131406.1750-1-hdanton@sina.com>
2021-08-02  8:29   ` Nikolay Aleksandrov
2021-08-02  8:40     ` Arnd Bergmann

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox