LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nikolay Aleksandrov <nikolay@nvidia.com>
To: Hillf Danton <hdanton@sina.com>,
	syzbot <syzbot+34fe5894623c4ab1b379@syzkaller.appspotmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
	bridge@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] possible deadlock in br_ioctl_call
Date: Mon, 2 Aug 2021 11:29:42 +0300	[thread overview]
Message-ID: <6f05c1a9-801a-6174-048a-90688a23941d@nvidia.com> (raw)
In-Reply-To: <20210801131406.1750-1-hdanton@sina.com>

On 01/08/2021 16:14, Hillf Danton wrote:
> On Sun, 01 Aug 2021 03:34:24 -0700
>> syzbot found the following issue on:
>>
>> HEAD commit:    3bdc70669eb2 Merge branch 'devlink-register'
>> git tree:       net-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=11ee370a300000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=914a8107c0ffdc14
>> dashboard link: https://syzkaller.appspot.com/bug?extid=34fe5894623c4ab1b379
>> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=114398c6300000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10d6d61a300000
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+34fe5894623c4ab1b379@syzkaller.appspotmail.com
>>
>> netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
>> netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
>> netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 5.14.0-rc2-syzkaller #0 Not tainted
>> ------------------------------------------------------
>> syz-executor772/8460 is trying to acquire lock:
>> ffffffff8d0a9608 (br_ioctl_mutex){+.+.}-{3:3}, at: br_ioctl_call+0x3b/0xa0 net/socket.c:1089
>>
>> but task is already holding lock:
>> ffffffff8d0cb568 (rtnl_mutex){+.+.}-{3:3}, at: dev_ioctl+0x1a7/0xee0 net/core/dev_ioctl.c:579
>>
>> which lock already depends on the new lock.
>>
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (rtnl_mutex){+.+.}-{3:3}:
>>        __mutex_lock_common kernel/locking/mutex.c:959 [inline]
>>        __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104
>>        register_netdev+0x11/0x50 net/core/dev.c:10474
>>        br_add_bridge+0x97/0xf0 net/bridge/br_if.c:459
>>        br_ioctl_stub+0x750/0x7f0 net/bridge/br_ioctl.c:390
>>        br_ioctl_call+0x5e/0xa0 net/socket.c:1091
>>        sock_ioctl+0x30c/0x640 net/socket.c:1185
>>        vfs_ioctl fs/ioctl.c:51 [inline]
>>        __do_sys_ioctl fs/ioctl.c:1069 [inline]
>>        __se_sys_ioctl fs/ioctl.c:1055 [inline]
>>        __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:1055
>>        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>        do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
>>        entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> -> #0 (br_ioctl_mutex){+.+.}-{3:3}:
>>        check_prev_add kernel/locking/lockdep.c:3051 [inline]
>>        check_prevs_add kernel/locking/lockdep.c:3174 [inline]
>>        validate_chain kernel/locking/lockdep.c:3789 [inline]
>>        __lock_acquire+0x2a07/0x54a0 kernel/locking/lockdep.c:5015
>>        lock_acquire kernel/locking/lockdep.c:5625 [inline]
>>        lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590
>>        __mutex_lock_common kernel/locking/mutex.c:959 [inline]
>>        __mutex_lock+0x12a/0x10a0 kernel/locking/mutex.c:1104
>>        br_ioctl_call+0x3b/0xa0 net/socket.c:1089
>>        dev_ifsioc+0xc1f/0xf60 net/core/dev_ioctl.c:382
>>        dev_ioctl+0x1b9/0xee0 net/core/dev_ioctl.c:580
>>        sock_do_ioctl+0x18b/0x210 net/socket.c:1128
>>        sock_ioctl+0x2f1/0x640 net/socket.c:1231
>>        vfs_ioctl fs/ioctl.c:51 [inline]
>>        __do_sys_ioctl fs/ioctl.c:1069 [inline]
>>        __se_sys_ioctl fs/ioctl.c:1055 [inline]
>>        __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:1055
>>        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>        do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
>>        entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> other info that might help us debug this:
>>
>>  Possible unsafe locking scenario:
>>
>>        CPU0                    CPU1
>>        ----                    ----
>>   lock(rtnl_mutex);
>>                                lock(br_ioctl_mutex);
>>                                lock(rtnl_mutex);
>>   lock(br_ioctl_mutex);
>>
>>  *** DEADLOCK ***
> 
> Fix it by doing bridge ioctl outside rtnl lock after checking netdev present
> and bumping up its reference. Recheck netdev state (or take rtnl lock) after
> acquiring br_ioctl_mutex with a stable netdev.
> 
> Now only for thoughts.
> 
> +++ x/net/core/dev_ioctl.c
> @@ -379,7 +379,12 @@ static int dev_ifsioc(struct net *net, s
>  	case SIOCBRDELIF:
>  		if (!netif_device_present(dev))
>  			return -ENODEV;
> -		return br_ioctl_call(net, netdev_priv(dev), cmd, ifr, NULL);
> +		dev_hold(dev);
> +		rtnl_unlock();
> +		err = br_ioctl_call(net, netdev_priv(dev), cmd, ifr, NULL);
> +		dev_put(dev);
> +		rtnl_lock();
> +		return err;
>  
>  	case SIOCSHWTSTAMP:
>  		err = net_hwtstamp_validate(ifr);
> 

Thanks, but it will need more work, the bridge ioctl calls were divided in two parts
before: one was deviceless called by sock_ioctl and didn't expect rtnl to be held, the other was
with a device called by dev_ifsioc() and expected rtnl to be held.
Then ad2f99aedf8f ("net: bridge: move bridge ioctls out of .ndo_do_ioctl")
united them in a single ioctl stub, but didn't take care of the locking expectations.
For sock_ioctl now we acquire  (1) br_ioctl_mutex, (2) rtnl and for dev_ifsioc we
acquire (1) rtnl, (2) br_ioctl_mutex as the lockdep warning has demonstrated.

That fix above can work if rtnl gets reacquired by the ioctl in the proper switch cases.
To avoid playing even more locking games it'd probably be best to always acquire and
release rtnl by the bridge ioctl which will need a bit more work.

Arnd, should I take care of it?

Cheers,
 Nik

  parent reply	other threads:[~2021-08-02  8:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-01 10:34 syzbot
     [not found] ` <20210801131406.1750-1-hdanton@sina.com>
2021-08-02  8:29   ` Nikolay Aleksandrov [this message]
2021-08-02  8:40     ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6f05c1a9-801a-6174-048a-90688a23941d@nvidia.com \
    --to=nikolay@nvidia.com \
    --cc=arnd@arndb.de \
    --cc=bridge@lists.linux-foundation.org \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=syzbot+34fe5894623c4ab1b379@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --subject='Re: [syzbot] possible deadlock in br_ioctl_call' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).