LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
To: dsterba@suse.cz, clm@fb.com, josef@toxicpanda.com,
	dsterba@suse.com, anand.jain@oracle.com,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	skhan@linuxfoundation.org, gregkh@linuxfoundation.org,
	linux-kernel-mentees@lists.linuxfoundation.org,
	syzbot+a70e2ad0879f160b9217@syzkaller.appspotmail.com
Subject: Re: [PATCH v2] btrfs: fix rw device counting in __btrfs_free_extra_devids
Date: Fri, 20 Aug 2021 01:11:58 +0800	[thread overview]
Message-ID: <89172356-335f-1ca3-d3a2-78fac7ef93fb@gmail.com> (raw)
In-Reply-To: <20210813103032.GR5047@twin.jikos.cz>

On 13/8/21 6:30 pm, David Sterba wrote:
> On Fri, Aug 13, 2021 at 05:57:26PM +0800, Desmond Cheong Zhi Xi wrote:
>> On 13/8/21 4:51 pm, David Sterba wrote:
>>> On Fri, Aug 13, 2021 at 01:31:25AM +0800, Desmond Cheong Zhi Xi wrote:
>>>> On 12/8/21 11:50 pm, David Sterba wrote:
>>>>> On Thu, Aug 12, 2021 at 11:43:16PM +0800, Desmond Cheong Zhi Xi wrote:
>>>>>> On 12/8/21 6:38 pm, David Sterba wrote:
>>>>>>> On Tue, Jul 27, 2021 at 03:13:03PM +0800, Desmond Cheong Zhi Xi wrote:
>>>>>>>> --- a/fs/btrfs/volumes.c
>>>>>>>> +++ b/fs/btrfs/volumes.c
>>>>>>>> @@ -1078,6 +1078,7 @@ static void __btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices,
>>>>>>>>      		if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
>>>>>>>>      			list_del_init(&device->dev_alloc_list);
>>>>>>>>      			clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
>>>>>>>> +			fs_devices->rw_devices--;
>>>>>>>>      		}
>>>>>>>>      		list_del_init(&device->dev_list);
>>>>>>>>      		fs_devices->num_devices--;
>>>>>>>
>>>>>>> I've hit a crash on master branch with stacktrace very similar to one
>>>>>>> this bug was supposed to fix. It's a failed assertion on device close.
>>>>>>> This patch was the last one to touch it and it matches some of the
>>>>>>> keywords, namely the BTRFS_DEV_STATE_REPLACE_TGT bit that used to be in
>>>>>>> the original patch but was not reinstated in your fix.
>>>>>>>
>>>>>>> I'm not sure how reproducible it is, right now I have only one instance
>>>>>>> and am hunting another strange problem. They could be related.
>>>>>>>
>>>>>>> assertion failed: !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state), in fs/btrfs/volumes.c:1150
>>>>>>>
>>>>>>> https://susepaste.org/view/raw/18223056 full log with other stacktraces,
>>>>>>> possibly relatedg
>>>>>>>
>>>>>>
>>>>>> Looking at the logs, it seems that a dev_replace was started, then
>>>>>> suspended. But it wasn't canceled or resumed before the fs devices were
>>>>>> closed.
>>>>>>
>>>>>> I'll investigate further, just throwing some observations out there.
>>>>>
>>>>> Thanks. I'm testing the patch revert, no crash after first loop, I'll
>>>>> run a few more to be sure as it's not entirely reliable.
>>>>>
>>>>> Sending the revert is option of last resort as we're approaching end of
>>>>> 5.14 dev cycle and the crash prevents testing (unlike the fuzzer
>>>>> warning).
>>>>>
>>>>
>>>> I might be missing something, so any thoughts would be appreciated. But
>>>> I don't think the assertion in btrfs_close_one_device is correct.
>>>>
>>>>    From what I see, this crash happens when close_ctree is called while a
>>>> dev_replace hasn't completed. In close_ctree, we suspend the
>>>> dev_replace, but keep the replace target around so that we can resume
>>>> the dev_replace procedure when we mount the root again. This is the call
>>>> trace:
>>>>
>>>>      close_ctree():
>>>>        btrfs_dev_replace_suspend_for_unmount();
>>>>        btrfs_close_devices():
>>>>          btrfs_close_fs_devices():
>>>>            btrfs_close_one_device():
>>>>              ASSERT(!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
>>>> &device->dev_state));
>>>>
>>>> However, since the replace target sticks around, there is a device with
>>>> BTRFS_DEV_STATE_REPLACE_TGT set, and we fail the assertion in
>>>> btrfs_close_one_device.
>>>>
>>>> Two options I can think of:
>>>>
>>>> - We could remove the assertion.
>>>>
>>>> - Or we could clear the BTRFS_DEV_STATE_REPLACE_TGT bit in
>>>> btrfs_dev_replace_suspend_for_unmount. This is fine since the bit is set
>>>> again in btrfs_init_dev_replace if the dev_replace->replace_state is
>>>> BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED. But this approach strikes me as
>>>> a little odd because the device is still the replace target when
>>>> mounting in the future.
>>>
>>> The option #2 does not sound safe because the TGT bit is checked in
>>> several places where device list is queried for various reasons, even
>>> without a mounted filesystem.
>>>
>>> Removing the assertion makes more sense but I'm still not convinced that
>>> the this is expected/allowed state of a closed device.
>>>
>>
>> Would it be better if we cleared the REPLACE_TGT bit only when closing
>> the device where device->devid == BTRFS_DEV_REPLACE_DEVID?
>>
>> The first conditional in btrfs_close_one_device assumes that we can come
>> across such a device. If we come across it, we should properly reset it.
>>
>> If other devices has this bit set, the ASSERT will still catch it and
>> let us know something is wrong.
> 
> That sounds great.
> 
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 70f94b75f25a..a5afebb78ecf 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1130,6 +1130,9 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>                   fs_devices->rw_devices--;
>>           }
>>    
>> +       if (device->devid == BTRFS_DEV_REPLACE_DEVID)
>> +               clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
>> +
>>           if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
>>                   fs_devices->missing_devices--;
> 
> I'll do a few test rounds, thanks.
> 

Hi David,

Just following up. Did that resolve the issue or is further 
investigation needed?

  reply	other threads:[~2021-08-19 17:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-27  7:13 Desmond Cheong Zhi Xi
2021-07-28 12:58 ` David Sterba
2021-08-12 10:38 ` David Sterba
2021-08-12 15:43   ` Desmond Cheong Zhi Xi
2021-08-12 15:50     ` David Sterba
2021-08-12 17:31       ` Desmond Cheong Zhi Xi
2021-08-13  8:51         ` David Sterba
2021-08-13  9:57           ` Desmond Cheong Zhi Xi
2021-08-13 10:30             ` David Sterba
2021-08-19 17:11               ` Desmond Cheong Zhi Xi [this message]
2021-08-19 17:34                 ` David Sterba
2021-08-20  3:09                   ` Desmond Cheong Zhi Xi
2021-08-20 10:58                     ` David Sterba
2021-08-20 17:53                       ` Desmond Cheong Zhi Xi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=89172356-335f-1ca3-d3a2-78fac7ef93fb@gmail.com \
    --to=desmondcheongzx@gmail.com \
    --cc=anand.jain@oracle.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=dsterba@suse.cz \
    --cc=gregkh@linuxfoundation.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=syzbot+a70e2ad0879f160b9217@syzkaller.appspotmail.com \
    --subject='Re: [PATCH v2] btrfs: fix rw device counting in __btrfs_free_extra_devids' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).