LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] ext4: avoid huge mmp update interval value
@ 2021-08-05 15:14 Pavel Skripkin
  2021-08-05 19:45 ` Theodore Ts'o
  0 siblings, 1 reply; 5+ messages in thread
From: Pavel Skripkin @ 2021-08-05 15:14 UTC (permalink / raw)
  To: tytso, adilger.kernel, johann
  Cc: linux-ext4, linux-kernel, Pavel Skripkin, syzbot+c9ff4822a62eee994ea3

Syzbot reported task hung bug in ext4_fill_super(). The problem was in
too huge mmp update interval.

Syzkaller reproducer setted s_mmp_update_interval to 39785 seconds. This
update interaval is unreasonable huge and it can cause tasks to hung on
kthread_stop() call, since it will wait until timeout timer expires.

To avoid this sutiation, I've added MIN and MAX constants for kmmp
interval and clamped s_mmp_update_interval within the boundaries

Reported-and-tested-by: syzbot+c9ff4822a62eee994ea3@syzkaller.appspotmail.com
Fixes: c5e06d101aaf ("ext4: add support for multiple mount protection")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
---

Hi, Ted and ext4 maintainers!

I am not sure about min/max values for interval, so I look forward to
receiving your views on these values and patch in general!



With regards,
Pavel Skripkin

---
 fs/ext4/mmp.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
index bc364c119af6..160abee66dce 100644
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -7,6 +7,9 @@
 
 #include "ext4.h"
 
+#define EXT4_KMMP_MAX_INTERVAL		100
+#define EXT4_KMMP_MIN_INTERVAL		5
+
 /* Checksumming functions */
 static __le32 ext4_mmp_csum(struct super_block *sb, struct mmp_struct *mmp)
 {
@@ -140,6 +143,12 @@ static int kmmpd(void *data)
 	unsigned long diff;
 	int retval;
 
+	/* We should avoid unreasonable huge update interval, since it can cause
+	 * task hung bug on umount or on error handling path in ext4_fill_super()
+	 */
+	mmp_update_interval = clamp(mmp_update_interval, EXT4_KMMP_MIN_INTERVAL,
+							 EXT4_KMMP_MAX_INTERVAL);
+
 	mmp_block = le64_to_cpu(es->s_mmp_block);
 	mmp = (struct mmp_struct *)(bh->b_data);
 	mmp->mmp_time = cpu_to_le64(ktime_get_real_seconds());
@@ -156,6 +165,9 @@ static int kmmpd(void *data)
 	memcpy(mmp->mmp_nodename, init_utsname()->nodename,
 	       sizeof(mmp->mmp_nodename));
 
+	ext4_msg(sb, KERN_INFO, "Started kmmp thread with update interval = %u\n",
+		 mmp_update_interval);
+
 	while (!kthread_should_stop() && !sb_rdonly(sb)) {
 		if (!ext4_has_feature_mmp(sb)) {
 			ext4_warning(sb, "kmmpd being stopped since MMP feature"
-- 
2.32.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ext4: avoid huge mmp update interval value
  2021-08-05 15:14 [PATCH] ext4: avoid huge mmp update interval value Pavel Skripkin
@ 2021-08-05 19:45 ` Theodore Ts'o
  2021-08-05 20:12   ` Pavel Skripkin
  0 siblings, 1 reply; 5+ messages in thread
From: Theodore Ts'o @ 2021-08-05 19:45 UTC (permalink / raw)
  To: Pavel Skripkin
  Cc: adilger.kernel, johann, linux-ext4, linux-kernel,
	syzbot+c9ff4822a62eee994ea3

On Thu, Aug 05, 2021 at 06:14:18PM +0300, Pavel Skripkin wrote:
> Syzbot reported task hung bug in ext4_fill_super(). The problem was in
> too huge mmp update interval.
> 
> Syzkaller reproducer setted s_mmp_update_interval to 39785 seconds. This
> update interaval is unreasonable huge and it can cause tasks to hung on
> kthread_stop() call, since it will wait until timeout timer expires.

I must be missing something.  kthread_stop() should wake up the
kmmpd() thread, which should see kthread_should_stop(), and then it
should exit.  What is causing it to wait until the timeout timer
expires?

					- Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ext4: avoid huge mmp update interval value
  2021-08-05 19:45 ` Theodore Ts'o
@ 2021-08-05 20:12   ` Pavel Skripkin
  2021-08-05 22:59     ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Pavel Skripkin @ 2021-08-05 20:12 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: adilger.kernel, johann, linux-ext4, linux-kernel,
	syzbot+c9ff4822a62eee994ea3

On 8/5/21 10:45 PM, Theodore Ts'o wrote:
> On Thu, Aug 05, 2021 at 06:14:18PM +0300, Pavel Skripkin wrote:
>> Syzbot reported task hung bug in ext4_fill_super(). The problem was in
>> too huge mmp update interval.
>> 
>> Syzkaller reproducer setted s_mmp_update_interval to 39785 seconds. This
>> update interaval is unreasonable huge and it can cause tasks to hung on
>> kthread_stop() call, since it will wait until timeout timer expires.
> 
> I must be missing something.  kthread_stop() should wake up the
> kmmpd() thread, which should see kthread_should_stop(), and then it
> should exit.  What is causing it to wait until the timeout timer
> expires?
> 
> 					- Ted
> 


Hi, Ted!

I guess, I've explained my idea badly, sorry :)

I mean, that there is a chance to hit this situation:

CPU0				CPU1
				kthread_should_stop()  <-- false
kthread_stop()
set_bit(KTHREAD_SHOULD_STOP)				
wake_up_process()
wait_for_completion()
				schedule_timeout_interruptible()

*waits until timer expires*


Since there wasn't any validation checks for mmp_update_interval, CPU0 
will wait for up to (1 << 16) seconds (s_mmp_update_interval it __le16).


With regards,
Pavel Skripkin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ext4: avoid huge mmp update interval value
  2021-08-05 20:12   ` Pavel Skripkin
@ 2021-08-05 22:59     ` Dave Chinner
  2021-08-06 10:20       ` Pavel Skripkin
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2021-08-05 22:59 UTC (permalink / raw)
  To: Pavel Skripkin
  Cc: Theodore Ts'o, adilger.kernel, johann, linux-ext4,
	linux-kernel, syzbot+c9ff4822a62eee994ea3

On Thu, Aug 05, 2021 at 11:12:42PM +0300, Pavel Skripkin wrote:
> On 8/5/21 10:45 PM, Theodore Ts'o wrote:
> > On Thu, Aug 05, 2021 at 06:14:18PM +0300, Pavel Skripkin wrote:
> > > Syzbot reported task hung bug in ext4_fill_super(). The problem was in
> > > too huge mmp update interval.
> > > 
> > > Syzkaller reproducer setted s_mmp_update_interval to 39785 seconds. This
> > > update interaval is unreasonable huge and it can cause tasks to hung on
> > > kthread_stop() call, since it will wait until timeout timer expires.
> > 
> > I must be missing something.  kthread_stop() should wake up the
> > kmmpd() thread, which should see kthread_should_stop(), and then it
> > should exit.  What is causing it to wait until the timeout timer
> > expires?
> > 
> > 					- Ted
> > 
> 
> 
> Hi, Ted!
> 
> I guess, I've explained my idea badly, sorry :)
> 
> I mean, that there is a chance to hit this situation:
> 
> CPU0				CPU1
> 				kthread_should_stop()  <-- false
> kthread_stop()
> set_bit(KTHREAD_SHOULD_STOP)				
> wake_up_process()
> wait_for_completion()
> 				schedule_timeout_interruptible()
> 
> *waits until timer expires*

Yeah, so the bug here is checking kthread_should_stop() while
the task state is TASK_RUNNING.

What you need to do here is:

while (run) {

	....
	set_current_state(TASK_INTERRUPTIBLE);
	if (kthread_should_stop()) {
		__set_current_state(TASK_RUNNING);
		break;
	}
	schedule_timeout(tout);

	.....
}


That means in the case above where schedule() occurs after the
kthread_should_stop() check has raced with kthread_stop(), then
wake_up_process() will handle any races with schedule() correctly.

i.e.  wake_up_process() will set the task state to TASK_RUNNING and
schedule() will not sleep if it is called after wake_up_process().
Or if schedule() runs first then wake_up_process() will wake it
correctly after setting the state to TASK_RUNNING.

Either way, the loop then runs around again straight away to the next
kthread_should_stop() call, at which point it breaks out.

I note that the "wait_to_exit:" code in the same function does this
properly....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ext4: avoid huge mmp update interval value
  2021-08-05 22:59     ` Dave Chinner
@ 2021-08-06 10:20       ` Pavel Skripkin
  0 siblings, 0 replies; 5+ messages in thread
From: Pavel Skripkin @ 2021-08-06 10:20 UTC (permalink / raw)
  To: Dave Chinner, Theodore Ts'o
  Cc: adilger.kernel, johann, linux-ext4, linux-kernel,
	syzbot+c9ff4822a62eee994ea3

On 8/6/21 1:59 AM, Dave Chinner wrote:
> On Thu, Aug 05, 2021 at 11:12:42PM +0300, Pavel Skripkin wrote:
>> On 8/5/21 10:45 PM, Theodore Ts'o wrote:
>> > On Thu, Aug 05, 2021 at 06:14:18PM +0300, Pavel Skripkin wrote:
>> > > Syzbot reported task hung bug in ext4_fill_super(). The problem was in
>> > > too huge mmp update interval.
>> > > 
>> > > Syzkaller reproducer setted s_mmp_update_interval to 39785 seconds. This
>> > > update interaval is unreasonable huge and it can cause tasks to hung on
>> > > kthread_stop() call, since it will wait until timeout timer expires.
>> > 
>> > I must be missing something.  kthread_stop() should wake up the
>> > kmmpd() thread, which should see kthread_should_stop(), and then it
>> > should exit.  What is causing it to wait until the timeout timer
>> > expires?
>> > 
>> > 					- Ted
>> > 
>> 
>> 
>> Hi, Ted!
>> 
>> I guess, I've explained my idea badly, sorry :)
>> 
>> I mean, that there is a chance to hit this situation:
>> 
>> CPU0				CPU1
>> 				kthread_should_stop()  <-- false
>> kthread_stop()
>> set_bit(KTHREAD_SHOULD_STOP)				
>> wake_up_process()
>> wait_for_completion()
>> 				schedule_timeout_interruptible()
>> 
>> *waits until timer expires*
> 
> Yeah, so the bug here is checking kthread_should_stop() while
> the task state is TASK_RUNNING.
> 
> What you need to do here is:
> 
> while (run) {
> 
> 	....
> 	set_current_state(TASK_INTERRUPTIBLE);
> 	if (kthread_should_stop()) {
> 		__set_current_state(TASK_RUNNING);
> 		break;
> 	}
> 	schedule_timeout(tout);
> 
> 	.....
> }
> 
> 
> That means in the case above where schedule() occurs after the
> kthread_should_stop() check has raced with kthread_stop(), then
> wake_up_process() will handle any races with schedule() correctly.
> 
> i.e.  wake_up_process() will set the task state to TASK_RUNNING and
> schedule() will not sleep if it is called after wake_up_process().
> Or if schedule() runs first then wake_up_process() will wake it
> correctly after setting the state to TASK_RUNNING.
> 
> Either way, the loop then runs around again straight away to the next
> kthread_should_stop() call, at which point it breaks out.
> 
> I note that the "wait_to_exit:" code in the same function does this
> properly....
> 

Hi, Dave!

I've tested your suggestion with syzbot and it works, thank you!


Anyway, @Ted, does it make sense to add boundaries for 
s_mmp_update_interval? I think, too big update interval for mmp isn't 
reasonable. I can send patch series with Dave's suggestion and previous 
patch. What do you think?




With regards,
Pavel Skripkin

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-06 10:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-05 15:14 [PATCH] ext4: avoid huge mmp update interval value Pavel Skripkin
2021-08-05 19:45 ` Theodore Ts'o
2021-08-05 20:12   ` Pavel Skripkin
2021-08-05 22:59     ` Dave Chinner
2021-08-06 10:20       ` Pavel Skripkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).