LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v3 0/2] ceph_check_delayed_caps() softlockup
@ 2021-07-06 13:52 Luis Henriques
  2021-07-06 13:52 ` [PATCH v3 1/2] ceph: allow schedule_delayed() callers to set delay for workqueue Luis Henriques
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Luis Henriques @ 2021-07-06 13:52 UTC (permalink / raw)
  To: Jeff Layton, Ilya Dryomov; +Cc: ceph-devel, linux-kernel, Luis Henriques

* changes since v3:
  - always round the delay with round_jiffies_relative() in function
    schedule_delayed() (patch 0001)

This is an attempt to fix the softlock on the delayed_work workqueue.  As
stated in 0002 patch:

  Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
  workqueue and it can be kept looping for quite some time if caps keep being
  added back to the mdsc->cap_delay_list.  This may result in the watchdog
  tainting the kernel with the softlockup flag.

v2 of this fix modifies the approach by time-bounding the loop in this
function, so that any caps added to the list *after* the loop starts will
be postponed to the next wq run.

An extra change in 0001 (suggested by Jeff) allows scheduling runs for
periods smaller than the default (5 secs) period.  This way,
delayed_work() can have the next run scheduled for the next list element
ci->i_hold_caps_max instead of 5 secs.

This patchset should fix the issue reported here [1], although a quick
search for "ceph_check_delayed_caps" in the tracker returns a few more
bugs, possibly duplicates.

[1] https://tracker.ceph.com/issues/46284

Luis Henriques (2):
  ceph: allow schedule_delayed() callers to set delay for workqueue
  ceph: reduce contention in ceph_check_delayed_caps()

 fs/ceph/caps.c       | 17 ++++++++++++++++-
 fs/ceph/mds_client.c | 25 ++++++++++++++++---------
 fs/ceph/super.h      |  2 +-
 3 files changed, 33 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 1/2] ceph: allow schedule_delayed() callers to set delay for workqueue
  2021-07-06 13:52 [PATCH v3 0/2] ceph_check_delayed_caps() softlockup Luis Henriques
@ 2021-07-06 13:52 ` Luis Henriques
  2021-07-06 13:52 ` [PATCH v3 2/2] ceph: reduce contention in ceph_check_delayed_caps() Luis Henriques
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Luis Henriques @ 2021-07-06 13:52 UTC (permalink / raw)
  To: Jeff Layton, Ilya Dryomov; +Cc: ceph-devel, linux-kernel, Luis Henriques

Allow schedule_delayed() callers to explicitly set the delay value instead
of defaulting to a 5 secs value.

Signed-off-by: Luis Henriques <lhenriques@suse.de>
---
 fs/ceph/mds_client.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index e5af591d3bd4..f5dc58a05f9f 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4502,13 +4502,19 @@ void inc_session_sequence(struct ceph_mds_session *s)
 }
 
 /*
- * delayed work -- periodically trim expired leases, renew caps with mds
+ * delayed work -- periodically trim expired leases, renew caps with mds.  If
+ * the @delay parameter is set to 0 or if it's more than 5 secs, the default
+ * workqueue delay value of 5 secs will be used.
  */
-static void schedule_delayed(struct ceph_mds_client *mdsc)
+static void schedule_delayed(struct ceph_mds_client *mdsc, unsigned long delay)
 {
-	int delay = 5;
-	unsigned hz = round_jiffies_relative(HZ * delay);
-	schedule_delayed_work(&mdsc->delayed_work, hz);
+	unsigned long max_delay = HZ * 5;
+
+	/* 5 secs default delay */
+	if (!delay || (delay > max_delay))
+		delay = max_delay;
+	schedule_delayed_work(&mdsc->delayed_work,
+			      round_jiffies_relative(delay));
 }
 
 static void delayed_work(struct work_struct *work)
@@ -4565,7 +4571,7 @@ static void delayed_work(struct work_struct *work)
 
 	maybe_recover_session(mdsc);
 
-	schedule_delayed(mdsc);
+	schedule_delayed(mdsc, 0);
 }
 
 int ceph_mdsc_init(struct ceph_fs_client *fsc)
@@ -5042,7 +5048,7 @@ void ceph_mdsc_handle_mdsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg)
 			  mdsc->mdsmap->m_epoch);
 
 	mutex_unlock(&mdsc->mutex);
-	schedule_delayed(mdsc);
+	schedule_delayed(mdsc, 0);
 	return;
 
 bad_unlock:

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 2/2] ceph: reduce contention in ceph_check_delayed_caps()
  2021-07-06 13:52 [PATCH v3 0/2] ceph_check_delayed_caps() softlockup Luis Henriques
  2021-07-06 13:52 ` [PATCH v3 1/2] ceph: allow schedule_delayed() callers to set delay for workqueue Luis Henriques
@ 2021-07-06 13:52 ` Luis Henriques
  2021-07-06 17:03 ` [PATCH v3 0/2] ceph_check_delayed_caps() softlockup Jeff Layton
  2021-08-04 15:52 ` Jeff Layton
  3 siblings, 0 replies; 5+ messages in thread
From: Luis Henriques @ 2021-07-06 13:52 UTC (permalink / raw)
  To: Jeff Layton, Ilya Dryomov
  Cc: ceph-devel, linux-kernel, Luis Henriques, stable

Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
workqueue and it can be kept looping for quite some time if caps keep
being added back to the mdsc->cap_delay_list.  This may result in the
watchdog tainting the kernel with the softlockup flag.

This patch breaks this loop if the caps have been recently (i.e. during
the loop execution).  Any new caps added to the list will be handled in
the next run.

Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/46284
Signed-off-by: Luis Henriques <lhenriques@suse.de>
---
 fs/ceph/caps.c       | 17 ++++++++++++++++-
 fs/ceph/mds_client.c |  7 ++++---
 fs/ceph/super.h      |  2 +-
 3 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index a5e93b185515..c79b8dff25d7 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -4224,11 +4224,19 @@ void ceph_handle_caps(struct ceph_mds_session *session,
 
 /*
  * Delayed work handler to process end of delayed cap release LRU list.
+ *
+ * If new caps are added to the list while processing it, these won't get
+ * processed in this run.  In this case, the ci->i_hold_caps_max will be
+ * returned so that the work can be scheduled accordingly.
  */
-void ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
+unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
 {
 	struct inode *inode;
 	struct ceph_inode_info *ci;
+	struct ceph_mount_options *opt = mdsc->fsc->mount_options;
+	unsigned long delay_max = opt->caps_wanted_delay_max * HZ;
+	unsigned long loop_start = jiffies;
+	unsigned long delay = 0;
 
 	dout("check_delayed_caps\n");
 	spin_lock(&mdsc->cap_delay_lock);
@@ -4236,6 +4244,11 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
 		ci = list_first_entry(&mdsc->cap_delay_list,
 				      struct ceph_inode_info,
 				      i_cap_delay_list);
+		if (time_before(loop_start, ci->i_hold_caps_max - delay_max)) {
+			dout("%s caps added recently.  Exiting loop", __func__);
+			delay = ci->i_hold_caps_max;
+			break;
+		}
 		if ((ci->i_ceph_flags & CEPH_I_FLUSH) == 0 &&
 		    time_before(jiffies, ci->i_hold_caps_max))
 			break;
@@ -4252,6 +4265,8 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
 		}
 	}
 	spin_unlock(&mdsc->cap_delay_lock);
+
+	return delay;
 }
 
 /*
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index f5dc58a05f9f..a6f985786d68 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4519,11 +4519,12 @@ static void schedule_delayed(struct ceph_mds_client *mdsc, unsigned long delay)
 
 static void delayed_work(struct work_struct *work)
 {
-	int i;
 	struct ceph_mds_client *mdsc =
 		container_of(work, struct ceph_mds_client, delayed_work.work);
+	unsigned long delay;
 	int renew_interval;
 	int renew_caps;
+	int i;
 
 	dout("mdsc delayed_work\n");
 
@@ -4563,7 +4564,7 @@ static void delayed_work(struct work_struct *work)
 	}
 	mutex_unlock(&mdsc->mutex);
 
-	ceph_check_delayed_caps(mdsc);
+	delay = ceph_check_delayed_caps(mdsc);
 
 	ceph_queue_cap_reclaim_work(mdsc);
 
@@ -4571,7 +4572,7 @@ static void delayed_work(struct work_struct *work)
 
 	maybe_recover_session(mdsc);
 
-	schedule_delayed(mdsc, 0);
+	schedule_delayed(mdsc, delay);
 }
 
 int ceph_mdsc_init(struct ceph_fs_client *fsc)
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 839e6b0239ee..3b5207c82767 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1170,7 +1170,7 @@ extern void ceph_flush_snaps(struct ceph_inode_info *ci,
 extern bool __ceph_should_report_size(struct ceph_inode_info *ci);
 extern void ceph_check_caps(struct ceph_inode_info *ci, int flags,
 			    struct ceph_mds_session *session);
-extern void ceph_check_delayed_caps(struct ceph_mds_client *mdsc);
+extern unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc);
 extern void ceph_flush_dirty_caps(struct ceph_mds_client *mdsc);
 extern int  ceph_drop_caps_for_unlink(struct inode *inode);
 extern int ceph_encode_inode_release(void **p, struct inode *inode,

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 0/2] ceph_check_delayed_caps() softlockup
  2021-07-06 13:52 [PATCH v3 0/2] ceph_check_delayed_caps() softlockup Luis Henriques
  2021-07-06 13:52 ` [PATCH v3 1/2] ceph: allow schedule_delayed() callers to set delay for workqueue Luis Henriques
  2021-07-06 13:52 ` [PATCH v3 2/2] ceph: reduce contention in ceph_check_delayed_caps() Luis Henriques
@ 2021-07-06 17:03 ` Jeff Layton
  2021-08-04 15:52 ` Jeff Layton
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2021-07-06 17:03 UTC (permalink / raw)
  To: Luis Henriques, Ilya Dryomov; +Cc: ceph-devel, linux-kernel

On Tue, 2021-07-06 at 14:52 +0100, Luis Henriques wrote:
> * changes since v3:
>   - always round the delay with round_jiffies_relative() in function
>     schedule_delayed() (patch 0001)
> 
> This is an attempt to fix the softlock on the delayed_work workqueue.  As
> stated in 0002 patch:
> 
>   Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
>   workqueue and it can be kept looping for quite some time if caps keep being
>   added back to the mdsc->cap_delay_list.  This may result in the watchdog
>   tainting the kernel with the softlockup flag.
> 
> v2 of this fix modifies the approach by time-bounding the loop in this
> function, so that any caps added to the list *after* the loop starts will
> be postponed to the next wq run.
> 
> An extra change in 0001 (suggested by Jeff) allows scheduling runs for
> periods smaller than the default (5 secs) period.  This way,
> delayed_work() can have the next run scheduled for the next list element
> ci->i_hold_caps_max instead of 5 secs.
> 
> This patchset should fix the issue reported here [1], although a quick
> search for "ceph_check_delayed_caps" in the tracker returns a few more
> bugs, possibly duplicates.
> 
> [1] https://tracker.ceph.com/issues/46284
> 
> Luis Henriques (2):
>   ceph: allow schedule_delayed() callers to set delay for workqueue
>   ceph: reduce contention in ceph_check_delayed_caps()
> 
>  fs/ceph/caps.c       | 17 ++++++++++++++++-
>  fs/ceph/mds_client.c | 25 ++++++++++++++++---------
>  fs/ceph/super.h      |  2 +-
>  3 files changed, 33 insertions(+), 11 deletions(-)
> 

Looks good. I'll do some testing with this today and will merge into
testing branch if all goes well.

Thanks!
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 0/2] ceph_check_delayed_caps() softlockup
  2021-07-06 13:52 [PATCH v3 0/2] ceph_check_delayed_caps() softlockup Luis Henriques
                   ` (2 preceding siblings ...)
  2021-07-06 17:03 ` [PATCH v3 0/2] ceph_check_delayed_caps() softlockup Jeff Layton
@ 2021-08-04 15:52 ` Jeff Layton
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2021-08-04 15:52 UTC (permalink / raw)
  To: Luis Henriques, Ilya Dryomov; +Cc: ceph-devel, linux-kernel

On Tue, 2021-07-06 at 14:52 +0100, Luis Henriques wrote:
> * changes since v3:
>   - always round the delay with round_jiffies_relative() in function
>     schedule_delayed() (patch 0001)
> 
> This is an attempt to fix the softlock on the delayed_work workqueue.  As
> stated in 0002 patch:
> 
>   Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
>   workqueue and it can be kept looping for quite some time if caps keep being
>   added back to the mdsc->cap_delay_list.  This may result in the watchdog
>   tainting the kernel with the softlockup flag.
> 
> v2 of this fix modifies the approach by time-bounding the loop in this
> function, so that any caps added to the list *after* the loop starts will
> be postponed to the next wq run.
> 
> An extra change in 0001 (suggested by Jeff) allows scheduling runs for
> periods smaller than the default (5 secs) period.  This way,
> delayed_work() can have the next run scheduled for the next list element
> ci->i_hold_caps_max instead of 5 secs.
> 
> This patchset should fix the issue reported here [1], although a quick
> search for "ceph_check_delayed_caps" in the tracker returns a few more
> bugs, possibly duplicates.
> 
> [1] https://tracker.ceph.com/issues/46284
> 
> Luis Henriques (2):
>   ceph: allow schedule_delayed() callers to set delay for workqueue
>   ceph: reduce contention in ceph_check_delayed_caps()
> 
>  fs/ceph/caps.c       | 17 ++++++++++++++++-
>  fs/ceph/mds_client.c | 25 ++++++++++++++++---------
>  fs/ceph/super.h      |  2 +-
>  3 files changed, 33 insertions(+), 11 deletions(-)
> 

FWIW, we've had some more reports of this, so I think we should get this
into mainline and stable soon. I'm going to squash these two patches
together as it should (hopefully) make it simpler for stable backports.

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-04 15:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-06 13:52 [PATCH v3 0/2] ceph_check_delayed_caps() softlockup Luis Henriques
2021-07-06 13:52 ` [PATCH v3 1/2] ceph: allow schedule_delayed() callers to set delay for workqueue Luis Henriques
2021-07-06 13:52 ` [PATCH v3 2/2] ceph: reduce contention in ceph_check_delayed_caps() Luis Henriques
2021-07-06 17:03 ` [PATCH v3 0/2] ceph_check_delayed_caps() softlockup Jeff Layton
2021-08-04 15:52 ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).