Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/2] iowait and idle fixes in /proc/stat
@ 2020-06-10 21:05 Tom Hromatka
  2020-06-10 21:05 ` [PATCH 1/2] tick-sched: Do not clear the iowait and idle times Tom Hromatka
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Tom Hromatka @ 2020-06-10 21:05 UTC (permalink / raw)
  To: tom.hromatka, linux-kernel, linux-fsdevel
  Cc: fweisbec, tglx, mingo, adobriyan

A customer is using /proc/stat to track cpu usage in a VM and noted
that the iowait and idle times behave strangely when a cpu goes
offline and comes back online.

This patchset addresses two issues that can cause iowait and idle
to fluctuate up and down.  With these changes, cpu iowait and idle
now only monotonically increase.

Tom Hromatka (2):
  tick-sched: Do not clear the iowait and idle times
  /proc/stat: Simplify iowait and idle calculations when cpu is offline

 fs/proc/stat.c           | 24 ++++++------------------
 kernel/time/tick-sched.c |  9 +++++++++
 2 files changed, 15 insertions(+), 18 deletions(-)

-- 
2.25.3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] tick-sched: Do not clear the iowait and idle times
  2020-06-10 21:05 [PATCH 0/2] iowait and idle fixes in /proc/stat Tom Hromatka
@ 2020-06-10 21:05 ` Tom Hromatka
  2020-06-10 21:05 ` [PATCH 2/2] /proc/stat: Simplify iowait and idle calculations when cpu is offline Tom Hromatka
  2020-07-14 18:55 ` [PATCH 0/2] iowait and idle fixes in /proc/stat Tom Hromatka
  2 siblings, 0 replies; 4+ messages in thread
From: Tom Hromatka @ 2020-06-10 21:05 UTC (permalink / raw)
  To: tom.hromatka, linux-kernel, linux-fsdevel
  Cc: fweisbec, tglx, mingo, adobriyan

A customer reported that when a cpu goes offline and then comes back
online, the overall cpu idle and iowait data in /proc/stat decreases.
This is wreaking havoc with their cpu usage calculations.

Prior to this patch:

	        user nice system    idle iowait
	cpu  1390748  636 209444 9802206  19598
	cpu1  178384   75  24545 1392450   3025

take cpu1 offline and bring it back online

	        user nice system    idle iowait
	cpu  1391209  636 209682 8453440  16595
	cpu1  178440   75  24572     627      0

To prevent this, do not clear the idle and iowait times for the
cpu that has come back online.

With this patch:

	        user nice system    idle iowait
	cpu   129913   17  17590  166512    704
	cpu1   15916    3   2395   20989     47

take cpu1 offline and bring it back online

	        user nice system    idle iowait
	cpu   130089   17  17686  184625    711
        cpu1   15942    3   2401   23088     47

Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
---
 kernel/time/tick-sched.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3e2dc9b8858c..8103bad7bbd6 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1375,13 +1375,22 @@ void tick_setup_sched_timer(void)
 void tick_cancel_sched_timer(int cpu)
 {
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+	ktime_t idle_sleeptime, iowait_sleeptime;
 
 # ifdef CONFIG_HIGH_RES_TIMERS
 	if (ts->sched_timer.base)
 		hrtimer_cancel(&ts->sched_timer);
 # endif
 
+	/* save off and restore the idle_sleeptime and the iowait_sleeptime
+	 * to avoid discontinuities and ensure that they are monotonically
+	 * increasing
+	 */
+	idle_sleeptime = ts->idle_sleeptime;
+	iowait_sleeptime = ts->iowait_sleeptime;
 	memset(ts, 0, sizeof(*ts));
+	ts->idle_sleeptime = idle_sleeptime;
+	ts->iowait_sleeptime = iowait_sleeptime;
 }
 #endif
 
-- 
2.25.3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/2] /proc/stat: Simplify iowait and idle calculations when cpu is offline
  2020-06-10 21:05 [PATCH 0/2] iowait and idle fixes in /proc/stat Tom Hromatka
  2020-06-10 21:05 ` [PATCH 1/2] tick-sched: Do not clear the iowait and idle times Tom Hromatka
@ 2020-06-10 21:05 ` Tom Hromatka
  2020-07-14 18:55 ` [PATCH 0/2] iowait and idle fixes in /proc/stat Tom Hromatka
  2 siblings, 0 replies; 4+ messages in thread
From: Tom Hromatka @ 2020-06-10 21:05 UTC (permalink / raw)
  To: tom.hromatka, linux-kernel, linux-fsdevel
  Cc: fweisbec, tglx, mingo, adobriyan

A customer reported that when a cpu goes offline, the iowait and idle
times reported in /proc/stat will sometimes spike.  This is being
caused by a different data source being used for these values when a
cpu is offline.

Prior to this patch:

put the system under heavy load so that there is little idle time

	       user nice system    idle iowait
	cpu  109515   17  32111  220686    607

take cpu1 offline

	       user nice system    idle iowait
	cpu  113742   17  32721  220724    612

bring cpu1 back online

	       user nice system    idle iowait
	cpu  118332   17  33430  220687    607

To prevent this, let's use the same data source whether a cpu is
online or not.

With this patch:

put the system under heavy load so that there is little idle time

	       user nice system    idle iowait
	cpu   14096   16   4646  157687    426

take cpu1 offline

	       user nice system    idle iowait
	cpu   21614   16   7179  157687    426

bring cpu1 back online

	       user nice system    idle iowait
	cpu   27362   16   9555  157688    426

Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
---
 fs/proc/stat.c | 24 ++++++------------------
 1 file changed, 6 insertions(+), 18 deletions(-)

diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 46b3293015fe..35b92539e711 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -47,32 +47,20 @@ static u64 get_iowait_time(struct kernel_cpustat *kcs, int cpu)
 
 static u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
 {
-	u64 idle, idle_usecs = -1ULL;
+	u64 idle, idle_usecs;
 
-	if (cpu_online(cpu))
-		idle_usecs = get_cpu_idle_time_us(cpu, NULL);
-
-	if (idle_usecs == -1ULL)
-		/* !NO_HZ or cpu offline so we can rely on cpustat.idle */
-		idle = kcs->cpustat[CPUTIME_IDLE];
-	else
-		idle = idle_usecs * NSEC_PER_USEC;
+	idle_usecs = get_cpu_idle_time_us(cpu, NULL);
+	idle = idle_usecs * NSEC_PER_USEC;
 
 	return idle;
 }
 
 static u64 get_iowait_time(struct kernel_cpustat *kcs, int cpu)
 {
-	u64 iowait, iowait_usecs = -1ULL;
-
-	if (cpu_online(cpu))
-		iowait_usecs = get_cpu_iowait_time_us(cpu, NULL);
+	u64 iowait, iowait_usecs;
 
-	if (iowait_usecs == -1ULL)
-		/* !NO_HZ or cpu offline so we can rely on cpustat.iowait */
-		iowait = kcs->cpustat[CPUTIME_IOWAIT];
-	else
-		iowait = iowait_usecs * NSEC_PER_USEC;
+	iowait_usecs = get_cpu_iowait_time_us(cpu, NULL);
+	iowait = iowait_usecs * NSEC_PER_USEC;
 
 	return iowait;
 }
-- 
2.25.3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] iowait and idle fixes in /proc/stat
  2020-06-10 21:05 [PATCH 0/2] iowait and idle fixes in /proc/stat Tom Hromatka
  2020-06-10 21:05 ` [PATCH 1/2] tick-sched: Do not clear the iowait and idle times Tom Hromatka
  2020-06-10 21:05 ` [PATCH 2/2] /proc/stat: Simplify iowait and idle calculations when cpu is offline Tom Hromatka
@ 2020-07-14 18:55 ` Tom Hromatka
  2 siblings, 0 replies; 4+ messages in thread
From: Tom Hromatka @ 2020-07-14 18:55 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, tglx; +Cc: fweisbec, mingo, adobriyan

Ping.

Thanks.

Tom


On 6/10/20 3:05 PM, Tom Hromatka wrote:
> A customer is using /proc/stat to track cpu usage in a VM and noted
> that the iowait and idle times behave strangely when a cpu goes
> offline and comes back online.
>
> This patchset addresses two issues that can cause iowait and idle
> to fluctuate up and down.  With these changes, cpu iowait and idle
> now only monotonically increase.
>
> Tom Hromatka (2):
>    tick-sched: Do not clear the iowait and idle times
>    /proc/stat: Simplify iowait and idle calculations when cpu is offline
>
>   fs/proc/stat.c           | 24 ++++++------------------
>   kernel/time/tick-sched.c |  9 +++++++++
>   2 files changed, 15 insertions(+), 18 deletions(-)
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-14 19:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-10 21:05 [PATCH 0/2] iowait and idle fixes in /proc/stat Tom Hromatka
2020-06-10 21:05 ` [PATCH 1/2] tick-sched: Do not clear the iowait and idle times Tom Hromatka
2020-06-10 21:05 ` [PATCH 2/2] /proc/stat: Simplify iowait and idle calculations when cpu is offline Tom Hromatka
2020-07-14 18:55 ` [PATCH 0/2] iowait and idle fixes in /proc/stat Tom Hromatka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).