LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 00/12] Increased clocksource validation and cleanups (v4)
@ 2015-03-12  4:16 John Stultz
  2015-03-12  4:16 ` [PATCH 01/12] clocksource: Simplify clocks_calc_max_nsecs logic John Stultz
                   ` (12 more replies)
  0 siblings, 13 replies; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

So here is another round of this series, which is the result of
earlier discussions with Linus and his suggestions around
improvements to clocksource validation in the hope we can more
easily catch bad hardware.

There's also a few cleanups Linus suggested as well as a few I've been
meaning to get to for awhile.

I tried in address all the feedback that had been given, adding
the checks behind CONFIG_DEBUG_TIMEKEEPING.  I also sorted out a
sane way to print rate-limited warnings if we see cycle deltas that
are too large, or if they look like small underflows.

I'd like to get this queued into -tip soon so it can get as much
testing in -next as possible.

If there are any objections or feedback, I'd love to hear it!

New in v4:
* Lots and lots of typo corrections and minor cleanups suggested
  by Ingo.
* Dropped "Remove clocksource_max_deferment()" patch
* Added "Rename __clocksource_updatefreq_*..." patch
* I realized one of the patches (Improve clocksource watchdog
  reporting) didn't have a proper cc list, so while it was on lkml
  folks may not have reviewed it before.

thanks
-john

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>

John Stultz (12):
  clocksource: Simplify clocks_calc_max_nsecs logic
  clocksource: Simplify logic around clocksource wrapping safety margins
  clocksource: Add max_cycles to clocksource structure
  time: Add debugging checks to warn if we see delays
  time: Add infrastructure to cap clocksource reads to the max_cycles
    value
  time: Try to catch clocksource delta underflows
  time: Add warnings when overflows or underflows are observed
  clocksource: Improve clocksource watchdog reporting
  clocksource: Mostly kill clocksource_register()
  sparc: Convert to using clocksource_register_hz()
  clocksource: Add some debug info about clocksources being registered
  clocksource: Rename __clocksource_updatefreq_* to
    __clocksource_update_freq_*

 arch/arm/plat-omap/counter_32k.c |   2 +-
 arch/s390/kernel/time.c          |   2 +-
 arch/sparc/kernel/time_32.c      |   6 +-
 drivers/clocksource/em_sti.c     |   2 +-
 drivers/clocksource/sh_cmt.c     |   2 +-
 drivers/clocksource/sh_tmu.c     |   2 +-
 include/linux/clocksource.h      |  26 ++++--
 kernel/time/clocksource.c        | 171 ++++++++++++++++++---------------------
 kernel/time/jiffies.c            |   5 +-
 kernel/time/sched_clock.c        |   6 +-
 kernel/time/timekeeping.c        | 127 ++++++++++++++++++++++++++---
 lib/Kconfig.debug                |  12 +++
 12 files changed, 235 insertions(+), 128 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 01/12] clocksource: Simplify clocks_calc_max_nsecs logic
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:01   ` [tip:timers/core] clocksource: Simplify the clocks_calc_max_nsecs () logic tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 02/12] clocksource: Simplify logic around clocksource wrapping safety margins John Stultz
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

The previous clocks_calc_max_nsecs had some unecessarily
complex bit logic to find the max interval that could cause
multiplication overflows. Since this is not in the hot
path, just do the divide to make it easier to read.

The previous implementation also had a subtle issue
that it avoided overflows into signed 64-bit values, where
as the intervals are always unsigned. This resulted in
overly conservative intervals, which other safety margins
were then added to, reducing the intended interval length.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/clocksource.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 4892352..11323f4 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -476,19 +476,10 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 
 	/*
 	 * Calculate the maximum number of cycles that we can pass to the
-	 * cyc2ns function without overflowing a 64-bit signed result. The
-	 * maximum number of cycles is equal to ULLONG_MAX/(mult+maxadj)
-	 * which is equivalent to the below.
-	 * max_cycles < (2^63)/(mult + maxadj)
-	 * max_cycles < 2^(log2((2^63)/(mult + maxadj)))
-	 * max_cycles < 2^(log2(2^63) - log2(mult + maxadj))
-	 * max_cycles < 2^(63 - log2(mult + maxadj))
-	 * max_cycles < 1 << (63 - log2(mult + maxadj))
-	 * Please note that we add 1 to the result of the log2 to account for
-	 * any rounding errors, ensure the above inequality is satisfied and
-	 * no overflow will occur.
+	 * cyc2ns function without overflowing a 64-bit result.
 	 */
-	max_cycles = 1ULL << (63 - (ilog2(mult + maxadj) + 1));
+	max_cycles = ULLONG_MAX;
+	do_div(max_cycles, mult+maxadj);
 
 	/*
 	 * The actual maximum number of cycles we can defer the clocksource is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 02/12] clocksource: Simplify logic around clocksource wrapping safety margins
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
  2015-03-12  4:16 ` [PATCH 01/12] clocksource: Simplify clocks_calc_max_nsecs logic John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-12  5:55   ` Ingo Molnar
  2015-03-13  9:01   ` [tip:timers/core] clocksource: Simplify the " tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 03/12] clocksource: Add max_cycles to clocksource structure John Stultz
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

The clocksource logic has a number of places where we try to
include a safety margin. Most of these are 12% safety margins,
but they are inconsistently applied and sometimes are applied
on top of each other.

Additionally, in the previous patch, we corrected an issue
where we unintentionally in effect created a 50% safety margin,
which these 12.5% margins where then added to.

So to simplify the logic here, this patch removes the various
12.5% margins, and consolidates adding the margin in one place:
clocks_calc_max_nsecs().

Additionally, Linus prefers a 50% safety margin, as it allows
bad clock values to be more easily caught. This should really
have no net effect, due to the corrected issue earlier which
caused greater then 50% margins to be used w/o issue.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Stephen Boyd <sboyd@codeaurora.org> (for sched_clock.c bit)
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/clocksource.c | 26 ++++++++++++--------------
 kernel/time/sched_clock.c |  4 ++--
 2 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 11323f4..fe64c7f 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -469,6 +469,9 @@ static u32 clocksource_max_adjustment(struct clocksource *cs)
  * @shift:	cycle to nanosecond divisor (power of two)
  * @maxadj:	maximum adjustment value to mult (~11%)
  * @mask:	bitmask for two's complement subtraction of non 64 bit counters
+ *
+ * NOTE: This function includes a safety margin of 50%, so that bad clock values
+ * can be detected.
  */
 u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 {
@@ -490,11 +493,14 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 	max_cycles = min(max_cycles, mask);
 	max_nsecs = clocksource_cyc2ns(max_cycles, mult - maxadj, shift);
 
+	/* Return 50% of the actual maximum, so we can detect bad values */
+	max_nsecs >>= 1;
+
 	return max_nsecs;
 }
 
 /**
- * clocksource_max_deferment - Returns max time the clocksource can be deferred
+ * clocksource_max_deferment - Returns max time the clocksource should be deferred
  * @cs:         Pointer to clocksource
  *
  */
@@ -504,13 +510,7 @@ static u64 clocksource_max_deferment(struct clocksource *cs)
 
 	max_nsecs = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj,
 					  cs->mask);
-	/*
-	 * To ensure that the clocksource does not wrap whilst we are idle,
-	 * limit the time the clocksource can be deferred by 12.5%. Please
-	 * note a margin of 12.5% is used because this can be computed with
-	 * a shift, versus say 10% which would require division.
-	 */
-	return max_nsecs - (max_nsecs >> 3);
+	return max_nsecs;
 }
 
 #ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
@@ -659,10 +659,9 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 	 * conversion precision. 10 minutes is still a reasonable
 	 * amount. That results in a shift value of 24 for a
 	 * clocksource with mask >= 40bit and f >= 4GHz. That maps to
-	 * ~ 0.06ppm granularity for NTP. We apply the same 12.5%
-	 * margin as we do in clocksource_max_deferment()
+	 * ~ 0.06ppm granularity for NTP.
 	 */
-	sec = (cs->mask - (cs->mask >> 3));
+	sec = cs->mask;
 	do_div(sec, freq);
 	do_div(sec, scale);
 	if (!sec)
@@ -674,9 +673,8 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 			       NSEC_PER_SEC / scale, sec * scale);
 
 	/*
-	 * for clocksources that have large mults, to avoid overflow.
-	 * Since mult may be adjusted by ntp, add an safety extra margin
-	 *
+	 * Ensure clocksources that have large mults don't overflow
+	 * when adjusted.
 	 */
 	cs->maxadj = clocksource_max_adjustment(cs);
 	while ((cs->mult + cs->maxadj < cs->mult)
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 01d2d15..c794b84 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -125,9 +125,9 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
 
 	new_mask = CLOCKSOURCE_MASK(bits);
 
-	/* calculate how many ns until we wrap */
+	/* calculate how many ns until we risk wrapping */
 	wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask);
-	new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3));
+	new_wrap_kt = ns_to_ktime(wrap);
 
 	/* update epoch for new counter and update epoch_ns from old counter*/
 	new_epoch = read();
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 03/12] clocksource: Add max_cycles to clocksource structure
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
  2015-03-12  4:16 ` [PATCH 01/12] clocksource: Simplify clocks_calc_max_nsecs logic John Stultz
  2015-03-12  4:16 ` [PATCH 02/12] clocksource: Simplify logic around clocksource wrapping safety margins John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:01   ` [tip:timers/core] clocksource: Add 'max_cycles' to ' struct clocksource' tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 04/12] time: Add debugging checks to warn if we see delays John Stultz
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

In order to facilitate some clocksource validation,
add a max_cycles entry to the structure which will
hold the maximum cycle value that can safely be
multiplied without potentially causing an overflow.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 include/linux/clocksource.h |  6 ++++--
 kernel/time/clocksource.c   | 29 +++++++++++++++++------------
 kernel/time/sched_clock.c   |  2 +-
 3 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 9c78d15..63fe52f 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -56,6 +56,7 @@ struct module;
  * @shift:		cycle to nanosecond divisor (power of two)
  * @max_idle_ns:	max idle time permitted by the clocksource (nsecs)
  * @maxadj:		maximum adjustment value to mult (~11%)
+ * @max_cycles:		maximum safe cycle value which won't overflow on mult
  * @flags:		flags describing special properties
  * @archdata:		arch-specific data
  * @suspend:		suspend function for the clocksource, if necessary
@@ -76,7 +77,7 @@ struct clocksource {
 #ifdef CONFIG_ARCH_CLOCKSOURCE_DATA
 	struct arch_clocksource_data archdata;
 #endif
-
+	u64 max_cycles;
 	const char *name;
 	struct list_head list;
 	int rating;
@@ -189,7 +190,8 @@ extern struct clocksource * __init clocksource_default_clock(void);
 extern void clocksource_mark_unstable(struct clocksource *cs);
 
 extern u64
-clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask);
+clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask,
+							u64 *max_cycles);
 extern void
 clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
 
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index fe64c7f..a491803 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -469,11 +469,14 @@ static u32 clocksource_max_adjustment(struct clocksource *cs)
  * @shift:	cycle to nanosecond divisor (power of two)
  * @maxadj:	maximum adjustment value to mult (~11%)
  * @mask:	bitmask for two's complement subtraction of non 64 bit counters
+ * @max_cyc:	maximum cycle value before potential overflow (does not include
+ *		any safety margin)
  *
  * NOTE: This function includes a safety margin of 50%, so that bad clock values
  * can be detected.
  */
-u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
+u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask,
+								u64 *max_cyc)
 {
 	u64 max_nsecs, max_cycles;
 
@@ -493,6 +496,10 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 	max_cycles = min(max_cycles, mask);
 	max_nsecs = clocksource_cyc2ns(max_cycles, mult - maxadj, shift);
 
+	/* return the max_cycles value as well if requested */
+	if (max_cyc)
+		*max_cyc = max_cycles;
+
 	/* Return 50% of the actual maximum, so we can detect bad values */
 	max_nsecs >>= 1;
 
@@ -500,17 +507,15 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 }
 
 /**
- * clocksource_max_deferment - Returns max time the clocksource should be deferred
- * @cs:         Pointer to clocksource
+ * clocksource_update_max_deferment - Updates the clocksource max_idle_ns & max_cycles
+ * @cs:         Pointer to clocksource to be updated
  *
  */
-static u64 clocksource_max_deferment(struct clocksource *cs)
+static inline void clocksource_update_max_deferment(struct clocksource *cs)
 {
-	u64 max_nsecs;
-
-	max_nsecs = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj,
-					  cs->mask);
-	return max_nsecs;
+	cs->max_idle_ns = clocks_calc_max_nsecs(cs->mult, cs->shift,
+						cs->maxadj, cs->mask,
+						&cs->max_cycles);
 }
 
 #ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
@@ -684,7 +689,7 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 		cs->maxadj = clocksource_max_adjustment(cs);
 	}
 
-	cs->max_idle_ns = clocksource_max_deferment(cs);
+	clocksource_update_max_deferment(cs);
 }
 EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
 
@@ -730,8 +735,8 @@ int clocksource_register(struct clocksource *cs)
 		"Clocksource %s might overflow on 11%% adjustment\n",
 		cs->name);
 
-	/* calculate max idle time permitted for this clocksource */
-	cs->max_idle_ns = clocksource_max_deferment(cs);
+	/* Update max idle time permitted for this clocksource */
+	clocksource_update_max_deferment(cs);
 
 	mutex_lock(&clocksource_mutex);
 	clocksource_enqueue(cs);
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index c794b84..d43855b 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -126,7 +126,7 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
 	new_mask = CLOCKSOURCE_MASK(bits);
 
 	/* calculate how many ns until we risk wrapping */
-	wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask);
+	wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask, NULL);
 	new_wrap_kt = ns_to_ktime(wrap);
 
 	/* update epoch for new counter and update epoch_ns from old counter*/
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 04/12] time: Add debugging checks to warn if we see delays
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (2 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 03/12] clocksource: Add max_cycles to clocksource structure John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-12  6:32   ` Ingo Molnar
  2015-03-13  9:02   ` [tip:timers/core] timekeeping: " tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 05/12] time: Add infrastructure to cap clocksource reads to the max_cycles value John Stultz
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

Recently there's been some request for better sanity
checking in the time code, so that its more clear
when something is going wrong since timekeeping issues
could manifest in a large number of strange ways with
various subsystems.

Thus, this patch adds some extra infrastructure to
add a check update_wall_time() to print warnings if we
see the call delayed beyond the max_cycles overflow
point, or beyond the clocksource max_idle_ns value
which is currently 50% of the overflow point.

This extra infrastructure is conditionalized
behind a new CONFIG_DEBUG_TIMEKEEPING option
also added in this patch.

Tested this a bit by halting qemu for specified
lengths of time to trigger the warnings.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/jiffies.c     |  1 +
 kernel/time/timekeeping.c | 27 +++++++++++++++++++++++++++
 lib/Kconfig.debug         | 12 ++++++++++++
 3 files changed, 40 insertions(+)

diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index a6a5bf5..7e41390 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -71,6 +71,7 @@ static struct clocksource clocksource_jiffies = {
 	.mask		= 0xffffffff, /*32bits*/
 	.mult		= NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
 	.shift		= JIFFIES_SHIFT,
+	.max_cycles	= 10,
 };
 
 __cacheline_aligned_in_smp DEFINE_SEQLOCK(jiffies_lock);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 91db941..882ba5b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -118,6 +118,30 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
 	tk->offs_boot = ktime_add(tk->offs_boot, delta);
 }
 
+#ifdef CONFIG_DEBUG_TIMEKEEPING
+static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
+{
+
+	cycle_t max_cycles = tk->tkr.clock->max_cycles;
+	const char *name = tk->tkr.clock->name;
+
+	if (offset > max_cycles) {
+		printk_deferred("ERROR: cycle offset (%lld) is larger than  allowed %s max_cycles (%lld)\n",
+					offset, name, max_cycles);
+	} else {
+		if (offset > (max_cycles >> 1)) {
+			printk_deferred("WARNING: cycle offset (%lld) is past the %s 50%% safety margin (%lld)\n",
+					offset, name, max_cycles>>1);
+		}
+	}
+}
+#else
+static inline
+void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
+{
+}
+#endif
+
 /**
  * tk_setup_internals - Set up internals to use clocksource clock.
  *
@@ -1630,6 +1654,9 @@ void update_wall_time(void)
 	if (offset < real_tk->cycle_interval)
 		goto out;
 
+	/* Do some additional sanity checking */
+	timekeeping_check_update(real_tk, offset);
+
 	/*
 	 * With NO_HZ we may have to accumulate many cycle_intervals
 	 * (think "ticks") worth of time at once. To do this efficiently,
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c5cefb3..408d32e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -865,6 +865,18 @@ config SCHED_STACK_END_CHECK
 	  data corruption or a sporadic crash at a later stage once the region
 	  is examined. The runtime overhead introduced is minimal.
 
+config DEBUG_TIMEKEEPING
+	bool "Enable extra timekeeping sanity checking"
+	help
+	  This option will enable additional timekeeping sanity checks
+	  which may be helpful when diagnosing issues where timekeeping
+	  problems are suspected.
+
+	  This may include checks in the timekeeping hotpaths, so this
+	  option may have a performance impact to some workloads.
+
+	  If unsure, say N.
+
 config TIMER_STATS
 	bool "Collect kernel timers statistics"
 	depends on DEBUG_KERNEL && PROC_FS
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 05/12] time: Add infrastructure to cap clocksource reads to the max_cycles value
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (3 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 04/12] time: Add debugging checks to warn if we see delays John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:02   ` [tip:timers/core] timekeeping: Add checks to cap clocksource reads to the 'max_cycles' value tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 06/12] time: Try to catch clocksource delta underflows John Stultz
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

When calculating the current delta since the last tick, we
currently have no hard protections to prevent a multiplication
overflow from occurring.

This patch introduces infrastructure to allow a cap that
limits the read delta value to the max_cycles value, which is
where an overflow would occur.

Since this is in the hotpath, it adds the extra checking under
CONFIG_DEBUG_TIMEKEEPING.

There was some concern that capping time like this could cause
problems as we may stop expiring timers, which could go circular
if the timer that triggers time accumulation were mis-scheduled
too far in the future, which would cause time to stop.

However, since the mult overflow would result in a smaller time
value, we would effectively have the same problem there.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c | 45 +++++++++++++++++++++++++++++++++------------
 1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 882ba5b..c06fe6e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -135,11 +135,40 @@ static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 		}
 	}
 }
+
+static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
+{
+	cycle_t cycle_now, delta;
+
+	/* read clocksource */
+	cycle_now = tkr->read(tkr->clock);
+
+	/* calculate the delta since the last update_wall_time */
+	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+
+	/* Cap delta value to the max_cycles values to avoid mult overflows */
+	if (unlikely(delta > tkr->clock->max_cycles))
+		delta = tkr->clock->max_cycles;
+
+	return delta;
+}
 #else
 static inline
 void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 {
 }
+static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
+{
+	cycle_t cycle_now, delta;
+
+	/* read clocksource */
+	cycle_now = tkr->read(tkr->clock);
+
+	/* calculate the delta since the last update_wall_time */
+	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+
+	return delta;
+}
 #endif
 
 /**
@@ -217,14 +246,10 @@ static inline u32 arch_gettimeoffset(void) { return 0; }
 
 static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
 {
-	cycle_t cycle_now, delta;
+	cycle_t delta;
 	s64 nsec;
 
-	/* read clocksource: */
-	cycle_now = tkr->read(tkr->clock);
-
-	/* calculate the delta since the last update_wall_time: */
-	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+	delta = timekeeping_get_delta(tkr);
 
 	nsec = delta * tkr->mult + tkr->xtime_nsec;
 	nsec >>= tkr->shift;
@@ -236,14 +261,10 @@ static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
 static inline s64 timekeeping_get_ns_raw(struct timekeeper *tk)
 {
 	struct clocksource *clock = tk->tkr.clock;
-	cycle_t cycle_now, delta;
+	cycle_t delta;
 	s64 nsec;
 
-	/* read clocksource: */
-	cycle_now = tk->tkr.read(clock);
-
-	/* calculate the delta since the last update_wall_time: */
-	delta = clocksource_delta(cycle_now, tk->tkr.cycle_last, tk->tkr.mask);
+	delta = timekeeping_get_delta(&tk->tkr);
 
 	/* convert delta to nanoseconds. */
 	nsec = clocksource_cyc2ns(delta, clock->mult, clock->shift);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 06/12] time: Try to catch clocksource delta underflows
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (4 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 05/12] time: Add infrastructure to cap clocksource reads to the max_cycles value John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:02   ` [tip:timers/core] timekeeping: " tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 07/12] time: Add warnings when overflows or underflows are observed John Stultz
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

In the case where there is a broken clocksource
where there are multiple actual clocks that
aren't perfectly aligned, we may see small "negative"
deltas when we subtract 'now' from 'cycle_last'.

The values are actually negative with respect to the
clocksource mask value, not necessarily negative
if cast to a s64, but we can check by checking the
delta see if it is a small (relative to the mask)
negative value (again negative relative to the mask).

If so, we assume we jumped backwards somehow and
instead use zero for our delta.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index c06fe6e..2ba62fe 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -146,6 +146,13 @@ static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
 	/* calculate the delta since the last update_wall_time */
 	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
 
+	/*
+	 * Try to catch underflows by checking if we are seeing small
+	 * mask-relative negative values.
+	 */
+	if (unlikely((~delta & tkr->mask) < (tkr->mask >> 3)))
+		delta = 0;
+
 	/* Cap delta value to the max_cycles values to avoid mult overflows */
 	if (unlikely(delta > tkr->clock->max_cycles))
 		delta = tkr->clock->max_cycles;
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 07/12] time: Add warnings when overflows or underflows are observed
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (5 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 06/12] time: Try to catch clocksource delta underflows John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-12  7:37   ` Ingo Molnar
  2015-03-13  9:03   ` [tip:timers/core] timekeeping: " tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 08/12] clocksource: Improve clocksource watchdog reporting John Stultz
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

It was suggested that the underflow/overflow protection
should probably throw some sort of warning out, rather
then just silently fixing the issue.

So this patch adds some warnings here. The flag variables
used are not protected by locks, but since we can't print
from the reading functions, just being able to say we
saw an issue in the update interval is useful enough,
and can be slightly racy without real consequence.

The big complication is that we're only under a read
seqlock, so the data could shift under us during
our calculation to see if there was a problem. This
patch avoids this issue by nesting another seqlock
which allows us to snapshot the just required values
atomically. So we shouldn't see false positives.

I also added some basic rate-limiting here, since
on one build machine w/ skewed TSCs it was fairly
noisy at bootup.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c | 62 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 55 insertions(+), 7 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 2ba62fe..a7204ad 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -119,6 +119,20 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
 }
 
 #ifdef CONFIG_DEBUG_TIMEKEEPING
+#define WARNING_FREQ (HZ*300) /* 5 minute rate-limiting */
+/*
+ * These simple flag variables are managed
+ * without locks, which is racy, but ok since
+ * we don't really care about being super
+ * precise about how many events were seen,
+ * just that a problem was observed.
+ */
+static int timekeeping_underflow_seen;
+static int timekeeping_overflow_seen;
+
+/* last_warning is only modified under the timekeeping lock */
+static long timekeeping_last_warning;
+
 static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 {
 
@@ -134,28 +148,62 @@ static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 					offset, name, max_cycles>>1);
 		}
 	}
+
+	if (timekeeping_underflow_seen) {
+		if (jiffies - timekeeping_last_warning > WARNING_FREQ) {
+			printk_deferred("WARNING: Clocksource %s underflow observed. You should report\n", name);
+			printk_deferred("         this, consider using a different clocksource.\n");
+			timekeeping_last_warning = jiffies;
+		}
+		timekeeping_underflow_seen = 0;
+	}
+
+	if (timekeeping_overflow_seen) {
+		if (jiffies - timekeeping_last_warning > WARNING_FREQ) {
+			printk_deferred("WARNING: Clocksource %s overflow observed. You should report\n", name);
+			printk_deferred("         this, consider using a different clocksource.\n");
+			timekeeping_last_warning = jiffies;
+		}
+		timekeeping_overflow_seen = 0;
+	}
 }
 
 static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
 {
-	cycle_t cycle_now, delta;
+	cycle_t now, last, mask, max, delta;
+	unsigned int seq;
 
-	/* read clocksource */
-	cycle_now = tkr->read(tkr->clock);
+	/*
+	 * Since we're called holding a seqlock, the data may shift
+	 * under us while we're doing the calculation. This can cause
+	 * false positives, since we'd note a problem but throw the
+	 * results away. So nest another seqlock here to atomically
+	 * grab the points we are checking with.
+	 */
+	do {
+		seq = read_seqcount_begin(&tk_core.seq);
+		now = tkr->read(tkr->clock);
+		last = tkr->cycle_last;
+		mask = tkr->mask;
+		max = tkr->clock->max_cycles;
+	} while (read_seqcount_retry(&tk_core.seq, seq));
 
-	/* calculate the delta since the last update_wall_time */
-	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+	delta = clocksource_delta(now, last, mask);
 
 	/*
 	 * Try to catch underflows by checking if we are seeing small
 	 * mask-relative negative values.
 	 */
-	if (unlikely((~delta & tkr->mask) < (tkr->mask >> 3)))
+	if (unlikely((~delta & mask) < (mask >> 3))) {
+		timekeeping_underflow_seen = 1;
 		delta = 0;
+	}
 
 	/* Cap delta value to the max_cycles values to avoid mult overflows */
-	if (unlikely(delta > tkr->clock->max_cycles))
+	if (unlikely(delta > max)) {
+		timekeeping_overflow_seen = 1;
 		delta = tkr->clock->max_cycles;
+	}
 
 	return delta;
 }
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 08/12] clocksource: Improve clocksource watchdog reporting
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (6 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 07/12] time: Add warnings when overflows or underflows are observed John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:03   ` [tip:timers/core] " tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 09/12] clocksource: Mostly kill clocksource_register() John Stultz
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

The clocksource watchdog reporting has been less helpful
then desired, as it just printed the delta between
the two clocksources. This prevents any useful analysis
of why the skew occurred.

Thus this patch tries to improve the output when we
mark a clocksource as unstable, printing out the cycle
last and now values for both the current clocksource
and the watchdog clocksource. This will allow us to see
if the result was due to a false positive caused by
a problematic watchdog.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/clocksource.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index a491803..340461e 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -142,13 +142,6 @@ static void __clocksource_unstable(struct clocksource *cs)
 		schedule_work(&watchdog_work);
 }
 
-static void clocksource_unstable(struct clocksource *cs, int64_t delta)
-{
-	printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns)\n",
-	       cs->name, delta);
-	__clocksource_unstable(cs);
-}
-
 /**
  * clocksource_mark_unstable - mark clocksource unstable via watchdog
  * @cs:		clocksource to be marked unstable
@@ -174,7 +167,7 @@ void clocksource_mark_unstable(struct clocksource *cs)
 static void clocksource_watchdog(unsigned long data)
 {
 	struct clocksource *cs;
-	cycle_t csnow, wdnow, delta;
+	cycle_t csnow, wdnow, cslast, wdlast, delta;
 	int64_t wd_nsec, cs_nsec;
 	int next_cpu, reset_pending;
 
@@ -213,6 +206,8 @@ static void clocksource_watchdog(unsigned long data)
 
 		delta = clocksource_delta(csnow, cs->cs_last, cs->mask);
 		cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
+		wdlast = cs->wd_last; /* save these in case we print them */
+		cslast = cs->cs_last;
 		cs->cs_last = csnow;
 		cs->wd_last = wdnow;
 
@@ -221,7 +216,12 @@ static void clocksource_watchdog(unsigned long data)
 
 		/* Check the deviation from the watchdog clocksource. */
 		if ((abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD)) {
-			clocksource_unstable(cs, cs_nsec - wd_nsec);
+			pr_warn("Watchdog: clocksource %s unstable\n", cs->name);
+			pr_warn("	%s wd_now: %llx wd_last: %llx mask: %llx\n",
+				watchdog->name, wdnow, wdlast, watchdog->mask);
+			pr_warn("	%s cs_now: %llx cs_last: %llx mask: %llx\n",
+				cs->name, csnow, cslast, cs->mask);
+			__clocksource_unstable(cs);
 			continue;
 		}
 
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 09/12] clocksource: Mostly kill clocksource_register()
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (7 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 08/12] clocksource: Improve clocksource watchdog reporting John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:03   ` [tip:timers/core] " tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 10/12] sparc: Convert to using clocksource_register_hz() John Stultz
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra, David S. Miller, Martin Schwidefsky

A long running project has been to cleanup remaining uses
of clocksource_register(), replacing it with the simpler
clocksource_register_khz/hz() functions.

However, there are a few cases where we need to self-define
our mult/shift values, so switch the function to a more
obviously internal __clocksource_register() name, and
consolidate much of the internal logic so we don't have
duplication.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 arch/s390/kernel/time.c     |  2 +-
 arch/sparc/kernel/time_32.c |  2 +-
 include/linux/clocksource.h | 10 +++++-
 kernel/time/clocksource.c   | 81 +++++++++++++++++++--------------------------
 kernel/time/jiffies.c       |  4 +--
 5 files changed, 47 insertions(+), 52 deletions(-)

diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index 20660dd..6c273cd 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -283,7 +283,7 @@ void __init time_init(void)
 	if (register_external_irq(EXT_IRQ_TIMING_ALERT, timing_alert_interrupt))
 		panic("Couldn't request external interrupt 0x1406");
 
-	if (clocksource_register(&clocksource_tod) != 0)
+	if (__clocksource_register(&clocksource_tod) != 0)
 		panic("Could not register TOD clock source");
 
 	/* Enable TOD clock interrupts on the boot cpu. */
diff --git a/arch/sparc/kernel/time_32.c b/arch/sparc/kernel/time_32.c
index 2f80d23..a31c0c8 100644
--- a/arch/sparc/kernel/time_32.c
+++ b/arch/sparc/kernel/time_32.c
@@ -191,7 +191,7 @@ static __init int setup_timer_cs(void)
 	timer_cs.mult = clocksource_hz2mult(sparc_config.clock_rate,
 	                                    timer_cs.shift);
 
-	return clocksource_register(&timer_cs);
+	return __clocksource_register(&timer_cs);
 }
 
 #ifdef CONFIG_SMP
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 63fe52f..c064349 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -179,7 +179,6 @@ static inline s64 clocksource_cyc2ns(cycle_t cycles, u32 mult, u32 shift)
 }
 
 
-extern int clocksource_register(struct clocksource*);
 extern int clocksource_unregister(struct clocksource*);
 extern void clocksource_touch_watchdog(void);
 extern struct clocksource* clocksource_get_next(void);
@@ -204,6 +203,15 @@ __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
 extern void
 __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq);
 
+/*
+ * Dont' call this unless you're a default clocksource
+ * (AKA: jiffies) and absolutely have to.
+ */
+static inline int __clocksource_register(struct clocksource *cs)
+{
+	return __clocksource_register_scale(cs, 1, 0);
+}
+
 static inline int clocksource_register_hz(struct clocksource *cs, u32 hz)
 {
 	return __clocksource_register_scale(cs, 1, hz);
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 340461e..e31bd60 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -657,38 +657,52 @@ static void clocksource_enqueue(struct clocksource *cs)
 void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 {
 	u64 sec;
+
 	/*
-	 * Calc the maximum number of seconds which we can run before
-	 * wrapping around. For clocksources which have a mask > 32bit
-	 * we need to limit the max sleep time to have a good
-	 * conversion precision. 10 minutes is still a reasonable
-	 * amount. That results in a shift value of 24 for a
-	 * clocksource with mask >= 40bit and f >= 4GHz. That maps to
-	 * ~ 0.06ppm granularity for NTP.
+	 * Default clocksources are *special* and self-define their mult/shift.
+	 * But, you're not special, so you should specify a freq value.
 	 */
-	sec = cs->mask;
-	do_div(sec, freq);
-	do_div(sec, scale);
-	if (!sec)
-		sec = 1;
-	else if (sec > 600 && cs->mask > UINT_MAX)
-		sec = 600;
-
-	clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
-			       NSEC_PER_SEC / scale, sec * scale);
-
+	if (freq) {
+		/*
+		 * Calc the maximum number of seconds which we can run before
+		 * wrapping around. For clocksources which have a mask > 32-bit
+		 * we need to limit the max sleep time to have a good
+		 * conversion precision. 10 minutes is still a reasonable
+		 * amount. That results in a shift value of 24 for a
+		 * clocksource with mask >= 40-bit and f >= 4GHz. That maps to
+		 * ~ 0.06ppm granularity for NTP.
+		 */
+		sec = cs->mask;
+		do_div(sec, freq);
+		do_div(sec, scale);
+		if (!sec)
+			sec = 1;
+		else if (sec > 600 && cs->mask > UINT_MAX)
+			sec = 600;
+
+		clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
+				       NSEC_PER_SEC / scale, sec * scale);
+	}
 	/*
 	 * Ensure clocksources that have large mults don't overflow
 	 * when adjusted.
 	 */
 	cs->maxadj = clocksource_max_adjustment(cs);
-	while ((cs->mult + cs->maxadj < cs->mult)
-		|| (cs->mult - cs->maxadj > cs->mult)) {
+	while (freq && ((cs->mult + cs->maxadj < cs->mult)
+		|| (cs->mult - cs->maxadj > cs->mult))) {
 		cs->mult >>= 1;
 		cs->shift--;
 		cs->maxadj = clocksource_max_adjustment(cs);
 	}
 
+	/*
+	 * Only warn for *special* clocksources that self-define
+	 * their mult/shift values and don't specify a freq.
+	 */
+	WARN_ONCE(cs->mult + cs->maxadj < cs->mult,
+		"Clocksource %s might overflow on 11%% adjustment\n",
+		cs->name);
+
 	clocksource_update_max_deferment(cs);
 }
 EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
@@ -720,33 +734,6 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
 }
 EXPORT_SYMBOL_GPL(__clocksource_register_scale);
 
-
-/**
- * clocksource_register - Used to install new clocksources
- * @cs:		clocksource to be registered
- *
- * Returns -EBUSY if registration fails, zero otherwise.
- */
-int clocksource_register(struct clocksource *cs)
-{
-	/* calculate max adjustment for given mult/shift */
-	cs->maxadj = clocksource_max_adjustment(cs);
-	WARN_ONCE(cs->mult + cs->maxadj < cs->mult,
-		"Clocksource %s might overflow on 11%% adjustment\n",
-		cs->name);
-
-	/* Update max idle time permitted for this clocksource */
-	clocksource_update_max_deferment(cs);
-
-	mutex_lock(&clocksource_mutex);
-	clocksource_enqueue(cs);
-	clocksource_enqueue_watchdog(cs);
-	clocksource_select();
-	mutex_unlock(&clocksource_mutex);
-	return 0;
-}
-EXPORT_SYMBOL(clocksource_register);
-
 static void __clocksource_change_rating(struct clocksource *cs, int rating)
 {
 	list_del(&cs->list);
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index 7e41390..c4bb518 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -95,7 +95,7 @@ EXPORT_SYMBOL(jiffies);
 
 static int __init init_jiffies_clocksource(void)
 {
-	return clocksource_register(&clocksource_jiffies);
+	return __clocksource_register(&clocksource_jiffies);
 }
 
 core_initcall(init_jiffies_clocksource);
@@ -131,6 +131,6 @@ int register_refined_jiffies(long cycles_per_second)
 
 	refined_jiffies.mult = ((u32)nsec_per_tick) << JIFFIES_SHIFT;
 
-	clocksource_register(&refined_jiffies);
+	__clocksource_register(&refined_jiffies);
 	return 0;
 }
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 10/12] sparc: Convert to using clocksource_register_hz()
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (8 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 09/12] clocksource: Mostly kill clocksource_register() John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:04   ` [tip:timers/core] clocksource, sparc32: " tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 11/12] clocksource: Add some debug info about clocksources being registered John Stultz
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra, David S. Miller

While cleaning up some clocksource code, I noticed the
time_32 implementation uses the clocksource_hz2mult()
helper, but doesn't use the clocksource_register_hz()
method.

I don't believe the Sparc clocksource is a default
clocksource, so we shouldn't need to self-define
the mult/shift pair.

So convert the time_32.c implementation to use
clocksource_register_hz().

Untested.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 arch/sparc/kernel/time_32.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/sparc/kernel/time_32.c b/arch/sparc/kernel/time_32.c
index a31c0c8..18147a5 100644
--- a/arch/sparc/kernel/time_32.c
+++ b/arch/sparc/kernel/time_32.c
@@ -181,17 +181,13 @@ static struct clocksource timer_cs = {
 	.rating	= 100,
 	.read	= timer_cs_read,
 	.mask	= CLOCKSOURCE_MASK(64),
-	.shift	= 2,
 	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 static __init int setup_timer_cs(void)
 {
 	timer_cs_enabled = 1;
-	timer_cs.mult = clocksource_hz2mult(sparc_config.clock_rate,
-	                                    timer_cs.shift);
-
-	return __clocksource_register(&timer_cs);
+	return clocksource_register_hz(&timer_cs, sparc_config.clock_rate);
 }
 
 #ifdef CONFIG_SMP
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 11/12] clocksource: Add some debug info about clocksources being registered
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (9 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 10/12] sparc: Convert to using clocksource_register_hz() John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:04   ` [tip:timers/core] " tip-bot for John Stultz
  2015-03-12  4:16 ` [PATCH 12/12] clocksource: Rename __clocksource_updatefreq_* to __clocksource_update_freq_* John Stultz
  2015-03-12  9:19 ` [PATCH 00/12] Increased clocksource validation and cleanups (v4) Ingo Molnar
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra

Print the mask, max_cycles, and max_idle_ns values for clocksources
being registered.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/clocksource.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index e31bd60..757d770 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -704,6 +704,9 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 		cs->name);
 
 	clocksource_update_max_deferment(cs);
+
+	pr_info("clocksource %s: mask: 0x%llx max_cycles: 0x%llx, max_idle_ns: %lld ns\n",
+			cs->name, cs->mask, cs->max_cycles, cs->max_idle_ns);
 }
 EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
 
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 12/12] clocksource: Rename __clocksource_updatefreq_* to __clocksource_update_freq_*
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (10 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 11/12] clocksource: Add some debug info about clocksources being registered John Stultz
@ 2015-03-12  4:16 ` John Stultz
  2015-03-13  9:04   ` [tip:timers/core] clocksource: Rename __clocksource_updatefreq_*( ) to __clocksource_update_freq_*() tip-bot for John Stultz
  2015-03-12  9:19 ` [PATCH 00/12] Increased clocksource validation and cleanups (v4) Ingo Molnar
  12 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12  4:16 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Ingo Molnar,
	Peter Zijlstra, Daniel Lezcano

Ingo requested this function be renamed to improve readability, so
I've renamed __clocksource_updatefreq_scale() as well as the
__clocksource_updatefreq_hz/khz() functions to avoid
squishedtogethernames.

This touches some of the sh clocksources, which I've not tested.

The arch/arm/plat-omap change is just a comment change for
consistency.

Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 arch/arm/plat-omap/counter_32k.c |  2 +-
 drivers/clocksource/em_sti.c     |  2 +-
 drivers/clocksource/sh_cmt.c     |  2 +-
 drivers/clocksource/sh_tmu.c     |  2 +-
 include/linux/clocksource.h      | 10 +++++-----
 kernel/time/clocksource.c        | 11 ++++++-----
 6 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/arm/plat-omap/counter_32k.c b/arch/arm/plat-omap/counter_32k.c
index 61b4d70..43cf745 100644
--- a/arch/arm/plat-omap/counter_32k.c
+++ b/arch/arm/plat-omap/counter_32k.c
@@ -103,7 +103,7 @@ int __init omap_init_clocksource_32k(void __iomem *vbase)
 
 	/*
 	 * 120000 rough estimate from the calculations in
-	 * __clocksource_updatefreq_scale.
+	 * __clocksource_update_freq_scale.
 	 */
 	clocks_calc_mult_shift(&persistent_mult, &persistent_shift,
 			32768, NSEC_PER_SEC, 120000);
diff --git a/drivers/clocksource/em_sti.c b/drivers/clocksource/em_sti.c
index d0a7bd6..dc3c6ee 100644
--- a/drivers/clocksource/em_sti.c
+++ b/drivers/clocksource/em_sti.c
@@ -210,7 +210,7 @@ static int em_sti_clocksource_enable(struct clocksource *cs)
 
 	ret = em_sti_start(p, USER_CLOCKSOURCE);
 	if (!ret)
-		__clocksource_updatefreq_hz(cs, p->rate);
+		__clocksource_update_freq_hz(cs, p->rate);
 	return ret;
 }
 
diff --git a/drivers/clocksource/sh_cmt.c b/drivers/clocksource/sh_cmt.c
index 2bd13b5..b8ff3c6 100644
--- a/drivers/clocksource/sh_cmt.c
+++ b/drivers/clocksource/sh_cmt.c
@@ -641,7 +641,7 @@ static int sh_cmt_clocksource_enable(struct clocksource *cs)
 
 	ret = sh_cmt_start(ch, FLAG_CLOCKSOURCE);
 	if (!ret) {
-		__clocksource_updatefreq_hz(cs, ch->rate);
+		__clocksource_update_freq_hz(cs, ch->rate);
 		ch->cs_enabled = true;
 	}
 	return ret;
diff --git a/drivers/clocksource/sh_tmu.c b/drivers/clocksource/sh_tmu.c
index f150ca82..b6b8fa3 100644
--- a/drivers/clocksource/sh_tmu.c
+++ b/drivers/clocksource/sh_tmu.c
@@ -272,7 +272,7 @@ static int sh_tmu_clocksource_enable(struct clocksource *cs)
 
 	ret = sh_tmu_enable(ch);
 	if (!ret) {
-		__clocksource_updatefreq_hz(cs, ch->rate);
+		__clocksource_update_freq_hz(cs, ch->rate);
 		ch->cs_enabled = true;
 	}
 
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index c064349..68a8525 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -201,7 +201,7 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
 extern int
 __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
 extern void
-__clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq);
+__clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq);
 
 /*
  * Dont' call this unless you're a default clocksource
@@ -222,14 +222,14 @@ static inline int clocksource_register_khz(struct clocksource *cs, u32 khz)
 	return __clocksource_register_scale(cs, 1000, khz);
 }
 
-static inline void __clocksource_updatefreq_hz(struct clocksource *cs, u32 hz)
+static inline void __clocksource_update_freq_hz(struct clocksource *cs, u32 hz)
 {
-	__clocksource_updatefreq_scale(cs, 1, hz);
+	__clocksource_update_freq_scale(cs, 1, hz);
 }
 
-static inline void __clocksource_updatefreq_khz(struct clocksource *cs, u32 khz)
+static inline void __clocksource_update_freq_khz(struct clocksource *cs, u32 khz)
 {
-	__clocksource_updatefreq_scale(cs, 1000, khz);
+	__clocksource_update_freq_scale(cs, 1000, khz);
 }
 
 
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 757d770..e5590c9 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -644,7 +644,7 @@ static void clocksource_enqueue(struct clocksource *cs)
 }
 
 /**
- * __clocksource_updatefreq_scale - Used update clocksource with new freq
+ * __clocksource_update_freq_scale - Used update clocksource with new freq
  * @cs:		clocksource to be registered
  * @scale:	Scale factor multiplied against freq to get clocksource hz
  * @freq:	clocksource frequency (cycles per second) divided by scale
@@ -652,9 +652,10 @@ static void clocksource_enqueue(struct clocksource *cs)
  * This should only be called from the clocksource->enable() method.
  *
  * This *SHOULD NOT* be called directly! Please use the
- * clocksource_updatefreq_hz() or clocksource_updatefreq_khz helper functions.
+ * __clocksource_update_freq_hz() or __clocksource_update_freq_khz() helper
+ * functions.
  */
-void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
+void __clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq)
 {
 	u64 sec;
 
@@ -708,7 +709,7 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 	pr_info("clocksource %s: mask: 0x%llx max_cycles: 0x%llx, max_idle_ns: %lld ns\n",
 			cs->name, cs->mask, cs->max_cycles, cs->max_idle_ns);
 }
-EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
+EXPORT_SYMBOL_GPL(__clocksource_update_freq_scale);
 
 /**
  * __clocksource_register_scale - Used to install new clocksources
@@ -725,7 +726,7 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
 {
 
 	/* Initialize mult/shift and max_idle_ns */
-	__clocksource_updatefreq_scale(cs, scale, freq);
+	__clocksource_update_freq_scale(cs, scale, freq);
 
 	/* Add clocksource to the clocksource list */
 	mutex_lock(&clocksource_mutex);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/12] clocksource: Simplify logic around clocksource wrapping safety margins
  2015-03-12  4:16 ` [PATCH 02/12] clocksource: Simplify logic around clocksource wrapping safety margins John Stultz
@ 2015-03-12  5:55   ` Ingo Molnar
  2015-03-13  9:01   ` [tip:timers/core] clocksource: Simplify the " tip-bot for John Stultz
  1 sibling, 0 replies; 33+ messages in thread
From: Ingo Molnar @ 2015-03-12  5:55 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Peter Zijlstra


* John Stultz <john.stultz@linaro.org> wrote:

> The clocksource logic has a number of places where we try to
> include a safety margin. Most of these are 12% safety margins,
> but they are inconsistently applied and sometimes are applied
> on top of each other.
> 
> Additionally, in the previous patch, we corrected an issue
> where we unintentionally in effect created a 50% safety margin,
> which these 12.5% margins where then added to.
> 
> So to simplify the logic here, this patch removes the various
> 12.5% margins, and consolidates adding the margin in one place:
> clocks_calc_max_nsecs().
> 
> Additionally, Linus prefers a 50% safety margin, as it allows
> bad clock values to be more easily caught. This should really
> have no net effect, due to the corrected issue earlier which
> caused greater then 50% margins to be used w/o issue.

> +++ b/kernel/time/clocksource.c
> @@ -469,6 +469,9 @@ static u32 clocksource_max_adjustment(struct clocksource *cs)
>   * @shift:	cycle to nanosecond divisor (power of two)
>   * @maxadj:	maximum adjustment value to mult (~11%)
>   * @mask:	bitmask for two's complement subtraction of non 64 bit counters
> + *
> + * NOTE: This function includes a safety margin of 50%, so that bad clock values
> + * can be detected.
>   */
>  u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
>  {

So it would be nice if there was also a comment here explaining the 
'safety margin': what values are checked, what stream of values is 
expected from clocksources, why clocksources can be off, what the core 
clocksource code does with that, what symptoms it can cause, what is 
considered 'normal', what is considered 'abnormal', what happens if a 
'safety margin' is exceeded, etc.

I.e. all the code and all the changelogs talk about 'safety margins' 
in a somewhat circular, self-defining fashion - without there being 
any easily visible place where it's explained from first principles.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 04/12] time: Add debugging checks to warn if we see delays
  2015-03-12  4:16 ` [PATCH 04/12] time: Add debugging checks to warn if we see delays John Stultz
@ 2015-03-12  6:32   ` Ingo Molnar
  2015-03-13  9:02   ` [tip:timers/core] timekeeping: " tip-bot for John Stultz
  1 sibling, 0 replies; 33+ messages in thread
From: Ingo Molnar @ 2015-03-12  6:32 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Peter Zijlstra


* John Stultz <john.stultz@linaro.org> wrote:

> Recently there's been some request for better sanity
> checking in the time code, so that its more clear
> when something is going wrong since timekeeping issues
> could manifest in a large number of strange ways with
> various subsystems.
> 
> Thus, this patch adds some extra infrastructure to
> add a check update_wall_time() to print warnings if we
> see the call delayed beyond the max_cycles overflow
> point, or beyond the clocksource max_idle_ns value
> which is currently 50% of the overflow point.
> 
> This extra infrastructure is conditionalized
> behind a new CONFIG_DEBUG_TIMEKEEPING option
> also added in this patch.
> 
> Tested this a bit by halting qemu for specified
> lengths of time to trigger the warnings.

> +static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
> +{
> +
> +	cycle_t max_cycles = tk->tkr.clock->max_cycles;
> +	const char *name = tk->tkr.clock->name;
> +
> +	if (offset > max_cycles) {
> +		printk_deferred("ERROR: cycle offset (%lld) is larger than  allowed %s max_cycles (%lld)\n",
> +					offset, name, max_cycles);
> +	} else {
> +		if (offset > (max_cycles >> 1)) {
> +			printk_deferred("WARNING: cycle offset (%lld) is past the %s 50%% safety margin (%lld)\n",
> +					offset, name, max_cycles>>1);

Since we don't know the intensity with which these warnings will 
trigger on various systems, I've adjusted the messages a bit:

               printk_deferred("WARNING: timekeeping: cycle offset (%lld) is larger than allowed by '%s' max_cycles (%lld)\n",
                               offset, name, max_cycles);

               printk_deferred("INFO: timekeeping: cycle offset (%lld) is past the '%s' 50%% safety margin (%lld)\n",
                               offset, name, max_cycles>>1);

'INFO:' / 'WARNING:' is more in line with how we warn about various 
problems in kernel code.

We can upgrade this to 'WARNING:' / 'BUG:' pair once we are confident 
that most clocksources are doing OK.

I also improved the messages to make it apparent when we are printing 
a clocksource name and removed extra spaces from the message, etc. See 
the tip commit log notification email for details.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 07/12] time: Add warnings when overflows or underflows are observed
  2015-03-12  4:16 ` [PATCH 07/12] time: Add warnings when overflows or underflows are observed John Stultz
@ 2015-03-12  7:37   ` Ingo Molnar
  2015-03-13  9:03   ` [tip:timers/core] timekeeping: " tip-bot for John Stultz
  1 sibling, 0 replies; 33+ messages in thread
From: Ingo Molnar @ 2015-03-12  7:37 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Peter Zijlstra


* John Stultz <john.stultz@linaro.org> wrote:

> It was suggested that the underflow/overflow protection
> should probably throw some sort of warning out, rather
> then just silently fixing the issue.
> 
> So this patch adds some warnings here. The flag variables
> used are not protected by locks, but since we can't print
> from the reading functions, just being able to say we
> saw an issue in the update interval is useful enough,
> and can be slightly racy without real consequence.

> + * These simple flag variables are managed
> + * without locks, which is racy, but ok since
> + * we don't really care about being super
> + * precise about how many events were seen,
> + * just that a problem was observed.
> + */
> +static int timekeeping_underflow_seen;
> +static int timekeeping_overflow_seen;
> +
> +/* last_warning is only modified under the timekeeping lock */
> +static long timekeeping_last_warning;
> +
>  static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
>  {
>  
> @@ -134,28 +148,62 @@ static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
>  					offset, name, max_cycles>>1);
>  		}
>  	}
> +
> +	if (timekeeping_underflow_seen) {
> +		if (jiffies - timekeeping_last_warning > WARNING_FREQ) {
> +			printk_deferred("WARNING: Clocksource %s underflow observed. You should report\n", name);
> +			printk_deferred("         this, consider using a different clocksource.\n");
> +			timekeeping_last_warning = jiffies;
> +		}
> +		timekeeping_underflow_seen = 0;
> +	}
> +
> +	if (timekeeping_overflow_seen) {
> +		if (jiffies - timekeeping_last_warning > WARNING_FREQ) {
> +			printk_deferred("WARNING: Clocksource %s overflow observed. You should report\n", name);
> +			printk_deferred("         this, consider using a different clocksource.\n");
> +			timekeeping_last_warning = jiffies;
> +		}
> +		timekeeping_overflow_seen = 0;
> +	}
>  }
>  
>  static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
>  {
> -	cycle_t cycle_now, delta;
> +	cycle_t now, last, mask, max, delta;
> +	unsigned int seq;
>  
> -	/* read clocksource */
> -	cycle_now = tkr->read(tkr->clock);
> +	/*
> +	 * Since we're called holding a seqlock, the data may shift
> +	 * under us while we're doing the calculation. This can cause
> +	 * false positives, since we'd note a problem but throw the
> +	 * results away. So nest another seqlock here to atomically
> +	 * grab the points we are checking with.
> +	 */
> +	do {
> +		seq = read_seqcount_begin(&tk_core.seq);
> +		now = tkr->read(tkr->clock);
> +		last = tkr->cycle_last;
> +		mask = tkr->mask;
> +		max = tkr->clock->max_cycles;
> +	} while (read_seqcount_retry(&tk_core.seq, seq));
>  
> -	/* calculate the delta since the last update_wall_time */
> -	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
> +	delta = clocksource_delta(now, last, mask);
>  
>  	/*
>  	 * Try to catch underflows by checking if we are seeing small
>  	 * mask-relative negative values.
>  	 */
> -	if (unlikely((~delta & tkr->mask) < (tkr->mask >> 3)))
> +	if (unlikely((~delta & mask) < (mask >> 3))) {
> +		timekeeping_underflow_seen = 1;
>  		delta = 0;
> +	}
>  
>  	/* Cap delta value to the max_cycles values to avoid mult overflows */
> -	if (unlikely(delta > tkr->clock->max_cycles))
> +	if (unlikely(delta > max)) {
> +		timekeeping_overflow_seen = 1;
>  		delta = tkr->clock->max_cycles;
> +	}

Please add the flags as new fields in the 'struct tk_read_base' data 
structure in include/linux/timekeeper_internal.h - we don't want to go 
back to the old pattern of adding globals to the timekeeping code, 
even if they are just for debugging!

also, timekeeping_check_update() should pass in 'struct tk_read_base', 
not 'struct timekeeper' - it's really only using the tkr bits and 
doing this change would make it similar to timekeeping_get_delta(). It 
would also shorten:

        cycle_t max_cycles = tk->tkr.clock->max_cycles;
        const char *name = tk->tkr.clock->name;

to the more natural looking:

        cycle_t max_cycles = tkr->clock->max_cycles;
        const char *name = tkr->clock->name;

hm?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/12] Increased clocksource validation and cleanups (v4)
  2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
                   ` (11 preceding siblings ...)
  2015-03-12  4:16 ` [PATCH 12/12] clocksource: Rename __clocksource_updatefreq_* to __clocksource_update_freq_* John Stultz
@ 2015-03-12  9:19 ` Ingo Molnar
  2015-03-12 16:26   ` John Stultz
  12 siblings, 1 reply; 33+ messages in thread
From: Ingo Molnar @ 2015-03-12  9:19 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Peter Zijlstra


* John Stultz <john.stultz@linaro.org> wrote:

> So here is another round of this series, which is the result of
> earlier discussions with Linus and his suggestions around
> improvements to clocksource validation in the hope we can more
> easily catch bad hardware.
> 
> There's also a few cleanups Linus suggested as well as a few I've been
> meaning to get to for awhile.
> 
> I tried in address all the feedback that had been given, adding
> the checks behind CONFIG_DEBUG_TIMEKEEPING.  I also sorted out a
> sane way to print rate-limited warnings if we see cycle deltas that
> are too large, or if they look like small underflows.
> 
> I'd like to get this queued into -tip soon so it can get as much
> testing in -next as possible.
> 
> If there are any objections or feedback, I'd love to hear it!
> 
> New in v4:
> * Lots and lots of typo corrections and minor cleanups suggested
>   by Ingo.
> * Dropped "Remove clocksource_max_deferment()" patch
> * Added "Rename __clocksource_updatefreq_*..." patch
> * I realized one of the patches (Improve clocksource watchdog
>   reporting) didn't have a proper cc list, so while it was on lkml
>   folks may not have reviewed it before.

So I've applied them with some changes to tip:timers/core, please look 
at the commit notifications for details. Any outstanding review 
feedback can be addressed as delta patches on top of this I think, 
none of my observations were show-stoppers.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/12] Increased clocksource validation and cleanups (v4)
  2015-03-12  9:19 ` [PATCH 00/12] Increased clocksource validation and cleanups (v4) Ingo Molnar
@ 2015-03-12 16:26   ` John Stultz
  2015-03-13  8:58     ` Ingo Molnar
  0 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-12 16:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: lkml, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Peter Zijlstra

On Thu, Mar 12, 2015 at 2:19 AM, Ingo Molnar <mingo@kernel.org> wrote:
> * John Stultz <john.stultz@linaro.org> wrote:
>> New in v4:
>> * Lots and lots of typo corrections and minor cleanups suggested
>>   by Ingo.
>> * Dropped "Remove clocksource_max_deferment()" patch
>> * Added "Rename __clocksource_updatefreq_*..." patch
>> * I realized one of the patches (Improve clocksource watchdog
>>   reporting) didn't have a proper cc list, so while it was on lkml
>>   folks may not have reviewed it before.
>
> So I've applied them with some changes to tip:timers/core, please look
> at the commit notifications for details. Any outstanding review
> feedback can be addressed as delta patches on top of this I think,
> none of my observations were show-stoppers.

Have you not yet pushed the changed out publicly? (I'm only seeing one
change from Viresh in tip/timers/core)
Once its available I'll review your changes and try to integrate any
remaining issues you pointed out with additional patches.

thanks
-john

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/12] Increased clocksource validation and cleanups (v4)
  2015-03-12 16:26   ` John Stultz
@ 2015-03-13  8:58     ` Ingo Molnar
  2015-03-13 17:54       ` John Stultz
  0 siblings, 1 reply; 33+ messages in thread
From: Ingo Molnar @ 2015-03-13  8:58 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Peter Zijlstra


* John Stultz <john.stultz@linaro.org> wrote:

> On Thu, Mar 12, 2015 at 2:19 AM, Ingo Molnar <mingo@kernel.org> wrote:
> > * John Stultz <john.stultz@linaro.org> wrote:
> >> New in v4:
> >> * Lots and lots of typo corrections and minor cleanups suggested
> >>   by Ingo.
> >> * Dropped "Remove clocksource_max_deferment()" patch
> >> * Added "Rename __clocksource_updatefreq_*..." patch
> >> * I realized one of the patches (Improve clocksource watchdog
> >>   reporting) didn't have a proper cc list, so while it was on lkml
> >>   folks may not have reviewed it before.
> >
> > So I've applied them with some changes to tip:timers/core, please look
> > at the commit notifications for details. Any outstanding review
> > feedback can be addressed as delta patches on top of this I think,
> > none of my observations were show-stoppers.
> 
> Have you not yet pushed the changed out publicly? (I'm only seeing one
> change from Viresh in tip/timers/core)

Yeah, yesterday I had some test failures (elsewhere) so they were held 
up.

I pushed them out a minute ago, the commit notifications should appear 
shortly.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] clocksource: Simplify the clocks_calc_max_nsecs () logic
  2015-03-12  4:16 ` [PATCH 01/12] clocksource: Simplify clocks_calc_max_nsecs logic John Stultz
@ 2015-03-13  9:01   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, sboyd, mingo, prarit, torvalds, davej, peterz,
	john.stultz, tglx, hpa, richardcochran

Commit-ID:  6086e346fdea1ae64d974c94c1acacc2605567ae
Gitweb:     http://git.kernel.org/tip/6086e346fdea1ae64d974c94c1acacc2605567ae
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:29 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 12 Mar 2015 10:16:38 +0100

clocksource: Simplify the clocks_calc_max_nsecs() logic

The previous clocks_calc_max_nsecs() code had some unecessarily
complex bit logic to find the max interval that could cause
multiplication overflows. Since this is not in the hot
path, just do the divide to make it easier to read.

The previous implementation also had a subtle issue
that it avoided overflows with signed 64-bit values, where
as the intervals are always unsigned. This resulted in
overly conservative intervals, which other safety margins
were then added to, reducing the intended interval length.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/clocksource.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 4892352..2148f41 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -476,19 +476,10 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 
 	/*
 	 * Calculate the maximum number of cycles that we can pass to the
-	 * cyc2ns function without overflowing a 64-bit signed result. The
-	 * maximum number of cycles is equal to ULLONG_MAX/(mult+maxadj)
-	 * which is equivalent to the below.
-	 * max_cycles < (2^63)/(mult + maxadj)
-	 * max_cycles < 2^(log2((2^63)/(mult + maxadj)))
-	 * max_cycles < 2^(log2(2^63) - log2(mult + maxadj))
-	 * max_cycles < 2^(63 - log2(mult + maxadj))
-	 * max_cycles < 1 << (63 - log2(mult + maxadj))
-	 * Please note that we add 1 to the result of the log2 to account for
-	 * any rounding errors, ensure the above inequality is satisfied and
-	 * no overflow will occur.
+	 * cyc2ns() function without overflowing a 64-bit result.
 	 */
-	max_cycles = 1ULL << (63 - (ilog2(mult + maxadj) + 1));
+	max_cycles = ULLONG_MAX;
+	do_div(max_cycles, mult+maxadj);
 
 	/*
 	 * The actual maximum number of cycles we can defer the clocksource is

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] clocksource: Simplify the logic around clocksource wrapping safety margins
  2015-03-12  4:16 ` [PATCH 02/12] clocksource: Simplify logic around clocksource wrapping safety margins John Stultz
  2015-03-12  5:55   ` Ingo Molnar
@ 2015-03-13  9:01   ` tip-bot for John Stultz
  1 sibling, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: davej, mingo, richardcochran, peterz, hpa, john.stultz, prarit,
	tglx, linux-kernel, torvalds

Commit-ID:  362fde0410377e468ca00ad363fdf3e3ec42eb6a
Gitweb:     http://git.kernel.org/tip/362fde0410377e468ca00ad363fdf3e3ec42eb6a
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:30 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 12 Mar 2015 10:16:38 +0100

clocksource: Simplify the logic around clocksource wrapping safety margins

The clocksource logic has a number of places where we try to
include a safety margin. Most of these are 12% safety margins,
but they are inconsistently applied and sometimes are applied
on top of each other.

Additionally, in the previous patch, we corrected an issue
where we unintentionally in effect created a 50% safety margin,
which these 12.5% margins where then added to.

So to simplify the logic here, this patch removes the various
12.5% margins, and consolidates adding the margin in one place:
clocks_calc_max_nsecs().

Additionally, Linus prefers a 50% safety margin, as it allows
bad clock values to be more easily caught. This should really
have no net effect, due to the corrected issue earlier which
caused greater then 50% margins to be used w/o issue.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Stephen Boyd <sboyd@codeaurora.org> (for the sched_clock.c bit)
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/clocksource.c | 26 ++++++++++++--------------
 kernel/time/sched_clock.c |  4 ++--
 2 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 2148f41..ace9576 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -469,6 +469,9 @@ static u32 clocksource_max_adjustment(struct clocksource *cs)
  * @shift:	cycle to nanosecond divisor (power of two)
  * @maxadj:	maximum adjustment value to mult (~11%)
  * @mask:	bitmask for two's complement subtraction of non 64 bit counters
+ *
+ * NOTE: This function includes a safety margin of 50%, so that bad clock values
+ * can be detected.
  */
 u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 {
@@ -490,11 +493,14 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 	max_cycles = min(max_cycles, mask);
 	max_nsecs = clocksource_cyc2ns(max_cycles, mult - maxadj, shift);
 
+	/* Return 50% of the actual maximum, so we can detect bad values */
+	max_nsecs >>= 1;
+
 	return max_nsecs;
 }
 
 /**
- * clocksource_max_deferment - Returns max time the clocksource can be deferred
+ * clocksource_max_deferment - Returns max time the clocksource should be deferred
  * @cs:         Pointer to clocksource
  *
  */
@@ -504,13 +510,7 @@ static u64 clocksource_max_deferment(struct clocksource *cs)
 
 	max_nsecs = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj,
 					  cs->mask);
-	/*
-	 * To ensure that the clocksource does not wrap whilst we are idle,
-	 * limit the time the clocksource can be deferred by 12.5%. Please
-	 * note a margin of 12.5% is used because this can be computed with
-	 * a shift, versus say 10% which would require division.
-	 */
-	return max_nsecs - (max_nsecs >> 3);
+	return max_nsecs;
 }
 
 #ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
@@ -659,10 +659,9 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 	 * conversion precision. 10 minutes is still a reasonable
 	 * amount. That results in a shift value of 24 for a
 	 * clocksource with mask >= 40bit and f >= 4GHz. That maps to
-	 * ~ 0.06ppm granularity for NTP. We apply the same 12.5%
-	 * margin as we do in clocksource_max_deferment()
+	 * ~ 0.06ppm granularity for NTP.
 	 */
-	sec = (cs->mask - (cs->mask >> 3));
+	sec = cs->mask;
 	do_div(sec, freq);
 	do_div(sec, scale);
 	if (!sec)
@@ -674,9 +673,8 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 			       NSEC_PER_SEC / scale, sec * scale);
 
 	/*
-	 * for clocksources that have large mults, to avoid overflow.
-	 * Since mult may be adjusted by ntp, add an safety extra margin
-	 *
+	 * Ensure clocksources that have large 'mult' values don't overflow
+	 * when adjusted.
 	 */
 	cs->maxadj = clocksource_max_adjustment(cs);
 	while ((cs->mult + cs->maxadj < cs->mult)
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 01d2d15..3b8ae45 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -125,9 +125,9 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
 
 	new_mask = CLOCKSOURCE_MASK(bits);
 
-	/* calculate how many ns until we wrap */
+	/* calculate how many nanosecs until we risk wrapping */
 	wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask);
-	new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3));
+	new_wrap_kt = ns_to_ktime(wrap);
 
 	/* update epoch for new counter and update epoch_ns from old counter*/
 	new_epoch = read();

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] clocksource: Add 'max_cycles' to ' struct clocksource'
  2015-03-12  4:16 ` [PATCH 03/12] clocksource: Add max_cycles to clocksource structure John Stultz
@ 2015-03-13  9:01   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, tglx, sboyd, richardcochran, hpa, john.stultz, mingo,
	linux-kernel, davej, peterz, prarit

Commit-ID:  fb82fe2fe8588745edd73aa3a6229facac5c1e15
Gitweb:     http://git.kernel.org/tip/fb82fe2fe8588745edd73aa3a6229facac5c1e15
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:31 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 12 Mar 2015 10:16:38 +0100

clocksource: Add 'max_cycles' to 'struct clocksource'

In order to facilitate clocksource validation, add a
'max_cycles' field to the clocksource structure which
will hold the maximum cycle value that can safely be
multiplied without potentially causing an overflow.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/clocksource.h |  5 +++--
 kernel/time/clocksource.c   | 28 ++++++++++++++++------------
 kernel/time/sched_clock.c   |  2 +-
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 9c78d15..16d048c 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -56,6 +56,7 @@ struct module;
  * @shift:		cycle to nanosecond divisor (power of two)
  * @max_idle_ns:	max idle time permitted by the clocksource (nsecs)
  * @maxadj:		maximum adjustment value to mult (~11%)
+ * @max_cycles:		maximum safe cycle value which won't overflow on multiplication
  * @flags:		flags describing special properties
  * @archdata:		arch-specific data
  * @suspend:		suspend function for the clocksource, if necessary
@@ -76,7 +77,7 @@ struct clocksource {
 #ifdef CONFIG_ARCH_CLOCKSOURCE_DATA
 	struct arch_clocksource_data archdata;
 #endif
-
+	u64 max_cycles;
 	const char *name;
 	struct list_head list;
 	int rating;
@@ -189,7 +190,7 @@ extern struct clocksource * __init clocksource_default_clock(void);
 extern void clocksource_mark_unstable(struct clocksource *cs);
 
 extern u64
-clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask);
+clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, u64 *max_cycles);
 extern void
 clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
 
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index ace9576..fc2a9de 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -469,11 +469,13 @@ static u32 clocksource_max_adjustment(struct clocksource *cs)
  * @shift:	cycle to nanosecond divisor (power of two)
  * @maxadj:	maximum adjustment value to mult (~11%)
  * @mask:	bitmask for two's complement subtraction of non 64 bit counters
+ * @max_cyc:	maximum cycle value before potential overflow (does not include
+ *		any safety margin)
  *
  * NOTE: This function includes a safety margin of 50%, so that bad clock values
  * can be detected.
  */
-u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
+u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, u64 *max_cyc)
 {
 	u64 max_nsecs, max_cycles;
 
@@ -493,6 +495,10 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 	max_cycles = min(max_cycles, mask);
 	max_nsecs = clocksource_cyc2ns(max_cycles, mult - maxadj, shift);
 
+	/* return the max_cycles value as well if requested */
+	if (max_cyc)
+		*max_cyc = max_cycles;
+
 	/* Return 50% of the actual maximum, so we can detect bad values */
 	max_nsecs >>= 1;
 
@@ -500,17 +506,15 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
 }
 
 /**
- * clocksource_max_deferment - Returns max time the clocksource should be deferred
- * @cs:         Pointer to clocksource
+ * clocksource_update_max_deferment - Updates the clocksource max_idle_ns & max_cycles
+ * @cs:         Pointer to clocksource to be updated
  *
  */
-static u64 clocksource_max_deferment(struct clocksource *cs)
+static inline void clocksource_update_max_deferment(struct clocksource *cs)
 {
-	u64 max_nsecs;
-
-	max_nsecs = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj,
-					  cs->mask);
-	return max_nsecs;
+	cs->max_idle_ns = clocks_calc_max_nsecs(cs->mult, cs->shift,
+						cs->maxadj, cs->mask,
+						&cs->max_cycles);
 }
 
 #ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
@@ -684,7 +688,7 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 		cs->maxadj = clocksource_max_adjustment(cs);
 	}
 
-	cs->max_idle_ns = clocksource_max_deferment(cs);
+	clocksource_update_max_deferment(cs);
 }
 EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
 
@@ -730,8 +734,8 @@ int clocksource_register(struct clocksource *cs)
 		"Clocksource %s might overflow on 11%% adjustment\n",
 		cs->name);
 
-	/* calculate max idle time permitted for this clocksource */
-	cs->max_idle_ns = clocksource_max_deferment(cs);
+	/* Update max idle time permitted for this clocksource */
+	clocksource_update_max_deferment(cs);
 
 	mutex_lock(&clocksource_mutex);
 	clocksource_enqueue(cs);
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 3b8ae45..ca3bc5c 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -126,7 +126,7 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
 	new_mask = CLOCKSOURCE_MASK(bits);
 
 	/* calculate how many nanosecs until we risk wrapping */
-	wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask);
+	wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask, NULL);
 	new_wrap_kt = ns_to_ktime(wrap);
 
 	/* update epoch for new counter and update epoch_ns from old counter*/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] timekeeping: Add debugging checks to warn if we see delays
  2015-03-12  4:16 ` [PATCH 04/12] time: Add debugging checks to warn if we see delays John Stultz
  2015-03-12  6:32   ` Ingo Molnar
@ 2015-03-13  9:02   ` tip-bot for John Stultz
  1 sibling, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, torvalds, john.stultz, peterz, mingo, davej, hpa, prarit,
	sboyd, richardcochran, linux-kernel

Commit-ID:  3c17ad19f0697ffe5ef7438cdafc2d2b7757d8a5
Gitweb:     http://git.kernel.org/tip/3c17ad19f0697ffe5ef7438cdafc2d2b7757d8a5
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:32 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:06:58 +0100

timekeeping: Add debugging checks to warn if we see delays

Recently there's been requests for better sanity
checking in the time code, so that it's more clear
when something is going wrong, since timekeeping issues
could manifest in a large number of strange ways in
various subsystems.

Thus, this patch adds some extra infrastructure to
add a check to update_wall_time() to print two new
warnings:

 1) if we see the call delayed beyond the 'max_cycles'
    overflow point,

 2) or if we see the call delayed beyond the clocksource's
    'max_idle_ns' value, which is currently 50% of the
    overflow point.

This extra infrastructure is conditional on
a new CONFIG_DEBUG_TIMEKEEPING option, also
added in this patch - default off.

Tested this a bit by halting qemu for specified
lengths of time to trigger the warnings.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-5-git-send-email-john.stultz@linaro.org
[ Improved the changelog and the messages a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/jiffies.c     |  1 +
 kernel/time/timekeeping.c | 28 ++++++++++++++++++++++++++++
 lib/Kconfig.debug         | 13 +++++++++++++
 3 files changed, 42 insertions(+)

diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index a6a5bf5..7e41390 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -71,6 +71,7 @@ static struct clocksource clocksource_jiffies = {
 	.mask		= 0xffffffff, /*32bits*/
 	.mult		= NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
 	.shift		= JIFFIES_SHIFT,
+	.max_cycles	= 10,
 };
 
 __cacheline_aligned_in_smp DEFINE_SEQLOCK(jiffies_lock);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 91db941..acf0491 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -118,6 +118,31 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
 	tk->offs_boot = ktime_add(tk->offs_boot, delta);
 }
 
+#ifdef CONFIG_DEBUG_TIMEKEEPING
+static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
+{
+
+	cycle_t max_cycles = tk->tkr.clock->max_cycles;
+	const char *name = tk->tkr.clock->name;
+
+	if (offset > max_cycles) {
+		printk_deferred("WARNING: timekeeping: Cycle offset (%lld) is larger than allowed by the '%s' clock's max_cycles value (%lld): time overflow\n",
+				offset, name, max_cycles);
+		printk_deferred("         timekeeping: Your kernel is sick, but tries to cope\n");
+	} else {
+		if (offset > (max_cycles >> 1)) {
+			printk_deferred("INFO: timekeeping: Cycle offset (%lld) is larger than the the '%s' clock's 50%% safety margin (%lld)\n",
+					offset, name, max_cycles >> 1);
+			printk_deferred("      timekeeping: Your kernel is still fine, but is feeling a bit nervous\n");
+		}
+	}
+}
+#else
+static inline void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
+{
+}
+#endif
+
 /**
  * tk_setup_internals - Set up internals to use clocksource clock.
  *
@@ -1630,6 +1655,9 @@ void update_wall_time(void)
 	if (offset < real_tk->cycle_interval)
 		goto out;
 
+	/* Do some additional sanity checking */
+	timekeeping_check_update(real_tk, offset);
+
 	/*
 	 * With NO_HZ we may have to accumulate many cycle_intervals
 	 * (think "ticks") worth of time at once. To do this efficiently,
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c5cefb3..36b6fa8 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -865,6 +865,19 @@ config SCHED_STACK_END_CHECK
 	  data corruption or a sporadic crash at a later stage once the region
 	  is examined. The runtime overhead introduced is minimal.
 
+config DEBUG_TIMEKEEPING
+	bool "Enable extra timekeeping sanity checking"
+	help
+	  This option will enable additional timekeeping sanity checks
+	  which may be helpful when diagnosing issues where timekeeping
+	  problems are suspected.
+
+	  This may include checks in the timekeeping hotpaths, so this
+	  option may have a (very small) performance impact to some
+	  workloads.
+
+	  If unsure, say N.
+
 config TIMER_STATS
 	bool "Collect kernel timers statistics"
 	depends on DEBUG_KERNEL && PROC_FS

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] timekeeping: Add checks to cap clocksource reads to the 'max_cycles' value
  2015-03-12  4:16 ` [PATCH 05/12] time: Add infrastructure to cap clocksource reads to the max_cycles value John Stultz
@ 2015-03-13  9:02   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, tglx, mingo, torvalds, richardcochran, davej, sboyd, hpa,
	john.stultz, prarit, linux-kernel

Commit-ID:  a558cd021d83b65c47ee5b9bec1fcfe5298a769f
Gitweb:     http://git.kernel.org/tip/a558cd021d83b65c47ee5b9bec1fcfe5298a769f
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:33 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:07:04 +0100

timekeeping: Add checks to cap clocksource reads to the 'max_cycles' value

When calculating the current delta since the last tick, we
currently have no hard protections to prevent a multiplication
overflow from occuring.

This patch introduces infrastructure to allow a cap that
limits the clocksource read delta value to the 'max_cycles' value,
which is where an overflow would occur.

Since this is in the hotpath, it adds the extra checking under
CONFIG_DEBUG_TIMEKEEPING=y.

There was some concern that capping time like this could cause
problems as we may stop expiring timers, which could go circular
if the timer that triggers time accumulation were mis-scheduled
too far in the future, which would cause time to stop.

However, since the mult overflow would result in a smaller time
value, we would effectively have the same problem there.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-6-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/timekeeping.c | 49 +++++++++++++++++++++++++++++++++--------------
 1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index acf0491..657414c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -126,9 +126,9 @@ static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 	const char *name = tk->tkr.clock->name;
 
 	if (offset > max_cycles) {
-		printk_deferred("WARNING: timekeeping: Cycle offset (%lld) is larger than allowed by the '%s' clock's max_cycles value (%lld): time overflow\n",
+		printk_deferred("WARNING: timekeeping: Cycle offset (%lld) is larger than allowed by the '%s' clock's max_cycles value (%lld): time overflow danger\n",
 				offset, name, max_cycles);
-		printk_deferred("         timekeeping: Your kernel is sick, but tries to cope\n");
+		printk_deferred("         timekeeping: Your kernel is sick, but tries to cope by capping time updates\n");
 	} else {
 		if (offset > (max_cycles >> 1)) {
 			printk_deferred("INFO: timekeeping: Cycle offset (%lld) is larger than the the '%s' clock's 50%% safety margin (%lld)\n",
@@ -137,10 +137,39 @@ static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 		}
 	}
 }
+
+static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
+{
+	cycle_t cycle_now, delta;
+
+	/* read clocksource */
+	cycle_now = tkr->read(tkr->clock);
+
+	/* calculate the delta since the last update_wall_time */
+	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+
+	/* Cap delta value to the max_cycles values to avoid mult overflows */
+	if (unlikely(delta > tkr->clock->max_cycles))
+		delta = tkr->clock->max_cycles;
+
+	return delta;
+}
 #else
 static inline void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 {
 }
+static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
+{
+	cycle_t cycle_now, delta;
+
+	/* read clocksource */
+	cycle_now = tkr->read(tkr->clock);
+
+	/* calculate the delta since the last update_wall_time */
+	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+
+	return delta;
+}
 #endif
 
 /**
@@ -218,14 +247,10 @@ static inline u32 arch_gettimeoffset(void) { return 0; }
 
 static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
 {
-	cycle_t cycle_now, delta;
+	cycle_t delta;
 	s64 nsec;
 
-	/* read clocksource: */
-	cycle_now = tkr->read(tkr->clock);
-
-	/* calculate the delta since the last update_wall_time: */
-	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+	delta = timekeeping_get_delta(tkr);
 
 	nsec = delta * tkr->mult + tkr->xtime_nsec;
 	nsec >>= tkr->shift;
@@ -237,14 +262,10 @@ static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
 static inline s64 timekeeping_get_ns_raw(struct timekeeper *tk)
 {
 	struct clocksource *clock = tk->tkr.clock;
-	cycle_t cycle_now, delta;
+	cycle_t delta;
 	s64 nsec;
 
-	/* read clocksource: */
-	cycle_now = tk->tkr.read(clock);
-
-	/* calculate the delta since the last update_wall_time: */
-	delta = clocksource_delta(cycle_now, tk->tkr.cycle_last, tk->tkr.mask);
+	delta = timekeeping_get_delta(&tk->tkr);
 
 	/* convert delta to nanoseconds. */
 	nsec = clocksource_cyc2ns(delta, clock->mult, clock->shift);

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] timekeeping: Try to catch clocksource delta underflows
  2015-03-12  4:16 ` [PATCH 06/12] time: Try to catch clocksource delta underflows John Stultz
@ 2015-03-13  9:02   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, tglx, torvalds, mingo, sboyd, john.stultz,
	richardcochran, davej, prarit, peterz, hpa

Commit-ID:  057b87e3161d1194a095718f9918c01b2c389e74
Gitweb:     http://git.kernel.org/tip/057b87e3161d1194a095718f9918c01b2c389e74
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:34 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:07:05 +0100

timekeeping: Try to catch clocksource delta underflows

In the case where there is a broken clocksource
where there are multiple actual clocks that
aren't perfectly aligned, we may see small "negative"
deltas when we subtract 'now' from 'cycle_last'.

The values are actually negative with respect to the
clocksource mask value, not necessarily negative
if cast to a s64, but we can check by checking the
delta to see if it is a small (relative to the mask)
negative value (again negative relative to the mask).

If so, we assume we jumped backwards somehow and
instead use zero for our delta.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-7-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/timekeeping.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 657414c..187149b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -148,6 +148,13 @@ static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
 	/* calculate the delta since the last update_wall_time */
 	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
 
+	/*
+	 * Try to catch underflows by checking if we are seeing small
+	 * mask-relative negative values.
+	 */
+	if (unlikely((~delta & tkr->mask) < (tkr->mask >> 3)))
+		delta = 0;
+
 	/* Cap delta value to the max_cycles values to avoid mult overflows */
 	if (unlikely(delta > tkr->clock->max_cycles))
 		delta = tkr->clock->max_cycles;

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] timekeeping: Add warnings when overflows or underflows are observed
  2015-03-12  4:16 ` [PATCH 07/12] time: Add warnings when overflows or underflows are observed John Stultz
  2015-03-12  7:37   ` Ingo Molnar
@ 2015-03-13  9:03   ` tip-bot for John Stultz
  1 sibling, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: prarit, tglx, davej, mingo, torvalds, linux-kernel, peterz,
	john.stultz, sboyd, richardcochran, hpa

Commit-ID:  4ca22c2648f9c1cec0b242f58d7302136f5a4cbb
Gitweb:     http://git.kernel.org/tip/4ca22c2648f9c1cec0b242f58d7302136f5a4cbb
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:35 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:07:05 +0100

timekeeping: Add warnings when overflows or underflows are observed

It was suggested that the underflow/overflow protection
should probably throw some sort of warning out, rather
than just silently fixing the issue.

So this patch adds some warnings here. The flag variables
used are not protected by locks, but since we can't print
from the reading functions, just being able to say we
saw an issue in the update interval is useful enough,
and can be slightly racy without real consequence.

The big complication is that we're only under a read
seqlock, so the data could shift under us during
our calculation to see if there was a problem. This
patch avoids this issue by nesting another seqlock
which allows us to snapshot the just required values
atomically. So we shouldn't see false positives.

I also added some basic rate-limiting here, since
on one build machine w/ skewed TSCs it was fairly
noisy at bootup.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-8-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/timekeeping.c | 64 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 57 insertions(+), 7 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 187149b..892f6cb 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -119,6 +119,20 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
 }
 
 #ifdef CONFIG_DEBUG_TIMEKEEPING
+#define WARNING_FREQ (HZ*300) /* 5 minute rate-limiting */
+/*
+ * These simple flag variables are managed
+ * without locks, which is racy, but ok since
+ * we don't really care about being super
+ * precise about how many events were seen,
+ * just that a problem was observed.
+ */
+static int timekeeping_underflow_seen;
+static int timekeeping_overflow_seen;
+
+/* last_warning is only modified under the timekeeping lock */
+static long timekeeping_last_warning;
+
 static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 {
 
@@ -136,28 +150,64 @@ static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
 			printk_deferred("      timekeeping: Your kernel is still fine, but is feeling a bit nervous\n");
 		}
 	}
+
+	if (timekeeping_underflow_seen) {
+		if (jiffies - timekeeping_last_warning > WARNING_FREQ) {
+			printk_deferred("WARNING: Underflow in clocksource '%s' observed, time update ignored.\n", name);
+			printk_deferred("         Please report this, consider using a different clocksource, if possible.\n");
+			printk_deferred("         Your kernel is probably still fine.\n");
+			timekeeping_last_warning = jiffies;
+		}
+		timekeeping_underflow_seen = 0;
+	}
+
+	if (timekeeping_overflow_seen) {
+		if (jiffies - timekeeping_last_warning > WARNING_FREQ) {
+			printk_deferred("WARNING: Overflow in clocksource '%s' observed, time update capped.\n", name);
+			printk_deferred("         Please report this, consider using a different clocksource, if possible.\n");
+			printk_deferred("         Your kernel is probably still fine.\n");
+			timekeeping_last_warning = jiffies;
+		}
+		timekeeping_overflow_seen = 0;
+	}
 }
 
 static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
 {
-	cycle_t cycle_now, delta;
+	cycle_t now, last, mask, max, delta;
+	unsigned int seq;
 
-	/* read clocksource */
-	cycle_now = tkr->read(tkr->clock);
+	/*
+	 * Since we're called holding a seqlock, the data may shift
+	 * under us while we're doing the calculation. This can cause
+	 * false positives, since we'd note a problem but throw the
+	 * results away. So nest another seqlock here to atomically
+	 * grab the points we are checking with.
+	 */
+	do {
+		seq = read_seqcount_begin(&tk_core.seq);
+		now = tkr->read(tkr->clock);
+		last = tkr->cycle_last;
+		mask = tkr->mask;
+		max = tkr->clock->max_cycles;
+	} while (read_seqcount_retry(&tk_core.seq, seq));
 
-	/* calculate the delta since the last update_wall_time */
-	delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+	delta = clocksource_delta(now, last, mask);
 
 	/*
 	 * Try to catch underflows by checking if we are seeing small
 	 * mask-relative negative values.
 	 */
-	if (unlikely((~delta & tkr->mask) < (tkr->mask >> 3)))
+	if (unlikely((~delta & mask) < (mask >> 3))) {
+		timekeeping_underflow_seen = 1;
 		delta = 0;
+	}
 
 	/* Cap delta value to the max_cycles values to avoid mult overflows */
-	if (unlikely(delta > tkr->clock->max_cycles))
+	if (unlikely(delta > max)) {
+		timekeeping_overflow_seen = 1;
 		delta = tkr->clock->max_cycles;
+	}
 
 	return delta;
 }

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] clocksource: Improve clocksource watchdog reporting
  2015-03-12  4:16 ` [PATCH 08/12] clocksource: Improve clocksource watchdog reporting John Stultz
@ 2015-03-13  9:03   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: sboyd, hpa, torvalds, tglx, prarit, linux-kernel, davej,
	richardcochran, peterz, john.stultz, mingo

Commit-ID:  0b046b217ad4c64fbbeaaac24d0648cb1fa49ad8
Gitweb:     http://git.kernel.org/tip/0b046b217ad4c64fbbeaaac24d0648cb1fa49ad8
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:36 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:07:06 +0100

clocksource: Improve clocksource watchdog reporting

The clocksource watchdog reporting has been less helpful
then desired, as it just printed the delta between
the two clocksources. This prevents any useful analysis
of why the skew occurred.

Thus this patch tries to improve the output when we
mark a clocksource as unstable, printing out the cycle
last and now values for both the current clocksource
and the watchdog clocksource. This will allow us to see
if the result was due to a false positive caused by
a problematic watchdog.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-9-git-send-email-john.stultz@linaro.org
[ Minor cleanups of kernel messages. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/clocksource.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index fc2a9de..c4cc04b 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -142,13 +142,6 @@ static void __clocksource_unstable(struct clocksource *cs)
 		schedule_work(&watchdog_work);
 }
 
-static void clocksource_unstable(struct clocksource *cs, int64_t delta)
-{
-	printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns)\n",
-	       cs->name, delta);
-	__clocksource_unstable(cs);
-}
-
 /**
  * clocksource_mark_unstable - mark clocksource unstable via watchdog
  * @cs:		clocksource to be marked unstable
@@ -174,7 +167,7 @@ void clocksource_mark_unstable(struct clocksource *cs)
 static void clocksource_watchdog(unsigned long data)
 {
 	struct clocksource *cs;
-	cycle_t csnow, wdnow, delta;
+	cycle_t csnow, wdnow, cslast, wdlast, delta;
 	int64_t wd_nsec, cs_nsec;
 	int next_cpu, reset_pending;
 
@@ -213,6 +206,8 @@ static void clocksource_watchdog(unsigned long data)
 
 		delta = clocksource_delta(csnow, cs->cs_last, cs->mask);
 		cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
+		wdlast = cs->wd_last; /* save these in case we print them */
+		cslast = cs->cs_last;
 		cs->cs_last = csnow;
 		cs->wd_last = wdnow;
 
@@ -221,7 +216,12 @@ static void clocksource_watchdog(unsigned long data)
 
 		/* Check the deviation from the watchdog clocksource. */
 		if ((abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD)) {
-			clocksource_unstable(cs, cs_nsec - wd_nsec);
+			pr_warn("timekeeping watchdog: Marking clocksource '%s' as unstable, because the skew is too large:\n", cs->name);
+			pr_warn("	'%s' wd_now: %llx wd_last: %llx mask: %llx\n",
+				watchdog->name, wdnow, wdlast, watchdog->mask);
+			pr_warn("	'%s' cs_now: %llx cs_last: %llx mask: %llx\n",
+				cs->name, csnow, cslast, cs->mask);
+			__clocksource_unstable(cs);
 			continue;
 		}
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] clocksource: Mostly kill clocksource_register()
  2015-03-12  4:16 ` [PATCH 09/12] clocksource: Mostly kill clocksource_register() John Stultz
@ 2015-03-13  9:03   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, john.stultz, hpa, richardcochran, mingo, schwidefsky,
	prarit, davem, sboyd, davej, peterz, linux-kernel, tglx

Commit-ID:  f8935983f110505daa38e8d36ee406807f83a069
Gitweb:     http://git.kernel.org/tip/f8935983f110505daa38e8d36ee406807f83a069
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:37 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:07:06 +0100

clocksource: Mostly kill clocksource_register()

A long running project has been to clean up remaining uses
of clocksource_register(), replacing it with the simpler
clocksource_register_khz/hz() functions.

However, there are a few cases where we need to self-define
our mult/shift values, so switch the function to a more
obviously internal __clocksource_register() name, and
consolidate much of the internal logic so we don't have
duplication.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: David S. Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-10-git-send-email-john.stultz@linaro.org
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/s390/kernel/time.c     |  2 +-
 arch/sparc/kernel/time_32.c |  2 +-
 include/linux/clocksource.h | 10 +++++-
 kernel/time/clocksource.c   | 81 +++++++++++++++++++--------------------------
 kernel/time/jiffies.c       |  4 +--
 5 files changed, 47 insertions(+), 52 deletions(-)

diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index 20660dd..6c273cd 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -283,7 +283,7 @@ void __init time_init(void)
 	if (register_external_irq(EXT_IRQ_TIMING_ALERT, timing_alert_interrupt))
 		panic("Couldn't request external interrupt 0x1406");
 
-	if (clocksource_register(&clocksource_tod) != 0)
+	if (__clocksource_register(&clocksource_tod) != 0)
 		panic("Could not register TOD clock source");
 
 	/* Enable TOD clock interrupts on the boot cpu. */
diff --git a/arch/sparc/kernel/time_32.c b/arch/sparc/kernel/time_32.c
index 2f80d23..a31c0c8 100644
--- a/arch/sparc/kernel/time_32.c
+++ b/arch/sparc/kernel/time_32.c
@@ -191,7 +191,7 @@ static __init int setup_timer_cs(void)
 	timer_cs.mult = clocksource_hz2mult(sparc_config.clock_rate,
 	                                    timer_cs.shift);
 
-	return clocksource_register(&timer_cs);
+	return __clocksource_register(&timer_cs);
 }
 
 #ifdef CONFIG_SMP
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 16d048c..bd98eaa 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -179,7 +179,6 @@ static inline s64 clocksource_cyc2ns(cycle_t cycles, u32 mult, u32 shift)
 }
 
 
-extern int clocksource_register(struct clocksource*);
 extern int clocksource_unregister(struct clocksource*);
 extern void clocksource_touch_watchdog(void);
 extern struct clocksource* clocksource_get_next(void);
@@ -203,6 +202,15 @@ __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
 extern void
 __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq);
 
+/*
+ * Don't call this unless you are a default clocksource
+ * (AKA: jiffies) and absolutely have to.
+ */
+static inline int __clocksource_register(struct clocksource *cs)
+{
+	return __clocksource_register_scale(cs, 1, 0);
+}
+
 static inline int clocksource_register_hz(struct clocksource *cs, u32 hz)
 {
 	return __clocksource_register_scale(cs, 1, hz);
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index c4cc04b..5cdf17e 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -656,38 +656,52 @@ static void clocksource_enqueue(struct clocksource *cs)
 void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 {
 	u64 sec;
+
 	/*
-	 * Calc the maximum number of seconds which we can run before
-	 * wrapping around. For clocksources which have a mask > 32bit
-	 * we need to limit the max sleep time to have a good
-	 * conversion precision. 10 minutes is still a reasonable
-	 * amount. That results in a shift value of 24 for a
-	 * clocksource with mask >= 40bit and f >= 4GHz. That maps to
-	 * ~ 0.06ppm granularity for NTP.
+	 * Default clocksources are *special* and self-define their mult/shift.
+	 * But, you're not special, so you should specify a freq value.
 	 */
-	sec = cs->mask;
-	do_div(sec, freq);
-	do_div(sec, scale);
-	if (!sec)
-		sec = 1;
-	else if (sec > 600 && cs->mask > UINT_MAX)
-		sec = 600;
-
-	clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
-			       NSEC_PER_SEC / scale, sec * scale);
-
+	if (freq) {
+		/*
+		 * Calc the maximum number of seconds which we can run before
+		 * wrapping around. For clocksources which have a mask > 32-bit
+		 * we need to limit the max sleep time to have a good
+		 * conversion precision. 10 minutes is still a reasonable
+		 * amount. That results in a shift value of 24 for a
+		 * clocksource with mask >= 40-bit and f >= 4GHz. That maps to
+		 * ~ 0.06ppm granularity for NTP.
+		 */
+		sec = cs->mask;
+		do_div(sec, freq);
+		do_div(sec, scale);
+		if (!sec)
+			sec = 1;
+		else if (sec > 600 && cs->mask > UINT_MAX)
+			sec = 600;
+
+		clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
+				       NSEC_PER_SEC / scale, sec * scale);
+	}
 	/*
 	 * Ensure clocksources that have large 'mult' values don't overflow
 	 * when adjusted.
 	 */
 	cs->maxadj = clocksource_max_adjustment(cs);
-	while ((cs->mult + cs->maxadj < cs->mult)
-		|| (cs->mult - cs->maxadj > cs->mult)) {
+	while (freq && ((cs->mult + cs->maxadj < cs->mult)
+		|| (cs->mult - cs->maxadj > cs->mult))) {
 		cs->mult >>= 1;
 		cs->shift--;
 		cs->maxadj = clocksource_max_adjustment(cs);
 	}
 
+	/*
+	 * Only warn for *special* clocksources that self-define
+	 * their mult/shift values and don't specify a freq.
+	 */
+	WARN_ONCE(cs->mult + cs->maxadj < cs->mult,
+		"timekeeping: Clocksource %s might overflow on 11%% adjustment\n",
+		cs->name);
+
 	clocksource_update_max_deferment(cs);
 }
 EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
@@ -719,33 +733,6 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
 }
 EXPORT_SYMBOL_GPL(__clocksource_register_scale);
 
-
-/**
- * clocksource_register - Used to install new clocksources
- * @cs:		clocksource to be registered
- *
- * Returns -EBUSY if registration fails, zero otherwise.
- */
-int clocksource_register(struct clocksource *cs)
-{
-	/* calculate max adjustment for given mult/shift */
-	cs->maxadj = clocksource_max_adjustment(cs);
-	WARN_ONCE(cs->mult + cs->maxadj < cs->mult,
-		"Clocksource %s might overflow on 11%% adjustment\n",
-		cs->name);
-
-	/* Update max idle time permitted for this clocksource */
-	clocksource_update_max_deferment(cs);
-
-	mutex_lock(&clocksource_mutex);
-	clocksource_enqueue(cs);
-	clocksource_enqueue_watchdog(cs);
-	clocksource_select();
-	mutex_unlock(&clocksource_mutex);
-	return 0;
-}
-EXPORT_SYMBOL(clocksource_register);
-
 static void __clocksource_change_rating(struct clocksource *cs, int rating)
 {
 	list_del(&cs->list);
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index 7e41390..c4bb518 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -95,7 +95,7 @@ EXPORT_SYMBOL(jiffies);
 
 static int __init init_jiffies_clocksource(void)
 {
-	return clocksource_register(&clocksource_jiffies);
+	return __clocksource_register(&clocksource_jiffies);
 }
 
 core_initcall(init_jiffies_clocksource);
@@ -131,6 +131,6 @@ int register_refined_jiffies(long cycles_per_second)
 
 	refined_jiffies.mult = ((u32)nsec_per_tick) << JIFFIES_SHIFT;
 
-	clocksource_register(&refined_jiffies);
+	__clocksource_register(&refined_jiffies);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] clocksource, sparc32: Convert to using clocksource_register_hz()
  2015-03-12  4:16 ` [PATCH 10/12] sparc: Convert to using clocksource_register_hz() John Stultz
@ 2015-03-13  9:04   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, linux-kernel, mingo, hpa, john.stultz, richardcochran,
	tglx, peterz, davem, prarit, sboyd, davej

Commit-ID:  3142f76022fe46f6e0a0d3940b23fb6ccb794692
Gitweb:     http://git.kernel.org/tip/3142f76022fe46f6e0a0d3940b23fb6ccb794692
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:38 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:07:07 +0100

clocksource, sparc32: Convert to using clocksource_register_hz()

While cleaning up some clocksource code, I noticed the
time_32 implementation uses the clocksource_hz2mult()
helper, but doesn't use the clocksource_register_hz()
method.

I don't believe the Sparc clocksource is a default
clocksource, so we shouldn't need to self-define
the mult/shift pair.

So convert the time_32.c implementation to use
clocksource_register_hz().

Untested.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-11-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/sparc/kernel/time_32.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/sparc/kernel/time_32.c b/arch/sparc/kernel/time_32.c
index a31c0c8..18147a5 100644
--- a/arch/sparc/kernel/time_32.c
+++ b/arch/sparc/kernel/time_32.c
@@ -181,17 +181,13 @@ static struct clocksource timer_cs = {
 	.rating	= 100,
 	.read	= timer_cs_read,
 	.mask	= CLOCKSOURCE_MASK(64),
-	.shift	= 2,
 	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
 static __init int setup_timer_cs(void)
 {
 	timer_cs_enabled = 1;
-	timer_cs.mult = clocksource_hz2mult(sparc_config.clock_rate,
-	                                    timer_cs.shift);
-
-	return __clocksource_register(&timer_cs);
+	return clocksource_register_hz(&timer_cs, sparc_config.clock_rate);
 }
 
 #ifdef CONFIG_SMP

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] clocksource: Add some debug info about clocksources being registered
  2015-03-12  4:16 ` [PATCH 11/12] clocksource: Add some debug info about clocksources being registered John Stultz
@ 2015-03-13  9:04   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: sboyd, tglx, davej, hpa, linux-kernel, john.stultz, torvalds,
	peterz, prarit, richardcochran, mingo

Commit-ID:  8cc8c525ad4e7b581cacf84119e1a28dcb4044db
Gitweb:     http://git.kernel.org/tip/8cc8c525ad4e7b581cacf84119e1a28dcb4044db
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:39 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:07:07 +0100

clocksource: Add some debug info about clocksources being registered

Print the mask, max_cycles, and max_idle_ns values for
clocksources being registered.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-12-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/time/clocksource.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 5cdf17e..1977eba 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -703,6 +703,9 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 		cs->name);
 
 	clocksource_update_max_deferment(cs);
+
+	pr_info("clocksource %s: mask: 0x%llx max_cycles: 0x%llx, max_idle_ns: %lld ns\n",
+			cs->name, cs->mask, cs->max_cycles, cs->max_idle_ns);
 }
 EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [tip:timers/core] clocksource: Rename __clocksource_updatefreq_*( ) to __clocksource_update_freq_*()
  2015-03-12  4:16 ` [PATCH 12/12] clocksource: Rename __clocksource_updatefreq_* to __clocksource_update_freq_* John Stultz
@ 2015-03-13  9:04   ` tip-bot for John Stultz
  0 siblings, 0 replies; 33+ messages in thread
From: tip-bot for John Stultz @ 2015-03-13  9:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, davej, john.stultz, tglx, richardcochran, prarit,
	torvalds, linux-kernel, daniel.lezcano, mingo, sboyd, hpa

Commit-ID:  fba9e07208c0f9d92d9f73761c99c8612039da44
Gitweb:     http://git.kernel.org/tip/fba9e07208c0f9d92d9f73761c99c8612039da44
Author:     John Stultz <john.stultz@linaro.org>
AuthorDate: Wed, 11 Mar 2015 21:16:40 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 13 Mar 2015 08:07:08 +0100

clocksource: Rename __clocksource_updatefreq_*() to __clocksource_update_freq_*()

Ingo requested this function be renamed to improve readability,
so I've renamed __clocksource_updatefreq_scale() as well as the
__clocksource_updatefreq_hz/khz() functions to avoid
squishedtogethernames.

This touches some of the sh clocksources, which I've not tested.

The arch/arm/plat-omap change is just a comment change for
consistency.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-13-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/arm/plat-omap/counter_32k.c |  2 +-
 drivers/clocksource/em_sti.c     |  2 +-
 drivers/clocksource/sh_cmt.c     |  2 +-
 drivers/clocksource/sh_tmu.c     |  2 +-
 include/linux/clocksource.h      | 10 +++++-----
 kernel/time/clocksource.c        | 11 ++++++-----
 6 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/arm/plat-omap/counter_32k.c b/arch/arm/plat-omap/counter_32k.c
index 61b4d70..43cf745 100644
--- a/arch/arm/plat-omap/counter_32k.c
+++ b/arch/arm/plat-omap/counter_32k.c
@@ -103,7 +103,7 @@ int __init omap_init_clocksource_32k(void __iomem *vbase)
 
 	/*
 	 * 120000 rough estimate from the calculations in
-	 * __clocksource_updatefreq_scale.
+	 * __clocksource_update_freq_scale.
 	 */
 	clocks_calc_mult_shift(&persistent_mult, &persistent_shift,
 			32768, NSEC_PER_SEC, 120000);
diff --git a/drivers/clocksource/em_sti.c b/drivers/clocksource/em_sti.c
index d0a7bd6..dc3c6ee 100644
--- a/drivers/clocksource/em_sti.c
+++ b/drivers/clocksource/em_sti.c
@@ -210,7 +210,7 @@ static int em_sti_clocksource_enable(struct clocksource *cs)
 
 	ret = em_sti_start(p, USER_CLOCKSOURCE);
 	if (!ret)
-		__clocksource_updatefreq_hz(cs, p->rate);
+		__clocksource_update_freq_hz(cs, p->rate);
 	return ret;
 }
 
diff --git a/drivers/clocksource/sh_cmt.c b/drivers/clocksource/sh_cmt.c
index 2bd13b5..b8ff3c6 100644
--- a/drivers/clocksource/sh_cmt.c
+++ b/drivers/clocksource/sh_cmt.c
@@ -641,7 +641,7 @@ static int sh_cmt_clocksource_enable(struct clocksource *cs)
 
 	ret = sh_cmt_start(ch, FLAG_CLOCKSOURCE);
 	if (!ret) {
-		__clocksource_updatefreq_hz(cs, ch->rate);
+		__clocksource_update_freq_hz(cs, ch->rate);
 		ch->cs_enabled = true;
 	}
 	return ret;
diff --git a/drivers/clocksource/sh_tmu.c b/drivers/clocksource/sh_tmu.c
index f150ca82..b6b8fa3 100644
--- a/drivers/clocksource/sh_tmu.c
+++ b/drivers/clocksource/sh_tmu.c
@@ -272,7 +272,7 @@ static int sh_tmu_clocksource_enable(struct clocksource *cs)
 
 	ret = sh_tmu_enable(ch);
 	if (!ret) {
-		__clocksource_updatefreq_hz(cs, ch->rate);
+		__clocksource_update_freq_hz(cs, ch->rate);
 		ch->cs_enabled = true;
 	}
 
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index bd98eaa..1355098 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -200,7 +200,7 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
 extern int
 __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
 extern void
-__clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq);
+__clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq);
 
 /*
  * Don't call this unless you are a default clocksource
@@ -221,14 +221,14 @@ static inline int clocksource_register_khz(struct clocksource *cs, u32 khz)
 	return __clocksource_register_scale(cs, 1000, khz);
 }
 
-static inline void __clocksource_updatefreq_hz(struct clocksource *cs, u32 hz)
+static inline void __clocksource_update_freq_hz(struct clocksource *cs, u32 hz)
 {
-	__clocksource_updatefreq_scale(cs, 1, hz);
+	__clocksource_update_freq_scale(cs, 1, hz);
 }
 
-static inline void __clocksource_updatefreq_khz(struct clocksource *cs, u32 khz)
+static inline void __clocksource_update_freq_khz(struct clocksource *cs, u32 khz)
 {
-	__clocksource_updatefreq_scale(cs, 1000, khz);
+	__clocksource_update_freq_scale(cs, 1000, khz);
 }
 
 
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 1977eba..c3be3c7 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -643,7 +643,7 @@ static void clocksource_enqueue(struct clocksource *cs)
 }
 
 /**
- * __clocksource_updatefreq_scale - Used update clocksource with new freq
+ * __clocksource_update_freq_scale - Used update clocksource with new freq
  * @cs:		clocksource to be registered
  * @scale:	Scale factor multiplied against freq to get clocksource hz
  * @freq:	clocksource frequency (cycles per second) divided by scale
@@ -651,9 +651,10 @@ static void clocksource_enqueue(struct clocksource *cs)
  * This should only be called from the clocksource->enable() method.
  *
  * This *SHOULD NOT* be called directly! Please use the
- * clocksource_updatefreq_hz() or clocksource_updatefreq_khz helper functions.
+ * __clocksource_update_freq_hz() or __clocksource_update_freq_khz() helper
+ * functions.
  */
-void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
+void __clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq)
 {
 	u64 sec;
 
@@ -707,7 +708,7 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
 	pr_info("clocksource %s: mask: 0x%llx max_cycles: 0x%llx, max_idle_ns: %lld ns\n",
 			cs->name, cs->mask, cs->max_cycles, cs->max_idle_ns);
 }
-EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
+EXPORT_SYMBOL_GPL(__clocksource_update_freq_scale);
 
 /**
  * __clocksource_register_scale - Used to install new clocksources
@@ -724,7 +725,7 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
 {
 
 	/* Initialize mult/shift and max_idle_ns */
-	__clocksource_updatefreq_scale(cs, scale, freq);
+	__clocksource_update_freq_scale(cs, scale, freq);
 
 	/* Add clocksource to the clocksource list */
 	mutex_lock(&clocksource_mutex);

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/12] Increased clocksource validation and cleanups (v4)
  2015-03-13  8:58     ` Ingo Molnar
@ 2015-03-13 17:54       ` John Stultz
  2015-03-16  8:28         ` Ingo Molnar
  0 siblings, 1 reply; 33+ messages in thread
From: John Stultz @ 2015-03-13 17:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: lkml, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Peter Zijlstra

On Fri, Mar 13, 2015 at 1:58 AM, Ingo Molnar <mingo@kernel.org> wrote:
> * John Stultz <john.stultz@linaro.org> wrote:
>> On Thu, Mar 12, 2015 at 2:19 AM, Ingo Molnar <mingo@kernel.org> wrote:
>> > * John Stultz <john.stultz@linaro.org> wrote:
>> >> New in v4:
>> >> * Lots and lots of typo corrections and minor cleanups suggested
>> >>   by Ingo.
>> >> * Dropped "Remove clocksource_max_deferment()" patch
>> >> * Added "Rename __clocksource_updatefreq_*..." patch
>> >> * I realized one of the patches (Improve clocksource watchdog
>> >>   reporting) didn't have a proper cc list, so while it was on lkml
>> >>   folks may not have reviewed it before.
>> >
>> > So I've applied them with some changes to tip:timers/core, please look
>> > at the commit notifications for details. Any outstanding review
>> > feedback can be addressed as delta patches on top of this I think,
>> > none of my observations were show-stoppers.
>>
>> Have you not yet pushed the changed out publicly? (I'm only seeing one
>> change from Viresh in tip/timers/core)
>
> Yeah, yesterday I had some test failures (elsewhere) so they were held
> up.
>
> I pushed them out a minute ago, the commit notifications should appear
> shortly.

Ok, just looked them over and your changes look good to me. Thanks for
the improvements!

I'll generate a few patches to address the other comments you made the
last cycle and send those to you in a bit.

thanks
-john

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 00/12] Increased clocksource validation and cleanups (v4)
  2015-03-13 17:54       ` John Stultz
@ 2015-03-16  8:28         ` Ingo Molnar
  0 siblings, 0 replies; 33+ messages in thread
From: Ingo Molnar @ 2015-03-16  8:28 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Dave Jones, Linus Torvalds, Thomas Gleixner,
	Richard Cochran, Prarit Bhargava, Stephen Boyd, Peter Zijlstra


* John Stultz <john.stultz@linaro.org> wrote:

> On Fri, Mar 13, 2015 at 1:58 AM, Ingo Molnar <mingo@kernel.org> wrote:
> > * John Stultz <john.stultz@linaro.org> wrote:
> >> On Thu, Mar 12, 2015 at 2:19 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >> > * John Stultz <john.stultz@linaro.org> wrote:
> >> >> New in v4:
> >> >> * Lots and lots of typo corrections and minor cleanups suggested
> >> >>   by Ingo.
> >> >> * Dropped "Remove clocksource_max_deferment()" patch
> >> >> * Added "Rename __clocksource_updatefreq_*..." patch
> >> >> * I realized one of the patches (Improve clocksource watchdog
> >> >>   reporting) didn't have a proper cc list, so while it was on lkml
> >> >>   folks may not have reviewed it before.
> >> >
> >> > So I've applied them with some changes to tip:timers/core, please look
> >> > at the commit notifications for details. Any outstanding review
> >> > feedback can be addressed as delta patches on top of this I think,
> >> > none of my observations were show-stoppers.
> >>
> >> Have you not yet pushed the changed out publicly? (I'm only seeing one
> >> change from Viresh in tip/timers/core)
> >
> > Yeah, yesterday I had some test failures (elsewhere) so they were held
> > up.
> >
> > I pushed them out a minute ago, the commit notifications should appear
> > shortly.
> 
> Ok, just looked them over and your changes look good to me. Thanks 
> for the improvements!
> 
> I'll generate a few patches to address the other comments you made 
> the last cycle and send those to you in a bit.

Thanks!

	Ingo

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-03-16  8:28 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-12  4:16 [PATCH 00/12] Increased clocksource validation and cleanups (v4) John Stultz
2015-03-12  4:16 ` [PATCH 01/12] clocksource: Simplify clocks_calc_max_nsecs logic John Stultz
2015-03-13  9:01   ` [tip:timers/core] clocksource: Simplify the clocks_calc_max_nsecs () logic tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 02/12] clocksource: Simplify logic around clocksource wrapping safety margins John Stultz
2015-03-12  5:55   ` Ingo Molnar
2015-03-13  9:01   ` [tip:timers/core] clocksource: Simplify the " tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 03/12] clocksource: Add max_cycles to clocksource structure John Stultz
2015-03-13  9:01   ` [tip:timers/core] clocksource: Add 'max_cycles' to ' struct clocksource' tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 04/12] time: Add debugging checks to warn if we see delays John Stultz
2015-03-12  6:32   ` Ingo Molnar
2015-03-13  9:02   ` [tip:timers/core] timekeeping: " tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 05/12] time: Add infrastructure to cap clocksource reads to the max_cycles value John Stultz
2015-03-13  9:02   ` [tip:timers/core] timekeeping: Add checks to cap clocksource reads to the 'max_cycles' value tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 06/12] time: Try to catch clocksource delta underflows John Stultz
2015-03-13  9:02   ` [tip:timers/core] timekeeping: " tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 07/12] time: Add warnings when overflows or underflows are observed John Stultz
2015-03-12  7:37   ` Ingo Molnar
2015-03-13  9:03   ` [tip:timers/core] timekeeping: " tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 08/12] clocksource: Improve clocksource watchdog reporting John Stultz
2015-03-13  9:03   ` [tip:timers/core] " tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 09/12] clocksource: Mostly kill clocksource_register() John Stultz
2015-03-13  9:03   ` [tip:timers/core] " tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 10/12] sparc: Convert to using clocksource_register_hz() John Stultz
2015-03-13  9:04   ` [tip:timers/core] clocksource, sparc32: " tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 11/12] clocksource: Add some debug info about clocksources being registered John Stultz
2015-03-13  9:04   ` [tip:timers/core] " tip-bot for John Stultz
2015-03-12  4:16 ` [PATCH 12/12] clocksource: Rename __clocksource_updatefreq_* to __clocksource_update_freq_* John Stultz
2015-03-13  9:04   ` [tip:timers/core] clocksource: Rename __clocksource_updatefreq_*( ) to __clocksource_update_freq_*() tip-bot for John Stultz
2015-03-12  9:19 ` [PATCH 00/12] Increased clocksource validation and cleanups (v4) Ingo Molnar
2015-03-12 16:26   ` John Stultz
2015-03-13  8:58     ` Ingo Molnar
2015-03-13 17:54       ` John Stultz
2015-03-16  8:28         ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).