LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 1/2] Keep track of original clocksource frequency
@ 2008-03-18 22:11 john stultz
  2008-03-18 22:13 ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW john stultz
  2008-04-02 11:20 ` [PATCH 1/2] Keep track of original clocksource frequency Roman Zippel
  0 siblings, 2 replies; 13+ messages in thread
From: john stultz @ 2008-03-18 22:11 UTC (permalink / raw)
  To: lkml, Andrew Morton; +Cc: Roman Zippel

Here's my earlier mult/adj split patch reworked as suggested by Roman to
instead just introduce mult_orig.

Andrew, if there are not major objections, could you add it to your
2.6.26 pending list?

thanks
-john


The clocksource frequency is represented by
clocksource->mult/2^(clocksource->shift). Currently, when NTP makes
adjustments to the clock frequency, they are made directly to the mult
value.

This has the drawback that once changed, we cannot know what the orignal
mult value was, or how much adjustment has been applied.

This property causes problems in calculating proper ntp intervals when
switching back and forth between clocksources. 

This patch separates the current mult value into a mult and mult_orig
pair. The mult_orig value stays constant, while the ntp clocksource
adjustments are done only to the mult value.

This allows for correct ntp interval calculation and additionally lays
the groundwork for a new notion of time, what I'm calling the
monotonic-raw time, which is introduced in a following patch.

Signed-off-by: John Stultz <johnstul@us.ibm.com>

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index cde2b9f..b282b79 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -45,7 +45,8 @@ struct clocksource;
  * @read:		returns a cycle value
  * @mask:		bitmask for two's complement
  *			subtraction of non 64 bit counters
- * @mult:		cycle to nanosecond multiplier
+ * @mult:		cycle to nanosecond multiplier (adjusted by NTP)
+ * @mult_orig:		cycle to nanosecond multiplier (unadjusted by NTP)
  * @shift:		cycle to nanosecond divisor (power of two)
  * @flags:		flags describing special properties
  * @vread:		vsyscall based read
@@ -63,6 +64,7 @@ struct clocksource {
 	cycle_t (*read)(void);
 	cycle_t mask;
 	u32 mult;
+	s32 mult_orig;
 	u32 shift;
 	unsigned long flags;
 	cycle_t (*vread)(void);
@@ -201,16 +203,17 @@ static inline void clocksource_calculate_interval(struct clocksource *c,
 {
 	u64 tmp;
 
-	/* XXX - All of this could use a whole lot of optimization */
+	/* Do the ns -> cycle conversion first, using original mult */
 	tmp = length_nsec;
 	tmp <<= c->shift;
-	tmp += c->mult/2;
-	do_div(tmp, c->mult);
+	tmp += c->mult_orig/2;
+	do_div(tmp, c->mult_orig);
 
 	c->cycle_interval = (cycle_t)tmp;
 	if (c->cycle_interval == 0)
 		c->cycle_interval = 1;
 
+	/* Go back from cycles -> shifted ns, this time use ntp adjused mult */
 	c->xtime_interval = (u64)c->cycle_interval * c->mult;
 }
 
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 548c436..7a4a1b4 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -309,6 +309,9 @@ int clocksource_register(struct clocksource *c)
 	unsigned long flags;
 	int ret;
 
+	/* save mult_orig on registration */
+	c->mult_orig = c->mult;
+
 	spin_lock_irqsave(&clocksource_lock, flags);
 	ret = clocksource_enqueue(c);
 	if (!ret)
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index 4c256fd..1ca9955 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -61,6 +61,7 @@ struct clocksource clocksource_jiffies = {
 	.read		= jiffies_read,
 	.mask		= 0xffffffff, /*32bits*/
 	.mult		= NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
+	.mult_orig	= NSEC_PER_JIFFY << JIFFIES_SHIFT,
 	.shift		= JIFFIES_SHIFT,
 };
 



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW
  2008-03-18 22:11 [PATCH 1/2] Keep track of original clocksource frequency john stultz
@ 2008-03-18 22:13 ` john stultz
  2008-03-19  2:43   ` Andrew Morton
                     ` (2 more replies)
  2008-04-02 11:20 ` [PATCH 1/2] Keep track of original clocksource frequency Roman Zippel
  1 sibling, 3 replies; 13+ messages in thread
From: john stultz @ 2008-03-18 22:13 UTC (permalink / raw)
  To: lkml; +Cc: Andrew Morton, Roman Zippel

Here's my CLOCK_MONOTONIC_RAW, including some suggested changes from
Roman.

Andrew, if there are not major objections, could you add it to your
2.6.26 pending list?

thanks
-john


In talking with Josip Loncaric, and his work on clock synchronization
(see btime.sf.net), he mentioned that for really close synchronization,
it is useful to have access to "hardware time", that is a notion of time
that is not in any way adjusted by the clock slewing done to keep close
time sync.

Part of the issue is if we are using the kernel's ntp adjusted
representation of time in order to measure how we should correct time,
we can run into what Paul McKenney aptly described as "Painting a road
using the lines we're painting as the guide". 

I had been thinking of a similar problem, and was trying to come up with
a way to give users access to a purely hardware based time
representation that avoided users having to know the underlying
frequency and mask values needed to deal with the wide variety of
possible underlying hardware counters.

My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
nanosecond based time value, that increments starting at bootup and has
no frequency adjustments made to it what so ever.

The time is accessed from userspace via the posix_clock_gettime()
syscall, passing CLOCK_MONOTONIC_RAW as the clock_id.

This patch depends on the mult_orig patch, just sent a moment ago.


Signed-off-by: John Stultz <johnstul@us.ibm.com>

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index b282b79..3936d2e 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -79,6 +79,7 @@ struct clocksource {
 	/* timekeeping specific data, ignore */
 	cycle_t cycle_interval;
 	u64	xtime_interval;
+	u64	raw_interval;
 	/*
 	 * Second part is written at each timer interrupt
 	 * Keep it in a different cache line to dirty no
@@ -86,6 +87,7 @@ struct clocksource {
 	 */
 	cycle_t cycle_last ____cacheline_aligned_in_smp;
 	u64 xtime_nsec;
+	u64 raw_snsec;
 	s64 error;
 
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
@@ -215,6 +217,7 @@ static inline void clocksource_calculate_interval(struct clocksource *c,
 
 	/* Go back from cycles -> shifted ns, this time use ntp adjused mult */
 	c->xtime_interval = (u64)c->cycle_interval * c->mult;
+	c->raw_interval = (u64)c->cycle_interval * c->mult_orig;
 }
 
 
diff --git a/include/linux/time.h b/include/linux/time.h
index d32ef0a..f9f41be 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -116,6 +116,7 @@ extern int do_setitimer(int which, struct itimerval *value,
 extern unsigned int alarm_setitimer(unsigned int seconds);
 extern int do_getitimer(int which, struct itimerval *value);
 extern void getnstimeofday(struct timespec *tv);
+extern void getrawmonotonic(struct timespec *ts);
 extern void getboottime(struct timespec *ts);
 extern void monotonic_to_bootbased(struct timespec *ts);
 
@@ -218,6 +219,7 @@ struct itimerval {
 #define CLOCK_MONOTONIC			1
 #define CLOCK_PROCESS_CPUTIME_ID	2
 #define CLOCK_THREAD_CPUTIME_ID		3
+#define CLOCK_MONOTONIC_RAW		4
 
 /*
  * The IDs of various hardware clocks:
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index a9b0420..f75adfa 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -224,6 +224,15 @@ static int posix_ktime_get_ts(clockid_t which_clock, struct timespec *tp)
 }
 
 /*
+ * Get monotonic time for posix timers
+ */
+static int posix_get_monotonic_raw(clockid_t which_clock, struct timespec *tp)
+{
+	getrawmonotonic(tp);
+	return 0;
+}
+
+/*
  * Initialize everything, well, just everything in Posix clocks/timers ;)
  */
 static __init int init_posix_timers(void)
@@ -236,9 +245,15 @@ static __init int init_posix_timers(void)
 		.clock_get = posix_ktime_get_ts,
 		.clock_set = do_posix_clock_nosettime,
 	};
+	struct k_clock clock_monotonic_raw = {
+		.clock_getres = hrtimer_get_res,
+		.clock_get = posix_get_monotonic_raw,
+		.clock_set = do_posix_clock_nosettime,
+	};
 
 	register_posix_clock(CLOCK_REALTIME, &clock_realtime);
 	register_posix_clock(CLOCK_MONOTONIC, &clock_monotonic);
+	register_posix_clock(CLOCK_MONOTONIC_RAW, &clock_monotonic_raw);
 
 	posix_timers_cache = kmem_cache_create("posix_timers_cache",
 					sizeof (struct k_itimer), 0, SLAB_PANIC,
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 26cf0e7..f5cce8d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -44,6 +44,7 @@ __cacheline_aligned_in_smp DEFINE_SEQLOCK(xtime_lock);
  */
 struct timespec xtime __attribute__ ((aligned (16)));
 struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
+struct timespec monotonic_raw;
 static unsigned long total_sleep_time;		/* seconds */
 
 static struct timespec xtime_cache __attribute__ ((aligned (16)));
@@ -106,6 +107,39 @@ void getnstimeofday(struct timespec *ts)
 EXPORT_SYMBOL(getnstimeofday);
 
 /**
+ * getrawmonotonic - Returns the raw monotonic time in a timespec
+ * @ts:		pointer to the timespec to be set
+ *
+ * Returns the raw monotonic time (completely un-modified by ntp)
+ */
+void getrawmonotonic(struct timespec *ts)
+{
+	unsigned long seq;
+	s64 nsecs;
+	cycle_t cycle_now, cycle_delta;
+
+	do {
+		seq = read_seqbegin(&xtime_lock);
+
+		/* read clocksource: */
+		cycle_now = clocksource_read(clock);
+
+		/* calculate the delta since the last update_wall_time: */
+		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+
+		/* convert to nanoseconds: */
+		nsecs = ((s64)cycle_delta * clock->mult_orig) >> clock->shift;
+
+		*ts = monotonic_raw;
+		
+	} while (read_seqretry(&xtime_lock, seq));
+
+	timespec_add_ns(ts, nsecs);
+}
+
+EXPORT_SYMBOL(getrawmonotonic);
+
+/**
  * do_gettimeofday - Returns the time of day in a timeval
  * @tv:		pointer to the timeval to be set
  *
@@ -187,6 +221,7 @@ static void change_clocksource(void)
 
 	clock->error = 0;
 	clock->xtime_nsec = 0;
+	clock->raw_snsec = 0;
 	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
 
 	tick_clock_notify();
@@ -251,6 +286,8 @@ void __init timekeeping_init(void)
 	xtime.tv_nsec = 0;
 	set_normalized_timespec(&wall_to_monotonic,
 		-xtime.tv_sec, -xtime.tv_nsec);
+	set_normalized_timespec(&monotonic_raw, 0,0);
+	
 	update_xtime_cache(0);
 	total_sleep_time = 0;
 	write_sequnlock_irqrestore(&xtime_lock, flags);
@@ -449,6 +486,7 @@ void update_wall_time(void)
 	offset = clock->cycle_interval;
 #endif
 	clock->xtime_nsec += (s64)xtime.tv_nsec << clock->shift;
+	clock->raw_snsec += (s64) monotonic_raw.tv_nsec <<  clock->shift;
 
 	/* normally this loop will run just once, however in the
 	 * case of lost or late ticks, it will accumulate correctly.
@@ -456,6 +494,7 @@ void update_wall_time(void)
 	while (offset >= clock->cycle_interval) {
 		/* accumulate one interval */
 		clock->xtime_nsec += clock->xtime_interval;
+		clock->raw_snsec += clock->raw_interval;
 		clock->cycle_last += clock->cycle_interval;
 		offset -= clock->cycle_interval;
 
@@ -465,6 +504,11 @@ void update_wall_time(void)
 			second_overflow();
 		}
 
+		if (clock->raw_snsec >= (u64)NSEC_PER_SEC << clock->shift) {
+			clock->raw_snsec -= (u64)NSEC_PER_SEC << clock->shift;
+			monotonic_raw.tv_sec++;
+		}
+	
 		/* accumulate error between NTP and clock interval */
 		clock->error += tick_length;
 		clock->error -= clock->xtime_interval << (NTP_SCALE_SHIFT - clock->shift);
@@ -477,6 +521,11 @@ void update_wall_time(void)
 	xtime.tv_nsec = (s64)clock->xtime_nsec >> clock->shift;
 	clock->xtime_nsec -= (s64)xtime.tv_nsec << clock->shift;
 
+	/* store full nanoseconds into raw_monotonic */
+	monotonic_raw.tv_nsec = (s64)clock->raw_snsec >> clock->shift;
+	clock->raw_snsec -= (s64)monotonic_raw.tv_nsec << clock->shift;
+
+
 	update_xtime_cache(cyc2ns(clock, offset));
 
 	/* check to see if there is a new clocksource to use */



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW
  2008-03-18 22:13 ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW john stultz
@ 2008-03-19  2:43   ` Andrew Morton
  2008-03-19  3:01     ` john stultz
  2008-04-02 11:39   ` [PATCH 1/2] Introduce clocksource_forward_now Roman Zippel
  2008-04-02 11:50   ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW Roman Zippel
  2 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2008-03-19  2:43 UTC (permalink / raw)
  To: john stultz; +Cc: lkml, Roman Zippel

On Tue, 18 Mar 2008 15:13:40 -0700 john stultz <johnstul@us.ibm.com> wrote:

> Here's my CLOCK_MONOTONIC_RAW, including some suggested changes from
> Roman.
> 
> Andrew, if there are not major objections, could you add it to your
> 2.6.26 pending list?
> 
> thanks
> -john
> 
> 
> In talking with Josip Loncaric, and his work on clock synchronization
> (see btime.sf.net), he mentioned that for really close synchronization,
> it is useful to have access to "hardware time", that is a notion of time
> that is not in any way adjusted by the clock slewing done to keep close
> time sync.
> 
> Part of the issue is if we are using the kernel's ntp adjusted
> representation of time in order to measure how we should correct time,
> we can run into what Paul McKenney aptly described as "Painting a road
> using the lines we're painting as the guide". 
> 
> I had been thinking of a similar problem, and was trying to come up with
> a way to give users access to a purely hardware based time
> representation that avoided users having to know the underlying
> frequency and mask values needed to deal with the wide variety of
> possible underlying hardware counters.
> 
> My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
> nanosecond based time value, that increments starting at bootup and has
> no frequency adjustments made to it what so ever.
> 
> The time is accessed from userspace via the posix_clock_gettime()
> syscall, passing CLOCK_MONOTONIC_RAW as the clock_id.
> 
> This patch depends on the mult_orig patch, just sent a moment ago.

alpha:

kernel/built-in.o: In function `posix_get_monotonic_raw':
: undefined reference to `getrawmonotonic'
kernel/built-in.o: In function `posix_get_monotonic_raw':
: undefined reference to `getrawmonotonic'

presumably busted for all CONFIG_GENERIC_TIME=n architectures.

Couldn't see a quick fix so I dropped it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW
  2008-03-19  2:43   ` Andrew Morton
@ 2008-03-19  3:01     ` john stultz
  0 siblings, 0 replies; 13+ messages in thread
From: john stultz @ 2008-03-19  3:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: lkml, Roman Zippel


On Tue, 2008-03-18 at 19:43 -0700, Andrew Morton wrote:
> alpha:
> 
> kernel/built-in.o: In function `posix_get_monotonic_raw':
> : undefined reference to `getrawmonotonic'
> kernel/built-in.o: In function `posix_get_monotonic_raw':
> : undefined reference to `getrawmonotonic'
> 
> presumably busted for all CONFIG_GENERIC_TIME=n architectures.
> 
> Couldn't see a quick fix so I dropped it.

Ah, crud. Sorry about that.

Hrmm.. How about this quick fix, move getrawmonotonic outside "#ifdef
CONFIG_GENERIC_TIME" ?

Build tested on x86_64 and also alpha.

thanks
-john



In talking with Josip Loncaric, and his work on clock synchronization
(see btime.sf.net), he mentioned that for really close synchronization,
it is useful to have access to "hardware time", that is a notion of time
that is not in any way adjusted by the clock slewing done to keep close
time sync.

Part of the issue is if we are using the kernel's ntp adjusted
representation of time in order to measure how we should correct time,
we can run into what Paul McKenney aptly described as "Painting a road
using the lines we're painting as the guide". 

I had been thinking of a similar problem, and was trying to come up with
a way to give users access to a purely hardware based time
representation that avoided users having to know the underlying
frequency and mask values needed to deal with the wide variety of
possible underlying hardware counters.

My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
nanosecond based time value, that increments starting at bootup and has
no frequency adjustments made to it what so ever.

The time is accessed from userspace via the posix_clock_gettime()
syscall, passing CLOCK_MONOTONIC_RAW as the clock_id.

This patch depends on the mult_orig patch sent earlier.

Signed-off-by: John Stultz <johnstul@us.ibm.com>

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index b282b79..3936d2e 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -79,6 +79,7 @@ struct clocksource {
 	/* timekeeping specific data, ignore */
 	cycle_t cycle_interval;
 	u64	xtime_interval;
+	u64	raw_interval;
 	/*
 	 * Second part is written at each timer interrupt
 	 * Keep it in a different cache line to dirty no
@@ -86,6 +87,7 @@ struct clocksource {
 	 */
 	cycle_t cycle_last ____cacheline_aligned_in_smp;
 	u64 xtime_nsec;
+	u64 raw_snsec;
 	s64 error;
 
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
@@ -215,6 +217,7 @@ static inline void clocksource_calculate_interval(struct clocksource *c,
 
 	/* Go back from cycles -> shifted ns, this time use ntp adjused mult */
 	c->xtime_interval = (u64)c->cycle_interval * c->mult;
+	c->raw_interval = (u64)c->cycle_interval * c->mult_orig;
 }
 
 
diff --git a/include/linux/time.h b/include/linux/time.h
index d32ef0a..f9f41be 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -116,6 +116,7 @@ extern int do_setitimer(int which, struct itimerval *value,
 extern unsigned int alarm_setitimer(unsigned int seconds);
 extern int do_getitimer(int which, struct itimerval *value);
 extern void getnstimeofday(struct timespec *tv);
+extern void getrawmonotonic(struct timespec *ts);
 extern void getboottime(struct timespec *ts);
 extern void monotonic_to_bootbased(struct timespec *ts);
 
@@ -218,6 +219,7 @@ struct itimerval {
 #define CLOCK_MONOTONIC			1
 #define CLOCK_PROCESS_CPUTIME_ID	2
 #define CLOCK_THREAD_CPUTIME_ID		3
+#define CLOCK_MONOTONIC_RAW		4
 
 /*
  * The IDs of various hardware clocks:
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index a9b0420..f75adfa 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -224,6 +224,15 @@ static int posix_ktime_get_ts(clockid_t which_clock, struct timespec *tp)
 }
 
 /*
+ * Get monotonic time for posix timers
+ */
+static int posix_get_monotonic_raw(clockid_t which_clock, struct timespec *tp)
+{
+	getrawmonotonic(tp);
+	return 0;
+}
+
+/*
  * Initialize everything, well, just everything in Posix clocks/timers ;)
  */
 static __init int init_posix_timers(void)
@@ -236,9 +245,15 @@ static __init int init_posix_timers(void)
 		.clock_get = posix_ktime_get_ts,
 		.clock_set = do_posix_clock_nosettime,
 	};
+	struct k_clock clock_monotonic_raw = {
+		.clock_getres = hrtimer_get_res,
+		.clock_get = posix_get_monotonic_raw,
+		.clock_set = do_posix_clock_nosettime,
+	};
 
 	register_posix_clock(CLOCK_REALTIME, &clock_realtime);
 	register_posix_clock(CLOCK_MONOTONIC, &clock_monotonic);
+	register_posix_clock(CLOCK_MONOTONIC_RAW, &clock_monotonic_raw);
 
 	posix_timers_cache = kmem_cache_create("posix_timers_cache",
 					sizeof (struct k_itimer), 0, SLAB_PANIC,
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 26cf0e7..eb27bd8 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -44,6 +44,7 @@ __cacheline_aligned_in_smp DEFINE_SEQLOCK(xtime_lock);
  */
 struct timespec xtime __attribute__ ((aligned (16)));
 struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
+struct timespec monotonic_raw;
 static unsigned long total_sleep_time;		/* seconds */
 
 static struct timespec xtime_cache __attribute__ ((aligned (16)));
@@ -187,6 +188,7 @@ static void change_clocksource(void)
 
 	clock->error = 0;
 	clock->xtime_nsec = 0;
+	clock->raw_snsec = 0;
 	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
 
 	tick_clock_notify();
@@ -200,6 +202,40 @@ static inline s64 __get_nsec_offset(void) { return 0; }
 #endif
 
 /**
+ * getrawmonotonic - Returns the raw monotonic time in a timespec
+ * @ts:		pointer to the timespec to be set
+ *
+ * Returns the raw monotonic time (completely un-modified by ntp)
+ */
+void getrawmonotonic(struct timespec *ts)
+{
+	unsigned long seq;
+	s64 nsecs;
+	cycle_t cycle_now, cycle_delta;
+
+	do {
+		seq = read_seqbegin(&xtime_lock);
+
+		/* read clocksource: */
+		cycle_now = clocksource_read(clock);
+
+		/* calculate the delta since the last update_wall_time: */
+		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+
+		/* convert to nanoseconds: */
+		nsecs = ((s64)cycle_delta * clock->mult_orig) >> clock->shift;
+
+		*ts = monotonic_raw;
+		
+	} while (read_seqretry(&xtime_lock, seq));
+
+	timespec_add_ns(ts, nsecs);
+}
+
+EXPORT_SYMBOL(getrawmonotonic);
+
+
+/**
  * timekeeping_valid_for_hres - Check if timekeeping is suitable for hres
  */
 int timekeeping_valid_for_hres(void)
@@ -251,6 +287,8 @@ void __init timekeeping_init(void)
 	xtime.tv_nsec = 0;
 	set_normalized_timespec(&wall_to_monotonic,
 		-xtime.tv_sec, -xtime.tv_nsec);
+	set_normalized_timespec(&monotonic_raw, 0,0);
+	
 	update_xtime_cache(0);
 	total_sleep_time = 0;
 	write_sequnlock_irqrestore(&xtime_lock, flags);
@@ -449,6 +487,7 @@ void update_wall_time(void)
 	offset = clock->cycle_interval;
 #endif
 	clock->xtime_nsec += (s64)xtime.tv_nsec << clock->shift;
+	clock->raw_snsec += (s64) monotonic_raw.tv_nsec <<  clock->shift;
 
 	/* normally this loop will run just once, however in the
 	 * case of lost or late ticks, it will accumulate correctly.
@@ -456,6 +495,7 @@ void update_wall_time(void)
 	while (offset >= clock->cycle_interval) {
 		/* accumulate one interval */
 		clock->xtime_nsec += clock->xtime_interval;
+		clock->raw_snsec += clock->raw_interval;
 		clock->cycle_last += clock->cycle_interval;
 		offset -= clock->cycle_interval;
 
@@ -465,6 +505,11 @@ void update_wall_time(void)
 			second_overflow();
 		}
 
+		if (clock->raw_snsec >= (u64)NSEC_PER_SEC << clock->shift) {
+			clock->raw_snsec -= (u64)NSEC_PER_SEC << clock->shift;
+			monotonic_raw.tv_sec++;
+		}
+	
 		/* accumulate error between NTP and clock interval */
 		clock->error += tick_length;
 		clock->error -= clock->xtime_interval << (NTP_SCALE_SHIFT - clock->shift);
@@ -477,6 +522,11 @@ void update_wall_time(void)
 	xtime.tv_nsec = (s64)clock->xtime_nsec >> clock->shift;
 	clock->xtime_nsec -= (s64)xtime.tv_nsec << clock->shift;
 
+	/* store full nanoseconds into raw_monotonic */
+	monotonic_raw.tv_nsec = (s64)clock->raw_snsec >> clock->shift;
+	clock->raw_snsec -= (s64)monotonic_raw.tv_nsec << clock->shift;
+
+
 	update_xtime_cache(cyc2ns(clock, offset));
 
 	/* check to see if there is a new clocksource to use */



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] Keep track of original clocksource frequency
  2008-03-18 22:11 [PATCH 1/2] Keep track of original clocksource frequency john stultz
  2008-03-18 22:13 ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW john stultz
@ 2008-04-02 11:20 ` Roman Zippel
  2008-04-02 15:38   ` John Stultz
  1 sibling, 1 reply; 13+ messages in thread
From: Roman Zippel @ 2008-04-02 11:20 UTC (permalink / raw)
  To: john stultz; +Cc: lkml, Andrew Morton

Hi,

On Tue, 18 Mar 2008, john stultz wrote:

> @@ -63,6 +64,7 @@ struct clocksource {
>  	cycle_t (*read)(void);
>  	cycle_t mask;
>  	u32 mult;
> +	s32 mult_orig;
>  	u32 shift;
>  	unsigned long flags;
>  	cycle_t (*vread)(void);

This is wrong, with HZ=100 the jiffies clock multiplier suddenly becomes 
negative and later the raw interval underflows.

bye, Roman


Signed-off-by: Roman Zippel <zippel@linux-m68k.org>

---
 include/linux/clocksource.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -64,7 +64,7 @@ struct clocksource {
 	cycle_t (*read)(void);
 	cycle_t mask;
 	u32 mult;
-	s32 mult_orig;
+	u32 mult_orig;
 	u32 shift;
 	unsigned long flags;
 	cycle_t (*vread)(void);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] Introduce clocksource_forward_now
  2008-03-18 22:13 ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW john stultz
  2008-03-19  2:43   ` Andrew Morton
@ 2008-04-02 11:39   ` Roman Zippel
  2008-04-02 16:01     ` John Stultz
  2008-04-03 21:07     ` Andrew Morton
  2008-04-02 11:50   ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW Roman Zippel
  2 siblings, 2 replies; 13+ messages in thread
From: Roman Zippel @ 2008-04-02 11:39 UTC (permalink / raw)
  To: john stultz; +Cc: lkml, Andrew Morton

Hi,

On Tue, 18 Mar 2008, john stultz wrote:

> My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
> nanosecond based time value, that increments starting at bootup and has
> no frequency adjustments made to it what so ever.

There is a problem with the time offset since the last update_wall_time() 
call, which isn't taken into account when switching clocks (possibly 
during suspend/resume too), so that the clock might jump back during a 
clock switch.
To avoid making the whole more complex it's better to do small cleanup 
first, so this patch introduces clocksource_forward_now() which takes care 
of this offset since the last update_wall_time() call and adds it to the 
clock, so there is no need anymore to deal with it explicitly.
This is also gets rid of the timekeeping_suspend_nsecs hack, instead of 
waiting until resume, the value is accumulated during suspend. In the end 
there is only a single user of __get_nsec_offset() left, so I integrated 
it back to getnstimeofday().

bye, Roman

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>

---
 kernel/time/timekeeping.c |   67 ++++++++++++++++++++--------------------------
 1 file changed, 30 insertions(+), 37 deletions(-)

Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -58,27 +58,23 @@ struct clocksource *clock;
 
 #ifdef CONFIG_GENERIC_TIME
 /**
- * __get_nsec_offset - Returns nanoseconds since last call to periodic_hook
+ * clocksource_forward_now - update clock to the current time
  *
- * private function, must hold xtime_lock lock when being
- * called. Returns the number of nanoseconds since the
- * last call to update_wall_time() (adjusted by NTP scaling)
+ * Forward the current clock to update its state since the last call to
+ * update_wall_time(). This is useful before significant clock changes,
+ * as it avoids having to deal with this time offset explicitly.
  */
-static inline s64 __get_nsec_offset(void)
+static void clocksource_forward_now(void)
 {
 	cycle_t cycle_now, cycle_delta;
-	s64 ns_offset;
+	s64 nsec;
 
-	/* read clocksource: */
 	cycle_now = clocksource_read(clock);
-
-	/* calculate the delta since the last update_wall_time: */
 	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+	clock->cycle_last = cycle_now;
 
-	/* convert to nanoseconds: */
-	ns_offset = cyc2ns(clock, cycle_delta);
-
-	return ns_offset;
+	nsec = cyc2ns(clock, cycle_delta);
+	timespec_add_ns(&xtime, nsec);
 }
 
 /**
@@ -89,6 +85,7 @@ static inline s64 __get_nsec_offset(void
  */
 void getnstimeofday(struct timespec *ts)
 {
+	cycle_t cycle_now, cycle_delta;
 	unsigned long seq;
 	s64 nsecs;
 
@@ -96,7 +93,15 @@ void getnstimeofday(struct timespec *ts)
 		seq = read_seqbegin(&xtime_lock);
 
 		*ts = xtime;
-		nsecs = __get_nsec_offset();
+
+		/* read clocksource: */
+		cycle_now = clocksource_read(clock);
+
+		/* calculate the delta since the last update_wall_time: */
+		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+
+		/* convert to nanoseconds: */
+		nsecs = cyc2ns(clock, cycle_delta);
 
 	} while (read_seqretry(&xtime_lock, seq));
 
@@ -130,21 +135,19 @@ EXPORT_SYMBOL(do_gettimeofday);
 int do_settimeofday(struct timespec *tv)
 {
 	unsigned long flags;
-	time_t wtm_sec, sec = tv->tv_sec;
-	long wtm_nsec, nsec = tv->tv_nsec;
 
 	if ((unsigned long)tv->tv_nsec >= NSEC_PER_SEC)
 		return -EINVAL;
 
 	write_seqlock_irqsave(&xtime_lock, flags);
 
-	nsec -= __get_nsec_offset();
+	clocksource_forward_now();
 
-	wtm_sec  = wall_to_monotonic.tv_sec + (xtime.tv_sec - sec);
-	wtm_nsec = wall_to_monotonic.tv_nsec + (xtime.tv_nsec - nsec);
+	wall_to_monotonic.tv_sec += xtime.tv_sec - tv->tv_sec;
+	timespec_add_ns(&wall_to_monotonic, xtime.tv_nsec - tv->tv_nsec);
+
+	xtime = *tv;
 
-	set_normalized_timespec(&xtime, sec, nsec);
-	set_normalized_timespec(&wall_to_monotonic, wtm_sec, wtm_nsec);
 	update_xtime_cache(0);
 
 	clock->error = 0;
@@ -170,21 +173,16 @@ EXPORT_SYMBOL(do_settimeofday);
 static void change_clocksource(void)
 {
 	struct clocksource *new;
-	cycle_t now;
-	u64 nsec;
 
 	new = clocksource_get_next();
 
 	if (clock == new)
 		return;
 
-	now = clocksource_read(new);
-	nsec =  __get_nsec_offset();
-	timespec_add_ns(&xtime, nsec);
+	clocksource_forward_now();
 
 	clock = new;
-	clock->cycle_last = now;
-
+	clock->cycle_last = clocksource_read(new);
 	clock->error = 0;
 	clock->xtime_nsec = 0;
 	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
@@ -199,8 +197,8 @@ static void change_clocksource(void)
 	 */
 }
 #else
+static inline void clocksource_forward_now(void) { }
 static inline void change_clocksource(void) { }
-static inline s64 __get_nsec_offset(void) { return 0; }
 #endif
 
 /**
@@ -264,8 +262,6 @@ void __init timekeeping_init(void)
 static int timekeeping_suspended;
 /* time in seconds when suspend began */
 static unsigned long timekeeping_suspend_time;
-/* xtime offset when we went into suspend */
-static s64 timekeeping_suspend_nsecs;
 
 /**
  * timekeeping_resume - Resumes the generic timekeeping subsystem.
@@ -291,8 +287,6 @@ static int timekeeping_resume(struct sys
 		wall_to_monotonic.tv_sec -= sleep_length;
 		total_sleep_time += sleep_length;
 	}
-	/* Make sure that we have the correct xtime reference */
-	timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
 	update_xtime_cache(0);
 	/* re-base the last cycle value */
 	clock->cycle_last = clocksource_read(clock);
@@ -317,8 +311,7 @@ static int timekeeping_suspend(struct sy
 	timekeeping_suspend_time = read_persistent_clock();
 
 	write_seqlock_irqsave(&xtime_lock, flags);
-	/* Get the current xtime offset */
-	timekeeping_suspend_nsecs = __get_nsec_offset();
+	clocksource_forward_now();
 	timekeeping_suspended = 1;
 	write_sequnlock_irqrestore(&xtime_lock, flags);
 
@@ -459,10 +452,10 @@ void update_wall_time(void)
 	 */
 	while (offset >= clock->cycle_interval) {
 		/* accumulate one interval */
-		clock->xtime_nsec += clock->xtime_interval;
-		clock->cycle_last += clock->cycle_interval;
 		offset -= clock->cycle_interval;
+		clock->cycle_last += clock->cycle_interval;
 
+		clock->xtime_nsec += clock->xtime_interval;
 		if (clock->xtime_nsec >= (u64)NSEC_PER_SEC << clock->shift) {
 			clock->xtime_nsec -= (u64)NSEC_PER_SEC << clock->shift;
 			xtime.tv_sec++;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW
  2008-03-18 22:13 ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW john stultz
  2008-03-19  2:43   ` Andrew Morton
  2008-04-02 11:39   ` [PATCH 1/2] Introduce clocksource_forward_now Roman Zippel
@ 2008-04-02 11:50   ` Roman Zippel
  2008-04-02 16:01     ` John Stultz
  2008-04-03 21:11     ` Andrew Morton
  2 siblings, 2 replies; 13+ messages in thread
From: Roman Zippel @ 2008-04-02 11:50 UTC (permalink / raw)
  To: john stultz; +Cc: lkml, Andrew Morton

Hi,

On Tue, 18 Mar 2008, john stultz wrote:

> My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
> nanosecond based time value, that increments starting at bootup and has
> no frequency adjustments made to it what so ever.
> 
> The time is accessed from userspace via the posix_clock_gettime()
> syscall, passing CLOCK_MONOTONIC_RAW as the clock_id.

This is a reworked version of this patch based on the previous 
clocksource_forward_now patch, since clocksource_forward_now() takes 
care of the time offset now, it's not needed to do this at various places.
I also got rid of the monotonic_raw splitting, so the work done during 
update_wall_time() is quite a bit simpler.

bye, Roman

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>

---
 include/linux/clocksource.h |    5 ++++
 include/linux/time.h        |    2 +
 kernel/posix-timers.c       |   15 ++++++++++++++
 kernel/time/timekeeping.c   |   47 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 69 insertions(+)

Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -79,6 +79,7 @@ struct clocksource {
 	/* timekeeping specific data, ignore */
 	cycle_t cycle_interval;
 	u64	xtime_interval;
+	u64	raw_interval;
 	/*
 	 * Second part is written at each timer interrupt
 	 * Keep it in a different cache line to dirty no
@@ -87,6 +88,8 @@ struct clocksource {
 	cycle_t cycle_last ____cacheline_aligned_in_smp;
 	u64 xtime_nsec;
 	s64 error;
+	u64 raw_nsec;
+	long raw_sec;
 
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
 	/* Watchdog related data, used by the framework */
@@ -215,6 +218,8 @@ static inline void clocksource_calculate
 
 	/* Go back from cycles -> shifted ns, this time use ntp adjused mult */
 	c->xtime_interval = (u64)c->cycle_interval * c->mult;
+	c->raw_interval = ((u64)c->cycle_interval * c->mult_orig) <<
+			  (NTP_SCALE_SHIFT - c->shift);
 }
 
 
Index: linux-2.6/include/linux/time.h
===================================================================
--- linux-2.6.orig/include/linux/time.h
+++ linux-2.6/include/linux/time.h
@@ -116,6 +116,7 @@ extern int do_setitimer(int which, struc
 extern unsigned int alarm_setitimer(unsigned int seconds);
 extern int do_getitimer(int which, struct itimerval *value);
 extern void getnstimeofday(struct timespec *tv);
+extern void getrawmonotonic(struct timespec *ts);
 extern void getboottime(struct timespec *ts);
 extern void monotonic_to_bootbased(struct timespec *ts);
 
@@ -218,6 +219,7 @@ struct itimerval {
 #define CLOCK_MONOTONIC			1
 #define CLOCK_PROCESS_CPUTIME_ID	2
 #define CLOCK_THREAD_CPUTIME_ID		3
+#define CLOCK_MONOTONIC_RAW		4
 
 /*
  * The IDs of various hardware clocks:
Index: linux-2.6/kernel/posix-timers.c
===================================================================
--- linux-2.6.orig/kernel/posix-timers.c
+++ linux-2.6/kernel/posix-timers.c
@@ -224,6 +224,15 @@ static int posix_ktime_get_ts(clockid_t 
 }
 
 /*
+ * Get monotonic time for posix timers
+ */
+static int posix_get_monotonic_raw(clockid_t which_clock, struct timespec *tp)
+{
+	getrawmonotonic(tp);
+	return 0;
+}
+
+/*
  * Initialize everything, well, just everything in Posix clocks/timers ;)
  */
 static __init int init_posix_timers(void)
@@ -236,9 +245,15 @@ static __init int init_posix_timers(void
 		.clock_get = posix_ktime_get_ts,
 		.clock_set = do_posix_clock_nosettime,
 	};
+	struct k_clock clock_monotonic_raw = {
+		.clock_getres = hrtimer_get_res,
+		.clock_get = posix_get_monotonic_raw,
+		.clock_set = do_posix_clock_nosettime,
+	};
 
 	register_posix_clock(CLOCK_REALTIME, &clock_realtime);
 	register_posix_clock(CLOCK_MONOTONIC, &clock_monotonic);
+	register_posix_clock(CLOCK_MONOTONIC_RAW, &clock_monotonic_raw);
 
 	posix_timers_cache = kmem_cache_create("posix_timers_cache",
 					sizeof (struct k_itimer), 0, SLAB_PANIC,
Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -75,6 +75,10 @@ static void clocksource_forward_now(void
 
 	nsec = cyc2ns(clock, cycle_delta);
 	timespec_add_ns(&xtime, nsec);
+
+	nsec = ((s64)cycle_delta * clock->mult_orig) <<
+	       (NTP_SCALE_SHIFT - clock->shift);
+	clock->raw_nsec += nsec;
 }
 
 /**
@@ -181,6 +185,9 @@ static void change_clocksource(void)
 
 	clocksource_forward_now();
 
+	new->raw_sec = clock->raw_sec;
+	new->raw_nsec = clock->raw_nsec;
+
 	clock = new;
 	clock->cycle_last = clocksource_read(new);
 	clock->error = 0;
@@ -202,6 +209,40 @@ static inline void change_clocksource(vo
 #endif
 
 /**
+ * getrawmonotonic - Returns the raw monotonic time in a timespec
+ * @ts:		pointer to the timespec to be set
+ *
+ * Returns the raw monotonic time (completely un-modified by ntp)
+ */
+void getrawmonotonic(struct timespec *ts)
+{
+	unsigned long seq;
+	s64 nsecs;
+	cycle_t cycle_now, cycle_delta;
+
+	do {
+		seq = read_seqbegin(&xtime_lock);
+
+		/* read clocksource: */
+		cycle_now = clocksource_read(clock);
+
+		/* calculate the delta since the last update_wall_time: */
+		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+
+		/* convert to nanoseconds: */
+		nsecs = ((s64)cycle_delta * clock->mult_orig) >> clock->shift;
+
+		ts->tv_sec = clock->raw_sec;
+		ts->tv_nsec = clock->raw_nsec >> NTP_SCALE_SHIFT;
+
+	} while (read_seqretry(&xtime_lock, seq));
+
+	timespec_add_ns(ts, nsecs);
+}
+EXPORT_SYMBOL(getrawmonotonic);
+
+
+/**
  * timekeeping_valid_for_hres - Check if timekeeping is suitable for hres
  */
 int timekeeping_valid_for_hres(void)
@@ -462,6 +503,12 @@ void update_wall_time(void)
 			second_overflow();
 		}
 
+		clock->raw_nsec += clock->raw_interval;
+		if ((u32)(clock->raw_nsec >> NTP_SCALE_SHIFT) >= NSEC_PER_SEC) {
+			clock->raw_nsec -= (u64)NSEC_PER_SEC << NTP_SCALE_SHIFT;
+			clock->raw_sec++;
+		}
+
 		/* accumulate error between NTP and clock interval */
 		clock->error += tick_length;
 		clock->error -= clock->xtime_interval << (NTP_SCALE_SHIFT - clock->shift);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] Keep track of original clocksource frequency
  2008-04-02 11:20 ` [PATCH 1/2] Keep track of original clocksource frequency Roman Zippel
@ 2008-04-02 15:38   ` John Stultz
  0 siblings, 0 replies; 13+ messages in thread
From: John Stultz @ 2008-04-02 15:38 UTC (permalink / raw)
  To: Roman Zippel; +Cc: lkml, Andrew Morton


On Wed, 2008-04-02 at 13:20 +0200, Roman Zippel wrote:
> Hi,
> 
> On Tue, 18 Mar 2008, john stultz wrote:
> 
> > @@ -63,6 +64,7 @@ struct clocksource {
> >  	cycle_t (*read)(void);
> >  	cycle_t mask;
> >  	u32 mult;
> > +	s32 mult_orig;
> >  	u32 shift;
> >  	unsigned long flags;
> >  	cycle_t (*vread)(void);
> 
> This is wrong, with HZ=100 the jiffies clock multiplier suddenly becomes 
> negative and later the raw interval underflows.

Ah, thanks for catching that!

> 
> Signed-off-by: Roman Zippel <zippel@linux-m68k.org>

Acked-by: John Stultz <johnstul@us.ibm.com>




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] Introduce clocksource_forward_now
  2008-04-02 11:39   ` [PATCH 1/2] Introduce clocksource_forward_now Roman Zippel
@ 2008-04-02 16:01     ` John Stultz
  2008-04-03 21:07     ` Andrew Morton
  1 sibling, 0 replies; 13+ messages in thread
From: John Stultz @ 2008-04-02 16:01 UTC (permalink / raw)
  To: Roman Zippel; +Cc: lkml, Andrew Morton


On Wed, 2008-04-02 at 13:39 +0200, Roman Zippel wrote:
> Hi,
> 
> On Tue, 18 Mar 2008, john stultz wrote:
> 
> > My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
> > nanosecond based time value, that increments starting at bootup and has
> > no frequency adjustments made to it what so ever.
> 
> There is a problem with the time offset since the last update_wall_time() 
> call, which isn't taken into account when switching clocks (possibly 
> during suspend/resume too), so that the clock might jump back during a 
> clock switch.

Yep, thanks for catching that!


> To avoid making the whole more complex it's better to do small cleanup 
> first, so this patch introduces clocksource_forward_now() which takes care 
> of this offset since the last update_wall_time() call and adds it to the 
> clock, so there is no need anymore to deal with it explicitly.
> This is also gets rid of the timekeeping_suspend_nsecs hack, instead of 
> waiting until resume, the value is accumulated during suspend. In the end 
> there is only a single user of __get_nsec_offset() left, so I integrated 
> it back to getnstimeofday().
> 
> bye, Roman
> 
> Signed-off-by: Roman Zippel <zippel@linux-m68k.org>

Looks ok to me.

Acked-by: John Stultz <johnstul@us.ibm.com>

> ---
>  kernel/time/timekeeping.c |   67 ++++++++++++++++++++--------------------------
>  1 file changed, 30 insertions(+), 37 deletions(-)
> 
> Index: linux-2.6/kernel/time/timekeeping.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/timekeeping.c
> +++ linux-2.6/kernel/time/timekeeping.c
> @@ -58,27 +58,23 @@ struct clocksource *clock;
> 
>  #ifdef CONFIG_GENERIC_TIME
>  /**
> - * __get_nsec_offset - Returns nanoseconds since last call to periodic_hook
> + * clocksource_forward_now - update clock to the current time
>   *
> - * private function, must hold xtime_lock lock when being
> - * called. Returns the number of nanoseconds since the
> - * last call to update_wall_time() (adjusted by NTP scaling)
> + * Forward the current clock to update its state since the last call to
> + * update_wall_time(). This is useful before significant clock changes,
> + * as it avoids having to deal with this time offset explicitly.
>   */
> -static inline s64 __get_nsec_offset(void)
> +static void clocksource_forward_now(void)
>  {
>  	cycle_t cycle_now, cycle_delta;
> -	s64 ns_offset;
> +	s64 nsec;
> 
> -	/* read clocksource: */
>  	cycle_now = clocksource_read(clock);
> -
> -	/* calculate the delta since the last update_wall_time: */
>  	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> +	clock->cycle_last = cycle_now;
> 
> -	/* convert to nanoseconds: */
> -	ns_offset = cyc2ns(clock, cycle_delta);
> -
> -	return ns_offset;
> +	nsec = cyc2ns(clock, cycle_delta);
> +	timespec_add_ns(&xtime, nsec);
>  }
> 
>  /**
> @@ -89,6 +85,7 @@ static inline s64 __get_nsec_offset(void
>   */
>  void getnstimeofday(struct timespec *ts)
>  {
> +	cycle_t cycle_now, cycle_delta;
>  	unsigned long seq;
>  	s64 nsecs;
> 
> @@ -96,7 +93,15 @@ void getnstimeofday(struct timespec *ts)
>  		seq = read_seqbegin(&xtime_lock);
> 
>  		*ts = xtime;
> -		nsecs = __get_nsec_offset();
> +
> +		/* read clocksource: */
> +		cycle_now = clocksource_read(clock);
> +
> +		/* calculate the delta since the last update_wall_time: */
> +		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> +
> +		/* convert to nanoseconds: */
> +		nsecs = cyc2ns(clock, cycle_delta);
> 
>  	} while (read_seqretry(&xtime_lock, seq));
> 
> @@ -130,21 +135,19 @@ EXPORT_SYMBOL(do_gettimeofday);
>  int do_settimeofday(struct timespec *tv)
>  {
>  	unsigned long flags;
> -	time_t wtm_sec, sec = tv->tv_sec;
> -	long wtm_nsec, nsec = tv->tv_nsec;
> 
>  	if ((unsigned long)tv->tv_nsec >= NSEC_PER_SEC)
>  		return -EINVAL;
> 
>  	write_seqlock_irqsave(&xtime_lock, flags);
> 
> -	nsec -= __get_nsec_offset();
> +	clocksource_forward_now();
> 
> -	wtm_sec  = wall_to_monotonic.tv_sec + (xtime.tv_sec - sec);
> -	wtm_nsec = wall_to_monotonic.tv_nsec + (xtime.tv_nsec - nsec);
> +	wall_to_monotonic.tv_sec += xtime.tv_sec - tv->tv_sec;
> +	timespec_add_ns(&wall_to_monotonic, xtime.tv_nsec - tv->tv_nsec);
> +
> +	xtime = *tv;
> 
> -	set_normalized_timespec(&xtime, sec, nsec);
> -	set_normalized_timespec(&wall_to_monotonic, wtm_sec, wtm_nsec);
>  	update_xtime_cache(0);
> 
>  	clock->error = 0;
> @@ -170,21 +173,16 @@ EXPORT_SYMBOL(do_settimeofday);
>  static void change_clocksource(void)
>  {
>  	struct clocksource *new;
> -	cycle_t now;
> -	u64 nsec;
> 
>  	new = clocksource_get_next();
> 
>  	if (clock == new)
>  		return;
> 
> -	now = clocksource_read(new);
> -	nsec =  __get_nsec_offset();
> -	timespec_add_ns(&xtime, nsec);
> +	clocksource_forward_now();
> 
>  	clock = new;
> -	clock->cycle_last = now;
> -
> +	clock->cycle_last = clocksource_read(new);
>  	clock->error = 0;
>  	clock->xtime_nsec = 0;
>  	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
> @@ -199,8 +197,8 @@ static void change_clocksource(void)
>  	 */
>  }
>  #else
> +static inline void clocksource_forward_now(void) { }
>  static inline void change_clocksource(void) { }
> -static inline s64 __get_nsec_offset(void) { return 0; }
>  #endif
> 
>  /**
> @@ -264,8 +262,6 @@ void __init timekeeping_init(void)
>  static int timekeeping_suspended;
>  /* time in seconds when suspend began */
>  static unsigned long timekeeping_suspend_time;
> -/* xtime offset when we went into suspend */
> -static s64 timekeeping_suspend_nsecs;
> 
>  /**
>   * timekeeping_resume - Resumes the generic timekeeping subsystem.
> @@ -291,8 +287,6 @@ static int timekeeping_resume(struct sys
>  		wall_to_monotonic.tv_sec -= sleep_length;
>  		total_sleep_time += sleep_length;
>  	}
> -	/* Make sure that we have the correct xtime reference */
> -	timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
>  	update_xtime_cache(0);
>  	/* re-base the last cycle value */
>  	clock->cycle_last = clocksource_read(clock);
> @@ -317,8 +311,7 @@ static int timekeeping_suspend(struct sy
>  	timekeeping_suspend_time = read_persistent_clock();
> 
>  	write_seqlock_irqsave(&xtime_lock, flags);
> -	/* Get the current xtime offset */
> -	timekeeping_suspend_nsecs = __get_nsec_offset();
> +	clocksource_forward_now();
>  	timekeeping_suspended = 1;
>  	write_sequnlock_irqrestore(&xtime_lock, flags);
> 
> @@ -459,10 +452,10 @@ void update_wall_time(void)
>  	 */
>  	while (offset >= clock->cycle_interval) {
>  		/* accumulate one interval */
> -		clock->xtime_nsec += clock->xtime_interval;
> -		clock->cycle_last += clock->cycle_interval;
>  		offset -= clock->cycle_interval;
> +		clock->cycle_last += clock->cycle_interval;
> 
> +		clock->xtime_nsec += clock->xtime_interval;
>  		if (clock->xtime_nsec >= (u64)NSEC_PER_SEC << clock->shift) {
>  			clock->xtime_nsec -= (u64)NSEC_PER_SEC << clock->shift;
>  			xtime.tv_sec++;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW
  2008-04-02 11:50   ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW Roman Zippel
@ 2008-04-02 16:01     ` John Stultz
  2008-04-02 16:37       ` Roman Zippel
  2008-04-03 21:11     ` Andrew Morton
  1 sibling, 1 reply; 13+ messages in thread
From: John Stultz @ 2008-04-02 16:01 UTC (permalink / raw)
  To: Roman Zippel; +Cc: lkml, Andrew Morton


On Wed, 2008-04-02 at 13:50 +0200, Roman Zippel wrote:
> Hi,
> 
> On Tue, 18 Mar 2008, john stultz wrote:
> 
> > My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
> > nanosecond based time value, that increments starting at bootup and has
> > no frequency adjustments made to it what so ever.
> > 
> > The time is accessed from userspace via the posix_clock_gettime()
> > syscall, passing CLOCK_MONOTONIC_RAW as the clock_id.
> 
> This is a reworked version of this patch based on the previous 
> clocksource_forward_now patch, since clocksource_forward_now() takes 
> care of the time offset now, it's not needed to do this at various places.
> I also got rid of the monotonic_raw splitting, so the work done during 
> update_wall_time() is quite a bit simpler.
> 
> bye, Roman
> 
> Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
> 
> ---
>  include/linux/clocksource.h |    5 ++++
>  include/linux/time.h        |    2 +
>  kernel/posix-timers.c       |   15 ++++++++++++++
>  kernel/time/timekeeping.c   |   47 ++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 69 insertions(+)
> 
> Index: linux-2.6/include/linux/clocksource.h
> ===================================================================
> --- linux-2.6.orig/include/linux/clocksource.h
> +++ linux-2.6/include/linux/clocksource.h
> @@ -79,6 +79,7 @@ struct clocksource {
>  	/* timekeeping specific data, ignore */
>  	cycle_t cycle_interval;
>  	u64	xtime_interval;
> +	u64	raw_interval;
>  	/*
>  	 * Second part is written at each timer interrupt
>  	 * Keep it in a different cache line to dirty no
> @@ -87,6 +88,8 @@ struct clocksource {
>  	cycle_t cycle_last ____cacheline_aligned_in_smp;
>  	u64 xtime_nsec;
>  	s64 error;
> +	u64 raw_nsec;
> +	long raw_sec;


So, with the raw_sec being stored in the clocksource, and there not
being a monotonic_raw value, doesn't this mean the MONOTONIC_RAW value
will clear to zero on clocksource changes? 


>  #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
>  	/* Watchdog related data, used by the framework */
> @@ -215,6 +218,8 @@ static inline void clocksource_calculate
> 
>  	/* Go back from cycles -> shifted ns, this time use ntp adjused mult */
>  	c->xtime_interval = (u64)c->cycle_interval * c->mult;
> +	c->raw_interval = ((u64)c->cycle_interval * c->mult_orig) <<
> +			  (NTP_SCALE_SHIFT - c->shift);
>  }

Could you explain further how this extra shift scaling is beneficial?
(Additionally, if we're using it for more then just NTP's shift, we
might want to change its name).

thanks
-john


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW
  2008-04-02 16:01     ` John Stultz
@ 2008-04-02 16:37       ` Roman Zippel
  0 siblings, 0 replies; 13+ messages in thread
From: Roman Zippel @ 2008-04-02 16:37 UTC (permalink / raw)
  To: John Stultz; +Cc: lkml, Andrew Morton

Hi,

On Wed, 2 Apr 2008, John Stultz wrote:

> > +	u64 raw_nsec;
> > +	long raw_sec;
> 
> 
> So, with the raw_sec being stored in the clocksource, and there not
> being a monotonic_raw value, doesn't this mean the MONOTONIC_RAW value
> will clear to zero on clocksource changes? 

It's copied during the clock change.

> > @@ -215,6 +218,8 @@ static inline void clocksource_calculate
> > 
> >  	/* Go back from cycles -> shifted ns, this time use ntp adjused mult */
> >  	c->xtime_interval = (u64)c->cycle_interval * c->mult;
> > +	c->raw_interval = ((u64)c->cycle_interval * c->mult_orig) <<
> > +			  (NTP_SCALE_SHIFT - c->shift);
> >  }
> 
> Could you explain further how this extra shift scaling is beneficial?

The value has a constant scale, which allows a few optimizations, e.g. 
look at update_wall_time(), where the 64bit-shift and 64bit-compare 
has become a simple 32bit-compare, as it's now very easy to extract the 
full nanosecond part and to drop the fraction part.

> (Additionally, if we're using it for more then just NTP's shift, we
> might want to change its name).

I don't mind.

bye, Roman

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] Introduce clocksource_forward_now
  2008-04-02 11:39   ` [PATCH 1/2] Introduce clocksource_forward_now Roman Zippel
  2008-04-02 16:01     ` John Stultz
@ 2008-04-03 21:07     ` Andrew Morton
  1 sibling, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2008-04-03 21:07 UTC (permalink / raw)
  To: Roman Zippel; +Cc: johnstul, linux-kernel

On Wed, 2 Apr 2008 13:39:43 +0200 (CEST)
Roman Zippel <zippel@linux-m68k.org> wrote:

> Hi,
> 
> On Tue, 18 Mar 2008, john stultz wrote:
> 
> > My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
> > nanosecond based time value, that increments starting at bootup and has
> > no frequency adjustments made to it what so ever.
> 
> There is a problem with the time offset since the last update_wall_time() 
> call, which isn't taken into account when switching clocks (possibly 
> during suspend/resume too), so that the clock might jump back during a 
> clock switch.
> To avoid making the whole more complex it's better to do small cleanup 
> first, so this patch introduces clocksource_forward_now() which takes care 
> of this offset since the last update_wall_time() call and adds it to the 
> clock, so there is no need anymore to deal with it explicitly.
> This is also gets rid of the timekeeping_suspend_nsecs hack, instead of 
> waiting until resume, the value is accumulated during suspend. In the end 
> there is only a single user of __get_nsec_offset() left, so I integrated 
> it back to getnstimeofday().
> 

It's unclear what tree this was agaisnt, but it was the wrong one.

> --- linux-2.6.orig/kernel/time/timekeeping.c
> +++ linux-2.6/kernel/time/timekeeping.c

Hopefully these changes will work when mixed with the pending 2.6.26
changes, but nobody yet knows.

> @@ -459,10 +452,10 @@ void update_wall_time(void)
>  	 */
>  	while (offset >= clock->cycle_interval) {
>  		/* accumulate one interval */
> -		clock->xtime_nsec += clock->xtime_interval;
> -		clock->cycle_last += clock->cycle_interval;
>  		offset -= clock->cycle_interval;
> +		clock->cycle_last += clock->cycle_interval;
>  
> +		clock->xtime_nsec += clock->xtime_interval;
>  		if (clock->xtime_nsec >= (u64)NSEC_PER_SEC << clock->shift) {
>  			clock->xtime_nsec -= (u64)NSEC_PER_SEC << clock->shift;
>  			xtime.tv_sec++;

This do-nothing change conflited with
clocksource-introduce-clock_monotonic_raw.patch (and fixes).  Please check
that the result is OK (ie: clocksource.raw_snsec ddd not need any changes)



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW
  2008-04-02 11:50   ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW Roman Zippel
  2008-04-02 16:01     ` John Stultz
@ 2008-04-03 21:11     ` Andrew Morton
  1 sibling, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2008-04-03 21:11 UTC (permalink / raw)
  To: Roman Zippel; +Cc: johnstul, linux-kernel

On Wed, 2 Apr 2008 13:50:36 +0200 (CEST)
Roman Zippel <zippel@linux-m68k.org> wrote:

> On Tue, 18 Mar 2008, john stultz wrote:
> 
> > My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
> > nanosecond based time value, that increments starting at bootup and has
> > no frequency adjustments made to it what so ever.
> > 
> > The time is accessed from userspace via the posix_clock_gettime()
> > syscall, passing CLOCK_MONOTONIC_RAW as the clock_id.
> 
> This is a reworked version of this patch based on the previous 
> clocksource_forward_now patch, since clocksource_forward_now() takes 
> care of the time offset now, it's not needed to do this at various places.
> I also got rid of the monotonic_raw splitting, so the work done during 
> update_wall_time() is quite a bit simpler.

All right, I give up.  I dropped

clocksource-keep-track-of-original-clocksource-frequency.patch
clocksource-keep-track-of-original-clocksource-frequency-fix.patch
clocksource-introduce-clock_monotonic_raw.patch
clocksource-introduce-clock_monotonic_raw-fix.patch
clocksource-introduce-clock_monotonic_raw-fix-checkpatch-fixes.patch
clocksource-introduce-clocksource_forward_now.patch

Please someone resend everything from scratch when it's all sorted out.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-04-03 21:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-18 22:11 [PATCH 1/2] Keep track of original clocksource frequency john stultz
2008-03-18 22:13 ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW john stultz
2008-03-19  2:43   ` Andrew Morton
2008-03-19  3:01     ` john stultz
2008-04-02 11:39   ` [PATCH 1/2] Introduce clocksource_forward_now Roman Zippel
2008-04-02 16:01     ` John Stultz
2008-04-03 21:07     ` Andrew Morton
2008-04-02 11:50   ` [PATCH 2/2] Introduce CLOCK_MONOTONIC_RAW Roman Zippel
2008-04-02 16:01     ` John Stultz
2008-04-02 16:37       ` Roman Zippel
2008-04-03 21:11     ` Andrew Morton
2008-04-02 11:20 ` [PATCH 1/2] Keep track of original clocksource frequency Roman Zippel
2008-04-02 15:38   ` John Stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).