LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: gettimeofday resolution seriously degraded in test9
       [not found]       ` <LGLz.1h2.5@gated-at.bofh.it>
@ 2003-10-28 19:19         ` David Mosberger-Tang
  2003-10-28 19:59           ` Stephen Hemminger
       [not found]         ` <LVAR.4Mb.3@gated-at.bofh.it>
  1 sibling, 1 reply; 9+ messages in thread
From: David Mosberger-Tang @ 2003-10-28 19:19 UTC (permalink / raw)
  To: linux-kernel

>>>>> On Tue, 28 Oct 2003 19:30:13 +0100, Stephen Hemminger <shemminger@osdl.org> said:

  Stephen> This should work better. Patch against 2.6.0-test9

Why not use the time-interpolator interface defined in timex.h?  It
should handle such things without any special hacks.

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettimeofday resolution seriously degraded in test9
  2003-10-28 19:19         ` gettimeofday resolution seriously degraded in test9 David Mosberger-Tang
@ 2003-10-28 19:59           ` Stephen Hemminger
  2003-10-29  0:19             ` David Mosberger
  0 siblings, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2003-10-28 19:59 UTC (permalink / raw)
  To: linux-kernel

On 28 Oct 2003 11:19:21 -0800
David Mosberger-Tang <David.Mosberger@acm.org> wrote:

> >>>>> On Tue, 28 Oct 2003 19:30:13 +0100, Stephen Hemminger <shemminger@osdl.org> said:
> 
>   Stephen> This should work better. Patch against 2.6.0-test9
> 
> Why not use the time-interpolator interface defined in timex.h?  It
> should handle such things without any special hacks.
> 
> 	--david

Because it has not been used yet outside of ia64.  It would be worth investigating
post 2.6.0 if it could be shown to be as fast and more correct.  Several people
have talked about redoing the existing mess, but now is not the time to attack
this dragon...


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: gettimeofday resolution seriously degraded in test9
  2003-10-28 19:59           ` Stephen Hemminger
@ 2003-10-29  0:19             ` David Mosberger
  0 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2003-10-29  0:19 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: linux-kernel

>>>>> On Tue, 28 Oct 2003 11:59:17 -0800, Stephen Hemminger <shemminger@osdl.org> said:

  Stephen> On 28 Oct 2003 11:19:21 -0800
  Stephen> David Mosberger-Tang <David.Mosberger@acm.org> wrote:

  >> >>>>> On Tue, 28 Oct 2003 19:30:13 +0100, Stephen Hemminger <shemminger@osdl.org> said:

  Stephen> This should work better. Patch against 2.6.0-test9

  >> Why not use the time-interpolator interface defined in timex.h?  It
  >> should handle such things without any special hacks.

  Stephen> Because it has not been used yet outside of ia64.  It would
  Stephen> be worth investigating post 2.6.0 if it could be shown to
  Stephen> be as fast and more correct.  Several people have talked
  Stephen> about redoing the existing mess, but now is not the time to
  Stephen> attack this dragon...

OK, I certainly agree that it's not something that should be done in a
rush, so it's definitely post 2.6.0.

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] possible erronous use of tick_usec in do_gettimeofday
       [not found]                     ` <VOyG.w9.35@gated-at.bofh.it>
@ 2003-11-28  1:29                       ` Andi Kleen
  0 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2003-11-28  1:29 UTC (permalink / raw)
  To: Joe Korty; +Cc: shemminger, linux-kernel

Joe Korty <joe.korty@ccur.com> writes:

> test10's version of do_gettimeofday is using tick_usec which is
> defined in terms of USER_HZ not HZ.
>
> Against 2.6.0-test10-bk1.  Compiled, not tested, for comment only.
>

I added the changes to x86-64, but at least ping still complains 
that the time is going backwards. The machine is running ntpd
and has a high drift (AMD 8111 chipset, doesn't have the most stable
timer in the world)

-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] possible erronous use of tick_usec in do_gettimeofday
  2003-11-25 21:12                       ` Joe Korty
@ 2003-11-25 23:26                         ` George Anzinger
  0 siblings, 0 replies; 9+ messages in thread
From: George Anzinger @ 2003-11-25 23:26 UTC (permalink / raw)
  To: Joe Korty
  Cc: Peter Chubb, root, Stephen Hemminger, Gabriel Paubert,
	john stultz, Linus Torvalds, lkml, Andrew Morton

Joe Korty wrote:
> On Tue, Nov 25, 2003 at 11:57:55AM -0800, George Anzinger wrote:
> 
>>Joe Korty wrote:
>>
>>>test10's version of do_gettimeofday is using tick_usec which is
>>>defined in terms of USER_HZ not HZ.
>>
>>We still have the problem that we are doing this calculation in usecs while 
>>the wall clock uses nsecs.  This would be fine if there were an even number 
>>of usecs in tick_nsec, but in fact it is somewhat less than (USEC_PER_SEC / 
>>HZ).  This means that this correction (if we are behind by 7 or more ticks) 
>>will push the clock past current time.  Here are the numbers:
>>
>>tick_nsec =999849 or 1ms less 151 ns.  So if we are behind 7 or more ticks 
>>we will report the time out 1 us too high.  (7 * 151 = 1057 or 1.057 usec).
>>
>>Question is, do we care?  Will we ever be 7ms late in updating the wall 
>>clock? As I recall, the wall clock is updated in the interrupt handler for 
>>the tick so, to be this late, we would need to suffer a long interrupt hold 
>>off AND the tick recovery code would need to have done its thing.  But this 
>>whole time is covered by a write_seqlock on xtime_lock, so how can this 
>>even happen?  Seems like it is only possible when we are locked and we then 
>>throw the whole thing away.
>>
>>A test I would like to see is to put this in the code AFTER the read unlock:
>>
>>if (lost )
>>	printk("Lost is %d\n", lost);
>>
>>(need to pull "	unsigned long lost;" out of the do{}while loop to do 
>>this)
>>
>>In short, I think we are beating a dead issue.
> 
> 
> There are other issues too: the 'lost' calculation is a prediction
> over the next 'lost' number of ticks.  That prediction will be wrong
> if 1) adjtime goes to zero within that interval or, 2) adjtime was
> zero but went nonzero in that interval due to a adjtimex(2) call.
> 
> Despite these flaws the patch replaces truly broken code with code
> that is good but slightly inaccurate, which is good enough for now.

Can you prove that "lost" is EVER non-zero in a case we care about?  I.e. a case 
where the read_seq will exit the loop?

I could be wrong here, but I don't think it can happen.  That is why I suggested 
the if(lost) test.

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] possible erronous use of tick_usec in do_gettimeofday
  2003-11-25 19:57                     ` George Anzinger
@ 2003-11-25 21:12                       ` Joe Korty
  2003-11-25 23:26                         ` George Anzinger
  0 siblings, 1 reply; 9+ messages in thread
From: Joe Korty @ 2003-11-25 21:12 UTC (permalink / raw)
  To: George Anzinger
  Cc: Peter Chubb, root, Stephen Hemminger, Gabriel Paubert,
	john stultz, Linus Torvalds, lkml, Andrew Morton

On Tue, Nov 25, 2003 at 11:57:55AM -0800, George Anzinger wrote:
> Joe Korty wrote:
> >test10's version of do_gettimeofday is using tick_usec which is
> >defined in terms of USER_HZ not HZ.
> 
> We still have the problem that we are doing this calculation in usecs while 
> the wall clock uses nsecs.  This would be fine if there were an even number 
> of usecs in tick_nsec, but in fact it is somewhat less than (USEC_PER_SEC / 
> HZ).  This means that this correction (if we are behind by 7 or more ticks) 
> will push the clock past current time.  Here are the numbers:
> 
> tick_nsec =999849 or 1ms less 151 ns.  So if we are behind 7 or more ticks 
> we will report the time out 1 us too high.  (7 * 151 = 1057 or 1.057 usec).
> 
> Question is, do we care?  Will we ever be 7ms late in updating the wall 
> clock? As I recall, the wall clock is updated in the interrupt handler for 
> the tick so, to be this late, we would need to suffer a long interrupt hold 
> off AND the tick recovery code would need to have done its thing.  But this 
> whole time is covered by a write_seqlock on xtime_lock, so how can this 
> even happen?  Seems like it is only possible when we are locked and we then 
> throw the whole thing away.
> 
> A test I would like to see is to put this in the code AFTER the read unlock:
> 
> if (lost )
> 	printk("Lost is %d\n", lost);
> 
> (need to pull "	unsigned long lost;" out of the do{}while loop to do 
> this)
> 
> In short, I think we are beating a dead issue.

There are other issues too: the 'lost' calculation is a prediction
over the next 'lost' number of ticks.  That prediction will be wrong
if 1) adjtime goes to zero within that interval or, 2) adjtime was
zero but went nonzero in that interval due to a adjtimex(2) call.

Despite these flaws the patch replaces truly broken code with code
that is good but slightly inaccurate, which is good enough for now.

Joe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] possible erronous use of tick_usec in do_gettimeofday
  2003-11-25 16:42                   ` [RFC] possible erronous use of tick_usec in do_gettimeofday Joe Korty
  2003-11-25 17:13                     ` Stephen Hemminger
@ 2003-11-25 19:57                     ` George Anzinger
  2003-11-25 21:12                       ` Joe Korty
  1 sibling, 1 reply; 9+ messages in thread
From: George Anzinger @ 2003-11-25 19:57 UTC (permalink / raw)
  To: Joe Korty
  Cc: Peter Chubb, root, Stephen Hemminger, Gabriel Paubert,
	john stultz, Linus Torvalds, lkml, Andrew Morton

Joe Korty wrote:
> test10's version of do_gettimeofday is using tick_usec which is
> defined in terms of USER_HZ not HZ.
> 
> Against 2.6.0-test10-bk1.  Compiled, not tested, for comment only.

We still have the problem that we are doing this calculation in usecs while the 
wall clock uses nsecs.  This would be fine if there were an even number of usecs 
in tick_nsec, but in fact it is somewhat less than (USEC_PER_SEC / HZ).  This 
means that this correction (if we are behind by 7 or more ticks) will push the 
clock past current time.  Here are the numbers:

tick_nsec =999849 or 1ms less 151 ns.  So if we are behind 7 or more ticks we 
will report the time out 1 us too high.  (7 * 151 = 1057 or 1.057 usec).

Question is, do we care?  Will we ever be 7ms late in updating the wall clock? 
As I recall, the wall clock is updated in the interrupt handler for the tick so, 
to be this late, we would need to suffer a long interrupt hold off AND the tick 
recovery code would need to have done its thing.  But this whole time is covered 
by a write_seqlock on xtime_lock, so how can this even happen?  Seems like it is 
only possible when we are locked and we then throw the whole thing away.

A test I would like to see is to put this in the code AFTER the read unlock:

if (lost )
	printk("Lost is %d\n", lost);

(need to pull "	unsigned long lost;" out of the do{}while loop to do this)

In short, I think we are beating a dead issue.

-g
> 
> Joe
> 
> --- base/arch/i386/kernel/time.c	2003-11-23 20:31:55.000000000 -0500
> +++ new/arch/i386/kernel/time.c	2003-11-25 11:22:38.000000000 -0500
> @@ -94,7 +94,7 @@
>  {
>  	unsigned long seq;
>  	unsigned long usec, sec;
> -	unsigned long max_ntp_tick = tick_usec - tickadj;
> +	unsigned long max_ntp_tick;
>  
>  	do {
>  		unsigned long lost;
> @@ -110,13 +110,14 @@
>  		 * Better to lose some accuracy than have time go backwards..
>  		 */
>  		if (unlikely(time_adjust < 0)) {
> +			max_ntp_tick = (USEC_PER_SEC / HZ) - tickadj;
>  			usec = min(usec, max_ntp_tick);
>  
>  			if (lost)
>  				usec += lost * max_ntp_tick;
>  		}
>  		else if (unlikely(lost))
> -			usec += lost * tick_usec;
> +			usec += lost * (USEC_PER_SEC / HZ);
>  
>  		sec = xtime.tv_sec;
>  		usec += (xtime.tv_nsec / 1000);
> 
> 

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] possible erronous use of tick_usec in do_gettimeofday
  2003-11-25 16:42                   ` [RFC] possible erronous use of tick_usec in do_gettimeofday Joe Korty
@ 2003-11-25 17:13                     ` Stephen Hemminger
  2003-11-25 19:57                     ` George Anzinger
  1 sibling, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2003-11-25 17:13 UTC (permalink / raw)
  To: Joe Korty; +Cc: linux-kernel

On Tue, 25 Nov 2003 11:42:38 -0500
Joe Korty <joe.korty@ccur.com> wrote:

> test10's version of do_gettimeofday is using tick_usec which is
> defined in terms of USER_HZ not HZ.
> 
> Against 2.6.0-test10-bk1.  Compiled, not tested, for comment only.

Your right. tick_usec is in user hz so the value of max_ntp_tick would be
too large.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC] possible erronous use of tick_usec in do_gettimeofday
  2003-10-30 23:15                 ` Peter Chubb
@ 2003-11-25 16:42                   ` Joe Korty
  2003-11-25 17:13                     ` Stephen Hemminger
  2003-11-25 19:57                     ` George Anzinger
  0 siblings, 2 replies; 9+ messages in thread
From: Joe Korty @ 2003-11-25 16:42 UTC (permalink / raw)
  To: Peter Chubb
  Cc: root, George Anzinger, Stephen Hemminger, Gabriel Paubert,
	john stultz, Linus Torvalds, lkml, Andrew Morton

test10's version of do_gettimeofday is using tick_usec which is
defined in terms of USER_HZ not HZ.

Against 2.6.0-test10-bk1.  Compiled, not tested, for comment only.

Joe

--- base/arch/i386/kernel/time.c	2003-11-23 20:31:55.000000000 -0500
+++ new/arch/i386/kernel/time.c	2003-11-25 11:22:38.000000000 -0500
@@ -94,7 +94,7 @@
 {
 	unsigned long seq;
 	unsigned long usec, sec;
-	unsigned long max_ntp_tick = tick_usec - tickadj;
+	unsigned long max_ntp_tick;
 
 	do {
 		unsigned long lost;
@@ -110,13 +110,14 @@
 		 * Better to lose some accuracy than have time go backwards..
 		 */
 		if (unlikely(time_adjust < 0)) {
+			max_ntp_tick = (USEC_PER_SEC / HZ) - tickadj;
 			usec = min(usec, max_ntp_tick);
 
 			if (lost)
 				usec += lost * max_ntp_tick;
 		}
 		else if (unlikely(lost))
-			usec += lost * tick_usec;
+			usec += lost * (USEC_PER_SEC / HZ);
 
 		sec = xtime.tv_sec;
 		usec += (xtime.tv_nsec / 1000);

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-11-28  1:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <LphK.2Dl.15@gated-at.bofh.it>
     [not found] ` <Lq47.3Go.11@gated-at.bofh.it>
     [not found]   ` <LqGL.4zF.11@gated-at.bofh.it>
     [not found]     ` <LAPN.1dU.11@gated-at.bofh.it>
     [not found]       ` <LGLz.1h2.5@gated-at.bofh.it>
2003-10-28 19:19         ` gettimeofday resolution seriously degraded in test9 David Mosberger-Tang
2003-10-28 19:59           ` Stephen Hemminger
2003-10-29  0:19             ` David Mosberger
     [not found]         ` <LVAR.4Mb.3@gated-at.bofh.it>
     [not found]           ` <M4uv.bw.5@gated-at.bofh.it>
     [not found]             ` <M7sx.4et.13@gated-at.bofh.it>
     [not found]               ` <MsGE.8cN.19@gated-at.bofh.it>
     [not found]                 ` <MsZZ.c3.5@gated-at.bofh.it>
     [not found]                   ` <Mufp.1YL.15@gated-at.bofh.it>
     [not found]                     ` <VOyG.w9.35@gated-at.bofh.it>
2003-11-28  1:29                       ` [RFC] possible erronous use of tick_usec in do_gettimeofday Andi Kleen
2003-10-28  0:29 gettimeofday resolution seriously degraded in test9 john stultz
2003-10-28  1:17 ` Stephen Hemminger
2003-10-28 11:55   ` Gabriel Paubert
2003-10-28 18:21     ` Stephen Hemminger
2003-10-29 10:07       ` Gabriel Paubert
2003-10-29 19:38         ` Stephen Hemminger
2003-10-29 22:50           ` Peter Chubb
2003-10-30 21:33             ` George Anzinger
2003-10-30 21:52               ` Richard B. Johnson
2003-10-30 23:15                 ` Peter Chubb
2003-11-25 16:42                   ` [RFC] possible erronous use of tick_usec in do_gettimeofday Joe Korty
2003-11-25 17:13                     ` Stephen Hemminger
2003-11-25 19:57                     ` George Anzinger
2003-11-25 21:12                       ` Joe Korty
2003-11-25 23:26                         ` George Anzinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).