LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* ALSA: hda: Make proper use of timecounter
@ 2021-11-24 22:40 Thomas Gleixner
  2021-11-26  8:55 ` Takashi Iwai
  2021-11-29 16:06 ` Pierre-Louis Bossart
  0 siblings, 2 replies; 4+ messages in thread
From: Thomas Gleixner @ 2021-11-24 22:40 UTC (permalink / raw)
  To: LKML
  Cc: Jaroslav Kysela, Takashi Iwai, Cezary Rojewski,
	Pierre-Louis Bossart, Liam Girdwood, Jie Yang, alsa-devel

HDA uses a timecounter to read a hardware clock running at 24 MHz. The
conversion factor is set with a mult value of 125 and a shift value of 0,
which is not converting the hardware clock to nanoseconds, it is converting
to 1/3 nanoseconds because the conversion factor from 24Mhz to nanoseconds
is 125/3. The usage sites divide the "nanoseconds" value returned by
timecounter_read() by 3 to get a real nanoseconds value.

There is a lengthy comment in azx_timecounter_init() explaining this
choice. That comment makes blatantly wrong assumptions about how
timecounters work and what can overflow.

The comment says:

     * Applying the 1/3 factor as part of the multiplication
     * requires at least 20 bits for a decent precision, however
     * overflows occur after about 4 hours or less, not a option.

timecounters operate on time deltas between two readouts of a clock and use
the mult/shift pair to calculate a precise nanoseconds value:

    delta_nsec = (delta_clock * mult) >> shift;

The fractional part is also taken into account and preserved to prevent
accumulated rounding errors. For details see cyclecounter_cyc2ns().

The mult/shift pair has to be chosen so that the multiplication of the
maximum expected delta value does not result in a 64bit overflow. As the
counter wraps around on 32bit, the maximum observable delta between two
reads is (1 << 32) - 1 which is about 178.9 seconds.

That in turn means the maximum multiplication factor which fits into an u32
will not cause a 64bit overflow ever because it's guaranteed that:

     ((1 << 32) - 1) ^ 2 < (1 << 64)

The resulting correct multiplication factor is 2796202667 and the shift
value is 26, i.e. 26 bit precision. The overflow of the multiplication
would happen exactly at a clock readout delta of 6597069765 which is way
after the wrap around of the hardware clock at around 274.8 seconds which
is off from the claimed 4 hours by more than an order of magnitude.

If the counter ever wraps around the last read value then the calculation
is off by the number of wrap arounds times 178.9 seconds because the
overflow cannot be observed.

Use clocks_calc_mult_shift(), which calculates the most accurate mult/shift
pair based on the given clock frequency, and remove the bogus comment along
with the divisions at the readout sites.

Fixes: 5d890f591d15 ("ALSA: hda: support for wallclock timestamps")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 sound/hda/hdac_stream.c           |   14 ++++----------
 sound/pci/hda/hda_controller.c    |    1 -
 sound/soc/intel/skylake/skl-pcm.c |    1 -
 3 files changed, 4 insertions(+), 12 deletions(-)

--- a/sound/hda/hdac_stream.c
+++ b/sound/hda/hdac_stream.c
@@ -534,17 +534,11 @@ static void azx_timecounter_init(struct
 	cc->mask = CLOCKSOURCE_MASK(32);
 
 	/*
-	 * Converting from 24 MHz to ns means applying a 125/3 factor.
-	 * To avoid any saturation issues in intermediate operations,
-	 * the 125 factor is applied first. The division is applied
-	 * last after reading the timecounter value.
-	 * Applying the 1/3 factor as part of the multiplication
-	 * requires at least 20 bits for a decent precision, however
-	 * overflows occur after about 4 hours or less, not a option.
+	 * Calculate the optimal mult/shift values. The counter wraps
+	 * around after ~178.9 seconds.
 	 */
-
-	cc->mult = 125; /* saturation after 195 years */
-	cc->shift = 0;
+	clocks_calc_mult_shift(&cc->mult, &cc->shift, 24000000,
+			       NSEC_PER_SEC, 178);
 
 	nsec = 0; /* audio time is elapsed time since trigger */
 	timecounter_init(tc, cc, nsec);
--- a/sound/pci/hda/hda_controller.c
+++ b/sound/pci/hda/hda_controller.c
@@ -504,7 +504,6 @@ static int azx_get_time_info(struct snd_
 		snd_pcm_gettime(substream->runtime, system_ts);
 
 		nsec = timecounter_read(&azx_dev->core.tc);
-		nsec = div_u64(nsec, 3); /* can be optimized */
 		if (audio_tstamp_config->report_delay)
 			nsec = azx_adjust_codec_delay(substream, nsec);
 
--- a/sound/soc/intel/skylake/skl-pcm.c
+++ b/sound/soc/intel/skylake/skl-pcm.c
@@ -1251,7 +1251,6 @@ static int skl_platform_soc_get_time_inf
 		snd_pcm_gettime(substream->runtime, system_ts);
 
 		nsec = timecounter_read(&hstr->tc);
-		nsec = div_u64(nsec, 3); /* can be optimized */
 		if (audio_tstamp_config->report_delay)
 			nsec = skl_adjust_codec_delay(substream, nsec);
 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ALSA: hda: Make proper use of timecounter
  2021-11-24 22:40 ALSA: hda: Make proper use of timecounter Thomas Gleixner
@ 2021-11-26  8:55 ` Takashi Iwai
  2021-11-29 16:06 ` Pierre-Louis Bossart
  1 sibling, 0 replies; 4+ messages in thread
From: Takashi Iwai @ 2021-11-26  8:55 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Thomas Gleixner, LKML, Jaroslav Kysela, Takashi Iwai,
	Cezary Rojewski, Liam Girdwood, Jie Yang, alsa-devel

On Wed, 24 Nov 2021 23:40:01 +0100,
Thomas Gleixner wrote:
> 
> HDA uses a timecounter to read a hardware clock running at 24 MHz. The
> conversion factor is set with a mult value of 125 and a shift value of 0,
> which is not converting the hardware clock to nanoseconds, it is converting
> to 1/3 nanoseconds because the conversion factor from 24Mhz to nanoseconds
> is 125/3. The usage sites divide the "nanoseconds" value returned by
> timecounter_read() by 3 to get a real nanoseconds value.
> 
> There is a lengthy comment in azx_timecounter_init() explaining this
> choice. That comment makes blatantly wrong assumptions about how
> timecounters work and what can overflow.
> 
> The comment says:
> 
>      * Applying the 1/3 factor as part of the multiplication
>      * requires at least 20 bits for a decent precision, however
>      * overflows occur after about 4 hours or less, not a option.
> 
> timecounters operate on time deltas between two readouts of a clock and use
> the mult/shift pair to calculate a precise nanoseconds value:
> 
>     delta_nsec = (delta_clock * mult) >> shift;
> 
> The fractional part is also taken into account and preserved to prevent
> accumulated rounding errors. For details see cyclecounter_cyc2ns().
> 
> The mult/shift pair has to be chosen so that the multiplication of the
> maximum expected delta value does not result in a 64bit overflow. As the
> counter wraps around on 32bit, the maximum observable delta between two
> reads is (1 << 32) - 1 which is about 178.9 seconds.
> 
> That in turn means the maximum multiplication factor which fits into an u32
> will not cause a 64bit overflow ever because it's guaranteed that:
> 
>      ((1 << 32) - 1) ^ 2 < (1 << 64)
> 
> The resulting correct multiplication factor is 2796202667 and the shift
> value is 26, i.e. 26 bit precision. The overflow of the multiplication
> would happen exactly at a clock readout delta of 6597069765 which is way
> after the wrap around of the hardware clock at around 274.8 seconds which
> is off from the claimed 4 hours by more than an order of magnitude.
> 
> If the counter ever wraps around the last read value then the calculation
> is off by the number of wrap arounds times 178.9 seconds because the
> overflow cannot be observed.
> 
> Use clocks_calc_mult_shift(), which calculates the most accurate mult/shift
> pair based on the given clock frequency, and remove the bogus comment along
> with the divisions at the readout sites.
> 
> Fixes: 5d890f591d15 ("ALSA: hda: support for wallclock timestamps")
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Looks like a nice fix / cleanup.  Pierre, could you check it?


thanks,

Takashi


> ---
>  sound/hda/hdac_stream.c           |   14 ++++----------
>  sound/pci/hda/hda_controller.c    |    1 -
>  sound/soc/intel/skylake/skl-pcm.c |    1 -
>  3 files changed, 4 insertions(+), 12 deletions(-)
> 
> --- a/sound/hda/hdac_stream.c
> +++ b/sound/hda/hdac_stream.c
> @@ -534,17 +534,11 @@ static void azx_timecounter_init(struct
>  	cc->mask = CLOCKSOURCE_MASK(32);
>  
>  	/*
> -	 * Converting from 24 MHz to ns means applying a 125/3 factor.
> -	 * To avoid any saturation issues in intermediate operations,
> -	 * the 125 factor is applied first. The division is applied
> -	 * last after reading the timecounter value.
> -	 * Applying the 1/3 factor as part of the multiplication
> -	 * requires at least 20 bits for a decent precision, however
> -	 * overflows occur after about 4 hours or less, not a option.
> +	 * Calculate the optimal mult/shift values. The counter wraps
> +	 * around after ~178.9 seconds.
>  	 */
> -
> -	cc->mult = 125; /* saturation after 195 years */
> -	cc->shift = 0;
> +	clocks_calc_mult_shift(&cc->mult, &cc->shift, 24000000,
> +			       NSEC_PER_SEC, 178);
>  
>  	nsec = 0; /* audio time is elapsed time since trigger */
>  	timecounter_init(tc, cc, nsec);
> --- a/sound/pci/hda/hda_controller.c
> +++ b/sound/pci/hda/hda_controller.c
> @@ -504,7 +504,6 @@ static int azx_get_time_info(struct snd_
>  		snd_pcm_gettime(substream->runtime, system_ts);
>  
>  		nsec = timecounter_read(&azx_dev->core.tc);
> -		nsec = div_u64(nsec, 3); /* can be optimized */
>  		if (audio_tstamp_config->report_delay)
>  			nsec = azx_adjust_codec_delay(substream, nsec);
>  
> --- a/sound/soc/intel/skylake/skl-pcm.c
> +++ b/sound/soc/intel/skylake/skl-pcm.c
> @@ -1251,7 +1251,6 @@ static int skl_platform_soc_get_time_inf
>  		snd_pcm_gettime(substream->runtime, system_ts);
>  
>  		nsec = timecounter_read(&hstr->tc);
> -		nsec = div_u64(nsec, 3); /* can be optimized */
>  		if (audio_tstamp_config->report_delay)
>  			nsec = skl_adjust_codec_delay(substream, nsec);
>  
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ALSA: hda: Make proper use of timecounter
  2021-11-24 22:40 ALSA: hda: Make proper use of timecounter Thomas Gleixner
  2021-11-26  8:55 ` Takashi Iwai
@ 2021-11-29 16:06 ` Pierre-Louis Bossart
  2021-11-29 16:42   ` Takashi Iwai
  1 sibling, 1 reply; 4+ messages in thread
From: Pierre-Louis Bossart @ 2021-11-29 16:06 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Jaroslav Kysela, Takashi Iwai, Cezary Rojewski, Liam Girdwood,
	Jie Yang, alsa-devel



On 11/24/21 4:40 PM, Thomas Gleixner wrote:
> HDA uses a timecounter to read a hardware clock running at 24 MHz. The
> conversion factor is set with a mult value of 125 and a shift value of 0,
> which is not converting the hardware clock to nanoseconds, it is converting
> to 1/3 nanoseconds because the conversion factor from 24Mhz to nanoseconds
> is 125/3. The usage sites divide the "nanoseconds" value returned by
> timecounter_read() by 3 to get a real nanoseconds value.
> 
> There is a lengthy comment in azx_timecounter_init() explaining this
> choice. That comment makes blatantly wrong assumptions about how
> timecounters work and what can overflow.
> 
> The comment says:
> 
>      * Applying the 1/3 factor as part of the multiplication
>      * requires at least 20 bits for a decent precision, however
>      * overflows occur after about 4 hours or less, not a option.
> 
> timecounters operate on time deltas between two readouts of a clock and use
> the mult/shift pair to calculate a precise nanoseconds value:
> 
>     delta_nsec = (delta_clock * mult) >> shift;
> 
> The fractional part is also taken into account and preserved to prevent
> accumulated rounding errors. For details see cyclecounter_cyc2ns().
> 
> The mult/shift pair has to be chosen so that the multiplication of the
> maximum expected delta value does not result in a 64bit overflow. As the
> counter wraps around on 32bit, the maximum observable delta between two
> reads is (1 << 32) - 1 which is about 178.9 seconds.
> 
> That in turn means the maximum multiplication factor which fits into an u32
> will not cause a 64bit overflow ever because it's guaranteed that:
> 
>      ((1 << 32) - 1) ^ 2 < (1 << 64)
> 
> The resulting correct multiplication factor is 2796202667 and the shift
> value is 26, i.e. 26 bit precision. The overflow of the multiplication
> would happen exactly at a clock readout delta of 6597069765 which is way
> after the wrap around of the hardware clock at around 274.8 seconds which
> is off from the claimed 4 hours by more than an order of magnitude.
> 
> If the counter ever wraps around the last read value then the calculation
> is off by the number of wrap arounds times 178.9 seconds because the
> overflow cannot be observed.
> 
> Use clocks_calc_mult_shift(), which calculates the most accurate mult/shift
> pair based on the given clock frequency, and remove the bogus comment along
> with the divisions at the readout sites.
> 
> Fixes: 5d890f591d15 ("ALSA: hda: support for wallclock timestamps")
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

I don't recall the reason of why I added separate steps for
multiplication by 125 and division by 3 back in 2012, but obviously they
weren't aligned with my own comment "Max buffer time is limited to 178
seconds to make sure wall clock counter does not overflow".

Thanks for the patch, much appreciated.

Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ALSA: hda: Make proper use of timecounter
  2021-11-29 16:06 ` Pierre-Louis Bossart
@ 2021-11-29 16:42   ` Takashi Iwai
  0 siblings, 0 replies; 4+ messages in thread
From: Takashi Iwai @ 2021-11-29 16:42 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Thomas Gleixner, LKML, Jaroslav Kysela, Takashi Iwai,
	Cezary Rojewski, Liam Girdwood, Jie Yang, alsa-devel

On Mon, 29 Nov 2021 17:06:40 +0100,
Pierre-Louis Bossart wrote:
> 
> 
> 
> On 11/24/21 4:40 PM, Thomas Gleixner wrote:
> > HDA uses a timecounter to read a hardware clock running at 24 MHz. The
> > conversion factor is set with a mult value of 125 and a shift value of 0,
> > which is not converting the hardware clock to nanoseconds, it is converting
> > to 1/3 nanoseconds because the conversion factor from 24Mhz to nanoseconds
> > is 125/3. The usage sites divide the "nanoseconds" value returned by
> > timecounter_read() by 3 to get a real nanoseconds value.
> > 
> > There is a lengthy comment in azx_timecounter_init() explaining this
> > choice. That comment makes blatantly wrong assumptions about how
> > timecounters work and what can overflow.
> > 
> > The comment says:
> > 
> >      * Applying the 1/3 factor as part of the multiplication
> >      * requires at least 20 bits for a decent precision, however
> >      * overflows occur after about 4 hours or less, not a option.
> > 
> > timecounters operate on time deltas between two readouts of a clock and use
> > the mult/shift pair to calculate a precise nanoseconds value:
> > 
> >     delta_nsec = (delta_clock * mult) >> shift;
> > 
> > The fractional part is also taken into account and preserved to prevent
> > accumulated rounding errors. For details see cyclecounter_cyc2ns().
> > 
> > The mult/shift pair has to be chosen so that the multiplication of the
> > maximum expected delta value does not result in a 64bit overflow. As the
> > counter wraps around on 32bit, the maximum observable delta between two
> > reads is (1 << 32) - 1 which is about 178.9 seconds.
> > 
> > That in turn means the maximum multiplication factor which fits into an u32
> > will not cause a 64bit overflow ever because it's guaranteed that:
> > 
> >      ((1 << 32) - 1) ^ 2 < (1 << 64)
> > 
> > The resulting correct multiplication factor is 2796202667 and the shift
> > value is 26, i.e. 26 bit precision. The overflow of the multiplication
> > would happen exactly at a clock readout delta of 6597069765 which is way
> > after the wrap around of the hardware clock at around 274.8 seconds which
> > is off from the claimed 4 hours by more than an order of magnitude.
> > 
> > If the counter ever wraps around the last read value then the calculation
> > is off by the number of wrap arounds times 178.9 seconds because the
> > overflow cannot be observed.
> > 
> > Use clocks_calc_mult_shift(), which calculates the most accurate mult/shift
> > pair based on the given clock frequency, and remove the bogus comment along
> > with the divisions at the readout sites.
> > 
> > Fixes: 5d890f591d15 ("ALSA: hda: support for wallclock timestamps")
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> 
> I don't recall the reason of why I added separate steps for
> multiplication by 125 and division by 3 back in 2012, but obviously they
> weren't aligned with my own comment "Max buffer time is limited to 178
> seconds to make sure wall clock counter does not overflow".
> 
> Thanks for the patch, much appreciated.
> 
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>

Now queued to for-next branch.  Thanks.


Takashi

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-29 16:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-24 22:40 ALSA: hda: Make proper use of timecounter Thomas Gleixner
2021-11-26  8:55 ` Takashi Iwai
2021-11-29 16:06 ` Pierre-Louis Bossart
2021-11-29 16:42   ` Takashi Iwai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).