LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [2.6.25-rc1] Strange regression with CONFIG_HZ_300=y
@ 2008-02-11 13:41 Carlos R. Mafra
  2008-02-12  8:26 ` Eric Piel
  0 siblings, 1 reply; 5+ messages in thread
From: Carlos R. Mafra @ 2008-02-11 13:41 UTC (permalink / raw)
  To: linux-kernel, hpa

I apologize in advance if I am crazy about this, but I noticed
a strange regression wrt 2.6.24 in cpufreq (I think) in 2.6.25-rc1, which
goes away if I revert the following commit:

commit bdc807871d58285737d50dc6163d0feb72cb0dc2
Author: H. Peter Anvin <hpa@zytor.com>
Date:   Fri Feb 8 04:21:26 2008 -0800

    avoid overflows in kernel/time.c

    When the conversion factor between jiffies and milli- or microseconds is
    not a single multiply or divide, as for the case of HZ == 300, we currently
    do a multiply followed by a divide.  The intervening result, however, is
    subject to overflows, especially since the fraction is not simplified (for
    HZ == 300, we multiply by 300 and divide by 1000).

    This is exposed to the user when passing a large timeout to poll(), for
    example.

    This patch replaces the multiply-divide with a reciprocal multiplication on
    32-bit platforms.  When the input is an unsigned long, there is no portable
    way to do this on 64-bit platforms there is no portable way to do this
    since it requires a 128-bit intermediate result (which gcc does support on
    64-bit platforms but may generate libgcc calls, e.g.  on 64-bit s390), but
    since the output is a 32-bit integer in the cases affected, just simplify
    the multiply-divide (*3/10 instead of *300/1000).

    The reciprocal multiply used can have off-by-one errors in the upper half
    of the valid output range.  This could be avoided at the expense of having
    to deal with a potential 65-bit intermediate result.  Since the intent is
    to avoid overflow problems and most of the other time conversions are only
    semiexact, the off-by-one errors were considered an acceptable tradeoff.

[...]
[more text follows]

The problem in vanilla 2.6.25-rc1 happens with CONFIG_HZ_300=y (and doesn't
with CONFIG_HZ_250=y or with the above commit reverted). The cpu frequency doesn't
change anymore regardless of the load, and it stays high (2.0 GHz or 1.2 GHz) even
when idle (I checked with 'top'), when the usual is to go to 800 Mhz when idle (I
always use the ondemand governor compiled in and as the default governor).

The laptop is a Vaio VGN-FZ240E, core 2 duo T7250 @ 2.0 GHz and the kernel is x86_64.

If someone needs more information about this I will be happy to provide.

Carlos R. Mafra




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [2.6.25-rc1] Strange regression with CONFIG_HZ_300=y
  2008-02-11 13:41 [2.6.25-rc1] Strange regression with CONFIG_HZ_300=y Carlos R. Mafra
@ 2008-02-12  8:26 ` Eric Piel
  2008-02-12 11:41   ` Carlos R. Mafra
  2008-02-12 19:23   ` H. Peter Anvin
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Piel @ 2008-02-12  8:26 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: linux-kernel, hpa

[-- Attachment #1: Type: text/plain, Size: 2618 bytes --]

Carlos R. Mafra wrote:
> I apologize in advance if I am crazy about this, but I noticed
> a strange regression wrt 2.6.24 in cpufreq (I think) in 2.6.25-rc1, which
> goes away if I revert the following commit:
> 
> commit bdc807871d58285737d50dc6163d0feb72cb0dc2
> Author: H. Peter Anvin <hpa@zytor.com>
> Date:   Fri Feb 8 04:21:26 2008 -0800
> 
>     avoid overflows in kernel/time.c
> 
>     When the conversion factor between jiffies and milli- or microseconds is
>     not a single multiply or divide, as for the case of HZ == 300, we currently
>     do a multiply followed by a divide.  The intervening result, however, is
>     subject to overflows, especially since the fraction is not simplified (for
>     HZ == 300, we multiply by 300 and divide by 1000).
> 
>     This is exposed to the user when passing a large timeout to poll(), for
>     example.
> 
>     This patch replaces the multiply-divide with a reciprocal multiplication on
>     32-bit platforms.  When the input is an unsigned long, there is no portable
>     way to do this on 64-bit platforms there is no portable way to do this
>     since it requires a 128-bit intermediate result (which gcc does support on
>     64-bit platforms but may generate libgcc calls, e.g.  on 64-bit s390), but
>     since the output is a 32-bit integer in the cases affected, just simplify
>     the multiply-divide (*3/10 instead of *300/1000).
> 
>     The reciprocal multiply used can have off-by-one errors in the upper half
>     of the valid output range.  This could be avoided at the expense of having
>     to deal with a potential 65-bit intermediate result.  Since the intent is
>     to avoid overflow problems and most of the other time conversions are only
>     semiexact, the off-by-one errors were considered an acceptable tradeoff.
> 
> [...]
> [more text follows]
> 
> The problem in vanilla 2.6.25-rc1 happens with CONFIG_HZ_300=y (and doesn't
> with CONFIG_HZ_250=y or with the above commit reverted). The cpu frequency doesn't
> change anymore regardless of the load, and it stays high (2.0 GHz or 1.2 GHz) even
> when idle (I checked with 'top'), when the usual is to go to 800 Mhz when idle (I
> always use the ondemand governor compiled in and as the default governor).
> 
> The laptop is a Vaio VGN-FZ240E, core 2 duo T7250 @ 2.0 GHz and the kernel is x86_64.

Hi, it's great you found out the culprit commit because I was really 
wondering where this bug was coming from...
As a data point, my machine has a core 2 duo @ 1.2GHz and x86_64 arch. 
Do you also have the tickless option activated? (it could play a role)

See you,
Eric

[-- Attachment #2: E_A_B_Piel.vcf --]
[-- Type: text/x-vcard, Size: 342 bytes --]

begin:vcard
fn;quoted-printable:=C3=89ric Piel
n;quoted-printable:Piel;=C3=89ric
org:Technical University of Delft;Software Engineering Research Group
adr:HB 08.080;;Mekelweg 4;Delft;;2628 CD;The Netherlands
email;internet:E.A.B.Piel@tudelft.nl
tel;work:+31 15 278 6338
x-mozilla-html:FALSE
url:http://pieleric.free.fr
version:2.1
end:vcard


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [2.6.25-rc1] Strange regression with CONFIG_HZ_300=y
  2008-02-12  8:26 ` Eric Piel
@ 2008-02-12 11:41   ` Carlos R. Mafra
  2008-02-12 19:23   ` H. Peter Anvin
  1 sibling, 0 replies; 5+ messages in thread
From: Carlos R. Mafra @ 2008-02-12 11:41 UTC (permalink / raw)
  To: Eric Piel, hpa, linux-kernel

Eric Piel wrote:
> Carlos R. Mafra wrote:
>> I apologize in advance if I am crazy about this, but I noticed
>> a strange regression wrt 2.6.24 in cpufreq (I think) in 2.6.25-rc1, which
>> goes away if I revert the following commit:
>>
>> commit bdc807871d58285737d50dc6163d0feb72cb0dc2
>> Author: H. Peter Anvin <hpa@zytor.com>
>> Date:   Fri Feb 8 04:21:26 2008 -0800
>>
>>     avoid overflows in kernel/time.c
>>
>>     When the conversion factor between jiffies and milli- or
>> microseconds is
>>     not a single multiply or divide, as for the case of HZ == 300, we
>> currently
>>     do a multiply followed by a divide.  The intervening result,
>> however, is
>>     subject to overflows, especially since the fraction is not
>> simplified (for
>>     HZ == 300, we multiply by 300 and divide by 1000).
>>
>>     This is exposed to the user when passing a large timeout to
>> poll(), for
>>     example.
>>
>>     This patch replaces the multiply-divide with a reciprocal
>> multiplication on
>>     32-bit platforms.  When the input is an unsigned long, there is no
>> portable
>>     way to do this on 64-bit platforms there is no portable way to do
>> this
>>     since it requires a 128-bit intermediate result (which gcc does
>> support on
>>     64-bit platforms but may generate libgcc calls, e.g.  on 64-bit
>> s390), but
>>     since the output is a 32-bit integer in the cases affected, just
>> simplify
>>     the multiply-divide (*3/10 instead of *300/1000).
>>
>>     The reciprocal multiply used can have off-by-one errors in the
>> upper half
>>     of the valid output range.  This could be avoided at the expense
>> of having
>>     to deal with a potential 65-bit intermediate result.  Since the
>> intent is
>>     to avoid overflow problems and most of the other time conversions
>> are only
>>     semiexact, the off-by-one errors were considered an acceptable
>> tradeoff.
>>
>> [...]
>> [more text follows]
>>
>> The problem in vanilla 2.6.25-rc1 happens with CONFIG_HZ_300=y (and
>> doesn't
>> with CONFIG_HZ_250=y or with the above commit reverted). The cpu
>> frequency doesn't
>> change anymore regardless of the load, and it stays high (2.0 GHz or
>> 1.2 GHz) even
>> when idle (I checked with 'top'), when the usual is to go to 800 Mhz
>> when idle (I
>> always use the ondemand governor compiled in and as the default
>> governor).
>>
>> The laptop is a Vaio VGN-FZ240E, core 2 duo T7250 @ 2.0 GHz and the
>> kernel is x86_64.
> 
> Hi, it's great you found out the culprit commit because I was really
> wondering where this bug was coming from...

Nice!

> As a data point, my machine has a core 2 duo @ 1.2GHz and x86_64 arch.
> Do you also have the tickless option activated? (it could play a role)

Yes, I have tickless enabled.

> See you,
> Eric


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [2.6.25-rc1] Strange regression with CONFIG_HZ_300=y
  2008-02-12  8:26 ` Eric Piel
  2008-02-12 11:41   ` Carlos R. Mafra
@ 2008-02-12 19:23   ` H. Peter Anvin
  2008-02-12 19:53     ` Carlos R. Mafra
  1 sibling, 1 reply; 5+ messages in thread
From: H. Peter Anvin @ 2008-02-12 19:23 UTC (permalink / raw)
  To: Eric Piel; +Cc: Carlos R. Mafra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 470 bytes --]

Eric Piel wrote:
>>
>> The laptop is a Vaio VGN-FZ240E, core 2 duo T7250 @ 2.0 GHz and the 
>> kernel is x86_64.
> 
> Hi, it's great you found out the culprit commit because I was really 
> wondering where this bug was coming from...
> As a data point, my machine has a core 2 duo @ 1.2GHz and x86_64 arch. 
> Do you also have the tickless option activated? (it could play a role)
> 

I believe this patch should fix the problem.  Can you please verify?

Thanks,

	-hpa

[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 409 bytes --]

diff --git a/kernel/timeconst.pl b/kernel/timeconst.pl
index 62b1287..4146803 100644
--- a/kernel/timeconst.pl
+++ b/kernel/timeconst.pl
@@ -339,7 +339,7 @@ sub output($@)
 	print "\n";
 
 	foreach $pfx ('HZ_TO_MSEC','MSEC_TO_HZ',
-		      'USEC_TO_HZ','HZ_TO_USEC') {
+		      'HZ_TO_USEC','USEC_TO_HZ') {
 		foreach $bit (32, 64) {
 			foreach $suf ('MUL', 'ADJ', 'SHR') {
 				printf "#define %-23s %s\n",

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [2.6.25-rc1] Strange regression with CONFIG_HZ_300=y
  2008-02-12 19:23   ` H. Peter Anvin
@ 2008-02-12 19:53     ` Carlos R. Mafra
  0 siblings, 0 replies; 5+ messages in thread
From: Carlos R. Mafra @ 2008-02-12 19:53 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Eric Piel, linux-kernel

H. Peter Anvin wrote:
> Eric Piel wrote:
>>>
>>> The laptop is a Vaio VGN-FZ240E, core 2 duo T7250 @ 2.0 GHz and the
>>> kernel is x86_64.
>>
>> Hi, it's great you found out the culprit commit because I was really
>> wondering where this bug was coming from...
>> As a data point, my machine has a core 2 duo @ 1.2GHz and x86_64 arch.
>> Do you also have the tickless option activated? (it could play a role)
>>
> 
> I believe this patch should fix the problem.  Can you please verify?

I've just tested your patch on top of 2.6.25-rc1 and it fixes the
problem!

> Thanks,
> 
>     -hpa

Thank you very much for solving this so fast,
Carlos



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-02-12 19:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-11 13:41 [2.6.25-rc1] Strange regression with CONFIG_HZ_300=y Carlos R. Mafra
2008-02-12  8:26 ` Eric Piel
2008-02-12 11:41   ` Carlos R. Mafra
2008-02-12 19:23   ` H. Peter Anvin
2008-02-12 19:53     ` Carlos R. Mafra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).