LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] Warn of incorrect cpu_khz on AMD systems
@ 2008-11-04 15:27 Prarit Bhargava
  2008-11-06  9:01 ` Ingo Molnar
  0 siblings, 1 reply; 5+ messages in thread
From: Prarit Bhargava @ 2008-11-04 15:27 UTC (permalink / raw)
  To: linux-kernel, tglx, mark.langsdorf; +Cc: Prarit Bhargava

If none of the perfctrs are free when calculating cpu_khz we default to using
ctr 3 (ie, we just choose 3).  This may lead to an incorrect tsc freq value
which can cause the system to be unstable.

To aid in future debugging, WARN the user of a potential problem.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>

diff --git a/arch/x86/kernel/time_64.c b/arch/x86/kernel/time_64.c
index cb19d65..86d71b3 100644
--- a/arch/x86/kernel/time_64.c
+++ b/arch/x86/kernel/time_64.c
@@ -80,6 +80,8 @@ unsigned long __init calibrate_cpu(void)
 			break;
 	no_ctr_free = (i == 4);
 	if (no_ctr_free) {
+		printk(KERN_WARN "Warning: AMD perfctrs busy ... "
+		       "cpu_khz value may be incorrect.\n");
 		i = 3;
 		rdmsrl(MSR_K7_EVNTSEL3, evntsel3);
 		wrmsrl(MSR_K7_EVNTSEL3, 0);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Warn of incorrect cpu_khz on AMD systems
  2008-11-04 15:27 [PATCH] Warn of incorrect cpu_khz on AMD systems Prarit Bhargava
@ 2008-11-06  9:01 ` Ingo Molnar
  2008-11-06 13:42   ` Prarit Bhargava
  2008-11-12 18:35   ` Prarit Bhargava
  0 siblings, 2 replies; 5+ messages in thread
From: Ingo Molnar @ 2008-11-06  9:01 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, tglx, mark.langsdorf


* Prarit Bhargava <prarit@redhat.com> wrote:

> If none of the perfctrs are free when calculating cpu_khz we default 
> to using ctr 3 (ie, we just choose 3).  This may lead to an 
> incorrect tsc freq value which can cause the system to be unstable.
> 
> To aid in future debugging, WARN the user of a potential problem.

oh, nasty... when can this happen - are you using nmi_watchdog=2?

Cannot we avoid this situation somehow? The calibrate_cpu() function 
is quite ugly and does a dangerous thing by ignoring the reservation. 

This whole sequence is sloppy:

        for (i = 0; i < 4; i++)
                if (avail_to_resrv_perfctr_nmi_bit(i))
                        break;
        no_ctr_free = (i == 4);
        if (no_ctr_free) {
                i = 3;
                rdmsrl(MSR_K7_EVNTSEL3, evntsel3);
                wrmsrl(MSR_K7_EVNTSEL3, 0);
                rdmsrl(MSR_K7_PERFCTR3, pmc3);
        } else {
                reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i);
                reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i);
        }

>  	no_ctr_free = (i == 4);
>  	if (no_ctr_free) {
> +		printk(KERN_WARN "Warning: AMD perfctrs busy ... "
> +		       "cpu_khz value may be incorrect.\n");

also, please use a WARN() instead so that kerneloops.org picks it up.

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Warn of incorrect cpu_khz on AMD systems
  2008-11-06  9:01 ` Ingo Molnar
@ 2008-11-06 13:42   ` Prarit Bhargava
  2008-11-12 18:35   ` Prarit Bhargava
  1 sibling, 0 replies; 5+ messages in thread
From: Prarit Bhargava @ 2008-11-06 13:42 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, tglx, mark.langsdorf


> Cannot we avoid this situation somehow? The calibrate_cpu() function 
> is quite ugly and does a dangerous thing by ignoring the reservation. 
>
>   

Yes, I noticed that too -- it's really the crux of the problem.  If 
no_ctr_free is a last resort boot option.  But I wonder if it should 
exist at all.

I was originally thinking the system should just stop booting if 
no_ctr_free and panic() ... and I'm willing to make that patch.

The likelihood of hitting this is low, below .0125% of the time.  But 
when it does hit, it is nasty and difficult to diagnose.
The last thing that I thought could be wrong was that cpu_khz was busted.

> This whole sequence is sloppy:
>
>         for (i = 0; i < 4; i++)
>                 if (avail_to_resrv_perfctr_nmi_bit(i))
>                         break;
>         no_ctr_free = (i == 4);
>         if (no_ctr_free) {
>                 i = 3;
>                 rdmsrl(MSR_K7_EVNTSEL3, evntsel3);
>                 wrmsrl(MSR_K7_EVNTSEL3, 0);
>                 rdmsrl(MSR_K7_PERFCTR3, pmc3);
>         } else {
>                 reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i);
>                 reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i);
>         }
>
>   
>>  	no_ctr_free = (i == 4);
>>  	if (no_ctr_free) {
>> +		printk(KERN_WARN "Warning: AMD perfctrs busy ... "
>> +		       "cpu_khz value may be incorrect.\n");
>>     
>
> also, please use a WARN() instead so that kerneloops.org picks it up.
>
>   

Will do -- but do you think a panic() is more appropriate?

P.

> 	Ingo
>   

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Warn of incorrect cpu_khz on AMD systems
  2008-11-06  9:01 ` Ingo Molnar
  2008-11-06 13:42   ` Prarit Bhargava
@ 2008-11-12 18:35   ` Prarit Bhargava
  2008-11-12 18:54     ` Ingo Molnar
  1 sibling, 1 reply; 5+ messages in thread
From: Prarit Bhargava @ 2008-11-12 18:35 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, tglx, mark.langsdorf

[-- Attachment #1: Type: text/plain, Size: 74 bytes --]

New patch replacing printk with WARN() based on Ingo's suggestion...

P.


[-- Attachment #2: upstream.patch --]
[-- Type: text/plain, Size: 775 bytes --]

If none of the perfctrs are free when calculating cpu_khz we default to using
ctr 3 (ie, we just choose 3).  This may lead to an incorrect tsc freq value
which can cause the system to be unstable.

To aid in future debugging, WARN the user of a potential problem.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>

diff --git a/arch/x86/kernel/time_64.c b/arch/x86/kernel/time_64.c
index cb19d65..e71d1ba 100644
--- a/arch/x86/kernel/time_64.c
+++ b/arch/x86/kernel/time_64.c
@@ -80,6 +80,8 @@ unsigned long __init calibrate_cpu(void)
 			break;
 	no_ctr_free = (i == 4);
 	if (no_ctr_free) {
+		WARN(1, KERN_WARN "Warning: AMD perfctrs busy ... "
+		     "cpu_khz value may be incorrect.\n");
 		i = 3;
 		rdmsrl(MSR_K7_EVNTSEL3, evntsel3);
 		wrmsrl(MSR_K7_EVNTSEL3, 0);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Warn of incorrect cpu_khz on AMD systems
  2008-11-12 18:35   ` Prarit Bhargava
@ 2008-11-12 18:54     ` Ingo Molnar
  0 siblings, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2008-11-12 18:54 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-kernel, tglx, mark.langsdorf


* Prarit Bhargava <prarit@redhat.com> wrote:

> New patch replacing printk with WARN() based on Ingo's suggestion...

applied to tip/x86/debug, thanks Prarit!

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-11-12 18:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-04 15:27 [PATCH] Warn of incorrect cpu_khz on AMD systems Prarit Bhargava
2008-11-06  9:01 ` Ingo Molnar
2008-11-06 13:42   ` Prarit Bhargava
2008-11-12 18:35   ` Prarit Bhargava
2008-11-12 18:54     ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).