LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [BUG] Patch "CPU hotplug: call check_tsc_sync_source() with irqs off" breaks some drivers
@ 2007-03-25 19:16 Nicolas Boichat
  2007-03-26  4:32 ` Nicolas Boichat
  0 siblings, 1 reply; 4+ messages in thread
From: Nicolas Boichat @ 2007-03-25 19:16 UTC (permalink / raw)
  To: Ingo Molnar, Linus Torvalds
  Cc: Michal Piotrowski, linux-kernel, alsa-devel, tiwai

Hello,

I'm running a Macbook Pro first generation (Core Duo, so x86).

I ran accross these two problems while upgrading from 2.6.21-rc3 to the
current git HEAD:
 1. appletouch cannot initialize the device properly at boot time (the
module is automatically loaded by Gentoo), I have to reload the module
to get it to work.
 2. ALSA hda_intel (patch_sigmatel) fails to read properly the subsystem
id. "head  /proc/asound/card0/codec#0" returns:
> Codec: SigmaTel STAC9221 A1
> Address: 0
> Vendor Id: 0x83847680
> Subsystem Id: 0x100
> Revision Id: 0x103401
while I expect this subsystem id: 0x106b0200.
This is due to a read failure in sound/pci/hda/hda_codec.c at line 553
(I have to reboot on OS X to get the id correct again, and it seems to
happen quite randomly, so I'm not absolutely certain this bug is
related, but it's something i have never seen before...)

I found out which commit seems to cause these bugs:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d04f41e35343f1d788551fd3f753f51794f4afcf

The latest GIT without this commit works fine, but doesn't with it.

See below for the relevant parts of a diff, with my comments where
needed, of the dmesg output with and without this commit. You can get
the raw dmesg outputs, and my .config there:
http://www.boichat.ch/nicolas/linux/

Best regards,

Nicolas Boichat

--- dmesg-2.6.21-rc4-head-notimestamp	2007-03-26 02:19:57.000000000 +0800

Vanilla git

+++ dmesg-2.6.21-rc4-head-without-d04f41-notimestamp	2007-03-26 02:20:14.000000000 +0800

Vanilla git with commit d04f41e35343f1d788551fd3f753f51794f4afcf reverted.

@@ -1,4 +1,4 @@
-Linux version 2.6.21-rc4 (nicolas@nunuche) (gcc version 4.1.2 (Gentoo 4.1.2)) #1 SMP PREEMPT Sun Mar 25 17:06:13 SGT 2007
+Linux version 2.6.21-rc4 (nicolas@nunuche) (gcc version 4.1.2 (Gentoo 4.1.2)) #2 SMP PREEMPT Sun Mar 25 18:16:15 SGT 2007
 BIOS-provided physical RAM map:
 sanitize start
 sanitize end
@@ -93,14 +93,14 @@
 Enabling fast FPU save and restore... done.
 Enabling unmasked SIMD FPU exception support... done.
 Initializing CPU#0
 PID hash table entries: 4096 (order: 12, 16384 bytes)
-Detected 1952.371 MHz processor.

This is WRONG. I have a 1.83 Ghz Core Duo.

+Detected 1831.082 MHz processor.

This is right.

 Console: colour VGA+ 80x25
 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
 Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
@@ -114,7 +114,7 @@
       .data : 0xc03457ab - 0xc041782c   ( 840 kB)
       .text : 0xc0100000 - 0xc03457ab   (2325 kB)
 Checking if this processor honours the WP bit even in supervisor mode... Ok.
-Calibrating delay using timer specific routine.. 4945.92 BogoMIPS (lpj=2472960)
+Calibrating delay using timer specific routine.. 4689.63 BogoMIPS (lpj=2344818)
 Mount-cache hash table entries: 512
 CPU: After generic identify, caps: bfe9fbff 00100000 00000000 00000000 0000c1a9 00000000 00000000
 monitor/mwait feature present.
@@ -143,7 +143,7 @@
 SMP alternatives: switching to SMP code
 Booting processor 1/1 eip 3000
 Initializing CPU#1
-Calibrating delay using timer specific routine.. 3661.28 BogoMIPS (lpj=1830642)
+Calibrating delay using timer specific routine.. 3661.30 BogoMIPS (lpj=1830650)
 CPU: After generic identify, caps: bfe9fbff 00100000 00000000 00000000 0000c1a9 00000000 00000000
 monitor/mwait feature present.
 CPU: L1 I cache: 32K, L1 D cache: 32K
@@ -154,12 +154,12 @@
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#1.
 CPU1: Intel Genuine Intel(R) CPU           T2400  @ 1.83GHz stepping 08
-Total of 2 processors activated (8607.20 BogoMIPS).
+Total of 2 processors activated (8350.93 BogoMIPS).
 ENABLING IO-APIC IRQs
 ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
 checking TSC synchronization [CPU#0 -> CPU#1]: passed.
 Brought up 2 CPUs
-migration_cost=34
+migration_cost=46
 PM: Adding info for No Bus:platform
 NET: Registered protocol family 16
 PM: Adding info for No Bus:vtcon0
@@ -336,9 +336,9 @@
 pnp: 00:01: iomem range 0xfed14000-0xfed17fff could not be reserved
 pnp: 00:01: iomem range 0xfed18000-0xfed18fff could not be reserved
 pnp: 00:01: iomem range 0xfed19000-0xfed19fff could not be reserved
+Time: tsc clocksource has been installed.
 pnp: 00:06: iomem range 0xfed00000-0xfed003ff has been reserved
 PM: Adding info for No Bus:mem
-Time: tsc clocksource has been installed.
 PM: Adding info for No Bus:kmem
 PM: Adding info for No Bus:null
 PM: Adding info for No Bus:port
@@ -385,7 +385,7 @@
 Machine check exception polling timer started.
 PM: Adding info for platform:pcspkr
 audit: initializing netlink socket (disabled)
-audit(1174817446.273:1): initialized
+audit(1174817951.346:1): initialized
 highmem bounce pool size: 64 pages
 Total HugeTLB memory allocated, 0
 io scheduler noop registered
@@ -755,66 +772,76 @@
 sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
 Uniform CD-ROM driver Revision: 3.20
 sr 0:0:0:0: Attached scsi CD-ROM sr0
-drivers/usb/input/appletouch.c: Could not do mode read request from device (Geyser 3 mode)
-appletouch: probe of 1-2:1.1 failed with error -12
 usb 5-4: new high speed USB device using ehci_hcd and address 4
+Clocksource tsc unstable (delta = -288700753 ns)
 PM: Adding info for usb:5-4
 PM: Adding info for No Bus:usbdev5.4_ep00
 usb 5-4: configuration #1 chosen from 1 choice
 PM: Adding info for usb:5-4:1.0
 PM: Adding info for No Bus:usbdev5.4
-Clocksource tsc unstable (delta = -292504439 ns)
 usbcore: registered new interface driver appletouch
@@ -921,6 +948,7 @@
 -> GSI 22 (level, low) -> IRQ 21
 PCI: Setting latency timer of device 0000:00:1b.0 to 64
 hda_codec: STAC922x, Apple subsys_id=100
+hda_intel: azx_get_response timeout, switching to polling mode...
 PM: Adding info for No Bus:pcmC0D1p
 PM: Adding info for No Bus:pcmC0D0p
 PM: Adding info for No Bus:pcmC0D0c



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] Patch "CPU hotplug: call check_tsc_sync_source() with irqs off" breaks some drivers
  2007-03-25 19:16 [BUG] Patch "CPU hotplug: call check_tsc_sync_source() with irqs off" breaks some drivers Nicolas Boichat
@ 2007-03-26  4:32 ` Nicolas Boichat
  2007-03-26  6:56   ` Ingo Molnar
  0 siblings, 1 reply; 4+ messages in thread
From: Nicolas Boichat @ 2007-03-26  4:32 UTC (permalink / raw)
  To: Ingo Molnar, Linus Torvalds
  Cc: Nicolas Boichat, Michal Piotrowski, linux-kernel, alsa-devel, tiwai

Nicolas Boichat wrote:
> Hello,
> 
> I'm running a Macbook Pro first generation (Core Duo, so x86).
> 
> I ran accross these two problems while upgrading from 2.6.21-rc3 to the
> current git HEAD:
>  1. appletouch cannot initialize the device properly at boot time (the
> module is automatically loaded by Gentoo), I have to reload the module
> to get it to work.
>  2. ALSA hda_intel (patch_sigmatel) fails to read properly the subsystem
> id. "head  /proc/asound/card0/codec#0" returns:
>> Codec: SigmaTel STAC9221 A1
>> Address: 0
>> Vendor Id: 0x83847680
>> Subsystem Id: 0x100
>> Revision Id: 0x103401
> while I expect this subsystem id: 0x106b0200.
> This is due to a read failure in sound/pci/hda/hda_codec.c at line 553
> (I have to reboot on OS X to get the id correct again, and it seems to
> happen quite randomly, so I'm not absolutely certain this bug is
> related, but it's something i have never seen before...)
> 
> I found out which commit seems to cause these bugs:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d04f41e35343f1d788551fd3f753f51794f4afcf
> 
> The latest GIT without this commit works fine, but doesn't with it.

Sorry about blaming this commit. The problems happen randomly (about 1
reboot over 2 is ok, at least with 2.6.21-rc4). I'll run more tests and
post the results later.

Best regards,

Nicolas


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] Patch "CPU hotplug: call check_tsc_sync_source() with irqs off" breaks some drivers
  2007-03-26  4:32 ` Nicolas Boichat
@ 2007-03-26  6:56   ` Ingo Molnar
  2007-03-26  9:47     ` [BUG] Macbook pro timer bug (was: "Patch "CPU hotplug: call check_tsc_sync_source() with irqs off" breaks some drivers") Nicolas Boichat
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Molnar @ 2007-03-26  6:56 UTC (permalink / raw)
  To: Nicolas Boichat
  Cc: Linus Torvalds, Michal Piotrowski, linux-kernel, alsa-devel, tiwai


* Nicolas Boichat <nicolas@boichat.ch> wrote:

> > I found out which commit seems to cause these bugs:
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d04f41e35343f1d788551fd3f753f51794f4afcf
> > 
> > The latest GIT without this commit works fine, but doesn't with it.
> 
> Sorry about blaming this commit. The problems happen randomly (about 1 
> reboot over 2 is ok, at least with 2.6.21-rc4). I'll run more tests 
> and post the results later.

is this the 32-bit kernel? If yes then does commit 
4edc5db83f574dfcc8be35b7b96760ded543b360 (included in -rc5) fix it?

	Ingo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] Macbook pro timer bug (was: "Patch "CPU hotplug: call check_tsc_sync_source() with irqs off" breaks some drivers")
  2007-03-26  6:56   ` Ingo Molnar
@ 2007-03-26  9:47     ` Nicolas Boichat
  0 siblings, 0 replies; 4+ messages in thread
From: Nicolas Boichat @ 2007-03-26  9:47 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linus Torvalds, Michal Piotrowski, linux-kernel

Ingo Molnar wrote:
> * Nicolas Boichat <nicolas@boichat.ch> wrote:
>
>   
>>> I found out which commit seems to cause these bugs:
>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d04f41e35343f1d788551fd3f753f51794f4afcf
>>>
>>> The latest GIT without this commit works fine, but doesn't with it.
>>>       
>> Sorry about blaming this commit. The problems happen randomly (about 1 
>> reboot over 2 is ok, at least with 2.6.21-rc4). I'll run more tests 
>> and post the results later.
>>     
>
> is this the 32-bit kernel? 

Yes it is.

> If yes then does commit 
> 4edc5db83f574dfcc8be35b7b96760ded543b360 (included in -rc5) fix it?
>   

I don't know precisely which commit you are talking about, but I
upgraded to 2.6.21-rc5, and the problem became more obvious.

>From what I experimented, I came to the conclusion that when the
frequency of the first CPU is wrongly detected, the timers get wrong
(i.e. tsc, hpet, with or without high-res timers).

It is not a new bug, it happened in 2.6.20 too (I also found an old
dmesg from 2.6.18-rc2 with the same frequency detection problem), except
that for some reasons it didn't cause that many obvious bugs, so I
didn't notice it at that time.

(in case you don't remember my first message, I have a Macbook Pro first
generation (Core Duo, so x86), see below for /proc/cpuinfo)

This is the diff of two consecutive soft reboots, with the same kernel
(CONFIG_HPET_TIMER unset, CONFIG_HIGH_RES_TIMERS set):

--- dmesg-2.6.21-rc5-nohpet-error-notime	2007-03-26 16:42:00.000000000 +0800
+++ dmesg-2.6.21-rc5-nohpet-success-notime	2007-03-26 16:42:09.000000000 +0800
@@ -100,7 +100,7 @@
 Enabling unmasked SIMD FPU exception support... done.
 Initializing CPU#0
 PID hash table entries: 4096 (order: 12, 16384 bytes)
-Detected 1952.283 MHz processor.
+Detected 1830.920 MHz processor.
 Console: colour VGA+ 80x25
 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
 Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)


The "+" boot reports the correct CPU frequency (1.83Ghz).

When you run something like this on both:
# ntpdate swisstime.ethz.ch
# ntpdate swisstime.ethz.ch
# sleep 60
# ntpdate swisstime.ethz.ch

You get, with the "+" boot, what you expect:

26 Mar 16:33:34 ntpdate[3385]: adjust time server 129.132.2.21 offset 0.120391 sec
26 Mar 16:32:52 ntpdate[3376]: adjust time server 129.132.2.21 offset 0.137979 sec
-- pause of about 60 seconds
26 Mar 16:33:55 ntpdate[3386]: adjust time server 129.132.2.21 offset 0.125751 sec


i.e. the clock is still correct 60 seconds after you adjust it.

While, with the "-" boot, you get:

26 Mar 16:18:05 ntpdate[3444]: step time server 129.132.2.21 offset 3.829351 sec
26 Mar 16:18:08 ntpdate[3445]: adjust time server 129.132.2.21 offset 0.232023 sec
-- pause
26 Mar 16:19:16 ntpdate[3447]: step time server 129.132.2.21 offset 4.320653 sec

The clock has shifted of about 4.1 seconds in 60 seconds, i.e. an error
of 6.8%, which is in the same range as the error between 1952 and 1833
Mhz (6.5%).
 
I get the same results with both CONFIG_HPET_TIMER and
CONFIG_HIGH_RES_TIMERS unset, some boots are ok, some aren't.

I couldn't get a correct boot with these two options enabled, but it's
probably because I didn't reboot enough times...

I you want some detailed kernel logs, see
http://www.boichat.ch/nicolas/linux/2nd/ .
Please tell me if you need more informations, or if you prefer I file a
bug in the bugzilla.

Best regards,

Nicolas

---

/proc/cpuinfo on an incorrect boot:

nunuche ~ # cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 14
model name	: Genuine Intel(R) CPU           T2400  @ 1.83GHz
stepping	: 8
cpu MHz		: 1952.216
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc pni monitor vmx est tm2 xtpr
bogomips	: 4689.63
clflush size	: 64

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 14
model name	: Genuine Intel(R) CPU           T2400  @ 1.83GHz
stepping	: 8
cpu MHz		: 1952.216
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc pni monitor vmx est tm2 xtpr
bogomips	: 3661.27
clflush size	: 64



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-03-26  9:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-25 19:16 [BUG] Patch "CPU hotplug: call check_tsc_sync_source() with irqs off" breaks some drivers Nicolas Boichat
2007-03-26  4:32 ` Nicolas Boichat
2007-03-26  6:56   ` Ingo Molnar
2007-03-26  9:47     ` [BUG] Macbook pro timer bug (was: "Patch "CPU hotplug: call check_tsc_sync_source() with irqs off" breaks some drivers") Nicolas Boichat

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).