LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
@ 2007-12-07 22:53 Parag Warudkar
  2007-12-07 23:12 ` Pallipadi, Venkatesh
  2007-12-07 23:17 ` BUG: " Andrew Morton
  0 siblings, 2 replies; 36+ messages in thread
From: Parag Warudkar @ 2007-12-07 22:53 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, venkatesh.pallipadi, Linus Torvalds

Got this on today's git (2.6.24-rc4) while compiling  stuff - Looks
like it is related to CpuIdle stuff.
I chose CONFIG_CPU_IDLE for the first time so I don't know when this
was introduced.

This is on x86_32, SMP.

BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #3)
EIP: 0060:[<c0603e22>] EFLAGS: 00000202 CPU: 1
EIP is at _spin_lock_irqsave+0x16/0x27
EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 004f5960 CR3: 372c5000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c0438233>] tick_broadcast_oneshot_control+0x10/0xda
 [<c0437c82>] tick_notify+0x1d4/0x2eb
 [<c04281b4>] get_next_timer_interrupt+0x143/0x1b4
 [<c0605819>] notifier_call_chain+0x2a/0x47
 [<c04345a0>] raw_notifier_call_chain+0x17/0x1a
 [<c04378b7>] clockevents_notify+0x19/0x4f
 [<c0533cc3>] acpi_idle_enter_simple+0x183/0x1d0
 [<c058cea3>] cpuidle_idle_call+0x53/0x78
 [<c058ce50>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-07 22:53 BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] Parag Warudkar
@ 2007-12-07 23:12 ` Pallipadi, Venkatesh
  2007-12-07 23:31   ` Parag Warudkar
  2007-12-08  2:56   ` Thomas Gleixner
  2007-12-07 23:17 ` BUG: " Andrew Morton
  1 sibling, 2 replies; 36+ messages in thread
From: Pallipadi, Venkatesh @ 2007-12-07 23:12 UTC (permalink / raw)
  To: Parag Warudkar, LKML
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner, Ingo Molnar

 

>-----Original Message-----
>From: Parag Warudkar [mailto:parag.warudkar@gmail.com] 
>Sent: Friday, December 07, 2007 2:54 PM
>To: LKML
>Cc: Andrew Morton; Pallipadi, Venkatesh; Linus Torvalds
>Subject: BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
>
>Got this on today's git (2.6.24-rc4) while compiling  stuff - Looks
>like it is related to CpuIdle stuff.
>I chose CONFIG_CPU_IDLE for the first time so I don't know when this
>was introduced.
>
>This is on x86_32, SMP.
>
>BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
>
>Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #3)
>EIP: 0060:[<c0603e22>] EFLAGS: 00000202 CPU: 1
>EIP is at _spin_lock_irqsave+0x16/0x27
>EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
>ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>CR0: 8005003b CR2: 004f5960 CR3: 372c5000 CR4: 000006d0
>DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>DR6: ffff0ff0 DR7: 00000400
> [<c0438233>] tick_broadcast_oneshot_control+0x10/0xda
> [<c0437c82>] tick_notify+0x1d4/0x2eb
> [<c04281b4>] get_next_timer_interrupt+0x143/0x1b4
> [<c0605819>] notifier_call_chain+0x2a/0x47
> [<c04345a0>] raw_notifier_call_chain+0x17/0x1a
> [<c04378b7>] clockevents_notify+0x19/0x4f
> [<c0533cc3>] acpi_idle_enter_simple+0x183/0x1d0
> [<c058cea3>] cpuidle_idle_call+0x53/0x78
> [<c058ce50>] cpuidle_idle_call+0x0/0x78
> [<c0402575>] cpu_idle+0x97/0xb8
>

Looks like tick_broadcast_lock did not get freed in some path.
You do not see this when you CPU_IDLE is not configured? 

Thanks,
Venki

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-07 22:53 BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] Parag Warudkar
  2007-12-07 23:12 ` Pallipadi, Venkatesh
@ 2007-12-07 23:17 ` Andrew Morton
  2007-12-08  2:35   ` Rafael J. Wysocki
  1 sibling, 1 reply; 36+ messages in thread
From: Andrew Morton @ 2007-12-07 23:17 UTC (permalink / raw)
  To: Parag Warudkar, Rafael J. Wysocki
  Cc: linux-kernel, venkatesh.pallipadi, torvalds, linux-acpi

On Fri, 7 Dec 2007 17:53:47 -0500
"Parag Warudkar" <parag.warudkar@gmail.com> wrote:

> Got this on today's git (2.6.24-rc4) while compiling  stuff - Looks
> like it is related to CpuIdle stuff.
> I chose CONFIG_CPU_IDLE for the first time so I don't know when this
> was introduced.
> 
> This is on x86_32, SMP.
> 
> BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
> 
> Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #3)
> EIP: 0060:[<c0603e22>] EFLAGS: 00000202 CPU: 1
> EIP is at _spin_lock_irqsave+0x16/0x27
> EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
> ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: 004f5960 CR3: 372c5000 CR4: 000006d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
>  [<c0438233>] tick_broadcast_oneshot_control+0x10/0xda
>  [<c0437c82>] tick_notify+0x1d4/0x2eb
>  [<c04281b4>] get_next_timer_interrupt+0x143/0x1b4
>  [<c0605819>] notifier_call_chain+0x2a/0x47
>  [<c04345a0>] raw_notifier_call_chain+0x17/0x1a
>  [<c04378b7>] clockevents_notify+0x19/0x4f
>  [<c0533cc3>] acpi_idle_enter_simple+0x183/0x1d0
>  [<c058cea3>] cpuidle_idle_call+0x53/0x78
>  [<c058ce50>] cpuidle_idle_call+0x0/0x78
>  [<c0402575>] cpu_idle+0x97/0xb8

OK, thanks.  Another one for the regression list, please.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-07 23:12 ` Pallipadi, Venkatesh
@ 2007-12-07 23:31   ` Parag Warudkar
  2007-12-08  2:56   ` Thomas Gleixner
  1 sibling, 0 replies; 36+ messages in thread
From: Parag Warudkar @ 2007-12-07 23:31 UTC (permalink / raw)
  To: Pallipadi, Venkatesh
  Cc: LKML, Andrew Morton, Linus Torvalds, Thomas Gleixner, Ingo Molnar

On Dec 7, 2007 6:12 PM, Pallipadi, Venkatesh
<venkatesh.pallipadi@intel.com> wrote:

> Looks like tick_broadcast_lock did not get freed in some path.
> You do not see this when you CPU_IDLE is not configured?
>
> Thanks,
> Venki
>

No, I did not see this prior to enabling CPU_IDLE.

All previous kernels without CPU_IDLE also had soft lockup detection enabled.

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-07 23:17 ` BUG: " Andrew Morton
@ 2007-12-08  2:35   ` Rafael J. Wysocki
  0 siblings, 0 replies; 36+ messages in thread
From: Rafael J. Wysocki @ 2007-12-08  2:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Parag Warudkar, linux-kernel, venkatesh.pallipadi, torvalds, linux-acpi

On Saturday, 8 of December 2007, Andrew Morton wrote:
> On Fri, 7 Dec 2007 17:53:47 -0500
> "Parag Warudkar" <parag.warudkar@gmail.com> wrote:
> 
> > Got this on today's git (2.6.24-rc4) while compiling  stuff - Looks
> > like it is related to CpuIdle stuff.
> > I chose CONFIG_CPU_IDLE for the first time so I don't know when this
> > was introduced.
> > 
> > This is on x86_32, SMP.
> > 
> > BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
> > 
> > Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #3)
> > EIP: 0060:[<c0603e22>] EFLAGS: 00000202 CPU: 1
> > EIP is at _spin_lock_irqsave+0x16/0x27
> > EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
> > ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
> >  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > CR0: 8005003b CR2: 004f5960 CR3: 372c5000 CR4: 000006d0
> > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > DR6: ffff0ff0 DR7: 00000400
> >  [<c0438233>] tick_broadcast_oneshot_control+0x10/0xda
> >  [<c0437c82>] tick_notify+0x1d4/0x2eb
> >  [<c04281b4>] get_next_timer_interrupt+0x143/0x1b4
> >  [<c0605819>] notifier_call_chain+0x2a/0x47
> >  [<c04345a0>] raw_notifier_call_chain+0x17/0x1a
> >  [<c04378b7>] clockevents_notify+0x19/0x4f
> >  [<c0533cc3>] acpi_idle_enter_simple+0x183/0x1d0
> >  [<c058cea3>] cpuidle_idle_call+0x53/0x78
> >  [<c058ce50>] cpuidle_idle_call+0x0/0x78
> >  [<c0402575>] cpu_idle+0x97/0xb8
> 
> OK, thanks.  Another one for the regression list, please.

Added, http://bugzilla.kernel.org/show_bug.cgi?id=9525, thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-07 23:12 ` Pallipadi, Venkatesh
  2007-12-07 23:31   ` Parag Warudkar
@ 2007-12-08  2:56   ` Thomas Gleixner
  2007-12-08 15:10     ` Parag Warudkar
  1 sibling, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2007-12-08  2:56 UTC (permalink / raw)
  To: Pallipadi, Venkatesh
  Cc: Parag Warudkar, LKML, Andrew Morton, Linus Torvalds, Ingo Molnar

On Fri, 7 Dec 2007, Pallipadi, Venkatesh wrote:
> >-----Original Message-----
> >From: Parag Warudkar [mailto:parag.warudkar@gmail.com] 
> >Sent: Friday, December 07, 2007 2:54 PM
> >To: LKML
> >Cc: Andrew Morton; Pallipadi, Venkatesh; Linus Torvalds
> >Subject: BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
> >
> >Got this on today's git (2.6.24-rc4) while compiling  stuff - Looks
> >like it is related to CpuIdle stuff.
> >I chose CONFIG_CPU_IDLE for the first time so I don't know when this
> >was introduced.
> >
> >This is on x86_32, SMP.
> >
> >BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]
> >
> >Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #3)
> >EIP: 0060:[<c0603e22>] EFLAGS: 00000202 CPU: 1
> >EIP is at _spin_lock_irqsave+0x16/0x27
> >EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
> >ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
> > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> >CR0: 8005003b CR2: 004f5960 CR3: 372c5000 CR4: 000006d0
> >DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> >DR6: ffff0ff0 DR7: 00000400
> > [<c0438233>] tick_broadcast_oneshot_control+0x10/0xda
> > [<c0437c82>] tick_notify+0x1d4/0x2eb
> > [<c04281b4>] get_next_timer_interrupt+0x143/0x1b4
> > [<c0605819>] notifier_call_chain+0x2a/0x47
> > [<c04345a0>] raw_notifier_call_chain+0x17/0x1a
> > [<c04378b7>] clockevents_notify+0x19/0x4f
> > [<c0533cc3>] acpi_idle_enter_simple+0x183/0x1d0
> > [<c058cea3>] cpuidle_idle_call+0x53/0x78
> > [<c058ce50>] cpuidle_idle_call+0x0/0x78
> > [<c0402575>] cpu_idle+0x97/0xb8
> >
> 
> Looks like tick_broadcast_lock did not get freed in some path.
> You do not see this when you CPU_IDLE is not configured? 

This looks pretty much like the problem I was solving yesterday.

Parag, can you please try Linus latest and please check whether there
is a stack trace with clockevents_program_event on the top in your
dmesg.

Thanks,
	tglx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08  2:56   ` Thomas Gleixner
@ 2007-12-08 15:10     ` Parag Warudkar
  2007-12-08 15:36       ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-08 15:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Pallipadi, Venkatesh, LKML, Andrew Morton, Linus Torvalds, Ingo Molnar

On Dec 7, 2007 9:56 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> This looks pretty much like the problem I was solving yesterday.
>
> Parag, can you please try Linus latest and please check whether there
> is a stack trace with clockevents_program_event on the top in your
> dmesg.
>

Just booted with latest git and there is no  clockevents_program_event
in dmesg anywhere.

Any way I have enabled CPU_IDLE again and the machine's got to do lot
of compiles today.
I'll see if the problem reproduces.

Thanks

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 15:10     ` Parag Warudkar
@ 2007-12-08 15:36       ` Parag Warudkar
  2007-12-08 15:47         ` Ingo Molnar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-08 15:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Pallipadi, Venkatesh, LKML, Andrew Morton, Linus Torvalds, Ingo Molnar

On Dec 8, 2007 10:10 AM, Parag Warudkar <parag.warudkar@gmail.com> wrote:
> On Dec 7, 2007 9:56 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > This looks pretty much like the problem I was solving yesterday.
> >
> > Parag, can you please try Linus latest and please check whether there
> > is a stack trace with clockevents_program_event on the top in your
> > dmesg.
> >
>
> Just booted with latest git and there is no  clockevents_program_event
> in dmesg anywhere.
>
> Any way I have enabled CPU_IDLE again and the machine's got to do lot
> of compiles today.
> I'll see if the problem reproduces.
>

Whoa whoa whoa - this happens left and right with latest Linus +CPU_IDLE.

As soon as I start compiling, dmesg starts filling up with -

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #5)
EIP: 0060:[<c0533d06>] EFLAGS: 00000202 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000da1 ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 00042070 EBP: 000412cf ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 080ce178 CR3: 00715000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
BUG: soft lockup - CPU#1 stuck for 11s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #5)
EIP: 0060:[<c0603eaa>] EFLAGS: 00000202 CPU: 1
EIP is at _spin_lock_irqsave+0x16/0x27
EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 080ff4b0 CR3: 37cf5000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c0438293>] tick_broadcast_oneshot_control+0x10/0xda
 [<c0437ce2>] tick_notify+0x1d4/0x2eb
 [<c04281bc>] get_next_timer_interrupt+0x143/0x1b4
 [<c06058a1>] notifier_call_chain+0x2a/0x47
 [<c04345c0>] raw_notifier_call_chain+0x17/0x1a
 [<c043781e>] clockevents_notify+0x19/0x4f
 [<c0533d23>] acpi_idle_enter_simple+0x183/0x1d0
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
BUG: soft lockup - CPU#1 stuck for 11s! [vim:3736]

Pid: 3736, comm: vim Not tainted (2.6.24-rc4 #5)
EIP: 0060:[<c060540b>] EFLAGS: 00000206 CPU: 1
EIP is at do_page_fault+0x117/0x583
EAX: f6ec1380 EBX: 005d0ff4 ECX: 0000007b EDX: 00000003
ESI: f6dd1fb8 EDI: 004c2000 EBP: f149eae0 ESP: f6dd1f70
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: 004c2000 CR3: 3736a000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c046d80d>] vfs_read+0x105/0x115
 [<c06052f4>] do_page_fault+0x0/0x583
 [<c06040aa>] error_code+0x72/0x78
 [<c0600000>] netlbl_domhsh_walk+0x94/0xba
 =======================
BUG: soft lockup - CPU#1 stuck for 11s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #5)
EIP: 0060:[<c0533d06>] EFLAGS: 00000202 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 0000000a ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 008e8c88 EBP: 008e8c7e ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 005da044 CR3: 37cf5000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
BUG: soft lockup - CPU#1 stuck for 22s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #5)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000d9b ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 007d1802 EBP: 007d0a67 ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 005da044 CR3: 37cf5000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
BUG: soft lockup - CPU#1 stuck for 18s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #5)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000da2 ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 00b651f2 EBP: 00b64450 ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 005da044 CR3: 37cf5000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
BUG: soft lockup - CPU#1 stuck for 14s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #5)
EIP: 0060:[<c0603eaa>] EFLAGS: 00000202 CPU: 1
EIP is at _spin_lock_irqsave+0x16/0x27
EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 080ff4b0 CR3: 37cf5000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c0438293>] tick_broadcast_oneshot_control+0x10/0xda
 [<c0437ce2>] tick_notify+0x1d4/0x2eb
 [<c04281bc>] get_next_timer_interrupt+0x143/0x1b4
 [<c06058a1>] notifier_call_chain+0x2a/0x47
 [<c04345c0>] raw_notifier_call_chain+0x17/0x1a
 [<c043781e>] clockevents_notify+0x19/0x4f
 [<c0533d23>] acpi_idle_enter_simple+0x183/0x1d0
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
BUG: soft lockup - CPU#1 stuck for 11s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #5)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000020 ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 00261276 EBP: 00261256 ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7f5e000 CR3: 37314000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 15:36       ` Parag Warudkar
@ 2007-12-08 15:47         ` Ingo Molnar
  2007-12-08 16:00           ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Ingo Molnar @ 2007-12-08 15:47 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Thomas Gleixner, Pallipadi, Venkatesh, LKML, Andrew Morton,
	Linus Torvalds


* Parag Warudkar <parag.warudkar@gmail.com> wrote:

>  [<c0438293>] tick_broadcast_oneshot_control+0x10/0xda
>  [<c0437ce2>] tick_notify+0x1d4/0x2eb
>  [<c04281bc>] get_next_timer_interrupt+0x143/0x1b4
>  [<c06058a1>] notifier_call_chain+0x2a/0x47
>  [<c04345c0>] raw_notifier_call_chain+0x17/0x1a
>  [<c043781e>] clockevents_notify+0x19/0x4f
>  [<c0533d23>] acpi_idle_enter_simple+0x183/0x1d0
>  [<c058cf03>] cpuidle_idle_call+0x53/0x78
>  [<c058ceb0>] cpuidle_idle_call+0x0/0x78
>  [<c0402575>] cpu_idle+0x97/0xb8
>  =======================
> BUG: soft lockup - CPU#1 stuck for 11s! [vim:3736]

does the patch below help? But the root cause is likely some timer 
problems - do you get consistent results from:

   while :; do time usleep 1111111; done

or do these sleeps fluctuate?

	Ingo

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -774,6 +774,7 @@ void sched_clock_idle_wakeup_event(u64 d
 	struct rq *rq = cpu_rq(smp_processor_id());
 	u64 now = sched_clock();
 
+	touch_softlockup_watchdog();
 	rq->idle_clock += delta_ns;
 	/*
 	 * Override the previous timestamp and ignore all

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 15:47         ` Ingo Molnar
@ 2007-12-08 16:00           ` Parag Warudkar
  2007-12-08 19:13             ` Ingo Molnar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-08 16:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Pallipadi, Venkatesh, LKML, Andrew Morton,
	Linus Torvalds

On Dec 8, 2007 10:47 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> does the patch below help? But the root cause is likely some timer
> problems - do you get consistent results from:
>

Haven't yet tried the patch - will try a little later.

>    while :; do time usleep 1111111; done
>
> or do these sleeps fluctuate?
>

They seem to fluctuate - not sure if that's supposed to be exact or if
below variations are normal -
This is when my compiles are running -

[parag@mini ~]$ while :; do time usleep 1111111; done

real    0m1.116s
user    0m0.000s
sys     0m0.000s

real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.130s
user    0m0.000s
sys     0m0.004s

real    0m1.144s
user    0m0.000s
sys     0m0.000s


Thanks
Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 16:00           ` Parag Warudkar
@ 2007-12-08 19:13             ` Ingo Molnar
  2007-12-08 19:33               ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Ingo Molnar @ 2007-12-08 19:13 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Thomas Gleixner, Pallipadi, Venkatesh, LKML, Andrew Morton,
	Linus Torvalds


* Parag Warudkar <parag.warudkar@gmail.com> wrote:

> >    while :; do time usleep 1111111; done
> >
> > or do these sleeps fluctuate?
> 
> They seem to fluctuate - not sure if that's supposed to be exact or if 
> below variations are normal - This is when my compiles are running -

it's normal for them to fluctuate a little when load goes up and the 
task gets delayed.

'bad' fluctuations are ones that happen when the system is idle.

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 19:13             ` Ingo Molnar
@ 2007-12-08 19:33               ` Parag Warudkar
  2007-12-08 19:42                 ` Ingo Molnar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-08 19:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Pallipadi, Venkatesh, LKML, Andrew Morton,
	Linus Torvalds

On Dec 8, 2007 2:13 PM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Parag Warudkar <parag.warudkar@gmail.com> wrote:
>
> > >    while :; do time usleep 1111111; done
> > >
> > > or do these sleeps fluctuate?
> >
> > They seem to fluctuate - not sure if that's supposed to be exact or if
> > below variations are normal - This is when my compiles are running -
>
> it's normal for them to fluctuate a little when load goes up and the
> task gets delayed.
>
> 'bad' fluctuations are ones that happen when the system is idle.
>

Even on 100% idle I get variations that are approx in the same range
when not idle.
Clocksource is hpet if that matters.  Next I think I will disable
CPU_IDLE, Tickless

My ssh connection just died - another lockup in dmesg -

BUG: soft lockup - CPU#1 stuck for 14s! [sshd:509]

Pid: 509, comm: sshd Not tainted (2.6.24-rc4 #6)
EIP: 0060:[<c060540b>] EFLAGS: 00200206 CPU: 1
EIP is at do_page_fault+0x117/0x583
EAX: ef5aaa80 EBX: 0048cfc0 ECX: 0000007b EDX: 00000003
ESI: ef691fb8 EDI: b7ca1ab4 EBP: ef47f050 ESP: ef691f70
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7ca1ab4 CR3: 2f555000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c06052f4>] do_page_fault+0x0/0x583
 [<c06040aa>] error_code+0x72/0x78
 [<c0600000>] netlbl_domhsh_walk+0x94/0xba


$> while :; do time usleep 1111111; done
real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.385s
user    0m0.000s
sys     0m0.004s

real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.279s
user    0m0.000s
sys     0m0.000s

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 19:33               ` Parag Warudkar
@ 2007-12-08 19:42                 ` Ingo Molnar
  2007-12-08 20:08                   ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Ingo Molnar @ 2007-12-08 19:42 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Thomas Gleixner, Pallipadi, Venkatesh, LKML, Andrew Morton,
	Linus Torvalds


* Parag Warudkar <parag.warudkar@gmail.com> wrote:

> Even on 100% idle I get variations that are approx in the same range 
> when not idle. Clocksource is hpet if that matters.  Next I think I 
> will disable CPU_IDLE, Tickless

also try the hpet=disable boot option.

> My ssh connection just died - another lockup in dmesg -
> 
> BUG: soft lockup - CPU#1 stuck for 14s! [sshd:509]

this would suggest that the softlockup watchdog is correct and indeed 
there was some lockup of sorts?

> $> while :; do time usleep 1111111; done
> real    0m1.112s
> real    0m1.385s
> real    0m1.112s
> real    0m1.112s
> real    0m1.279s

this, if done on an idle system, looks anomalous to me. Does 
hpet=disable help?

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 19:42                 ` Ingo Molnar
@ 2007-12-08 20:08                   ` Parag Warudkar
  2007-12-08 20:11                     ` Ingo Molnar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-08 20:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Pallipadi, Venkatesh, LKML, Andrew Morton,
	Linus Torvalds

On Dec 8, 2007 2:42 PM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Parag Warudkar <parag.warudkar@gmail.com> wrote:
>
> > Even on 100% idle I get variations that are approx in the same range
> > when not idle. Clocksource is hpet if that matters.  Next I think I
> > will disable CPU_IDLE, Tickless
>
> also try the hpet=disable boot option.

With hpet=disable (clocksource is acpi_pm ) the usleeps are somewhat
more closer to correct - under idle as well as under kernel
compilation load.
But there are still fluctuations under 100% idle -

Kernel compile
--------------------------
real    0m1.112s
user    0m0.004s
sys     0m0.000s

real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.115s
user    0m0.000s
sys     0m0.000s

real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.112s
user    0m0.000s
sys     0m0.000s
------------------------------
IDLE
real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.131s
user    0m0.000s
sys     0m0.000s

real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.112s
user    0m0.000s
sys     0m0.000s

real    0m1.139s
user    0m0.000s
sys     0m0.000s

>
> > My ssh connection just died - another lockup in dmesg -
> >
> > BUG: soft lockup - CPU#1 stuck for 14s! [sshd:509]
>
> this would suggest that the softlockup watchdog is correct and indeed
> there was some lockup of sorts?

Yep, I was about to say that there are indeed real soft-lockups going
on as opposed to the psuedo ones that we thought may be due to timer
inconsistencies.

BTW this is no-preempt config and 2.6.22  and latest Centos-5.1
kernels ran fine with HPET.

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 20:08                   ` Parag Warudkar
@ 2007-12-08 20:11                     ` Ingo Molnar
  2007-12-08 20:46                       ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Ingo Molnar @ 2007-12-08 20:11 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Thomas Gleixner, Pallipadi, Venkatesh, LKML, Andrew Morton,
	Linus Torvalds


* Parag Warudkar <parag.warudkar@gmail.com> wrote:

> But there are still fluctuations under 100% idle -

do you have CONFIG_HIGH_RES_TIMERS=y?

> IDLE
> real    0m1.112s
> real    0m1.131s
> real    0m1.112s
> real    0m1.112s
> real    0m1.139s

these fluctuations would still be OK if they are due to HZ granularity.

	Ingo

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 20:11                     ` Ingo Molnar
@ 2007-12-08 20:46                       ` Parag Warudkar
  2007-12-08 20:51                         ` Arjan van de Ven
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-08 20:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Pallipadi, Venkatesh, LKML, Andrew Morton,
	Linus Torvalds

On Dec 8, 2007 3:11 PM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Parag Warudkar <parag.warudkar@gmail.com> wrote:
>
> > But there are still fluctuations under 100% idle -
>
> do you have CONFIG_HIGH_RES_TIMERS=y?

Yes - NO_HZ=y  and HIGH_RES_TIMERS=y.
My ssh connection still died with hpet=disable although this time I
did not see soft lockup message.
Also it randomly gets stuck - suddenly there is a freeze and I can't
type for seconds.

> these fluctuations would still be OK if they are due to HZ granularity.
>

So are the hpet ones ok too or do they seem off even with HRTIMERS=y?
Just trying to figure out if I should be testing further with hpet
disabled or enabling it is ok.

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 20:46                       ` Parag Warudkar
@ 2007-12-08 20:51                         ` Arjan van de Ven
  2007-12-08 21:04                           ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Arjan van de Ven @ 2007-12-08 20:51 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Ingo Molnar, Thomas Gleixner, Pallipadi, Venkatesh, LKML,
	Andrew Morton, Linus Torvalds

On Sat, 8 Dec 2007 15:46:54 -0500
"Parag Warudkar" <parag.warudkar@gmail.com> wrote:

> On Dec 8, 2007 3:11 PM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > * Parag Warudkar <parag.warudkar@gmail.com> wrote:
> >
> > > But there are still fluctuations under 100% idle -
> >
> > do you have CONFIG_HIGH_RES_TIMERS=y?
> 
> Yes - NO_HZ=y  and HIGH_RES_TIMERS=y.
> My ssh connection still died with hpet=disable although this time I
> did not see soft lockup message.
> Also it randomly gets stuck - suddenly there is a freeze and I can't
> type for seconds.
> 
> > these fluctuations would still be OK if they are due to HZ
> > granularity.
> >
> 
> So are the hpet ones ok too or do they seem off even with HRTIMERS=y?
> Just trying to figure out if I should be testing further with hpet
> disabled or enabling it is ok.
> 

what chipset is this?
(I wonder if there's one where we shouldn't be force-enabling the hpet)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 20:51                         ` Arjan van de Ven
@ 2007-12-08 21:04                           ` Parag Warudkar
  2007-12-08 23:12                             ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-08 21:04 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, Thomas Gleixner, Pallipadi, Venkatesh, LKML,
	Andrew Morton, Linus Torvalds

On Dec 8, 2007 3:51 PM, Arjan van de Ven <arjan@infradead.org> wrote:
> what chipset is this?
> (I wonder if there's one where we shouldn't be force-enabling the hpet)

It's an Intel Mac Mini - Intel 945GM chipset.
I believe OSX requires HPET and so there shouldn't be a need to force
enable it on this chipset?

Also I don't have problems under Centos 5.1 kernel (2.6.18-53.1.4.el5)
and hpet is enabled there.

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 21:04                           ` Parag Warudkar
@ 2007-12-08 23:12                             ` Parag Warudkar
  2007-12-09 21:57                               ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-08 23:12 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, Thomas Gleixner, Pallipadi, Venkatesh, LKML,
	Andrew Morton, Linus Torvalds

No problems after disabling CONFIG_HIGHRES_TIMERS , CONFIG_CPU_IDLE
and CONFIG_NO_HZ.

I will try enabling them one by one - HRT, NOHZ and CPU_IDLE last -
that way we can at least tell what is required to be hit with this
problem.

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-08 23:12                             ` Parag Warudkar
@ 2007-12-09 21:57                               ` Parag Warudkar
  2007-12-09 22:30                                 ` Arjan van de Ven
  2007-12-10  5:49                                 ` Thomas Gleixner
  0 siblings, 2 replies; 36+ messages in thread
From: Parag Warudkar @ 2007-12-09 21:57 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, Thomas Gleixner, Pallipadi, Venkatesh, LKML,
	Andrew Morton, Linus Torvalds

On Dec 8, 2007 6:12 PM, Parag Warudkar <parag.warudkar@gmail.com> wrote:
> No problems after disabling CONFIG_HIGHRES_TIMERS , CONFIG_CPU_IDLE
> and CONFIG_NO_HZ.
>
> I will try enabling them one by one - HRT, NOHZ and CPU_IDLE last -
> that way we can at least tell what is required to be hit with this
> problem.

Looks like CPU_IDLE=y is necessary for the problem to show up.
With CPU_IDLE=n HRT+NO_HZ+TICK_ONESHOT does not give soft lockup problems.
(Actually with HIGH_RES_TIMERS=NO_HZ=TICK_ONESHOT=y  I do see short
freezes on ssh - when I cannot type anything for may be a second even
under 100% idle. But Soft Lock up doesnt show up in dmesg with this
configuration.)

No Soft Lockup - CPU_IDLE=n 5+ Hours uptime
---------------------------------------------------------------------------
HIGH_RES_TIMERS=NO_HZ=TICK_ONESHOT=n
HIGH_RES_TIMERS=y NO_HZ=TICK_ONESHOT=n
HIGH_RES_TIMERS=NO_HZ=y TICK_ONESHOT=n
HIGH_RES_TIMERS=NO_HZ=TICK_ONESHOT=y - short freezes.


Soft Lockup - CPU_IDLE=y 2 hours uptime
--------------------------------------------------------------------------
HIGH_RES_TIMERS=NO_HZ=TICK_ONESHOT=y

dmesg -

BUG: soft lockup - CPU#1 stuck for 11s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #13)
EIP: 0060:[<c0603eaa>] EFLAGS: 00000202 CPU: 1
EIP is at _spin_lock_irqsave+0x16/0x27
EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 800f5958 CR3: 37284000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c0438293>] tick_broadcast_oneshot_control+0x10/0xda
 [<c0437ce2>] tick_notify+0x1d4/0x2eb
 [<c04281bc>] get_next_timer_interrupt+0x143/0x1b4
 [<c06058a1>] notifier_call_chain+0x2a/0x47
 [<c04345c0>] raw_notifier_call_chain+0x17/0x1a
 [<c043781e>] clockevents_notify+0x19/0x4f
 [<c0533d23>] acpi_idle_enter_simple+0x183/0x1d0
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-09 21:57                               ` Parag Warudkar
@ 2007-12-09 22:30                                 ` Arjan van de Ven
  2007-12-10  5:49                                 ` Thomas Gleixner
  1 sibling, 0 replies; 36+ messages in thread
From: Arjan van de Ven @ 2007-12-09 22:30 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Ingo Molnar, Thomas Gleixner, Pallipadi, Venkatesh, LKML,
	Andrew Morton, Linus Torvalds

On Sun, 9 Dec 2007 16:57:38 -0500
"Parag Warudkar" <parag.warudkar@gmail.com> wrote:

> On Dec 8, 2007 6:12 PM, Parag Warudkar <parag.warudkar@gmail.com>
> wrote:
> > No problems after disabling CONFIG_HIGHRES_TIMERS , CONFIG_CPU_IDLE
> > and CONFIG_NO_HZ.
> >
> > I will try enabling them one by one - HRT, NOHZ and CPU_IDLE last -
> > that way we can at least tell what is required to be hit with this
> > problem.
> 
> Looks like CPU_IDLE=y is necessary for the problem to show up.
> With CPU_IDLE=n HRT+NO_HZ+TICK_ONESHOT does not give soft lockup
> problems. (Actually with HIGH_RES_TIMERS=NO_HZ=TICK_ONESHOT=y  I do
> see short freezes on ssh - when I cannot type anything for may be a
> second even under 100% idle. But Soft Lock up doesnt show up in dmesg
> with this configuration.)


starts to sound like something is turning off the clock during a somewhat deeper C-state...

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-09 21:57                               ` Parag Warudkar
  2007-12-09 22:30                                 ` Arjan van de Ven
@ 2007-12-10  5:49                                 ` Thomas Gleixner
  2007-12-11  4:49                                   ` Parag Warudkar
  1 sibling, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2007-12-10  5:49 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh, LKML,
	Andrew Morton, Linus Torvalds

On Sun, 9 Dec 2007, Parag Warudkar wrote:

> On Dec 8, 2007 6:12 PM, Parag Warudkar <parag.warudkar@gmail.com> wrote:
> > No problems after disabling CONFIG_HIGHRES_TIMERS , CONFIG_CPU_IDLE
> > and CONFIG_NO_HZ.
> >
> > I will try enabling them one by one - HRT, NOHZ and CPU_IDLE last -
> > that way we can at least tell what is required to be hit with this
> > problem.
> 
> Looks like CPU_IDLE=y is necessary for the problem to show up.
> With CPU_IDLE=n HRT+NO_HZ+TICK_ONESHOT does not give soft lockup problems.
> (Actually with HIGH_RES_TIMERS=NO_HZ=TICK_ONESHOT=y  I do see short
> freezes on ssh - when I cannot type anything for may be a second even
> under 100% idle. But Soft Lock up doesnt show up in dmesg with this
> configuration.)

Can you please apply the patch below ? It prints out the internal
state of the clockevents/timer system when the softlockup is detected.

Thanks,

	tglx

diff --git a/kernel/softlockup.c b/kernel/softlockup.c
index 11df812..82f1a05 100644
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -118,6 +118,7 @@ void softlockup_tick(void)
 		show_regs(regs);
 	else
 		dump_stack();
+	sysrq_timer_list_show();
 	spin_unlock(&print_lock);
 }
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-10  5:49                                 ` Thomas Gleixner
@ 2007-12-11  4:49                                   ` Parag Warudkar
  2007-12-14 23:17                                     ` Len Brown
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-11  4:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh, LKML,
	Andrew Morton, Linus Torvalds

On Dec 10, 2007 12:49 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> Can you please apply the patch below ? It prints out the internal
> state of the clockevents/timer system when the softlockup is detected.
>

Below is the relevant output after multiple soft lockups.

Thanks
Parag

BUG: soft lockup - CPU#1 stuck for 14s! [squid_redirect:3109]

Pid: 3109, comm: squid_redirect Not tainted (2.6.24-rc4 #15)
EIP: 0073:[<0077a362>] EFLAGS: 00000202 CPU: 1
EIP is at 0x77a362
EAX: 0804c008 EBX: 007ca5cc ECX: 08199947 EDX: 0846da74
ESI: 0846da74 EDI: 08199968 EBP: bfcc2d58 ESP: bfcc2d20
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
CR0: 80050033 CR2: b7ea2c84 CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 2899263521779 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb1e9c>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 2899384000000 nsecs [in 120478221 nsecs]
 #1: <f6eb1e9c>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 2902364827067 nsecs [in 3101305288 nsecs]
 #2: <f6eb1e9c>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 2918995045613 nsecs [in 19731523834 nsecs]
 #3: <f6eb1e9c>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 2938118668932 nsecs [in 38855147153 nsecs]
 #4: <f6eb1e9c>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3037524193866 nsecs [in 138260672087 nsecs]
 #5: <f6eb1e9c>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 746313186881 nsecs]
  .expires_next   : 2899500000000 nsecs
  .hres_active    : 1
  .nr_events      : 113175
  .nohz_mode      : 2
  .idle_tick      : 2899000000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 649750
  .idle_calls     : 848768
  .idle_sleeps    : 642128
  .idle_entrytime : 2899580013124 nsecs
  .idle_sleeptime : 2675153362437 nsecs
  .last_jiffies   : 649900
  .next_jiffies   : 649918
  .idle_expires   : 2899496000000 nsecs
jiffies: 649906

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb1e9c>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 735281596423 nsecs]
 #1: <f6eb1e9c>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 735369584826 nsecs]
  .expires_next   : 2899261000000 nsecs
  .hres_active    : 1
  .nr_events      : 51469
  .nohz_mode      : 2
  .idle_tick      : 2883861000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 645965
  .idle_calls     : 59946
  .idle_sleeps    : 21647
  .idle_entrytime : 2883857038167 nsecs
  .idle_sleeptime : 2760707498759 nsecs
  .last_jiffies   : 645965
  .next_jiffies   : 652750
  .idle_expires   : 2910996000000 nsecs
jiffies: 649967


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     2899924000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000001


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     2900012000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 13s! [squid_redirect:3110]

Pid: 3110, comm: squid_redirect Not tainted (2.6.24-rc4 #15)
EIP: 0073:[<00725c40>] EFLAGS: 00000246 CPU: 1
EIP is at 0x725c40
EAX: 0829da00 EBX: 007ca5cc ECX: 080bb420 EDX: 00000000
ESI: 0804c008 EDI: 00000003 EBP: bfd26158 ESP: bfd2613c
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
CR0: 80050033 CR2: b7ea2c84 CR3: 37c06000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 2925319278498 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb5e9c>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 2925436000000 nsecs [in 116721502 nsecs]
 #1: <f6eb5e9c>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 2932364827220 nsecs [in 7045548722 nsecs]
 #2: <f6eb5e9c>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 2938118668932 nsecs [in 12799390434 nsecs]
 #3: <f6eb5e9c>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 2948995090537 nsecs [in 23675812039 nsecs]
 #4: <f6eb5e9c>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3037524193866 nsecs [in 112204915368 nsecs]
 #5: <f6eb5e9c>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 720257430162 nsecs]
  .expires_next   : 2925532000000 nsecs
  .hres_active    : 1
  .nr_events      : 113548
  .nohz_mode      : 2
  .idle_tick      : 2925000000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 656250
  .idle_calls     : 849830
  .idle_sleeps    : 642265
  .idle_entrytime : 2925608012256 nsecs
  .idle_sleeptime : 2700129054859 nsecs
  .last_jiffies   : 656311
  .next_jiffies   : 656321
  .idle_expires   : 2925496000000 nsecs
jiffies: 656311

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb5e9c>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 709225839704 nsecs]
 #1: <f6eb5e9c>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 709313828107 nsecs]
  .expires_next   : 2925241000000 nsecs
  .hres_active    : 1
  .nr_events      : 51474
  .nohz_mode      : 2
  .idle_tick      : 2911001000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 652750
  .idle_calls     : 59951
  .idle_sleeps    : 21649
  .idle_entrytime : 2910997024596 nsecs
  .idle_sleeptime : 2785834078493 nsecs
  .last_jiffies   : 652750
  .next_jiffies   : 660250
  .idle_expires   : 2940996000000 nsecs
jiffies: 656311


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     2925952000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000001


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     2926044000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 107s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000202 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000dab ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 00bcbed8 EBP: 00bcb12d ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 086e902c CR3: 3720a000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3175981092730 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_stop_sched_tick, swapper/0
 # expires at 3175984000000 nsecs [in 2907270 nsecs]
 #1: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3178119140931 nsecs [in 2138048201 nsecs]
 #2: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3182364827430 nsecs [in 6383734700 nsecs]
 #3: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3188995455070 nsecs [in 13014362340 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3337524287270 nsecs [in 161543194540 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 469595615930 nsecs]
  .expires_next   : 3175984000000 nsecs
  .hres_active    : 1
  .nr_events      : 116977
  .nohz_mode      : 2
  .idle_tick      : 3175872000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 718969
  .idle_calls     : 982424
  .idle_sleeps    : 768351
  .idle_entrytime : 3175868036812 nsecs
  .idle_sleeptime : 2942681121008 nsecs
  .last_jiffies   : 718968
  .next_jiffies   : 718995
  .idle_expires   : 3175976000000 nsecs
jiffies: 718996

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 458564025472 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 458652013875 nsecs]
  .expires_next   : 3175981000000 nsecs
  .hres_active    : 1
  .nr_events      : 52243
  .nohz_mode      : 2
  .idle_tick      : 3061841000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 690460
  .idle_calls     : 61325
  .idle_sleeps    : 22249
  .idle_entrytime : 3175980020722 nsecs
  .idle_sleeptime : 3034270703089 nsecs
  .last_jiffies   : 718996
  .next_jiffies   : 732728
  .idle_expires   : 3230908000000 nsecs
jiffies: 718996


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000002


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           3
 next_event:     3175984000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 13s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000da2 ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 009232c2 EBP: 00922520 ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 086e902c CR3: 37890000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3315910720388 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3316024000000 nsecs [in 113279612 nsecs]
 #1: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3322364826957 nsecs [in 6454106569 nsecs]
 #2: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3337524287270 nsecs [in 21613566882 nsecs]
 #3: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3338995684982 nsecs [in 23084964594 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3358119467677 nsecs [in 42208747289 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 329665988272 nsecs]
  .expires_next   : 3316124000000 nsecs
  .hres_active    : 1
  .nr_events      : 118223
  .nohz_mode      : 2
  .idle_tick      : 3315500000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 753875
  .idle_calls     : 985645
  .idle_sleeps    : 770122
  .idle_entrytime : 3316196013563 nsecs
  .idle_sleeptime : 3079535676376 nsecs
  .last_jiffies   : 754054
  .next_jiffies   : 754085
  .idle_expires   : 3315996000000 nsecs
jiffies: 754060

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 318634397814 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 318722386217 nsecs]
  .expires_next   : 3315813000000 nsecs
  .hres_active    : 1
  .nr_events      : 52555
  .nohz_mode      : 2
  .idle_tick      : 3302001000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 750500
  .idle_calls     : 61676
  .idle_sleeps    : 22325
  .idle_entrytime : 3315812024367 nsecs
  .idle_sleeptime : 3172329298971 nsecs
  .last_jiffies   : 753954
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 754122


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3316544000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3316636000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 17s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 0000000a ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 00855ac8 EBP: 00855abe ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 086e902c CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3339110116699 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3339224000000 nsecs [in 113883301 nsecs]
 #1: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3342364827041 nsecs [in 3254710342 nsecs]
 #2: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3358119467677 nsecs [in 19009350978 nsecs]
 #3: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3368995727615 nsecs [in 29885610916 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 298414243859 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 306466591961 nsecs]
  .expires_next   : 3339320000000 nsecs
  .hres_active    : 1
  .nr_events      : 118638
  .nohz_mode      : 2
  .idle_tick      : 3339000000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 759750
  .idle_calls     : 986993
  .idle_sleeps    : 770348
  .idle_entrytime : 3339396057527 nsecs
  .idle_sleeptime : 3101499603762 nsecs
  .last_jiffies   : 759854
  .next_jiffies   : 759875
  .idle_expires   : 3339396000000 nsecs
jiffies: 759860

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 295435001503 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 295522989906 nsecs]
  .expires_next   : 3339013000000 nsecs
  .hres_active    : 1
  .nr_events      : 52575
  .nohz_mode      : 2
  .idle_tick      : 3320845000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 755211
  .idle_calls     : 61702
  .idle_sleeps    : 22333
  .idle_entrytime : 3339010504626 nsecs
  .idle_sleeptime : 3194546573260 nsecs
  .last_jiffies   : 759753
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 759921


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3339744000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3339832000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000da7 ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 009dd0d4 EBP: 009dc32d ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 086e902c CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3358306003372 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3358420000000 nsecs [in 113996628 nsecs]
 #1: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3362364827247 nsecs [in 4058823875 nsecs]
 #2: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3368995727615 nsecs [in 10689724243 nsecs]
 #3: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3418119574321 nsecs [in 59813570949 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 279218357186 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 287270705288 nsecs]
  .expires_next   : 3358516000000 nsecs
  .hres_active    : 1
  .nr_events      : 119037
  .nohz_mode      : 2
  .idle_tick      : 3358132000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 764533
  .idle_calls     : 988268
  .idle_sleeps    : 770593
  .idle_entrytime : 3358592013279 nsecs
  .idle_sleeptime : 3119453550713 nsecs
  .last_jiffies   : 764653
  .next_jiffies   : 764735
  .idle_expires   : 3358288000000 nsecs
jiffies: 764659

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 276239114830 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 276327103233 nsecs]
  .expires_next   : 3358209000000 nsecs
  .hres_active    : 1
  .nr_events      : 52601
  .nohz_mode      : 2
  .idle_tick      : 3343153000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 760788
  .idle_calls     : 61735
  .idle_sleeps    : 22340
  .idle_entrytime : 3358208022533 nsecs
  .idle_sleeptime : 3212747374331 nsecs
  .last_jiffies   : 764553
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 764720


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3358940000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3359028000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 13s! [squid_redirect:3109]

Pid: 3109, comm: squid_redirect Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c05a0ca2>] EFLAGS: 00010246 CPU: 1
EIP is at __alloc_skb+0x5e/0xfb
EAX: 00000000 EBX: f6ebb300 ECX: 00000022 EDX: f7006800
ESI: 00000080 EDI: f6ebb304 EBP: 000004d0 ESP: f6eb1dd8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: 086e902c CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c059d70b>] sock_alloc_send_skb+0x6e/0x196
 [<c05fae2d>] unix_stream_sendmsg+0x15c/0x317
 [<c059b0d5>] sock_aio_write+0xe3/0xef
 [<c046ced1>] do_sync_write+0xc7/0x10a
 [<c0430e29>] autoremove_wake_function+0x0/0x35
 [<c0447e34>] audit_syscall_exit+0x2c7/0x2e3
 [<c046d6aa>] vfs_write+0xb5/0x117
 [<c046dc05>] sys_write+0x41/0x67
 [<c0403e26>] syscall_call+0x7/0xb
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3378758554114 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb1c84>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3378880000000 nsecs [in 121445886 nsecs]
 #1: <f6eb1c84>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3382364827171 nsecs [in 3606273057 nsecs]
 #2: <f6eb1c84>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3398995770136 nsecs [in 20237216022 nsecs]
 #3: <f6eb1c84>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3418119574321 nsecs [in 39361020207 nsecs]
 #4: <f6eb1c84>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 258765806444 nsecs]
 #5: <f6eb1c84>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 266818154546 nsecs]
  .expires_next   : 3378980000000 nsecs
  .hres_active    : 1
  .nr_events      : 119459
  .nohz_mode      : 2
  .idle_tick      : 3378480000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 769621
  .idle_calls     : 989598
  .idle_sleeps    : 770848
  .idle_entrytime : 3379048011858 nsecs
  .idle_sleeptime : 3138622137644 nsecs
  .last_jiffies   : 769655
  .next_jiffies   : 769735
  .idle_expires   : 3378936000000 nsecs
jiffies: 769655

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb1c84>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 255786564088 nsecs]
 #1: <f6eb1c84>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 255874552491 nsecs]
  .expires_next   : 3378617000000 nsecs
  .hres_active    : 1
  .nr_events      : 52630
  .nohz_mode      : 2
  .idle_tick      : 3365009000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 766253
  .idle_calls     : 61772
  .idle_sleeps    : 22349
  .idle_entrytime : 3365008034288 nsecs
  .idle_sleeptime : 3232136157785 nsecs
  .last_jiffies   : 766253
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 769655


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3379392000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000001


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3379480000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 13s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 000003d2 ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 004d7480 EBP: 004d70ae ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7cb8eb4 CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3399017378968 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3399132000000 nsecs [in 114621032 nsecs]
 #1: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3402364827246 nsecs [in 3347448278 nsecs]
 #2: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3418119574321 nsecs [in 19102195353 nsecs]
 #3: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3428995802046 nsecs [in 29978423078 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 238506981590 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 246559329692 nsecs]
  .expires_next   : 3399228000000 nsecs
  .hres_active    : 1
  .nr_events      : 119857
  .nohz_mode      : 2
  .idle_tick      : 3398596000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 774649
  .idle_calls     : 990978
  .idle_sleeps    : 771180
  .idle_entrytime : 3399300011523 nsecs
  .idle_sleeptime : 3157671876745 nsecs
  .last_jiffies   : 774731
  .next_jiffies   : 774741
  .idle_expires   : 3398996000000 nsecs
jiffies: 774731

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 235527739234 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 235615727637 nsecs]
  .expires_next   : 3398921000000 nsecs
  .hres_active    : 1
  .nr_events      : 52664
  .nohz_mode      : 2
  .idle_tick      : 3384993000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 771248
  .idle_calls     : 61816
  .idle_sleeps    : 22362
  .idle_entrytime : 3398920723452 nsecs
  .idle_sleeptime : 3251373016329 nsecs
  .last_jiffies   : 774731
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 774731


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3399644000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3399736000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 21s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 0000026a ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 000dc235 EBP: 000dbfcb ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7cb8eb4 CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3425973316433 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3426088000000 nsecs [in 114683567 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3428995802046 nsecs [in 3022485613 nsecs]
 #2: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3432364826941 nsecs [in 6391510508 nsecs]
 #3: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3478119686256 nsecs [in 52146369823 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 211551044125 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 219603392227 nsecs]
  .expires_next   : 3426184000000 nsecs
  .hres_active    : 1
  .nr_events      : 120297
  .nohz_mode      : 2
  .idle_tick      : 3425876000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 781469
  .idle_calls     : 992318
  .idle_sleeps    : 771457
  .idle_entrytime : 3426256011783 nsecs
  .idle_sleeptime : 3183364803024 nsecs
  .last_jiffies   : 781470
  .next_jiffies   : 781480
  .idle_expires   : 3425996000000 nsecs
jiffies: 781470

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 208571801769 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 208659790172 nsecs]
  .expires_next   : 3425877000000 nsecs
  .hres_active    : 1
  .nr_events      : 52690
  .nohz_mode      : 2
  .idle_tick      : 3404129000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 776032
  .idle_calls     : 61850
  .idle_sleeps    : 22372
  .idle_entrytime : 3425876824416 nsecs
  .idle_sleeptime : 3277331921223 nsecs
  .last_jiffies   : 781470
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 781470


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3426600000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3426692000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 13s! [squid_redirect:3109]

Pid: 3109, comm: squid_redirect Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0406798>] EFLAGS: 00000202 CPU: 1
EIP is at do_syscall_trace+0xfe/0x16a
EAX: f6de15c0 EBX: 00000000 ECX: f6de15c0 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: f6eb1fb8 ESP: f6eb1f94
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: b7cb8eb4 CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c046db9e>] sys_read+0x41/0x67
 [<c0403f26>] syscall_exit_work+0x16/0x1b
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3445397166939 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb1e40>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3445516000000 nsecs [in 118833061 nsecs]
 #1: <f6eb1e40>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3452364827341 nsecs [in 6967660402 nsecs]
 #2: <f6eb1e40>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3458995868510 nsecs [in 13598701571 nsecs]
 #3: <f6eb1e40>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3478119686256 nsecs [in 32722519317 nsecs]
 #4: <f6eb1e40>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 192127193619 nsecs]
 #5: <f6eb1e40>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 200179541721 nsecs]
  .expires_next   : 3445616000000 nsecs
  .hres_active    : 1
  .nr_events      : 120695
  .nohz_mode      : 2
  .idle_tick      : 3445000000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 786250
  .idle_calls     : 993728
  .idle_sleeps    : 771801
  .idle_entrytime : 3445688011834 nsecs
  .idle_sleeptime : 3201567185186 nsecs
  .last_jiffies   : 786327
  .next_jiffies   : 786337
  .idle_expires   : 3445496000000 nsecs
jiffies: 786327

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb1e40>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 189147951263 nsecs]
 #1: <f6eb1e40>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 189235939666 nsecs]
  .expires_next   : 3445305000000 nsecs
  .hres_active    : 1
  .nr_events      : 52725
  .nohz_mode      : 2
  .idle_tick      : 3432009000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 783003
  .idle_calls     : 61890
  .idle_sleeps    : 22382
  .idle_entrytime : 3432008033864 nsecs
  .idle_sleeptime : 3295603622377 nsecs
  .last_jiffies   : 783003
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 786327


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3446036000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000001


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3446124000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 11s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000da6 ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 00804711 EBP: 0080396b ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7cb8eb4 CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3470253369529 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3470368000000 nsecs [in 114630471 nsecs]
 #1: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3472364827179 nsecs [in 2111457650 nsecs]
 #2: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3478119686256 nsecs [in 7866316727 nsecs]
 #3: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3488995915727 nsecs [in 18742546198 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 167270991029 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 175323339131 nsecs]
  .expires_next   : 3470464000000 nsecs
  .hres_active    : 1
  .nr_events      : 121145
  .nohz_mode      : 2
  .idle_tick      : 3470000000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 792500
  .idle_calls     : 995079
  .idle_sleeps    : 772053
  .idle_entrytime : 3470532015086 nsecs
  .idle_sleeptime : 3225097811941 nsecs
  .last_jiffies   : 792639
  .next_jiffies   : 792643
  .idle_expires   : 3470496000000 nsecs
jiffies: 792645

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 164291748673 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 164379737076 nsecs]
  .expires_next   : 3470157000000 nsecs
  .hres_active    : 1
  .nr_events      : 52762
  .nohz_mode      : 2
  .idle_tick      : 3458613000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 789653
  .idle_calls     : 61939
  .idle_sleeps    : 22394
  .idle_entrytime : 3470156022499 nsecs
  .idle_sleeptime : 3319306712683 nsecs
  .last_jiffies   : 792540
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 792706


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3470880000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3470968000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 13s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0603eaa>] EFLAGS: 00000202 CPU: 1
EIP is at _spin_lock_irqsave+0x16/0x27
EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7cb8eb4 CR3: 37dcc000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c0438293>] tick_broadcast_oneshot_control+0x10/0xda
 [<c0437ce2>] tick_notify+0x1d4/0x2eb
 [<c04281bc>] get_next_timer_interrupt+0x143/0x1b4
 [<c06058a1>] notifier_call_chain+0x2a/0x47
 [<c04345c0>] raw_notifier_call_chain+0x17/0x1a
 [<c043781e>] clockevents_notify+0x19/0x4f
 [<c0533d23>] acpi_idle_enter_simple+0x183/0x1d0
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3489148567147 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829dbc>, tick_sched_timer, S:01, tick_nohz_stop_sched_tick, swapper/0
 # expires at 3489004000000 nsecs [in 18446744073564984469 nsecs]
 #1: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3492364827517 nsecs [in 3216260370 nsecs]
 #2: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3538119802465 nsecs [in 48971235318 nsecs]
 #3: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 148375793411 nsecs]
 #4: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 156428141513 nsecs]
  .expires_next   : 3489004000000 nsecs
  .hres_active    : 1
  .nr_events      : 121552
  .nohz_mode      : 2
  .idle_tick      : 3488500000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 797126
  .idle_calls     : 996547
  .idle_sleeps    : 772606
  .idle_entrytime : 3489000016669 nsecs
  .idle_sleeptime : 3242430478734 nsecs
  .last_jiffies   : 797251
  .next_jiffies   : 797375
  .idle_expires   : 3488996000000 nsecs
jiffies: 797251

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829dbc>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3518995976435 nsecs [in 29847409288 nsecs]
 #1: <f7829dbc>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 145396551055 nsecs]
 #2: <f7829dbc>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 145484539458 nsecs]
  .expires_next   : 3488997000000 nsecs
  .hres_active    : 1
  .nr_events      : 52842
  .nohz_mode      : 2
  .idle_tick      : 3474001000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 793501
  .idle_calls     : 62051
  .idle_sleeps    : 22430
  .idle_entrytime : 3488995988657 nsecs
  .idle_sleeptime : 3336928804329 nsecs
  .last_jiffies   : 797249
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 797251


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3489004000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 18s! [squid_redirect:3109]

Pid: 3109, comm: squid_redirect Not tainted (2.6.24-rc4 #15)
EIP: 0073:[<00778e26>] EFLAGS: 00000286 CPU: 1
EIP is at 0x778e26
EAX: 00000000 EBX: 007ca5cc ECX: 0819993b EDX: 00000000
ESI: 082eafb0 EDI: 08199963 EBP: bfcc2d18 ESP: bfcc2ca0
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
CR0: 80050033 CR2: b7cb8eb4 CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3509959588108 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb1e9c>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3510076000000 nsecs [in 116411892 nsecs]
 #1: <f6eb1e9c>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3512364827593 nsecs [in 2405239485 nsecs]
 #2: <f6eb1e9c>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3538119802465 nsecs [in 28160214357 nsecs]
 #3: <f6eb1e9c>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 127564772450 nsecs]
 #4: <f6eb1e9c>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 135617120552 nsecs]
  .expires_next   : 3510156000000 nsecs
  .hres_active    : 1
  .nr_events      : 121736
  .nohz_mode      : 2
  .idle_tick      : 3509500000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 802375
  .idle_calls     : 997613
  .idle_sleeps    : 773233
  .idle_entrytime : 3510228013028 nsecs
  .idle_sleeptime : 3262173789757 nsecs
  .last_jiffies   : 802562
  .next_jiffies   : 802590
  .idle_expires   : 3509976000000 nsecs
jiffies: 802568

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f6eb1e9c>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3518995976435 nsecs [in 9036388327 nsecs]
 #1: <f6eb1e9c>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 124585530094 nsecs]
 #2: <f6eb1e9c>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 124673518497 nsecs]
  .expires_next   : 3509881000000 nsecs
  .hres_active    : 1
  .nr_events      : 52891
  .nohz_mode      : 2
  .idle_tick      : 3490217000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 797554
  .idle_calls     : 62239
  .idle_sleeps    : 22472
  .idle_entrytime : 3509880022423 nsecs
  .idle_sleeptime : 3356666831101 nsecs
  .last_jiffies   : 802471
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 802632


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3510588000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000001


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3510680000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 73s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000009 ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 00cd838d EBP: 00cd8384 ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7cb8eb4 CR3: 3721b000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 3598217317189 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 3598332000000 nsecs [in 114682811 nsecs]
 #1: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 3602364827289 nsecs [in 4147510100 nsecs]
 #2: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 3608996157924 nsecs [in 10778840735 nsecs]
 #3: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 3637524360558 nsecs [in 39307043369 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 3645576708660 nsecs [in 47359391471 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 3658120033497 nsecs [in 59902716308 nsecs]
  .expires_next   : 3598428000000 nsecs
  .hres_active    : 1
  .nr_events      : 122506
  .nohz_mode      : 2
  .idle_tick      : 3598000000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 824500
  .idle_calls     : 999222
  .idle_sleeps    : 773694
  .idle_entrytime : 3598504013419 nsecs
  .idle_sleeptime : 3348824164217 nsecs
  .last_jiffies   : 824631
  .next_jiffies   : 824750
  .idle_expires   : 3598320000000 nsecs
jiffies: 824636

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 3634545118202 nsecs [in 36327801013 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 3634633106605 nsecs [in 36415789416 nsecs]
  .expires_next   : 3598121000000 nsecs
  .hres_active    : 1
  .nr_events      : 52922
  .nohz_mode      : 2
  .idle_tick      : 3519005000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 804751
  .idle_calls     : 62274
  .idle_sleeps    : 22480
  .idle_entrytime : 3598120023719 nsecs
  .idle_sleeptime : 3443870779570 nsecs
  .last_jiffies   : 824531
  .next_jiffies   : 825443
  .idle_expires   : 3601768000000 nsecs
jiffies: 824698


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     3598848000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     3598940000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 16s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0603eaa>] EFLAGS: 00000202 CPU: 1
EIP is at _spin_lock_irqsave+0x16/0x27
EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: bfef28a8 CR3: 37370000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c0438293>] tick_broadcast_oneshot_control+0x10/0xda
 [<c0437ce2>] tick_notify+0x1d4/0x2eb
 [<c04281bc>] get_next_timer_interrupt+0x143/0x1b4
 [<c06058a1>] notifier_call_chain+0x2a/0x47
 [<c04345c0>] raw_notifier_call_chain+0x17/0x1a
 [<c043781e>] clockevents_notify+0x19/0x4f
 [<c0533d23>] acpi_idle_enter_simple+0x183/0x1d0
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 4167069767337 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829dbc>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 4166924000000 nsecs [in 18446744073563784279 nsecs]
 #1: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 4172364827191 nsecs [in 5295059854 nsecs]
 #2: <f7829dbc>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 4178996866281 nsecs [in 11927098944 nsecs]
 #3: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 4198120966920 nsecs [in 31051199583 nsecs]
 #4: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 4237524532033 nsecs [in 70454764696 nsecs]
 #5: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 5445738202858 nsecs [in 1278668435521 nsecs]
  .expires_next   : 4166924000000 nsecs
  .hres_active    : 1
  .nr_events      : 125855
  .nohz_mode      : 2
  .idle_tick      : 4166872000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 966718
  .idle_calls     : 1003972
  .idle_sleeps    : 777081
  .idle_entrytime : 4166921107361 nsecs
  .idle_sleeptime : 3914047603252 nsecs
  .last_jiffies   : 966731
  .next_jiffies   : 966740
  .idle_expires   : 4166996000000 nsecs
jiffies: 966731

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829dbc>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 7234546066766 nsecs [in 3067476299429 nsecs]
 #1: <f7829dbc>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 7234633977840 nsecs [in 3067564210503 nsecs]
  .expires_next   : 4166921000000 nsecs
  .hres_active    : 1
  .nr_events      : 53499
  .nohz_mode      : 2
  .idle_tick      : 4150253000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 962563
  .idle_calls     : 62865
  .idle_sleeps    : 22602
  .idle_entrytime : 4166918657364 nsecs
  .idle_sleeptime : 4009849868806 nsecs
  .last_jiffies   : 966730
  .next_jiffies   : 970000
  .idle_expires   : 4179996000000 nsecs
jiffies: 966731


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     4166924000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 20s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0533d06>] EFLAGS: 00000202 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 0000000a ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 0053a918 EBP: 0053a90e ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 080ff4a8 CR3: 36f0e000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 4261546668428 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 4261660000000 nsecs [in 113331572 nsecs]
 #1: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 4262364827121 nsecs [in 818158693 nsecs]
 #2: <f7829e34>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 4268996983467 nsecs [in 7450315039 nsecs]
 #3: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 4318121181735 nsecs [in 56574513307 nsecs]
 #4: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 4537524611720 nsecs [in 275977943292 nsecs]
 #5: <f7829e34>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 5445738202858 nsecs [in 1184191534430 nsecs]
  .expires_next   : 4261756000000 nsecs
  .hres_active    : 1
  .nr_events      : 126606
  .nohz_mode      : 2
  .idle_tick      : 4261356000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 990339
  .idle_calls     : 1007304
  .idle_sleeps    : 779634
  .idle_entrytime : 4261828014556 nsecs
  .idle_sleeptime : 4006637937696 nsecs
  .last_jiffies   : 990462
  .next_jiffies   : 990500
  .idle_expires   : 4261496000000 nsecs
jiffies: 990468

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 7234546066766 nsecs [in 2972999398338 nsecs]
 #1: <f7829e34>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 7234633977840 nsecs [in 2973087309412 nsecs]
  .expires_next   : 4261449000000 nsecs
  .hres_active    : 1
  .nr_events      : 53573
  .nohz_mode      : 2
  .idle_tick      : 4240017000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 985004
  .idle_calls     : 62997
  .idle_sleeps    : 22668
  .idle_entrytime : 4261446283566 nsecs
  .idle_sleeptime : 4103195796783 nsecs
  .last_jiffies   : 990362
  .next_jiffies   : 992500
  .idle_expires   : 4269996000000 nsecs
jiffies: 990529


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     4262172000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     4262264000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 11s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #15)
EIP: 0060:[<c0603eaa>] EFLAGS: 00000202 CPU: 1
EIP is at _spin_lock_irqsave+0x16/0x27
EAX: c06b4110 EBX: 00000001 ECX: f7873808 EDX: 00000293
ESI: 00000005 EDI: f7873808 EBP: 00000000 ESP: f7829f10
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 080ff4a8 CR3: 37344000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c0438293>] tick_broadcast_oneshot_control+0x10/0xda
 [<c0437ce2>] tick_notify+0x1d4/0x2eb
 [<c04281bc>] get_next_timer_interrupt+0x143/0x1b4
 [<c06058a1>] notifier_call_chain+0x2a/0x47
 [<c04345c0>] raw_notifier_call_chain+0x17/0x1a
 [<c043781e>] clockevents_notify+0x19/0x4f
 [<c0533d23>] acpi_idle_enter_simple+0x183/0x1d0
 [<c058cf03>] cpuidle_idle_call+0x53/0x78
 [<c058ceb0>] cpuidle_idle_call+0x0/0x78
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 4281836818905 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829dbc>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 4281692000000 nsecs [in 18446744073564732711 nsecs]
 #1: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, irqbalance/2663
 # expires at 4282364827815 nsecs [in 528008910 nsecs]
 #2: <f7829dbc>, it_real_fn, S:01, do_setitimer, syslogd/2643
 # expires at 4298997022577 nsecs [in 17160203672 nsecs]
 #3: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, crond/3083
 # expires at 4318121181735 nsecs [in 36284362830 nsecs]
 #4: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, atd/3184
 # expires at 4537524611720 nsecs [in 255687792815 nsecs]
 #5: <f7829dbc>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3313
 # expires at 5445738202858 nsecs [in 1163901383953 nsecs]
  .expires_next   : 4281692000000 nsecs
  .hres_active    : 1
  .nr_events      : 126862
  .nohz_mode      : 2
  .idle_tick      : 4281500000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 995375
  .idle_calls     : 1007974
  .idle_sleeps    : 779749
  .idle_entrytime : 4281689136364 nsecs
  .idle_sleeptime : 4025848101273 nsecs
  .last_jiffies   : 995423
  .next_jiffies   : 995438
  .idle_expires   : 4281996000000 nsecs
jiffies: 995423

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197343983635176004 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829dbc>, it_real_fn, S:01, do_setitimer, sendmail/3036
 # expires at 7234546066766 nsecs [in 2952709247861 nsecs]
 #1: <f7829dbc>, it_real_fn, S:01, do_setitimer, sendmail/3047
 # expires at 7234633977840 nsecs [in 2952797158935 nsecs]
  .expires_next   : 4281689000000 nsecs
  .hres_active    : 1
  .nr_events      : 53582
  .nohz_mode      : 2
  .idle_tick      : 4270017000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 992504
  .idle_calls     : 63006
  .idle_sleeps    : 22670
  .idle_entrytime : 4281688022031 nsecs
  .idle_sleeptime : 4122505836637 nsecs
  .last_jiffies   : 995423
  .next_jiffies   : 1000000
  .idle_expires   : 4299996000000 nsecs
jiffies: 995423


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     4281692000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805384969
 min_delta_ns:   1440
 mult:           44734871
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-11  4:49                                   ` Parag Warudkar
@ 2007-12-14 23:17                                     ` Len Brown
  2007-12-15  0:51                                       ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Len Brown @ 2007-12-14 23:17 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Thomas Gleixner, Arjan van de Ven, Ingo Molnar, Pallipadi,
	Venkatesh, LKML, Andrew Morton, Linus Torvalds

does processor.max_cstate=1 make the failing configuration work?
If yes, how about processor.max_cstate=2?

what do you see in /proc/acpi/processor/*/power?

-Len

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-14 23:17                                     ` Len Brown
@ 2007-12-15  0:51                                       ` Parag Warudkar
  2007-12-15  8:10                                         ` Thomas Gleixner
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-15  0:51 UTC (permalink / raw)
  To: Len Brown
  Cc: Thomas Gleixner, Arjan van de Ven, Ingo Molnar, Pallipadi,
	Venkatesh, LKML, Andrew Morton, Linus Torvalds

On Dec 14, 2007 6:17 PM, Len Brown <lenb@kernel.org> wrote:
> does processor.max_cstate=1 make the failing configuration work?
> If yes, how about processor.max_cstate=2?

Until now 2 things were necessary to reproduce the problem -
1) CPU_IDLE=y and
2) Wakeups from Idle = 5-7 Per second (== Longer/deeper C state residency)

If I left the wakeups to high number (50-60) - there were no lockups
but it was very jerky over ssh.
(Typing keys had no effect for seconds etc.)

CPU_IDLE=y

With max_cstate=1 and CPU_IDLE=y things are pretty smooth - no lockups
for the last hour. (Soft lockups used to appear in minutes
previously.)

With max_cstate=2 - old story repeats - it's very jerky and soft
lockups appear in under a minute after going to 3-5 wakeups from idle
per sec.

>
> what do you see in /proc/acpi/processor/*/power?
>

Normally (without any max_cstate= parameter) I see this -
(I admit I never thought there was a C8 - C3 was all I ever heard.)

[parag@mini router]$ cat /proc/acpi/processor/CPU0/power
active state:            C2
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 8000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--]
latency[001] usage[00001010] duration[00000000000000000000]
   *C2:                  type[C2] promotion[--] demotion[C1]
latency[001] usage[07181700] duration[00000000379397304978]


Thanks
Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-15  0:51                                       ` Parag Warudkar
@ 2007-12-15  8:10                                         ` Thomas Gleixner
  2007-12-15 18:48                                           ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2007-12-15  8:10 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Fri, 14 Dec 2007, Parag Warudkar wrote:

> On Dec 14, 2007 6:17 PM, Len Brown <lenb@kernel.org> wrote:
> > does processor.max_cstate=1 make the failing configuration work?
> > If yes, how about processor.max_cstate=2?
> 
> Until now 2 things were necessary to reproduce the problem -
> 1) CPU_IDLE=y and
> 2) Wakeups from Idle = 5-7 Per second (== Longer/deeper C state residency)
> 
> If I left the wakeups to high number (50-60) - there were no lockups
> but it was very jerky over ssh.
> (Typing keys had no effect for seconds etc.)
> 
> CPU_IDLE=y
> 
> With max_cstate=1 and CPU_IDLE=y things are pretty smooth - no lockups
> for the last hour. (Soft lockups used to appear in minutes
> previously.)
> 
> With max_cstate=2 - old story repeats - it's very jerky and soft
> lockups appear in under a minute after going to 3-5 wakeups from idle
> per sec.

I have a patch staged for Linus, which fixes a thinko in the broadcast
code. It might be related to your problem. Can you give it a try ?

Thanks,

	tglx

------>

Subject: clockevents: fix reprogramming decision in oneshot broadcast
From: Thomas Gleixner <tglx@linutronix.de>

A previous version of the code did the reprogramming of the broadcast
device in the return from idle code. This was removed, but the logic in
tick_handle_oneshot_broadcast() was kept the same.

When a broadcast interrupt happens we signal the expiry to all CPUs
which have an expired event. If none of the CPUs has an expired event,
which can happen in dyntick mode, then we reprogram the broadcast
device. We do not reprogram otherwise, but this is only correct if all
CPUs, which are in the idle broadcast state have been woken up.

The code ignores, that there might be pending not yet expired events on
other CPUs, which are in the idle broadcast state. So the delivery of
those events can be delayed for quite a time.

Change the tick_handle_oneshot_broadcast() function to check for CPUs,
which are in broadcast state and are not woken up by the current event,
and enforce the rearming of the broadcast device for those CPUs.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/time/tick-broadcast.c |   56 ++++++++++++++++---------------------------
 1 file changed, 21 insertions(+), 35 deletions(-)

Index: linux-x86.q/kernel/time/tick-broadcast.c
===================================================================
--- linux-x86.q.orig/kernel/time/tick-broadcast.c
+++ linux-x86.q/kernel/time/tick-broadcast.c
@@ -384,45 +384,19 @@ int tick_resume_broadcast_oneshot(struct
 }
 
 /*
- * Reprogram the broadcast device:
- *
- * Called with tick_broadcast_lock held and interrupts disabled.
- */
-static int tick_broadcast_reprogram(void)
-{
-	ktime_t expires = { .tv64 = KTIME_MAX };
-	struct tick_device *td;
-	int cpu;
-
-	/*
-	 * Find the event which expires next:
-	 */
-	for (cpu = first_cpu(tick_broadcast_oneshot_mask); cpu != NR_CPUS;
-	     cpu = next_cpu(cpu, tick_broadcast_oneshot_mask)) {
-		td = &per_cpu(tick_cpu_device, cpu);
-		if (td->evtdev->next_event.tv64 < expires.tv64)
-			expires = td->evtdev->next_event;
-	}
-
-	if (expires.tv64 == KTIME_MAX)
-		return 0;
-
-	return tick_broadcast_set_event(expires, 0);
-}
-
-/*
  * Handle oneshot mode broadcasting
  */
 static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
 {
 	struct tick_device *td;
 	cpumask_t mask;
-	ktime_t now;
+	ktime_t now, next_event;
 	int cpu;
 
 	spin_lock(&tick_broadcast_lock);
 again:
 	dev->next_event.tv64 = KTIME_MAX;
+	next_event.tv64 = KTIME_MAX;
 	mask = CPU_MASK_NONE;
 	now = ktime_get();
 	/* Find all expired events */
@@ -431,19 +405,31 @@ again:
 		td = &per_cpu(tick_cpu_device, cpu);
 		if (td->evtdev->next_event.tv64 <= now.tv64)
 			cpu_set(cpu, mask);
+		else if (td->evtdev->next_event.tv64 < next_event.tv64)
+			next_event.tv64 = td->evtdev->next_event.tv64;
 	}
 
 	/*
-	 * Wakeup the cpus which have an expired event. The broadcast
-	 * device is reprogrammed in the return from idle code.
+	 * Wakeup the cpus which have an expired event.
+	 */
+	tick_do_broadcast(mask);
+
+	/*
+	 * Two reasons for reprogram:
+	 *
+	 * - The global event did not expire any CPU local
+	 * events. This happens in dyntick mode, as the maximum PIT
+	 * delta is quite small.
+	 *
+	 * - There are pending events on sleeping CPUs which were not
+	 * in the event mask
 	 */
-	if (!tick_do_broadcast(mask)) {
+	if (next_event.tv64 != KTIME_MAX) {
 		/*
-		 * The global event did not expire any CPU local
-		 * events. This happens in dyntick mode, as the
-		 * maximum PIT delta is quite small.
+		 * Rearm the broadcast device. If event expired,
+		 * repeat the above
 		 */
-		if (tick_broadcast_reprogram())
+		if (tick_broadcast_set_event(next_event, 0))
 			goto again;
 	}
 	spin_unlock(&tick_broadcast_lock);

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-15  8:10                                         ` Thomas Gleixner
@ 2007-12-15 18:48                                           ` Parag Warudkar
  2007-12-16  5:15                                             ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-15 18:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Parag Warudkar, Len Brown, Arjan van de Ven, Ingo Molnar,
	Pallipadi, Venkatesh, LKML, Andrew Morton, Linus Torvalds

On Sat, 15 Dec 2007, Thomas Gleixner wrote:

>I have a patch staged for Linus, which fixes a thinko in the broadcast
>code. It might be related to your problem. Can you give it a try ?

Yep - this looks promising. No soft lockups for last half an hour 
with 4-5 Wakeups from idle. It is almost smooth - jerkiness gone down by 
99%. There are still very infrequent and quick spells where the key typed 
on the keyboard takes effect a little later - but nothing like earlier.

I will run it for a little longer just to be sure - but I don't think it 
will be a problem.

Thanks Thomas.

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-15 18:48                                           ` Parag Warudkar
@ 2007-12-16  5:15                                             ` Parag Warudkar
  2007-12-16 13:43                                               ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-16  5:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Sat, 15 Dec 2007, Parag Warudkar wrote:

>> I will run it for a little longer just to be sure - but I don't think it 
>> will be a problem.

No problems for last 10 hours - I consider this fixed.

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-16  5:15                                             ` Parag Warudkar
@ 2007-12-16 13:43                                               ` Parag Warudkar
  2007-12-17  8:05                                                 ` Thomas Gleixner
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-16 13:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Dec 16, 2007 12:15 AM, Parag Warudkar <parag.warudkar@gmail.com> wrote:
> On Sat, 15 Dec 2007, Parag Warudkar wrote:
>
> >> I will run it for a little longer just to be sure - but I don't think it
> >> will be a problem.
>
> No problems for last 10 hours - I consider this fixed.
>

Arghh - spoke 8 hours too soon. I left it running overnight and
morning I see a bunch of softlockups - so NO NOT FIXED.

Parag

BUG: soft lockup - CPU#1 stuck for 13s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc5 #21)
EIP: 0060:[<c0533ca6>] EFLAGS: 00000206 CPU: 1
EIP is at acpi_idle_enter_simple+0x166/0x1d0
EAX: f7829f88 EBX: 00000dab ECX: 00000266 EDX: 00000000
ESI: 00000000 EDI: 00c056e5 EBP: 00c0493a ESP: f7829f88
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 080f6c78 CR3: 00718000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-16 13:43                                               ` Parag Warudkar
@ 2007-12-17  8:05                                                 ` Thomas Gleixner
  2007-12-17 13:08                                                   ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2007-12-17  8:05 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Sun, 16 Dec 2007, Parag Warudkar wrote:

> On Dec 16, 2007 12:15 AM, Parag Warudkar <parag.warudkar@gmail.com> wrote:
> > On Sat, 15 Dec 2007, Parag Warudkar wrote:
> >
> > >> I will run it for a little longer just to be sure - but I don't think it
> > >> will be a problem.
> >
> > No problems for last 10 hours - I consider this fixed.
> >
> 
> Arghh - spoke 8 hours too soon. I left it running overnight and
> morning I see a bunch of softlockups - so NO NOT FIXED.
> 
> Parag
> 
> BUG: soft lockup - CPU#1 stuck for 13s! [swapper:0]
> 
> Pid: 0, comm: swapper Not tainted (2.6.24-rc5 #21)
> EIP: 0060:[<c0533ca6>] EFLAGS: 00000206 CPU: 1
> EIP is at acpi_idle_enter_simple+0x166/0x1d0
> EAX: f7829f88 EBX: 00000dab ECX: 00000266 EDX: 00000000
> ESI: 00000000 EDI: 00c056e5 EBP: 00c0493a ESP: f7829f88
>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: 080f6c78 CR3: 00718000 CR4: 000006d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400

Sigh. You did not have the debug patch applied anymore, which printks
the timer_list data ? Can you apply it again and provide the output
please ?

Thanks,

	tglx


 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-17  8:05                                                 ` Thomas Gleixner
@ 2007-12-17 13:08                                                   ` Parag Warudkar
  2007-12-17 22:05                                                     ` Thomas Gleixner
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2007-12-17 13:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Dec 17, 2007 3:05 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Sun, 16 Dec 2007, Parag Warudkar wrote:
>
> > On Dec 16, 2007 12:15 AM, Parag Warudkar <parag.warudkar@gmail.com> wrote:
> > > On Sat, 15 Dec 2007, Parag Warudkar wrote:
> > >
> > > >> I will run it for a little longer just to be sure - but I don't think it
> > > >> will be a problem.
> > >
> > > No problems for last 10 hours - I consider this fixed.
> > >
> >
> > Arghh - spoke 8 hours too soon. I left it running overnight and
> > morning I see a bunch of softlockups - so NO NOT FIXED.
> >
> > Parag
> >
> > BUG: soft lockup - CPU#1 stuck for 13s! [swapper:0]
> >
> > Pid: 0, comm: swapper Not tainted (2.6.24-rc5 #21)
> > EIP: 0060:[<c0533ca6>] EFLAGS: 00000206 CPU: 1
> > EIP is at acpi_idle_enter_simple+0x166/0x1d0
> > EAX: f7829f88 EBX: 00000dab ECX: 00000266 EDX: 00000000
> > ESI: 00000000 EDI: 00c056e5 EBP: 00c0493a ESP: f7829f88
> >  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > CR0: 8005003b CR2: 080f6c78 CR3: 00718000 CR4: 000006d0
> > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > DR6: ffff0ff0 DR7: 00000400
>
> Sigh. You did not have the debug patch applied anymore, which printks
> the timer_list data ? Can you apply it again and provide the output
> please ?
>

This keeps getting more and more weird - This time I was running with
CONFIG_CPU_IDLE=N and I have ton of soft lockups after 14hr uptime.

The timer_list data for a few lockups is below. If you need the full
thing grab it from http://warudkars.net/messages.bz2 .

BUG: soft lockup - CPU#1 stuck for 12s! [kjournald:424]

Pid: 424, comm: kjournald Not tainted (2.6.24-rc5 #24)
EIP: 0060:[<c04bbd57>] EFLAGS: 00000206 CPU: 1
EIP is at __journal_remove_journal_head+0x18/0xe8
EAX: dc75e1c0 EBX: dc75e1c0 ECX: f6d539f0 EDX: dc75e1f8
ESI: dc75e1f8 EDI: 00000009 EBP: f712a000 ESP: f7157f28
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7d18eb4 CR3: 37352000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c04b8680>] __journal_unfile_buffer+0x8/0x11
 [<c04bbe64>] journal_put_journal_head+0x3d/0x4b
 [<c04ba254>] journal_commit_transaction+0x75d/0xbc9
 [<c0428246>] lock_timer_base+0x19/0x35
 [<c04bcda0>] kjournald+0xa2/0x1d0
 [<c0430e29>] autoremove_wake_function+0x0/0x35
 [<c04bccfe>] kjournald+0x0/0x1d0
 [<c0430d63>] kthread+0x38/0x5d
 [<c0430d2b>] kthread+0x0/0x5d
 [<c0404a0f>] kernel_thread_helper+0x7/0x10
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 39080147128907 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197846065352886518 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7157dd4>, tick_sched_timer, S:01, tick_nohz_stop_sched_tick, swapper/0
 # expires at 39080272000000 nsecs [in 124871093 nsecs]
 #1: <f7157dd4>, it_real_fn, S:01, do_setitimer, syslogd/2522
 # expires at 39086475543333 nsecs [in 6328414426 nsecs]
 #2: <f7157dd4>, hrtimer_wakeup, S:01, do_nanosleep, atd/3039
 # expires at 39332442305024 nsecs [in 252295176117 nsecs]
  .expires_next   : 39080320000000 nsecs
  .hres_active    : 1
  .nr_events      : 81214
  .nohz_mode      : 2
  .idle_tick      : 39078000000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 9694501
  .idle_calls     : 171123
  .idle_sleeps    : 106664
  .idle_entrytime : 39080384013047 nsecs
  .idle_sleeptime : 38923743508612 nsecs
  .last_jiffies   : 9695101
  .next_jiffies   : 9695250
  .idle_expires   : 39079996000000 nsecs
jiffies: 9695106

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197846065352886518 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7157dd4>, hrtimer_wakeup, S:01, do_nanosleep, crond/2946
 # expires at 39115730211364 nsecs [in 35583082457 nsecs]
 #1: <f7157dd4>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3480
 # expires at 39641838017337 nsecs [in 561690888430 nsecs]
  .expires_next   : 39079997000000 nsecs
  .hres_active    : 1
  .nr_events      : 48789
  .nohz_mode      : 2
  .idle_tick      : 39066865000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 9691717
  .idle_calls     : 125065
  .idle_sleeps    : 79815
  .idle_entrytime : 39079996603653 nsecs
  .idle_sleeptime : 38923914717954 nsecs
  .last_jiffies   : 9695000
  .next_jiffies   : 9695467
  .idle_expires   : 39081864000000 nsecs
jiffies: 9695162


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     39080704000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000001


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805378776
 min_delta_ns:   1440
 mult:           44735215
 shift:          32
 mode:           1
 next_event:     39080796000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805378776
 min_delta_ns:   1440
 mult:           44735215
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 12s! [kjournald:424]

Pid: 424, comm: kjournald Not tainted (2.6.24-rc5 #24)
EIP: 0060:[<f8bb3e78>] EFLAGS: 00000202 CPU: 1
EIP is at scsi_request_fn+0x265/0x326 [scsi_mod]
EAX: f7bb8000 EBX: f7bb8000 ECX: f70fe028 EDX: f70fe028
ESI: f70fe000 EDI: f7105400 EBP: f70de408 ESP: f7157e18
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7d18eb4 CR3: 37d98000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<c04e4ca4>] blk_remove_plug+0x4f/0x5b
 [<c04e4ccd>] __generic_unplug_device+0x1d/0x1f
 [<c04e58e8>] generic_unplug_device+0x15/0x21
 [<c04e3919>] blk_unplug+0x72/0x7a
 [<f8b94813>] dm_table_unplug_all+0x1e/0x27 [dm_mod]
 [<f8b92c63>] dm_unplug_all+0x17/0x21 [dm_mod]
 [<c04e3919>] blk_unplug+0x72/0x7a
 [<c0488fe1>] sync_buffer+0x2b/0x33
 [<c0602e5a>] __wait_on_bit+0x33/0x58
 [<c0488fb6>] sync_buffer+0x0/0x33
 [<c0488fb6>] sync_buffer+0x0/0x33
 [<c0602ee2>] out_of_line_wait_on_bit+0x63/0x6b
 [<c0430e5e>] wake_bit_function+0x0/0x3c
 [<c0488f80>] __wait_on_buffer+0x24/0x27
 [<c0489d72>] sync_dirty_buffer+0x7f/0xb4
 [<c04bd21f>] journal_get_descriptor_buffer+0x79/0x7f
 [<c04ba3d3>] journal_commit_transaction+0x8dc/0xbc9
 [<c0428246>] lock_timer_base+0x19/0x35
 [<c04bcda0>] kjournald+0xa2/0x1d0
 [<c0430e29>] autoremove_wake_function+0x0/0x35
 [<c04bccfe>] kjournald+0x0/0x1d0
 [<c0430d63>] kthread+0x38/0x5d
 [<c0430d2b>] kthread+0x0/0x5d
 [<c0404a0f>] kernel_thread_helper+0x7/0x10
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 40490254040892 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197846065352886518 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7157cc4>, tick_sched_timer, S:01, tick_nohz_stop_sched_tick, swapper/0
 # expires at 40490384000000 nsecs [in 129959108 nsecs]
 #1: <f7157cc4>, it_real_fn, S:01, do_setitimer, syslogd/2522
 # expires at 40496478002707 nsecs [in 6223961815 nsecs]
 #2: <f7157cc4>, hrtimer_wakeup, S:01, do_nanosleep, atd/3039
 # expires at 40532442653957 nsecs [in 42188613065 nsecs]
  .expires_next   : 40490428000000 nsecs
  .hres_active    : 1
  .nr_events      : 83410
  .nohz_mode      : 2
  .idle_tick      : 40489800000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 10047451
  .idle_calls     : 174800
  .idle_sleeps    : 109338
  .idle_entrytime : 40490492012833 nsecs
  .idle_sleeptime : 40331022857511 nsecs
  .last_jiffies   : 10047628
  .next_jiffies   : 10048000
  .idle_expires   : 40489996000000 nsecs
jiffies: 10047633

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197846065352886518 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7157cc4>, hrtimer_wakeup, S:01, do_nanosleep, crond/2946
 # expires at 40495766729200 nsecs [in 5512688308 nsecs]
 #1: <f7157cc4>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3480
 # expires at 41441988869277 nsecs [in 951734828385 nsecs]
  .expires_next   : 40489997000000 nsecs
  .hres_active    : 1
  .nr_events      : 49576
  .nohz_mode      : 2
  .idle_tick      : 40476865000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 10044217
  .idle_calls     : 127015
  .idle_sleeps    : 81254
  .idle_entrytime : 40489996578090 nsecs
  .idle_sleeptime : 40331714407873 nsecs
  .last_jiffies   : 10047500
  .next_jiffies   : 10047967
  .idle_expires   : 40491864000000 nsecs
jiffies: 10047688


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     40490812000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000001


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805378776
 min_delta_ns:   1440
 mult:           44735215
 shift:          32
 mode:           1
 next_event:     40490904000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805378776
 min_delta_ns:   1440
 mult:           44735215
 shift:          32
 mode:           3
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

BUG: soft lockup - CPU#1 stuck for 12s! [swapper:0]

Pid: 0, comm: swapper Not tainted (2.6.24-rc5 #24)
EIP: 0060:[<c0603b70>] EFLAGS: 00000292 CPU: 1
EIP is at _spin_unlock_irqrestore+0x5/0x6
EAX: f7bdd24c EBX: f7110000 ECX: f70cc000 EDX: 00000292
ESI: f70cc07c EDI: 00000050 EBP: 00000064 ESP: f7829ee8
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b7d18eb4 CR3: 37352000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
 [<f8bd96ca>] ata_interrupt+0x1a8/0x1ba [libata]
 [<c044aa0a>] handle_IRQ_event+0x1a/0x3f
 [<c044bc42>] handle_fasteoi_irq+0x71/0xa4
 [<c0405e8a>] do_IRQ+0x79/0x93
 [<c04047cf>] common_interrupt+0x23/0x28
 [<c041007b>] acpi_map_lsapic+0x2d/0xda
 [<c0533a75>] acpi_processor_idle+0x293/0x43e
 [<c05337e2>] acpi_processor_idle+0x0/0x43e
 [<c05337e2>] acpi_processor_idle+0x0/0x43e
 [<c0402575>] cpu_idle+0x97/0xb8
 =======================
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 40700144217096 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197846065352886518 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829d94>, tick_sched_timer, S:01, tick_nohz_stop_sched_tick, swapper/0
 # expires at 40700264000000 nsecs [in 119782904 nsecs]
 #1: <f7829d94>, it_real_fn, S:01, do_setitimer, syslogd/2522
 # expires at 40706478360651 nsecs [in 6334143555 nsecs]
 #2: <f7829d94>, hrtimer_wakeup, S:01, do_nanosleep, atd/3039
 # expires at 40832442740683 nsecs [in 132298523587 nsecs]
  .expires_next   : 40700308000000 nsecs
  .hres_active    : 1
  .nr_events      : 83948
  .nohz_mode      : 2
  .idle_tick      : 40699800000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 10099951
  .idle_calls     : 175667
  .idle_sleeps    : 109741
  .idle_entrytime : 40700372012887 nsecs
  .idle_sleeptime : 40539661547333 nsecs
  .last_jiffies   : 10100098
  .next_jiffies   : 10100500
  .idle_expires   : 40699996000000 nsecs
jiffies: 10100103

cpu: 1
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1197846065352886518 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <f7829d94>, hrtimer_wakeup, S:01, do_nanosleep, crond/2946
 # expires at 40735767233236 nsecs [in 35623016140 nsecs]
 #1: <f7829d94>, hrtimer_wakeup, S:01, do_nanosleep, smartd/3480
 # expires at 41441988869277 nsecs [in 741844652181 nsecs]
  .expires_next   : 40699997000000 nsecs
  .hres_active    : 1
  .nr_events      : 49671
  .nohz_mode      : 2
  .idle_tick      : 40686865000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 10096717
  .idle_calls     : 127753
  .idle_sleeps    : 81895
  .idle_entrytime : 40699996620694 nsecs
  .idle_sleeptime : 40540536046589 nsecs
  .last_jiffies   : 10100000
  .next_jiffies   : 10100467
  .idle_expires   : 40701864000000 nsecs
jiffies: 10100158


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     40700692000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  tick_handle_oneshot_broadcast
tick_broadcast_mask: 00000003
tick_broadcast_oneshot_mask: 00000003


Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805378776
 min_delta_ns:   1440
 mult:           44735215
 shift:          32
 mode:           1
 next_event:     40700784000000 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

Tick Device: mode:     1
Clock Event Device: lapic
 max_delta_ns:   805378776
 min_delta_ns:   1440
 mult:           44735215
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: lapic_next_event
 set_mode:       lapic_timer_setup
 event_handler:  hrtimer_interrupt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-17 13:08                                                   ` Parag Warudkar
@ 2007-12-17 22:05                                                     ` Thomas Gleixner
  2008-01-09 11:56                                                       ` Thomas Gleixner
  0 siblings, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2007-12-17 22:05 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Mon, 17 Dec 2007, Parag Warudkar wrote:
> On Dec 17, 2007 3:05 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > Sigh. You did not have the debug patch applied anymore, which printks
> > the timer_list data ? Can you apply it again and provide the output
> > please ?
> >
> 
> This keeps getting more and more weird - This time I was running with
> CONFIG_CPU_IDLE=N and I have ton of soft lockups after 14hr uptime.

It's more than weird.
 
> now at 39080147128907 nsecs

>   .idle_entrytime : 39080384013047 nsecs

The time, when we entered idle on CPU#0 is in the future. 

>   .tick_stopped   : 0

But the tick is not stopped, that means CPU#0 has work to do

On CPU#1 idle entry just happened:

>   .idle_entrytime : 39079996603653 nsecs

> now at 40490254040892 nsecs

>   .idle_entrytime : 40490492012833 nsecs

Again, idle_entry on CPU#0 is in the future.

On CPU#1 the idle entry was at: 
>   .idle_entrytime : 40489996578090 nsecs

which means: 0.257462802 sec. ago

> now at 40700144217096 nsecs

Aarg. On CPU#0 this is consistently in the future:

>   .idle_entrytime : 40700372012887 nsecs

I'm really confused.

>   .idle_entrytime : 40699996620694 nsecs
>   .idle_sleeptime : 40540536046589 nsecs
>   .last_jiffies   : 10100000
>   .next_jiffies   : 10100467
>   .idle_expires   : 40701864000000 nsecs
> jiffies: 10100158
> 
> 
> Tick Device: mode:     1
> Clock Event Device: hpet
>  max_delta_ns:   2147483647
>  min_delta_ns:   3352
>  mult:           61496110
>  shift:          32
>  mode:           3
>  next_event:     40700692000000 nsecs
>  set_next_event: hpet_legacy_next_event
>  set_mode:       hpet_legacy_set_mode
>  event_handler:  tick_handle_oneshot_broadcast
> tick_broadcast_mask: 00000003

Here is the next inconsistent data:

> tick_broadcast_oneshot_mask: 00000003

CPU#1 just woke up. That means the broadcast oneshot mask must be
cleared for CPU#1.

Some real strange thing is going on in your box.

I try to come up with some more debug patches tomorrow.

      tglx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2007-12-17 22:05                                                     ` Thomas Gleixner
@ 2008-01-09 11:56                                                       ` Thomas Gleixner
  2008-01-11  0:31                                                         ` Parag Warudkar
  0 siblings, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2008-01-09 11:56 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Mon, 17 Dec 2007, Thomas Gleixner wrote:
> I try to come up with some more debug patches tomorrow.

Sorry took a bit longer than a day :(

Can you apply the patch below + the debug patch which prints the timer
stats on softlockup and provide the output of this.

Thanks,

	tglx

diff --git a/include/linux/tick.h b/include/linux/tick.h
index f4a1395..22d921f 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -39,6 +39,8 @@ enum tick_nohz_mode {
  * @idle_calls:		Total number of idle calls
  * @idle_sleeps:	Number of idle calls, where the sched tick was stopped
  * @idle_entrytime:	Time when the idle call was entered
+ * @idle_waketime:	Time when the idle was interrupted
+ * @idle_exittime:	Time when the idle state was left
  * @idle_sleeptime:	Sum of the time slept in idle with sched tick stopped
  * @sleep_length:	Duration of the current idle sleep
  */
@@ -52,6 +54,8 @@ struct tick_sched {
 	unsigned long			idle_calls;
 	unsigned long			idle_sleeps;
 	ktime_t				idle_entrytime;
+	ktime_t				idle_waketime;
+	ktime_t				idle_exittime;
 	ktime_t				idle_sleeptime;
 	ktime_t				sleep_length;
 	unsigned long			last_jiffies;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index cb89fa8..206a436 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -133,14 +133,15 @@ void tick_nohz_update_jiffies(void)
 	if (!ts->tick_stopped)
 		return;
 
-	touch_softlockup_watchdog();
-
 	cpu_clear(cpu, nohz_cpu_mask);
 	now = ktime_get();
 
 	local_irq_save(flags);
 	tick_do_update_jiffies64(now);
 	local_irq_restore(flags);
+
+	ts->idle_waketime = now;
+	touch_softlockup_watchdog();
 }
 
 /**
@@ -369,6 +370,7 @@ void tick_nohz_restart_sched_tick(void)
 	 * Cancel the scheduled timer and restore the tick
 	 */
 	ts->tick_stopped  = 0;
+	ts->idle_exittime = now;
 	hrtimer_cancel(&ts->sched_timer);
 	ts->sched_timer.expires = ts->idle_tick;
 
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index 12c5f4c..d3d94c1 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -166,6 +166,8 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now)
 		P(idle_calls);
 		P(idle_sleeps);
 		P_ns(idle_entrytime);
+		P_ns(idle_waketime);
+		P_ns(idle_exittime);
 		P_ns(idle_sleeptime);
 		P(last_jiffies);
 		P(next_jiffies);



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2008-01-09 11:56                                                       ` Thomas Gleixner
@ 2008-01-11  0:31                                                         ` Parag Warudkar
  2008-01-11  9:50                                                           ` Thomas Gleixner
  0 siblings, 1 reply; 36+ messages in thread
From: Parag Warudkar @ 2008-01-11  0:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Jan 9, 2008 6:56 AM, Thomas Gleixner <tglx@linutronix.de> wrote:

> Can you apply the patch below + the debug patch which prints the timer
> stats on softlockup and provide the output of this.

Applied to today's git and running for 21 hours - no recurrence yet
even with 1.2 wakeups per second.

I will re-open this thread if it bites again.

Thanks

Parag

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
  2008-01-11  0:31                                                         ` Parag Warudkar
@ 2008-01-11  9:50                                                           ` Thomas Gleixner
  0 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2008-01-11  9:50 UTC (permalink / raw)
  To: Parag Warudkar
  Cc: Len Brown, Arjan van de Ven, Ingo Molnar, Pallipadi, Venkatesh,
	LKML, Andrew Morton, Linus Torvalds

On Thu, 10 Jan 2008, Parag Warudkar wrote:

> On Jan 9, 2008 6:56 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > Can you apply the patch below + the debug patch which prints the timer
> > stats on softlockup and provide the output of this.
> 
> Applied to today's git and running for 21 hours - no recurrence yet
> even with 1.2 wakeups per second.
> 
> I will re-open this thread if it bites again.

Would be interesting which commit fixed that problem. I'm a bit
worried about self healing bugs.

	tglx

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: soft lockup - CPU#1 stuck for 15s! [swapper:0]
@ 2007-12-09 22:42 parag.warudkar
  0 siblings, 0 replies; 36+ messages in thread
From: parag.warudkar @ 2007-12-09 22:42 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, Thomas Gleixner, Pallipadi, Venkatesh, LKML,
	Andrew Morton, Linus Torvalds


>starts to sound like something is >turning off the clock during a >somewhat deeper C-state...

Sure may be - powertop is behind all this - if there wasn't long lasting deep c-state we wouldn't have this problem :) 

Actually I run powertop and disable things until the machine goes into around 5-6 wakeups per second. So what you said is quite likely.

Thanks for powertop.

Parag
Sent via BlackBerry from T-Mobile


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2008-01-11  9:51 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-07 22:53 BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] Parag Warudkar
2007-12-07 23:12 ` Pallipadi, Venkatesh
2007-12-07 23:31   ` Parag Warudkar
2007-12-08  2:56   ` Thomas Gleixner
2007-12-08 15:10     ` Parag Warudkar
2007-12-08 15:36       ` Parag Warudkar
2007-12-08 15:47         ` Ingo Molnar
2007-12-08 16:00           ` Parag Warudkar
2007-12-08 19:13             ` Ingo Molnar
2007-12-08 19:33               ` Parag Warudkar
2007-12-08 19:42                 ` Ingo Molnar
2007-12-08 20:08                   ` Parag Warudkar
2007-12-08 20:11                     ` Ingo Molnar
2007-12-08 20:46                       ` Parag Warudkar
2007-12-08 20:51                         ` Arjan van de Ven
2007-12-08 21:04                           ` Parag Warudkar
2007-12-08 23:12                             ` Parag Warudkar
2007-12-09 21:57                               ` Parag Warudkar
2007-12-09 22:30                                 ` Arjan van de Ven
2007-12-10  5:49                                 ` Thomas Gleixner
2007-12-11  4:49                                   ` Parag Warudkar
2007-12-14 23:17                                     ` Len Brown
2007-12-15  0:51                                       ` Parag Warudkar
2007-12-15  8:10                                         ` Thomas Gleixner
2007-12-15 18:48                                           ` Parag Warudkar
2007-12-16  5:15                                             ` Parag Warudkar
2007-12-16 13:43                                               ` Parag Warudkar
2007-12-17  8:05                                                 ` Thomas Gleixner
2007-12-17 13:08                                                   ` Parag Warudkar
2007-12-17 22:05                                                     ` Thomas Gleixner
2008-01-09 11:56                                                       ` Thomas Gleixner
2008-01-11  0:31                                                         ` Parag Warudkar
2008-01-11  9:50                                                           ` Thomas Gleixner
2007-12-07 23:17 ` BUG: " Andrew Morton
2007-12-08  2:35   ` Rafael J. Wysocki
2007-12-09 22:42 parag.warudkar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).