LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* 2.6.21-rc4-rt0 BUG: at kernel/fork.c:1033 copy_process()
@ 2007-03-21 20:38 Michal Piotrowski
2007-03-22 9:31 ` [patch] setup_boot_APIC_clock() irq-enable fix Ingo Molnar
0 siblings, 1 reply; 12+ messages in thread
From: Michal Piotrowski @ 2007-03-21 20:38 UTC (permalink / raw)
To: Ingo Molnar; +Cc: LKML
Hi Ingo,
It might be lockdep related
-#ifdef CONFIG_TRACE_IRQFLAGS
+#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_LOCKDEP)
BUG: at kernel/fork.c:1033 copy_process()
[<c0105474>] dump_trace+0x78/0x21a
[<c010564b>] show_trace_log_lvl+0x35/0x54
[<c0105ddc>] show_trace+0x2c/0x2e
[<c0105ea3>] dump_stack+0x29/0x2b
[<c01234e0>] copy_process+0x1d1/0x1440
[<c01249fc>] do_fork+0xa8/0x18f
[<c010239e>] kernel_thread+0x93/0x99
[<c01393ec>] keventd_create_kthread+0x2f/0x7c
[<c01394ae>] kthread_create+0x75/0xbb
[<c011feba>] migration_call+0x5c/0x3cf
[<c04ddb54>] migration_init+0x2b/0x62
[<c04d14c5>] init+0x56/0x361
[<c0104ffb>] kernel_thread_helper+0x7/0x10
=======================
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-dmesg
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch] setup_boot_APIC_clock() irq-enable fix
2007-03-21 20:38 2.6.21-rc4-rt0 BUG: at kernel/fork.c:1033 copy_process() Michal Piotrowski
@ 2007-03-22 9:31 ` Ingo Molnar
2007-03-22 10:56 ` Thomas Gleixner
2007-03-22 12:57 ` Michal Piotrowski
0 siblings, 2 replies; 12+ messages in thread
From: Ingo Molnar @ 2007-03-22 9:31 UTC (permalink / raw)
To: Michal Piotrowski; +Cc: LKML, Linus Torvalds, Thomas Gleixner, Andrew Morton
* Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> Hi Ingo,
> 2.6.21-rc4-rt0
> BUG: at kernel/fork.c:1033 copy_process()
thanks Michal - this is a real bug that affects upstream too. Find the
fix below - i've test-booted it and it fixes the warning.
Linus, Andrew, this is a must-have for v2.6.21.
Ingo
--------------->
Subject: [patch] setup_boot_APIC_clock() irq-enable fix
From: Ingo Molnar <mingo@elte.hu>
latest -git triggers an irqtrace/lockdep warning of a leaked
irqs-off condition:
BUG: at kernel/fork.c:1033 copy_process()
after some debugging it turns out that commit ca1b940c accidentally left
interrupts disabled - which trickled down all the way to the first time
we fork a kernel thread and triggered the warning.
the fix is to re-enable interrupts in the 'else' branch of
setup_boot_APIC_clock()'s pmtimers calibration path.
Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/i386/kernel/apic.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Index: linux/arch/i386/kernel/apic.c
===================================================================
--- linux.orig/arch/i386/kernel/apic.c
+++ linux/arch/i386/kernel/apic.c
@@ -507,7 +507,8 @@ void __init setup_boot_APIC_clock(void)
apic_printk(APIC_VERBOSE, "... jiffies result ok\n");
else
local_apic_timer_verify_ok = 0;
- }
+ } else
+ local_irq_enable();
if (!local_apic_timer_verify_ok) {
printk(KERN_WARNING
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch] setup_boot_APIC_clock() irq-enable fix
2007-03-22 9:31 ` [patch] setup_boot_APIC_clock() irq-enable fix Ingo Molnar
@ 2007-03-22 10:56 ` Thomas Gleixner
2007-03-22 12:57 ` Michal Piotrowski
1 sibling, 0 replies; 12+ messages in thread
From: Thomas Gleixner @ 2007-03-22 10:56 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Michal Piotrowski, LKML, Linus Torvalds, Andrew Morton
On Thu, 2007-03-22 at 10:31 +0100, Ingo Molnar wrote:
> Subject: [patch] setup_boot_APIC_clock() irq-enable fix
> From: Ingo Molnar <mingo@elte.hu>
>
> latest -git triggers an irqtrace/lockdep warning of a leaked
> irqs-off condition:
>
> BUG: at kernel/fork.c:1033 copy_process()
>
> after some debugging it turns out that commit ca1b940c accidentally left
> interrupts disabled - which trickled down all the way to the first time
> we fork a kernel thread and triggered the warning.
>
> the fix is to re-enable interrupts in the 'else' branch of
> setup_boot_APIC_clock()'s pmtimers calibration path.
>
> Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@brown.paperbag.linutronix.de>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch] setup_boot_APIC_clock() irq-enable fix
2007-03-22 9:31 ` [patch] setup_boot_APIC_clock() irq-enable fix Ingo Molnar
2007-03-22 10:56 ` Thomas Gleixner
@ 2007-03-22 12:57 ` Michal Piotrowski
2007-03-22 13:27 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Michal Piotrowski
1 sibling, 1 reply; 12+ messages in thread
From: Michal Piotrowski @ 2007-03-22 12:57 UTC (permalink / raw)
To: Ingo Molnar; +Cc: LKML, Linus Torvalds, Thomas Gleixner, Andrew Morton
On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>
> > Hi Ingo,
>
> > 2.6.21-rc4-rt0
>
> > BUG: at kernel/fork.c:1033 copy_process()
>
> thanks Michal - this is a real bug that affects upstream too. Find the
> fix below - i've test-booted it and it fixes the warning.
Problem is fixed, thanks.
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
^ permalink raw reply [flat|nested] 12+ messages in thread
* 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix)
2007-03-22 12:57 ` Michal Piotrowski
@ 2007-03-22 13:27 ` Michal Piotrowski
2007-03-23 5:25 ` Vivek Goyal
2007-03-23 7:15 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Ingo Molnar
0 siblings, 2 replies; 12+ messages in thread
From: Michal Piotrowski @ 2007-03-22 13:27 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, LKML
Michal Piotrowski napisał(a):
> On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote:
>>
>> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>>
>> > Hi Ingo,
>>
>> > 2.6.21-rc4-rt0
>>
>> > BUG: at kernel/fork.c:1033 copy_process()
>>
>> thanks Michal - this is a real bug that affects upstream too. Find the
>> fix below - i've test-booted it and it fixes the warning.
>
> Problem is fixed, thanks.
BTW. It seems that nobody uses -rt as a crash dump kernel ;)
BUG: unable to handle kernel paging request at virtual address f7ebf8c4
printing eip:
c1610192
*pde = 00000000
stopped custom tracer.
Oops: 0000 [#1]
PREEMPT
Modules linked in:
CPU: 0
EIP: 0060:[<c1610192>] Not tainted VLI
EFLAGS: 00010206 (2.6.21-rc4-rt0-kdump #3)
EIP is at copy_oldmem_page+0x4a/0xd0
eax: 000008c4 ebx: f7ebf000 ecx: 00000100 edx: 00000246
esi: f7ebf8c4 edi: c4c520fc ebp: c4d54e30 esp: c4d54e18
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 preempt:00000001
Process swapper (pid: 1, ti=c4d54000 task=c4d52c20 task.ti=c4d54000)
Stack: c17ab7e0 c183f982 c1969658 00000400 00000400 00037ebf c4d54e5c c16af187
00037ebf c4c520fc 00000400 000008c4 00000000 00000000 c4c696e0 00000400
c4c520fc c4d54f94 c19a9cfd c4c520fc 00000400 c4d54f78 00000000 c1840996
Call Trace:
[<c16af187>] read_from_oldmem+0x73/0x98
[<c19a9cfd>] vmcore_init+0x26c/0xab7
[<c199979b>] init+0xaa/0x287
[<c16044eb>] kernel_thread_helper+0x7/0x10
=======================
l *copy_oldmem_page+0x4a/0xd0
0xc1610148 is in copy_oldmem_page (arch/i386/kernel/crash_dump.c:35).
30 * copying the data to a pre-allocated kernel page and then copying to user
31 * space in non-atomic context.
32 */
33 ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
34 size_t csize, unsigned long offset, int userbuf)
35 {
36 void *vaddr;
37
38 if (!csize)
39 return 0;
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c184045a>] .... __spin_lock_irqsave+0x23/0x65
.....[<c1604f8c>] .. ( <= die+0x44/0x24d)
l *0xc184045a
0xc184045a is in __spin_lock_irqsave (kernel/spinlock.c:122).
117 {
118 unsigned long flags;
119
120 local_irq_save(flags);
121 preempt_disable();
122 spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
123 /*
124 * On lockdep we dont want the hand-coded irq-enable of
125 * _raw_spin_lock_flags() code, because lockdep assumes
126 * that interrupts are not re-enabled during lock-acquire:
l *0xc1604f8c
0xc1604f8c is in die (arch/i386/kernel/traps.c:477).
472
473 oops_enter();
474
475 if (die.lock_owner != raw_smp_processor_id()) {
476 console_verbose();
477 spin_lock_irqsave(&die.lock, flags);
478 die.lock_owner = smp_processor_id();
479 die.lock_owner_depth = 0;
480 bust_spinlocks(1);
481 }
Code: 10 05 00 c1 e3 05 03 1d 60 8e d6 c1 89 1c 24 e8 fc 33 00 00 89 c3 83 7d 18 00 75 2a 8b 4d 10 c1 e9 02 8b 45 14 8d 34 03 8b 7d 0c <f3> a5 8b 4d 10 83 e1 03 74 02 f3 a4 e8 cb 10 05 00 89 1c 24 e8
EIP: [<c1610192>] copy_oldmem_page+0x4a/0xd0 SS:ESP 0068:c4d54e18
Kernel panic - not syncing: Attempted to kill init!
[<c160496d>] dump_trace+0x78/0x21a
[<c1604b44>] show_trace_log_lvl+0x35/0x54
[<c16052c4>] show_trace+0x2c/0x2e
[<c160538b>] dump_stack+0x29/0x2b
[<c1618b30>] panic+0x68/0x130
[<c161b67c>] do_exit+0xa1/0x7e3
[<c160516a>] die+0x222/0x24d
[<c1612a3f>] do_page_fault+0x4a1/0x586
[<c1841044>] error_code+0x74/0x7c
[<c1610192>] copy_oldmem_page+0x4a/0xd0
[<c16af187>] read_from_oldmem+0x73/0x98
[<c19a9cfd>] vmcore_init+0x26c/0xab7
[<c199979b>] init+0xaa/0x287
[<c16044eb>] kernel_thread_helper+0x7/0x10
=======================
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c1618ae6>] .... panic+0x1e/0x130
.....[<c161b67c>] .. ( <= do_exit+0xa1/0x7e3)
l *0xc1618ae6
0xc1618ae6 is in panic (kernel/panic.c:85).
80 * have preempt disabled. Some functions called from here want
81 * preempt to be disabled. No point enabling it later though...
82 */
83 preempt_disable();
84
85 bust_spinlocks(1);
86 va_start(args, fmt);
87 vsnprintf(buf, sizeof(buf), fmt, args);
88 va_end(args);
89 printk(KERN_EMERG "Kernel panic - not syncing: %s\n",buf);
l *0xc161b67c
0xc161b67c is in do_exit (include/linux/pid_namespace.h:42).
37 kref_put(&ns->kref, free_pid_ns);
38 }
39
40 static inline struct task_struct *child_reaper(struct task_struct *tsk)
41 {
42 return init_pid_ns.child_reaper;
43 }
44
45 #endif /* _LINUX_PID_NS_H */
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/kdump-console.log
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/kdump-config
NOHZ: local_softirq_pending 02 on CPU#1
NOHZ: local_softirq_pending 02 on CPU#0
NOHZ: local_softirq_pending 08 on CPU#0
NOHZ: local_softirq_pending 02 on CPU#1
NOHZ: local_softirq_pending 10 on CPU#0
NOHZ: local_softirq_pending 02 on CPU#0
NOHZ: local_softirq_pending 02 on CPU#1
NOHZ: local_softirq_pending 08 on CPU#0
CPU0 CPU1
0: 304 0 IO-APIC-edge timer
1: 2319 0 IO-APIC-edge i8042
7: 0 0 IO-APIC-edge parport0
8: 1 0 IO-APIC-edge rtc
9: 1 0 IO-APIC-fasteoi acpi
12: 3 0 IO-APIC-edge i8042
14: 738 0 IO-APIC-edge ide0
15: 3050 0 IO-APIC-edge ide1
16: 23802 0 IO-APIC-fasteoi uhci_hcd:usb2, uhci_hcd:usb5
17: 33123 0 IO-APIC-fasteoi eth1
19: 29280 0 IO-APIC-fasteoi libata, uhci_hcd:usb4
20: 2 0 IO-APIC-fasteoi ehci_hcd:usb1
21: 0 0 IO-APIC-fasteoi uhci_hcd:usb3
22: 38572 0 IO-APIC-fasteoi Intel ICH5
NMI: 0 0
LOC: 351305 268386
ERR: 0
MIS: 0
Hibernation is still broken.
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix)
2007-03-22 13:27 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Michal Piotrowski
@ 2007-03-23 5:25 ` Vivek Goyal
2007-03-23 8:23 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski
2007-03-23 12:10 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski
2007-03-23 7:15 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Ingo Molnar
1 sibling, 2 replies; 12+ messages in thread
From: Vivek Goyal @ 2007-03-23 5:25 UTC (permalink / raw)
To: Michal Piotrowski; +Cc: Ingo Molnar, Thomas Gleixner, LKML
On Thu, Mar 22, 2007 at 02:27:25PM +0100, Michal Piotrowski wrote:
> Michal Piotrowski napisał(a):
> > On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote:
> >>
> >> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> >>
> >> > Hi Ingo,
> >>
> >> > 2.6.21-rc4-rt0
> >>
> >> > BUG: at kernel/fork.c:1033 copy_process()
> >>
> >> thanks Michal - this is a real bug that affects upstream too. Find the
> >> fix below - i've test-booted it and it fixes the warning.
> >
> > Problem is fixed, thanks.
>
> BTW. It seems that nobody uses -rt as a crash dump kernel ;)
>
> BUG: unable to handle kernel paging request at virtual address f7ebf8c4
> printing eip:
> c1610192
> *pde = 00000000
> stopped custom tracer.
> Oops: 0000 [#1]
> PREEMPT
> Modules linked in:
> CPU: 0
> EIP: 0060:[<c1610192>] Not tainted VLI
> EFLAGS: 00010206 (2.6.21-rc4-rt0-kdump #3)
> EIP is at copy_oldmem_page+0x4a/0xd0
> eax: 000008c4 ebx: f7ebf000 ecx: 00000100 edx: 00000246
> esi: f7ebf8c4 edi: c4c520fc ebp: c4d54e30 esp: c4d54e18
> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 preempt:00000001
> Process swapper (pid: 1, ti=c4d54000 task=c4d52c20 task.ti=c4d54000)
> Stack: c17ab7e0 c183f982 c1969658 00000400 00000400 00037ebf c4d54e5c c16af187
> 00037ebf c4c520fc 00000400 000008c4 00000000 00000000 c4c696e0 00000400
> c4c520fc c4d54f94 c19a9cfd c4c520fc 00000400 c4d54f78 00000000 c1840996
> Call Trace:
> [<c16af187>] read_from_oldmem+0x73/0x98
> [<c19a9cfd>] vmcore_init+0x26c/0xab7
> [<c199979b>] init+0xaa/0x287
> [<c16044eb>] kernel_thread_helper+0x7/0x10
> =======================
>
> l *copy_oldmem_page+0x4a/0xd0
> 0xc1610148 is in copy_oldmem_page (arch/i386/kernel/crash_dump.c:35).
> 30 * copying the data to a pre-allocated kernel page and then copying to user
> 31 * space in non-atomic context.
> 32 */
> 33 ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
> 34 size_t csize, unsigned long offset, int userbuf)
> 35 {
> 36 void *vaddr;
> 37
> 38 if (!csize)
> 39 return 0;
>
Can you please paste the disassembly of copy_oldmem_page() on your system.
Not sure from where this faulting address 0xf7ebf8c4 is coming. We are still
in vmcore_init(), so we should be copying the data to kernel buffers only.
This looks like a valid kernel address.
Can you also put some printk() here to find out from where 0xf7ebf8c4 has
come? It does not look like a fixed kernel virutual address returned by
kmap_atomic_pfn(). Then is it passed by kernel as a parameter to
copy_oldmem_page()?
Thanks
Vivek
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix)
2007-03-22 13:27 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Michal Piotrowski
2007-03-23 5:25 ` Vivek Goyal
@ 2007-03-23 7:15 ` Ingo Molnar
2007-03-23 7:58 ` Michal Piotrowski
1 sibling, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2007-03-23 7:15 UTC (permalink / raw)
To: Michal Piotrowski; +Cc: Thomas Gleixner, LKML
* Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> >> > BUG: at kernel/fork.c:1033 copy_process()
> >>
> >> thanks Michal - this is a real bug that affects upstream too. Find
> >> the fix below - i've test-booted it and it fixes the warning.
> >
> > Problem is fixed, thanks.
>
> BTW. It seems that nobody uses -rt as a crash dump kernel ;)
it's been tested with v2.6.20-rt8, and it should work as long as you
enable CONFIG_RELOCATABLE. But i'm not using it myself, and
v2.6.21-rc4-rt0 isnt a particularly encouraging version string for
people to try ;)
> Hibernation is still broken.
>
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config
what's the failure mode besides the lockdep + other debug messages - it
doesnt resume? Your log seems to have at least one sequence of resume
related messages - those seem to have worked fine.
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix)
2007-03-23 7:15 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Ingo Molnar
@ 2007-03-23 7:58 ` Michal Piotrowski
2007-03-23 8:02 ` Ingo Molnar
0 siblings, 1 reply; 12+ messages in thread
From: Michal Piotrowski @ 2007-03-23 7:58 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, LKML
On 23/03/07, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>
> > >> > BUG: at kernel/fork.c:1033 copy_process()
> > >>
> > >> thanks Michal - this is a real bug that affects upstream too. Find
> > >> the fix below - i've test-booted it and it fixes the warning.
> > >
> > > Problem is fixed, thanks.
> >
> > BTW. It seems that nobody uses -rt as a crash dump kernel ;)
>
> it's been tested with v2.6.20-rt8, and it should work as long as you
> enable CONFIG_RELOCATABLE. But i'm not using it myself, and
> v2.6.21-rc4-rt0 isnt a particularly encouraging version string for
> people to try ;)
>
> > Hibernation is still broken.
> >
> > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log
> > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config
>
> what's the failure mode besides the lockdep + other debug messages - it
> doesnt resume? Your log seems to have at least one sequence of resume
> related messages - those seem to have worked fine.
Kernel has crashed after
Read 497936 kbytes in 23.09 seconds (21.56 MB/s)
swsusp: Reading resume file was successful
PM: Preparing devices for restore.
Suspending console(s)
>
> Ingo
>
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix)
2007-03-23 7:58 ` Michal Piotrowski
@ 2007-03-23 8:02 ` Ingo Molnar
2007-03-23 8:17 ` Michal Piotrowski
0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2007-03-23 8:02 UTC (permalink / raw)
To: Michal Piotrowski; +Cc: Thomas Gleixner, LKML
* Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> >> Hibernation is still broken.
> >>
> >>
> >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log
> >>
> >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config
> >
> >what's the failure mode besides the lockdep + other debug messages -
> >it doesnt resume? Your log seems to have at least one sequence of
> >resume related messages - those seem to have worked fine.
>
> Kernel has crashed after
crashed == 'hung hard' or 'spontaneous reboot' or 'other'?
> Read 497936 kbytes in 23.09 seconds (21.56 MB/s)
> swsusp: Reading resume file was successful
> PM: Preparing devices for restore.
> Suspending console(s)
i havent used sw-suspend for a while, so here's a stupid question: why
does it try to suspend the consoles in the resume path? I assume the
messages above mean that we are already in the resume path?
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix)
2007-03-23 8:02 ` Ingo Molnar
@ 2007-03-23 8:17 ` Michal Piotrowski
0 siblings, 0 replies; 12+ messages in thread
From: Michal Piotrowski @ 2007-03-23 8:17 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Thomas Gleixner, LKML
On 23/03/07, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>
> > >> Hibernation is still broken.
> > >>
> > >>
> > >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log
> > >>
> > >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config
> > >
> > >what's the failure mode besides the lockdep + other debug messages -
> > >it doesnt resume? Your log seems to have at least one sequence of
> > >resume related messages - those seem to have worked fine.
> >
> > Kernel has crashed after
>
> crashed == 'hung hard' or 'spontaneous reboot' or 'other'?
It means 'nothing has happened for a five minutes'.
>
> > Read 497936 kbytes in 23.09 seconds (21.56 MB/s)
> > swsusp: Reading resume file was successful
> > PM: Preparing devices for restore.
> > Suspending console(s)
>
> i havent used sw-suspend for a while, so here's a stupid question: why
> does it try to suspend the consoles in the resume path? I assume the
> messages above mean that we are already in the resume path?
Yes. (I'm not sure, but "Suspending console(s)" might come from PM_DEBUG.)
>
> Ingo
>
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump
2007-03-23 5:25 ` Vivek Goyal
@ 2007-03-23 8:23 ` Michal Piotrowski
2007-03-23 12:10 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski
1 sibling, 0 replies; 12+ messages in thread
From: Michal Piotrowski @ 2007-03-23 8:23 UTC (permalink / raw)
To: vgoyal; +Cc: Michal Piotrowski, Ingo Molnar, Thomas Gleixner, LKML
Vivek Goyal napisał(a):
> On Thu, Mar 22, 2007 at 02:27:25PM +0100, Michal Piotrowski wrote:
>> Michal Piotrowski napisał(a):
>>> On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote:
>>>> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>>>>
>>>>> Hi Ingo,
>>>>> 2.6.21-rc4-rt0
>>>>> BUG: at kernel/fork.c:1033 copy_process()
>>>> thanks Michal - this is a real bug that affects upstream too. Find the
>>>> fix below - i've test-booted it and it fixes the warning.
>>> Problem is fixed, thanks.
>> BTW. It seems that nobody uses -rt as a crash dump kernel ;)
>>
>> BUG: unable to handle kernel paging request at virtual address f7ebf8c4
>> printing eip:
>> c1610192
>> *pde = 00000000
>> stopped custom tracer.
>> Oops: 0000 [#1]
>> PREEMPT
>> Modules linked in:
>> CPU: 0
>> EIP: 0060:[<c1610192>] Not tainted VLI
>> EFLAGS: 00010206 (2.6.21-rc4-rt0-kdump #3)
>> EIP is at copy_oldmem_page+0x4a/0xd0
>> eax: 000008c4 ebx: f7ebf000 ecx: 00000100 edx: 00000246
>> esi: f7ebf8c4 edi: c4c520fc ebp: c4d54e30 esp: c4d54e18
>> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 preempt:00000001
>> Process swapper (pid: 1, ti=c4d54000 task=c4d52c20 task.ti=c4d54000)
>> Stack: c17ab7e0 c183f982 c1969658 00000400 00000400 00037ebf c4d54e5c c16af187
>> 00037ebf c4c520fc 00000400 000008c4 00000000 00000000 c4c696e0 00000400
>> c4c520fc c4d54f94 c19a9cfd c4c520fc 00000400 c4d54f78 00000000 c1840996
>> Call Trace:
>> [<c16af187>] read_from_oldmem+0x73/0x98
>> [<c19a9cfd>] vmcore_init+0x26c/0xab7
>> [<c199979b>] init+0xaa/0x287
>> [<c16044eb>] kernel_thread_helper+0x7/0x10
>> =======================
>>
>> l *copy_oldmem_page+0x4a/0xd0
>> 0xc1610148 is in copy_oldmem_page (arch/i386/kernel/crash_dump.c:35).
>> 30 * copying the data to a pre-allocated kernel page and then copying to user
>> 31 * space in non-atomic context.
>> 32 */
>> 33 ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
>> 34 size_t csize, unsigned long offset, int userbuf)
>> 35 {
>> 36 void *vaddr;
>> 37
>> 38 if (!csize)
>> 39 return 0;
>>
>
> Can you please paste the disassembly of copy_oldmem_page() on your system.
disassemble *copy_oldmem_page
Dump of assembler code for function copy_oldmem_page:
0xc1610148 <copy_oldmem_page+0>: push %ebp
0xc1610149 <copy_oldmem_page+1>: mov %esp,%ebp
0xc161014b <copy_oldmem_page+3>: push %edi
0xc161014c <copy_oldmem_page+4>: push %esi
0xc161014d <copy_oldmem_page+5>: push %ebx
0xc161014e <copy_oldmem_page+6>: sub $0xc,%esp
0xc1610151 <copy_oldmem_page+9>: call 0xc160f5c4 <mcount>
0xc1610156 <copy_oldmem_page+14>: mov 0x8(%ebp),%ebx
0xc1610159 <copy_oldmem_page+17>: xor %edx,%edx
0xc161015b <copy_oldmem_page+19>: cmpl $0x0,0x10(%ebp)
0xc161015f <copy_oldmem_page+23>: je 0xc161020d <copy_oldmem_page+197>
0xc1610165 <copy_oldmem_page+29>: call 0xc1661258 <pagefault_disable>
0xc161016a <copy_oldmem_page+34>: shl $0x5,%ebx
0xc161016d <copy_oldmem_page+37>: add 0xc1d68e60,%ebx
0xc1610173 <copy_oldmem_page+43>: mov %ebx,(%esp)
0xc1610176 <copy_oldmem_page+46>: call 0xc1613577 <kmap>
0xc161017b <copy_oldmem_page+51>: mov %eax,%ebx
0xc161017d <copy_oldmem_page+53>: cmpl $0x0,0x18(%ebp)
0xc1610181 <copy_oldmem_page+57>: jne 0xc16101ad <copy_oldmem_page+101>
0xc1610183 <copy_oldmem_page+59>: mov 0x10(%ebp),%ecx
0xc1610186 <copy_oldmem_page+62>: shr $0x2,%ecx
0xc1610189 <copy_oldmem_page+65>: mov 0x14(%ebp),%eax
0xc161018c <copy_oldmem_page+68>: lea (%ebx,%eax,1),%esi
0xc161018f <copy_oldmem_page+71>: mov 0xc(%ebp),%edi
0xc1610192 <copy_oldmem_page+74>: rep movsl %ds:(%esi),%es:(%edi)
0xc1610194 <copy_oldmem_page+76>: mov 0x10(%ebp),%ecx
0xc1610197 <copy_oldmem_page+79>: and $0x3,%ecx
0xc161019a <copy_oldmem_page+82>: je 0xc161019e <copy_oldmem_page+86>
0xc161019c <copy_oldmem_page+84>: rep movsb %ds:(%esi),%es:(%edi)
0xc161019e <copy_oldmem_page+86>: call 0xc166126e <pagefault_enable>
0xc16101a3 <copy_oldmem_page+91>: mov %ebx,(%esp)
0xc16101a6 <copy_oldmem_page+94>: call 0xc1613533 <kunmap_virt>
0xc16101ab <copy_oldmem_page+99>: jmp 0xc161020a <copy_oldmem_page+194>
0xc16101ad <copy_oldmem_page+101>: mov 0xc19d4004,%edi
0xc16101b3 <copy_oldmem_page+107>: test %edi,%edi
0xc16101b5 <copy_oldmem_page+109>: jne 0xc16101ca <copy_oldmem_page+130>
0xc16101b7 <copy_oldmem_page+111>: movl $0xc18bbc93,(%esp)
0xc16101be <copy_oldmem_page+118>: call 0xc1619671 <printk>
0xc16101c3 <copy_oldmem_page+123>: mov $0xfffffff2,%edx
0xc16101c8 <copy_oldmem_page+128>: jmp 0xc161020d <copy_oldmem_page+197>
0xc16101ca <copy_oldmem_page+130>: mov $0x400,%ecx
0xc16101cf <copy_oldmem_page+135>: mov %eax,%esi
0xc16101d1 <copy_oldmem_page+137>: rep movsl %ds:(%esi),%es:(%edi)
0xc16101d3 <copy_oldmem_page+139>: call 0xc166126e <pagefault_enable>
0xc16101d8 <copy_oldmem_page+144>: mov %ebx,(%esp)
0xc16101db <copy_oldmem_page+147>: call 0xc1613533 <kunmap_virt>
0xc16101e0 <copy_oldmem_page+152>: mov 0x10(%ebp),%eax
0xc16101e3 <copy_oldmem_page+155>: mov %eax,0x8(%esp)
0xc16101e7 <copy_oldmem_page+159>: mov 0xc19d4004,%eax
0xc16101ec <copy_oldmem_page+164>: add %eax,0x14(%ebp)
0xc16101ef <copy_oldmem_page+167>: mov 0x14(%ebp),%eax
0xc16101f2 <copy_oldmem_page+170>: mov %eax,0x4(%esp)
0xc16101f6 <copy_oldmem_page+174>: mov 0xc(%ebp),%eax
0xc16101f9 <copy_oldmem_page+177>: mov %eax,(%esp)
0xc16101fc <copy_oldmem_page+180>: call 0xc1700e98 <copy_to_user>
0xc1610201 <copy_oldmem_page+185>: mov $0xfffffff2,%edx
0xc1610206 <copy_oldmem_page+190>: test %eax,%eax
0xc1610208 <copy_oldmem_page+192>: jne 0xc161020d <copy_oldmem_page+197>
0xc161020a <copy_oldmem_page+194>: mov 0x10(%ebp),%edx
0xc161020d <copy_oldmem_page+197>: mov %edx,%eax
0xc161020f <copy_oldmem_page+199>: add $0xc,%esp
0xc1610212 <copy_oldmem_page+202>: pop %ebx
0xc1610213 <copy_oldmem_page+203>: pop %esi
0xc1610214 <copy_oldmem_page+204>: pop %edi
0xc1610215 <copy_oldmem_page+205>: pop %ebp
0xc1610216 <copy_oldmem_page+206>: ret
End of assembler dump.
> Not sure from where this faulting address 0xf7ebf8c4 is coming. We are still
> in vmcore_init(), so we should be copying the data to kernel buffers only.
> This looks like a valid kernel address.
>
> Can you also put some printk() here to find out from where 0xf7ebf8c4 has
> come? It does not look like a fixed kernel virutual address returned by
> kmap_atomic_pfn(). Then is it passed by kernel as a parameter to
> copy_oldmem_page()?
>
> Thanks
> Vivek
>
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump
2007-03-23 5:25 ` Vivek Goyal
2007-03-23 8:23 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski
@ 2007-03-23 12:10 ` Michal Piotrowski
1 sibling, 0 replies; 12+ messages in thread
From: Michal Piotrowski @ 2007-03-23 12:10 UTC (permalink / raw)
To: vgoyal; +Cc: Michal Piotrowski, Ingo Molnar, Thomas Gleixner, LKML
Vivek Goyal napisał(a):
> On Thu, Mar 22, 2007 at 02:27:25PM +0100, Michal Piotrowski wrote:
>> Michal Piotrowski napisał(a):
>>> On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote:
>>>> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
>>>>
>>>>> Hi Ingo,
>>>>> 2.6.21-rc4-rt0
>>>>> BUG: at kernel/fork.c:1033 copy_process()
>>>> thanks Michal - this is a real bug that affects upstream too. Find the
>>>> fix below - i've test-booted it and it fixes the warning.
>>> Problem is fixed, thanks.
>> BTW. It seems that nobody uses -rt as a crash dump kernel ;)
>>
>> BUG: unable to handle kernel paging request at virtual address f7ebf8c4
>> printing eip:
>> c1610192
>> *pde = 00000000
>> stopped custom tracer.
>> Oops: 0000 [#1]
>> PREEMPT
>> Modules linked in:
>> CPU: 0
>> EIP: 0060:[<c1610192>] Not tainted VLI
>> EFLAGS: 00010206 (2.6.21-rc4-rt0-kdump #3)
>> EIP is at copy_oldmem_page+0x4a/0xd0
>> eax: 000008c4 ebx: f7ebf000 ecx: 00000100 edx: 00000246
>> esi: f7ebf8c4 edi: c4c520fc ebp: c4d54e30 esp: c4d54e18
>> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 preempt:00000001
>> Process swapper (pid: 1, ti=c4d54000 task=c4d52c20 task.ti=c4d54000)
>> Stack: c17ab7e0 c183f982 c1969658 00000400 00000400 00037ebf c4d54e5c c16af187
>> 00037ebf c4c520fc 00000400 000008c4 00000000 00000000 c4c696e0 00000400
>> c4c520fc c4d54f94 c19a9cfd c4c520fc 00000400 c4d54f78 00000000 c1840996
>> Call Trace:
>> [<c16af187>] read_from_oldmem+0x73/0x98
>> [<c19a9cfd>] vmcore_init+0x26c/0xab7
>> [<c199979b>] init+0xaa/0x287
>> [<c16044eb>] kernel_thread_helper+0x7/0x10
>> =======================
>>
>> l *copy_oldmem_page+0x4a/0xd0
>> 0xc1610148 is in copy_oldmem_page (arch/i386/kernel/crash_dump.c:35).
>> 30 * copying the data to a pre-allocated kernel page and then copying to user
>> 31 * space in non-atomic context.
>> 32 */
>> 33 ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
>> 34 size_t csize, unsigned long offset, int userbuf)
>> 35 {
>> 36 void *vaddr;
>> 37
>> 38 if (!csize)
>> 39 return 0;
>>
>
> Can you please paste the disassembly of copy_oldmem_page() on your system.
> Not sure from where this faulting address 0xf7ebf8c4 is coming. We are still
> in vmcore_init(), so we should be copying the data to kernel buffers only.
> This looks like a valid kernel address.
>
> Can you also put some printk() here to find out from where 0xf7ebf8c4 has
> come? It does not look like a fixed kernel virutual address returned by
> kmap_atomic_pfn(). Then is it passed by kernel as a parameter to
> copy_oldmem_page()?
I added
printk(KERN_WARNING "copy_oldmem_page() pfn=%lu , buf=%s , nr_bytes=%d , offset=%lu , userbuf=%d\n", pfn, buf, nr_bytes, offset, userbuf);
before
tmp = copy_oldmem_page(pfn, buf, nr_bytes, offset, userbuf);
result is here
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/kdump-console2.log
'buf' might be broken.
>
> Thanks
> Vivek
>
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-03-23 12:11 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-21 20:38 2.6.21-rc4-rt0 BUG: at kernel/fork.c:1033 copy_process() Michal Piotrowski
2007-03-22 9:31 ` [patch] setup_boot_APIC_clock() irq-enable fix Ingo Molnar
2007-03-22 10:56 ` Thomas Gleixner
2007-03-22 12:57 ` Michal Piotrowski
2007-03-22 13:27 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Michal Piotrowski
2007-03-23 5:25 ` Vivek Goyal
2007-03-23 8:23 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski
2007-03-23 12:10 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski
2007-03-23 7:15 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Ingo Molnar
2007-03-23 7:58 ` Michal Piotrowski
2007-03-23 8:02 ` Ingo Molnar
2007-03-23 8:17 ` Michal Piotrowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).