LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* 2.6.21-rc4-rt0 BUG: at kernel/fork.c:1033 copy_process() @ 2007-03-21 20:38 Michal Piotrowski 2007-03-22 9:31 ` [patch] setup_boot_APIC_clock() irq-enable fix Ingo Molnar 0 siblings, 1 reply; 12+ messages in thread From: Michal Piotrowski @ 2007-03-21 20:38 UTC (permalink / raw) To: Ingo Molnar; +Cc: LKML Hi Ingo, It might be lockdep related -#ifdef CONFIG_TRACE_IRQFLAGS +#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_LOCKDEP) BUG: at kernel/fork.c:1033 copy_process() [<c0105474>] dump_trace+0x78/0x21a [<c010564b>] show_trace_log_lvl+0x35/0x54 [<c0105ddc>] show_trace+0x2c/0x2e [<c0105ea3>] dump_stack+0x29/0x2b [<c01234e0>] copy_process+0x1d1/0x1440 [<c01249fc>] do_fork+0xa8/0x18f [<c010239e>] kernel_thread+0x93/0x99 [<c01393ec>] keventd_create_kthread+0x2f/0x7c [<c01394ae>] kthread_create+0x75/0xbb [<c011feba>] migration_call+0x5c/0x3cf [<c04ddb54>] migration_init+0x2b/0x62 [<c04d14c5>] init+0x56/0x361 [<c0104ffb>] kernel_thread_helper+0x7/0x10 ======================= --------------------------- | preempt count: 00000000 ] | 0-level deep critical section nesting: ---------------------------------------- http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-dmesg Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) ^ permalink raw reply [flat|nested] 12+ messages in thread
* [patch] setup_boot_APIC_clock() irq-enable fix 2007-03-21 20:38 2.6.21-rc4-rt0 BUG: at kernel/fork.c:1033 copy_process() Michal Piotrowski @ 2007-03-22 9:31 ` Ingo Molnar 2007-03-22 10:56 ` Thomas Gleixner 2007-03-22 12:57 ` Michal Piotrowski 0 siblings, 2 replies; 12+ messages in thread From: Ingo Molnar @ 2007-03-22 9:31 UTC (permalink / raw) To: Michal Piotrowski; +Cc: LKML, Linus Torvalds, Thomas Gleixner, Andrew Morton * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: > Hi Ingo, > 2.6.21-rc4-rt0 > BUG: at kernel/fork.c:1033 copy_process() thanks Michal - this is a real bug that affects upstream too. Find the fix below - i've test-booted it and it fixes the warning. Linus, Andrew, this is a must-have for v2.6.21. Ingo ---------------> Subject: [patch] setup_boot_APIC_clock() irq-enable fix From: Ingo Molnar <mingo@elte.hu> latest -git triggers an irqtrace/lockdep warning of a leaked irqs-off condition: BUG: at kernel/fork.c:1033 copy_process() after some debugging it turns out that commit ca1b940c accidentally left interrupts disabled - which trickled down all the way to the first time we fork a kernel thread and triggered the warning. the fix is to re-enable interrupts in the 'else' branch of setup_boot_APIC_clock()'s pmtimers calibration path. Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/i386/kernel/apic.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux/arch/i386/kernel/apic.c =================================================================== --- linux.orig/arch/i386/kernel/apic.c +++ linux/arch/i386/kernel/apic.c @@ -507,7 +507,8 @@ void __init setup_boot_APIC_clock(void) apic_printk(APIC_VERBOSE, "... jiffies result ok\n"); else local_apic_timer_verify_ok = 0; - } + } else + local_irq_enable(); if (!local_apic_timer_verify_ok) { printk(KERN_WARNING ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch] setup_boot_APIC_clock() irq-enable fix 2007-03-22 9:31 ` [patch] setup_boot_APIC_clock() irq-enable fix Ingo Molnar @ 2007-03-22 10:56 ` Thomas Gleixner 2007-03-22 12:57 ` Michal Piotrowski 1 sibling, 0 replies; 12+ messages in thread From: Thomas Gleixner @ 2007-03-22 10:56 UTC (permalink / raw) To: Ingo Molnar; +Cc: Michal Piotrowski, LKML, Linus Torvalds, Andrew Morton On Thu, 2007-03-22 at 10:31 +0100, Ingo Molnar wrote: > Subject: [patch] setup_boot_APIC_clock() irq-enable fix > From: Ingo Molnar <mingo@elte.hu> > > latest -git triggers an irqtrace/lockdep warning of a leaked > irqs-off condition: > > BUG: at kernel/fork.c:1033 copy_process() > > after some debugging it turns out that commit ca1b940c accidentally left > interrupts disabled - which trickled down all the way to the first time > we fork a kernel thread and triggered the warning. > > the fix is to re-enable interrupts in the 'else' branch of > setup_boot_APIC_clock()'s pmtimers calibration path. > > Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> > Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@brown.paperbag.linutronix.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [patch] setup_boot_APIC_clock() irq-enable fix 2007-03-22 9:31 ` [patch] setup_boot_APIC_clock() irq-enable fix Ingo Molnar 2007-03-22 10:56 ` Thomas Gleixner @ 2007-03-22 12:57 ` Michal Piotrowski 2007-03-22 13:27 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Michal Piotrowski 1 sibling, 1 reply; 12+ messages in thread From: Michal Piotrowski @ 2007-03-22 12:57 UTC (permalink / raw) To: Ingo Molnar; +Cc: LKML, Linus Torvalds, Thomas Gleixner, Andrew Morton On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote: > > * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: > > > Hi Ingo, > > > 2.6.21-rc4-rt0 > > > BUG: at kernel/fork.c:1033 copy_process() > > thanks Michal - this is a real bug that affects upstream too. Find the > fix below - i've test-booted it and it fixes the warning. Problem is fixed, thanks. Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) ^ permalink raw reply [flat|nested] 12+ messages in thread
* 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) 2007-03-22 12:57 ` Michal Piotrowski @ 2007-03-22 13:27 ` Michal Piotrowski 2007-03-23 5:25 ` Vivek Goyal 2007-03-23 7:15 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Ingo Molnar 0 siblings, 2 replies; 12+ messages in thread From: Michal Piotrowski @ 2007-03-22 13:27 UTC (permalink / raw) To: Ingo Molnar; +Cc: Thomas Gleixner, LKML Michal Piotrowski napisał(a): > On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote: >> >> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: >> >> > Hi Ingo, >> >> > 2.6.21-rc4-rt0 >> >> > BUG: at kernel/fork.c:1033 copy_process() >> >> thanks Michal - this is a real bug that affects upstream too. Find the >> fix below - i've test-booted it and it fixes the warning. > > Problem is fixed, thanks. BTW. It seems that nobody uses -rt as a crash dump kernel ;) BUG: unable to handle kernel paging request at virtual address f7ebf8c4 printing eip: c1610192 *pde = 00000000 stopped custom tracer. Oops: 0000 [#1] PREEMPT Modules linked in: CPU: 0 EIP: 0060:[<c1610192>] Not tainted VLI EFLAGS: 00010206 (2.6.21-rc4-rt0-kdump #3) EIP is at copy_oldmem_page+0x4a/0xd0 eax: 000008c4 ebx: f7ebf000 ecx: 00000100 edx: 00000246 esi: f7ebf8c4 edi: c4c520fc ebp: c4d54e30 esp: c4d54e18 ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 preempt:00000001 Process swapper (pid: 1, ti=c4d54000 task=c4d52c20 task.ti=c4d54000) Stack: c17ab7e0 c183f982 c1969658 00000400 00000400 00037ebf c4d54e5c c16af187 00037ebf c4c520fc 00000400 000008c4 00000000 00000000 c4c696e0 00000400 c4c520fc c4d54f94 c19a9cfd c4c520fc 00000400 c4d54f78 00000000 c1840996 Call Trace: [<c16af187>] read_from_oldmem+0x73/0x98 [<c19a9cfd>] vmcore_init+0x26c/0xab7 [<c199979b>] init+0xaa/0x287 [<c16044eb>] kernel_thread_helper+0x7/0x10 ======================= l *copy_oldmem_page+0x4a/0xd0 0xc1610148 is in copy_oldmem_page (arch/i386/kernel/crash_dump.c:35). 30 * copying the data to a pre-allocated kernel page and then copying to user 31 * space in non-atomic context. 32 */ 33 ssize_t copy_oldmem_page(unsigned long pfn, char *buf, 34 size_t csize, unsigned long offset, int userbuf) 35 { 36 void *vaddr; 37 38 if (!csize) 39 return 0; --------------------------- | preempt count: 00000001 ] | 1-level deep critical section nesting: ---------------------------------------- .. [<c184045a>] .... __spin_lock_irqsave+0x23/0x65 .....[<c1604f8c>] .. ( <= die+0x44/0x24d) l *0xc184045a 0xc184045a is in __spin_lock_irqsave (kernel/spinlock.c:122). 117 { 118 unsigned long flags; 119 120 local_irq_save(flags); 121 preempt_disable(); 122 spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); 123 /* 124 * On lockdep we dont want the hand-coded irq-enable of 125 * _raw_spin_lock_flags() code, because lockdep assumes 126 * that interrupts are not re-enabled during lock-acquire: l *0xc1604f8c 0xc1604f8c is in die (arch/i386/kernel/traps.c:477). 472 473 oops_enter(); 474 475 if (die.lock_owner != raw_smp_processor_id()) { 476 console_verbose(); 477 spin_lock_irqsave(&die.lock, flags); 478 die.lock_owner = smp_processor_id(); 479 die.lock_owner_depth = 0; 480 bust_spinlocks(1); 481 } Code: 10 05 00 c1 e3 05 03 1d 60 8e d6 c1 89 1c 24 e8 fc 33 00 00 89 c3 83 7d 18 00 75 2a 8b 4d 10 c1 e9 02 8b 45 14 8d 34 03 8b 7d 0c <f3> a5 8b 4d 10 83 e1 03 74 02 f3 a4 e8 cb 10 05 00 89 1c 24 e8 EIP: [<c1610192>] copy_oldmem_page+0x4a/0xd0 SS:ESP 0068:c4d54e18 Kernel panic - not syncing: Attempted to kill init! [<c160496d>] dump_trace+0x78/0x21a [<c1604b44>] show_trace_log_lvl+0x35/0x54 [<c16052c4>] show_trace+0x2c/0x2e [<c160538b>] dump_stack+0x29/0x2b [<c1618b30>] panic+0x68/0x130 [<c161b67c>] do_exit+0xa1/0x7e3 [<c160516a>] die+0x222/0x24d [<c1612a3f>] do_page_fault+0x4a1/0x586 [<c1841044>] error_code+0x74/0x7c [<c1610192>] copy_oldmem_page+0x4a/0xd0 [<c16af187>] read_from_oldmem+0x73/0x98 [<c19a9cfd>] vmcore_init+0x26c/0xab7 [<c199979b>] init+0xaa/0x287 [<c16044eb>] kernel_thread_helper+0x7/0x10 ======================= --------------------------- | preempt count: 00000001 ] | 1-level deep critical section nesting: ---------------------------------------- .. [<c1618ae6>] .... panic+0x1e/0x130 .....[<c161b67c>] .. ( <= do_exit+0xa1/0x7e3) l *0xc1618ae6 0xc1618ae6 is in panic (kernel/panic.c:85). 80 * have preempt disabled. Some functions called from here want 81 * preempt to be disabled. No point enabling it later though... 82 */ 83 preempt_disable(); 84 85 bust_spinlocks(1); 86 va_start(args, fmt); 87 vsnprintf(buf, sizeof(buf), fmt, args); 88 va_end(args); 89 printk(KERN_EMERG "Kernel panic - not syncing: %s\n",buf); l *0xc161b67c 0xc161b67c is in do_exit (include/linux/pid_namespace.h:42). 37 kref_put(&ns->kref, free_pid_ns); 38 } 39 40 static inline struct task_struct *child_reaper(struct task_struct *tsk) 41 { 42 return init_pid_ns.child_reaper; 43 } 44 45 #endif /* _LINUX_PID_NS_H */ http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/kdump-console.log http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/kdump-config NOHZ: local_softirq_pending 02 on CPU#1 NOHZ: local_softirq_pending 02 on CPU#0 NOHZ: local_softirq_pending 08 on CPU#0 NOHZ: local_softirq_pending 02 on CPU#1 NOHZ: local_softirq_pending 10 on CPU#0 NOHZ: local_softirq_pending 02 on CPU#0 NOHZ: local_softirq_pending 02 on CPU#1 NOHZ: local_softirq_pending 08 on CPU#0 CPU0 CPU1 0: 304 0 IO-APIC-edge timer 1: 2319 0 IO-APIC-edge i8042 7: 0 0 IO-APIC-edge parport0 8: 1 0 IO-APIC-edge rtc 9: 1 0 IO-APIC-fasteoi acpi 12: 3 0 IO-APIC-edge i8042 14: 738 0 IO-APIC-edge ide0 15: 3050 0 IO-APIC-edge ide1 16: 23802 0 IO-APIC-fasteoi uhci_hcd:usb2, uhci_hcd:usb5 17: 33123 0 IO-APIC-fasteoi eth1 19: 29280 0 IO-APIC-fasteoi libata, uhci_hcd:usb4 20: 2 0 IO-APIC-fasteoi ehci_hcd:usb1 21: 0 0 IO-APIC-fasteoi uhci_hcd:usb3 22: 38572 0 IO-APIC-fasteoi Intel ICH5 NMI: 0 0 LOC: 351305 268386 ERR: 0 MIS: 0 Hibernation is still broken. http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) 2007-03-22 13:27 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Michal Piotrowski @ 2007-03-23 5:25 ` Vivek Goyal 2007-03-23 8:23 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski 2007-03-23 12:10 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski 2007-03-23 7:15 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Ingo Molnar 1 sibling, 2 replies; 12+ messages in thread From: Vivek Goyal @ 2007-03-23 5:25 UTC (permalink / raw) To: Michal Piotrowski; +Cc: Ingo Molnar, Thomas Gleixner, LKML On Thu, Mar 22, 2007 at 02:27:25PM +0100, Michal Piotrowski wrote: > Michal Piotrowski napisał(a): > > On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote: > >> > >> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: > >> > >> > Hi Ingo, > >> > >> > 2.6.21-rc4-rt0 > >> > >> > BUG: at kernel/fork.c:1033 copy_process() > >> > >> thanks Michal - this is a real bug that affects upstream too. Find the > >> fix below - i've test-booted it and it fixes the warning. > > > > Problem is fixed, thanks. > > BTW. It seems that nobody uses -rt as a crash dump kernel ;) > > BUG: unable to handle kernel paging request at virtual address f7ebf8c4 > printing eip: > c1610192 > *pde = 00000000 > stopped custom tracer. > Oops: 0000 [#1] > PREEMPT > Modules linked in: > CPU: 0 > EIP: 0060:[<c1610192>] Not tainted VLI > EFLAGS: 00010206 (2.6.21-rc4-rt0-kdump #3) > EIP is at copy_oldmem_page+0x4a/0xd0 > eax: 000008c4 ebx: f7ebf000 ecx: 00000100 edx: 00000246 > esi: f7ebf8c4 edi: c4c520fc ebp: c4d54e30 esp: c4d54e18 > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 preempt:00000001 > Process swapper (pid: 1, ti=c4d54000 task=c4d52c20 task.ti=c4d54000) > Stack: c17ab7e0 c183f982 c1969658 00000400 00000400 00037ebf c4d54e5c c16af187 > 00037ebf c4c520fc 00000400 000008c4 00000000 00000000 c4c696e0 00000400 > c4c520fc c4d54f94 c19a9cfd c4c520fc 00000400 c4d54f78 00000000 c1840996 > Call Trace: > [<c16af187>] read_from_oldmem+0x73/0x98 > [<c19a9cfd>] vmcore_init+0x26c/0xab7 > [<c199979b>] init+0xaa/0x287 > [<c16044eb>] kernel_thread_helper+0x7/0x10 > ======================= > > l *copy_oldmem_page+0x4a/0xd0 > 0xc1610148 is in copy_oldmem_page (arch/i386/kernel/crash_dump.c:35). > 30 * copying the data to a pre-allocated kernel page and then copying to user > 31 * space in non-atomic context. > 32 */ > 33 ssize_t copy_oldmem_page(unsigned long pfn, char *buf, > 34 size_t csize, unsigned long offset, int userbuf) > 35 { > 36 void *vaddr; > 37 > 38 if (!csize) > 39 return 0; > Can you please paste the disassembly of copy_oldmem_page() on your system. Not sure from where this faulting address 0xf7ebf8c4 is coming. We are still in vmcore_init(), so we should be copying the data to kernel buffers only. This looks like a valid kernel address. Can you also put some printk() here to find out from where 0xf7ebf8c4 has come? It does not look like a fixed kernel virutual address returned by kmap_atomic_pfn(). Then is it passed by kernel as a parameter to copy_oldmem_page()? Thanks Vivek ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump 2007-03-23 5:25 ` Vivek Goyal @ 2007-03-23 8:23 ` Michal Piotrowski 2007-03-23 12:10 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski 1 sibling, 0 replies; 12+ messages in thread From: Michal Piotrowski @ 2007-03-23 8:23 UTC (permalink / raw) To: vgoyal; +Cc: Michal Piotrowski, Ingo Molnar, Thomas Gleixner, LKML Vivek Goyal napisał(a): > On Thu, Mar 22, 2007 at 02:27:25PM +0100, Michal Piotrowski wrote: >> Michal Piotrowski napisał(a): >>> On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote: >>>> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: >>>> >>>>> Hi Ingo, >>>>> 2.6.21-rc4-rt0 >>>>> BUG: at kernel/fork.c:1033 copy_process() >>>> thanks Michal - this is a real bug that affects upstream too. Find the >>>> fix below - i've test-booted it and it fixes the warning. >>> Problem is fixed, thanks. >> BTW. It seems that nobody uses -rt as a crash dump kernel ;) >> >> BUG: unable to handle kernel paging request at virtual address f7ebf8c4 >> printing eip: >> c1610192 >> *pde = 00000000 >> stopped custom tracer. >> Oops: 0000 [#1] >> PREEMPT >> Modules linked in: >> CPU: 0 >> EIP: 0060:[<c1610192>] Not tainted VLI >> EFLAGS: 00010206 (2.6.21-rc4-rt0-kdump #3) >> EIP is at copy_oldmem_page+0x4a/0xd0 >> eax: 000008c4 ebx: f7ebf000 ecx: 00000100 edx: 00000246 >> esi: f7ebf8c4 edi: c4c520fc ebp: c4d54e30 esp: c4d54e18 >> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 preempt:00000001 >> Process swapper (pid: 1, ti=c4d54000 task=c4d52c20 task.ti=c4d54000) >> Stack: c17ab7e0 c183f982 c1969658 00000400 00000400 00037ebf c4d54e5c c16af187 >> 00037ebf c4c520fc 00000400 000008c4 00000000 00000000 c4c696e0 00000400 >> c4c520fc c4d54f94 c19a9cfd c4c520fc 00000400 c4d54f78 00000000 c1840996 >> Call Trace: >> [<c16af187>] read_from_oldmem+0x73/0x98 >> [<c19a9cfd>] vmcore_init+0x26c/0xab7 >> [<c199979b>] init+0xaa/0x287 >> [<c16044eb>] kernel_thread_helper+0x7/0x10 >> ======================= >> >> l *copy_oldmem_page+0x4a/0xd0 >> 0xc1610148 is in copy_oldmem_page (arch/i386/kernel/crash_dump.c:35). >> 30 * copying the data to a pre-allocated kernel page and then copying to user >> 31 * space in non-atomic context. >> 32 */ >> 33 ssize_t copy_oldmem_page(unsigned long pfn, char *buf, >> 34 size_t csize, unsigned long offset, int userbuf) >> 35 { >> 36 void *vaddr; >> 37 >> 38 if (!csize) >> 39 return 0; >> > > Can you please paste the disassembly of copy_oldmem_page() on your system. disassemble *copy_oldmem_page Dump of assembler code for function copy_oldmem_page: 0xc1610148 <copy_oldmem_page+0>: push %ebp 0xc1610149 <copy_oldmem_page+1>: mov %esp,%ebp 0xc161014b <copy_oldmem_page+3>: push %edi 0xc161014c <copy_oldmem_page+4>: push %esi 0xc161014d <copy_oldmem_page+5>: push %ebx 0xc161014e <copy_oldmem_page+6>: sub $0xc,%esp 0xc1610151 <copy_oldmem_page+9>: call 0xc160f5c4 <mcount> 0xc1610156 <copy_oldmem_page+14>: mov 0x8(%ebp),%ebx 0xc1610159 <copy_oldmem_page+17>: xor %edx,%edx 0xc161015b <copy_oldmem_page+19>: cmpl $0x0,0x10(%ebp) 0xc161015f <copy_oldmem_page+23>: je 0xc161020d <copy_oldmem_page+197> 0xc1610165 <copy_oldmem_page+29>: call 0xc1661258 <pagefault_disable> 0xc161016a <copy_oldmem_page+34>: shl $0x5,%ebx 0xc161016d <copy_oldmem_page+37>: add 0xc1d68e60,%ebx 0xc1610173 <copy_oldmem_page+43>: mov %ebx,(%esp) 0xc1610176 <copy_oldmem_page+46>: call 0xc1613577 <kmap> 0xc161017b <copy_oldmem_page+51>: mov %eax,%ebx 0xc161017d <copy_oldmem_page+53>: cmpl $0x0,0x18(%ebp) 0xc1610181 <copy_oldmem_page+57>: jne 0xc16101ad <copy_oldmem_page+101> 0xc1610183 <copy_oldmem_page+59>: mov 0x10(%ebp),%ecx 0xc1610186 <copy_oldmem_page+62>: shr $0x2,%ecx 0xc1610189 <copy_oldmem_page+65>: mov 0x14(%ebp),%eax 0xc161018c <copy_oldmem_page+68>: lea (%ebx,%eax,1),%esi 0xc161018f <copy_oldmem_page+71>: mov 0xc(%ebp),%edi 0xc1610192 <copy_oldmem_page+74>: rep movsl %ds:(%esi),%es:(%edi) 0xc1610194 <copy_oldmem_page+76>: mov 0x10(%ebp),%ecx 0xc1610197 <copy_oldmem_page+79>: and $0x3,%ecx 0xc161019a <copy_oldmem_page+82>: je 0xc161019e <copy_oldmem_page+86> 0xc161019c <copy_oldmem_page+84>: rep movsb %ds:(%esi),%es:(%edi) 0xc161019e <copy_oldmem_page+86>: call 0xc166126e <pagefault_enable> 0xc16101a3 <copy_oldmem_page+91>: mov %ebx,(%esp) 0xc16101a6 <copy_oldmem_page+94>: call 0xc1613533 <kunmap_virt> 0xc16101ab <copy_oldmem_page+99>: jmp 0xc161020a <copy_oldmem_page+194> 0xc16101ad <copy_oldmem_page+101>: mov 0xc19d4004,%edi 0xc16101b3 <copy_oldmem_page+107>: test %edi,%edi 0xc16101b5 <copy_oldmem_page+109>: jne 0xc16101ca <copy_oldmem_page+130> 0xc16101b7 <copy_oldmem_page+111>: movl $0xc18bbc93,(%esp) 0xc16101be <copy_oldmem_page+118>: call 0xc1619671 <printk> 0xc16101c3 <copy_oldmem_page+123>: mov $0xfffffff2,%edx 0xc16101c8 <copy_oldmem_page+128>: jmp 0xc161020d <copy_oldmem_page+197> 0xc16101ca <copy_oldmem_page+130>: mov $0x400,%ecx 0xc16101cf <copy_oldmem_page+135>: mov %eax,%esi 0xc16101d1 <copy_oldmem_page+137>: rep movsl %ds:(%esi),%es:(%edi) 0xc16101d3 <copy_oldmem_page+139>: call 0xc166126e <pagefault_enable> 0xc16101d8 <copy_oldmem_page+144>: mov %ebx,(%esp) 0xc16101db <copy_oldmem_page+147>: call 0xc1613533 <kunmap_virt> 0xc16101e0 <copy_oldmem_page+152>: mov 0x10(%ebp),%eax 0xc16101e3 <copy_oldmem_page+155>: mov %eax,0x8(%esp) 0xc16101e7 <copy_oldmem_page+159>: mov 0xc19d4004,%eax 0xc16101ec <copy_oldmem_page+164>: add %eax,0x14(%ebp) 0xc16101ef <copy_oldmem_page+167>: mov 0x14(%ebp),%eax 0xc16101f2 <copy_oldmem_page+170>: mov %eax,0x4(%esp) 0xc16101f6 <copy_oldmem_page+174>: mov 0xc(%ebp),%eax 0xc16101f9 <copy_oldmem_page+177>: mov %eax,(%esp) 0xc16101fc <copy_oldmem_page+180>: call 0xc1700e98 <copy_to_user> 0xc1610201 <copy_oldmem_page+185>: mov $0xfffffff2,%edx 0xc1610206 <copy_oldmem_page+190>: test %eax,%eax 0xc1610208 <copy_oldmem_page+192>: jne 0xc161020d <copy_oldmem_page+197> 0xc161020a <copy_oldmem_page+194>: mov 0x10(%ebp),%edx 0xc161020d <copy_oldmem_page+197>: mov %edx,%eax 0xc161020f <copy_oldmem_page+199>: add $0xc,%esp 0xc1610212 <copy_oldmem_page+202>: pop %ebx 0xc1610213 <copy_oldmem_page+203>: pop %esi 0xc1610214 <copy_oldmem_page+204>: pop %edi 0xc1610215 <copy_oldmem_page+205>: pop %ebp 0xc1610216 <copy_oldmem_page+206>: ret End of assembler dump. > Not sure from where this faulting address 0xf7ebf8c4 is coming. We are still > in vmcore_init(), so we should be copying the data to kernel buffers only. > This looks like a valid kernel address. > > Can you also put some printk() here to find out from where 0xf7ebf8c4 has > come? It does not look like a fixed kernel virutual address returned by > kmap_atomic_pfn(). Then is it passed by kernel as a parameter to > copy_oldmem_page()? > > Thanks > Vivek > Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump 2007-03-23 5:25 ` Vivek Goyal 2007-03-23 8:23 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski @ 2007-03-23 12:10 ` Michal Piotrowski 1 sibling, 0 replies; 12+ messages in thread From: Michal Piotrowski @ 2007-03-23 12:10 UTC (permalink / raw) To: vgoyal; +Cc: Michal Piotrowski, Ingo Molnar, Thomas Gleixner, LKML Vivek Goyal napisał(a): > On Thu, Mar 22, 2007 at 02:27:25PM +0100, Michal Piotrowski wrote: >> Michal Piotrowski napisał(a): >>> On 22/03/07, Ingo Molnar <mingo@elte.hu> wrote: >>>> * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: >>>> >>>>> Hi Ingo, >>>>> 2.6.21-rc4-rt0 >>>>> BUG: at kernel/fork.c:1033 copy_process() >>>> thanks Michal - this is a real bug that affects upstream too. Find the >>>> fix below - i've test-booted it and it fixes the warning. >>> Problem is fixed, thanks. >> BTW. It seems that nobody uses -rt as a crash dump kernel ;) >> >> BUG: unable to handle kernel paging request at virtual address f7ebf8c4 >> printing eip: >> c1610192 >> *pde = 00000000 >> stopped custom tracer. >> Oops: 0000 [#1] >> PREEMPT >> Modules linked in: >> CPU: 0 >> EIP: 0060:[<c1610192>] Not tainted VLI >> EFLAGS: 00010206 (2.6.21-rc4-rt0-kdump #3) >> EIP is at copy_oldmem_page+0x4a/0xd0 >> eax: 000008c4 ebx: f7ebf000 ecx: 00000100 edx: 00000246 >> esi: f7ebf8c4 edi: c4c520fc ebp: c4d54e30 esp: c4d54e18 >> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 preempt:00000001 >> Process swapper (pid: 1, ti=c4d54000 task=c4d52c20 task.ti=c4d54000) >> Stack: c17ab7e0 c183f982 c1969658 00000400 00000400 00037ebf c4d54e5c c16af187 >> 00037ebf c4c520fc 00000400 000008c4 00000000 00000000 c4c696e0 00000400 >> c4c520fc c4d54f94 c19a9cfd c4c520fc 00000400 c4d54f78 00000000 c1840996 >> Call Trace: >> [<c16af187>] read_from_oldmem+0x73/0x98 >> [<c19a9cfd>] vmcore_init+0x26c/0xab7 >> [<c199979b>] init+0xaa/0x287 >> [<c16044eb>] kernel_thread_helper+0x7/0x10 >> ======================= >> >> l *copy_oldmem_page+0x4a/0xd0 >> 0xc1610148 is in copy_oldmem_page (arch/i386/kernel/crash_dump.c:35). >> 30 * copying the data to a pre-allocated kernel page and then copying to user >> 31 * space in non-atomic context. >> 32 */ >> 33 ssize_t copy_oldmem_page(unsigned long pfn, char *buf, >> 34 size_t csize, unsigned long offset, int userbuf) >> 35 { >> 36 void *vaddr; >> 37 >> 38 if (!csize) >> 39 return 0; >> > > Can you please paste the disassembly of copy_oldmem_page() on your system. > Not sure from where this faulting address 0xf7ebf8c4 is coming. We are still > in vmcore_init(), so we should be copying the data to kernel buffers only. > This looks like a valid kernel address. > > Can you also put some printk() here to find out from where 0xf7ebf8c4 has > come? It does not look like a fixed kernel virutual address returned by > kmap_atomic_pfn(). Then is it passed by kernel as a parameter to > copy_oldmem_page()? I added printk(KERN_WARNING "copy_oldmem_page() pfn=%lu , buf=%s , nr_bytes=%d , offset=%lu , userbuf=%d\n", pfn, buf, nr_bytes, offset, userbuf); before tmp = copy_oldmem_page(pfn, buf, nr_bytes, offset, userbuf); result is here http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/kdump-console2.log 'buf' might be broken. > > Thanks > Vivek > Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) 2007-03-22 13:27 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Michal Piotrowski 2007-03-23 5:25 ` Vivek Goyal @ 2007-03-23 7:15 ` Ingo Molnar 2007-03-23 7:58 ` Michal Piotrowski 1 sibling, 1 reply; 12+ messages in thread From: Ingo Molnar @ 2007-03-23 7:15 UTC (permalink / raw) To: Michal Piotrowski; +Cc: Thomas Gleixner, LKML * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: > >> > BUG: at kernel/fork.c:1033 copy_process() > >> > >> thanks Michal - this is a real bug that affects upstream too. Find > >> the fix below - i've test-booted it and it fixes the warning. > > > > Problem is fixed, thanks. > > BTW. It seems that nobody uses -rt as a crash dump kernel ;) it's been tested with v2.6.20-rt8, and it should work as long as you enable CONFIG_RELOCATABLE. But i'm not using it myself, and v2.6.21-rc4-rt0 isnt a particularly encouraging version string for people to try ;) > Hibernation is still broken. > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config what's the failure mode besides the lockdep + other debug messages - it doesnt resume? Your log seems to have at least one sequence of resume related messages - those seem to have worked fine. Ingo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) 2007-03-23 7:15 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Ingo Molnar @ 2007-03-23 7:58 ` Michal Piotrowski 2007-03-23 8:02 ` Ingo Molnar 0 siblings, 1 reply; 12+ messages in thread From: Michal Piotrowski @ 2007-03-23 7:58 UTC (permalink / raw) To: Ingo Molnar; +Cc: Thomas Gleixner, LKML On 23/03/07, Ingo Molnar <mingo@elte.hu> wrote: > > * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: > > > >> > BUG: at kernel/fork.c:1033 copy_process() > > >> > > >> thanks Michal - this is a real bug that affects upstream too. Find > > >> the fix below - i've test-booted it and it fixes the warning. > > > > > > Problem is fixed, thanks. > > > > BTW. It seems that nobody uses -rt as a crash dump kernel ;) > > it's been tested with v2.6.20-rt8, and it should work as long as you > enable CONFIG_RELOCATABLE. But i'm not using it myself, and > v2.6.21-rc4-rt0 isnt a particularly encouraging version string for > people to try ;) > > > Hibernation is still broken. > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config > > what's the failure mode besides the lockdep + other debug messages - it > doesnt resume? Your log seems to have at least one sequence of resume > related messages - those seem to have worked fine. Kernel has crashed after Read 497936 kbytes in 23.09 seconds (21.56 MB/s) swsusp: Reading resume file was successful PM: Preparing devices for restore. Suspending console(s) > > Ingo > Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) 2007-03-23 7:58 ` Michal Piotrowski @ 2007-03-23 8:02 ` Ingo Molnar 2007-03-23 8:17 ` Michal Piotrowski 0 siblings, 1 reply; 12+ messages in thread From: Ingo Molnar @ 2007-03-23 8:02 UTC (permalink / raw) To: Michal Piotrowski; +Cc: Thomas Gleixner, LKML * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: > >> Hibernation is still broken. > >> > >> > >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log > >> > >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config > > > >what's the failure mode besides the lockdep + other debug messages - > >it doesnt resume? Your log seems to have at least one sequence of > >resume related messages - those seem to have worked fine. > > Kernel has crashed after crashed == 'hung hard' or 'spontaneous reboot' or 'other'? > Read 497936 kbytes in 23.09 seconds (21.56 MB/s) > swsusp: Reading resume file was successful > PM: Preparing devices for restore. > Suspending console(s) i havent used sw-suspend for a while, so here's a stupid question: why does it try to suspend the consoles in the resume path? I assume the messages above mean that we are already in the resume path? Ingo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) 2007-03-23 8:02 ` Ingo Molnar @ 2007-03-23 8:17 ` Michal Piotrowski 0 siblings, 0 replies; 12+ messages in thread From: Michal Piotrowski @ 2007-03-23 8:17 UTC (permalink / raw) To: Ingo Molnar; +Cc: Thomas Gleixner, LKML On 23/03/07, Ingo Molnar <mingo@elte.hu> wrote: > > * Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote: > > > >> Hibernation is still broken. > > >> > > >> > > >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/console.log > > >> > > >http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc4-rt0/rt-config > > > > > >what's the failure mode besides the lockdep + other debug messages - > > >it doesnt resume? Your log seems to have at least one sequence of > > >resume related messages - those seem to have worked fine. > > > > Kernel has crashed after > > crashed == 'hung hard' or 'spontaneous reboot' or 'other'? It means 'nothing has happened for a five minutes'. > > > Read 497936 kbytes in 23.09 seconds (21.56 MB/s) > > swsusp: Reading resume file was successful > > PM: Preparing devices for restore. > > Suspending console(s) > > i havent used sw-suspend for a while, so here's a stupid question: why > does it try to suspend the consoles in the resume path? I assume the > messages above mean that we are already in the resume path? Yes. (I'm not sure, but "Suspending console(s)" might come from PM_DEBUG.) > > Ingo > Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-03-23 12:11 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-03-21 20:38 2.6.21-rc4-rt0 BUG: at kernel/fork.c:1033 copy_process() Michal Piotrowski 2007-03-22 9:31 ` [patch] setup_boot_APIC_clock() irq-enable fix Ingo Molnar 2007-03-22 10:56 ` Thomas Gleixner 2007-03-22 12:57 ` Michal Piotrowski 2007-03-22 13:27 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Michal Piotrowski 2007-03-23 5:25 ` Vivek Goyal 2007-03-23 8:23 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski 2007-03-23 12:10 ` 2.6.21-rc4-rt0-kdump Michal Piotrowski 2007-03-23 7:15 ` 2.6.21-rc4-rt0-kdump (was: Re: [patch] setup_boot_APIC_clock() irq-enable fix) Ingo Molnar 2007-03-23 7:58 ` Michal Piotrowski 2007-03-23 8:02 ` Ingo Molnar 2007-03-23 8:17 ` Michal Piotrowski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).