LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: "Haar János" <djani22@netcenter.hu> To: "Haar János" <djani22@netcenter.hu> Cc: <linux-xfs@oss.sgi.com>, <linux-kernel@vger.kernel.org> Subject: Re: xfslogd-spinlock bug? Date: Sat, 16 Dec 2006 12:19:45 +0100 [thread overview] Message-ID: <000d01c72127$3d7509b0$0400a8c0@dcccs> (raw) In-Reply-To: 00ab01c71e53$942af2f0$0400a8c0@dcccs Hi I have some news. I dont know there is a context between 2 messages, but i can see, the spinlock bug comes always on cpu #3. Somebody have any idea? Thanks Janos Dec 16 11:42:48 dy-base BUG: warning at mm/truncate.c:398/invalidate_inode_pages2_range() Dec 16 11:42:49 dy-base Dec 16 11:42:49 dy-base Call Trace: Dec 16 11:42:49 dy-base [<ffffffff8025b8a2>] invalidate_inode_pages2_range+0x285/0x2b8 Dec 16 11:42:49 dy-base [<ffffffff805e7ddd>] _spin_unlock+0x9/0xb Dec 16 11:42:49 dy-base [<ffffffff8031b98d>] nfs_sync_inode_wait+0x1ab/0x1bd Dec 16 11:42:49 dy-base [<ffffffff8025b8e4>] invalidate_inode_pages2+0xf/0x11 Dec 16 11:42:49 dy-base [<ffffffff80314859>] nfs_revalidate_mapping+0xa0/0x152 Dec 16 11:42:49 dy-base [<ffffffff80312a07>] nfs_file_read+0x6e/0xbe Dec 16 11:42:49 dy-base [<ffffffff80273f99>] do_sync_read+0xe2/0x126 Dec 16 11:42:49 dy-base [<ffffffff805e7ffc>] unlock_kernel+0x35/0x37 Dec 16 11:42:49 dy-base [<ffffffff8023c804>] autoremove_wake_function+0x0/0x38 Dec 16 11:42:49 dy-base [<ffffffff802729e7>] nameidata_to_filp+0x2d/0x3e Dec 16 11:42:49 dy-base [<ffffffff8026fd15>] poison_obj+0x27/0x32 Dec 16 11:42:49 dy-base [<ffffffff8026fee6>] cache_free_debugcheck+0x1c6/0x1d6 Dec 16 11:42:49 dy-base [<ffffffff8027cb26>] putname+0x37/0x39 Dec 16 11:42:49 dy-base [<ffffffff80274849>] vfs_read+0xcc/0x172 Dec 16 11:42:49 dy-base [<ffffffff80274d3e>] sys_pread64+0x55/0x76 Dec 16 11:42:49 dy-base [<ffffffff802098ee>] system_call+0x7e/0x83 Dec 16 11:42:49 dy-base .... Dec 16 12:08:36 dy-base BUG: spinlock bad magic on CPU#3, xfslogd/3/317 Dec 16 12:08:36 dy-base general protection fault: 0000 [1] Dec 16 12:08:36 dy-base SMP Dec 16 12:08:36 dy-base Dec 16 12:08:36 dy-base CPU 3 Dec 16 12:08:36 dy-base Dec 16 12:08:36 dy-base Modules linked in: Dec 16 12:08:36 dy-base nbd Dec 16 12:08:36 dy-base rd Dec 16 12:08:36 dy-base netconsole Dec 16 12:08:36 dy-base e1000 Dec 16 12:08:36 dy-base video Dec 16 12:08:36 dy-base Dec 16 12:08:36 dy-base Pid: 317, comm: xfslogd/3 Not tainted 2.6.19 #1 Dec 16 12:08:36 dy-base RIP: 0010:[<ffffffff803f3aba>] Dec 16 12:08:36 dy-base [<ffffffff803f3aba>] spin_bug+0x69/0xdf Dec 16 12:08:36 dy-base RSP: 0018:ffff81011fdedbc0 EFLAGS: 00010002 Dec 16 12:08:36 dy-base RAX: 0000000000000033 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000 Dec 16 12:08:36 dy-base RDX: ffffffff807f3be2 RSI: 0000000000000082 RDI: 0000000100000000 Dec 16 12:08:36 dy-base RBP: ffff81011fdedbe0 R08: 0000000000006eb2 R09: 000000006b6b6b6b Dec 16 12:08:36 dy-base R10: 0000000000000082 R11: ffff81000584d280 R12: ffff810081476098 Dec 16 12:08:36 dy-base R13: ffffffff80642dc6 R14: 0000000000000000 R15: 0000000000000003 Dec 16 12:08:36 dy-base FS: 0000000000000000(0000) GS:ffff81011fc76b90(0000) knlGS:0000000000000000 Dec 16 12:08:36 dy-base CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Dec 16 12:08:36 dy-base CR2: 00002afc7d3ea000 CR3: 0000000117afc000 CR4: 00000000000006e0 Dec 16 12:08:36 dy-base Process xfslogd/3 (pid: 317, threadinfo ffff81011fdec000, task ffff81011fafc140) Dec 16 12:08:36 dy-base Stack: Dec 16 12:08:36 dy-base ffff81011fdedbe0 Dec 16 12:08:36 dy-base ffff810081476098 Dec 16 12:08:36 dy-base 0000000000000000 Dec 16 12:08:36 dy-base 0000000000000000 Dec 16 12:08:36 dy-base Dec 16 12:08:36 dy-base ffff81011fdedc10 Dec 16 12:08:36 dy-base ffffffff803f3bdc Dec 16 12:08:36 dy-base 0000000000000282 Dec 16 12:08:36 dy-base 0000000000000000 Dec 16 12:08:36 dy-base Dec 16 12:08:36 dy-base 0000000000000000 Dec 16 12:08:36 dy-base 0000000000000000 Dec 16 12:08:36 dy-base ffff81011fdedc30 Dec 16 12:08:36 dy-base ffffffff805e7f2b Dec 16 12:08:36 dy-base Dec 16 12:08:36 dy-base Call Trace: Dec 16 12:08:36 dy-base [<ffffffff803f3bdc>] _raw_spin_lock+0x23/0xf1 Dec 16 12:08:36 dy-base [<ffffffff805e7f2b>] _spin_lock_irqsave+0x11/0x18 Dec 16 12:08:36 dy-base [<ffffffff80222aab>] __wake_up+0x22/0x50 Dec 16 12:08:36 dy-base [<ffffffff803c97f9>] xfs_buf_unpin+0x21/0x23 Dec 16 12:08:36 dy-base [<ffffffff803970a4>] xfs_buf_item_unpin+0x2e/0xa6 Dec 16 12:08:36 dy-base [<ffffffff803bc460>] xfs_trans_chunk_committed+0xc3/0xf7 Dec 16 12:08:37 dy-base [<ffffffff803bc4dd>] xfs_trans_committed+0x49/0xde Dec 16 12:08:37 dy-base [<ffffffff803b1bde>] xlog_state_do_callback+0x185/0x33f Dec 16 12:08:37 dy-base [<ffffffff803b1e9c>] xlog_iodone+0x104/0x131 Dec 16 12:08:37 dy-base [<ffffffff803c9dae>] xfs_buf_iodone_work+0x1a/0x3e Dec 16 12:08:37 dy-base [<ffffffff802399d8>] worker_thread+0x0/0x134 Dec 16 12:08:37 dy-base [<ffffffff8023937e>] run_workqueue+0xa8/0xf8 Dec 16 12:08:37 dy-base [<ffffffff803c9d94>] xfs_buf_iodone_work+0x0/0x3e Dec 16 12:08:37 dy-base [<ffffffff802399d8>] worker_thread+0x0/0x134 Dec 16 12:08:37 dy-base [<ffffffff80239ad3>] worker_thread+0xfb/0x134 Dec 16 12:08:37 dy-base [<ffffffff80223f6c>] default_wake_function+0x0/0xf Dec 16 12:08:37 dy-base [<ffffffff802399d8>] worker_thread+0x0/0x134 Dec 16 12:08:37 dy-base [<ffffffff8023c6e5>] kthread+0xd8/0x10b Dec 16 12:08:37 dy-base [<ffffffff802256ac>] schedule_tail+0x45/0xa6 Dec 16 12:08:37 dy-base [<ffffffff8020a6a8>] child_rip+0xa/0x12 Dec 16 12:08:37 dy-base [<ffffffff802399d8>] worker_thread+0x0/0x134 Dec 16 12:08:37 dy-base [<ffffffff8023c60d>] kthread+0x0/0x10b Dec 16 12:08:38 dy-base [<ffffffff8020a69e>] child_rip+0x0/0x12 Dec 16 12:08:38 dy-base Dec 16 12:08:38 dy-base Dec 16 12:08:38 dy-base Code: Dec 16 12:08:38 dy-base 8b Dec 16 12:08:38 dy-base 83 Dec 16 12:08:38 dy-base 0c Dec 16 12:08:38 dy-base 01 Dec 16 12:08:38 dy-base 00 Dec 16 12:08:38 dy-base 00 Dec 16 12:08:38 dy-base 48 Dec 16 12:08:38 dy-base 8d Dec 16 12:08:38 dy-base 8b Dec 16 12:08:38 dy-base 98 Dec 16 12:08:38 dy-base 02 Dec 16 12:08:38 dy-base 00 Dec 16 12:08:38 dy-base 00 Dec 16 12:08:38 dy-base 41 Dec 16 12:08:38 dy-base 8b Dec 16 12:08:38 dy-base 54 Dec 16 12:08:38 dy-base 24 Dec 16 12:08:38 dy-base 04 Dec 16 12:08:38 dy-base 41 Dec 16 12:08:38 dy-base 89 Dec 16 12:08:38 dy-base Dec 16 12:08:38 dy-base RIP Dec 16 12:08:38 dy-base [<ffffffff803f3aba>] spin_bug+0x69/0xdf Dec 16 12:08:38 dy-base RSP <ffff81011fdedbc0> Dec 16 12:08:38 dy-base Dec 16 12:08:38 dy-base Kernel panic - not syncing: Fatal exception Dec 16 12:08:38 dy-base Dec 16 12:08:38 dy-base Rebooting in 5 seconds.. ----- Original Message ----- From: "Haar János" <djani22@netcenter.hu> To: "Justin Piszcz" <jpiszcz@lucidpixels.com> Cc: <linux-xfs@oss.sgi.com>; <linux-kernel@vger.kernel.org> Sent: Wednesday, December 13, 2006 2:11 AM Subject: Re: xfslogd-spinlock bug? > Hello, Justin, > > This is a 64bit system. > > But i cannot understand, what is the curious? :-) > > I am not a kernel developer, and not a C programmer, but the long pointers > shows me, the 64 bit. > Or am i on the wrong clue? :-) > > Anyway, this issue happens for me about daily, or max 2-3 day often. > But i have no idea what cause this exactly. > The 2.6.16.18 was stable for me a long time, and one day starts to tricking > me, and happens more and more often. > Thats why i thinking some bad part of the (14TB) FS on the disks. > > (this fs have a lot of errors, what the xfs_repair cannot be corrected, but > anyway this is a productive system, and works well, except this issue. > Some months before i have replaced the parity disk in one of the RAID4 > array, and the next day, during the resync process, another one disk died. > I almost lost everything, but thanks to the raid4, and mdadm, i have > successfully recovered a lot of data with the 1 day older parity-only drive. > This was really bad, and leave some scars on the fs. ) > > This issue looks like for me a race condition between the cpus. (2x Xeon HT) > > Am i right? :-) > > Thanks, > > Janos > > > > > ----- Original Message ----- > From: "Justin Piszcz" <jpiszcz@lucidpixels.com> > To: "Haar János" <djani22@netcenter.hu> > Cc: <linux-xfs@oss.sgi.com>; <linux-kernel@vger.kernel.org> > Sent: Tuesday, December 12, 2006 3:32 PM > Subject: Re: xfslogd-spinlock bug? > > > I'm not sure what is causing this problem but I was curious is this on a > 32bit or 64bit platform? > > Justin. > > On Tue, 12 Dec 2006, Haar János wrote: > > > Hello, list, > > > > I am the "big red button men" with the one big 14TB xfs, if somebody can > > remember me. :-) > > > > Now i found something in the 2.6.16.18, and try the 2.6.18.4, and the > > 2.6.19, but the bug still exists: > > > > Dec 11 22:47:21 dy-base BUG: spinlock bad magic on CPU#3, xfslogd/3/317 > > Dec 11 22:47:21 dy-base general protection fault: 0000 [1] > > Dec 11 22:47:21 dy-base SMP > > Dec 11 22:47:21 dy-base > > Dec 11 22:47:21 dy-base CPU 3 > > Dec 11 22:47:21 dy-base > > Dec 11 22:47:21 dy-base Modules linked in: > > Dec 11 22:47:21 dy-base nbd > > Dec 11 22:47:21 dy-base rd > > Dec 11 22:47:21 dy-base netconsole > > Dec 11 22:47:21 dy-base e1000 > > Dec 11 22:47:21 dy-base video > > Dec 11 22:47:21 dy-base > > Dec 11 22:47:21 dy-base Pid: 317, comm: xfslogd/3 Not tainted 2.6.19 #1 > > Dec 11 22:47:21 dy-base RIP: 0010:[<ffffffff803f3aba>] > > Dec 11 22:47:21 dy-base [<ffffffff803f3aba>] spin_bug+0x69/0xdf > > Dec 11 22:47:21 dy-base RSP: 0018:ffff81011fb89bc0 EFLAGS: 00010002 > > Dec 11 22:47:21 dy-base RAX: 0000000000000033 RBX: 6b6b6b6b6b6b6b6b RCX: > > 0000000000000000 > > Dec 11 22:47:21 dy-base RDX: ffffffff808137a0 RSI: 0000000000000082 RDI: > > 0000000100000000 > > Dec 11 22:47:21 dy-base RBP: ffff81011fb89be0 R08: 0000000000026a70 R09: > > 000000006b6b6b6b > > Dec 11 22:47:21 dy-base R10: 0000000000000082 R11: ffff81000584d380 R12: > > ffff8100db92ad80 > > Dec 11 22:47:21 dy-base R13: ffffffff80642dc6 R14: 0000000000000000 R15: > > 0000000000000003 > > Dec 11 22:47:21 dy-base FS: 0000000000000000(0000) > > GS:ffff81011fc76b90(0000) knlGS:0000000000000000 > > Dec 11 22:47:21 dy-base CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > > Dec 11 22:47:21 dy-base CR2: 00002ba007700000 CR3: 0000000108c05000 CR4: > > 00000000000006e0 > > Dec 11 22:47:21 dy-base Process xfslogd/3 (pid: 317, threadinfo > > ffff81011fb88000, task ffff81011fa7f830) > > Dec 11 22:47:21 dy-base Stack: > > Dec 11 22:47:21 dy-base ffff81011fb89be0 > > Dec 11 22:47:21 dy-base ffff8100db92ad80 > > Dec 11 22:47:21 dy-base 0000000000000000 > > Dec 11 22:47:21 dy-base 0000000000000000 > > Dec 11 22:47:21 dy-base > > Dec 11 22:47:21 dy-base ffff81011fb89c10 > > Dec 11 22:47:21 dy-base ffffffff803f3bdc > > Dec 11 22:47:21 dy-base 0000000000000282 > > Dec 11 22:47:21 dy-base 0000000000000000 > > Dec 11 22:47:21 dy-base > > Dec 11 22:47:21 dy-base 0000000000000000 > > Dec 11 22:47:21 dy-base 0000000000000000 > > Dec 11 22:47:21 dy-base ffff81011fb89c30 > > Dec 11 22:47:21 dy-base ffffffff805e7f2b > > Dec 11 22:47:21 dy-base > > Dec 11 22:47:21 dy-base Call Trace: > > Dec 11 22:47:21 dy-base [<ffffffff803f3bdc>] _raw_spin_lock+0x23/0xf1 > > Dec 11 22:47:21 dy-base [<ffffffff805e7f2b>] _spin_lock_irqsave+0x11/0x18 > > Dec 11 22:47:21 dy-base [<ffffffff80222aab>] __wake_up+0x22/0x50 > > Dec 11 22:47:21 dy-base [<ffffffff803c97f9>] xfs_buf_unpin+0x21/0x23 > > Dec 11 22:47:21 dy-base [<ffffffff803970a4>] xfs_buf_item_unpin+0x2e/0xa6 > > Dec 11 22:47:21 dy-base [<ffffffff803bc460>] > > xfs_trans_chunk_committed+0xc3/0xf7 > > Dec 11 22:47:21 dy-base [<ffffffff803bc4dd>] > xfs_trans_committed+0x49/0xde > > Dec 11 22:47:21 dy-base [<ffffffff803b1bde>] > > xlog_state_do_callback+0x185/0x33f > > Dec 11 22:47:21 dy-base [<ffffffff803b1e9c>] xlog_iodone+0x104/0x131 > > Dec 11 22:47:22 dy-base [<ffffffff803c9dae>] > xfs_buf_iodone_work+0x1a/0x3e > > Dec 11 22:47:22 dy-base [<ffffffff802399d8>] worker_thread+0x0/0x134 > > Dec 11 22:47:22 dy-base [<ffffffff8023937e>] run_workqueue+0xa8/0xf8 > > Dec 11 22:47:22 dy-base [<ffffffff803c9d94>] xfs_buf_iodone_work+0x0/0x3e > > Dec 11 22:47:22 dy-base [<ffffffff802399d8>] worker_thread+0x0/0x134 > > Dec 11 22:47:22 dy-base [<ffffffff80239ad3>] worker_thread+0xfb/0x134 > > Dec 11 22:47:22 dy-base [<ffffffff80223f6c>] > default_wake_function+0x0/0xf > > Dec 11 22:47:22 dy-base [<ffffffff802399d8>] worker_thread+0x0/0x134 > > Dec 11 22:47:22 dy-base [<ffffffff8023c6e5>] kthread+0xd8/0x10b > > Dec 11 22:47:22 dy-base [<ffffffff802256ac>] schedule_tail+0x45/0xa6 > > Dec 11 22:47:22 dy-base [<ffffffff8020a6a8>] child_rip+0xa/0x12 > > Dec 11 22:47:22 dy-base [<ffffffff802399d8>] worker_thread+0x0/0x134 > > Dec 11 22:47:22 dy-base [<ffffffff8023c60d>] kthread+0x0/0x10b > > Dec 11 22:47:22 dy-base [<ffffffff8020a69e>] child_rip+0x0/0x12 > > Dec 11 22:47:22 dy-base > > Dec 11 22:47:22 dy-base > > Dec 11 22:47:22 dy-base Code: > > Dec 11 22:47:22 dy-base 8b > > Dec 11 22:47:22 dy-base 83 > > Dec 11 22:47:22 dy-base 0c > > Dec 11 22:47:22 dy-base 01 > > Dec 11 22:47:22 dy-base 00 > > Dec 11 22:47:22 dy-base 00 > > Dec 11 22:47:22 dy-base 48 > > Dec 11 22:47:22 dy-base 8d > > Dec 11 22:47:22 dy-base 8b > > Dec 11 22:47:22 dy-base 98 > > Dec 11 22:47:22 dy-base 02 > > Dec 11 22:47:22 dy-base 00 > > Dec 11 22:47:22 dy-base 00 > > Dec 11 22:47:22 dy-base 41 > > Dec 11 22:47:22 dy-base 8b > > Dec 11 22:47:22 dy-base 54 > > Dec 11 22:47:22 dy-base 24 > > Dec 11 22:47:22 dy-base 04 > > Dec 11 22:47:22 dy-base 41 > > Dec 11 22:47:22 dy-base 89 > > Dec 11 22:47:22 dy-base > > Dec 11 22:47:22 dy-base RIP > > Dec 11 22:47:22 dy-base [<ffffffff803f3aba>] spin_bug+0x69/0xdf > > Dec 11 22:47:22 dy-base RSP <ffff81011fb89bc0> > > Dec 11 22:47:22 dy-base > > Dec 11 22:47:22 dy-base Kernel panic - not syncing: Fatal exception > > Dec 11 22:47:22 dy-base > > Dec 11 22:47:22 dy-base Rebooting in 5 seconds.. > > > > After this, sometimes the server reboots normally, but sometimes hangs, no > > console, no sysreq, no nothing. > > > > This is a "simple" crash, no "too much" data lost, or else. > > > > Can somebody help me to tracking down the problem? > > > > Thanks, > > Janos Haar > > > > > > > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2006-12-16 15:32 UTC|newest] Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top 2006-12-11 23:00 xfslogd-spinlock bug? Haar János 2006-12-12 14:32 ` Justin Piszcz 2006-12-13 1:11 ` Haar János 2006-12-16 11:19 ` Haar János [this message] 2006-12-17 22:44 ` David Chinner 2006-12-17 23:56 ` Haar János 2006-12-18 6:24 ` David Chinner 2006-12-18 8:17 ` Haar János 2006-12-18 22:36 ` David Chinner 2006-12-18 23:39 ` Haar János 2006-12-19 2:52 ` David Chinner 2006-12-19 4:47 ` David Chinner 2006-12-27 12:58 ` Haar János 2007-01-07 23:14 ` David Chinner 2007-01-10 17:18 ` Janos Haar 2007-01-11 3:34 ` David Chinner 2007-01-11 20:15 ` Janos Haar
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='000d01c72127$3d7509b0$0400a8c0@dcccs' \ --to=djani22@netcenter.hu \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-xfs@oss.sgi.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).