LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3
@ 2004-05-26 16:13 dag
2004-05-26 20:37 ` Nathan Scott
2004-05-27 5:58 ` Nathan Scott
0 siblings, 2 replies; 7+ messages in thread
From: dag @ 2004-05-26 16:13 UTC (permalink / raw)
To: linux-kernel
I experience hangs with xfsdump, when dumping my rootfs to a USB 2.0
connected drive. The hangs are reproducible within 0.2-2 GB of dump, and
always come together with one or two instances of :
pagebuf_get: failed to lookup pages
I do not know if this is a problem with xfs, ide, scsi, usb, VM or some
other area of the kernel. But it is reproducible with 2.6.6 + a few
select patches, and with plain 2.6.7-rc1-bk3.
I have collected sysrq-t, sysrq-p info. A snippet below.
If none of this explains the hang, maybe the gurus would like to point a
browser at:
http://thaifood.homeip.net/xfsdumphang/xfsdump.dmesg.txt
http://thaifood.homeip.net/xfsdumphang/config-2.6.7-rc1-bk3
http://thaifood.homeip.net/xfsdumphang/lspci.txt
http://thaifood.homeip.net/xfsdumphang/lsusb.txt
xfssyncd S C04F25E0 0 331 1 342 317 (L-TLB)
cfccbf9c 00000046 c1370610 c04f25e0 cfc31d60 c0238bec cfc31d98 c04fecd8
00000031 00000000 cfccbfb0 00002773 37e96cbf 00000210 c13707b8 000a43c5
cfccbfb0 00000000 00000000 c03d2ec3 cfccbfb0 000a43c5 00000000 c048e508
Call Trace:
[<c0238bec>] pagebuf_rele+0x2c/0x120
[<c03d2ec3>] schedule_timeout+0x63/0xc0
[<c0121110>] process_timeout+0x0/0x10
[<c023f5e7>] xfssyncd+0x57/0xc0
[<c023f590>] xfssyncd+0x0/0xc0
[<c0103f4d>] kernel_thread_helper+0x5/0x18
usb-storage S C04F2A88 0 342 1 343 331 (L-TLB)
cfc09f4c 00000046 c13ff0d0 c04f2a88 0000020f 3ccbf196 00000000 c58bfcea
c58c004c 0000020f c13ff0d0 0000012d c58c004c 0000020f c1370238 c13b0f04
00000246 cfc08000 c1370090 c03d24c7 cfc08000 c13b0f0c 00000000 00000001
Call Trace:
[<c03d24c7>] __down_interruptible+0xa7/0x140
[<c0115e60>] default_wake_function+0x0/0x20
[<c011555d>] wake_up_process+0x1d/0x30
[<c03d2573>] __down_failed_interruptible+0x7/0xc
[<c032eead>] .text.lock.usb+0x5/0x58
[<c01158f7>] schedule_tail+0x17/0x50
[<c0105c82>] ret_from_fork+0x6/0x14
[<c032e150>] usb_stor_control_thread+0x0/0x280
[<c032e150>] usb_stor_control_thread+0x0/0x280
[<c0103f4d>] kernel_thread_helper+0x5/0x18
scsi_eh_0 S C04F25E0 0 343 1 485 342 (L-TLB)
cfab7f78 00000046 cfab96b0 c04f25e0 00000000 00000000 00000000 00000000
00000086 cfab7f7c c13ff650 000015c8 2850a2f5 00000184 cfab9858 cfab7fd4
00000246 cfab6000 cfab96b0 c03d24c7 cfab6000 cfab7fdc 00000000 00000001
Call Trace:
[<c03d24c7>] __down_interruptible+0xa7/0x140
[<c0115e60>] default_wake_function+0x0/0x20
[<c03d2573>] __down_failed_interruptible+0x7/0xc
[<c02ddca8>] .text.lock.scsi_error+0x41/0x49
[<c02dd960>] scsi_error_handler+0x0/0x110
[<c0103f4d>] kernel_thread_helper+0x5/0x18
A few more bits of info, as I have no idea where to start *):
- the target filesystem is writeable after xfsdump hangs
- the USB2IDE chip is an ISD-300.
- the USB 2.0 controller is a NEC chip on a CardBus card.
- gcc 3.3, xfsdump 2.2.16
*) Yeah, I can start doing a binary search for a working kernel....
Anyone?
Dag B
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3
2004-05-26 16:13 xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3 dag
@ 2004-05-26 20:37 ` Nathan Scott
2004-05-27 5:58 ` Nathan Scott
1 sibling, 0 replies; 7+ messages in thread
From: Nathan Scott @ 2004-05-26 20:37 UTC (permalink / raw)
To: dag; +Cc: linux-kernel, linux-xfs
hi Dag,
On Wed, May 26, 2004 at 09:13:14AM -0700, dag@bakke.com wrote:
>
> I experience hangs with xfsdump, when dumping my rootfs to a USB 2.0
> connected drive. The hangs are reproducible within 0.2-2 GB of dump, and
> always come together with one or two instances of :
>
> pagebuf_get: failed to lookup pages
>
> xfssyncd S C04F25E0 0 331 1 342 317 (L-TLB)
> cfccbf9c 00000046 c1370610 c04f25e0 cfc31d60 c0238bec cfc31d98 c04fecd8
> 00000031 00000000 cfccbfb0 00002773 37e96cbf 00000210 c13707b8 000a43c5
> cfccbfb0 00000000 00000000 c03d2ec3 cfccbfb0 000a43c5 00000000 c048e508
> Call Trace:
> [<c0238bec>] pagebuf_rele+0x2c/0x120
> [<c03d2ec3>] schedule_timeout+0x63/0xc0
> [<c0121110>] process_timeout+0x0/0x10
> [<c023f5e7>] xfssyncd+0x57/0xc0
> [<c023f590>] xfssyncd+0x0/0xc0
> [<c0103f4d>] kernel_thread_helper+0x5/0x18
>
> Anyone?
>
This looks like the result of an earlier error on the code
path at that initial warning there (known problem) - in the
current code there is a situation where we attempt metadata
readahead, cannot initialise a XFS buffer completely due to
low memory, but fail to correctly tear down that partially
created buffer when passing back the (recoverable) error.
We're working on a fix.
cheers.
--
Nathan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3
2004-05-26 16:13 xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3 dag
2004-05-26 20:37 ` Nathan Scott
@ 2004-05-27 5:58 ` Nathan Scott
1 sibling, 0 replies; 7+ messages in thread
From: Nathan Scott @ 2004-05-27 5:58 UTC (permalink / raw)
To: dag; +Cc: linux-kernel, linux-xfs
On Wed, May 26, 2004 at 09:13:14AM -0700, dag@bakke.com wrote:
>
> I experience hangs with xfsdump, when dumping my rootfs to a USB 2.0
> ...
> http://thaifood.homeip.net/xfsdumphang/xfsdump.dmesg.txt
> ...
The xfsdump stack trace in there is the important one.
Can you try this patch and let me know how it goes?
thanks.
--
Nathan
--- fs/xfs/linux/xfs_buf.c.orig 2004-05-27 14:06:59.992936144 +1000
+++ fs/xfs/linux/xfs_buf.c 2004-05-27 14:08:21.548537808 +1000
@@ -370,8 +370,12 @@
retry:
page = find_or_create_page(mapping, first + i, gfp_mask);
if (unlikely(page == NULL)) {
- if (flags & PBF_READ_AHEAD)
+ if (flags & PBF_READ_AHEAD) {
+ for (--i; i >= 0; i--)
+ page_cache_release(bp->pb_pages[i]);
+ _pagebuf_free_pages(bp);
return -ENOMEM;
+ }
/*
* This could deadlock.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3
2004-05-27 8:09 dag
2004-05-27 8:18 ` Christoph Hellwig
@ 2004-05-27 10:05 ` Nathan Scott
1 sibling, 0 replies; 7+ messages in thread
From: Nathan Scott @ 2004-05-27 10:05 UTC (permalink / raw)
To: dag; +Cc: linux-kernel, linux-xfs
On Thu, May 27, 2004 at 01:09:46AM -0700, dag@bakke.com wrote:
> One failure, one success, one question :-)
> ...
> But: his patch from hch Works For Me:
>
Yep, use that final patch from Christoph, thats got all of
the bases covered.
> The one remaining question is: why does xfsrestore print
> xfsrestore: WARNING: open_by_handle of mnt failed:Bad file descriptor
Thats familiar - I can't remember the exact cause anymore,
but I think a more recent xfsdump solve that for you.
cheers.
--
Nathan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3
2004-05-27 8:09 dag
@ 2004-05-27 8:18 ` Christoph Hellwig
2004-05-27 10:05 ` Nathan Scott
1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2004-05-27 8:18 UTC (permalink / raw)
To: dag; +Cc: nathans, linux-kernel, linux-xfs
My patch still wasn't complete, you're still leaking pages, just not
locked ones, this patch should be better and I'll check it in in a few
minutes:
--- 1.111/fs/xfs/linux/xfs_buf.c 2004-04-28 06:45:14 +02:00
+++ edited/fs/xfs/linux/xfs_buf.c 2004-05-27 08:38:46 +02:00
@@ -359,6 +359,7 @@
error = _pagebuf_get_pages(bp, page_count, flags);
if (unlikely(error))
return error;
+ bp->pb_flags |= _PBF_PAGE_CACHE;
offset = bp->pb_offset;
first = bp->pb_file_offset >> PAGE_CACHE_SHIFT;
@@ -370,8 +371,12 @@
retry:
page = find_or_create_page(mapping, first + i, gfp_mask);
if (unlikely(page == NULL)) {
- if (flags & PBF_READ_AHEAD)
+ if (flags & PBF_READ_AHEAD) {
+ bp->pb_page_count = i;
+ for (i = 0; i < bp->pb_page_count; i++)
+ unlock_page(bp->pb_pages[i]);
return -ENOMEM;
+ }
/*
* This could deadlock.
@@ -426,8 +431,6 @@
for (i = 0; i < bp->pb_page_count; i++)
unlock_page(bp->pb_pages[i]);
}
-
- bp->pb_flags |= _PBF_PAGE_CACHE;
if (page_count) {
/* if we have any uptodate pages, mark that in the buffer */
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3
@ 2004-05-27 8:09 dag
2004-05-27 8:18 ` Christoph Hellwig
2004-05-27 10:05 ` Nathan Scott
0 siblings, 2 replies; 7+ messages in thread
From: dag @ 2004-05-27 8:09 UTC (permalink / raw)
To: nathans; +Cc: linux-kernel, linux-xfs
One failure, one success, one question :-)
On Thu, 27 May 2004 15:58:29 +1000, Nathan Scott wrote:
>
> On Wed, May 26, 2004 at 09:13:14AM -0700, dag@bakke.com wrote:
> >
> > I experience hangs with xfsdump, when dumping my rootfs to a USB 2.0
> > ...
> > http://thaifood.homeip.net/xfsdumphang/xfsdump.dmesg.txt
> > ...
>
> The xfsdump stack trace in there is the important one.
> Can you try this patch and let me know how it goes?
>
> --- fs/xfs/linux/xfs_buf.c.orig 2004-05-27 14:06:59.992936144 +1000
> +++ fs/xfs/linux/xfs_buf.c 2004-05-27 14:08:21.548537808 +1000
> @@ -370,8 +370,12 @@
> retry:
> page = find_or_create_page(mapping, first + i, gfp_mask);
> if (unlikely(page == NULL)) {
> - if (flags & PBF_READ_AHEAD)
> + if (flags & PBF_READ_AHEAD) {
> + for (--i; i >= 0; i--)
> + page_cache_release(bp->pb_pages[i]);
> + _pagebuf_free_pages(bp);
> return -ENOMEM;
> + }
>
> /*
> * This could deadlock.
xfsdump completes "successfully", but prematurely. And I get an oops.
Unable to handle kernel NULL pointer dereference at virtual address
00000000
printing eip:
c02381b3
*pde = 00000000
Oops: 0000 [#1]
PREEMPT
Modules linked in: 3c589_cs
CPU: 0
EIP: 0060:[<c02381b3>] Not tainted
EFLAGS: 00010206 (2.6.7-rc1-bk3)
EIP is at _pagebuf_lookup_pages+0x2b3/0x2e0
eax: ffffffff ebx: 0000ffff ecx: cba8db84 edx: 00000000
esi: cba8dac0 edi: 000389b4 ebp: 00002000 esp: cefa5cc8
ds: 007b es: 007b ss: 0068
Process xfsdump (pid: 5769, threadinfo=cefa4000 task=cf54b6f0)
Stack: cffa6afc 000389b4 00001200 00000000 cefa4000 00000000 00000000
000389b4
00000002 00001200 00000000 00001000 00000009 cffa6afc cba8dac0
cba8dac0
00019011 00000002 c023867b cba8dac0 00019011 00000000 00002000
00019011
Call Trace:
[<c023867b>] pagebuf_get+0x19b/0x1b0
[<c01f05e1>] xfs_btree_reada_bufs+0x71/0x90
[<c0217668>] xfs_bulkstat+0xd18/0x1000
[<c023c43b>] xfs_ioc_bulkstat+0x10b/0x1f0
[<c0216360>] xfs_bulkstat_one+0x0/0x5f0
[<c023c0bd>] xfs_ioctl+0x68d/0x840
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
[<c022f9f7>] xfs_inactive_free_eofblocks+0x107/0x2d0
[<c01657e1>] dput+0x31/0x220
[<c023acad>] linvfs_ioctl+0x3d/0x70
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
[<c0160d60>] sys_ioctl+0x100/0x270
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
[<c014e1d1>] sys_close+0x61/0xa0
[<c0105d59>] sysenter_past_esp+0x52/0x71
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
Code: 8b 02 f6 c4 08 75 f0 8b 42 04 40 74 14 83 42 04 ff 0f 98 c0
xfsrestore: restore complete: 403 seconds elapsed
xfsrestore: Restore Status: SUCCESS
Filesystem Size Used Avail Use% Mounted on
/dev/hda3 3.3G 2.0G 1.3G 62% /
/dev/scsi/host0/bus0/target0/lun0/part3
9.4G 562M 8.8G 6% /mnt/target
But: his patch from hch Works For Me:
--- 1.111/fs/xfs/linux/xfs_buf.c 2004-04-28 06:45:14 +02:00
+++ edited/fs/xfs/linux/xfs_buf.c 2004-05-26 18:58:14 +02:00
@@ -370,8 +370,12 @@
retry:
page = find_or_create_page(mapping, first + i, gfp_mask);
if (unlikely(page == NULL)) {
- if (flags & PBF_READ_AHEAD)
+ if (flags & PBF_READ_AHEAD) {
+ bp->pb_page_count = i;
+ for (i = 0; i < bp->pb_page_count; i++)
+ unlock_page(bp->pb_pages[i]);
return -ENOMEM;
+ }
/*
* This could deadlock.
Tested two dumps now, and both completes successfully. And for real.
I have yet to boot on the new root fs, though. :-)
The one remaining question is: why does xfsrestore print
xfsrestore: WARNING: open_by_handle of mnt failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of bin failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of dev/rd failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of dev/ida failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of dev failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of sys failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of tftpboot failed:Bad file descriptor
etc. etc. for what appears to be every directory in the source fs? This
is at the end of the dump, just prior to the
xfsrestore: restore complete: 403 seconds elapsed
xfsrestore: Restore Status: SUCCESS
message?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3
@ 2004-05-26 17:01 dag
0 siblings, 0 replies; 7+ messages in thread
From: dag @ 2004-05-26 17:01 UTC (permalink / raw)
To: linux-kernel
On Wed, 26 May 2004 09:13:14 -0700 (PDT), dag@bakke.com wrote:
>
>
> I experience hangs with xfsdump, when dumping my rootfs to a USB 2.0
> connected drive. The hangs are reproducible within 0.2-2 GB of dump,
Bah... ambiguity...
xfsdump hangs. Not the kernel. So it could quite possibly be a bug in
xfsdump. But the
pagebuf_get: failed to lookup pages
message in syslog makes me think otherwise.
Dag B
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-05-27 10:05 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-26 16:13 xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3 dag
2004-05-26 20:37 ` Nathan Scott
2004-05-27 5:58 ` Nathan Scott
2004-05-26 17:01 dag
2004-05-27 8:09 dag
2004-05-27 8:18 ` Christoph Hellwig
2004-05-27 10:05 ` Nathan Scott
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).