LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* 2.6.20 kernel hang with USB drive and vfat doing ftruncate @ 2007-02-16 19:54 Kumar Gala 2007-02-18 16:10 ` OGAWA Hirofumi 0 siblings, 1 reply; 19+ messages in thread From: Kumar Gala @ 2007-02-16 19:54 UTC (permalink / raw) To: Linux Kernel list I'm seeing an issue with a stock 2.6.20 kernel running on an embedded PPC. I've got a usb flash drive plugged in and the filesystem on the drive is vfat. Running with 64M and no swap. If I execute a series of large (100M+) ftruncate() on the disk the kernel will hang and never return. It seems to be stuck in the idle loop(). The following is the test program I'm running: #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <errno.h> void usage (void) { printf ("truncate_test <filename> <size>\n\n"); } int main(int argc, char *argv[]) { int fd, i; int ret = 0; unsigned int len; if (argc != 3) { printf("Invalid number of arguments\n\n"); usage(); exit(1); } fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU); len = strtoul(argv[2], NULL, 0); ret = ftruncate(fd, len); if (ret) printf ("ftruncate ret = %d %d\n", ret, errno); close(fd); return ret; } I usually run the following twice to get the hang state: time ./trunc_test bar 100000000 & time ./trunc_test baz 100000000 & I was wondering if anyone had any suggestions on what to poke at next to try and figure out what is going on. - k ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-16 19:54 2.6.20 kernel hang with USB drive and vfat doing ftruncate Kumar Gala @ 2007-02-18 16:10 ` OGAWA Hirofumi 2007-02-19 21:58 ` Kumar Gala 2007-02-19 22:06 ` Kumar Gala 0 siblings, 2 replies; 19+ messages in thread From: OGAWA Hirofumi @ 2007-02-18 16:10 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list Kumar Gala <galak@kernel.crashing.org> writes: > I'm seeing an issue with a stock 2.6.20 kernel running on an embedded > PPC. I've got a usb flash drive plugged in and the filesystem on the > drive is vfat. Running with 64M and no swap. > > If I execute a series of large (100M+) ftruncate() on the disk the > kernel will hang and never return. It seems to be stuck in the idle > loop(). > > The following is the test program I'm running: > > #include <sys/mman.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <stdio.h> > #include <unistd.h> > #include <errno.h> > > void usage (void) > { > printf ("truncate_test <filename> <size>\n\n"); > } > > int main(int argc, char *argv[]) > { > int fd, i; > int ret = 0; > unsigned int len; > > if (argc != 3) { > printf("Invalid number of arguments\n\n"); > usage(); > exit(1); > } > > fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU); > len = strtoul(argv[2], NULL, 0); > > ret = ftruncate(fd, len); > > if (ret) > printf ("ftruncate ret = %d %d\n", ret, errno); > > close(fd); > > return ret; > } > > I usually run the following twice to get the hang state: > > time ./trunc_test bar 100000000 & > time ./trunc_test baz 100000000 & > > I was wondering if anyone had any suggestions on what to poke at next > to try and figure out what is going on. Can you check /sys/block/xxx/stat or something to make sure there is no outstanding IO request? It seems to be no response from the lower layer... -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-18 16:10 ` OGAWA Hirofumi @ 2007-02-19 21:58 ` Kumar Gala 2007-02-19 22:19 ` OGAWA Hirofumi 2007-02-19 22:06 ` Kumar Gala 1 sibling, 1 reply; 19+ messages in thread From: Kumar Gala @ 2007-02-19 21:58 UTC (permalink / raw) To: OGAWA Hirofumi; +Cc: Linux Kernel list On Feb 18, 2007, at 10:10 AM, OGAWA Hirofumi wrote: > Kumar Gala <galak@kernel.crashing.org> writes: > >> I'm seeing an issue with a stock 2.6.20 kernel running on an embedded >> PPC. I've got a usb flash drive plugged in and the filesystem on the >> drive is vfat. Running with 64M and no swap. >> >> If I execute a series of large (100M+) ftruncate() on the disk the >> kernel will hang and never return. It seems to be stuck in the idle >> loop(). >> >> The following is the test program I'm running: >> >> #include <sys/mman.h> >> #include <sys/types.h> >> #include <sys/stat.h> >> #include <fcntl.h> >> #include <stdio.h> >> #include <unistd.h> >> #include <errno.h> >> >> void usage (void) >> { >> printf ("truncate_test <filename> <size>\n\n"); >> } >> >> int main(int argc, char *argv[]) >> { >> int fd, i; >> int ret = 0; >> unsigned int len; >> >> if (argc != 3) { >> printf("Invalid number of arguments\n\n"); >> usage(); >> exit(1); >> } >> >> fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU); >> len = strtoul(argv[2], NULL, 0); >> >> ret = ftruncate(fd, len); >> >> if (ret) >> printf ("ftruncate ret = %d %d\n", ret, errno); >> >> close(fd); >> >> return ret; >> } >> >> I usually run the following twice to get the hang state: >> >> time ./trunc_test bar 100000000 & >> time ./trunc_test baz 100000000 & >> >> I was wondering if anyone had any suggestions on what to poke at next >> to try and figure out what is going on. > > Can you check /sys/block/xxx/stat or something to make sure there is > no outstanding IO request? > > It seems to be no response from the lower layer... Once the system locks up I dont have any ability to do anything. - k ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-19 21:58 ` Kumar Gala @ 2007-02-19 22:19 ` OGAWA Hirofumi 2007-02-19 22:27 ` Kumar Gala 0 siblings, 1 reply; 19+ messages in thread From: OGAWA Hirofumi @ 2007-02-19 22:19 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list Kumar Gala <galak@kernel.crashing.org> writes: >>> I usually run the following twice to get the hang state: >>> >>> time ./trunc_test bar 100000000 & >>> time ./trunc_test baz 100000000 & >>> >>> I was wondering if anyone had any suggestions on what to poke at next >>> to try and figure out what is going on. >> >> Can you check /sys/block/xxx/stat or something to make sure there is >> no outstanding IO request? >> >> It seems to be no response from the lower layer... > > Once the system locks up I dont have any ability to do anything. Ah, doesn't sysrq also work? If sysrq work, it can use to see IO request state with a patch. -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-19 22:19 ` OGAWA Hirofumi @ 2007-02-19 22:27 ` Kumar Gala 2007-02-20 17:20 ` OGAWA Hirofumi 0 siblings, 1 reply; 19+ messages in thread From: Kumar Gala @ 2007-02-19 22:27 UTC (permalink / raw) To: OGAWA Hirofumi; +Cc: Linux Kernel list On Feb 19, 2007, at 4:19 PM, OGAWA Hirofumi wrote: > Kumar Gala <galak@kernel.crashing.org> writes: > >>>> I usually run the following twice to get the hang state: >>>> >>>> time ./trunc_test bar 100000000 & >>>> time ./trunc_test baz 100000000 & >>>> >>>> I was wondering if anyone had any suggestions on what to poke at >>>> next >>>> to try and figure out what is going on. >>> >>> Can you check /sys/block/xxx/stat or something to make sure there is >>> no outstanding IO request? >>> >>> It seems to be no response from the lower layer... >> >> Once the system locks up I dont have any ability to do anything. > > Ah, doesn't sysrq also work? If sysrq work, it can use to see IO > request state with a patch. Yeah, got sysrq working today. If you can point me at the patch I happy to apply it and get data. - k ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-19 22:27 ` Kumar Gala @ 2007-02-20 17:20 ` OGAWA Hirofumi 0 siblings, 0 replies; 19+ messages in thread From: OGAWA Hirofumi @ 2007-02-20 17:20 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list [-- Attachment #1: Type: text/plain, Size: 593 bytes --] Kumar Gala <galak@kernel.crashing.org> writes: > On Feb 19, 2007, at 4:19 PM, OGAWA Hirofumi wrote: > >> Kumar Gala <galak@kernel.crashing.org> writes: >> >>> Once the system locks up I dont have any ability to do anything. >> >> Ah, doesn't sysrq also work? If sysrq work, it can use to see IO >> request state with a patch. > > Yeah, got sysrq working today. If you can point me at the patch I > happy to apply it and get data. Ok, please try attached patch. I hope it helps you. BTW, new sysrq is sysrq-j, and it will show disk stats. -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: debug-block.patch --] [-- Type: text/x-diff, Size: 2821 bytes --] Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> --- block/genhd.c | 27 +++++++++++++++++++++++++++ drivers/char/sysrq.c | 15 ++++++++++++++- 2 files changed, 41 insertions(+), 1 deletion(-) diff -puN drivers/char/sysrq.c~debug-block drivers/char/sysrq.c --- linux-2.6/drivers/char/sysrq.c~debug-block 2007-02-21 00:58:35.000000000 +0900 +++ linux-2.6-hirofumi/drivers/char/sysrq.c 2007-02-21 02:02:52.000000000 +0900 @@ -311,6 +311,19 @@ static struct sysrq_key_op sysrq_kill_op .enable_mask = SYSRQ_ENABLE_SIGNAL, }; +extern void block_req_callback(struct work_struct *ignored); +static DECLARE_WORK(block_req_work, block_req_callback); +static void sysrq_handle_block_req(int key, struct tty_struct *tty) +{ + schedule_work(&block_req_work); +} +static struct sysrq_key_op sysrq_block_req_op = { + .handler = sysrq_handle_block_req, + .help_msg = "block req (j)", + .action_msg = "Block Req", + .enable_mask = SYSRQ_ENABLE_DUMP, +}; + static void sysrq_handle_unrt(int key, struct tty_struct *tty) { normalize_rt_tasks(); @@ -351,7 +364,7 @@ static struct sysrq_key_op *sysrq_key_ta NULL, /* g */ NULL, /* h */ &sysrq_kill_op, /* i */ - NULL, /* j */ + &sysrq_block_req_op, /* j */ &sysrq_SAK_op, /* k */ NULL, /* l */ &sysrq_showmem_op, /* m */ diff -puN block/genhd.c~debug-block block/genhd.c --- linux-2.6/block/genhd.c~debug-block 2007-02-21 01:02:13.000000000 +0900 +++ linux-2.6-hirofumi/block/genhd.c 2007-02-21 02:15:56.000000000 +0900 @@ -555,6 +555,33 @@ static struct kset_uevent_ops block_ueve decl_subsys(block, &ktype_block, &block_uevent_ops); +void block_req_callback(struct work_struct *ignored) +{ + struct gendisk *gp; + char buf[BDEVNAME_SIZE]; + + mutex_lock(&block_subsys_lock); + list_for_each_entry(gp, &block_subsys.kset.list, kobj.entry) { + printk("%4d %4d %s %lu %lu %llu %u %lu %lu %llu %u %u %u %u:" + " %u %u %u\n", + gp->major, gp->first_minor, disk_name(gp, 0, buf), + disk_stat_read(gp, ios[0]), + disk_stat_read(gp, merges[0]), + (unsigned long long)disk_stat_read(gp, sectors[0]), + jiffies_to_msecs(disk_stat_read(gp, ticks[0])), + disk_stat_read(gp, ios[1]), + disk_stat_read(gp, merges[1]), + (unsigned long long)disk_stat_read(gp, sectors[1]), + jiffies_to_msecs(disk_stat_read(gp, ticks[1])), + gp->in_flight, + jiffies_to_msecs(disk_stat_read(gp, io_ticks)), + jiffies_to_msecs(disk_stat_read(gp, time_in_queue)), + gp->queue->rq.count[0], gp->queue->rq.count[1], + gp->queue->in_flight); + } + mutex_unlock(&block_subsys_lock); +} + /* * aggregate disk stat collector. Uses the same stats that the sysfs * entries do, above, but makes them available through one seq_file. _ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-18 16:10 ` OGAWA Hirofumi 2007-02-19 21:58 ` Kumar Gala @ 2007-02-19 22:06 ` Kumar Gala 2007-02-21 20:18 ` OGAWA Hirofumi 1 sibling, 1 reply; 19+ messages in thread From: Kumar Gala @ 2007-02-19 22:06 UTC (permalink / raw) To: Linux Kernel list; +Cc: Andrew Morton, Greg KH On Feb 18, 2007, at 10:10 AM, OGAWA Hirofumi wrote: > Kumar Gala <galak@kernel.crashing.org> writes: > >> I'm seeing an issue with a stock 2.6.20 kernel running on an embedded >> PPC. I've got a usb flash drive plugged in and the filesystem on the >> drive is vfat. Running with 64M and no swap. >> >> If I execute a series of large (100M+) ftruncate() on the disk the >> kernel will hang and never return. It seems to be stuck in the idle >> loop(). >> >> The following is the test program I'm running: >> >> #include <sys/mman.h> >> #include <sys/types.h> >> #include <sys/stat.h> >> #include <fcntl.h> >> #include <stdio.h> >> #include <unistd.h> >> #include <errno.h> >> >> void usage (void) >> { >> printf ("truncate_test <filename> <size>\n\n"); >> } >> >> int main(int argc, char *argv[]) >> { >> int fd, i; >> int ret = 0; >> unsigned int len; >> >> if (argc != 3) { >> printf("Invalid number of arguments\n\n"); >> usage(); >> exit(1); >> } >> >> fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU); >> len = strtoul(argv[2], NULL, 0); >> >> ret = ftruncate(fd, len); >> >> if (ret) >> printf ("ftruncate ret = %d %d\n", ret, errno); >> >> close(fd); >> >> return ret; >> } >> >> I usually run the following twice to get the hang state: >> >> time ./trunc_test bar 100000000 & >> time ./trunc_test baz 100000000 & >> >> I was wondering if anyone had any suggestions on what to poke at next >> to try and figure out what is going on. So I realized I could use sysrq to provide some more debug information. When the system locks up I get the following output from 't' [ 496.901002] Show State [ 496.903356] [ 496.903360] free sibling [ 496.911532] task PC stack pid father child younger older [ 496.918486] init S 3009C7EC 0 1 0 2 (NOTLB) [ 496.926169] Call Trace: [ 496.928611] [C3FC7DA0] [C006F03C] __link_path_walk+0xd24/0x112c (unreliable) [ 496.935687] [C3FC7E60] [C00083AC] __switch_to+0x28/0x40 [ 496.940931] [C3FC7E80] [C01F4B78] schedule+0x324/0x6bc [ 496.946086] [C3FC7EC0] [C001E164] do_wait+0x700/0x100c [ 496.951242] [C3FC7F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 496.956828] --- Exception: c01 at 0x3009c7ec [ 496.961099] LR = 0x3009c3e0 [ 496.964234] ksoftirqd/0 S 00000000 0 2 1 3 (L-TLB) [ 496.971913] Call Trace: [ 496.974355] [C033DE80] [C0133F64] scsi_io_completion+0x74/0x318 (unreliable) [ 496.981428] [C033DF40] [C00083AC] __switch_to+0x28/0x40 [ 496.986664] [C033DF60] [C01F4B78] schedule+0x324/0x6bc [ 496.991811] [C033DFA0] [C00210CC] ksoftirqd+0xfc/0x114 [ 496.996960] [C033DFC0] [C0033E48] kthread+0xf4/0x130 [ 497.001941] [C033DFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.007350] events/0 S 00000000 0 3 1 4 2 (L-TLB) [ 497.015030] Call Trace: [ 497.017472] [C033FEE0] [C00083AC] __switch_to+0x28/0x40 [ 497.022707] [C033FF00] [C01F4B78] schedule+0x324/0x6bc [ 497.027855] [C033FF40] [C002F67C] worker_thread+0x144/0x148 [ 497.033435] [C033FFC0] [C0033E48] kthread+0xf4/0x130 [ 497.038409] [C033FFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.043817] khelper S 00000000 0 4 1 5 3 (L-TLB) [ 497.051497] Call Trace: [ 497.053940] [C3FE1E20] [C3FE0000] 0xc3fe0000 (unreliable) [ 497.059351] [C3FE1EE0] [C00083AC] __switch_to+0x28/0x40 [ 497.064586] [C3FE1F00] [C01F4B78] schedule+0x324/0x6bc [ 497.069734] [C3FE1F40] [C002F67C] worker_thread+0x144/0x148 [ 497.075316] [C3FE1FC0] [C0033E48] kthread+0xf4/0x130 [ 497.080291] [C3FE1FF0] [C001093C] kernel_thread+0x44/0x60 [ 497.085697] kthread S 00000000 0 5 1 37 617 4 (L-TLB) [ 497.093378] Call Trace: [ 497.095820] [C3FCBE20] [00001032] 0x1032 (unreliable) [ 497.100881] --- Exception: c3fcbef0 at __switch_to+0x28/0x40 [ 497.106545] LR = 0xc3fcbef0 [ 497.109681] [C3FCBEE0] [C00083AC] __switch_to+0x28/0x40 (unreliable) [ 497.116051] [C3FCBF00] [C01F4B78] schedule+0x324/0x6bc [ 497.121201] [C3FCBF40] [C002F67C] worker_thread+0x144/0x148 [ 497.126783] [C3FCBFC0] [C0033E48] kthread+0xf4/0x130 [ 497.131758] [C3FCBFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.137165] kblockd/0 S 00000000 0 37 5 41 (L-TLB) [ 497.144845] Call Trace: [ 497.147286] [C3D9FE20] [C3EBF490] 0xc3ebf490 (unreliable) [ 497.152697] [C3D9FEE0] [C00083AC] __switch_to+0x28/0x40 [ 497.157933] [C3D9FF00] [C01F4B78] schedule+0x324/0x6bc [ 497.163082] [C3D9FF40] [C002F67C] worker_thread+0x144/0x148 [ 497.168663] [C3D9FFC0] [C0033E48] kthread+0xf4/0x130 [ 497.173637] [C3D9FFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.179045] khubd S 00000000 0 41 5 53 37 (L-TLB) [ 497.186726] Call Trace: [ 497.189167] [C0341E00] [C3F03900] 0xc3f03900 (unreliable) [ 497.194578] [C0341EC0] [C00083AC] __switch_to+0x28/0x40 [ 497.199813] [C0341EE0] [C01F4B78] schedule+0x324/0x6bc [ 497.204961] [C0341F20] [C0152288] hub_thread+0xb40/0xcc0 [ 497.210283] [C0341FC0] [C0033E48] kthread+0xf4/0x130 [ 497.215257] [C0341FF0] [C001093C] kernel_thread+0x44/0x60 [ 497.220664] pdflush D 00000000 0 53 5 55 41 (L-TLB) [ 497.228344] Call Trace: [ 497.230786] [C3CABD10] [C008E098] __find_get_block+0x10c/0x288 (unreliable) [ 497.237769] [C3CABDD0] [C00083AC] __switch_to+0x28/0x40 [ 497.243004] [C3CABDF0] [C01F4B78] schedule+0x324/0x6bc [ 497.248152] [C3CABE30] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.253822] [C3CABE70] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.259752] [C3CABE90] [C0050DE4] congestion_wait+0x64/0x8c [ 497.265343] [C3CABEE0] [C004AA9C] background_writeout+0x44/0xe8 [ 497.271274] [C3CABF50] [C004BE04] pdflush+0x16c/0x27c [ 497.276335] [C3CABFC0] [C0033E48] kthread+0xf4/0x130 [ 497.281311] [C3CABFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.286718] kswapd0 D 00000000 0 55 5 56 53 (L-TLB) [ 497.294399] Call Trace: [ 497.296841] [C3DEFB70] [C0025FE8] run_timer_softirq+0x20/0x230 (unreliable) [ 497.303823] [C3DEFC30] [C00083AC] __switch_to+0x28/0x40 [ 497.309058] [C3DEFC50] [C01F4B78] schedule+0x324/0x6bc [ 497.314207] [C3DEFC90] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.319876] --- Exception: c3defd60 at 0xc3deff40 [ 497.324582] LR = 0xc3dee000 [ 497.327718] [C3DEFCD0] [C01F5C9C] io_schedule_timeout+0x30/0x54 (unreliable) [ 497.334783] [C3DEFCF0] [C0050DE4] congestion_wait+0x64/0x8c [ 497.340367] [C3DEFD40] [C004A9F0] throttle_vm_writeout+0x1c/0x84 [ 497.346384] [C3DEFD60] [C004F33C] shrink_zone+0xbb0/0xfe4 [ 497.351792] [C3DEFF10] [C004FD10] kswapd+0x2d4/0x424 [ 497.356767] [C3DEFFC0] [C0033E48] kthread+0xf4/0x130 [ 497.361741] [C3DEFFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.367151] aio/0 S 00000000 0 56 5 670 55 (L-TLB) [ 497.374832] Call Trace: [ 497.377272] [C3CADE20] [00000020] 0x20 (unreliable) [ 497.382162] [C3CADEE0] [C00083AC] __switch_to+0x28/0x40 [ 497.387398] [C3CADF00] [C01F4B78] schedule+0x324/0x6bc [ 497.392547] [C3CADF40] [C002F67C] worker_thread+0x144/0x148 [ 497.398128] [C3CADFC0] [C0033E48] kthread+0xf4/0x130 [ 497.403102] [C3CADFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.408508] mtdblockd S 00000000 0 617 1 718 5 (L-TLB) [ 497.416191] Call Trace: [ 497.418632] [C3F27E70] [C02A0000] 0xc02a0000 (unreliable) [ 497.424043] [C3F27F30] [C00083AC] __switch_to+0x28/0x40 [ 497.429278] [C3F27F50] [C01F4B78] schedule+0x324/0x6bc [ 497.434425] [C3F27F90] [C013FAAC] mtd_blktrans_thread+0x250/0x340 [ 497.440534] [C3F27FF0] [C001093C] kernel_thread+0x44/0x60 [ 497.445941] scsi_eh_0 D 00000000 0 670 5 671 56 (L-TLB) [ 497.453622] Call Trace: [ 497.456062] [C3F1FDF0] [00000011] 0x11 (unreliable) [ 497.460951] [C3F1FEB0] [C00083AC] __switch_to+0x28/0x40 [ 497.466187] [C3F1FED0] [C01F4B78] schedule+0x324/0x6bc [ 497.471335] [C3F1FF10] [C01F50D4] wait_for_completion+0xa0/0x150 [ 497.477351] [C3F1FF50] [C016641C] command_abort+0xdc/0x118 [ 497.482846] [C3F1FF60] [C0132BC0] scsi_error_handler+0x5f0/0x810 [ 497.488868] [C3F1FFC0] [C0033E48] kthread+0xf4/0x130 [ 497.493842] [C3F1FFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.499249] usb-storage D 00000000 0 671 5 773 670 (L-TLB) [ 497.506930] Call Trace: [ 497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40 [ 497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc [ 497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c [ 497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84 [ 497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4 [ 497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc [ 497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0 [ 497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694 [ 497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc [ 497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c [ 497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8 [ 497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ 0x138 [ 497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310 [ 497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ 0x344 [ 497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command +0x10/0x20 [ 497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290 [ 497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130 [ 497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60 [ 497.625285] sh D 3009C7EC 0 718 1 787 617 (NOTLB) [ 497.632968] Call Trace: [ 497.635410] [C3C37A90] [C01339AC] scsi_run_queue+0x220/0x2e0 (unreliable) [ 497.642216] [C3C37B50] [C00083AC] __switch_to+0x28/0x40 [ 497.647452] [C3C37B70] [C01F4B78] schedule+0x324/0x6bc [ 497.652601] [C3C37BB0] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.658272] [C3C37BF0] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.664202] [C3C37C10] [C0050DE4] congestion_wait+0x64/0x8c [ 497.669786] [C3C37C60] [C004A9F0] throttle_vm_writeout+0x1c/0x84 [ 497.675802] [C3C37C80] [C004F33C] shrink_zone+0xbb0/0xfe4 [ 497.681211] [C3C37E30] [C004F8F4] try_to_free_pages+0x184/0x2cc [ 497.687141] [C3C37EA0] [C0049AA8] __alloc_pages+0x110/0x2c0 [ 497.692723] [C3C37EF0] [C0049C8C] __get_free_pages+0x34/0x74 [ 497.698390] [C3C37F00] [C007C3C4] sys_getcwd+0x30/0x2b0 [ 497.703629] [C3C37F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 497.709210] --- Exception: c01 at 0x3009c7ec [ 497.713480] LR = 0x3009a7b0 [ 497.716614] pdflush D 00000000 0 773 5 671 (L-TLB) [ 497.724294] Call Trace: [ 497.726737] [C32ADD10] [C008E098] __find_get_block+0x10c/0x288 (unreliable) [ 497.733718] [C32ADDD0] [C00083AC] __switch_to+0x28/0x40 [ 497.738954] [C32ADDF0] [C01F4B78] schedule+0x324/0x6bc [ 497.744102] [C32ADE30] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.749772] [C32ADE70] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.755702] [C32ADE90] [C0050DE4] congestion_wait+0x64/0x8c [ 497.761285] [C32ADEE0] [C004AC74] wb_kupdate+0xf0/0x160 [ 497.766520] [C32ADF50] [C004BE04] pdflush+0x16c/0x27c [ 497.771581] [C32ADFC0] [C0033E48] kthread+0xf4/0x130 [ 497.776556] [C32ADFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.781965] time S 3009C7EC 0 787 1 789 788 718 (NOTLB) [ 497.789648] Call Trace: [ 497.792089] [C03A3E60] [C00083AC] __switch_to+0x28/0x40 [ 497.797326] [C03A3E80] [C01F4B78] schedule+0x324/0x6bc [ 497.802474] [C03A3EC0] [C001E164] do_wait+0x700/0x100c [ 497.807630] [C03A3F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 497.813210] --- Exception: c01 at 0x3009c7ec [ 497.817481] LR = 0x3009c414 [ 497.820616] trunc_test D 300787EC 0 789 787 (NOTLB) [ 497.828295] Call Trace: [ 497.830737] [C19B3960] [C0160000] handshake+0x6c/0x9c (unreliable) [ 497.836939] [C19B3A20] [C00083AC] __switch_to+0x28/0x40 [ 497.842175] [C19B3A40] [C01F4B78] schedule+0x324/0x6bc [ 497.847323] [C19B3A80] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.852994] [C19B3AC0] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.858924] [C19B3AE0] [C0050DE4] congestion_wait+0x64/0x8c [ 497.864510] [C19B3B30] [C004A9F0] throttle_vm_writeout+0x1c/0x84 [ 497.870525] [C19B3B50] [C004F33C] shrink_zone+0xbb0/0xfe4 [ 497.875934] [C19B3D00] [C004F8F4] try_to_free_pages+0x184/0x2cc [ 497.881864] [C19B3D70] [C0049AA8] __alloc_pages+0x110/0x2c0 [ 497.887445] [C19B3DC0] [C00447F4] find_or_create_page+0x8c/0xe4 [ 497.893386] [C19B3DE0] [C0090DAC] cont_prepare_write+0xac/0x32c [ 497.899321] [C19B3E20] [C00D7A50] fat_prepare_write+0x30/0x40 [ 497.905077] [C19B3E30] [C008E68C] __generic_cont_expand+0xa4/0x158 [ 497.911268] [C19B3E50] [C00D7254] fat_notify_change+0xf4/0x208 [ 497.917109] [C19B3E80] [C007EB24] notify_change+0x1ec/0x1fc [ 497.922695] [C19B3EB0] [C0062DC0] do_truncate+0x58/0x88 [ 497.927935] [C19B3F10] [C006316C] do_sys_ftruncate+0x180/0x1a8 [ 497.933780] [C19B3F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 497.939361] --- Exception: c01 at 0x300787ec [ 497.943634] LR = 0x1000073c [ 497.946768] time S 3009C7EC 0 788 1 790 787 (NOTLB) [ 497.954450] Call Trace: [ 497.956892] [C1919E60] [C00083AC] __switch_to+0x28/0x40 [ 497.962129] [C1919E80] [C01F4B78] schedule+0x324/0x6bc [ 497.967278] [C1919EC0] [C001E164] do_wait+0x700/0x100c [ 497.972431] [C1919F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 497.978011] --- Exception: c01 at 0x3009c7ec [ 497.982282] LR = 0x3009c414 [ 497.985417] trunc_test D 300787EC 0 790 788 (NOTLB) [ 497.993101] Call Trace: [ 497.995542] [C2BFDA00] [C0047E68] mempool_alloc+0x38/0x144 (unreliable) [ 498.002171] [C2BFDAC0] [C00083AC] __switch_to+0x28/0x40 [ 498.007406] [C2BFDAE0] [C01F4B78] schedule+0x324/0x6bc [ 498.012554] [C2BFDB20] [C01F5C48] io_schedule+0x30/0x54 [ 498.017790] [C2BFDB40] [C008D01C] sync_buffer+0x68/0x7c [ 498.023026] [C2BFDB50] [C01F5E80] __wait_on_bit+0x98/0xec [ 498.028435] [C2BFDB70] [C01F5F34] out_of_line_wait_on_bit+0x60/0x74 [ 498.034713] [C2BFDBC0] [C008CF3C] __wait_on_buffer+0x3c/0x4c [ 498.040382] [C2BFDBD0] [C00916F4] __bread+0xe8/0xf4 [ 498.045270] [C2BFDBE0] [C00D5C24] fat_ent_bread+0x48/0xa8 [ 498.050678] [C2BFDC00] [C00D6358] fat_ent_read+0x168/0x1f0 [ 498.056171] [C2BFDC30] [C00D6690] fat_free_clusters+0x64/0x260 [ 498.062011] [C2BFDCC0] [C00D75C4] fat_truncate+0x25c/0x334 [ 498.067507] [C2BFDD30] [C0053EE4] vmtruncate+0x184/0x1a4 [ 498.072833] [C2BFDD50] [C007E810] inode_setattr+0x7c/0x1a4 [ 498.078329] [C2BFDD90] [C00D7314] fat_notify_change+0x1b4/0x208 [ 498.084257] [C2BFDDC0] [C007EB24] notify_change+0x1ec/0x1fc [ 498.089840] [C2BFDDF0] [C0062DC0] do_truncate+0x58/0x88 [ 498.095077] [C2BFDE50] [C007028C] may_open+0x1fc/0x200 [ 498.100230] [C2BFDE70] [C0070380] open_namei+0xf0/0x714 [ 498.105465] [C2BFDEB0] [C0063BB8] do_filp_open+0x30/0x78 [ 498.110788] [C2BFDF20] [C0064018] do_sys_open+0x70/0xc0 [ 498.116023] [C2BFDF40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 498.121605] --- Exception: c01 at 0x300787ec [ 498.125878] LR = 0x30077580 and from 'm' [ 731.834529] Show Memory [ 731.836968] Mem-info: [ 731.839234] DMA per-cpu: [ 731.841768] CPU 0: Hot: hi: 18, btch: 3 usd: 3 Cold: hi: 6, btch: 1 usd: 2 [ 731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330 unstable:0 free:1009 slab:1671 mapped:110 pagetables:19 [ 731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB active:6040kB inactive:45236kB present:65024kB pages_scanned:292 all_unreclaimable? no [ 731.874363] lowmem_reserve[]: 0 0 [ 731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB [ 731.887669] Free swap: 0kB [ 731.893913] 16384 pages of RAM [ 731.896963] 798 reserved pages [ 731.900011] 10946 pages shared [ 731.903058] 0 pages swap cached It seems like usb-storage and aio are completely off in the weeds. Ideas? If you need any additional debug output let me know. - k ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-19 22:06 ` Kumar Gala @ 2007-02-21 20:18 ` OGAWA Hirofumi 2007-02-21 20:57 ` Andrew Morton 0 siblings, 1 reply; 19+ messages in thread From: OGAWA Hirofumi @ 2007-02-21 20:18 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list, Andrew Morton, Greg KH Kumar Gala <galak@kernel.crashing.org> writes: >>> I usually run the following twice to get the hang state: >>> >>> time ./trunc_test bar 100000000 & >>> time ./trunc_test baz 100000000 & >>> >>> I was wondering if anyone had any suggestions on what to poke at next >>> to try and figure out what is going on. > > So I realized I could use sysrq to provide some more debug > information. When the system locks up I get the following output > from 't' > > [ 497.499249] usb-storage D 00000000 0 671 5 > 773 670 (L-TLB) > [ 497.506930] Call Trace: > [ 497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40 > [ 497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc > [ 497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0 > [ 497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54 > [ 497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c > [ 497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84 > [ 497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4 > [ 497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc > [ 497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0 > [ 497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694 > [ 497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc > [ 497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c > [ 497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8 > [ 497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ > 0x138 > [ 497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310 > [ 497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ > 0x344 > [ 497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command > +0x10/0x20 > [ 497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290 > [ 497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130 > [ 497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60 > [ 497.625285] sh D 3009C7EC 0 718 1 [...] > and from 'm' > > [ 731.834529] Show Memory > [ 731.836968] Mem-info: > [ 731.839234] DMA per-cpu: > [ 731.841768] CPU 0: Hot: hi: 18, btch: 3 usd: 3 Cold: > hi: 6, btch: 1 usd: 2 > [ 731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330 > unstable:0 free:1009 slab:1671 mapped:110 pagetables:19 > [ 731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB > active:6040kB inactive:45236kB present:65024kB pages_scanned:292 > all_unreclaimable? no > [ 731.874363] lowmem_reserve[]: 0 0 > [ 731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB > 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB > [ 731.887669] Free swap: 0kB > [ 731.893913] 16384 pages of RAM > [ 731.896963] 798 reserved pages > [ 731.900011] 10946 pages shared > [ 731.903058] 0 pages swap cached > > It seems like usb-storage and aio are completely off in the weeds. > Ideas? It seems usb-storage should remove some kmalloc and use mempool() for urb... Is someone working on this? And idea? -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 20:18 ` OGAWA Hirofumi @ 2007-02-21 20:57 ` Andrew Morton 2007-02-21 21:22 ` [linux-usb-devel] " Alan Stern 0 siblings, 1 reply; 19+ messages in thread From: Andrew Morton @ 2007-02-21 20:57 UTC (permalink / raw) To: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev Cc: Kumar Gala, Linux Kernel list, Greg KH On Thu, 22 Feb 2007 05:18:45 +0900 OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> wrote: > Kumar Gala <galak@kernel.crashing.org> writes: > > >>> I usually run the following twice to get the hang state: > >>> > >>> time ./trunc_test bar 100000000 & > >>> time ./trunc_test baz 100000000 & > >>> > >>> I was wondering if anyone had any suggestions on what to poke at next > >>> to try and figure out what is going on. > > > > So I realized I could use sysrq to provide some more debug > > information. When the system locks up I get the following output > > from 't' > > > > [ 497.499249] usb-storage D 00000000 0 671 5 > > 773 670 (L-TLB) > > [ 497.506930] Call Trace: > > [ 497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40 > > [ 497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc > > [ 497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0 > > [ 497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54 > > [ 497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c > > [ 497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84 > > [ 497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4 > > [ 497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc > > [ 497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0 > > [ 497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694 > > [ 497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc > > [ 497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c > > [ 497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8 > > [ 497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ > > 0x138 > > [ 497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310 > > [ 497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ > > 0x344 > > [ 497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command > > +0x10/0x20 > > [ 497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290 > > [ 497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130 > > [ 497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60 > > [ 497.625285] sh D 3009C7EC 0 718 1 > > [...] > > > and from 'm' > > > > [ 731.834529] Show Memory > > [ 731.836968] Mem-info: > > [ 731.839234] DMA per-cpu: > > [ 731.841768] CPU 0: Hot: hi: 18, btch: 3 usd: 3 Cold: > > hi: 6, btch: 1 usd: 2 > > [ 731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330 > > unstable:0 free:1009 slab:1671 mapped:110 pagetables:19 > > [ 731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB > > active:6040kB inactive:45236kB present:65024kB pages_scanned:292 > > all_unreclaimable? no > > [ 731.874363] lowmem_reserve[]: 0 0 > > [ 731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB > > 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB > > [ 731.887669] Free swap: 0kB > > [ 731.893913] 16384 pages of RAM > > [ 731.896963] 798 reserved pages > > [ 731.900011] 10946 pages shared > > [ 731.903058] 0 pages swap cached > > > > It seems like usb-storage and aio are completely off in the weeds. > > Ideas? > > It seems usb-storage should remove some kmalloc and use mempool() for > urb... Is someone working on this? And idea? I think Pete said that we're supposed to be using GFP_NOIO in there. Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO and GFP_NOFS allocations, which is a bug. Because if the caller holds locks which prevent filesystem or IO progress, we deadlock. I'll fix the VM if someone else fixes USB ;) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 20:57 ` Andrew Morton @ 2007-02-21 21:22 ` Alan Stern 2007-02-21 21:31 ` Andrew Morton 0 siblings, 1 reply; 19+ messages in thread From: Alan Stern @ 2007-02-21 21:22 UTC (permalink / raw) To: Andrew Morton Cc: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Kumar Gala, Linux Kernel list On Wed, 21 Feb 2007, Andrew Morton wrote: > > > It seems like usb-storage and aio are completely off in the weeds. > > > Ideas? > > > > It seems usb-storage should remove some kmalloc and use mempool() for > > urb... Is someone working on this? And idea? > > I think Pete said that we're supposed to be using GFP_NOIO in there. We _are_ using it. > Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO > and GFP_NOFS allocations, which is a bug. Because if the caller holds > locks which prevent filesystem or IO progress, we deadlock. > > I'll fix the VM if someone else fixes USB ;) What else needs to be fixed? Alan Stern ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 21:22 ` [linux-usb-devel] " Alan Stern @ 2007-02-21 21:31 ` Andrew Morton 2007-02-21 21:50 ` Alan Stern 2007-02-22 7:40 ` Kumar Gala 0 siblings, 2 replies; 19+ messages in thread From: Andrew Morton @ 2007-02-21 21:31 UTC (permalink / raw) To: Alan Stern Cc: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Kumar Gala, Linux Kernel list On Wed, 21 Feb 2007 16:22:17 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > On Wed, 21 Feb 2007, Andrew Morton wrote: > > > > > It seems like usb-storage and aio are completely off in the weeds. > > > > Ideas? > > > > > > It seems usb-storage should remove some kmalloc and use mempool() for > > > urb... Is someone working on this? And idea? > > > > I think Pete said that we're supposed to be using GFP_NOIO in there. > > We _are_ using it. How admirably prompt. > > Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO > > and GFP_NOFS allocations, which is a bug. Because if the caller holds > > locks which prevent filesystem or IO progress, we deadlock. > > > > I'll fix the VM if someone else fixes USB ;) > > What else needs to be fixed? Would be nice if someone can confirm that this fixes it: From: Andrew Morton <akpm@linux-foundation.org> throttle_vm_writeout() is designed to wait for the dirty levels to subside. But if the caller holds IO or FS locks, we might be holding up that writeout. So change it to take a single nap to give other devices a chance to clean some memory, then return. Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- include/linux/writeback.h | 2 +- mm/page-writeback.c | 13 +++++++++++-- mm/vmscan.c | 2 +- 3 files changed, 13 insertions(+), 4 deletions(-) diff -puN mm/vmscan.c~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations mm/vmscan.c --- a/mm/vmscan.c~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations +++ a/mm/vmscan.c @@ -952,7 +952,7 @@ static unsigned long shrink_zone(int pri } } - throttle_vm_writeout(); + throttle_vm_writeout(sc->gfp_mask); atomic_dec(&zone->reclaim_in_progress); return nr_reclaimed; diff -puN mm/page-writeback.c~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations mm/page-writeback.c --- a/mm/page-writeback.c~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations +++ a/mm/page-writeback.c @@ -296,11 +296,21 @@ void balance_dirty_pages_ratelimited_nr( } EXPORT_SYMBOL(balance_dirty_pages_ratelimited_nr); -void throttle_vm_writeout(void) +void throttle_vm_writeout(gfp_t gfp_mask) { long background_thresh; long dirty_thresh; + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { + /* + * The caller might hold locks which can prevert IO completion + * or progress in the filesystem. So we cannot just sit here + * waiting for IO to complete. + */ + congestion_wait(WRITE, HZ/10); + return; + } + for ( ; ; ) { get_dirty_limits(&background_thresh, &dirty_thresh, NULL); @@ -317,7 +327,6 @@ void throttle_vm_writeout(void) } } - /* * writeback at least _min_pages, and keep writing until the amount of dirty * memory is less than the background threshold, or until we're all clean. diff -puN include/linux/writeback.h~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations include/linux/writeback.h --- a/include/linux/writeback.h~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations +++ a/include/linux/writeback.h @@ -84,7 +84,7 @@ static inline void wait_on_inode(struct int wakeup_pdflush(long nr_pages); void laptop_io_completion(void); void laptop_sync_completion(void); -void throttle_vm_writeout(void); +void throttle_vm_writeout(gfp_t gfp_mask); /* These are exported to sysctl. */ extern int dirty_background_ratio; _ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 21:31 ` Andrew Morton @ 2007-02-21 21:50 ` Alan Stern 2007-02-21 22:54 ` Andrew Morton 2007-02-22 7:40 ` Kumar Gala 1 sibling, 1 reply; 19+ messages in thread From: Alan Stern @ 2007-02-21 21:50 UTC (permalink / raw) To: Andrew Morton Cc: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Kumar Gala, Linux Kernel list On Wed, 21 Feb 2007, Andrew Morton wrote: > On Wed, 21 Feb 2007 16:22:17 -0500 (EST) > Alan Stern <stern@rowland.harvard.edu> wrote: > > > On Wed, 21 Feb 2007, Andrew Morton wrote: > > > > > > > It seems like usb-storage and aio are completely off in the weeds. > > > > > Ideas? > > > > > > > > It seems usb-storage should remove some kmalloc and use mempool() for > > > > urb... Is someone working on this? And idea? > > > > > > I think Pete said that we're supposed to be using GFP_NOIO in there. > > > > We _are_ using it. > > How admirably prompt. Shucks, we've been using it for years... > > > Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO > > > and GFP_NOFS allocations, which is a bug. Because if the caller holds > > > locks which prevent filesystem or IO progress, we deadlock. > > > > > > I'll fix the VM if someone else fixes USB ;) > > > > What else needs to be fixed? > > Would be nice if someone can confirm that this fixes it: Not having experienced the problem, I can't confirm the fix. However... > + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { Is that really the correct test? I don't know enough about the memory management subsystem to say one way or the other. What's special about having both flags set? > + /* > + * The caller might hold locks which can prevert IO completion --------------------------------------------------------------^ Typo Although perhaps "prevert" is an acceptable neologism in this context. > + * or progress in the filesystem. So we cannot just sit here > + * waiting for IO to complete. > + */ Alan Stern ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 21:50 ` Alan Stern @ 2007-02-21 22:54 ` Andrew Morton 0 siblings, 0 replies; 19+ messages in thread From: Andrew Morton @ 2007-02-21 22:54 UTC (permalink / raw) To: Alan Stern Cc: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Kumar Gala, Linux Kernel list On Wed, 21 Feb 2007 16:50:23 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > > + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { > > Is that really the correct test? I don't know enough about the memory > management subsystem to say one way or the other. What's special about > having both flags set? yup. We're saying "if the caller is unable to take either IO locks or FS locks, don't wait on FS or IO completion". ie: don't wait on writeout progress unless we know that both the IO system and the FS are able to make progress. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 21:31 ` Andrew Morton 2007-02-21 21:50 ` Alan Stern @ 2007-02-22 7:40 ` Kumar Gala 2007-02-22 18:20 ` Kumar Gala 1 sibling, 1 reply; 19+ messages in thread From: Kumar Gala @ 2007-02-22 7:40 UTC (permalink / raw) To: Andrew Morton Cc: Alan Stern, OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Linux Kernel list On Feb 21, 2007, at 3:31 PM, Andrew Morton wrote: > On Wed, 21 Feb 2007 16:22:17 -0500 (EST) > Alan Stern <stern@rowland.harvard.edu> wrote: > >> On Wed, 21 Feb 2007, Andrew Morton wrote: >> >>>>> It seems like usb-storage and aio are completely off in the weeds. >>>>> Ideas? >>>> >>>> It seems usb-storage should remove some kmalloc and use mempool >>>> () for >>>> urb... Is someone working on this? And idea? >>> >>> I think Pete said that we're supposed to be using GFP_NOIO in there. >> >> We _are_ using it. > > How admirably prompt. > >>> Not that it'll help much: the VM calls throttle_vm_writeout() for >>> GFP_NOIO >>> and GFP_NOFS allocations, which is a bug. Because if the caller >>> holds >>> locks which prevent filesystem or IO progress, we deadlock. >>> >>> I'll fix the VM if someone else fixes USB ;) >> >> What else needs to be fixed? > > Would be nice if someone can confirm that this fixes it: Doesn't seem to help my problem in a quick test, will get more data in the morning. - k ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-22 7:40 ` Kumar Gala @ 2007-02-22 18:20 ` Kumar Gala 2007-02-22 21:57 ` Andrew Morton 0 siblings, 1 reply; 19+ messages in thread From: Kumar Gala @ 2007-02-22 18:20 UTC (permalink / raw) To: Andrew Morton Cc: Alan Stern, OGAWA Hirofumi, USB development list, Pete Zaitcev, Greg KH, Linux Kernel list >>>> Not that it'll help much: the VM calls throttle_vm_writeout() >>>> for GFP_NOIO >>>> and GFP_NOFS allocations, which is a bug. Because if the caller >>>> holds >>>> locks which prevent filesystem or IO progress, we deadlock. >>>> >>>> I'll fix the VM if someone else fixes USB ;) >>> >>> What else needs to be fixed? >> >> Would be nice if someone can confirm that this fixes it: > > Doesn't seem to help my problem in a quick test, will get more data > in the morning. \x7f Well, I didn't realize the patch you sent via mm-commits and the one here are actually different. I noticed that mm-commits one has: + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != __GFP_FS|__GFP_IO) { vs + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { The second seems to make more sense. I tested with the first last night which didn't help. With the proper patch in place things look good. Is this a candidate for 2.6.20-stable? - k ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-22 18:20 ` Kumar Gala @ 2007-02-22 21:57 ` Andrew Morton 0 siblings, 0 replies; 19+ messages in thread From: Andrew Morton @ 2007-02-22 21:57 UTC (permalink / raw) To: Kumar Gala Cc: stern, hirofumi, linux-usb-devel, zaitcev, gregkh, linux-kernel > On Thu, 22 Feb 2007 12:20:06 -0600 Kumar Gala <galak@kernel.crashing.org> wrote: > + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { > > The second seems to make more sense. I tested with the first last > night which didn't help. > > With the proper patch in place things look good. Is this a candidate > for 2.6.20-stable? I suppose so, yes. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <fa.bxsB54F+7+006rE+o/VWUj5keQk@ifi.uio.no>]
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate [not found] <fa.bxsB54F+7+006rE+o/VWUj5keQk@ifi.uio.no> @ 2007-02-16 23:10 ` Robert Hancock 2007-02-17 5:05 ` Kumar Gala [not found] ` <fa.zK5W70l1vhk1YNxuwxFwZ8t1uIs@ifi.uio.no> 1 sibling, 1 reply; 19+ messages in thread From: Robert Hancock @ 2007-02-16 23:10 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list Kumar Gala wrote: > I'm seeing an issue with a stock 2.6.20 kernel running on an embedded > PPC. I've got a usb flash drive plugged in and the filesystem on the > drive is vfat. Running with 64M and no swap. > > If I execute a series of large (100M+) ftruncate() on the disk the > kernel will hang and never return. It seems to be stuck in the idle > loop(). On FAT filesystems this forces the entire file contents of that size to be written out with zeros. Are you sure the kernel just isn't busy writing out all that data to the disk? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from hancockr@nospamshaw.ca Home Page: http://www.roberthancock.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-16 23:10 ` Robert Hancock @ 2007-02-17 5:05 ` Kumar Gala 0 siblings, 0 replies; 19+ messages in thread From: Kumar Gala @ 2007-02-17 5:05 UTC (permalink / raw) To: Robert Hancock; +Cc: Linux Kernel list On Feb 16, 2007, at 5:10 PM, Robert Hancock wrote: > Kumar Gala wrote: >> I'm seeing an issue with a stock 2.6.20 kernel running on an >> embedded PPC. I've got a usb flash drive plugged in and the >> filesystem on the drive is vfat. Running with 64M and no swap. >> If I execute a series of large (100M+) ftruncate() on the disk the >> kernel will hang and never return. It seems to be stuck in the >> idle loop(). > > On FAT filesystems this forces the entire file contents of that > size to be written out with zeros. Are you sure the kernel just > isn't busy writing out all that data to the disk? I'm pretty sure, seeing as if I run the test it takes maybe 20-30 seconds to create the file if it succeeds. However, I've weighted 10 minutes and still no prompt. I'm also able to break in with a HW debugger and am always in the idle loop. - k ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <fa.zK5W70l1vhk1YNxuwxFwZ8t1uIs@ifi.uio.no>]
[parent not found: <fa.qFxa41A3LU3cQ19L+5DTEFMtCEY@ifi.uio.no>]
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate [not found] ` <fa.qFxa41A3LU3cQ19L+5DTEFMtCEY@ifi.uio.no> @ 2007-02-20 5:28 ` Robert Hancock 0 siblings, 0 replies; 19+ messages in thread From: Robert Hancock @ 2007-02-20 5:28 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list, linux-usb-users Kumar Gala wrote: > [ 497.499249] usb-storage D 00000000 0 671 5 > 773 670 (L-TLB) > [ 497.506930] Call Trace: > [ 497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40 > [ 497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc > [ 497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0 > [ 497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54 > [ 497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c > [ 497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84 > [ 497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4 > [ 497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc > [ 497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0 > [ 497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694 > [ 497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc > [ 497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c > [ 497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8 > [ 497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/0x138 > [ 497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310 > [ 497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/0x344 > [ 497.601303] [C3F35F50] [C0166B2C] > usb_stor_transparent_scsi_command+0x10/0x20 > [ 497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290 > [ 497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130 > [ 497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60 This seems like a problem, the usb-storage thread is trying to allocate some memory which is ending up waiting for VM writeout, which obviously won't proceed since this thread is the one that needs to do this.. It looks like the allocation in usb_stor_bulk_transfer_sglist is done with GFP_NOIO, so I wonder why we're getting into this state? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from hancockr@nospamshaw.ca Home Page: http://www.roberthancock.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2007-02-22 21:57 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-02-16 19:54 2.6.20 kernel hang with USB drive and vfat doing ftruncate Kumar Gala 2007-02-18 16:10 ` OGAWA Hirofumi 2007-02-19 21:58 ` Kumar Gala 2007-02-19 22:19 ` OGAWA Hirofumi 2007-02-19 22:27 ` Kumar Gala 2007-02-20 17:20 ` OGAWA Hirofumi 2007-02-19 22:06 ` Kumar Gala 2007-02-21 20:18 ` OGAWA Hirofumi 2007-02-21 20:57 ` Andrew Morton 2007-02-21 21:22 ` [linux-usb-devel] " Alan Stern 2007-02-21 21:31 ` Andrew Morton 2007-02-21 21:50 ` Alan Stern 2007-02-21 22:54 ` Andrew Morton 2007-02-22 7:40 ` Kumar Gala 2007-02-22 18:20 ` Kumar Gala 2007-02-22 21:57 ` Andrew Morton [not found] <fa.bxsB54F+7+006rE+o/VWUj5keQk@ifi.uio.no> 2007-02-16 23:10 ` Robert Hancock 2007-02-17 5:05 ` Kumar Gala [not found] ` <fa.zK5W70l1vhk1YNxuwxFwZ8t1uIs@ifi.uio.no> [not found] ` <fa.qFxa41A3LU3cQ19L+5DTEFMtCEY@ifi.uio.no> 2007-02-20 5:28 ` Robert Hancock
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).