LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* 2.6.20 kernel hang with USB drive and vfat doing ftruncate @ 2007-02-16 19:54 Kumar Gala 2007-02-18 16:10 ` OGAWA Hirofumi 0 siblings, 1 reply; 16+ messages in thread From: Kumar Gala @ 2007-02-16 19:54 UTC (permalink / raw) To: Linux Kernel list I'm seeing an issue with a stock 2.6.20 kernel running on an embedded PPC. I've got a usb flash drive plugged in and the filesystem on the drive is vfat. Running with 64M and no swap. If I execute a series of large (100M+) ftruncate() on the disk the kernel will hang and never return. It seems to be stuck in the idle loop(). The following is the test program I'm running: #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <errno.h> void usage (void) { printf ("truncate_test <filename> <size>\n\n"); } int main(int argc, char *argv[]) { int fd, i; int ret = 0; unsigned int len; if (argc != 3) { printf("Invalid number of arguments\n\n"); usage(); exit(1); } fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU); len = strtoul(argv[2], NULL, 0); ret = ftruncate(fd, len); if (ret) printf ("ftruncate ret = %d %d\n", ret, errno); close(fd); return ret; } I usually run the following twice to get the hang state: time ./trunc_test bar 100000000 & time ./trunc_test baz 100000000 & I was wondering if anyone had any suggestions on what to poke at next to try and figure out what is going on. - k ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-16 19:54 2.6.20 kernel hang with USB drive and vfat doing ftruncate Kumar Gala @ 2007-02-18 16:10 ` OGAWA Hirofumi 2007-02-19 21:58 ` Kumar Gala 2007-02-19 22:06 ` Kumar Gala 0 siblings, 2 replies; 16+ messages in thread From: OGAWA Hirofumi @ 2007-02-18 16:10 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list Kumar Gala <galak@kernel.crashing.org> writes: > I'm seeing an issue with a stock 2.6.20 kernel running on an embedded > PPC. I've got a usb flash drive plugged in and the filesystem on the > drive is vfat. Running with 64M and no swap. > > If I execute a series of large (100M+) ftruncate() on the disk the > kernel will hang and never return. It seems to be stuck in the idle > loop(). > > The following is the test program I'm running: > > #include <sys/mman.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <stdio.h> > #include <unistd.h> > #include <errno.h> > > void usage (void) > { > printf ("truncate_test <filename> <size>\n\n"); > } > > int main(int argc, char *argv[]) > { > int fd, i; > int ret = 0; > unsigned int len; > > if (argc != 3) { > printf("Invalid number of arguments\n\n"); > usage(); > exit(1); > } > > fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU); > len = strtoul(argv[2], NULL, 0); > > ret = ftruncate(fd, len); > > if (ret) > printf ("ftruncate ret = %d %d\n", ret, errno); > > close(fd); > > return ret; > } > > I usually run the following twice to get the hang state: > > time ./trunc_test bar 100000000 & > time ./trunc_test baz 100000000 & > > I was wondering if anyone had any suggestions on what to poke at next > to try and figure out what is going on. Can you check /sys/block/xxx/stat or something to make sure there is no outstanding IO request? It seems to be no response from the lower layer... -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-18 16:10 ` OGAWA Hirofumi @ 2007-02-19 21:58 ` Kumar Gala 2007-02-19 22:19 ` OGAWA Hirofumi 2007-02-19 22:06 ` Kumar Gala 1 sibling, 1 reply; 16+ messages in thread From: Kumar Gala @ 2007-02-19 21:58 UTC (permalink / raw) To: OGAWA Hirofumi; +Cc: Linux Kernel list On Feb 18, 2007, at 10:10 AM, OGAWA Hirofumi wrote: > Kumar Gala <galak@kernel.crashing.org> writes: > >> I'm seeing an issue with a stock 2.6.20 kernel running on an embedded >> PPC. I've got a usb flash drive plugged in and the filesystem on the >> drive is vfat. Running with 64M and no swap. >> >> If I execute a series of large (100M+) ftruncate() on the disk the >> kernel will hang and never return. It seems to be stuck in the idle >> loop(). >> >> The following is the test program I'm running: >> >> #include <sys/mman.h> >> #include <sys/types.h> >> #include <sys/stat.h> >> #include <fcntl.h> >> #include <stdio.h> >> #include <unistd.h> >> #include <errno.h> >> >> void usage (void) >> { >> printf ("truncate_test <filename> <size>\n\n"); >> } >> >> int main(int argc, char *argv[]) >> { >> int fd, i; >> int ret = 0; >> unsigned int len; >> >> if (argc != 3) { >> printf("Invalid number of arguments\n\n"); >> usage(); >> exit(1); >> } >> >> fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU); >> len = strtoul(argv[2], NULL, 0); >> >> ret = ftruncate(fd, len); >> >> if (ret) >> printf ("ftruncate ret = %d %d\n", ret, errno); >> >> close(fd); >> >> return ret; >> } >> >> I usually run the following twice to get the hang state: >> >> time ./trunc_test bar 100000000 & >> time ./trunc_test baz 100000000 & >> >> I was wondering if anyone had any suggestions on what to poke at next >> to try and figure out what is going on. > > Can you check /sys/block/xxx/stat or something to make sure there is > no outstanding IO request? > > It seems to be no response from the lower layer... Once the system locks up I dont have any ability to do anything. - k ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-19 21:58 ` Kumar Gala @ 2007-02-19 22:19 ` OGAWA Hirofumi 2007-02-19 22:27 ` Kumar Gala 0 siblings, 1 reply; 16+ messages in thread From: OGAWA Hirofumi @ 2007-02-19 22:19 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list Kumar Gala <galak@kernel.crashing.org> writes: >>> I usually run the following twice to get the hang state: >>> >>> time ./trunc_test bar 100000000 & >>> time ./trunc_test baz 100000000 & >>> >>> I was wondering if anyone had any suggestions on what to poke at next >>> to try and figure out what is going on. >> >> Can you check /sys/block/xxx/stat or something to make sure there is >> no outstanding IO request? >> >> It seems to be no response from the lower layer... > > Once the system locks up I dont have any ability to do anything. Ah, doesn't sysrq also work? If sysrq work, it can use to see IO request state with a patch. -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-19 22:19 ` OGAWA Hirofumi @ 2007-02-19 22:27 ` Kumar Gala 2007-02-20 17:20 ` OGAWA Hirofumi 0 siblings, 1 reply; 16+ messages in thread From: Kumar Gala @ 2007-02-19 22:27 UTC (permalink / raw) To: OGAWA Hirofumi; +Cc: Linux Kernel list On Feb 19, 2007, at 4:19 PM, OGAWA Hirofumi wrote: > Kumar Gala <galak@kernel.crashing.org> writes: > >>>> I usually run the following twice to get the hang state: >>>> >>>> time ./trunc_test bar 100000000 & >>>> time ./trunc_test baz 100000000 & >>>> >>>> I was wondering if anyone had any suggestions on what to poke at >>>> next >>>> to try and figure out what is going on. >>> >>> Can you check /sys/block/xxx/stat or something to make sure there is >>> no outstanding IO request? >>> >>> It seems to be no response from the lower layer... >> >> Once the system locks up I dont have any ability to do anything. > > Ah, doesn't sysrq also work? If sysrq work, it can use to see IO > request state with a patch. Yeah, got sysrq working today. If you can point me at the patch I happy to apply it and get data. - k ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-19 22:27 ` Kumar Gala @ 2007-02-20 17:20 ` OGAWA Hirofumi 0 siblings, 0 replies; 16+ messages in thread From: OGAWA Hirofumi @ 2007-02-20 17:20 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list [-- Attachment #1: Type: text/plain, Size: 593 bytes --] Kumar Gala <galak@kernel.crashing.org> writes: > On Feb 19, 2007, at 4:19 PM, OGAWA Hirofumi wrote: > >> Kumar Gala <galak@kernel.crashing.org> writes: >> >>> Once the system locks up I dont have any ability to do anything. >> >> Ah, doesn't sysrq also work? If sysrq work, it can use to see IO >> request state with a patch. > > Yeah, got sysrq working today. If you can point me at the patch I > happy to apply it and get data. Ok, please try attached patch. I hope it helps you. BTW, new sysrq is sysrq-j, and it will show disk stats. -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: debug-block.patch --] [-- Type: text/x-diff, Size: 2821 bytes --] Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> --- block/genhd.c | 27 +++++++++++++++++++++++++++ drivers/char/sysrq.c | 15 ++++++++++++++- 2 files changed, 41 insertions(+), 1 deletion(-) diff -puN drivers/char/sysrq.c~debug-block drivers/char/sysrq.c --- linux-2.6/drivers/char/sysrq.c~debug-block 2007-02-21 00:58:35.000000000 +0900 +++ linux-2.6-hirofumi/drivers/char/sysrq.c 2007-02-21 02:02:52.000000000 +0900 @@ -311,6 +311,19 @@ static struct sysrq_key_op sysrq_kill_op .enable_mask = SYSRQ_ENABLE_SIGNAL, }; +extern void block_req_callback(struct work_struct *ignored); +static DECLARE_WORK(block_req_work, block_req_callback); +static void sysrq_handle_block_req(int key, struct tty_struct *tty) +{ + schedule_work(&block_req_work); +} +static struct sysrq_key_op sysrq_block_req_op = { + .handler = sysrq_handle_block_req, + .help_msg = "block req (j)", + .action_msg = "Block Req", + .enable_mask = SYSRQ_ENABLE_DUMP, +}; + static void sysrq_handle_unrt(int key, struct tty_struct *tty) { normalize_rt_tasks(); @@ -351,7 +364,7 @@ static struct sysrq_key_op *sysrq_key_ta NULL, /* g */ NULL, /* h */ &sysrq_kill_op, /* i */ - NULL, /* j */ + &sysrq_block_req_op, /* j */ &sysrq_SAK_op, /* k */ NULL, /* l */ &sysrq_showmem_op, /* m */ diff -puN block/genhd.c~debug-block block/genhd.c --- linux-2.6/block/genhd.c~debug-block 2007-02-21 01:02:13.000000000 +0900 +++ linux-2.6-hirofumi/block/genhd.c 2007-02-21 02:15:56.000000000 +0900 @@ -555,6 +555,33 @@ static struct kset_uevent_ops block_ueve decl_subsys(block, &ktype_block, &block_uevent_ops); +void block_req_callback(struct work_struct *ignored) +{ + struct gendisk *gp; + char buf[BDEVNAME_SIZE]; + + mutex_lock(&block_subsys_lock); + list_for_each_entry(gp, &block_subsys.kset.list, kobj.entry) { + printk("%4d %4d %s %lu %lu %llu %u %lu %lu %llu %u %u %u %u:" + " %u %u %u\n", + gp->major, gp->first_minor, disk_name(gp, 0, buf), + disk_stat_read(gp, ios[0]), + disk_stat_read(gp, merges[0]), + (unsigned long long)disk_stat_read(gp, sectors[0]), + jiffies_to_msecs(disk_stat_read(gp, ticks[0])), + disk_stat_read(gp, ios[1]), + disk_stat_read(gp, merges[1]), + (unsigned long long)disk_stat_read(gp, sectors[1]), + jiffies_to_msecs(disk_stat_read(gp, ticks[1])), + gp->in_flight, + jiffies_to_msecs(disk_stat_read(gp, io_ticks)), + jiffies_to_msecs(disk_stat_read(gp, time_in_queue)), + gp->queue->rq.count[0], gp->queue->rq.count[1], + gp->queue->in_flight); + } + mutex_unlock(&block_subsys_lock); +} + /* * aggregate disk stat collector. Uses the same stats that the sysfs * entries do, above, but makes them available through one seq_file. _ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-18 16:10 ` OGAWA Hirofumi 2007-02-19 21:58 ` Kumar Gala @ 2007-02-19 22:06 ` Kumar Gala 2007-02-21 20:18 ` OGAWA Hirofumi 1 sibling, 1 reply; 16+ messages in thread From: Kumar Gala @ 2007-02-19 22:06 UTC (permalink / raw) To: Linux Kernel list; +Cc: Andrew Morton, Greg KH On Feb 18, 2007, at 10:10 AM, OGAWA Hirofumi wrote: > Kumar Gala <galak@kernel.crashing.org> writes: > >> I'm seeing an issue with a stock 2.6.20 kernel running on an embedded >> PPC. I've got a usb flash drive plugged in and the filesystem on the >> drive is vfat. Running with 64M and no swap. >> >> If I execute a series of large (100M+) ftruncate() on the disk the >> kernel will hang and never return. It seems to be stuck in the idle >> loop(). >> >> The following is the test program I'm running: >> >> #include <sys/mman.h> >> #include <sys/types.h> >> #include <sys/stat.h> >> #include <fcntl.h> >> #include <stdio.h> >> #include <unistd.h> >> #include <errno.h> >> >> void usage (void) >> { >> printf ("truncate_test <filename> <size>\n\n"); >> } >> >> int main(int argc, char *argv[]) >> { >> int fd, i; >> int ret = 0; >> unsigned int len; >> >> if (argc != 3) { >> printf("Invalid number of arguments\n\n"); >> usage(); >> exit(1); >> } >> >> fd = open(argv[1], O_CREAT|O_RDWR|O_TRUNC, S_IRWXU); >> len = strtoul(argv[2], NULL, 0); >> >> ret = ftruncate(fd, len); >> >> if (ret) >> printf ("ftruncate ret = %d %d\n", ret, errno); >> >> close(fd); >> >> return ret; >> } >> >> I usually run the following twice to get the hang state: >> >> time ./trunc_test bar 100000000 & >> time ./trunc_test baz 100000000 & >> >> I was wondering if anyone had any suggestions on what to poke at next >> to try and figure out what is going on. So I realized I could use sysrq to provide some more debug information. When the system locks up I get the following output from 't' [ 496.901002] Show State [ 496.903356] [ 496.903360] free sibling [ 496.911532] task PC stack pid father child younger older [ 496.918486] init S 3009C7EC 0 1 0 2 (NOTLB) [ 496.926169] Call Trace: [ 496.928611] [C3FC7DA0] [C006F03C] __link_path_walk+0xd24/0x112c (unreliable) [ 496.935687] [C3FC7E60] [C00083AC] __switch_to+0x28/0x40 [ 496.940931] [C3FC7E80] [C01F4B78] schedule+0x324/0x6bc [ 496.946086] [C3FC7EC0] [C001E164] do_wait+0x700/0x100c [ 496.951242] [C3FC7F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 496.956828] --- Exception: c01 at 0x3009c7ec [ 496.961099] LR = 0x3009c3e0 [ 496.964234] ksoftirqd/0 S 00000000 0 2 1 3 (L-TLB) [ 496.971913] Call Trace: [ 496.974355] [C033DE80] [C0133F64] scsi_io_completion+0x74/0x318 (unreliable) [ 496.981428] [C033DF40] [C00083AC] __switch_to+0x28/0x40 [ 496.986664] [C033DF60] [C01F4B78] schedule+0x324/0x6bc [ 496.991811] [C033DFA0] [C00210CC] ksoftirqd+0xfc/0x114 [ 496.996960] [C033DFC0] [C0033E48] kthread+0xf4/0x130 [ 497.001941] [C033DFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.007350] events/0 S 00000000 0 3 1 4 2 (L-TLB) [ 497.015030] Call Trace: [ 497.017472] [C033FEE0] [C00083AC] __switch_to+0x28/0x40 [ 497.022707] [C033FF00] [C01F4B78] schedule+0x324/0x6bc [ 497.027855] [C033FF40] [C002F67C] worker_thread+0x144/0x148 [ 497.033435] [C033FFC0] [C0033E48] kthread+0xf4/0x130 [ 497.038409] [C033FFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.043817] khelper S 00000000 0 4 1 5 3 (L-TLB) [ 497.051497] Call Trace: [ 497.053940] [C3FE1E20] [C3FE0000] 0xc3fe0000 (unreliable) [ 497.059351] [C3FE1EE0] [C00083AC] __switch_to+0x28/0x40 [ 497.064586] [C3FE1F00] [C01F4B78] schedule+0x324/0x6bc [ 497.069734] [C3FE1F40] [C002F67C] worker_thread+0x144/0x148 [ 497.075316] [C3FE1FC0] [C0033E48] kthread+0xf4/0x130 [ 497.080291] [C3FE1FF0] [C001093C] kernel_thread+0x44/0x60 [ 497.085697] kthread S 00000000 0 5 1 37 617 4 (L-TLB) [ 497.093378] Call Trace: [ 497.095820] [C3FCBE20] [00001032] 0x1032 (unreliable) [ 497.100881] --- Exception: c3fcbef0 at __switch_to+0x28/0x40 [ 497.106545] LR = 0xc3fcbef0 [ 497.109681] [C3FCBEE0] [C00083AC] __switch_to+0x28/0x40 (unreliable) [ 497.116051] [C3FCBF00] [C01F4B78] schedule+0x324/0x6bc [ 497.121201] [C3FCBF40] [C002F67C] worker_thread+0x144/0x148 [ 497.126783] [C3FCBFC0] [C0033E48] kthread+0xf4/0x130 [ 497.131758] [C3FCBFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.137165] kblockd/0 S 00000000 0 37 5 41 (L-TLB) [ 497.144845] Call Trace: [ 497.147286] [C3D9FE20] [C3EBF490] 0xc3ebf490 (unreliable) [ 497.152697] [C3D9FEE0] [C00083AC] __switch_to+0x28/0x40 [ 497.157933] [C3D9FF00] [C01F4B78] schedule+0x324/0x6bc [ 497.163082] [C3D9FF40] [C002F67C] worker_thread+0x144/0x148 [ 497.168663] [C3D9FFC0] [C0033E48] kthread+0xf4/0x130 [ 497.173637] [C3D9FFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.179045] khubd S 00000000 0 41 5 53 37 (L-TLB) [ 497.186726] Call Trace: [ 497.189167] [C0341E00] [C3F03900] 0xc3f03900 (unreliable) [ 497.194578] [C0341EC0] [C00083AC] __switch_to+0x28/0x40 [ 497.199813] [C0341EE0] [C01F4B78] schedule+0x324/0x6bc [ 497.204961] [C0341F20] [C0152288] hub_thread+0xb40/0xcc0 [ 497.210283] [C0341FC0] [C0033E48] kthread+0xf4/0x130 [ 497.215257] [C0341FF0] [C001093C] kernel_thread+0x44/0x60 [ 497.220664] pdflush D 00000000 0 53 5 55 41 (L-TLB) [ 497.228344] Call Trace: [ 497.230786] [C3CABD10] [C008E098] __find_get_block+0x10c/0x288 (unreliable) [ 497.237769] [C3CABDD0] [C00083AC] __switch_to+0x28/0x40 [ 497.243004] [C3CABDF0] [C01F4B78] schedule+0x324/0x6bc [ 497.248152] [C3CABE30] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.253822] [C3CABE70] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.259752] [C3CABE90] [C0050DE4] congestion_wait+0x64/0x8c [ 497.265343] [C3CABEE0] [C004AA9C] background_writeout+0x44/0xe8 [ 497.271274] [C3CABF50] [C004BE04] pdflush+0x16c/0x27c [ 497.276335] [C3CABFC0] [C0033E48] kthread+0xf4/0x130 [ 497.281311] [C3CABFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.286718] kswapd0 D 00000000 0 55 5 56 53 (L-TLB) [ 497.294399] Call Trace: [ 497.296841] [C3DEFB70] [C0025FE8] run_timer_softirq+0x20/0x230 (unreliable) [ 497.303823] [C3DEFC30] [C00083AC] __switch_to+0x28/0x40 [ 497.309058] [C3DEFC50] [C01F4B78] schedule+0x324/0x6bc [ 497.314207] [C3DEFC90] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.319876] --- Exception: c3defd60 at 0xc3deff40 [ 497.324582] LR = 0xc3dee000 [ 497.327718] [C3DEFCD0] [C01F5C9C] io_schedule_timeout+0x30/0x54 (unreliable) [ 497.334783] [C3DEFCF0] [C0050DE4] congestion_wait+0x64/0x8c [ 497.340367] [C3DEFD40] [C004A9F0] throttle_vm_writeout+0x1c/0x84 [ 497.346384] [C3DEFD60] [C004F33C] shrink_zone+0xbb0/0xfe4 [ 497.351792] [C3DEFF10] [C004FD10] kswapd+0x2d4/0x424 [ 497.356767] [C3DEFFC0] [C0033E48] kthread+0xf4/0x130 [ 497.361741] [C3DEFFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.367151] aio/0 S 00000000 0 56 5 670 55 (L-TLB) [ 497.374832] Call Trace: [ 497.377272] [C3CADE20] [00000020] 0x20 (unreliable) [ 497.382162] [C3CADEE0] [C00083AC] __switch_to+0x28/0x40 [ 497.387398] [C3CADF00] [C01F4B78] schedule+0x324/0x6bc [ 497.392547] [C3CADF40] [C002F67C] worker_thread+0x144/0x148 [ 497.398128] [C3CADFC0] [C0033E48] kthread+0xf4/0x130 [ 497.403102] [C3CADFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.408508] mtdblockd S 00000000 0 617 1 718 5 (L-TLB) [ 497.416191] Call Trace: [ 497.418632] [C3F27E70] [C02A0000] 0xc02a0000 (unreliable) [ 497.424043] [C3F27F30] [C00083AC] __switch_to+0x28/0x40 [ 497.429278] [C3F27F50] [C01F4B78] schedule+0x324/0x6bc [ 497.434425] [C3F27F90] [C013FAAC] mtd_blktrans_thread+0x250/0x340 [ 497.440534] [C3F27FF0] [C001093C] kernel_thread+0x44/0x60 [ 497.445941] scsi_eh_0 D 00000000 0 670 5 671 56 (L-TLB) [ 497.453622] Call Trace: [ 497.456062] [C3F1FDF0] [00000011] 0x11 (unreliable) [ 497.460951] [C3F1FEB0] [C00083AC] __switch_to+0x28/0x40 [ 497.466187] [C3F1FED0] [C01F4B78] schedule+0x324/0x6bc [ 497.471335] [C3F1FF10] [C01F50D4] wait_for_completion+0xa0/0x150 [ 497.477351] [C3F1FF50] [C016641C] command_abort+0xdc/0x118 [ 497.482846] [C3F1FF60] [C0132BC0] scsi_error_handler+0x5f0/0x810 [ 497.488868] [C3F1FFC0] [C0033E48] kthread+0xf4/0x130 [ 497.493842] [C3F1FFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.499249] usb-storage D 00000000 0 671 5 773 670 (L-TLB) [ 497.506930] Call Trace: [ 497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40 [ 497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc [ 497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c [ 497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84 [ 497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4 [ 497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc [ 497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0 [ 497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694 [ 497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc [ 497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c [ 497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8 [ 497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ 0x138 [ 497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310 [ 497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ 0x344 [ 497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command +0x10/0x20 [ 497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290 [ 497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130 [ 497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60 [ 497.625285] sh D 3009C7EC 0 718 1 787 617 (NOTLB) [ 497.632968] Call Trace: [ 497.635410] [C3C37A90] [C01339AC] scsi_run_queue+0x220/0x2e0 (unreliable) [ 497.642216] [C3C37B50] [C00083AC] __switch_to+0x28/0x40 [ 497.647452] [C3C37B70] [C01F4B78] schedule+0x324/0x6bc [ 497.652601] [C3C37BB0] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.658272] [C3C37BF0] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.664202] [C3C37C10] [C0050DE4] congestion_wait+0x64/0x8c [ 497.669786] [C3C37C60] [C004A9F0] throttle_vm_writeout+0x1c/0x84 [ 497.675802] [C3C37C80] [C004F33C] shrink_zone+0xbb0/0xfe4 [ 497.681211] [C3C37E30] [C004F8F4] try_to_free_pages+0x184/0x2cc [ 497.687141] [C3C37EA0] [C0049AA8] __alloc_pages+0x110/0x2c0 [ 497.692723] [C3C37EF0] [C0049C8C] __get_free_pages+0x34/0x74 [ 497.698390] [C3C37F00] [C007C3C4] sys_getcwd+0x30/0x2b0 [ 497.703629] [C3C37F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 497.709210] --- Exception: c01 at 0x3009c7ec [ 497.713480] LR = 0x3009a7b0 [ 497.716614] pdflush D 00000000 0 773 5 671 (L-TLB) [ 497.724294] Call Trace: [ 497.726737] [C32ADD10] [C008E098] __find_get_block+0x10c/0x288 (unreliable) [ 497.733718] [C32ADDD0] [C00083AC] __switch_to+0x28/0x40 [ 497.738954] [C32ADDF0] [C01F4B78] schedule+0x324/0x6bc [ 497.744102] [C32ADE30] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.749772] [C32ADE70] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.755702] [C32ADE90] [C0050DE4] congestion_wait+0x64/0x8c [ 497.761285] [C32ADEE0] [C004AC74] wb_kupdate+0xf0/0x160 [ 497.766520] [C32ADF50] [C004BE04] pdflush+0x16c/0x27c [ 497.771581] [C32ADFC0] [C0033E48] kthread+0xf4/0x130 [ 497.776556] [C32ADFF0] [C001093C] kernel_thread+0x44/0x60 [ 497.781965] time S 3009C7EC 0 787 1 789 788 718 (NOTLB) [ 497.789648] Call Trace: [ 497.792089] [C03A3E60] [C00083AC] __switch_to+0x28/0x40 [ 497.797326] [C03A3E80] [C01F4B78] schedule+0x324/0x6bc [ 497.802474] [C03A3EC0] [C001E164] do_wait+0x700/0x100c [ 497.807630] [C03A3F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 497.813210] --- Exception: c01 at 0x3009c7ec [ 497.817481] LR = 0x3009c414 [ 497.820616] trunc_test D 300787EC 0 789 787 (NOTLB) [ 497.828295] Call Trace: [ 497.830737] [C19B3960] [C0160000] handshake+0x6c/0x9c (unreliable) [ 497.836939] [C19B3A20] [C00083AC] __switch_to+0x28/0x40 [ 497.842175] [C19B3A40] [C01F4B78] schedule+0x324/0x6bc [ 497.847323] [C19B3A80] [C01F5D6C] schedule_timeout+0x6c/0xd0 [ 497.852994] [C19B3AC0] [C01F5C9C] io_schedule_timeout+0x30/0x54 [ 497.858924] [C19B3AE0] [C0050DE4] congestion_wait+0x64/0x8c [ 497.864510] [C19B3B30] [C004A9F0] throttle_vm_writeout+0x1c/0x84 [ 497.870525] [C19B3B50] [C004F33C] shrink_zone+0xbb0/0xfe4 [ 497.875934] [C19B3D00] [C004F8F4] try_to_free_pages+0x184/0x2cc [ 497.881864] [C19B3D70] [C0049AA8] __alloc_pages+0x110/0x2c0 [ 497.887445] [C19B3DC0] [C00447F4] find_or_create_page+0x8c/0xe4 [ 497.893386] [C19B3DE0] [C0090DAC] cont_prepare_write+0xac/0x32c [ 497.899321] [C19B3E20] [C00D7A50] fat_prepare_write+0x30/0x40 [ 497.905077] [C19B3E30] [C008E68C] __generic_cont_expand+0xa4/0x158 [ 497.911268] [C19B3E50] [C00D7254] fat_notify_change+0xf4/0x208 [ 497.917109] [C19B3E80] [C007EB24] notify_change+0x1ec/0x1fc [ 497.922695] [C19B3EB0] [C0062DC0] do_truncate+0x58/0x88 [ 497.927935] [C19B3F10] [C006316C] do_sys_ftruncate+0x180/0x1a8 [ 497.933780] [C19B3F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 497.939361] --- Exception: c01 at 0x300787ec [ 497.943634] LR = 0x1000073c [ 497.946768] time S 3009C7EC 0 788 1 790 787 (NOTLB) [ 497.954450] Call Trace: [ 497.956892] [C1919E60] [C00083AC] __switch_to+0x28/0x40 [ 497.962129] [C1919E80] [C01F4B78] schedule+0x324/0x6bc [ 497.967278] [C1919EC0] [C001E164] do_wait+0x700/0x100c [ 497.972431] [C1919F40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 497.978011] --- Exception: c01 at 0x3009c7ec [ 497.982282] LR = 0x3009c414 [ 497.985417] trunc_test D 300787EC 0 790 788 (NOTLB) [ 497.993101] Call Trace: [ 497.995542] [C2BFDA00] [C0047E68] mempool_alloc+0x38/0x144 (unreliable) [ 498.002171] [C2BFDAC0] [C00083AC] __switch_to+0x28/0x40 [ 498.007406] [C2BFDAE0] [C01F4B78] schedule+0x324/0x6bc [ 498.012554] [C2BFDB20] [C01F5C48] io_schedule+0x30/0x54 [ 498.017790] [C2BFDB40] [C008D01C] sync_buffer+0x68/0x7c [ 498.023026] [C2BFDB50] [C01F5E80] __wait_on_bit+0x98/0xec [ 498.028435] [C2BFDB70] [C01F5F34] out_of_line_wait_on_bit+0x60/0x74 [ 498.034713] [C2BFDBC0] [C008CF3C] __wait_on_buffer+0x3c/0x4c [ 498.040382] [C2BFDBD0] [C00916F4] __bread+0xe8/0xf4 [ 498.045270] [C2BFDBE0] [C00D5C24] fat_ent_bread+0x48/0xa8 [ 498.050678] [C2BFDC00] [C00D6358] fat_ent_read+0x168/0x1f0 [ 498.056171] [C2BFDC30] [C00D6690] fat_free_clusters+0x64/0x260 [ 498.062011] [C2BFDCC0] [C00D75C4] fat_truncate+0x25c/0x334 [ 498.067507] [C2BFDD30] [C0053EE4] vmtruncate+0x184/0x1a4 [ 498.072833] [C2BFDD50] [C007E810] inode_setattr+0x7c/0x1a4 [ 498.078329] [C2BFDD90] [C00D7314] fat_notify_change+0x1b4/0x208 [ 498.084257] [C2BFDDC0] [C007EB24] notify_change+0x1ec/0x1fc [ 498.089840] [C2BFDDF0] [C0062DC0] do_truncate+0x58/0x88 [ 498.095077] [C2BFDE50] [C007028C] may_open+0x1fc/0x200 [ 498.100230] [C2BFDE70] [C0070380] open_namei+0xf0/0x714 [ 498.105465] [C2BFDEB0] [C0063BB8] do_filp_open+0x30/0x78 [ 498.110788] [C2BFDF20] [C0064018] do_sys_open+0x70/0xc0 [ 498.116023] [C2BFDF40] [C000FAD4] ret_from_syscall+0x0/0x38 [ 498.121605] --- Exception: c01 at 0x300787ec [ 498.125878] LR = 0x30077580 and from 'm' [ 731.834529] Show Memory [ 731.836968] Mem-info: [ 731.839234] DMA per-cpu: [ 731.841768] CPU 0: Hot: hi: 18, btch: 3 usd: 3 Cold: hi: 6, btch: 1 usd: 2 [ 731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330 unstable:0 free:1009 slab:1671 mapped:110 pagetables:19 [ 731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB active:6040kB inactive:45236kB present:65024kB pages_scanned:292 all_unreclaimable? no [ 731.874363] lowmem_reserve[]: 0 0 [ 731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB [ 731.887669] Free swap: 0kB [ 731.893913] 16384 pages of RAM [ 731.896963] 798 reserved pages [ 731.900011] 10946 pages shared [ 731.903058] 0 pages swap cached It seems like usb-storage and aio are completely off in the weeds. Ideas? If you need any additional debug output let me know. - k ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-19 22:06 ` Kumar Gala @ 2007-02-21 20:18 ` OGAWA Hirofumi 2007-02-21 20:57 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: OGAWA Hirofumi @ 2007-02-21 20:18 UTC (permalink / raw) To: Kumar Gala; +Cc: Linux Kernel list, Andrew Morton, Greg KH Kumar Gala <galak@kernel.crashing.org> writes: >>> I usually run the following twice to get the hang state: >>> >>> time ./trunc_test bar 100000000 & >>> time ./trunc_test baz 100000000 & >>> >>> I was wondering if anyone had any suggestions on what to poke at next >>> to try and figure out what is going on. > > So I realized I could use sysrq to provide some more debug > information. When the system locks up I get the following output > from 't' > > [ 497.499249] usb-storage D 00000000 0 671 5 > 773 670 (L-TLB) > [ 497.506930] Call Trace: > [ 497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40 > [ 497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc > [ 497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0 > [ 497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54 > [ 497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c > [ 497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84 > [ 497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4 > [ 497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc > [ 497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0 > [ 497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694 > [ 497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc > [ 497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c > [ 497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8 > [ 497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ > 0x138 > [ 497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310 > [ 497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ > 0x344 > [ 497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command > +0x10/0x20 > [ 497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290 > [ 497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130 > [ 497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60 > [ 497.625285] sh D 3009C7EC 0 718 1 [...] > and from 'm' > > [ 731.834529] Show Memory > [ 731.836968] Mem-info: > [ 731.839234] DMA per-cpu: > [ 731.841768] CPU 0: Hot: hi: 18, btch: 3 usd: 3 Cold: > hi: 6, btch: 1 usd: 2 > [ 731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330 > unstable:0 free:1009 slab:1671 mapped:110 pagetables:19 > [ 731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB > active:6040kB inactive:45236kB present:65024kB pages_scanned:292 > all_unreclaimable? no > [ 731.874363] lowmem_reserve[]: 0 0 > [ 731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB > 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB > [ 731.887669] Free swap: 0kB > [ 731.893913] 16384 pages of RAM > [ 731.896963] 798 reserved pages > [ 731.900011] 10946 pages shared > [ 731.903058] 0 pages swap cached > > It seems like usb-storage and aio are completely off in the weeds. > Ideas? It seems usb-storage should remove some kmalloc and use mempool() for urb... Is someone working on this? And idea? -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 20:18 ` OGAWA Hirofumi @ 2007-02-21 20:57 ` Andrew Morton 2007-02-21 21:22 ` [linux-usb-devel] " Alan Stern 0 siblings, 1 reply; 16+ messages in thread From: Andrew Morton @ 2007-02-21 20:57 UTC (permalink / raw) To: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev Cc: Kumar Gala, Linux Kernel list, Greg KH On Thu, 22 Feb 2007 05:18:45 +0900 OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> wrote: > Kumar Gala <galak@kernel.crashing.org> writes: > > >>> I usually run the following twice to get the hang state: > >>> > >>> time ./trunc_test bar 100000000 & > >>> time ./trunc_test baz 100000000 & > >>> > >>> I was wondering if anyone had any suggestions on what to poke at next > >>> to try and figure out what is going on. > > > > So I realized I could use sysrq to provide some more debug > > information. When the system locks up I get the following output > > from 't' > > > > [ 497.499249] usb-storage D 00000000 0 671 5 > > 773 670 (L-TLB) > > [ 497.506930] Call Trace: > > [ 497.509372] [C3F35A60] [C00083AC] __switch_to+0x28/0x40 > > [ 497.514608] [C3F35A80] [C01F4B78] schedule+0x324/0x6bc > > [ 497.519756] [C3F35AC0] [C01F5D6C] schedule_timeout+0x6c/0xd0 > > [ 497.525426] [C3F35B00] [C01F5C9C] io_schedule_timeout+0x30/0x54 > > [ 497.531356] [C3F35B20] [C0050DE4] congestion_wait+0x64/0x8c > > [ 497.536941] [C3F35B70] [C004A9F0] throttle_vm_writeout+0x1c/0x84 > > [ 497.542958] [C3F35B90] [C004F33C] shrink_zone+0xbb0/0xfe4 > > [ 497.548367] [C3F35D40] [C004F8F4] try_to_free_pages+0x184/0x2cc > > [ 497.554298] [C3F35DB0] [C0049AA8] __alloc_pages+0x110/0x2c0 > > [ 497.559878] [C3F35E00] [C0060F84] cache_alloc_refill+0x394/0x694 > > [ 497.565900] [C3F35E30] [C00614A0] __kmalloc+0xc4/0xcc > > [ 497.570961] [C3F35E40] [C01544D0] usb_alloc_urb+0x1c/0x5c > > [ 497.576371] [C3F35E50] [C015520C] usb_sg_init+0x1a0/0x2f8 > > [ 497.581779] [C3F35EA0] [C0167318] usb_stor_bulk_transfer_sg+0x8c/ > > 0x138 > > [ 497.588317] [C3F35ED0] [C0167960] usb_stor_Bulk_transport+0x140/0x310 > > [ 497.594767] [C3F35F00] [C0167DCC] usb_stor_invoke_transport+0x2c/ > > 0x344 > > [ 497.601303] [C3F35F50] [C0166B2C] usb_stor_transparent_scsi_command > > +0x10/0x20 > > [ 497.608449] [C3F35F60] [C0168498] usb_stor_control_thread+0x1f8/0x290 > > [ 497.614900] [C3F35FC0] [C0033E48] kthread+0xf4/0x130 > > [ 497.619876] [C3F35FF0] [C001093C] kernel_thread+0x44/0x60 > > [ 497.625285] sh D 3009C7EC 0 718 1 > > [...] > > > and from 'm' > > > > [ 731.834529] Show Memory > > [ 731.836968] Mem-info: > > [ 731.839234] DMA per-cpu: > > [ 731.841768] CPU 0: Hot: hi: 18, btch: 3 usd: 3 Cold: > > hi: 6, btch: 1 usd: 2 > > [ 731.850206] Active:1510 inactive:11309 dirty:7188 writeback:3330 > > unstable:0 free:1009 slab:1671 mapped:110 pagetables:19 > > [ 731.861075] DMA free:4036kB min:4096kB low:5120kB high:6144kB > > active:6040kB inactive:45236kB present:65024kB pages_scanned:292 > > all_unreclaimable? no > > [ 731.874363] lowmem_reserve[]: 0 0 > > [ 731.877685] DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB > > 1*512kB 1*1024kB 1*2048kB 0*4096kB = 4036kB > > [ 731.887669] Free swap: 0kB > > [ 731.893913] 16384 pages of RAM > > [ 731.896963] 798 reserved pages > > [ 731.900011] 10946 pages shared > > [ 731.903058] 0 pages swap cached > > > > It seems like usb-storage and aio are completely off in the weeds. > > Ideas? > > It seems usb-storage should remove some kmalloc and use mempool() for > urb... Is someone working on this? And idea? I think Pete said that we're supposed to be using GFP_NOIO in there. Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO and GFP_NOFS allocations, which is a bug. Because if the caller holds locks which prevent filesystem or IO progress, we deadlock. I'll fix the VM if someone else fixes USB ;) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 20:57 ` Andrew Morton @ 2007-02-21 21:22 ` Alan Stern 2007-02-21 21:31 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: Alan Stern @ 2007-02-21 21:22 UTC (permalink / raw) To: Andrew Morton Cc: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Kumar Gala, Linux Kernel list On Wed, 21 Feb 2007, Andrew Morton wrote: > > > It seems like usb-storage and aio are completely off in the weeds. > > > Ideas? > > > > It seems usb-storage should remove some kmalloc and use mempool() for > > urb... Is someone working on this? And idea? > > I think Pete said that we're supposed to be using GFP_NOIO in there. We _are_ using it. > Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO > and GFP_NOFS allocations, which is a bug. Because if the caller holds > locks which prevent filesystem or IO progress, we deadlock. > > I'll fix the VM if someone else fixes USB ;) What else needs to be fixed? Alan Stern ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 21:22 ` [linux-usb-devel] " Alan Stern @ 2007-02-21 21:31 ` Andrew Morton 2007-02-21 21:50 ` Alan Stern 2007-02-22 7:40 ` Kumar Gala 0 siblings, 2 replies; 16+ messages in thread From: Andrew Morton @ 2007-02-21 21:31 UTC (permalink / raw) To: Alan Stern Cc: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Kumar Gala, Linux Kernel list On Wed, 21 Feb 2007 16:22:17 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > On Wed, 21 Feb 2007, Andrew Morton wrote: > > > > > It seems like usb-storage and aio are completely off in the weeds. > > > > Ideas? > > > > > > It seems usb-storage should remove some kmalloc and use mempool() for > > > urb... Is someone working on this? And idea? > > > > I think Pete said that we're supposed to be using GFP_NOIO in there. > > We _are_ using it. How admirably prompt. > > Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO > > and GFP_NOFS allocations, which is a bug. Because if the caller holds > > locks which prevent filesystem or IO progress, we deadlock. > > > > I'll fix the VM if someone else fixes USB ;) > > What else needs to be fixed? Would be nice if someone can confirm that this fixes it: From: Andrew Morton <akpm@linux-foundation.org> throttle_vm_writeout() is designed to wait for the dirty levels to subside. But if the caller holds IO or FS locks, we might be holding up that writeout. So change it to take a single nap to give other devices a chance to clean some memory, then return. Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- include/linux/writeback.h | 2 +- mm/page-writeback.c | 13 +++++++++++-- mm/vmscan.c | 2 +- 3 files changed, 13 insertions(+), 4 deletions(-) diff -puN mm/vmscan.c~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations mm/vmscan.c --- a/mm/vmscan.c~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations +++ a/mm/vmscan.c @@ -952,7 +952,7 @@ static unsigned long shrink_zone(int pri } } - throttle_vm_writeout(); + throttle_vm_writeout(sc->gfp_mask); atomic_dec(&zone->reclaim_in_progress); return nr_reclaimed; diff -puN mm/page-writeback.c~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations mm/page-writeback.c --- a/mm/page-writeback.c~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations +++ a/mm/page-writeback.c @@ -296,11 +296,21 @@ void balance_dirty_pages_ratelimited_nr( } EXPORT_SYMBOL(balance_dirty_pages_ratelimited_nr); -void throttle_vm_writeout(void) +void throttle_vm_writeout(gfp_t gfp_mask) { long background_thresh; long dirty_thresh; + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { + /* + * The caller might hold locks which can prevert IO completion + * or progress in the filesystem. So we cannot just sit here + * waiting for IO to complete. + */ + congestion_wait(WRITE, HZ/10); + return; + } + for ( ; ; ) { get_dirty_limits(&background_thresh, &dirty_thresh, NULL); @@ -317,7 +327,6 @@ void throttle_vm_writeout(void) } } - /* * writeback at least _min_pages, and keep writing until the amount of dirty * memory is less than the background threshold, or until we're all clean. diff -puN include/linux/writeback.h~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations include/linux/writeback.h --- a/include/linux/writeback.h~throttle_vm_writeout-dont-loop-on-gfp_nofs-and-gfp_noio-allocations +++ a/include/linux/writeback.h @@ -84,7 +84,7 @@ static inline void wait_on_inode(struct int wakeup_pdflush(long nr_pages); void laptop_io_completion(void); void laptop_sync_completion(void); -void throttle_vm_writeout(void); +void throttle_vm_writeout(gfp_t gfp_mask); /* These are exported to sysctl. */ extern int dirty_background_ratio; _ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 21:31 ` Andrew Morton @ 2007-02-21 21:50 ` Alan Stern 2007-02-21 22:54 ` Andrew Morton 2007-02-22 7:40 ` Kumar Gala 1 sibling, 1 reply; 16+ messages in thread From: Alan Stern @ 2007-02-21 21:50 UTC (permalink / raw) To: Andrew Morton Cc: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Kumar Gala, Linux Kernel list On Wed, 21 Feb 2007, Andrew Morton wrote: > On Wed, 21 Feb 2007 16:22:17 -0500 (EST) > Alan Stern <stern@rowland.harvard.edu> wrote: > > > On Wed, 21 Feb 2007, Andrew Morton wrote: > > > > > > > It seems like usb-storage and aio are completely off in the weeds. > > > > > Ideas? > > > > > > > > It seems usb-storage should remove some kmalloc and use mempool() for > > > > urb... Is someone working on this? And idea? > > > > > > I think Pete said that we're supposed to be using GFP_NOIO in there. > > > > We _are_ using it. > > How admirably prompt. Shucks, we've been using it for years... > > > Not that it'll help much: the VM calls throttle_vm_writeout() for GFP_NOIO > > > and GFP_NOFS allocations, which is a bug. Because if the caller holds > > > locks which prevent filesystem or IO progress, we deadlock. > > > > > > I'll fix the VM if someone else fixes USB ;) > > > > What else needs to be fixed? > > Would be nice if someone can confirm that this fixes it: Not having experienced the problem, I can't confirm the fix. However... > + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { Is that really the correct test? I don't know enough about the memory management subsystem to say one way or the other. What's special about having both flags set? > + /* > + * The caller might hold locks which can prevert IO completion --------------------------------------------------------------^ Typo Although perhaps "prevert" is an acceptable neologism in this context. > + * or progress in the filesystem. So we cannot just sit here > + * waiting for IO to complete. > + */ Alan Stern ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 21:50 ` Alan Stern @ 2007-02-21 22:54 ` Andrew Morton 0 siblings, 0 replies; 16+ messages in thread From: Andrew Morton @ 2007-02-21 22:54 UTC (permalink / raw) To: Alan Stern Cc: OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Kumar Gala, Linux Kernel list On Wed, 21 Feb 2007 16:50:23 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > > + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { > > Is that really the correct test? I don't know enough about the memory > management subsystem to say one way or the other. What's special about > having both flags set? yup. We're saying "if the caller is unable to take either IO locks or FS locks, don't wait on FS or IO completion". ie: don't wait on writeout progress unless we know that both the IO system and the FS are able to make progress. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-21 21:31 ` Andrew Morton 2007-02-21 21:50 ` Alan Stern @ 2007-02-22 7:40 ` Kumar Gala 2007-02-22 18:20 ` Kumar Gala 1 sibling, 1 reply; 16+ messages in thread From: Kumar Gala @ 2007-02-22 7:40 UTC (permalink / raw) To: Andrew Morton Cc: Alan Stern, OGAWA Hirofumi, linux-usb-devel, Pete Zaitcev, Greg KH, Linux Kernel list On Feb 21, 2007, at 3:31 PM, Andrew Morton wrote: > On Wed, 21 Feb 2007 16:22:17 -0500 (EST) > Alan Stern <stern@rowland.harvard.edu> wrote: > >> On Wed, 21 Feb 2007, Andrew Morton wrote: >> >>>>> It seems like usb-storage and aio are completely off in the weeds. >>>>> Ideas? >>>> >>>> It seems usb-storage should remove some kmalloc and use mempool >>>> () for >>>> urb... Is someone working on this? And idea? >>> >>> I think Pete said that we're supposed to be using GFP_NOIO in there. >> >> We _are_ using it. > > How admirably prompt. > >>> Not that it'll help much: the VM calls throttle_vm_writeout() for >>> GFP_NOIO >>> and GFP_NOFS allocations, which is a bug. Because if the caller >>> holds >>> locks which prevent filesystem or IO progress, we deadlock. >>> >>> I'll fix the VM if someone else fixes USB ;) >> >> What else needs to be fixed? > > Would be nice if someone can confirm that this fixes it: Doesn't seem to help my problem in a quick test, will get more data in the morning. - k ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-22 7:40 ` Kumar Gala @ 2007-02-22 18:20 ` Kumar Gala 2007-02-22 21:57 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: Kumar Gala @ 2007-02-22 18:20 UTC (permalink / raw) To: Andrew Morton Cc: Alan Stern, OGAWA Hirofumi, USB development list, Pete Zaitcev, Greg KH, Linux Kernel list >>>> Not that it'll help much: the VM calls throttle_vm_writeout() >>>> for GFP_NOIO >>>> and GFP_NOFS allocations, which is a bug. Because if the caller >>>> holds >>>> locks which prevent filesystem or IO progress, we deadlock. >>>> >>>> I'll fix the VM if someone else fixes USB ;) >>> >>> What else needs to be fixed? >> >> Would be nice if someone can confirm that this fixes it: > > Doesn't seem to help my problem in a quick test, will get more data > in the morning. \x7f Well, I didn't realize the patch you sent via mm-commits and the one here are actually different. I noticed that mm-commits one has: + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != __GFP_FS|__GFP_IO) { vs + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { The second seems to make more sense. I tested with the first last night which didn't help. With the proper patch in place things look good. Is this a candidate for 2.6.20-stable? - k ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-usb-devel] 2.6.20 kernel hang with USB drive and vfat doing ftruncate 2007-02-22 18:20 ` Kumar Gala @ 2007-02-22 21:57 ` Andrew Morton 0 siblings, 0 replies; 16+ messages in thread From: Andrew Morton @ 2007-02-22 21:57 UTC (permalink / raw) To: Kumar Gala Cc: stern, hirofumi, linux-usb-devel, zaitcev, gregkh, linux-kernel > On Thu, 22 Feb 2007 12:20:06 -0600 Kumar Gala <galak@kernel.crashing.org> wrote: > + if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { > > The second seems to make more sense. I tested with the first last > night which didn't help. > > With the proper patch in place things look good. Is this a candidate > for 2.6.20-stable? I suppose so, yes. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2007-02-22 21:57 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-02-16 19:54 2.6.20 kernel hang with USB drive and vfat doing ftruncate Kumar Gala 2007-02-18 16:10 ` OGAWA Hirofumi 2007-02-19 21:58 ` Kumar Gala 2007-02-19 22:19 ` OGAWA Hirofumi 2007-02-19 22:27 ` Kumar Gala 2007-02-20 17:20 ` OGAWA Hirofumi 2007-02-19 22:06 ` Kumar Gala 2007-02-21 20:18 ` OGAWA Hirofumi 2007-02-21 20:57 ` Andrew Morton 2007-02-21 21:22 ` [linux-usb-devel] " Alan Stern 2007-02-21 21:31 ` Andrew Morton 2007-02-21 21:50 ` Alan Stern 2007-02-21 22:54 ` Andrew Morton 2007-02-22 7:40 ` Kumar Gala 2007-02-22 18:20 ` Kumar Gala 2007-02-22 21:57 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).