LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* INFO: rcu detected stall in n_tty_receive_char_special @ 2018-04-06 7:12 syzbot 2018-04-06 7:13 ` Dmitry Vyukov 2018-04-15 13:43 ` syzbot 0 siblings, 2 replies; 8+ messages in thread From: syzbot @ 2018-04-06 7:12 UTC (permalink / raw) To: gregkh, jslaby, linux-kernel, syzkaller-bugs Hello, syzbot hit the following crash on upstream commit 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 (Sat Mar 31 01:52:36 2018 +0000) kernel.h: Retain constant expression output for max()/min() syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=18df353d7540aa6b5467 Unfortunately, I don't have any reproducer for this crash yet. Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5836679554269184 Kernel config: https://syzkaller.appspot.com/x/.config?id=-1647968177339044852 compiler: gcc (GCC) 8.0.1 20180301 (experimental) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+18df353d7540aa6b5467@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. INFO: rcu_sched detected stalls on CPUs/tasks: (detected by 1, t=125007 jiffies, g=42488, c=42487, q=11) All QSes seen, last rcu_sched kthread activity 125014 (4295022441-4294897427), jiffies_till_next_fqs=3, root ->qsmask 0x0 kworker/u4:5 R running task 15272 8806 2 0x80000008 Workqueue: events_unbound flush_to_ldisc Call Trace: <IRQ> sched_show_task.cold.87+0x27a/0x301 kernel/sched/core.c:5325 print_other_cpu_stall.cold.79+0x92f/0x9d2 kernel/rcu/tree.c:1481 check_cpu_stall.isra.61+0x706/0xf50 kernel/rcu/tree.c:1599 __rcu_pending kernel/rcu/tree.c:3356 [inline] rcu_pending kernel/rcu/tree.c:3401 [inline] rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763 update_process_times+0x2d/0x70 kernel/time/timer.c:1636 tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:171 tick_sched_timer+0x42/0x130 kernel/time/tick-sched.c:1179 __run_hrtimer kernel/time/hrtimer.c:1337 [inline] __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1399 hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1457 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline] smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862 </IRQ> RIP: 0010:echo_char+0xae/0x2e0 drivers/tty/n_tty.c:915 RSP: 0018:ffff8801d33e71e0 EFLAGS: 00000a07 ORIG_RAX: ffffffffffffff13 RAX: dffffc0000000000 RBX: ffffc90013158000 RCX: ffffffff8375b1b7 RDX: 1ffff1003ad87636 RSI: ffffffff8375b1c6 RDI: ffff8801d6c3b1b4 RBP: ffff8801d33e7210 R08: ffff8801cf482540 R09: fffff5200262b460 R10: fffff5200262b460 R11: ffffc9001315a307 R12: 00000000000000cb R13: ffff8801d6c3ae00 R14: 00000000c240f0bb R15: 00000000000000bb n_tty_receive_char_special+0x13b3/0x31c0 drivers/tty/n_tty.c:1306 n_tty_receive_buf_fast drivers/tty/n_tty.c:1577 [inline] __receive_buf drivers/tty/n_tty.c:1611 [inline] n_tty_receive_buf_common+0x20ca/0x2c50 drivers/tty/n_tty.c:1709 n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 tty_ldisc_receive_buf+0xb0/0x190 drivers/tty/tty_buffer.c:456 tty_port_default_receive_buf+0x110/0x170 drivers/tty/tty_port.c:38 receive_buf drivers/tty/tty_buffer.c:475 [inline] flush_to_ldisc+0x3e9/0x560 drivers/tty/tty_buffer.c:524 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279 kthread+0x345/0x410 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411 rcu_sched kthread starved for 125626 jiffies! g42488 c42487 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0 RCU grace-period kthread stack dump: rcu_sched R running task 23592 9 2 0x80000000 Call Trace: context_switch kernel/sched/core.c:2848 [inline] __schedule+0x807/0x1e40 kernel/sched/core.c:3490 schedule+0xef/0x430 kernel/sched/core.c:3549 schedule_timeout+0x138/0x240 kernel/time/timer.c:1801 rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231 kthread+0x345/0x410 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411 --- This bug is generated by a dumb bot. It may contain errors. See https://goo.gl/tpsmEJ for details. Direct all questions to syzkaller@googlegroups.com. syzbot will keep track of this bug report. If you forgot to add the Reported-by tag, once the fix for this bug is merged into any tree, please reply to this email with: #syz fix: exact-commit-title To mark this as a duplicate of another syzbot report, please reply with: #syz dup: exact-subject-of-another-report If it's a one-off invalid bug report, please reply with: #syz invalid Note: if the crash happens again, it will cause creation of a new bug report. Note: all commands must start from beginning of the line in the email body. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: rcu detected stall in n_tty_receive_char_special 2018-04-06 7:12 INFO: rcu detected stall in n_tty_receive_char_special syzbot @ 2018-04-06 7:13 ` Dmitry Vyukov 2018-04-15 13:43 ` syzbot 1 sibling, 0 replies; 8+ messages in thread From: Dmitry Vyukov @ 2018-04-06 7:13 UTC (permalink / raw) To: syzbot; +Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, syzkaller-bugs On Fri, Apr 6, 2018 at 9:12 AM, syzbot <syzbot+18df353d7540aa6b5467@syzkaller.appspotmail.com> wrote: > Hello, > > syzbot hit the following crash on upstream commit > 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 (Sat Mar 31 01:52:36 2018 +0000) > kernel.h: Retain constant expression output for max()/min() > syzbot dashboard link: > https://syzkaller.appspot.com/bug?extid=18df353d7540aa6b5467 > > Unfortunately, I don't have any reproducer for this crash yet. > Raw console output: > https://syzkaller.appspot.com/x/log.txt?id=5836679554269184 > Kernel config: > https://syzkaller.appspot.com/x/.config?id=-1647968177339044852 > compiler: gcc (GCC) 8.0.1 20180301 (experimental) > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+18df353d7540aa6b5467@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. This looks somewhat similar to "INFO: rcu detected stall in __process_echoes": https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 But I am not sure because stall stacks are somewhat different. > INFO: rcu_sched detected stalls on CPUs/tasks: > (detected by 1, t=125007 jiffies, g=42488, c=42487, q=11) > All QSes seen, last rcu_sched kthread activity 125014 > (4295022441-4294897427), jiffies_till_next_fqs=3, root ->qsmask 0x0 > kworker/u4:5 R running task 15272 8806 2 0x80000008 > Workqueue: events_unbound flush_to_ldisc > Call Trace: > <IRQ> > sched_show_task.cold.87+0x27a/0x301 kernel/sched/core.c:5325 > print_other_cpu_stall.cold.79+0x92f/0x9d2 kernel/rcu/tree.c:1481 > check_cpu_stall.isra.61+0x706/0xf50 kernel/rcu/tree.c:1599 > __rcu_pending kernel/rcu/tree.c:3356 [inline] > rcu_pending kernel/rcu/tree.c:3401 [inline] > rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763 > update_process_times+0x2d/0x70 kernel/time/timer.c:1636 > tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:171 > tick_sched_timer+0x42/0x130 kernel/time/tick-sched.c:1179 > __run_hrtimer kernel/time/hrtimer.c:1337 [inline] > __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1399 > hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1457 > local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline] > smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050 > apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862 > </IRQ> > RIP: 0010:echo_char+0xae/0x2e0 drivers/tty/n_tty.c:915 > RSP: 0018:ffff8801d33e71e0 EFLAGS: 00000a07 ORIG_RAX: ffffffffffffff13 > RAX: dffffc0000000000 RBX: ffffc90013158000 RCX: ffffffff8375b1b7 > RDX: 1ffff1003ad87636 RSI: ffffffff8375b1c6 RDI: ffff8801d6c3b1b4 > RBP: ffff8801d33e7210 R08: ffff8801cf482540 R09: fffff5200262b460 > R10: fffff5200262b460 R11: ffffc9001315a307 R12: 00000000000000cb > R13: ffff8801d6c3ae00 R14: 00000000c240f0bb R15: 00000000000000bb > n_tty_receive_char_special+0x13b3/0x31c0 drivers/tty/n_tty.c:1306 > n_tty_receive_buf_fast drivers/tty/n_tty.c:1577 [inline] > __receive_buf drivers/tty/n_tty.c:1611 [inline] > n_tty_receive_buf_common+0x20ca/0x2c50 drivers/tty/n_tty.c:1709 > n_tty_receive_buf2+0x33/0x40 drivers/tty/n_tty.c:1744 > tty_ldisc_receive_buf+0xb0/0x190 drivers/tty/tty_buffer.c:456 > tty_port_default_receive_buf+0x110/0x170 drivers/tty/tty_port.c:38 > receive_buf drivers/tty/tty_buffer.c:475 [inline] > flush_to_ldisc+0x3e9/0x560 drivers/tty/tty_buffer.c:524 > process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145 > worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279 > kthread+0x345/0x410 kernel/kthread.c:238 > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411 > rcu_sched kthread starved for 125626 jiffies! g42488 c42487 f0x2 > RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0 > RCU grace-period kthread stack dump: > rcu_sched R running task 23592 9 2 0x80000000 > Call Trace: > context_switch kernel/sched/core.c:2848 [inline] > __schedule+0x807/0x1e40 kernel/sched/core.c:3490 > schedule+0xef/0x430 kernel/sched/core.c:3549 > schedule_timeout+0x138/0x240 kernel/time/timer.c:1801 > rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231 > kthread+0x345/0x410 kernel/kthread.c:238 > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411 > > > --- > This bug is generated by a dumb bot. It may contain errors. > See https://goo.gl/tpsmEJ for details. > Direct all questions to syzkaller@googlegroups.com. > > syzbot will keep track of this bug report. > If you forgot to add the Reported-by tag, once the fix for this bug is > merged > into any tree, please reply to this email with: > #syz fix: exact-commit-title > To mark this as a duplicate of another syzbot report, please reply with: > #syz dup: exact-subject-of-another-report > If it's a one-off invalid bug report, please reply with: > #syz invalid > Note: if the crash happens again, it will cause creation of a new bug > report. > Note: all commands must start from beginning of the line in the email body. > > -- > You received this message because you are subscribed to the Google Groups > "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/syzkaller-bugs/883d24f7d4ccc52e19056928c5be%40google.com. > For more options, visit https://groups.google.com/d/optout. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: rcu detected stall in n_tty_receive_char_special 2018-04-06 7:12 INFO: rcu detected stall in n_tty_receive_char_special syzbot 2018-04-06 7:13 ` Dmitry Vyukov @ 2018-04-15 13:43 ` syzbot 2018-04-21 15:34 ` Tetsuo Handa 1 sibling, 1 reply; 8+ messages in thread From: syzbot @ 2018-04-15 13:43 UTC (permalink / raw) To: dvyukov, gregkh, jslaby, linux-kernel, syzkaller-bugs syzbot has found reproducer for the following crash on upstream commit 18b7fd1c93e5204355ddbf2608a097d64df81b88 (Sat Apr 14 15:50:50 2018 +0000) Merge branch 'akpm' (patches from Andrew) syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=18df353d7540aa6b5467 So far this crash happened 2 times on upstream. C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5283196245639168 syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=5978502260064256 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=4993708470566912 Kernel config: https://syzkaller.appspot.com/x/.config?id=-8852471259444315113 compiler: gcc (GCC) 8.0.1 20180413 (experimental) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+18df353d7540aa6b5467@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. INFO: rcu_sched self-detected stall on CPU 1-....: (124999 ticks this GP) idle=792/1/4611686018427387906 softirq=9920/9920 fqs=31243 (t=125000 jiffies g=4830 c=4829 q=20) NMI backtrace for cpu 1 CPU: 1 PID: 4467 Comm: syzkaller756606 Not tainted 4.16.0+ #3 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103 nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline] rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376 print_cpu_stall kernel/rcu/tree.c:1525 [inline] check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593 __rcu_pending kernel/rcu/tree.c:3356 [inline] rcu_pending kernel/rcu/tree.c:3401 [inline] rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763 update_process_times+0x2d/0x70 kernel/time/timer.c:1636 tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:173 tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1283 __run_hrtimer kernel/time/hrtimer.c:1386 [inline] __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1448 hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1506 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline] smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862 </IRQ> RIP: 0010:echo_char+0x96/0x2e0 drivers/tty/n_tty.c:915 RSP: 0018:ffff8801b079f6f8 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 RAX: ffff8801b06ee1c0 RBX: ffffc90001e64000 RCX: ffffffff837ab4a7 RDX: 0000000000000000 RSI: ffffffff837ab4b6 RDI: 0000000000000001 RBP: ffff8801b079f728 R08: ffff8801b06ee1c0 R09: 0000000000000001 R10: ffff8801b06eea20 R11: ffff8801b06ee1c0 R12: 000000000000000f R13: ffff8801b02b0dc0 R14: 000000012b1bcc47 R15: 0000000000000c47 n_tty_receive_char_special+0x13b3/0x31c0 drivers/tty/n_tty.c:1306 n_tty_receive_buf_fast drivers/tty/n_tty.c:1577 [inline] __receive_buf drivers/tty/n_tty.c:1611 [inline] n_tty_receive_buf_common+0x20ca/0x2c50 drivers/tty/n_tty.c:1709 n_tty_receive_buf+0x30/0x40 drivers/tty/n_tty.c:1738 tiocsti drivers/tty/tty_io.c:2171 [inline] tty_ioctl+0x7e7/0x1870 drivers/tty/tty_io.c:2557 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x16a0 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 SYSC_ioctl fs/ioctl.c:708 [inline] SyS_ioctl+0x24/0x30 fs/ioctl.c:706 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x4404b9 RSP: 002b:00007ffee94bd7f8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000200006c0 RCX: 00000000004404b9 RDX: 00000000200000c0 RSI: 0000000000005412 RDI: 0000000000000012 RBP: 6d74702f7665642f R08: 000000000000000a R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401c30 R13: 0000000000401cc0 R14: 0000000000000000 R15: 0000000000000000 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: rcu detected stall in n_tty_receive_char_special 2018-04-15 13:43 ` syzbot @ 2018-04-21 15:34 ` Tetsuo Handa 2018-04-23 16:01 ` Tetsuo Handa 0 siblings, 1 reply; 8+ messages in thread From: Tetsuo Handa @ 2018-04-21 15:34 UTC (permalink / raw) To: dvyukov, gregkh, jslaby; +Cc: syzbot, linux-kernel, syzkaller-bugs Dmitry Vyukov wrote: > This looks somewhat similar to "INFO: rcu detected stall in __process_echoes": > https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 > But I am not sure because stall stacks are somewhat different. Yes, this will be the same one which can be reproduced using same reproducer provided by syzbot. You can try below one like "while :; do echo -n .; ./a.out; done". ---------------------------------------- #include <fcntl.h> #include <stdio.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <unistd.h> int main(int argc, char *argv[]) { int fd[2]; unsigned int zero = 0; char buf[128]; int ptyno = 0; char c = 0; char data[64] = { }; while (1) { fd[0] = open("/dev/ptmx", O_RDWR); ioctl(fd[0], TIOCSPTLCK, &zero); write(fd[0], data, sizeof(data)); if (ioctl(fd[0], TIOCGPTN, &ptyno)) break; sprintf(buf, "/dev/pts/%d", ptyno); fd[1] = open(buf, O_RDONLY); if (fd[1] == EOF) break; ioctl(fd[1], TIOCSTI, &c); } return 1; } ---------------------------------------- Since tty subsystem is prone to race condition, same reproducer can hit various different results. [ 524.349009] a.out R running task 0 4052 1 0x80000084 [ 524.350879] Call Trace: [ 524.351816] <IRQ> [ 524.352672] sched_show_task+0x16b/0x230 [ 524.353864] show_state_filter+0x9d/0x190 [ 524.355098] sysrq_handle_showstate+0x7/0x10 [ 524.356342] __handle_sysrq+0x123/0x200 [ 524.357566] sysrq_filter+0x356/0x390 [ 524.358710] input_to_handler+0x4b/0xf0 [ 524.359911] input_pass_values.part.2+0x1ba/0x260 [ 524.361267] input_handle_event+0x11a/0x4f0 [ 524.362504] input_event+0x45/0x60 [ 524.363601] atkbd_interrupt+0x41a/0x6f0 [ 524.364815] serio_interrupt+0x3b/0x80 [ 524.365975] i8042_interrupt+0x1e8/0x3a0 [ 524.367197] ? _raw_spin_unlock_irq+0x24/0x40 [ 524.368456] __handle_irq_event_percpu+0x5b/0xc0 [ 524.369797] handle_irq_event_percpu+0x1b/0x50 [ 524.371102] handle_irq_event+0x2f/0x50 [ 524.372245] handle_edge_irq+0xcc/0x1a0 [ 524.373433] handle_irq+0x11/0x20 [ 524.374519] do_IRQ+0x52/0x100 [ 524.375519] common_interrupt+0xf/0xf [ 524.376648] </IRQ> [ 524.377473] RIP: 0010:n_tty_receive_char_special+0x4d1/0xad0 [ 524.379032] RSP: 0018:ffffc90006d2bd08 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffd2 [ 524.380957] RAX: ffffc90010f7d000 RBX: ffffc90010f7d000 RCX: 00000011c8d443f0 [ 524.382783] RDX: 00000000000003ef RSI: ffff880126ce3800 RDI: 0000000000000000 [ 524.384644] RBP: ffff880126ce3800 R08: 0000000000000012 R09: 0000000000000000 [ 524.386479] R10: 0000000000000000 R11: ffffffff814184f3 R12: 0000000000000012 [ 524.388300] R13: ffff880126ce3800 R14: 0000000a13a9e288 R15: 0000000000000012 [ 524.390158] ? n_tty_receive_buf_common+0x43/0xd00 [ 524.391567] ? n_tty_receive_char_special+0x4ea/0xad0 [ 524.393052] ? n_tty_receive_buf_common+0x43/0xd00 [ 524.394464] n_tty_receive_buf_common+0xa43/0xd00 [ 524.395832] tty_ioctl+0x3e7/0x910 [ 524.397041] ? trace_hardirqs_on_caller+0xe7/0x180 [ 524.398512] do_vfs_ioctl+0x8a/0x650 [ 524.399666] ? syscall_trace_enter+0x19e/0x250 [ 524.400977] ksys_ioctl+0x6b/0x80 [ 524.402102] __x64_sys_ioctl+0x11/0x20 [ 524.403327] do_syscall_64+0x4f/0x1f0 [ 524.404494] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 524.660246] 7 locks held by a.out/4052: [ 524.661533] #0: 00000000eb1b6dd2 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x1f/0x50 [ 524.663752] #1: 000000005920aabb (&o_tty->termios_rwsem/1){++++}, at: n_tty_receive_buf_common+0x43/0xd00 [ 524.666284] #2: 00000000fc85438a (&serio->lock){-.-.}, at: serio_interrupt+0x1f/0x80 [ 524.668422] #3: 00000000806da945 (&(&dev->event_lock)->rlock){-.-.}, at: input_event+0x32/0x60 [ 524.670712] #4: 000000005906d89a (rcu_read_lock){....}, at: input_pass_values.part.2+0x0/0x260 [ 524.673017] #5: 000000005906d89a (rcu_read_lock){....}, at: __handle_sysrq+0x0/0x200 [ 524.675144] #6: 00000000ed0ed6ec (tasklist_lock){.?.+}, at: debug_show_all_locks+0x37/0x190 [ 122.170234] a.out R running task 0 2818 1 0x80000084 [ 122.172442] Call Trace: [ 122.173402] <IRQ> [ 122.174271] sched_show_task+0x16b/0x230 [ 122.175512] show_state_filter+0x9d/0x190 [ 122.176693] sysrq_handle_showstate+0x7/0x10 [ 122.177924] __handle_sysrq+0x123/0x200 [ 122.179122] sysrq_filter+0x356/0x390 [ 122.180237] input_to_handler+0x4b/0xf0 [ 122.181385] input_pass_values.part.2+0x1ba/0x260 [ 122.182769] input_handle_event+0x11a/0x4f0 [ 122.184131] input_event+0x45/0x60 [ 122.185191] atkbd_interrupt+0x41a/0x6f0 [ 122.186350] serio_interrupt+0x3b/0x80 [ 122.187502] i8042_interrupt+0x1e8/0x3a0 [ 122.188668] __handle_irq_event_percpu+0x5b/0xc0 [ 122.189966] handle_irq_event_percpu+0x1b/0x50 [ 122.191384] handle_irq_event+0x2f/0x50 [ 122.192522] handle_edge_irq+0xcc/0x1a0 [ 122.193679] handle_irq+0x11/0x20 [ 122.194754] do_IRQ+0x52/0x100 [ 122.195764] common_interrupt+0xf/0xf [ 122.196875] RIP: 0010:__do_softirq+0xa0/0x233 [ 122.198102] RSP: 0018:ffff88013fc03f68 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffd3 [ 122.200212] RAX: ffff8801379bad80 RBX: 00000000ffffffff RCX: 0000000000000000 [ 122.202019] RDX: 0000000000000d84 RSI: ffff8801379bb5b8 RDI: ffff8801379bad80 [ 122.203886] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 122.205670] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 122.207536] R13: 0000000000000000 R14: 0000000000000002 R15: 0000000000000000 [ 122.209307] ? common_interrupt+0xa/0xf [ 122.210533] ? __do_softirq+0x9a/0x233 [ 122.211711] ? clockevents_program_event+0x76/0xe0 [ 122.213047] irq_exit+0xdc/0xe0 [ 122.214156] smp_apic_timer_interrupt+0x4d/0x80 [ 122.215446] apic_timer_interrupt+0xf/0x20 [ 122.216671] </IRQ> [ 122.217497] RIP: 0010:echo_char+0x29/0x90 [ 122.218744] RSP: 0018:ffffc90006787d00 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 [ 122.220616] RAX: ffffc90006959000 RBX: ffffc90006959000 RCX: 000000019b8c0c19 [ 122.222449] RDX: 0000000000000000 RSI: ffff880138ee1800 RDI: 0000000000000000 [ 122.224247] RBP: ffff880138ee1800 R08: 0000000000000012 R09: 0000000000000000 [ 122.226116] R10: 0000000000000000 R11: ffffffff814184f3 R12: 0000000000000012 [ 122.227921] R13: ffff880138ee1800 R14: 00000000d59f6933 R15: 0000000000000012 [ 122.229862] ? n_tty_receive_buf_common+0x43/0xd00 [ 122.231235] n_tty_receive_char_special+0x4ea/0xad0 [ 122.232622] ? n_tty_receive_buf_common+0x43/0xd00 [ 122.234084] n_tty_receive_buf_common+0xa43/0xd00 [ 122.235449] tty_ioctl+0x3e7/0x910 [ 122.236565] ? trace_hardirqs_on_caller+0xe7/0x180 [ 122.237993] do_vfs_ioctl+0x8a/0x650 [ 122.239157] ? syscall_trace_enter+0x19e/0x250 [ 122.240467] ksys_ioctl+0x6b/0x80 [ 122.241633] __x64_sys_ioctl+0x11/0x20 [ 122.242809] do_syscall_64+0x4f/0x1f0 [ 122.243969] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 122.624858] 7 locks held by a.out/2818: [ 122.626110] #0: 00000000d2e12ac0 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x1f/0x50 [ 122.628238] #1: 00000000c1852a2e (&o_tty->termios_rwsem/1){++++}, at: n_tty_receive_buf_common+0x43/0xd00 [ 122.630579] #2: 00000000019e0dd4 (&serio->lock){-.-.}, at: serio_interrupt+0x1f/0x80 [ 122.632634] #3: 00000000165952ed (&(&dev->event_lock)->rlock){-.-.}, at: input_event+0x32/0x60 [ 122.634823] #4: 0000000013aedeb2 (rcu_read_lock){....}, at: input_pass_values.part.2+0x0/0x260 [ 122.637122] #5: 0000000013aedeb2 (rcu_read_lock){....}, at: __handle_sysrq+0x0/0x200 [ 122.639133] #6: 0000000094121093 (tasklist_lock){.?.+}, at: debug_show_all_locks+0x37/0x190 [ 701.099036] a.out D 0 4365 1 0x80000082 [ 701.100533] Call Trace: [ 701.101735] ? __schedule+0x278/0x750 [ 701.103178] ? _raw_spin_unlock_irq+0x24/0x40 [ 701.104611] schedule+0x34/0x80 [ 701.105755] rwsem_down_write_failed+0x10a/0x230 [ 701.107228] ? call_rwsem_down_write_failed+0x13/0x20 [ 701.108781] call_rwsem_down_write_failed+0x13/0x20 [ 701.110544] ? n_tty_flush_buffer+0x14/0x50 [ 701.111905] down_write+0x3b/0x50 [ 701.113148] ? n_tty_flush_buffer+0x14/0x50 [ 701.114550] n_tty_flush_buffer+0x14/0x50 [ 701.115962] tty_ldisc_hangup+0x2a/0x1d0 [ 701.117270] __tty_hangup+0x13f/0x2b0 [ 701.118820] tty_release+0xf0/0x500 [ 701.120450] ? lock_acquire+0x43/0x60 [ 701.122099] __fput+0x95/0x1d0 [ 701.123377] task_work_run+0x84/0xa0 [ 701.124632] do_exit+0x2cc/0xac0 [ 701.125713] ? syscall_trace_enter+0x19e/0x250 [ 701.127039] do_group_exit+0x38/0xb0 [ 701.128469] __x64_sys_exit_group+0xf/0x10 [ 701.129795] do_syscall_64+0x4f/0x1f0 [ 701.130946] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 701.635816] 4 locks held by a.out/4365: [ 701.637551] #0: 0000000051b314cc (&tty->legacy_mutex){+.+.}, at: tty_release+0x4d/0x500 [ 701.639958] #1: 00000000fcddaa23 (&tty->legacy_mutex/1){+.+.}, at: __tty_hangup+0x5c/0x2b0 [ 701.642259] #2: 000000005a4dd91b (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x15/0x50 [ 701.645185] #3: 00000000997a5bc4 (&o_tty->termios_rwsem/1){++++}, at: n_tty_flush_buffer+0x14/0x50 [ 264.159739] a.out R running task 0 8464 1 0x80000084 [ 264.159739] Call Trace: [ 264.159739] <IRQ> [ 264.159739] sched_show_task+0x16b/0x230 [ 264.159739] show_state_filter+0x9d/0x190 [ 264.159739] sysrq_handle_showstate+0x7/0x10 [ 264.159739] __handle_sysrq+0x123/0x200 [ 264.159739] sysrq_filter+0x356/0x390 [ 264.159739] input_to_handler+0x4b/0xf0 [ 264.159739] input_pass_values.part.2+0x1ba/0x260 [ 264.159739] input_handle_event+0x11a/0x4f0 [ 264.159739] input_event+0x45/0x60 [ 264.159739] atkbd_interrupt+0x41a/0x6f0 [ 264.159739] serio_interrupt+0x3b/0x80 [ 264.159739] i8042_interrupt+0x1e8/0x3a0 [ 264.159739] ? _raw_spin_unlock_irq+0x24/0x40 [ 264.159739] __handle_irq_event_percpu+0x5b/0xc0 [ 264.159739] handle_irq_event_percpu+0x1b/0x50 [ 264.159739] handle_irq_event+0x2f/0x50 [ 264.159739] handle_edge_irq+0xcc/0x1a0 [ 264.159739] handle_irq+0x11/0x20 [ 264.159739] do_IRQ+0x52/0x100 [ 264.159739] common_interrupt+0xf/0xf [ 264.159739] </IRQ> [ 264.159739] RIP: 0010:__process_echoes+0xc4/0x260 [ 264.159739] RSP: 0018:ffffc90003257d18 EFLAGS: 00000296 ORIG_RAX: ffffffffffffffd3 [ 264.159739] RAX: 0000000000000004 RBX: 00000009abab4dd2 RCX: 00000009abab4dce [ 264.159739] RDX: fffffff65454b232 RSI: ffff880137fed800 RDI: 0000000000000000 [ 264.159739] RBP: ffffc90003305000 R08: 0000000000000000 R09: 0000000000000001 [ 264.159739] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 264.159739] R13: ffff880137fed800 R14: 0000000000001f00 R15: 0000000000000000 [ 264.159739] ? __process_echoes+0x14c/0x260 [ 264.159739] n_tty_receive_buf_common+0x4b6/0xd00 [ 264.159739] tty_ioctl+0x3e7/0x910 [ 264.159739] ? trace_hardirqs_on_caller+0xe7/0x180 [ 264.159739] do_vfs_ioctl+0x8a/0x650 [ 264.159739] ? syscall_trace_enter+0x19e/0x250 [ 264.159739] ksys_ioctl+0x6b/0x80 [ 264.159739] __x64_sys_ioctl+0x11/0x20 [ 264.159739] do_syscall_64+0x4f/0x1f0 [ 264.159739] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 264.159739] 8 locks held by a.out/8464: [ 264.159739] #0: 000000000a54adcf (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x1f/0x50 [ 264.159739] #1: 000000002abad3ab (&o_tty->termios_rwsem/1){++++}, at: n_tty_receive_buf_common+0x43/0xd00 [ 264.159739] #2: 00000000705a196b (&ldata->output_lock){+.+.}, at: n_tty_receive_buf_common+0x4a4/0xd00 [ 264.159739] #3: 000000008434776a (&serio->lock){-.-.}, at: serio_interrupt+0x1f/0x80 [ 264.159739] #4: 00000000710258a9 (&(&dev->event_lock)->rlock){-.-.}, at: input_event+0x32/0x60 [ 264.159739] #5: 000000004b5299a2 (rcu_read_lock){....}, at: input_pass_values.part.2+0x0/0x260 [ 264.159739] #6: 000000004b5299a2 (rcu_read_lock){....}, at: __handle_sysrq+0x0/0x200 [ 264.159739] #7: 00000000082c04ee (tasklist_lock){.?.+}, at: debug_show_all_locks+0x37/0x190 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: rcu detected stall in n_tty_receive_char_special 2018-04-21 15:34 ` Tetsuo Handa @ 2018-04-23 16:01 ` Tetsuo Handa 2018-04-24 12:25 ` Tetsuo Handa 0 siblings, 1 reply; 8+ messages in thread From: Tetsuo Handa @ 2018-04-23 16:01 UTC (permalink / raw) To: gregkh, jslaby, Peter Hurley Cc: dvyukov, syzbot, linux-kernel, syzkaller-bugs On 2018/04/22 0:34, Tetsuo Handa wrote: > Dmitry Vyukov wrote: >> This looks somewhat similar to "INFO: rcu detected stall in __process_echoes": >> https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 >> But I am not sure because stall stacks are somewhat different. > > Yes, this will be the same one which can be reproduced using same reproducer > provided by syzbot. You can try below one like "while :; do echo -n .; ./a.out; done". > > ---------------------------------------- > #include <fcntl.h> > #include <stdio.h> > #include <sys/ioctl.h> > #include <sys/stat.h> > #include <unistd.h> > > int main(int argc, char *argv[]) > { > int fd[2]; > unsigned int zero = 0; > char buf[128]; > int ptyno = 0; > char c = 0; > char data[64] = { }; > while (1) { > fd[0] = open("/dev/ptmx", O_RDWR); > ioctl(fd[0], TIOCSPTLCK, &zero); > write(fd[0], data, sizeof(data)); > if (ioctl(fd[0], TIOCGPTN, &ptyno)) > break; > sprintf(buf, "/dev/pts/%d", ptyno); > fd[1] = open(buf, O_RDONLY); > if (fd[1] == EOF) > break; > ioctl(fd[1], TIOCSTI, &c); > } > return 1; > } > ---------------------------------------- Bisection reached to commit e052c6d15c61cc4c ("tty: Use unbound workqueue for all input workers") added in 4.4-rc1. Reverting that commit on top of 4.4 solved the lockup. But unfortunately, reverting that commit on top of 4.16 only helped surviving the test longer than vanilla 4.16; still stalls at __process_echoes(). There might be similar changes which need to be reverted as well. # bad: [8005c49d9aea74d382f474ce11afbbc7d7130bec] Linux 4.4-rc1 # good: [6a13feb9c82803e2b815eca72fa7a9f5561d7861] Linux 4.3 # good: [64291f7db5bd8150a74ad2036f1037e6a0428df2] Linux 4.2 # good: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1 # good: [39a8804455fb23f09157341d3ba7db6d7ae6ee76] Linux 4.0 # good: [bfa76d49576599a4b9f9b7a71f23d73d6dcff735] Linux 3.19 # good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18 # good: [bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9] Linux 3.17 # good: [19583ca584d6f574384e17fe7613dfaeadcdc4a6] Linux 3.16 # good: [1860e379875dfe7271c649058aeddffe5afd9d0d] Linux 3.15 # good: [455c6fdbd219161bd09b1165f11699d6d73de11c] Linux 3.14 # good: [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13 # good: [5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52] Linux 3.12 git bisect start 'HEAD' 'v4.3' 'v4.2' 'v4.1' 'v4.0' 'v3.19' 'v3.18' 'v3.17' 'v3.16' 'v3.15' 'v3.14' 'v3.13' 'v3.12' '--' 'drivers/tty/' # good: [cc32382d9fd22dc8eebba4a245f50417267bda8e] serial: imx: make setup_ufcr more useful git bisect good cc32382d9fd22dc8eebba4a245f50417267bda8e # good: [32ede4a51754cb62b0d43d91cb7c4e3c57069a9c] serial: tegra: Add helper function for handling RX buffer git bisect good 32ede4a51754cb62b0d43d91cb7c4e3c57069a9c # good: [4b41b9539a1e9531f942ededfcdcff372317d2e7] tty: Prevent tty teardown during tty_write_message() git bisect good 4b41b9539a1e9531f942ededfcdcff372317d2e7 # bad: [fd0d351de7bbd718bc2b34d5846854831aa2b88c] Merge tag 'tty-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty git bisect bad fd0d351de7bbd718bc2b34d5846854831aa2b88c # good: [14c79092909a52b6fd6394b6ad5e7756c4f9565e] Merge branch 'parisc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux git bisect good 14c79092909a52b6fd6394b6ad5e7756c4f9565e # good: [41ecf1404b34d9975eb97f5005d9e4274eaeb76a] Merge tag 'for-linus-4.4-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip git bisect good 41ecf1404b34d9975eb97f5005d9e4274eaeb76a # bad: [e052c6d15c61cc4caff2f06cbca72b183da9f15e] tty: Use unbound workqueue for all input workers git bisect bad e052c6d15c61cc4caff2f06cbca72b183da9f15e # good: [e176058f0de53c2346734e5254835e0045364001] tty: Abstract tty buffer work git bisect good e176058f0de53c2346734e5254835e0045364001 # first bad commit: [e052c6d15c61cc4caff2f06cbca72b183da9f15e] tty: Use unbound workqueue for all input workers ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: rcu detected stall in n_tty_receive_char_special 2018-04-23 16:01 ` Tetsuo Handa @ 2018-04-24 12:25 ` Tetsuo Handa 2018-05-17 12:19 ` Tetsuo Handa 0 siblings, 1 reply; 8+ messages in thread From: Tetsuo Handa @ 2018-04-24 12:25 UTC (permalink / raw) To: gregkh, jslaby, Peter Hurley Cc: dvyukov, syzbot, linux-kernel, syzkaller-bugs On 2018/04/24 1:01, Tetsuo Handa wrote: > Bisection reached to commit e052c6d15c61cc4c ("tty: Use unbound workqueue > for all input workers") added in 4.4-rc1. > > Reverting that commit on top of 4.4 solved the lockup. But unfortunately, > reverting that commit on top of 4.16 only helped surviving the test longer > than vanilla 4.16; still stalls at __process_echoes(). There might be similar > changes which need to be reverted as well. I noticed that when stall inside __process_echoes() occurs, it is looping at discard loop added by commit cbfd0340ae199337 ("n_tty: Process echoes in blocks"). Since ldata->echo_commit < ldata->echo_tail is for some reason true upon entry, the discard loop serves as almost infinite loop. --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -613,12 +613,21 @@ static size_t __process_echoes(struct tty_struct *tty) int space, old_space; size_t tail; unsigned char c; + unsigned long start = jiffies; old_space = space = tty_write_room(tty); tail = ldata->echo_tail; + if (ldata->echo_commit < tail) + printk("BUG1: ldata->echo_commit=%ld tail=%ld space=%d old_space=%d\n", + ldata->echo_commit, tail, space, old_space); while (ldata->echo_commit != tail) { c = echo_buf(ldata, tail); + if (time_after(jiffies, start + 5 * HZ)) { + printk("ldata->echo_commit=%ld tail=%ld c=%c space=%d old_space=%d\n", + ldata->echo_commit, tail, c, space, old_space); + start = jiffies; + } if (c == ECHO_OP_START) { unsigned char op; int no_space_left = 0; @@ -729,6 +738,11 @@ static size_t __process_echoes(struct tty_struct *tty) * of echo overrun before the next commit), then discard enough * data at the tail to prevent a subsequent overrun */ while (ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) { + if (time_after(jiffies, start + 5 * HZ)) { + printk("ldata->echo_commit=%ld ldata->echo_tail=%ld tail=%ld space=%d old_space=%d\n", + ldata->echo_commit, ldata->echo_tail, tail, space, old_space); + start = jiffies; + } if (echo_buf(ldata, tail) == ECHO_OP_START) { if (echo_buf(ldata, tail + 1) == ECHO_OP_ERASE_TAB) tail += 3; @@ -739,6 +753,9 @@ static size_t __process_echoes(struct tty_struct *tty) } ldata->echo_tail = tail; + if (ldata->echo_commit < tail) + printk("BUG2: ldata->echo_commit=%ld tail=%ld space=%d old_space=%d\n", + ldata->echo_commit, tail, space, old_space); return old_space - space; } [ 5.890788] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 27.357324] BUG1: ldata->echo_commit=120 tail=130 space=7936 old_space=7936 [ 32.358527] ldata->echo_commit=120 ldata->echo_tail=130 tail=2994833619 space=0 old_space=7936 [ 37.362404] ldata->echo_commit=120 ldata->echo_tail=130 tail=5930715049 space=0 old_space=7936 [ 42.366124] ldata->echo_commit=120 ldata->echo_tail=130 tail=8943875147 space=0 old_space=7936 [ 47.370508] ldata->echo_commit=120 ldata->echo_tail=130 tail=11950164125 space=0 old_space=7936 [ 52.374224] ldata->echo_commit=120 ldata->echo_tail=130 tail=14880807667 space=0 old_space=7936 [ 57.378597] ldata->echo_commit=120 ldata->echo_tail=130 tail=17821338400 space=0 old_space=7936 [ 62.383638] ldata->echo_commit=120 ldata->echo_tail=130 tail=20811717629 space=0 old_space=7936 [ 63.203807] sysrq: SysRq : Resetting [ 5.843667] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 28.306346] BUG1: ldata->echo_commit=128 tail=130 space=7936 old_space=7936 [ 33.307133] ldata->echo_commit=128 ldata->echo_tail=130 tail=2892537244 space=0 old_space=7936 [ 38.311478] ldata->echo_commit=128 ldata->echo_tail=130 tail=5766457276 space=0 old_space=7936 [ 43.315319] ldata->echo_commit=128 ldata->echo_tail=130 tail=8686538551 space=0 old_space=7936 [ 48.319164] ldata->echo_commit=128 ldata->echo_tail=130 tail=11556458920 space=0 old_space=7936 [ 50.122017] sysrq: SysRq : Resetting [ 6.026322] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 38.149864] BUG1: ldata->echo_commit=118 tail=130 space=7936 old_space=7936 [ 43.150167] ldata->echo_commit=118 ldata->echo_tail=130 tail=2789490540 space=0 old_space=7936 [ 48.154012] ldata->echo_commit=118 ldata->echo_tail=130 tail=5592467000 space=0 old_space=7936 [ 53.158365] ldata->echo_commit=118 ldata->echo_tail=130 tail=8376906144 space=0 old_space=7936 [ 58.162211] ldata->echo_commit=118 ldata->echo_tail=130 tail=11158984120 space=0 old_space=7936 [ 60.487338] sysrq: SysRq : Resetting [ 9.170831] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 25.929908] BUG1: ldata->echo_commit=120 tail=130 space=7936 old_space=7936 [ 30.930553] ldata->echo_commit=120 ldata->echo_tail=130 tail=3008469343 space=0 old_space=7936 [ 35.935475] ldata->echo_commit=120 ldata->echo_tail=130 tail=5857403750 space=0 old_space=7936 [ 40.940153] ldata->echo_commit=120 ldata->echo_tail=130 tail=8858176994 space=0 old_space=7936 [ 45.944527] ldata->echo_commit=120 ldata->echo_tail=130 tail=11838898685 space=0 old_space=7936 [ 48.948625] sysrq: SysRq : Resetting [ 6.071991] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 22.890428] BUG1: ldata->echo_commit=128 tail=130 space=7936 old_space=7936 [ 27.891372] ldata->echo_commit=128 ldata->echo_tail=130 tail=3064270209 space=0 old_space=7936 [ 32.895212] ldata->echo_commit=128 ldata->echo_tail=130 tail=6085270166 space=0 old_space=7936 [ 37.899047] ldata->echo_commit=128 ldata->echo_tail=130 tail=9102650663 space=0 old_space=7936 [ 42.903405] ldata->echo_commit=128 ldata->echo_tail=130 tail=12195198062 space=0 old_space=7936 [ 44.497876] sysrq: SysRq : Resetting [ 5.804688] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 23.632487] BUG1: ldata->echo_commit=102 tail=130 space=7936 old_space=7936 [ 28.633878] ldata->echo_commit=102 ldata->echo_tail=130 tail=2930081035 space=0 old_space=7936 [ 33.638168] ldata->echo_commit=102 ldata->echo_tail=130 tail=5867627875 space=0 old_space=7936 [ 38.654062] ldata->echo_commit=102 ldata->echo_tail=130 tail=8804236495 space=0 old_space=7936 [ 40.296804] sysrq: SysRq : Resetting ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: rcu detected stall in n_tty_receive_char_special 2018-04-24 12:25 ` Tetsuo Handa @ 2018-05-17 12:19 ` Tetsuo Handa 2018-05-17 12:40 ` syzbot 0 siblings, 1 reply; 8+ messages in thread From: Tetsuo Handa @ 2018-05-17 12:19 UTC (permalink / raw) To: gregkh, jslaby, Peter Hurley Cc: dvyukov, syzbot, linux-kernel, syzkaller-bugs #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master >From ada7279f7f034c5fd79fc04e1120069ea5f6cef2 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Thu, 17 May 2018 20:42:50 +0900 Subject: [PATCH] n_tty: Access echo_* variables carefully. syzbot is reporting stalls at __process_echoes() [1]. This is because since ldata->echo_commit < ldata->echo_tail becomes true for some reason, the discard loop is serving as almost infinite loop. This patch tries to avoid falling into ldata->echo_commit < ldata->echo_tail situation by making access to echo_* variables more carefully. Since reset_buffer_flags() is called without output_lock held, it should not touch echo_* variables. And omit a call to reset_buffer_flags() from n_tty_open() by using vzalloc(). Since add_echo_byte() is called without output_lock held, it needs memory barrier between storing into echo_buf[] and incrementing echo_head counter. echo_buf() needs corresponding memory barrier before reading echo_buf[]. Lack of handling the possibility of not-yet-stored multi-byte operation might be the reason of falling into ldata->echo_commit < ldata->echo_tail situation, for if I do WARN_ON(ldata->echo_commit == tail + 1) prior to echo_buf(ldata, tail + 1), the WARN_ON() fires. The echo_* variables are now updated with output_lock held. Thus, I guess that ldata->echo_commit < ldata->echo_tail is no longer possible. But to become more carefully, use ldata->echo_commit > ldata->echo_tail as the preventive condition to continue the "while" loops. [1] https://syzkaller.appspot.com/bug?id=3d7481a346958d9469bebbeb0537d5f056bdd6e8 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+18df353d7540aa6b5467@syzkaller.appspotmail.com> Cc: Peter Hurley <peter@hurleysoftware.com> --- drivers/tty/n_tty.c | 51 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 30 insertions(+), 21 deletions(-) diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index cbe98bc..060ef91 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -90,9 +90,6 @@ struct n_tty_data { size_t read_head; size_t commit_head; size_t canon_head; - size_t echo_head; - size_t echo_commit; - size_t echo_mark; DECLARE_BITMAP(char_map, 256); /* private to n_tty_receive_overrun (single-threaded) */ @@ -119,6 +116,9 @@ struct n_tty_data { unsigned int column; unsigned int canon_column; size_t echo_tail; + size_t echo_head; + size_t echo_commit; + size_t echo_mark; struct mutex atomic_read_lock; struct mutex output_lock; @@ -141,6 +141,7 @@ static inline unsigned char *read_buf_addr(struct n_tty_data *ldata, size_t i) static inline unsigned char echo_buf(struct n_tty_data *ldata, size_t i) { + smp_rmb(); /* Matches smp_wmb() in add_echo_byte(). */ return ldata->echo_buf[i & (N_TTY_BUF_SIZE - 1)]; } @@ -316,9 +317,7 @@ static inline void put_tty_queue(unsigned char c, struct n_tty_data *ldata) static void reset_buffer_flags(struct n_tty_data *ldata) { ldata->read_head = ldata->canon_head = ldata->read_tail = 0; - ldata->echo_head = ldata->echo_tail = ldata->echo_commit = 0; ldata->commit_head = 0; - ldata->echo_mark = 0; ldata->line_start = 0; ldata->erasing = 0; @@ -617,13 +616,22 @@ static size_t __process_echoes(struct tty_struct *tty) old_space = space = tty_write_room(tty); tail = ldata->echo_tail; - while (ldata->echo_commit != tail) { + if (WARN_ON(ldata->echo_commit < tail)) + return 0; + while (ldata->echo_commit > tail) { c = echo_buf(ldata, tail); if (c == ECHO_OP_START) { unsigned char op; int no_space_left = 0; /* + * Since add_echo_byte() is called without holding + * output_lock, we might see only portion of multi-byte + * operation. + */ + if (ldata->echo_commit == tail + 1) + goto not_yet_stored; + /* * If the buffer byte is the start of a multi-byte * operation, get the next byte, which is either the * op code or a control character value. @@ -634,6 +642,8 @@ static size_t __process_echoes(struct tty_struct *tty) unsigned int num_chars, num_bs; case ECHO_OP_ERASE_TAB: + if (ldata->echo_commit == tail + 2) + goto not_yet_stored; num_chars = echo_buf(ldata, tail + 2); /* @@ -728,7 +738,9 @@ static size_t __process_echoes(struct tty_struct *tty) /* If the echo buffer is nearly full (so that the possibility exists * of echo overrun before the next commit), then discard enough * data at the tail to prevent a subsequent overrun */ - while (ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) { + WARN_ON(ldata->echo_commit < tail); + while (ldata->echo_commit > tail && + ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) { if (echo_buf(ldata, tail) == ECHO_OP_START) { if (echo_buf(ldata, tail + 1) == ECHO_OP_ERASE_TAB) tail += 3; @@ -738,6 +750,7 @@ static size_t __process_echoes(struct tty_struct *tty) tail++; } + not_yet_stored: ldata->echo_tail = tail; return old_space - space; } @@ -748,6 +761,7 @@ static void commit_echoes(struct tty_struct *tty) size_t nr, old, echoed; size_t head; + mutex_lock(&ldata->output_lock); head = ldata->echo_head; ldata->echo_mark = head; old = ldata->echo_commit - ldata->echo_tail; @@ -756,10 +770,12 @@ static void commit_echoes(struct tty_struct *tty) * is over the threshold (and try again each time another * block is accumulated) */ nr = head - ldata->echo_tail; - if (nr < ECHO_COMMIT_WATERMARK || (nr % ECHO_BLOCK > old % ECHO_BLOCK)) + if (nr < ECHO_COMMIT_WATERMARK || + (nr % ECHO_BLOCK > old % ECHO_BLOCK)) { + mutex_unlock(&ldata->output_lock); return; + } - mutex_lock(&ldata->output_lock); ldata->echo_commit = head; echoed = __process_echoes(tty); mutex_unlock(&ldata->output_lock); @@ -810,7 +826,9 @@ static void flush_echoes(struct tty_struct *tty) static inline void add_echo_byte(unsigned char c, struct n_tty_data *ldata) { - *echo_buf_addr(ldata, ldata->echo_head++) = c; + *echo_buf_addr(ldata, ldata->echo_head) = c; + smp_wmb(); /* Matches smp_rmb() in echo_buf(). */ + ldata->echo_head++; } /** @@ -1878,30 +1896,21 @@ static int n_tty_open(struct tty_struct *tty) struct n_tty_data *ldata; /* Currently a malloc failure here can panic */ - ldata = vmalloc(sizeof(*ldata)); + ldata = vzalloc(sizeof(*ldata)); if (!ldata) - goto err; + return -ENOMEM; ldata->overrun_time = jiffies; mutex_init(&ldata->atomic_read_lock); mutex_init(&ldata->output_lock); tty->disc_data = ldata; - reset_buffer_flags(tty->disc_data); - ldata->column = 0; - ldata->canon_column = 0; - ldata->num_overrun = 0; - ldata->no_room = 0; - ldata->lnext = 0; tty->closing = 0; /* indicate buffer work may resume */ clear_bit(TTY_LDISC_HALTED, &tty->flags); n_tty_set_termios(tty, NULL); tty_unthrottle(tty); - return 0; -err: - return -ENOMEM; } static inline int input_available_p(struct tty_struct *tty, int poll) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: INFO: rcu detected stall in n_tty_receive_char_special 2018-05-17 12:19 ` Tetsuo Handa @ 2018-05-17 12:40 ` syzbot 0 siblings, 0 replies; 8+ messages in thread From: syzbot @ 2018-05-17 12:40 UTC (permalink / raw) To: dvyukov, gregkh, jslaby, linux-kernel, penguin-kernel, peter, syzkaller-bugs Hello, syzbot has tested the proposed patch but the reproducer still triggered crash: INFO: rcu detected stall in corrupted IPVS: ftp: loaded support on port[0] = 21 IPVS: ftp: loaded support on port[0] = 21 IPVS: ftp: loaded support on port[0] = 21 IPVS: ftp: loaded support on port[0] = 21 IPVS: ftp: loaded support on port[0] = 21 INFO: rcu_sched self-detected stall on CPU INFO: rcu_sched detected stalls on CPUs/tasks: 1-....: (124998 ticks this GP) idle=da2/1/4611686018427387906 softirq=12308/12308 fqs=30969 1-....: (124998 ticks this GP) idle=da2/1/4611686018427387906 softirq=12308/12308 fqs=30969 (t=125002 jiffies g=6046 c=6045 q=1179) (detected by 0, t=125002 jiffies, g=6046, c=6045, q=1179) Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 5134 Comm: syz-executor3 Not tainted 4.17.0-rc5+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:rep_nop arch/x86/include/asm/processor.h:667 [inline] RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:672 [inline] RIP: 0010:virt_spin_lock arch/x86/include/asm/qspinlock.h:69 [inline] RIP: 0010:native_queued_spin_lock_slowpath+0x204/0xde0 kernel/locking/qspinlock.c:305 RSP: 0018:ffff8801db107030 EFLAGS: 00000002 RAX: 0000000000000000 RBX: ffffed003b620e1f RCX: 0000000000000004 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff88b911c0 RBP: ffff8801db1073a0 R08: fffffbfff1172239 R09: fffffbfff1172238 R10: fffffbfff1172238 R11: ffffffff88b911c3 R12: ffffffff88b911c0 R13: ffff8801db107378 R14: 0000000000000001 R15: dffffc0000000000 FS: 00007fd0bf7b5700(0000) GS:ffff8801db100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004d91e4 CR3: 00000001c68eb000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:674 [inline] queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:30 [inline] queued_spin_lock include/asm-generic/qspinlock.h:90 [inline] do_raw_spin_lock+0x1a7/0x200 kernel/locking/spinlock_debug.c:113 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline] _raw_spin_lock_irqsave+0x9e/0xc0 kernel/locking/spinlock.c:152 rcu_dump_cpu_stacks+0x9d/0x1c2 kernel/rcu/tree.c:1373 print_cpu_stall kernel/rcu/tree.c:1525 [inline] check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593 __rcu_pending kernel/rcu/tree.c:3356 [inline] rcu_pending kernel/rcu/tree.c:3401 [inline] rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763 update_process_times+0x2d/0x70 kernel/time/timer.c:1636 tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:164 tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1274 __run_hrtimer kernel/time/hrtimer.c:1398 [inline] __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1460 hrtimer_interrupt+0x2f3/0x750 kernel/time/hrtimer.c:1518 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline] smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863 </IRQ> RIP: 0010:add_echo_byte drivers/tty/n_tty.c:829 [inline] RIP: 0010:echo_char+0x152/0x340 drivers/tty/n_tty.c:934 RSP: 0018:ffff8801aa1ef720 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 RAX: ffff8801a9a5e180 RBX: ffffc90002bb4000 RCX: ffffffff837ff667 RDX: 0000000000000000 RSI: ffffffff837ff672 RDI: 0000000000000001 RBP: ffff8801aa1ef748 R08: ffff8801a9a5e180 R09: 0000000000000001 R10: ffff8801a9a5e9e0 R11: ffff8801a9a5e180 R12: 00000000000008df R13: 0000000000000000 R14: 0000000000000008 R15: ffffc90002bb6268 n_tty_receive_char_special+0x1451/0x3510 drivers/tty/n_tty.c:1324 n_tty_receive_buf_fast drivers/tty/n_tty.c:1595 [inline] __receive_buf drivers/tty/n_tty.c:1629 [inline] n_tty_receive_buf_common+0x21bd/0x2ca0 drivers/tty/n_tty.c:1727 n_tty_receive_buf+0x30/0x40 drivers/tty/n_tty.c:1756 tiocsti drivers/tty/tty_io.c:2171 [inline] tty_ioctl+0x7e7/0x1870 drivers/tty/tty_io.c:2557 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x16a0 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x455319 RSP: 002b:00007fd0bf7b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fd0bf7b56d4 RCX: 0000000000455319 RDX: 00000000200000c0 RSI: 0000000000005412 RDI: 0000000000000004 RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000000002f9 R14: 00000000006f77f8 R15: 0000000000000000 Code: 5c 00 00 00 00 48 83 e7 f8 49 29 fc 41 8d 4c 24 64 c1 e9 03 f3 48 ab 48 81 c4 48 03 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 f3 90 <e9> 3b ff ff ff 89 f1 81 fe 00 01 00 00 0f 84 9e 03 00 00 41 be NMI backtrace for cpu 1 CPU: 1 PID: 5134 Comm: syz-executor3 Not tainted 4.17.0-rc5+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103 nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline] rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376 print_cpu_stall kernel/rcu/tree.c:1525 [inline] check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593 __rcu_pending kernel/rcu/tree.c:3356 [inline] rcu_pending kernel/rcu/tree.c:3401 [inline] rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763 update_process_times+0x2d/0x70 kernel/time/timer.c:1636 tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:164 tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1274 __run_hrtimer kernel/time/hrtimer.c:1398 [inline] __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1460 hrtimer_interrupt+0x2f3/0x750 kernel/time/hrtimer.c:1518 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline] smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863 </IRQ> RIP: 0010:add_echo_byte drivers/tty/n_tty.c:829 [inline] RIP: 0010:echo_char+0x152/0x340 drivers/tty/n_tty.c:934 RSP: 0018:ffff8801aa1ef720 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13 RAX: ffff8801a9a5e180 RBX: ffffc90002bb4000 RCX: ffffffff837ff667 RDX: 0000000000000000 RSI: ffffffff837ff672 RDI: 0000000000000001 RBP: ffff8801aa1ef748 R08: ffff8801a9a5e180 R09: 0000000000000001 R10: ffff8801a9a5e9e0 R11: ffff8801a9a5e180 R12: 00000000000008df R13: 0000000000000000 R14: 0000000000000008 R15: ffffc90002bb6268 n_tty_receive_char_special+0x1451/0x3510 drivers/tty/n_tty.c:1324 n_tty_receive_buf_fast drivers/tty/n_tty.c:1595 [inline] __receive_buf drivers/tty/n_tty.c:1629 [inline] n_tty_receive_buf_common+0x21bd/0x2ca0 drivers/tty/n_tty.c:1727 n_tty_receive_buf+0x30/0x40 drivers/tty/n_tty.c:1756 tiocsti drivers/tty/tty_io.c:2171 [inline] tty_ioctl+0x7e7/0x1870 drivers/tty/tty_io.c:2557 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x16a0 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x455319 RSP: 002b:00007fd0bf7b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fd0bf7b56d4 RCX: 0000000000455319 RDX: 00000000200000c0 RSI: 0000000000005412 RDI: 0000000000000004 RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000000002f9 R14: 00000000006f77f8 R15: 0000000000000000 INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.375 msecs Tested on: commit: e6506eb24187 Merge tag 'trace-v4.17-rc4-2' of git://git.ke.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=15804957800000 kernel config: https://syzkaller.appspot.com/x/.config?x=d57d2358c5d82736 compiler: gcc (GCC) 8.0.1 20180413 (experimental) patch: https://syzkaller.appspot.com/x/patch.diff?x=11608417800000 ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-05-17 12:40 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-06 7:12 INFO: rcu detected stall in n_tty_receive_char_special syzbot 2018-04-06 7:13 ` Dmitry Vyukov 2018-04-15 13:43 ` syzbot 2018-04-21 15:34 ` Tetsuo Handa 2018-04-23 16:01 ` Tetsuo Handa 2018-04-24 12:25 ` Tetsuo Handa 2018-05-17 12:19 ` Tetsuo Handa 2018-05-17 12:40 ` syzbot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).