LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).
@ 2011-02-03  3:19 Nick Bowler
  2011-02-03  9:12 ` Yong Zhang
  0 siblings, 1 reply; 12+ messages in thread
From: Nick Bowler @ 2011-02-03  3:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yong Zhang, Andrew Morton, Thomas Gleixner

With 2.6.38-rc3+, I'm receiving nasty WARNINGS and lockdep spews on my
laptop.  Other than the noise, things seem to work fine.  The lockdep
warning is spewed as soon as the second of the following two events
occurs (in any order):

  * a key on the keyboard is pressed.
  * the radio switch is in the off position.

For example, pressing a key after booting with the switch in the off
position will trigger a dump.  The following output was what I got when I
booted with the switch on, flipped the switch off, flipped the switch
back on, then pressed a key (note that the iwlagn errors are normal).
Bisection results follow the traces -- reverting the implicated commit
solves the issue.

  ------------[ cut here ]------------
  WARNING: at /home/nick/misc/linux-2.6/kernel/softirq.c:159 _local_bh_enable_ip+0x49/0xe3()
  Hardware name: 2242CTO
  Modules linked in: nfs nfs_acl aes_x86_64 aes_generic nfsd lockd sunrpc exportfs ipv6 iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter xt_mark ip_tables snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc ufs ext2 vhost_net tun arc4 ecb crypto_blkcipher kvm_intel cryptomgr aead crypto_algapi kvm iwlagn iwlcore mac80211 btusb bluetooth cfg80211 e1000e sg joydev sr_mod cdrom uhci_hcd mmc_block ehci_hcd usbcore nls_base evdev thinkpad_acpi hwmon rfkill nvram acpi_cpufreq mperf sdhci_pci sdhci pcmcia mmc_core yenta_socket pcmcia_rsrc pcmcia_core psmouse loop rtc_cmos rtc_core rtc_lib
  Pid: 10, comm: kworker/0:1 Not tainted 2.6.38-rc3-bisect-00062-g8c007a1 #92
  Call Trace:
  [<ffffffff8103f57b>] ? warn_slowpath_common+0x80/0x98
  [<ffffffff8104a418>] ? del_timer_sync+0x4f/0x8e
  [<ffffffff8103f5a8>] ? warn_slowpath_null+0x15/0x17
  [<ffffffff81044c88>] ? _local_bh_enable_ip+0x49/0xe3
  [<ffffffff81044d3a>] ? local_bh_enable+0xd/0xf
  [<ffffffff8104a418>] ? del_timer_sync+0x4f/0x8e
  [<ffffffff8104a3c9>] ? del_timer_sync+0x0/0x8e
  [<ffffffffa00947fe>] ? rfkill_schedule_global_op+0x41/0x84 [rfkill]
  [<ffffffffa0094858>] ? rfkill_schedule_evsw_rfkillall+0x17/0x19 [rfkill]
  [<ffffffffa009495f>] ? rfkill_event+0x58/0x5a [rfkill]
  [<ffffffff8125c980>] ? input_pass_event+0x97/0xea
  [<ffffffff8125de13>] ? input_handle_event+0x460/0x46f
  [<ffffffff8125df1b>] ? input_event+0x66/0x83
  [<ffffffffa009becb>] ? tpacpi_send_radiosw_update+0x7f/0xc4 [thinkpad_acpi]
  [<ffffffffa009c417>] ? hotkey_notify+0x507/0x581 [thinkpad_acpi]
  [<ffffffff810bc242>] ? kfree+0xd2/0xeb
  [<ffffffffa009b029>] ? dispatch_acpi_notify+0x29/0x2b [thinkpad_acpi]
  [<ffffffff8119a18a>] ? acpi_ev_notify_dispatch+0x62/0x7c
  [<ffffffff8118b323>] ? acpi_os_execute_deferred+0x24/0x31
  [<ffffffff810511bf>] ? process_one_work+0x204/0x33b
  [<ffffffff8105115e>] ? process_one_work+0x1a3/0x33b
  [<ffffffff8118b2ff>] ? acpi_os_execute_deferred+0x0/0x31
  [<ffffffff81053384>] ? worker_thread+0x136/0x255
  [<ffffffff8105324e>] ? worker_thread+0x0/0x255
  [<ffffffff810566da>] ? kthread+0x7d/0x85
  [<ffffffff81003864>] ? kernel_thread_helper+0x4/0x10
  [<ffffffff812ffed8>] ? restore_args+0x0/0x30
  [<ffffffff8105665d>] ? kthread+0x0/0x85
  [<ffffffff81003860>] ? kernel_thread_helper+0x0/0x10
  ---[ end trace aae23672f55877aa ]---
  iwlagn 0000:03:00.0: Not sending command - RF KILL
  iwlagn 0000:03:00.0: Error sending REPLY_REMOVE_STA: enqueue_hcmd failed: -5
  iwlagn 0000:03:00.0: Error removing station 00:1e:e5:7a:b2:6f
  iwlagn 0000:03:00.0: Not sending command - RF KILL
  iwlagn 0000:03:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
  iwlagn 0000:03:00.0: Error clearing ASSOC_MSK on BSS (-5)
  iwlagn 0000:03:00.0: Not sending command - RF KILL
  iwlagn 0000:03:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
  iwlagn 0000:03:00.0: Error clearing ASSOC_MSK on BSS (-5)
  cfg80211: Calling CRDA to update world regulatory domain
  iwlagn 0000:03:00.0: Not sending command - RF KILL
  iwlagn 0000:03:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
  iwlagn 0000:03:00.0: Error clearing ASSOC_MSK on BSS (-5)
  usb 4-2: USB disconnect, address 2
  iwlagn 0000:03:00.0: RF_KILL bit toggled to enable radio.
  usb 4-2: new full speed USB device using uhci_hcd and address 3
  
  =================================
  [ INFO: inconsistent lock state ]
  2.6.38-rc3-bisect-00062-g8c007a1 #92
  ---------------------------------
  inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
  swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
  (&(&dev->event_lock)->rlock){?.+...}, at: [<ffffffff8125def6>] input_event+0x41/0x83
  {HARDIRQ-ON-W} state was registered at:
  [<ffffffff810656d4>] mark_held_locks+0x52/0x70
  [<ffffffff8106593a>] trace_hardirqs_on_caller+0xf7/0x135
  [<ffffffff81065985>] trace_hardirqs_on+0xd/0xf
  [<ffffffff81044ac2>] __do_softirq+0x62/0x160
  [<ffffffff8100395c>] call_softirq+0x1c/0x28
  [<ffffffff81044cf2>] _local_bh_enable_ip+0xb3/0xe3
  [<ffffffff81044d3a>] local_bh_enable+0xd/0xf
  [<ffffffff8104a418>] del_timer_sync+0x4f/0x8e
  [<ffffffffa00947fe>] rfkill_schedule_global_op+0x41/0x84 [rfkill]
  [<ffffffffa0094858>] rfkill_schedule_evsw_rfkillall+0x17/0x19 [rfkill]
  [<ffffffffa009495f>] rfkill_event+0x58/0x5a [rfkill]
  [<ffffffff8125c980>] input_pass_event+0x97/0xea
  [<ffffffff8125de13>] input_handle_event+0x460/0x46f
  [<ffffffff8125df1b>] input_event+0x66/0x83
  [<ffffffffa009becb>] tpacpi_send_radiosw_update+0x7f/0xc4 [thinkpad_acpi]
  [<ffffffffa009c417>] hotkey_notify+0x507/0x581 [thinkpad_acpi]
  [<ffffffffa009b029>] dispatch_acpi_notify+0x29/0x2b [thinkpad_acpi]
  [<ffffffff8119a18a>] acpi_ev_notify_dispatch+0x62/0x7c
  [<ffffffff8118b323>] acpi_os_execute_deferred+0x24/0x31
  [<ffffffff810511bf>] process_one_work+0x204/0x33b
  [<ffffffff81053384>] worker_thread+0x136/0x255
  [<ffffffff810566da>] kthread+0x7d/0x85
  [<ffffffff81003864>] kernel_thread_helper+0x4/0x10
  irq event stamp: 559266
  hardirqs last  enabled at (559263): [<ffffffff811b40a5>] acpi_idle_enter_bm+0x236/0x275
  hardirqs last disabled at (559264): [<ffffffff812ffd6a>] save_args+0x6a/0x70
  softirqs last  enabled at (559266): [<ffffffff81044bce>] _local_bh_enable+0xe/0x10
  softirqs last disabled at (559265): [<ffffffff81044c0d>] irq_enter+0x3d/0x6f
  
  other info that might help us debug this:
  1 lock held by swapper/0:
  #0:  (&serio->lock){-.-...}, at: [<ffffffff81259018>] serio_interrupt+0x24/0x7e
  
  stack backtrace:
  Pid: 0, comm: swapper Tainted: G        W   2.6.38-rc3-bisect-00062-g8c007a1 #92
  Call Trace:
  <IRQ>  [<ffffffff81065442>] ? valid_state+0x17e/0x191
  [<ffffffff8100e159>] ? save_stack_trace+0x28/0x45
  [<ffffffff81065d18>] ? check_usage_forwards+0x0/0x87
  [<ffffffff81065568>] ? mark_lock+0x113/0x22d
  [<ffffffff8106687f>] ? __lock_acquire+0x2ac/0xd01
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81067338>] ? lock_acquire+0x64/0x81
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff812ff378>] ? _raw_spin_lock_irqsave+0x53/0x65
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff81261fb6>] ? atkbd_interrupt+0x9a/0x574
  [<ffffffff81259034>] ? serio_interrupt+0x40/0x7e
  [<ffffffff8125a31c>] ? i8042_interrupt+0x28e/0x2a8
  [<ffffffff8107e834>] ? handle_IRQ_event+0x20/0xae
  [<ffffffff810805e6>] ? handle_edge_irq+0xde/0x12a
  [<ffffffff810055c9>] ? handle_irq+0x1f/0x2a
  [<ffffffff81004c37>] ? do_IRQ+0x48/0xaf
  [<ffffffff812ffe13>] ? ret_from_intr+0x0/0x1a
  <EOI>  [<ffffffff8100a1aa>] ? native_sched_clock+0x2d/0x5f
  [<ffffffff811b40ac>] ? acpi_idle_enter_bm+0x23d/0x275
  [<ffffffff811b40a5>] ? acpi_idle_enter_bm+0x236/0x275
  [<ffffffff8126c2ae>] ? cpuidle_idle_call+0x9f/0xd5
  [<ffffffff81001e75>] ? cpu_idle+0x61/0xaa
  [<ffffffff812ecdc9>] ? rest_init+0xcd/0xd4
  [<ffffffff812eccfc>] ? rest_init+0x0/0xd4
  [<ffffffff81682d3d>] ? start_kernel+0x3a6/0x3b1
  [<ffffffff816822b1>] ? x86_64_start_reservations+0xb8/0xbc
  [<ffffffff816823bb>] ? x86_64_start_kernel+0x106/0x115

1118e2cd33d47254854e1ba3ba8e32802ff14fdf is the first bad commit
commit 1118e2cd33d47254854e1ba3ba8e32802ff14fdf
Author: Yong Zhang <yong.zhang@windriver.com>
Date:   Wed Oct 20 15:57:32 2010 -0700

    timer: Del_timer_sync() can be used in softirq context
    
    Actually we have used del_timer_sync() in softirq context for a long time,
    e.g. in __dst_free()::cancel_delayed_work().
    
    So change the comments of it to warn on hardirq context only, and make
    lockdep know about this change.
    
    Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

:040000 040000 864a7fbdaeb5226c2cc47a49e56c2d5509d548be adb06c69b689b1ec1a9853e9d4f22b631683c164 M	kernel

git bisect start
# bad: [afe8a887550f7cc24eb16449670b93f6b43c32d8] Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
git bisect bad afe8a887550f7cc24eb16449670b93f6b43c32d8
# good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
# bad: [949f6711b83d2809d1ccb9d830155a65fdacdff9] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
git bisect bad 949f6711b83d2809d1ccb9d830155a65fdacdff9
# bad: [3c0cb7c31c206aaedb967e44b98442bbeb17a6c4] Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
git bisect bad 3c0cb7c31c206aaedb967e44b98442bbeb17a6c4
# good: [63e35cd9bd4c8ae085c8b9a70554595b529c4100] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
git bisect good 63e35cd9bd4c8ae085c8b9a70554595b529c4100
# bad: [9858a38ea3a940762ae3028cce88f686d0e0c28b] Merge branch 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
git bisect bad 9858a38ea3a940762ae3028cce88f686d0e0c28b
# bad: [6f46b120a96212b85cbdcb84a64c854dfd791ede] Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 6f46b120a96212b85cbdcb84a64c854dfd791ede
# good: [4eed11d5e24540dc133003b6e8f904cb747ac4bb] perf evsel: Auto allocate resources needed for some methods
git bisect good 4eed11d5e24540dc133003b6e8f904cb747ac4bb
# good: [28d9bfc37c861aa9c8386dff1ac7e9a10e5c5162] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good 28d9bfc37c861aa9c8386dff1ac7e9a10e5c5162
# good: [65b2074f84be2287e020839e93b4cdaaf60eb37c] Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good 65b2074f84be2287e020839e93b4cdaaf60eb37c
# bad: [42cbd8efb0746b55112de45173219f76c54390da] Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 42cbd8efb0746b55112de45173219f76c54390da
# bad: [87de5ac782761a3ebf806e434e8c9cc205a87274] timers: Introduce timerlist infrastructure.
git bisect bad 87de5ac782761a3ebf806e434e8c9cc205a87274
# good: [dd6414b50fa2b1cd247a8aa8f8bd42414b7453e1] timer: Permit statically-declared work with deferrable timers
git bisect good dd6414b50fa2b1cd247a8aa8f8bd42414b7453e1
# bad: [1118e2cd33d47254854e1ba3ba8e32802ff14fdf] timer: Del_timer_sync() can be used in softirq context
git bisect bad 1118e2cd33d47254854e1ba3ba8e32802ff14fdf
# good: [6f1bc451e6a79470b122a37ee1fc6bbca450f444] timer: Make try_to_del_timer_sync() the same on SMP and UP
git bisect good 6f1bc451e6a79470b122a37ee1fc6bbca450f444

-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).
  2011-02-03  3:19 Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected) Nick Bowler
@ 2011-02-03  9:12 ` Yong Zhang
  2011-02-03  9:30   ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Yong Zhang @ 2011-02-03  9:12 UTC (permalink / raw)
  To: Nick Bowler; +Cc: linux-kernel, Andrew Morton, Thomas Gleixner

Hi Nick,

Thanks for reporting this.
Could you please try below patch?

Thanks,
Yong

---
From: Yong Zhang <yong.zhang0@gmail.com>
Subject: [PATCH] timer: use _local_bh_enable() in del_timer_sync()

Nick Bowler reported:
  =================================
  [ INFO: inconsistent lock state ]
  2.6.38-rc3-bisect-00062-g8c007a1 #92
  ---------------------------------
  inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
  swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
  (&(&dev->event_lock)->rlock){?.+...}, at: [<ffffffff8125def6>] input_event+0x41/0x83
  {HARDIRQ-ON-W} state was registered at:
  [<ffffffff810656d4>] mark_held_locks+0x52/0x70
  [<ffffffff8106593a>] trace_hardirqs_on_caller+0xf7/0x135
  [<ffffffff81065985>] trace_hardirqs_on+0xd/0xf
  [<ffffffff81044ac2>] __do_softirq+0x62/0x160
  [<ffffffff8100395c>] call_softirq+0x1c/0x28
  [<ffffffff81044cf2>] _local_bh_enable_ip+0xb3/0xe3
  [<ffffffff81044d3a>] local_bh_enable+0xd/0xf
  [<ffffffff8104a418>] del_timer_sync+0x4f/0x8e
  [<ffffffffa00947fe>] rfkill_schedule_global_op+0x41/0x84 [rfkill]
  [<ffffffffa0094858>] rfkill_schedule_evsw_rfkillall+0x17/0x19 [rfkill]
  [<ffffffffa009495f>] rfkill_event+0x58/0x5a [rfkill]
  [<ffffffff8125c980>] input_pass_event+0x97/0xea
  [<ffffffff8125de13>] input_handle_event+0x460/0x46f
  [<ffffffff8125df1b>] input_event+0x66/0x83
  [<ffffffffa009becb>] tpacpi_send_radiosw_update+0x7f/0xc4 [thinkpad_acpi]
  [<ffffffffa009c417>] hotkey_notify+0x507/0x581 [thinkpad_acpi]
  [<ffffffffa009b029>] dispatch_acpi_notify+0x29/0x2b [thinkpad_acpi]
  [<ffffffff8119a18a>] acpi_ev_notify_dispatch+0x62/0x7c
  [<ffffffff8118b323>] acpi_os_execute_deferred+0x24/0x31
  [<ffffffff810511bf>] process_one_work+0x204/0x33b
  [<ffffffff81053384>] worker_thread+0x136/0x255
  [<ffffffff810566da>] kthread+0x7d/0x85
  [<ffffffff81003864>] kernel_thread_helper+0x4/0x10
  irq event stamp: 559266
  hardirqs last  enabled at (559263): [<ffffffff811b40a5>] acpi_idle_enter_bm+0x236/0x275
  hardirqs last disabled at (559264): [<ffffffff812ffd6a>] save_args+0x6a/0x70
  softirqs last  enabled at (559266): [<ffffffff81044bce>] _local_bh_enable+0xe/0x10
  softirqs last disabled at (559265): [<ffffffff81044c0d>] irq_enter+0x3d/0x6f

  other info that might help us debug this:
  1 lock held by swapper/0:
  #0:  (&serio->lock){-.-...}, at: [<ffffffff81259018>] serio_interrupt+0x24/0x7e

  stack backtrace:
  Pid: 0, comm: swapper Tainted: G        W   2.6.38-rc3-bisect-00062-g8c007a1 #92
  Call Trace:
  <IRQ>  [<ffffffff81065442>] ? valid_state+0x17e/0x191
  [<ffffffff8100e159>] ? save_stack_trace+0x28/0x45
  [<ffffffff81065d18>] ? check_usage_forwards+0x0/0x87
  [<ffffffff81065568>] ? mark_lock+0x113/0x22d
  [<ffffffff8106687f>] ? __lock_acquire+0x2ac/0xd01
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81067338>] ? lock_acquire+0x64/0x81
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff812ff378>] ? _raw_spin_lock_irqsave+0x53/0x65
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff81261fb6>] ? atkbd_interrupt+0x9a/0x574
  [<ffffffff81259034>] ? serio_interrupt+0x40/0x7e
  [<ffffffff8125a31c>] ? i8042_interrupt+0x28e/0x2a8
  [<ffffffff8107e834>] ? handle_IRQ_event+0x20/0xae
  [<ffffffff810805e6>] ? handle_edge_irq+0xde/0x12a
  [<ffffffff810055c9>] ? handle_irq+0x1f/0x2a
  [<ffffffff81004c37>] ? do_IRQ+0x48/0xaf
  [<ffffffff812ffe13>] ? ret_from_intr+0x0/0x1a
  <EOI>  [<ffffffff8100a1aa>] ? native_sched_clock+0x2d/0x5f
  [<ffffffff811b40ac>] ? acpi_idle_enter_bm+0x23d/0x275
  [<ffffffff811b40a5>] ? acpi_idle_enter_bm+0x236/0x275
  [<ffffffff8126c2ae>] ? cpuidle_idle_call+0x9f/0xd5
  [<ffffffff81001e75>] ? cpu_idle+0x61/0xaa
  [<ffffffff812ecdc9>] ? rest_init+0xcd/0xd4
  [<ffffffff812eccfc>] ? rest_init+0x0/0xd4
  [<ffffffff81682d3d>] ? start_kernel+0x3a6/0x3b1
  [<ffffffff816822b1>] ? x86_64_start_reservations+0xb8/0xbc
  [<ffffffff816823bb>] ? x86_64_start_kernel+0x106/0x115

The cause is that local_bh_enable() could call do_softirq()
in certain time, thus break irq status and make lock bite
us.

Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/timer.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index 43ca993..253aefe 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -969,10 +969,15 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
 int del_timer_sync(struct timer_list *timer)
 {
 #ifdef CONFIG_LOCKDEP
+	unsigned long flags;
+
 	local_bh_disable();
 	lock_map_acquire(&timer->lockdep_map);
 	lock_map_release(&timer->lockdep_map);
-	local_bh_enable();
+	/* raw_local_irq_[save|restore] is to protect _local_bh_enable() */
+	raw_local_irq_save(flags);
+	_local_bh_enable();
+	raw_local_irq_restore(flags);
 #endif
 	/*
 	 * don't use it in hardirq context, because it
-- 
1.7.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).
  2011-02-03  9:12 ` Yong Zhang
@ 2011-02-03  9:30   ` Peter Zijlstra
  2011-02-03 10:17     ` Yong Zhang
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2011-02-03  9:30 UTC (permalink / raw)
  To: Yong Zhang; +Cc: Nick Bowler, linux-kernel, Andrew Morton, Thomas Gleixner

On Thu, 2011-02-03 at 17:12 +0800, Yong Zhang wrote:
>  #ifdef CONFIG_LOCKDEP
> +       unsigned long flags;
> +
>         local_bh_disable();
>         lock_map_acquire(&timer->lockdep_map);
>         lock_map_release(&timer->lockdep_map);
> -       local_bh_enable();
> +       /* raw_local_irq_[save|restore] is to protect _local_bh_enable() */
> +       raw_local_irq_save(flags);
> +       _local_bh_enable();
> +       raw_local_irq_restore(flags);
>  #endif 

_why_ are you doing the raw_local_irq stuff? That's just weird, and that
comment isn't helping.

Also, calling _local_bh_enable() will leave pending softirqs, not
particularly nice


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).
  2011-02-03  9:30   ` Peter Zijlstra
@ 2011-02-03 10:17     ` Yong Zhang
  2011-02-03 10:33       ` Peter Zijlstra
                         ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Yong Zhang @ 2011-02-03 10:17 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Nick Bowler, linux-kernel, Andrew Morton, Thomas Gleixner

On Thu, Feb 03, 2011 at 10:30:40AM +0100, Peter Zijlstra wrote:
> On Thu, 2011-02-03 at 17:12 +0800, Yong Zhang wrote:
> >  #ifdef CONFIG_LOCKDEP
> > +       unsigned long flags;
> > +
> >         local_bh_disable();
> >         lock_map_acquire(&timer->lockdep_map);
> >         lock_map_release(&timer->lockdep_map);
> > -       local_bh_enable();
> > +       /* raw_local_irq_[save|restore] is to protect _local_bh_enable() */
> > +       raw_local_irq_save(flags);
> > +       _local_bh_enable();
> > +       raw_local_irq_restore(flags);
> >  #endif 
> 
> _why_ are you doing the raw_local_irq stuff? That's just weird, and that
> comment isn't helping.

Just to avoid trigger WARN_ON_ONCE(!irqs_disabled()) in __local_bh_enable(),

> 
> Also, calling _local_bh_enable() will leave pending softirqs, not
> particularly nice

Ah, indeed. That's bad.

Now I'm not sure how to work around this issue, because local_bh_enable()
and _local_bh_enable() are not suitable here.

Maybe we can introduce another type of local_bh_enable() in which
waking up ksoftirqd is forced if needed.

Thanks,
Yong

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).
  2011-02-03 10:17     ` Yong Zhang
@ 2011-02-03 10:33       ` Peter Zijlstra
  2011-02-03 11:42         ` Yong Zhang
  2011-02-03 11:50       ` [PATCH 1/2] softirq: introduce loacal_bh_enable_force_wake() Yong Zhang
  2011-02-03 11:53       ` [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync() Yong Zhang
  2 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2011-02-03 10:33 UTC (permalink / raw)
  To: Yong Zhang; +Cc: Nick Bowler, linux-kernel, Andrew Morton, Thomas Gleixner


How about we revert your patch and go back to not allowing
del_timer_sync() from any interrupt context, we can fix __dst_free by
removing the need for cancel_delayed_work().

Are there any other del_timer_sync() callers from softirq?

---
 include/linux/timer.h |    1 +
 kernel/timer.c        |   67 ++++++++++++++++++++++++++++++------------------
 kernel/workqueue.c    |    4 +-
 net/core/dst.c        |    1 -
 4 files changed, 45 insertions(+), 28 deletions(-)

diff --git a/include/linux/timer.h b/include/linux/timer.h
index 6abd913..3e263f9 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -209,6 +209,7 @@ static inline int timer_pending(const struct timer_list * timer)
 extern void add_timer_on(struct timer_list *timer, int cpu);
 extern int del_timer(struct timer_list * timer);
 extern int mod_timer(struct timer_list *timer, unsigned long expires);
+extern int mod_timer_on(struct timer_list *timer, unsigned long expires, int cpu);
 extern int mod_timer_pending(struct timer_list *timer, unsigned long expires);
 extern int mod_timer_pinned(struct timer_list *timer, unsigned long expires);
 
diff --git a/kernel/timer.c b/kernel/timer.c
index 87f656c..c9e98f7 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -648,8 +648,8 @@ static struct tvec_base *lock_timer_base(struct timer_list *timer,
 }
 
 static inline int
-__mod_timer(struct timer_list *timer, unsigned long expires,
-						bool pending_only, int pinned)
+__mod_timer_on(struct timer_list *timer, unsigned long expires,
+	       bool pending_only, int pinned, int cpu)
 {
 	struct tvec_base *base, *new_base;
 	unsigned long flags;
@@ -673,12 +673,14 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 
 	debug_activate(timer, expires);
 
-	cpu = smp_processor_id();
+	if (cpu == -1) {
+		cpu = smp_processor_id();
 
 #if defined(CONFIG_NO_HZ) && defined(CONFIG_SMP)
-	if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu))
-		cpu = get_nohz_timer_target();
+		if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu))
+			cpu = get_nohz_timer_target();
 #endif
+	}
 	new_base = per_cpu(tvec_bases, cpu);
 
 	if (base != new_base) {
@@ -705,12 +707,30 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 		base->next_timer = timer->expires;
 	internal_add_timer(base, timer);
 
+	if (cpu != smp_processor_id()) {
+		/*
+		 * Check whether the other CPU is idle and needs to be
+		 * triggered to reevaluate the timer wheel when nohz is
+		 * active. We are protected against the other CPU fiddling
+		 * with the timer by holding the timer base lock. This also
+		 * makes sure that a CPU on the way to idle can not evaluate
+		 * the timer wheel.
+		 */
+		wake_up_idle_cpu(cpu);
+	}
 out_unlock:
 	spin_unlock_irqrestore(&base->lock, flags);
 
 	return ret;
 }
 
+static inline int
+__mod_timer(struct timer_list *timer, unsigned long expires,
+	    bool pending_only, int pinned)
+{
+	return __mod_timer_on(timer, expires, pending_only, pinned, -1);
+}
+
 /**
  * mod_timer_pending - modify a pending timer's timeout
  * @timer: the pending timer to be modified
@@ -825,6 +845,17 @@ int mod_timer_pinned(struct timer_list *timer, unsigned long expires)
 }
 EXPORT_SYMBOL(mod_timer_pinned);
 
+int mod_timer_on(struct timer_list *timer, unsigned long expires, int cpu)
+{
+	if (timer_pending(timer) && timer->expires == expires)
+		return 1;
+
+	expires = apply_slack(timer, expires);
+
+	return __mod_timer_on(timer, expires, false, TIMER_NOT_PINNED, cpu);
+}
+EXPORT_SYMBOL(mod_timer_on);
+
 /**
  * add_timer - start a timer
  * @timer: the timer to be added
@@ -860,23 +891,7 @@ void add_timer_on(struct timer_list *timer, int cpu)
 
 	timer_stats_timer_set_start_info(timer);
 	BUG_ON(timer_pending(timer) || !timer->function);
-	spin_lock_irqsave(&base->lock, flags);
-	timer_set_base(timer, base);
-	debug_activate(timer, timer->expires);
-	if (time_before(timer->expires, base->next_timer) &&
-	    !tbase_get_deferrable(timer->base))
-		base->next_timer = timer->expires;
-	internal_add_timer(base, timer);
-	/*
-	 * Check whether the other CPU is idle and needs to be
-	 * triggered to reevaluate the timer wheel when nohz is
-	 * active. We are protected against the other CPU fiddling
-	 * with the timer by holding the timer base lock. This also
-	 * makes sure that a CPU on the way to idle can not evaluate
-	 * the timer wheel.
-	 */
-	wake_up_idle_cpu(cpu);
-	spin_unlock_irqrestore(&base->lock, flags);
+	mod_timer_on(timer, timer->expires, cpu);
 }
 EXPORT_SYMBOL_GPL(add_timer_on);
 
@@ -959,7 +974,7 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
  *
  * Synchronization rules: Callers must prevent restarting of the timer,
  * otherwise this function is meaningless. It must not be called from
- * hardirq contexts. The caller must not hold locks which would prevent
+ * interrupt contexts. The caller must not hold locks which would prevent
  * completion of the timer's handler. The timer's handler must not call
  * add_timer_on(). Upon exit the timer is not queued and the handler is
  * not running on any CPU.
@@ -969,10 +984,12 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
 int del_timer_sync(struct timer_list *timer)
 {
 #ifdef CONFIG_LOCKDEP
-	local_bh_disable();
+	unsigned long flags;
+
+	local_irq_save(flags);
 	lock_map_acquire(&timer->lockdep_map);
 	lock_map_release(&timer->lockdep_map);
-	local_bh_enable();
+	local_irq_restore(flags);
 #endif
 	/*
 	 * don't use it in hardirq context, because it
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 11869fa..3944acc 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1160,9 +1160,9 @@ int queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
 		timer->function = delayed_work_timer_fn;
 
 		if (unlikely(cpu >= 0))
-			add_timer_on(timer, cpu);
+			mod_timer_on(timer, cpu);
 		else
-			add_timer(timer);
+			mod_timer(timer);
 		ret = 1;
 	}
 	return ret;
diff --git a/net/core/dst.c b/net/core/dst.c
index b99c7c7..a4f28bc 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -207,7 +207,6 @@ void __dst_free(struct dst_entry *dst)
 	if (dst_garbage.timer_inc > DST_GC_INC) {
 		dst_garbage.timer_inc = DST_GC_INC;
 		dst_garbage.timer_expires = DST_GC_MIN;
-		cancel_delayed_work(&dst_gc_work);
 		schedule_delayed_work(&dst_gc_work, dst_garbage.timer_expires);
 	}
 	spin_unlock_bh(&dst_garbage.lock);



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).
  2011-02-03 10:33       ` Peter Zijlstra
@ 2011-02-03 11:42         ` Yong Zhang
  2011-02-08 16:55           ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Yong Zhang @ 2011-02-03 11:42 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Nick Bowler, linux-kernel, Andrew Morton, Thomas Gleixner

On Thu, Feb 03, 2011 at 11:33:04AM +0100, Peter Zijlstra wrote:
> 
> How about we revert your patch and go back to not allowing
> del_timer_sync() from any interrupt context, 

The fact is there is someone want to use del_timer_sync()
in softirq context.
My patch is inspired by this http://lkml.org/lkml/2010/8/16/291
though in the end that issue is resolved by not using
del_timer_sync() in softirq.

> we can fix __dst_free by
> removing the need for cancel_delayed_work().
> 
> Are there any other del_timer_sync() callers from softirq?

I'm not sure about this.

IMHO, we don't need to make more restriction if we don't
have to.

Thanks,
Yong

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] softirq: introduce loacal_bh_enable_force_wake()
  2011-02-03 10:17     ` Yong Zhang
  2011-02-03 10:33       ` Peter Zijlstra
@ 2011-02-03 11:50       ` Yong Zhang
  2011-02-03 11:53       ` [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync() Yong Zhang
  2 siblings, 0 replies; 12+ messages in thread
From: Yong Zhang @ 2011-02-03 11:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Bowler, linux-kernel, Andrew Morton, Thomas Gleixner, Ingo Molnar

On Thu, Feb 03, 2011 at 06:17:39PM +0800, Yong Zhang wrote:
> Maybe we can introduce another type of local_bh_enable() in which
> waking up ksoftirqd is forced if needed.

---
From: Yong Zhang <yong.zhang0@gmail.com>
Subject: [PATCH 1/2] softirq: introduce loacal_bh_enable_force_wake()

If there is pending softirq, don't handle it in the caller's
context, invoke ksoftirqd directly instead.

del_timer_sync() will be the first caller.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/bottom_half.h |    1 +
 kernel/softirq.c            |   19 ++++++++++++++-----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/linux/bottom_half.h b/include/linux/bottom_half.h
index 27b1bcf..665d697 100644
--- a/include/linux/bottom_half.h
+++ b/include/linux/bottom_half.h
@@ -5,5 +5,6 @@ extern void local_bh_disable(void);
 extern void _local_bh_enable(void);
 extern void local_bh_enable(void);
 extern void local_bh_enable_ip(unsigned long ip);
+extern void local_bh_enable_force_wake(void);
 
 #endif /* _LINUX_BH_H */
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 68eb5ef..5f224d4 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -154,7 +154,7 @@ void _local_bh_enable(void)
 
 EXPORT_SYMBOL(_local_bh_enable);
 
-static inline void _local_bh_enable_ip(unsigned long ip)
+static inline void _local_bh_enable_ip(unsigned long ip, bool force_wake)
 {
 	WARN_ON_ONCE(in_irq() || irqs_disabled());
 #ifdef CONFIG_TRACE_IRQFLAGS
@@ -171,8 +171,12 @@ static inline void _local_bh_enable_ip(unsigned long ip)
  	 */
 	sub_preempt_count(SOFTIRQ_DISABLE_OFFSET - 1);
 
-	if (unlikely(!in_interrupt() && local_softirq_pending()))
-		do_softirq();
+	if (unlikely(!in_interrupt() && local_softirq_pending())) {
+		if (!force_wake)
+			do_softirq();
+		else
+			wakeup_softirqd();
+	}
 
 	dec_preempt_count();
 #ifdef CONFIG_TRACE_IRQFLAGS
@@ -183,16 +187,21 @@ static inline void _local_bh_enable_ip(unsigned long ip)
 
 void local_bh_enable(void)
 {
-	_local_bh_enable_ip((unsigned long)__builtin_return_address(0));
+	_local_bh_enable_ip((unsigned long)__builtin_return_address(0), false);
 }
 EXPORT_SYMBOL(local_bh_enable);
 
 void local_bh_enable_ip(unsigned long ip)
 {
-	_local_bh_enable_ip(ip);
+	_local_bh_enable_ip(ip, false);
 }
 EXPORT_SYMBOL(local_bh_enable_ip);
 
+void local_bh_enable_force_wake(void)
+{
+	_local_bh_enable_ip((unsigned long)__builtin_return_address(0), true);
+}
+EXPORT_SYMBOL(local_bh_enable_force_wake);
 /*
  * We restart softirq processing MAX_SOFTIRQ_RESTART times,
  * and we fall back to softirqd after that.
-- 
1.7.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync()
  2011-02-03 10:17     ` Yong Zhang
  2011-02-03 10:33       ` Peter Zijlstra
  2011-02-03 11:50       ` [PATCH 1/2] softirq: introduce loacal_bh_enable_force_wake() Yong Zhang
@ 2011-02-03 11:53       ` Yong Zhang
  2 siblings, 0 replies; 12+ messages in thread
From: Yong Zhang @ 2011-02-03 11:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Bowler, linux-kernel, Andrew Morton, Thomas Gleixner, Ingo Molnar

From: Yong Zhang <yong.zhang0@gmail.com>
Subject: [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync()

Nick Bowler reported:
  =================================
  [ INFO: inconsistent lock state ]
  2.6.38-rc3-bisect-00062-g8c007a1 #92
  ---------------------------------
  inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
  swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
  (&(&dev->event_lock)->rlock){?.+...}, at: [<ffffffff8125def6>] input_event+0x41/0x83
  {HARDIRQ-ON-W} state was registered at:
  [<ffffffff810656d4>] mark_held_locks+0x52/0x70
  [<ffffffff8106593a>] trace_hardirqs_on_caller+0xf7/0x135
  [<ffffffff81065985>] trace_hardirqs_on+0xd/0xf
  [<ffffffff81044ac2>] __do_softirq+0x62/0x160
  [<ffffffff8100395c>] call_softirq+0x1c/0x28
  [<ffffffff81044cf2>] _local_bh_enable_ip+0xb3/0xe3
  [<ffffffff81044d3a>] local_bh_enable+0xd/0xf
  [<ffffffff8104a418>] del_timer_sync+0x4f/0x8e
  [<ffffffffa00947fe>] rfkill_schedule_global_op+0x41/0x84 [rfkill]
  [<ffffffffa0094858>] rfkill_schedule_evsw_rfkillall+0x17/0x19 [rfkill]
  [<ffffffffa009495f>] rfkill_event+0x58/0x5a [rfkill]
  [<ffffffff8125c980>] input_pass_event+0x97/0xea
  [<ffffffff8125de13>] input_handle_event+0x460/0x46f
  [<ffffffff8125df1b>] input_event+0x66/0x83
  [<ffffffffa009becb>] tpacpi_send_radiosw_update+0x7f/0xc4 [thinkpad_acpi]
  [<ffffffffa009c417>] hotkey_notify+0x507/0x581 [thinkpad_acpi]
  [<ffffffffa009b029>] dispatch_acpi_notify+0x29/0x2b [thinkpad_acpi]
  [<ffffffff8119a18a>] acpi_ev_notify_dispatch+0x62/0x7c
  [<ffffffff8118b323>] acpi_os_execute_deferred+0x24/0x31
  [<ffffffff810511bf>] process_one_work+0x204/0x33b
  [<ffffffff81053384>] worker_thread+0x136/0x255
  [<ffffffff810566da>] kthread+0x7d/0x85
  [<ffffffff81003864>] kernel_thread_helper+0x4/0x10
  irq event stamp: 559266
  hardirqs last  enabled at (559263): [<ffffffff811b40a5>] acpi_idle_enter_bm+0x236/0x275
  hardirqs last disabled at (559264): [<ffffffff812ffd6a>] save_args+0x6a/0x70
  softirqs last  enabled at (559266): [<ffffffff81044bce>] _local_bh_enable+0xe/0x10
  softirqs last disabled at (559265): [<ffffffff81044c0d>] irq_enter+0x3d/0x6f

  other info that might help us debug this:
  1 lock held by swapper/0:
  #0:  (&serio->lock){-.-...}, at: [<ffffffff81259018>] serio_interrupt+0x24/0x7e

  stack backtrace:
  Pid: 0, comm: swapper Tainted: G        W   2.6.38-rc3-bisect-00062-g8c007a1 #92
  Call Trace:
  <IRQ>  [<ffffffff81065442>] ? valid_state+0x17e/0x191
  [<ffffffff8100e159>] ? save_stack_trace+0x28/0x45
  [<ffffffff81065d18>] ? check_usage_forwards+0x0/0x87
  [<ffffffff81065568>] ? mark_lock+0x113/0x22d
  [<ffffffff8106687f>] ? __lock_acquire+0x2ac/0xd01
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81065482>] ? mark_lock+0x2d/0x22d
  [<ffffffff81067338>] ? lock_acquire+0x64/0x81
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff812ff378>] ? _raw_spin_lock_irqsave+0x53/0x65
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff8125def6>] ? input_event+0x41/0x83
  [<ffffffff81261fb6>] ? atkbd_interrupt+0x9a/0x574
  [<ffffffff81259034>] ? serio_interrupt+0x40/0x7e
  [<ffffffff8125a31c>] ? i8042_interrupt+0x28e/0x2a8
  [<ffffffff8107e834>] ? handle_IRQ_event+0x20/0xae
  [<ffffffff810805e6>] ? handle_edge_irq+0xde/0x12a
  [<ffffffff810055c9>] ? handle_irq+0x1f/0x2a
  [<ffffffff81004c37>] ? do_IRQ+0x48/0xaf
  [<ffffffff812ffe13>] ? ret_from_intr+0x0/0x1a
  <EOI>  [<ffffffff8100a1aa>] ? native_sched_clock+0x2d/0x5f
  [<ffffffff811b40ac>] ? acpi_idle_enter_bm+0x23d/0x275
  [<ffffffff811b40a5>] ? acpi_idle_enter_bm+0x236/0x275
  [<ffffffff8126c2ae>] ? cpuidle_idle_call+0x9f/0xd5
  [<ffffffff81001e75>] ? cpu_idle+0x61/0xaa
  [<ffffffff812ecdc9>] ? rest_init+0xcd/0xd4
  [<ffffffff812eccfc>] ? rest_init+0x0/0xd4
  [<ffffffff81682d3d>] ? start_kernel+0x3a6/0x3b1
  [<ffffffff816822b1>] ? x86_64_start_reservations+0xb8/0xbc
  [<ffffffff816823bb>] ? x86_64_start_kernel+0x106/0x115

The cause is that local_bh_enable() could call do_softirq()
in certain time, thus break irq status and make lock bite
us.

Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>

---
 kernel/timer.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index 43ca993..1654dc1 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -972,7 +972,7 @@ int del_timer_sync(struct timer_list *timer)
 	local_bh_disable();
 	lock_map_acquire(&timer->lockdep_map);
 	lock_map_release(&timer->lockdep_map);
-	local_bh_enable();
+	local_bh_enable_force_wake();
 #endif
 	/*
 	 * don't use it in hardirq context, because it
-- 
1.7.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected).
  2011-02-03 11:42         ` Yong Zhang
@ 2011-02-08 16:55           ` Peter Zijlstra
  2011-02-08 17:39             ` [PATCH] lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2011-02-08 16:55 UTC (permalink / raw)
  To: Yong Zhang
  Cc: Nick Bowler, linux-kernel, Andrew Morton, Thomas Gleixner,
	Steven Rostedt

On Thu, 2011-02-03 at 19:42 +0800, Yong Zhang wrote:
> My patch is inspired by this http://lkml.org/lkml/2010/8/16/291

Right, so that splat warns about:

spin_lock_irqsave(&shost->host_lock);
del_timer_sync()
  spin_lock_irqsave(&t->split_timeout_timer);

vs

<SOFTIRQ>
  call_timer_fn()
    spin_lock(&t->split_timeout_timer);
    <IRQ>
      spin_lock_irqsave(&shost->host_lock);

Which Steven just pointed out is an actual deadlock because:


CPU0					CPU1
spin_lock_irqsave(&shost->host_lock);
					<SOFTIRQ>
					  call_timer_fn();
					    base->running_timer = timer;
					    func();
					      <IRQ>
						spin_lock(&shost->host_lock);
del_timer_sync()
  while(base->running_timer == timer);

And the world stops turning..


Also note that your patch changes the above to:

spin_lock_irqsave(&shost->host_lock);
del_timer_sync()
  spin_lock_bh(&t->split_timeout_timer);

Which wouldn't make one bit of difference.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause
  2011-02-08 16:55           ` Peter Zijlstra
@ 2011-02-08 17:39             ` Steven Rostedt
  2011-02-16 13:51               ` [tip:core/locking] " tip-bot for Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2011-02-08 17:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Yong Zhang, Nick Bowler, linux-kernel, Andrew Morton, Thomas Gleixner

Twice I had to explain the output about why lockdep gives an error with
locks in IRQ context and with del_timer_sync(). Might as well write it
up and place it in the comments above the code in del_timer_sync().
Perhaps the next time this lockdep dump triggers people will understand
the issues.

It is a ticky issue and very subtle, explaining it in detail in the code
may help others understand the issue when they stumble upon the bug
again.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/kernel/timer.c b/kernel/timer.c
index d53ce66..d748321 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -964,6 +964,25 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
  * add_timer_on(). Upon exit the timer is not queued and the handler is
  * not running on any CPU.
  *
+ * Note: You must not hold locks that are held in interrupt context
+ *   while calling this function. Even if the lock has nothing to do
+ *   with the timer in question.  Here's why:
+ *
+ *    CPU0                             CPU1
+ *    ----                             ----
+ *                                   <SOFTIRQ>
+ *                                   call_timer_fn();
+ *                                     base->running_timer = mytimer;
+ *  spin_lock_irq(somelock);
+ *                                     <IRQ>
+ *                                        spin_lock(somelock);
+ *  del_timer_sync(mytimer);
+ *   while (base->running_timer == mytimer);
+ *
+ * Now del_timer_sync() will never return and never release somelock.
+ * The interrupt on the other CPU is waiting to grab somelock but
+ * it has interrupted the softirq that CPU0 is waiting to finish.
+ *
  * The function returns whether it has deactivated a pending timer or not.
  */
 int del_timer_sync(struct timer_list *timer)
@@ -971,6 +990,10 @@ int del_timer_sync(struct timer_list *timer)
 #ifdef CONFIG_LOCKDEP
 	unsigned long flags;
 
+	/*
+	 * If lockdep gives a backtrace here, please reference
+	 * the synchronization rules above.
+	 */
 	raw_local_irq_save(flags);
 	local_bh_disable();
 	lock_map_acquire(&timer->lockdep_map);



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:core/locking] lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause
  2011-02-08 17:39             ` [PATCH] lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause Steven Rostedt
@ 2011-02-16 13:51               ` tip-bot for Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Steven Rostedt @ 2011-02-16 13:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, a.p.zijlstra, rostedt, tglx, mingo

Commit-ID:  48228f7b470a74b6469a250d2977a13128d8fe96
Gitweb:     http://git.kernel.org/tip/48228f7b470a74b6469a250d2977a13128d8fe96
Author:     Steven Rostedt <rostedt@goodmis.org>
AuthorDate: Tue, 8 Feb 2011 12:39:54 -0500
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Wed, 16 Feb 2011 13:35:08 +0100

lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause

Twice I had to explain the output about why lockdep gives an error with
locks in IRQ context and with del_timer_sync(). Might as well write it
up and place it in the comments above the code in del_timer_sync().
Perhaps the next time this lockdep dump triggers people will understand
the issues.

It is a ticky issue and very subtle, explaining it in detail in the code
may help others understand the issue when they stumble upon the bug
again.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1297186794.23343.19.camel@gandalf.stny.rr.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/timer.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index d645992..5f40c2e 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -964,6 +964,25 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
  * add_timer_on(). Upon exit the timer is not queued and the handler is
  * not running on any CPU.
  *
+ * Note: You must not hold locks that are held in interrupt context
+ *   while calling this function. Even if the lock has nothing to do
+ *   with the timer in question.  Here's why:
+ *
+ *    CPU0                             CPU1
+ *    ----                             ----
+ *                                   <SOFTIRQ>
+ *                                   call_timer_fn();
+ *                                     base->running_timer = mytimer;
+ *  spin_lock_irq(somelock);
+ *                                     <IRQ>
+ *                                        spin_lock(somelock);
+ *  del_timer_sync(mytimer);
+ *   while (base->running_timer == mytimer);
+ *
+ * Now del_timer_sync() will never return and never release somelock.
+ * The interrupt on the other CPU is waiting to grab somelock but
+ * it has interrupted the softirq that CPU0 is waiting to finish.
+ *
  * The function returns whether it has deactivated a pending timer or not.
  */
 int del_timer_sync(struct timer_list *timer)
@@ -971,6 +990,10 @@ int del_timer_sync(struct timer_list *timer)
 #ifdef CONFIG_LOCKDEP
 	unsigned long flags;
 
+	/*
+	 * If lockdep gives a backtrace here, please reference
+	 * the synchronization rules above.
+	 */
 	local_irq_save(flags);
 	lock_map_acquire(&timer->lockdep_map);
 	lock_map_release(&timer->lockdep_map);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync()
  2011-02-08 12:11     ` Yong Zhang
@ 2011-02-08 12:14       ` Yong Zhang
  0 siblings, 0 replies; 12+ messages in thread
From: Yong Zhang @ 2011-02-08 12:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, Linus Torvalds, Linux Kernel Mailing List, x86,
	tglx, Ingo Molnar, Andrew Morton, Peter Zijlstra

From: Yong Zhang <yong.zhang0@gmail.com>
Subject: [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync()

raw_local_irq_save()/raw_local_irq_restore() is also not suitable
here because lockdep will report "possible reason: unannotated irqs-off"

To cure this issue, use local_bh_enable_force_wake() here in case of
we have pending softirq left during this time.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/timer.c |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index d53ce66..1654dc1 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -969,14 +969,10 @@ EXPORT_SYMBOL(try_to_del_timer_sync);
 int del_timer_sync(struct timer_list *timer)
 {
 #ifdef CONFIG_LOCKDEP
-	unsigned long flags;
-
-	raw_local_irq_save(flags);
 	local_bh_disable();
 	lock_map_acquire(&timer->lockdep_map);
 	lock_map_release(&timer->lockdep_map);
-	_local_bh_enable();
-	raw_local_irq_restore(flags);
+	local_bh_enable_force_wake();
 #endif
 	/*
 	 * don't use it in hardirq context, because it
-- 
1.7.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-02-16 13:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-03  3:19 Regression: WARNINGS and lockdep spews in 2.6.38-rc3+ (bisected) Nick Bowler
2011-02-03  9:12 ` Yong Zhang
2011-02-03  9:30   ` Peter Zijlstra
2011-02-03 10:17     ` Yong Zhang
2011-02-03 10:33       ` Peter Zijlstra
2011-02-03 11:42         ` Yong Zhang
2011-02-08 16:55           ` Peter Zijlstra
2011-02-08 17:39             ` [PATCH] lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause Steven Rostedt
2011-02-16 13:51               ` [tip:core/locking] " tip-bot for Steven Rostedt
2011-02-03 11:50       ` [PATCH 1/2] softirq: introduce loacal_bh_enable_force_wake() Yong Zhang
2011-02-03 11:53       ` [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync() Yong Zhang
2011-02-08  0:23 Linux 2.6.38-rc4 Linus Torvalds
2011-02-08 10:17 ` lockdep: possible reason: unannotated irqs-off. (was: Re: Linux 2.6.38-rc4) Borislav Petkov
2011-02-08 10:41   ` Peter Zijlstra
2011-02-08 12:11     ` Yong Zhang
2011-02-08 12:14       ` [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync() Yong Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).