LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Nitesh Lal <nilal@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Nicolas Saenz Julienne <nsaenzju@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Alex Belits <abelits@belits.com>, Peter Xu <peterx@redhat.com>
Subject: Re: [patch 4/4] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean
Date: Mon, 9 Aug 2021 14:34:39 -0300	[thread overview]
Message-ID: <20210809173439.GA17223@fuller.cnet> (raw)
In-Reply-To: <CAFki+L==9YGLS-qk0RgPCmO_-9+Ea1mJ0LTEN0cKF1OURpWdyg@mail.gmail.com>

On Fri, Aug 06, 2021 at 10:47:40PM -0400, Nitesh Lal wrote:
> Hi Marcelo,
> 
> On Fri, Jul 30, 2021 at 4:21 PM Marcelo Tosatti <mtosatti@redhat.com> wrote:
> >
> > It is not necessary to queue work item to run refresh_vm_stats
> > on a remote CPU if that CPU has no dirty stats and no per-CPU
> > allocations for remote nodes.
> 
> [...]
> 
> >         /*
> >          * The regular update, every sysctl_stat_interval, may come later
> > @@ -1850,9 +1875,15 @@ int vmstat_refresh(struct ctl_table *tab
> >          * transiently negative values, report an error here if any of
> >          * the stats is negative, so we know to go looking for imbalance.
> >          */
> > -       err = schedule_on_each_cpu(refresh_vm_stats);
> > -       if (err)
> > -               return err;
> > +       get_online_cpus();
> > +       for_each_online_cpu(cpu) {
> > +               if (need_update(cpu) || need_drain_remote_zones(cpu))
> > +                       work_on_cpu(cpu, refresh_vm_stats, NULL);
> > +
> > +               cond_resched();
> > +       }
> > +       put_online_cpus();
> 
> While testing this patch-set by enabling the debug config options, I
> ran into the following splat:
> 
> [  945.149209] ======================================================
> [  945.156111] WARNING: possible circular locking dependency detected
> [  945.163013] 5.14.0-rc3+ #2 Tainted: G S
> [  945.168754] ------------------------------------------------------
> [  945.175655] sysctl/6820 is trying to acquire lock:
> [  945.181006] ffff8881679efb00
> ((work_completion)(&wfc.work)){+.+.}-{0:0}, at:
> __flush_work+0x5c2/0x8c0
> [  945.191332]
> [  945.191332] but task is already holding lock:
> [  945.197846] ffffffff93b43660 (cpu_hotplug_lock){++++}-{0:0}, at:
> vmstat_refresh+0x4a/0x450
> [  945.207098]
> [  945.207098] which lock already depends on the new lock.
> [  945.207098]
> [  945.216228]
> [  945.216228] the existing dependency chain (in reverse order) is:
> [  945.224574]
> [  945.224574] -> #1 (cpu_hotplug_lock){++++}-{0:0}:
> [  945.231488]        lock_acquire+0x1d7/0x540
> [  945.236157]        cpus_read_lock+0x3b/0xc0
> [  945.240834]        alloc_workqueue+0x884/0xd00
> [  945.245800]        scsi_host_alloc+0xbdb/0xf60
> [  945.250769]        megasas_probe_one+0x164/0x800 [megaraid_sas]
> [  945.257402]        local_pci_probe+0xd8/0x170
> [  945.262270]        work_for_cpu_fn+0x51/0xa0
> [  945.267041]        process_one_work+0x960/0x16a0
> [  945.272200]        worker_thread+0x55e/0xbf0
> [  945.276970]        kthread+0x371/0x440
> [  945.281158]        ret_from_fork+0x22/0x30
> [  945.285737]
> [  945.285737] -> #0 ((work_completion)(&wfc.work)){+.+.}-{0:0}:
> [  945.293813]        check_prevs_add+0x3fd/0x2470
> [  945.298874]        __lock_acquire+0x23f7/0x2f80
> [  945.303934]        lock_acquire+0x1d7/0x540
> [  945.308605]        __flush_work+0x5e2/0x8c0
> [  945.313278]        work_on_cpu+0xee/0x140
> [  945.317753]        vmstat_refresh+0x12f/0x450
> [  945.322622]        proc_sys_call_handler+0x389/0x500
> [  945.328170]        new_sync_read+0x3af/0x610
> [  945.332942]        vfs_read+0x268/0x490
> [  945.337225]        ksys_read+0xf1/0x1c0
> [  945.341511]        do_syscall_64+0x42/0x90
> [  945.346091]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  945.352318]
> [  945.352318] other info that might help us debug this:
> [  945.352318]
> [  945.361254]  Possible unsafe locking scenario:
> [  945.361254]
> [  945.367863]        CPU0                    CPU1
> [  945.372920]        ----                    ----
> [  945.377976]   lock(cpu_hotplug_lock);
> [  945.382072]
> lock((work_completion)(&wfc.work));
> [  945.390140]                                lock(cpu_hotplug_lock);
> [  945.397046]   lock((work_completion)(&wfc.work));
> [  945.402301]
> [  945.402301]  *** DEADLOCK ***
> [  945.402301]
> [  945.408909] 1 lock held by sysctl/6820:
> [  945.413194]  #0: ffffffff93b43660 (cpu_hotplug_lock){++++}-{0:0},
> at: vmstat_refresh+0x4a/0x450
> [  945.422929]
> [  945.422929] stack backtrace:
> [  945.427793] CPU: 25 PID: 6820 Comm: sysctl Tainted: G S
>    5.14.0-rc3+ #2
> [  945.436540] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS
> 2.6.0 10/31/2017
> [  945.444886] Call Trace:
> [  945.447623]  dump_stack_lvl+0x44/0x57
> [  945.451720]  check_noncircular+0x280/0x320
> [  945.456299]  ? print_circular_bug.isra.49+0x440/0x440
> [  945.461943]  ? mark_lock.part.54+0x107/0x1370
> [  945.466813]  ? mark_lock.part.54+0x107/0x1370
> [  945.471683]  ? native_sched_clock+0x32/0x130
> [  945.476449]  ? lock_chain_count+0x20/0x20
> [  945.480929]  ? sched_clock_cpu+0x151/0x1d0
> [  945.485512]  check_prevs_add+0x3fd/0x2470
> [  945.489999]  ? native_sched_clock+0x32/0x130
> [  945.494770]  ? sched_clock_cpu+0x151/0x1d0
> [  945.499347]  ? find_held_lock+0x3a/0x1c0
> [  945.503731]  ? check_irq_usage+0xe10/0xe10
> [  945.508306]  ? lockdep_lock+0xbf/0x1b0
> [  945.512495]  ? static_obj+0xc0/0xc0
> [  945.516392]  ? sched_clock_cpu+0x151/0x1d0
> [  945.520975]  __lock_acquire+0x23f7/0x2f80
> [  945.525461]  ? rcu_read_lock_bh_held+0xc0/0xc0
> [  945.530430]  lock_acquire+0x1d7/0x540
> [  945.534523]  ? __flush_work+0x5c2/0x8c0
> [  945.538811]  ? rcu_read_unlock+0x50/0x50
> [  945.543196]  ? native_sched_clock+0x32/0x130
> [  945.547969]  ? __queue_work+0x4a3/0xfd0
> [  945.552256]  ? rcu_read_lock_sched_held+0xaf/0xe0
> [  945.557514]  __flush_work+0x5e2/0x8c0
> [  945.561604]  ? __flush_work+0x5c2/0x8c0
> [  945.565894]  ? queue_delayed_work_on+0x80/0x80
> [  945.570853]  ? lock_downgrade+0x700/0x700
> [  945.575339]  ? mark_held_locks+0xb7/0x120
> [  945.579829]  ? lockdep_hardirqs_on_prepare+0x28f/0x3e0
> [  945.585572]  ? queue_work_on+0x48/0x80
> [  945.589763]  ? trace_hardirqs_on+0x32/0x170
> [  945.594436]  work_on_cpu+0xee/0x140
> [  945.598327]  ? flush_delayed_work+0xc0/0xc0
> [  945.603004]  ? work_debug_hint+0x40/0x40
> [  945.607388]  ? refresh_cpu_vm_stats+0x530/0x530
> [  945.612452]  ? need_update+0x162/0x210
> [  945.616646]  vmstat_refresh+0x12f/0x450
> [  945.620938]  proc_sys_call_handler+0x389/0x500
> [  945.625905]  ? proc_sys_permission+0x120/0x120
> [  945.630872]  ? native_sched_clock+0x32/0x130
> [  945.635644]  new_sync_read+0x3af/0x610
> [  945.639838]  ? __x64_sys_llseek+0x2e0/0x2e0
> [  945.644505]  ? native_sched_clock+0x32/0x130
> [  945.649290]  ? fsnotify+0xf10/0xf10
> [  945.653199]  vfs_read+0x268/0x490
> [  945.656913]  ksys_read+0xf1/0x1c0
> [  945.660619]  ? vfs_write+0x910/0x910
> [  945.664615]  ? syscall_trace_enter.isra.17+0x18c/0x260
> [  945.670369]  do_syscall_64+0x42/0x90
> [  945.674371]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Maybe we should replace get/put_online_cpus() with cpu_hotplug/enable_disable()?

schedule_on_each_cpu has a similar pattern and should exhibit
the same problem.

So perhaps this problem already existed... 
Don't see the warning on a VM.

Can you revert the last patch and confirm that it introduces this
issue?

Thanks for testing!


  reply	other threads:[~2021-08-09 17:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-30 20:18 [patch 0/4] extensible prctl task isolation interface and vmstat sync (v2) Marcelo Tosatti
2021-07-30 20:18 ` [patch 1/4] add basic task isolation prctl interface Marcelo Tosatti
     [not found]   ` <CAFki+Lnf0cs62Se0aPubzYxP9wh7xjMXn7RXEPvrmtBdYBrsow@mail.gmail.com>
2021-07-31  0:49     ` Marcelo Tosatti
2021-07-31  7:47   ` kernel test robot
     [not found]   ` <CAFki+LkQVQOe+5aNEKWDvLdnjWjxzKWOiqOvBZzeuPWX+G=XgA@mail.gmail.com>
2021-08-02 14:16     ` Marcelo Tosatti
2021-07-30 20:18 ` [patch 2/4] task isolation: sync vmstats on return to userspace Marcelo Tosatti
2021-08-03 15:13   ` nsaenzju
2021-08-03 16:44     ` Marcelo Tosatti
2021-07-30 20:18 ` [patch 3/4] mm: vmstat: move need_update Marcelo Tosatti
2021-07-30 20:18 ` [patch 4/4] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti
2021-08-07  2:47   ` Nitesh Lal
2021-08-09 17:34     ` Marcelo Tosatti [this message]
2021-08-09 19:13       ` Nitesh Lal
2021-08-10 16:40 ` [patch 0/4] extensible prctl task isolation interface and vmstat sync (v2) Thomas Gleixner
2021-08-10 18:37   ` Marcelo Tosatti
2021-08-10 19:15     ` Marcelo Tosatti
  -- strict thread matches above, loose matches on Subject: below --
2021-07-27 10:38 [patch 0/4] prctl task isolation interface and vmstat sync Marcelo Tosatti
2021-07-27 10:38 ` [patch 4/4] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210809173439.GA17223@fuller.cnet \
    --to=mtosatti@redhat.com \
    --cc=abelits@belits.com \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nilal@redhat.com \
    --cc=nsaenzju@redhat.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --subject='Re: [patch 4/4] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).