LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com> To: linux-kernel@vger.kernel.org Cc: Nitesh Lal <nilal@redhat.com>, Nicolas Saenz Julienne <nsaenzju@redhat.com>, Frederic Weisbecker <frederic@kernel.org>, Christoph Lameter <cl@linux.com>, Juri Lelli <juri.lelli@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Alex Belits <abelits@belits.com>, Peter Xu <peterx@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com> Subject: [patch V3 3/8] task isolation: sync vmstats on return to userspace Date: Tue, 24 Aug 2021 12:24:26 -0300 [thread overview] Message-ID: <20210824152646.743604666@fuller.cnet> (raw) In-Reply-To: 20210824152423.300346181@fuller.cnet The logic to disable vmstat worker thread, when entering nohz full, does not cover all scenarios. For example, it is possible for the following to happen: 1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats. 2) app runs mlock, which increases counters for mlock'ed pages. 3) start -RT loop Since refresh_cpu_vm_stats from nohz_full logic can happen _before_ the mlock, vmstat shepherd can restart vmstat worker thread on the CPU in question. To fix this, use the task isolation prctl interface to quiesce deferred actions when returning to userspace. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> --- include/linux/task_isolation.h | 12 ++++++++++++ include/linux/vmstat.h | 8 ++++++++ kernel/entry/common.c | 2 ++ kernel/task_isolation.c | 26 ++++++++++++++++++++++++++ mm/vmstat.c | 21 +++++++++++++++++++++ 5 files changed, 69 insertions(+) Index: linux-2.6/include/linux/task_isolation.h =================================================================== --- linux-2.6.orig/include/linux/task_isolation.h +++ linux-2.6/include/linux/task_isolation.h @@ -41,8 +41,20 @@ int prctl_task_isolation_ctrl_set(unsign int __copy_task_isolation(struct task_struct *tsk); +void __isolation_exit_to_user_mode_prepare(void); + +static inline void isolation_exit_to_user_mode_prepare(void) +{ + if (current->isol_info) + __isolation_exit_to_user_mode_prepare(); +} + #else +static void isolation_exit_to_user_mode_prepare(void) +{ +} + static inline void tsk_isol_free(struct task_struct *tsk) { } Index: linux-2.6/include/linux/vmstat.h =================================================================== --- linux-2.6.orig/include/linux/vmstat.h +++ linux-2.6/include/linux/vmstat.h @@ -21,6 +21,14 @@ int sysctl_vm_numa_stat_handler(struct c void *buffer, size_t *length, loff_t *ppos); #endif +#ifdef CONFIG_SMP +void sync_vmstat(void); +#else +static inline void sync_vmstat(void) +{ +} +#endif + struct reclaim_stat { unsigned nr_dirty; unsigned nr_unqueued_dirty; Index: linux-2.6/kernel/entry/common.c =================================================================== --- linux-2.6.orig/kernel/entry/common.c +++ linux-2.6/kernel/entry/common.c @@ -6,6 +6,7 @@ #include <linux/livepatch.h> #include <linux/audit.h> #include <linux/tick.h> +#include <linux/task_isolation.h> #include "common.h" @@ -287,6 +288,7 @@ static void syscall_exit_to_user_mode_pr static __always_inline void __syscall_exit_to_user_mode_work(struct pt_regs *regs) { syscall_exit_to_user_mode_prepare(regs); + isolation_exit_to_user_mode_prepare(); local_irq_disable_exit_to_user(); exit_to_user_mode_prepare(regs); } Index: linux-2.6/kernel/task_isolation.c =================================================================== --- linux-2.6.orig/kernel/task_isolation.c +++ linux-2.6/kernel/task_isolation.c @@ -18,6 +18,8 @@ #include <linux/sysfs.h> #include <linux/init.h> #include <linux/sched/task.h> +#include <linux/mm.h> +#include <linux/vmstat.h> void __tsk_isol_free(struct task_struct *tsk) { @@ -278,3 +280,19 @@ int prctl_task_isolation_ctrl_get(unsign return ret; } + +void __isolation_exit_to_user_mode_prepare(void) +{ + struct isol_info *i; + + i = current->isol_info; + if (!i) + return; + + if (i->active_mask != ISOL_F_QUIESCE) + return; + + if (i->quiesce_mask & ISOL_F_QUIESCE_VMSTATS) + sync_vmstat(); +} +EXPORT_SYMBOL_GPL(__isolation_exit_to_user_mode_prepare); Index: linux-2.6/mm/vmstat.c =================================================================== --- linux-2.6.orig/mm/vmstat.c +++ linux-2.6/mm/vmstat.c @@ -1964,6 +1964,27 @@ static void vmstat_shepherd(struct work_ round_jiffies_relative(sysctl_stat_interval)); } +void sync_vmstat(void) +{ + int cpu; + + cpu = get_cpu(); + + refresh_cpu_vm_stats(false); + put_cpu(); + + /* + * If task is migrated to another CPU between put_cpu + * and cancel_delayed_work_sync, the code below might + * cancel vmstat_update work for a different cpu + * (than the one from which the vmstats were flushed). + * + * However, vmstat shepherd will re-enable it later, + * so its harmless. + */ + cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu)); +} + static void __init start_shepherd_timer(void) { int cpu;
next prev parent reply other threads:[~2021-08-24 15:42 UTC|newest] Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-08-24 15:24 [patch V3 0/8] extensible prctl task isolation interface and vmstat sync Marcelo Tosatti 2021-08-24 15:24 ` [patch V3 1/8] add basic task isolation prctl interface Marcelo Tosatti 2021-08-24 15:24 ` [patch V3 2/8] add prctl task isolation prctl docs and samples Marcelo Tosatti 2021-08-26 9:59 ` Frederic Weisbecker 2021-08-26 12:11 ` Marcelo Tosatti 2021-08-26 19:15 ` Christoph Lameter 2021-08-26 20:37 ` Marcelo Tosatti 2021-08-27 13:08 ` Frederic Weisbecker 2021-08-27 14:44 ` Marcelo Tosatti 2021-08-30 11:38 ` Frederic Weisbecker 2021-09-01 13:11 ` Nitesh Lal 2021-09-01 17:34 ` Marcelo Tosatti 2021-09-01 17:49 ` Nitesh Lal 2021-08-24 15:24 ` Marcelo Tosatti [this message] 2021-09-10 13:49 ` [patch V3 3/8] task isolation: sync vmstats on return to userspace nsaenzju 2021-08-24 15:24 ` [patch V3 4/8] procfs: add per-pid task isolation state Marcelo Tosatti 2021-08-24 15:24 ` [patch V3 5/8] task isolation: sync vmstats conditional on changes Marcelo Tosatti 2021-08-25 9:46 ` Christoph Lameter 2021-08-24 15:24 ` [patch V3 6/8] KVM: x86: call isolation prepare from VM-entry code path Marcelo Tosatti 2021-08-24 15:24 ` [patch V3 7/8] mm: vmstat: move need_update Marcelo Tosatti 2021-08-24 15:24 ` [patch V3 8/8] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti 2021-08-25 9:30 ` Christoph Lameter 2021-09-01 13:05 ` Nitesh Lal 2021-09-01 17:32 ` Marcelo Tosatti 2021-09-01 18:33 ` Marcelo Tosatti 2021-09-03 17:38 ` Nitesh Lal 2021-08-25 10:02 ` [patch V3 0/8] extensible prctl task isolation interface and vmstat sync Marcelo Tosatti
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210824152646.743604666@fuller.cnet \ --to=mtosatti@redhat.com \ --cc=abelits@belits.com \ --cc=cl@linux.com \ --cc=frederic@kernel.org \ --cc=juri.lelli@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=nilal@redhat.com \ --cc=nsaenzju@redhat.com \ --cc=peterx@redhat.com \ --cc=peterz@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).