LKML Archive on
help / color / mirror / Atom feed
From: Marcelo Tosatti <>
Cc: Nitesh Lal <>,
	Nicolas Saenz Julienne <>,
	Frederic Weisbecker <>,
	Christoph Lameter <>,
	Juri Lelli <>,
	Peter Zijlstra <>,
	Alex Belits <>, Peter Xu <>
Subject: [patch 0/4] prctl task isolation interface and vmstat sync
Date: Tue, 27 Jul 2021 07:38:03 -0300	[thread overview]
Message-ID: <20210727103803.464432924@fuller.cnet> (raw)

The logic to disable vmstat worker thread, when entering
nohz full, does not cover all scenarios. For example, it is possible
for the following to happen:

1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
2) app runs mlock, which increases counters for mlock'ed pages.
3) start -RT loop

Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
the mlock, vmstat shepherd can restart vmstat worker thread on
the CPU in question.

To fix this, add task isolation prctl interface to quiesce
deferred actions when returning to userspace.

Task isolation prctl interface

Set thread isolation mode and parameters, which allows
informing the kernel that application is
executing latency sensitive code (where interruptions
are undesired).

Its composed of 4 prctl commands (passed as arg1 to

PR_ISOL_SET:   set isolation parameters for the task

PR_ISOL_GET:   get isolation parameters for the task

PR_ISOL_ENTER: indicate that task should be considered
               isolated from this point on

PR_ISOL_EXIT: indicate that task should not be considered
              isolated from this point on

The isolation parameters and mode are not inherited by
children created by fork(2) and clone(2). The setting is
preserved across execve(2).

The meaning of isolated is specified as follows, when setting arg2 to
PR_ISOL_SET or PR_ISOL_GET, with the following arguments passed as arg3.

Isolation mode (PR_ISOL_MODE):

- PR_ISOL_MODE_NONE (arg4): no per-task isolation (default mode).

- PR_ISOL_MODE_NORMAL (arg4): applications can perform system calls normally,
  and in case of interruption events, the notifications can be collected
  by BPF programs.
  In this mode, if system calls are performed, deferred actions initiated
  by the system call will be executed before return to userspace.

Other modes, which for example send signals upon interruptions events,
can be implemented.


The ``samples/task_isolation/`` directory contains a sample

             reply	other threads:[~2021-07-27 10:42 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-27 10:38 Marcelo Tosatti [this message]
2021-07-27 10:38 ` [patch 1/4] add basic task isolation prctl interface Marcelo Tosatti
2021-07-27 10:48   ` nsaenzju
2021-07-27 11:00     ` Marcelo Tosatti
2021-07-27 12:38       ` nsaenzju
2021-07-27 13:06         ` Marcelo Tosatti
2021-07-27 13:08           ` Marcelo Tosatti
2021-07-27 13:09         ` Frederic Weisbecker
2021-07-27 14:52           ` Marcelo Tosatti
2021-07-27 23:45             ` Frederic Weisbecker
2021-07-28  9:37               ` Marcelo Tosatti
2021-07-28 11:45                 ` Frederic Weisbecker
2021-07-28 13:21                   ` Marcelo Tosatti
2021-07-28 21:22                     ` Frederic Weisbecker
2021-07-28 11:55                 ` nsaenzju
2021-07-28 13:16                   ` Marcelo Tosatti
     [not found]                     ` <>
2021-07-28 16:21                       ` Marcelo Tosatti
2021-07-28 17:08                     ` nsaenzju
     [not found]                 ` <>
2021-07-28 16:17                   ` Marcelo Tosatti
2021-07-27 10:38 ` [patch 2/4] task isolation: sync vmstats on return to userspace Marcelo Tosatti
2021-07-27 10:38 ` [patch 3/4] mm: vmstat: move need_update Marcelo Tosatti
2021-07-27 10:38 ` [patch 4/4] mm: vmstat_refresh: avoid queueing work item if cpu stats are clean Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210727103803.464432924@fuller.cnet \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: [patch 0/4] prctl task isolation interface and vmstat sync' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).