LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation
@ 2021-07-20 13:12 SeongJae Park
  2021-07-20 13:12 ` [RFC v3 01/15] mm/damon/paddr: Support the pageout scheme SeongJae Park
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:12 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

NOTE: This is only an RFC for future features of DAMON patchset[1], which is
not merged in the mainline yet.  The aim of this RFC is to show how DAMON would
be evolved once it is merged in.  So, if you have some interest here, please
consider reviewing the DAMON patchset, either.

[1] https://lore.kernel.org/linux-mm/20210716081449.22187-1-sj38.park@gmail.com/

Changes from Previous Version (RFC v2)
======================================

Compared to the RFC v2
(https://lore.kernel.org/linux-mm/20210608115254.11930-1-sj38.park@gmail.com/),
this version contains below changes.

- Rebase on latest -mm tree (v5.14-rc1-mmots-2021-07-15-18-47)
- Implement a time quota (limits the time for trying reclamation of cold pages)
- Make reclamation restarts from exactly the point it stopped due to the limit

Introduction
============

In short, this patchset 1) makes the engine for general data access
pattern-oriented memory management be useful for production environments, and
2) implements a static kernel module for lightweight proactive reclamation
using the engine.

Proactive Reclamation
---------------------

On general memory over-committed systems, proactively reclaiming cold pages
helps saving memory and reducing latency spikes that incurred by the direct
reclaim or the CPU consumption of kswapd, while incurring only minimal
performance degradation[2].

Particularly, a Free Pages Reporting[9] based memory over-commit virtualization
system would be one of such use cases.  In the system, the guest VMs reports
their free memory to host, and the host reallocates the reported memory to
other guests.  As a result, the system's memory can be fully utilized.
However, the guests could be not so memory-frugal, mainly because some kernel
subsystems and user-space applications are designed to use as much memory as
available.  Then, guests would report only small amount of free memory to host,
and results in poor memory utilization.  Running the proactive reclamation in
guests could help mitigating this problem.

Google has implemented the general idea and using it in their data center.
They further proposed upstreaming it in LSFMM'19, and "the general consensus
was that, while this sort of proactive reclaim would be useful for a number of
users, the cost of this particular solution was too high to consider merging it
upstream"[3].  The cost mainly comes from the coldness tracking.  Roughly
speaking, the implementation periodically scans the 'Accessed' bit of each
page.  For the reason, the overhead linearly increases as the size of the
memory and the scanning frequency grows.  As a result, Google is known to
dedicating one CPU for the work.  That's a reasonable option to someone like
Google, but it wouldn't be so to some others.

DAMON and DAMOS: An engine for data access pattern-oriented memory management
-----------------------------------------------------------------------------

DAMON[4] is a framework for general data access monitoring.  Its adaptive
monitoring overhead control feature minimizes its monitoring overhead.  It also
let the upper-bounded of the overhead be configurable by clients, regardless of
the size of the monitoring target memory.  While monitoring 70 GB memory of a
production system every 5 milliseconds, it consumes less than 1% single CPU
time.  For this, it could sacrify some of the quality of the monitoring
results.  Nevertheless, the lower-bound of the quality is configurable, and it
uses a best-effort algorithm for better quality.  Our test results[5] show the
quality is practical enough.  From the production system monitoring, we were
able to find a 4 KB region in the 70 GB memory that shows highest access
frequency.  For people having different requirements, the features can
selectively turned off, and DAMON supports the page-granularity monitoring[6],
though it makes the overhead higher and proportional to the memory size again.

We normally don't monitor the data access pattern just for fun but to improve
something like memory management.  Proactive reclamation is one such usage.
For such general cases, DAMON provides a feature called DAMon-based Operation
Schemes (DAMOS)[7].  It makes DAMON an engine for general data access pattern
oriented memory management.  Using this, clients can ask DAMON to find memory
regions of specific data access pattern and apply some memory management action
(e.g., page out, move to head of the LRU list, use huge page, ...).  We call
the request 'scheme'.

Proactive Reclamation on top of DAMON/DAMOS
-------------------------------------------

Therefore, by using DAMON for the cold pages detection, the proactive
reclamation's monitoring overhead issue could be solved.  If someone like
Google is ok to dedicate some CPUs for the monitoring and wants
page-granularity monitoring, they can configure DAMON so.

Actually, we previously implemented a version of proactive reclamation using
DAMOS and achieved noticeable improvements with our evaluation setup[5].
Nevertheless, it was only for a proof-of-concept.  It supports only virtual
address spaces of processes, and require additional tuning efforts for given
workloads and the hardware.  For the tuning, we recently introduced a simple
auto-tuning user space tool[8].  Google is also known to using a ML-based
similar approach for their fleets[2].  But, making it just works in the kernel
would be more convenient for general users.

To this end, this patchset improves DAMOS to be ready for such production
usages, and implements another version of the proactive reclamation, namely
DAMON_RECLAIM, on top of it.

DAMOS Improvements: Speed Limit, Prioritization, and Watermarks
---------------------------------------------------------------

First of all, the current version of DAMOS supports only virtual address
spaces.  This patchset makes it supports the physical address space for the
page out action.

One major problem of the current version of DAMOS is the lack of the
aggressiveness control, which can results in arbitrary overhead.  For example,
if huge memory regions having the data access pattern of interest are found,
applying the requested action to all of the regions could incur significant
overhead.  It can be controlled by modifying the target data access pattern
with manual or automated approaches[2,8].  But, some people would prefer the
kernel to just work with only intuitive tuning or default values.

For this, this patchset implements a safeguard time/size quota.  Using this,
the clients can specify up to how much time can be used for applying the
action, and/or up to how much memory regions the action can be applied within
specific time duration.  A followup question is, to which memory regions should
the action applied within the limits?  We implement a simple regions
prioritization mechanism for each action and make DAMOS to apply the action to
high priority regions first.  It also allows clients tune the prioritization
mechanism to use different weights for region's size, access frequency, and
age.  This means we could use not only LRU but also LFU or some fancy
algorithms like CAR[10] with lightweight overhead.

Though DAMON is lightweight, someone would want to remove even the overhead
when it is unnecessary.  Currently, it should manually turned on and off by
clients, but some clients would simply want to turn it on and off based on some
metrics like free memory ratio or memory fragmentation.  For such cases, this
patchset implements a watermarks-based automatic activation feature.  It allows
the clients configure the metric of their interest, and three watermarks of the
metric.  If the metric is higher than the high watermark or lower than the low
watermark, the scheme is deactivated.  If the metric is lower than the mid
watermark but higher than the low watermark, the scheme is activated.

DAMON-based Reclaim
-------------------

Using the improved DAMOS, this patchset implements a static kernel module
called 'damon_reclaim'.  It finds memory regions that didn't accessed for
specific time duration and page out.  Consuming too much CPU for the paging out
operations, or invoking it too frequently can be critical for systems
configuring its swap devices with software-defined in-memory block devices like
zram or total number of writes limited devices like SSDs, respectively.  To
avoid the problems, the time and/or size quotas can be configured.  Under the
quotas, it pages out memory regions that didn't accessed longer first.  Also,
to remove the monitoring overhead under peaceful situation, and to fall back to
the LRU-list based page granularity reclamation when it doesn't make progress,
the three watermarks based activation mechanism is used, with the free memory
ratio as the watermark metric.

For convenient configurations, it provides several module parameters.  Using
these, sysadmins can enable/disable it and tune the coldness identification
time threshold, the time/size quotas, and the three watermarks.  In detail,
sysadmins can use the kernel command line for a boot time tuning, or the sysfs
('/sys/modules/damon_reclaimparameters/') for overriding those in runtime.

Evaluation
==========

In short, DAMON_RECLAIM on v5.13 Linux kernel with ZRAM swap device and 50ms/s
time quota achieves 40.34% memory saving with only 3.38% runtime overhead.  For
this, DAMON_RECLAIM consumes only 5.16% of single CPU time.  Among the CPU
consumption, only up to about 1.448% of single CPU time is expected to be used
for the access pattern monitoring.

Setup
-----

We evaluate DAMON_RECLAIM to show how each of the DAMOS improvements make
effect.  For this, we measure entire system memory footprint and runtime of 24
realistic workloads in PARSEC3 and SPLASH-2X benchmark suites on my QEMU/KVM
based virtual machine.  The virtual machine runs on an i3.metal AWS instance
and has 130GiB memory.  It also utilizes a 4 GiB ZRAM swap device.  We do the
measurement 5 times and use averages.  We also measure the CPU consumption of
DAMON_RECLAIM.

Detailed Results
----------------

The result numbers are shown in below table.

DAMON_RECLAIM without the speed limit achieves 47.16% memory saving, but incur
5.4% runtime slowdown to the workloads on average.  For this, DAMON_RECLAIM
consumes about 11.62% single CPU time.

Applying 10ms/s, 50ms/s, and 200ms/s time quotas without the regions
prioritization reduces the slowdown to 2.51%, 4.53%, and 4.69%, respectively.
DAMON_RECLAIM's CPU utilization also similarly reduced: 1.78%, 5.7%, and 10.92%
of single CPU time.  That is, the overhead is proportional to the speed limit.
Nevertheless, it also reduces the memory saving because it becomes less
aggressive.  In detail, the three variants show 4.55%, 40.84%, and 48.42%
memory saving, respectively.

Applying the regions prioritization (page out regions that not accessed longer
first within the time quota) further reduces the performance degradation.
Runtime slowdowns has been 2.51% -> 1.84% (10ms/s), 4.53% -> 3.38% (50ms/s), and
4.69% -> 5.1% (200ms/s).  Interestingly, prioritization also reduced memory
saving a little bit.  I think that's because already paged out regions are
prioritized again.

    time quota   prioritization  memory_saving  cpu_util  slowdown
    N            N               47.16%         11.62%    5.4%
    10ms/s       N               4.55%          1.78%     2.51%
    50ms/s       N               40.84%         5.7%      4.53%
    200ms/s      N               48.42%         10.92%    4.69%
    10ms/s       Y               0.77%          1.37%     1.84%
    50ms/s       Y               40.34%         5.16%     3.38%
    200ms/s      Y               47.99%         10.41%    5.1%

Baseline and Complete Git Trees
===============================

The patches are based on the latest -mm tree (v5.14-rc1-mmots-2021-07-15-18-47)
plus DAMON patchset[1], DAMOS patchset[7], and physical address space support
patchset[6].  You can also clone the complete git tree from:

    $ git clone git://github.com/sjp38/linux -b damon_reclaim/rfc/v3

The web is also available:
https://github.com/sjp38/linux/releases/tag/damon_reclaim/rfc/v3

Development Trees
-----------------

There are a couple of trees for entire DAMON patchset series and
features for future release.

- For latest release: https://github.com/sjp38/linux/tree/damon/master
- For next release: https://github.com/sjp38/linux/tree/damon/next

Long-term Support Trees
-----------------------

For people who want to test DAMON patchset series but using only LTS kernels,
there are another couple of trees based on two latest LTS kernels respectively
and containing the 'damon/master' backports.

- For v5.4.y: https://github.com/sjp38/linux/tree/damon/for-v5.4.y
- For v5.10.y: https://github.com/sjp38/linux/tree/damon/for-v5.10.y

Sequence Of Patches
===================

The first patch makes DAMOS to support the physical address space for the page
out action.  Following five patches (patches 2-6) implement the time/size
quotas.  Next four patches (patches 7-10) implement the memory regions
prioritization within the limit.  Then, three following patches (patches 11-13)
implement the watermarks-based schemes activation.  Finally, the last two
patches (patches 14-15) implement and document the DAMON-based reclamation on
top of the advanced DAMOS.

[1] https://lore.kernel.org/linux-mm/20210716081449.22187-1-sj38.park@gmail.com/
[2] https://research.google/pubs/pub48551/
[3] https://lwn.net/Articles/787611/
[4] https://damonitor.github.io
[5] https://damonitor.github.io/doc/html/latest/vm/damon/eval.html
[6] https://lore.kernel.org/linux-mm/20201216094221.11898-1-sjpark@amazon.com/
[7] https://lore.kernel.org/linux-mm/20201216084404.23183-1-sjpark@amazon.com/
[8] https://github.com/awslabs/damoos
[9] https://www.kernel.org/doc/html/latest/vm/free_page_reporting.html
[10] https://www.usenix.org/conference/fast-04/car-clock-adaptive-replacement

Patch History
=============

Changes from RFC v2
(https://lore.kernel.org/linux-mm/20210608115254.11930-1-sj38.park@gmail.com/)
- Rebase on latest -mm tree (v5.14-rc1-mmots-2021-07-15-18-47)
- Make reclamation restarts from exactly the point it stopped due to the limit
- Implement a time quota (limits the time for trying reclamation of cold pages)

[1] https://lore.kernel.org/linux-mm/20210716081449.22187-1-sj38.park@gmail.com/

Changes from RFC v1
(https://lore.kernel.org/linux-mm/20210531133816.12689-1-sj38.park@gmail.com/)
- Avoid fake I/O load reporting (James Gowans)
- Remove kernel configs for the build time enabling and the parameters setting
- Export kdamond pid via a readonly parameter file
- Elaborate coverletter, especially for evaluation and DAMON_RECLAIM interface
- Add documentation
- Rebase on -mm tree
- Cleanup code

SeongJae Park (15):
  mm/damon/paddr: Support the pageout scheme
  mm/damon/damos: Make schemes aggressiveness controllable
  damon/core/schemes: Skip already charged targets and regions
  mm/damon/schemes: Implement time quota
  mm/damon/dbgfs: Support schemes' time/IO quotas
  mm/damon/selftests: Support schemes quotas
  mm/damon/schemes: Prioritize regions within the quotas
  mm/damon/vaddr,paddr: Support pageout prioritization
  mm/damon/dbgfs: Support prioritization weights
  tools/selftests/damon: Update for regions prioritization of schemes
  mm/damon/schemes: Activate schemes based on a watermarks mechanism
  mm/damon/dbgfs: Support watermarks
  selftests/damon: Support watermarks
  mm/damon: Introduce DAMON-based reclamation
  Documentation/admin-guide/mm/damon: Add a document for DAMON_RECLAIM

 Documentation/admin-guide/mm/damon/index.rst  |   1 +
 .../admin-guide/mm/damon/reclaim.rst          | 233 ++++++++++++
 include/linux/damon.h                         | 136 ++++++-
 mm/damon/Kconfig                              |  12 +
 mm/damon/Makefile                             |   1 +
 mm/damon/core.c                               | 283 +++++++++++++-
 mm/damon/dbgfs.c                              |  47 ++-
 mm/damon/paddr.c                              |  52 ++-
 mm/damon/prmtv-common.c                       |  48 ++-
 mm/damon/prmtv-common.h                       |   5 +
 mm/damon/reclaim.c                            | 354 ++++++++++++++++++
 mm/damon/vaddr.c                              |  15 +
 .../testing/selftests/damon/debugfs_attrs.sh  |   4 +-
 13 files changed, 1163 insertions(+), 28 deletions(-)
 create mode 100644 Documentation/admin-guide/mm/damon/reclaim.rst
 create mode 100644 mm/damon/reclaim.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 01/15] mm/damon/paddr: Support the pageout scheme
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
@ 2021-07-20 13:12 ` SeongJae Park
  2021-07-20 13:12 ` [RFC v3 02/15] mm/damon/damos: Make schemes aggressiveness controllable SeongJae Park
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:12 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit makes the DAMON primitives for physical address space to
support the pageout action for DAMON-based Operation Schemes.  IOW, now
the users can implement their own data access-aware reclamations for
whole system using DAMOS.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/damon/paddr.c        | 38 +++++++++++++++++++++++++++++++++++++-
 mm/damon/prmtv-common.c |  2 +-
 mm/damon/prmtv-common.h |  2 ++
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index b92b07a3ce53..303db372e53b 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -7,6 +7,9 @@
 
 #define pr_fmt(fmt) "damon-pa: " fmt
 
+#include <linux/swap.h>
+
+#include "../internal.h"
 #include "prmtv-common.h"
 
 /*
@@ -85,6 +88,39 @@ bool damon_pa_target_valid(void *t)
 	return true;
 }
 
+int damon_pa_apply_scheme(struct damon_ctx *ctx, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme)
+{
+	unsigned long addr;
+	LIST_HEAD(page_list);
+
+	if (scheme->action != DAMOS_PAGEOUT)
+		return -EINVAL;
+
+	for (addr = r->ar.start; addr < r->ar.end; addr += PAGE_SIZE) {
+		struct page *page = damon_get_page(PHYS_PFN(addr));
+
+		if (!page)
+			continue;
+
+		ClearPageReferenced(page);
+		test_and_clear_page_young(page);
+		if (isolate_lru_page(page)) {
+			put_page(page);
+			continue;
+		}
+		if (PageUnevictable(page)) {
+			putback_lru_page(page);
+		} else {
+			list_add(&page->lru, &page_list);
+			put_page(page);
+		}
+	}
+	reclaim_pages(&page_list);
+	cond_resched();
+	return 0;
+}
+
 void damon_pa_set_primitives(struct damon_ctx *ctx)
 {
 	ctx->primitive.init = NULL;
@@ -94,5 +130,5 @@ void damon_pa_set_primitives(struct damon_ctx *ctx)
 	ctx->primitive.reset_aggregated = NULL;
 	ctx->primitive.target_valid = damon_pa_target_valid;
 	ctx->primitive.cleanup = NULL;
-	ctx->primitive.apply_scheme = NULL;
+	ctx->primitive.apply_scheme = damon_pa_apply_scheme;
 }
diff --git a/mm/damon/prmtv-common.c b/mm/damon/prmtv-common.c
index 08e9318d67ed..01c1c1b37859 100644
--- a/mm/damon/prmtv-common.c
+++ b/mm/damon/prmtv-common.c
@@ -14,7 +14,7 @@
  * The body of this function is stolen from the 'page_idle_get_page()'.  We
  * steal rather than reuse it because the code is quite simple.
  */
-static struct page *damon_get_page(unsigned long pfn)
+struct page *damon_get_page(unsigned long pfn)
 {
 	struct page *page = pfn_to_online_page(pfn);
 
diff --git a/mm/damon/prmtv-common.h b/mm/damon/prmtv-common.h
index 939c41af6b59..ba0c4eecbb79 100644
--- a/mm/damon/prmtv-common.h
+++ b/mm/damon/prmtv-common.h
@@ -18,6 +18,8 @@
 /* Get a random number in [l, r) */
 #define damon_rand(l, r) (l + prandom_u32_max(r - l))
 
+struct page *damon_get_page(unsigned long pfn);
+
 void damon_va_mkold(struct mm_struct *mm, unsigned long addr);
 bool damon_va_young(struct mm_struct *mm, unsigned long addr,
 			unsigned long *page_sz);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 02/15] mm/damon/damos: Make schemes aggressiveness controllable
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
  2021-07-20 13:12 ` [RFC v3 01/15] mm/damon/paddr: Support the pageout scheme SeongJae Park
@ 2021-07-20 13:12 ` SeongJae Park
  2021-07-20 13:12 ` [RFC v3 03/15] damon/core/schemes: Skip already charged targets and regions SeongJae Park
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:12 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

If there are too large memory regions fulfilling the target data access
pattern of a DAMON-based operation scheme, applying the action of the
scheme could consume too much CPU.  To avoid that, this commit
implements a limit for the action application speed.  Using the feature,
the client can set up to how much amount of memory regions the action
could applied within specific time duration.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h | 36 +++++++++++++++++++++++---
 mm/damon/core.c       | 60 +++++++++++++++++++++++++++++++++++++------
 mm/damon/dbgfs.c      |  4 ++-
 3 files changed, 87 insertions(+), 13 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 6eb265717fd4..9c996adb02b8 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -89,6 +89,26 @@ enum damos_action {
 	DAMOS_STAT,		/* Do nothing but only record the stat */
 };
 
+/**
+ * struct damos_quota - Controls the aggressiveness of the given scheme.
+ * @sz:			Maximum bytes of memory that the action can be applied.
+ * @reset_interval:	Charge reset interval in milliseconds.
+ *
+ * To avoid consuming too much CPU time or IO resources for applying the
+ * &struct damos->action to large memory, DAMON allows users to set a size
+ * quota.  The quota can be set by writing non-zero values to &sz.  If the size
+ * quota is set, DAMON tries to apply the action only up to &sz bytes within
+ * &reset_interval.
+ */
+struct damos_quota {
+	unsigned long sz;
+	unsigned long reset_interval;
+
+/* private: For charging the quota */
+	unsigned long charged_sz;
+	unsigned long charged_from;
+};
+
 /**
  * struct damos - Represents a Data Access Monitoring-based Operation Scheme.
  * @min_sz_region:	Minimum size of target regions.
@@ -98,13 +118,20 @@ enum damos_action {
  * @min_age_region:	Minimum age of target regions.
  * @max_age_region:	Maximum age of target regions.
  * @action:		&damo_action to be applied to the target regions.
+ * @quota:		Control the aggressiveness of this scheme.
  * @stat_count:		Total number of regions that this scheme is applied.
  * @stat_sz:		Total size of regions that this scheme is applied.
  * @list:		List head for siblings.
  *
- * For each aggregation interval, DAMON applies @action to monitoring target
- * regions fit in the condition and updates the statistics.  Note that both
- * the minimums and the maximums are inclusive.
+ * For each aggregation interval, DAMON finds regions which fit in the
+ * condition (&min_sz_region, &max_sz_region, &min_nr_accesses,
+ * &max_nr_accesses, &min_age_region, &max_age_region) and applies &action to
+ * those.  To avoid consuming too much CPU time or IO resources for the
+ * &action, &quota is used.
+ *
+ * After applying the &action to each region, &stat_count and &stat_sz is
+ * updated to reflect the number of regions and total size of regions that the
+ * &action is applied.
  */
 struct damos {
 	unsigned long min_sz_region;
@@ -114,6 +141,7 @@ struct damos {
 	unsigned int min_age_region;
 	unsigned int max_age_region;
 	enum damos_action action;
+	struct damos_quota quota;
 	unsigned long stat_count;
 	unsigned long stat_sz;
 	struct list_head list;
@@ -338,7 +366,7 @@ struct damos *damon_new_scheme(
 		unsigned long min_sz_region, unsigned long max_sz_region,
 		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
 		unsigned int min_age_region, unsigned int max_age_region,
-		enum damos_action action);
+		enum damos_action action, struct damos_quota *quota);
 void damon_add_scheme(struct damon_ctx *ctx, struct damos *s);
 void damon_destroy_scheme(struct damos *s);
 
diff --git a/mm/damon/core.c b/mm/damon/core.c
index c45827e853cf..00804a1e5e2a 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -89,7 +89,7 @@ struct damos *damon_new_scheme(
 		unsigned long min_sz_region, unsigned long max_sz_region,
 		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
 		unsigned int min_age_region, unsigned int max_age_region,
-		enum damos_action action)
+		enum damos_action action, struct damos_quota *quota)
 {
 	struct damos *scheme;
 
@@ -107,6 +107,11 @@ struct damos *damon_new_scheme(
 	scheme->stat_sz = 0;
 	INIT_LIST_HEAD(&scheme->list);
 
+	scheme->quota.sz = quota->sz;
+	scheme->quota.reset_interval = quota->reset_interval;
+	scheme->quota.charged_sz = 0;
+	scheme->quota.charged_from = 0;
+
 	return scheme;
 }
 
@@ -535,15 +540,25 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
 	}
 }
 
+static void damon_split_region_at(struct damon_ctx *ctx,
+		struct damon_target *t, struct damon_region *r,
+		unsigned long sz_r);
+
 static void damon_do_apply_schemes(struct damon_ctx *c,
 				   struct damon_target *t,
 				   struct damon_region *r)
 {
 	struct damos *s;
-	unsigned long sz;
 
 	damon_for_each_scheme(s, c) {
-		sz = r->ar.end - r->ar.start;
+		struct damos_quota *quota = &s->quota;
+		unsigned long sz = r->ar.end - r->ar.start;
+
+		/* Check the quota */
+		if (quota->sz && quota->charged_sz >= quota->sz)
+			continue;
+
+		/* Check the target regions condition */
 		if (sz < s->min_sz_region || s->max_sz_region < sz)
 			continue;
 		if (r->nr_accesses < s->min_nr_accesses ||
@@ -551,22 +566,51 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
 			continue;
 		if (r->age < s->min_age_region || s->max_age_region < r->age)
 			continue;
-		s->stat_count++;
-		s->stat_sz += sz;
-		if (c->primitive.apply_scheme)
+
+		/* Apply the scheme */
+		if (c->primitive.apply_scheme) {
+			if (quota->sz && quota->charged_sz + sz > quota->sz) {
+				sz = ALIGN_DOWN(quota->sz - quota->charged_sz,
+						DAMON_MIN_REGION);
+				if (!sz)
+					goto update_stat;
+				damon_split_region_at(c, t, r, sz);
+			}
 			c->primitive.apply_scheme(c, t, r, s);
+			quota->charged_sz += sz;
+		}
 		if (s->action != DAMOS_STAT)
 			r->age = 0;
+
+update_stat:
+		s->stat_count++;
+		s->stat_sz += sz;
 	}
 }
 
 static void kdamond_apply_schemes(struct damon_ctx *c)
 {
 	struct damon_target *t;
-	struct damon_region *r;
+	struct damon_region *r, *next_r;
+	struct damos *s;
+
+	damon_for_each_scheme(s, c) {
+		struct damos_quota *quota = &s->quota;
+
+		if (!quota->sz)
+			continue;
+
+		/* New charge window starts */
+		if (time_after_eq(jiffies, quota->charged_from +
+					msecs_to_jiffies(
+						quota->reset_interval))) {
+			quota->charged_from = jiffies;
+			quota->charged_sz = 0;
+		}
+	}
 
 	damon_for_each_target(t, c) {
-		damon_for_each_region(r, t)
+		damon_for_each_region_safe(r, next_r, t)
 			damon_do_apply_schemes(c, t, r);
 	}
 }
diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
index c244e7689c70..ac0de2de1987 100644
--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -310,6 +310,8 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
 
 	*nr_schemes = 0;
 	while (pos < len && *nr_schemes < max_nr_schemes) {
+		struct damos_quota quota = {};
+
 		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n",
 				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
 				&min_age, &max_age, &action, &parsed);
@@ -322,7 +324,7 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
 
 		pos += parsed;
 		scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
-				min_age, max_age, action);
+				min_age, max_age, action, &quota);
 		if (!scheme)
 			goto fail;
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 03/15] damon/core/schemes: Skip already charged targets and regions
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
  2021-07-20 13:12 ` [RFC v3 01/15] mm/damon/paddr: Support the pageout scheme SeongJae Park
  2021-07-20 13:12 ` [RFC v3 02/15] mm/damon/damos: Make schemes aggressiveness controllable SeongJae Park
@ 2021-07-20 13:12 ` SeongJae Park
  2021-07-20 13:12 ` [RFC v3 04/15] mm/damon/schemes: Implement time quota SeongJae Park
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:12 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

If DAMOS stopped applying action to memory regions due to the speed
limit, it does nothing until next charge window starts.  Then, it starts
the work from the beginning of the address space.  If there is a huge
memory region at the beginning of the address space and it fulfills the
scheme target data access pattern, the action will applied to only the
region.

This commit mitigates the case by skipping memory regions that charged
in previous charge window at the beginning of current charge window.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h |  5 +++++
 mm/damon/core.c       | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 9c996adb02b8..7b1fa506e7a6 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -107,6 +107,8 @@ struct damos_quota {
 /* private: For charging the quota */
 	unsigned long charged_sz;
 	unsigned long charged_from;
+	struct damon_target *charge_target_from;
+	unsigned long charge_addr_from;
 };
 
 /**
@@ -335,6 +337,9 @@ struct damon_ctx {
 #define damon_prev_region(r) \
 	(container_of(r->list.prev, struct damon_region, list))
 
+#define damon_last_region(t) \
+	(list_last_entry(&t->regions_list, struct damon_region, list))
+
 #define damon_for_each_region(r, t) \
 	list_for_each_entry(r, &t->regions_list, list)
 
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 00804a1e5e2a..a41eb9d885bb 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -111,6 +111,8 @@ struct damos *damon_new_scheme(
 	scheme->quota.reset_interval = quota->reset_interval;
 	scheme->quota.charged_sz = 0;
 	scheme->quota.charged_from = 0;
+	scheme->quota.charge_target_from = NULL;
+	scheme->quota.charge_addr_from = 0;
 
 	return scheme;
 }
@@ -558,6 +560,37 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
 		if (quota->sz && quota->charged_sz >= quota->sz)
 			continue;
 
+		/* Skip previously charged regions */
+		if (quota->charge_target_from) {
+			if (t != quota->charge_target_from)
+				continue;
+			if (r == damon_last_region(t)) {
+				quota->charge_target_from = NULL;
+				quota->charge_addr_from = 0;
+				continue;
+			}
+			if (quota->charge_addr_from &&
+					r->ar.end <= quota->charge_addr_from)
+				continue;
+
+			if (quota->charge_addr_from && r->ar.start <
+					quota->charge_addr_from) {
+				sz = ALIGN_DOWN(quota->charge_addr_from -
+						r->ar.start, DAMON_MIN_REGION);
+				if (!sz) {
+					if (r->ar.end - r->ar.start <=
+							DAMON_MIN_REGION)
+						continue;
+					sz = DAMON_MIN_REGION;
+				}
+				damon_split_region_at(c, t, r, sz);
+				r = damon_next_region(r);
+				sz = r->ar.end - r->ar.start;
+			}
+			quota->charge_target_from = NULL;
+			quota->charge_addr_from = 0;
+		}
+
 		/* Check the target regions condition */
 		if (sz < s->min_sz_region || s->max_sz_region < sz)
 			continue;
@@ -578,6 +611,10 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
 			}
 			c->primitive.apply_scheme(c, t, r, s);
 			quota->charged_sz += sz;
+			if (quota->sz && quota->charged_sz >= quota->sz) {
+				quota->charge_target_from = t;
+				quota->charge_addr_from = r->ar.end + 1;
+			}
 		}
 		if (s->action != DAMOS_STAT)
 			r->age = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 04/15] mm/damon/schemes: Implement time quota
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (2 preceding siblings ...)
  2021-07-20 13:12 ` [RFC v3 03/15] damon/core/schemes: Skip already charged targets and regions SeongJae Park
@ 2021-07-20 13:12 ` SeongJae Park
  2021-07-20 13:12 ` [RFC v3 05/15] mm/damon/dbgfs: Support schemes' time/IO quotas SeongJae Park
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:12 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit implements time-based quota for DAMON-based Operation
Schemes.  If the quota is set, DAMOS tries to use only up to the
user-defined quota within the 'reset_interval' milliseconds.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h | 25 +++++++++++++++++++-----
 mm/damon/core.c       | 45 ++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 7b1fa506e7a6..d2dd36b9dd6c 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -91,20 +91,35 @@ enum damos_action {
 
 /**
  * struct damos_quota - Controls the aggressiveness of the given scheme.
+ * @ms:			Maximum milliseconds that the scheme can use.
  * @sz:			Maximum bytes of memory that the action can be applied.
  * @reset_interval:	Charge reset interval in milliseconds.
  *
  * To avoid consuming too much CPU time or IO resources for applying the
- * &struct damos->action to large memory, DAMON allows users to set a size
- * quota.  The quota can be set by writing non-zero values to &sz.  If the size
- * quota is set, DAMON tries to apply the action only up to &sz bytes within
- * &reset_interval.
+ * &struct damos->action to large memory, DAMON allows users to set time and/or
+ * size quotas.  The quotas can be set by writing non-zero values to &ms and
+ * &sz, respectively.  If the time quota is set, DAMON tries to use only up to
+ * &ms milliseconds within &reset_interval for applying the action.  If the
+ * size quota is set, DAMON tries to apply the action only up to &sz bytes
+ * within &reset_interval.
+ *
+ * Internally, the time quota is transformed to a size quota using estimated
+ * throughput of the scheme's action.  DAMON then compares it against &sz and
+ * uses smaller one as the effective quota.
  */
 struct damos_quota {
+	unsigned long ms;
 	unsigned long sz;
 	unsigned long reset_interval;
 
-/* private: For charging the quota */
+/* private: */
+	/* For throughput estimation */
+	unsigned long total_charged_sz;
+	unsigned long total_charged_ns;
+
+	unsigned long esz;	/* Effective size quota in bytes */
+
+	/* For charging the quota */
 	unsigned long charged_sz;
 	unsigned long charged_from;
 	struct damon_target *charge_target_from;
diff --git a/mm/damon/core.c b/mm/damon/core.c
index a41eb9d885bb..321523604ef6 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -107,8 +107,12 @@ struct damos *damon_new_scheme(
 	scheme->stat_sz = 0;
 	INIT_LIST_HEAD(&scheme->list);
 
+	scheme->quota.ms = quota->ms;
 	scheme->quota.sz = quota->sz;
 	scheme->quota.reset_interval = quota->reset_interval;
+	scheme->quota.total_charged_sz = 0;
+	scheme->quota.total_charged_ns = 0;
+	scheme->quota.esz = 0;
 	scheme->quota.charged_sz = 0;
 	scheme->quota.charged_from = 0;
 	scheme->quota.charge_target_from = NULL;
@@ -555,9 +559,10 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
 	damon_for_each_scheme(s, c) {
 		struct damos_quota *quota = &s->quota;
 		unsigned long sz = r->ar.end - r->ar.start;
+		struct timespec64 begin, end;
 
 		/* Check the quota */
-		if (quota->sz && quota->charged_sz >= quota->sz)
+		if (quota->esz && quota->charged_sz >= quota->esz)
 			continue;
 
 		/* Skip previously charged regions */
@@ -602,16 +607,21 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
 
 		/* Apply the scheme */
 		if (c->primitive.apply_scheme) {
-			if (quota->sz && quota->charged_sz + sz > quota->sz) {
-				sz = ALIGN_DOWN(quota->sz - quota->charged_sz,
+			if (quota->esz &&
+					quota->charged_sz + sz > quota->esz) {
+				sz = ALIGN_DOWN(quota->esz - quota->charged_sz,
 						DAMON_MIN_REGION);
 				if (!sz)
 					goto update_stat;
 				damon_split_region_at(c, t, r, sz);
 			}
+			ktime_get_coarse_ts64(&begin);
 			c->primitive.apply_scheme(c, t, r, s);
+			ktime_get_coarse_ts64(&end);
+			quota->total_charged_ns += timespec64_to_ns(&end) -
+				timespec64_to_ns(&begin);
 			quota->charged_sz += sz;
-			if (quota->sz && quota->charged_sz >= quota->sz) {
+			if (quota->esz && quota->charged_sz >= quota->esz) {
 				quota->charge_target_from = t;
 				quota->charge_addr_from = r->ar.end + 1;
 			}
@@ -625,6 +635,29 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
 	}
 }
 
+/* Shouldn't be called if quota->ms and quota->sz are zero */
+static void damos_set_effective_quota(struct damos_quota *quota)
+{
+	unsigned long throughput;
+	unsigned long esz;
+
+	if (!quota->ms) {
+		quota->esz = quota->sz;
+		return;
+	}
+
+	if (quota->total_charged_ns)
+		throughput = quota->total_charged_sz * 1000000 /
+			quota->total_charged_ns;
+	else
+		throughput = PAGE_SIZE * 1024;
+	esz = throughput * quota->ms;
+
+	if (quota->sz && quota->sz < esz)
+		esz = quota->sz;
+	quota->esz = esz;
+}
+
 static void kdamond_apply_schemes(struct damon_ctx *c)
 {
 	struct damon_target *t;
@@ -634,15 +667,17 @@ static void kdamond_apply_schemes(struct damon_ctx *c)
 	damon_for_each_scheme(s, c) {
 		struct damos_quota *quota = &s->quota;
 
-		if (!quota->sz)
+		if (!quota->ms && !quota->sz)
 			continue;
 
 		/* New charge window starts */
 		if (time_after_eq(jiffies, quota->charged_from +
 					msecs_to_jiffies(
 						quota->reset_interval))) {
+			quota->total_charged_sz += quota->charged_sz;
 			quota->charged_from = jiffies;
 			quota->charged_sz = 0;
+			damos_set_effective_quota(quota);
 		}
 	}
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 05/15] mm/damon/dbgfs: Support schemes' time/IO quotas
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (3 preceding siblings ...)
  2021-07-20 13:12 ` [RFC v3 04/15] mm/damon/schemes: Implement time quota SeongJae Park
@ 2021-07-20 13:12 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 06/15] mm/damon/selftests: Support schemes quotas SeongJae Park
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:12 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit makes the debugfs interface of DAMON to support the schemes'
time/IO quotas by chaning the format of the input for the schemes file.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/damon/dbgfs.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
index ac0de2de1987..3cd253a07116 100644
--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -227,11 +227,14 @@ static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
 
 	damon_for_each_scheme(s, c) {
 		rc = scnprintf(&buf[written], len - written,
-				"%lu %lu %u %u %u %u %d %lu %lu\n",
+				"%lu %lu %u %u %u %u %d %lu %lu %lu %lu %lu\n",
 				s->min_sz_region, s->max_sz_region,
 				s->min_nr_accesses, s->max_nr_accesses,
 				s->min_age_region, s->max_age_region,
-				s->action, s->stat_count, s->stat_sz);
+				s->action,
+				s->quota.ms, s->quota.sz,
+				s->quota.reset_interval,
+				s->stat_count, s->stat_sz);
 		if (!rc)
 			return -ENOMEM;
 
@@ -312,10 +315,11 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
 	while (pos < len && *nr_schemes < max_nr_schemes) {
 		struct damos_quota quota = {};
 
-		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n",
+		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u %lu %lu %lu%n",
 				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
-				&min_age, &max_age, &action, &parsed);
-		if (ret != 7)
+				&min_age, &max_age, &action, &quota.ms,
+				&quota.sz, &quota.reset_interval, &parsed);
+		if (ret != 10)
 			break;
 		if (!damos_action_valid(action)) {
 			pr_err("wrong action %d\n", action);
@@ -1137,6 +1141,15 @@ static ssize_t dbgfs_monitor_on_write(struct file *file,
 	return ret;
 }
 
+/*
+ * v1: Add the scheme speed limit
+ */
+static ssize_t dbgfs_version_read(struct file *file,
+		char __user *buf, size_t count, loff_t *ppos)
+{
+	return simple_read_from_buffer(buf, count, ppos, "1\n", 2);
+}
+
 static const struct file_operations mk_contexts_fops = {
 	.write = dbgfs_mk_context_write,
 };
@@ -1150,13 +1163,18 @@ static const struct file_operations monitor_on_fops = {
 	.write = dbgfs_monitor_on_write,
 };
 
+static const struct file_operations version_fops = {
+	.owner = THIS_MODULE,
+	.read = dbgfs_version_read,
+};
+
 static int __init __damon_dbgfs_init(void)
 {
 	struct dentry *dbgfs_root;
 	const char * const file_names[] = {"mk_contexts", "rm_contexts",
-		"monitor_on"};
+		"monitor_on", "version"};
 	const struct file_operations *fops[] = {&mk_contexts_fops,
-		&rm_contexts_fops, &monitor_on_fops};
+		&rm_contexts_fops, &monitor_on_fops, &version_fops};
 	int i;
 
 	dbgfs_root = debugfs_create_dir("damon", NULL);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 06/15] mm/damon/selftests: Support schemes quotas
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (4 preceding siblings ...)
  2021-07-20 13:12 ` [RFC v3 05/15] mm/damon/dbgfs: Support schemes' time/IO quotas SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 07/15] mm/damon/schemes: Prioritize regions within the quotas SeongJae Park
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit updates DAMON selftests to support updated schemes debugfs
file format for the quotas.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 tools/testing/selftests/damon/debugfs_attrs.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/damon/debugfs_attrs.sh b/tools/testing/selftests/damon/debugfs_attrs.sh
index 63908a4c397b..b6a09ed6a1fc 100755
--- a/tools/testing/selftests/damon/debugfs_attrs.sh
+++ b/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -78,10 +78,10 @@ echo "$orig_content" > "$file"
 file="$DBGFS/schemes"
 orig_content=$(cat "$file")
 
-test_write_succ "$file" "1 2 3 4 5 6 4" \
+test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0" \
 	"$orig_content" "valid input"
 test_write_fail "$file" "1 2
-3 4 5 6 3" "$orig_content" "multi lines"
+3 4 5 6 3 0 0 0" "$orig_content" "multi lines"
 test_write_succ "$file" "" "$orig_content" "disabling"
 echo "$orig_content" > "$file"
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 07/15] mm/damon/schemes: Prioritize regions within the quotas
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (5 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 06/15] mm/damon/selftests: Support schemes quotas SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 08/15] mm/damon/vaddr,paddr: Support pageout prioritization SeongJae Park
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit makes DAMON to apply schemes to regions having higher
priority first, if it cannot apply schemes to all regions due to the
quotas.

The prioritization function should be implemented in each monitoring
primitive.  Those would commonly calculate the priority of the region
using attributes of regions, namely 'size', 'nr_accesses', and 'age'.
For example, some primitive would calculate the priority of each region
using a weighted sum of 'nr_accesses' and 'age' of the region.

The optimal weights would depend on give environments, so this commit
allows it to be customizable.  Nevertheless, the score calculation
functions are only encouraged to respect the weights, not mandated.  So,
the customization might not work for some primitives.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h | 26 ++++++++++++++++++
 mm/damon/core.c       | 62 ++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 81 insertions(+), 7 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index d2dd36b9dd6c..ddea4b4f3bfa 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -14,6 +14,8 @@
 
 /* Minimal region size.  Every damon_region is aligned by this. */
 #define DAMON_MIN_REGION	PAGE_SIZE
+/* Max priority score for DAMON-based operation schemes */
+#define DAMOS_MAX_SCORE		(99)
 
 /**
  * struct damon_addr_range - Represents an address region of [@start, @end).
@@ -95,6 +97,10 @@ enum damos_action {
  * @sz:			Maximum bytes of memory that the action can be applied.
  * @reset_interval:	Charge reset interval in milliseconds.
  *
+ * @weight_sz:		Weight of the region's size for prioritization.
+ * @weight_nr_accesses:	Weight of the region's nr_accesses for prioritization.
+ * @weight_age:		Weight of the region's age for prioritization.
+ *
  * To avoid consuming too much CPU time or IO resources for applying the
  * &struct damos->action to large memory, DAMON allows users to set time and/or
  * size quotas.  The quotas can be set by writing non-zero values to &ms and
@@ -106,12 +112,22 @@ enum damos_action {
  * Internally, the time quota is transformed to a size quota using estimated
  * throughput of the scheme's action.  DAMON then compares it against &sz and
  * uses smaller one as the effective quota.
+ *
+ * For selecting regions within the quota, DAMON prioritizes current scheme's
+ * target memory regions using the &struct damon_primitive->get_scheme_score.
+ * You could customize the prioritization logic by setting &weight_sz,
+ * &weight_nr_accesses, and &weight_age, because monitoring primitives are
+ * encouraged to respect those.
  */
 struct damos_quota {
 	unsigned long ms;
 	unsigned long sz;
 	unsigned long reset_interval;
 
+	unsigned int weight_sz;
+	unsigned int weight_nr_accesses;
+	unsigned int weight_age;
+
 /* private: */
 	/* For throughput estimation */
 	unsigned long total_charged_sz;
@@ -124,6 +140,10 @@ struct damos_quota {
 	unsigned long charged_from;
 	struct damon_target *charge_target_from;
 	unsigned long charge_addr_from;
+
+	/* For prioritization */
+	unsigned long histogram[DAMOS_MAX_SCORE + 1];
+	unsigned int min_score;
 };
 
 /**
@@ -174,6 +194,7 @@ struct damon_ctx;
  * @prepare_access_checks:	Prepare next access check of target regions.
  * @check_accesses:		Check the accesses to target regions.
  * @reset_aggregated:		Reset aggregated accesses monitoring results.
+ * @get_scheme_score:		Get the score of a region for a scheme.
  * @apply_scheme:		Apply a DAMON-based operation scheme.
  * @target_valid:		Determine if the target is valid.
  * @cleanup:			Clean up the context.
@@ -201,6 +222,8 @@ struct damon_ctx;
  * of its update.  The value will be used for regions adjustment threshold.
  * @reset_aggregated should reset the access monitoring results that aggregated
  * by @check_accesses.
+ * @get_scheme_score should return the priority score of a region for a scheme
+ * as an integer in [0, &DAMOS_MAX_SCORE].
  * @apply_scheme is called from @kdamond when a region for user provided
  * DAMON-based operation scheme is found.  It should apply the scheme's action
  * to the region.  This is not used for &DAMON_ARBITRARY_TARGET case.
@@ -215,6 +238,9 @@ struct damon_primitive {
 	void (*prepare_access_checks)(struct damon_ctx *context);
 	unsigned int (*check_accesses)(struct damon_ctx *context);
 	void (*reset_aggregated)(struct damon_ctx *context);
+	int (*get_scheme_score)(struct damon_ctx *context,
+			struct damon_target *t, struct damon_region *r,
+			struct damos *scheme);
 	int (*apply_scheme)(struct damon_ctx *context, struct damon_target *t,
 			struct damon_region *r, struct damos *scheme);
 	bool (*target_valid)(void *target);
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 321523604ef6..4e8b44fe02a0 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -12,6 +12,7 @@
 #include <linux/kthread.h>
 #include <linux/random.h>
 #include <linux/slab.h>
+#include <linux/string.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/damon.h>
@@ -110,6 +111,9 @@ struct damos *damon_new_scheme(
 	scheme->quota.ms = quota->ms;
 	scheme->quota.sz = quota->sz;
 	scheme->quota.reset_interval = quota->reset_interval;
+	scheme->quota.weight_sz = quota->weight_sz;
+	scheme->quota.weight_nr_accesses = quota->weight_nr_accesses;
+	scheme->quota.weight_age = quota->weight_age;
 	scheme->quota.total_charged_sz = 0;
 	scheme->quota.total_charged_ns = 0;
 	scheme->quota.esz = 0;
@@ -550,6 +554,28 @@ static void damon_split_region_at(struct damon_ctx *ctx,
 		struct damon_target *t, struct damon_region *r,
 		unsigned long sz_r);
 
+static bool __damos_valid_target(struct damon_region *r, struct damos *s)
+{
+	unsigned long sz;
+
+	sz = r->ar.end - r->ar.start;
+	return s->min_sz_region <= sz && sz <= s->max_sz_region &&
+		s->min_nr_accesses <= r->nr_accesses &&
+		r->nr_accesses <= s->max_nr_accesses &&
+		s->min_age_region <= r->age && r->age <= s->max_age_region;
+}
+
+static bool damos_valid_target(struct damon_ctx *c, struct damon_target *t,
+		struct damon_region *r, struct damos *s)
+{
+	bool ret = __damos_valid_target(r, s);
+
+	if (!ret || !s->quota.esz || !c->primitive.get_scheme_score)
+		return ret;
+
+	return c->primitive.get_scheme_score(c, t, r, s) >= s->quota.min_score;
+}
+
 static void damon_do_apply_schemes(struct damon_ctx *c,
 				   struct damon_target *t,
 				   struct damon_region *r)
@@ -596,13 +622,7 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
 			quota->charge_addr_from = 0;
 		}
 
-		/* Check the target regions condition */
-		if (sz < s->min_sz_region || s->max_sz_region < sz)
-			continue;
-		if (r->nr_accesses < s->min_nr_accesses ||
-				s->max_nr_accesses < r->nr_accesses)
-			continue;
-		if (r->age < s->min_age_region || s->max_age_region < r->age)
+		if (!damos_valid_target(c, t, r, s))
 			continue;
 
 		/* Apply the scheme */
@@ -666,6 +686,8 @@ static void kdamond_apply_schemes(struct damon_ctx *c)
 
 	damon_for_each_scheme(s, c) {
 		struct damos_quota *quota = &s->quota;
+		unsigned long cumulated_sz;
+		unsigned int score, max_score = 0;
 
 		if (!quota->ms && !quota->sz)
 			continue;
@@ -679,6 +701,32 @@ static void kdamond_apply_schemes(struct damon_ctx *c)
 			quota->charged_sz = 0;
 			damos_set_effective_quota(quota);
 		}
+
+		if (!c->primitive.get_scheme_score)
+			continue;
+
+		/* Fill up the score histogram */
+		memset(quota->histogram, 0, sizeof(quota->histogram));
+		damon_for_each_target(t, c) {
+			damon_for_each_region(r, t) {
+				if (!__damos_valid_target(r, s))
+					continue;
+				score = c->primitive.get_scheme_score(
+						c, t, r, s);
+				quota->histogram[score] +=
+					r->ar.end - r->ar.start;
+				if (score > max_score)
+					max_score = score;
+			}
+		}
+
+		/* Set the min score limit */
+		for (cumulated_sz = 0, score = max_score; ; score--) {
+			cumulated_sz += quota->histogram[score];
+			if (cumulated_sz >= quota->esz || !score)
+				break;
+		}
+		quota->min_score = score;
 	}
 
 	damon_for_each_target(t, c) {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 08/15] mm/damon/vaddr,paddr: Support pageout prioritization
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (6 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 07/15] mm/damon/schemes: Prioritize regions within the quotas SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 09/15] mm/damon/dbgfs: Support prioritization weights SeongJae Park
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit makes the default monitoring primitives for virtual address
spaces and the physical address sapce to support memory regions
prioritization for 'PAGEOUT' DAMOS action.  It calculates hotness of
each region as weighted sum of 'nr_accesses' and 'age' of the region and
get the priority score as reverse of the hotness, so that cold regions
can be paged out first.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h   |  4 ++++
 mm/damon/paddr.c        | 14 +++++++++++++
 mm/damon/prmtv-common.c | 46 +++++++++++++++++++++++++++++++++++++++++
 mm/damon/prmtv-common.h |  3 +++
 mm/damon/vaddr.c        | 15 ++++++++++++++
 5 files changed, 82 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index ddea4b4f3bfa..f56f04b63b1c 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -449,6 +449,8 @@ bool damon_va_target_valid(void *t);
 void damon_va_cleanup(struct damon_ctx *ctx);
 int damon_va_apply_scheme(struct damon_ctx *context, struct damon_target *t,
 		struct damon_region *r, struct damos *scheme);
+int damon_va_scheme_score(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme);
 void damon_va_set_primitives(struct damon_ctx *ctx);
 
 #endif	/* CONFIG_DAMON_VADDR */
@@ -459,6 +461,8 @@ void damon_va_set_primitives(struct damon_ctx *ctx);
 void damon_pa_prepare_access_checks(struct damon_ctx *ctx);
 unsigned int damon_pa_check_accesses(struct damon_ctx *ctx);
 bool damon_pa_target_valid(void *t);
+int damon_pa_scheme_score(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme);
 void damon_pa_set_primitives(struct damon_ctx *ctx);
 
 #endif	/* CONFIG_DAMON_PADDR */
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 303db372e53b..99a579e8d046 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -121,6 +121,19 @@ int damon_pa_apply_scheme(struct damon_ctx *ctx, struct damon_target *t,
 	return 0;
 }
 
+int damon_pa_scheme_score(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme)
+{
+	switch (scheme->action) {
+	case DAMOS_PAGEOUT:
+		return damon_pageout_score(context, r, scheme);
+	default:
+		break;
+	}
+
+	return DAMOS_MAX_SCORE;
+}
+
 void damon_pa_set_primitives(struct damon_ctx *ctx)
 {
 	ctx->primitive.init = NULL;
@@ -131,4 +144,5 @@ void damon_pa_set_primitives(struct damon_ctx *ctx)
 	ctx->primitive.target_valid = damon_pa_target_valid;
 	ctx->primitive.cleanup = NULL;
 	ctx->primitive.apply_scheme = damon_pa_apply_scheme;
+	ctx->primitive.get_scheme_score = damon_pa_scheme_score;
 }
diff --git a/mm/damon/prmtv-common.c b/mm/damon/prmtv-common.c
index 01c1c1b37859..1df31d1f8590 100644
--- a/mm/damon/prmtv-common.c
+++ b/mm/damon/prmtv-common.c
@@ -236,3 +236,49 @@ bool damon_pa_young(unsigned long paddr, unsigned long *page_sz)
 	*page_sz = result.page_sz;
 	return result.accessed;
 }
+
+#define DAMON_MAX_SUBSCORE	(100)
+#define DAMON_MAX_AGE_IN_LOG	(32)
+
+int damon_pageout_score(struct damon_ctx *c, struct damon_region *r,
+			struct damos *s)
+{
+	unsigned int max_nr_accesses;
+	int freq_subscore;
+	unsigned int age_in_sec;
+	int age_in_log, age_subscore;
+	unsigned int freq_weight = s->quota.weight_nr_accesses;
+	unsigned int age_weight = s->quota.weight_age;
+	int hotness;
+
+	max_nr_accesses = c->aggr_interval / c->sample_interval;
+	freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE / max_nr_accesses;
+
+	age_in_sec = (unsigned long)r->age * c->aggr_interval / 1000000;
+	for (age_in_log = 0; age_in_log < DAMON_MAX_AGE_IN_LOG && age_in_sec;
+			age_in_log++, age_in_sec >>= 1)
+		;
+
+	/* If frequency is 0, higher age means it's colder */
+	if (freq_subscore == 0)
+		age_in_log *= -1;
+
+	/*
+	 * Now age_in_log is in [-DAMON_MAX_AGE_IN_LOG, DAMON_MAX_AGE_IN_LOG].
+	 * Scale it to be in [0, 100] and set it as age subscore.
+	 */
+	age_in_log += DAMON_MAX_AGE_IN_LOG;
+	age_subscore = age_in_log * DAMON_MAX_SUBSCORE /
+		DAMON_MAX_AGE_IN_LOG / 2;
+
+	hotness = (freq_weight * freq_subscore + age_weight * age_subscore);
+	if (freq_weight + age_weight)
+		hotness /= freq_weight + age_weight;
+	/*
+	 * Transform it to fit in [0, DAMOS_MAX_SCORE]
+	 */
+	hotness = hotness * DAMOS_MAX_SCORE / DAMON_MAX_SUBSCORE;
+
+	/* Return coldness of the region */
+	return DAMOS_MAX_SCORE - hotness;
+}
diff --git a/mm/damon/prmtv-common.h b/mm/damon/prmtv-common.h
index ba0c4eecbb79..b27c4e94917e 100644
--- a/mm/damon/prmtv-common.h
+++ b/mm/damon/prmtv-common.h
@@ -26,3 +26,6 @@ bool damon_va_young(struct mm_struct *mm, unsigned long addr,
 
 void damon_pa_mkold(unsigned long paddr);
 bool damon_pa_young(unsigned long paddr, unsigned long *page_sz);
+
+int damon_pageout_score(struct damon_ctx *c, struct damon_region *r,
+			struct damos *s);
diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index ee4cbe12ac30..38d5434315e4 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -515,6 +515,20 @@ int damon_va_apply_scheme(struct damon_ctx *ctx, struct damon_target *t,
 	return damos_madvise(t, r, madv_action);
 }
 
+int damon_va_scheme_score(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme)
+{
+
+	switch (scheme->action) {
+	case DAMOS_PAGEOUT:
+		return damon_pageout_score(context, r, scheme);
+	default:
+		break;
+	}
+
+	return DAMOS_MAX_SCORE;
+}
+
 void damon_va_set_primitives(struct damon_ctx *ctx)
 {
 	ctx->primitive.init = damon_va_init;
@@ -525,6 +539,7 @@ void damon_va_set_primitives(struct damon_ctx *ctx)
 	ctx->primitive.target_valid = damon_va_target_valid;
 	ctx->primitive.cleanup = NULL;
 	ctx->primitive.apply_scheme = damon_va_apply_scheme;
+	ctx->primitive.get_scheme_score = damon_va_scheme_score;
 }
 
 #include "vaddr-test.h"
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 09/15] mm/damon/dbgfs: Support prioritization weights
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (7 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 08/15] mm/damon/vaddr,paddr: Support pageout prioritization SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 10/15] tools/selftests/damon: Update for regions prioritization of schemes SeongJae Park
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit allows DAMON debugfs interface users set the prioritization
weights by putting three more numbers to the 'schemes' file.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/damon/dbgfs.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
index 3cd253a07116..93ac15cb5786 100644
--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -227,13 +227,16 @@ static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
 
 	damon_for_each_scheme(s, c) {
 		rc = scnprintf(&buf[written], len - written,
-				"%lu %lu %u %u %u %u %d %lu %lu %lu %lu %lu\n",
+				"%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %lu %lu\n",
 				s->min_sz_region, s->max_sz_region,
 				s->min_nr_accesses, s->max_nr_accesses,
 				s->min_age_region, s->max_age_region,
 				s->action,
 				s->quota.ms, s->quota.sz,
 				s->quota.reset_interval,
+				s->quota.weight_sz,
+				s->quota.weight_nr_accesses,
+				s->quota.weight_age,
 				s->stat_count, s->stat_sz);
 		if (!rc)
 			return -ENOMEM;
@@ -315,11 +318,14 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
 	while (pos < len && *nr_schemes < max_nr_schemes) {
 		struct damos_quota quota = {};
 
-		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u %lu %lu %lu%n",
+		ret = sscanf(&str[pos],
+				"%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n",
 				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
 				&min_age, &max_age, &action, &quota.ms,
-				&quota.sz, &quota.reset_interval, &parsed);
-		if (ret != 10)
+				&quota.sz, &quota.reset_interval,
+				&quota.weight_sz, &quota.weight_nr_accesses,
+				&quota.weight_age, &parsed);
+		if (ret != 13)
 			break;
 		if (!damos_action_valid(action)) {
 			pr_err("wrong action %d\n", action);
@@ -1147,7 +1153,7 @@ static ssize_t dbgfs_monitor_on_write(struct file *file,
 static ssize_t dbgfs_version_read(struct file *file,
 		char __user *buf, size_t count, loff_t *ppos)
 {
-	return simple_read_from_buffer(buf, count, ppos, "1\n", 2);
+	return simple_read_from_buffer(buf, count, ppos, "2\n", 2);
 }
 
 static const struct file_operations mk_contexts_fops = {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 10/15] tools/selftests/damon: Update for regions prioritization of schemes
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (8 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 09/15] mm/damon/dbgfs: Support prioritization weights SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 11/15] mm/damon/schemes: Activate schemes based on a watermarks mechanism SeongJae Park
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit updates the DAMON selftests for 'schemes' debugfs file, as
the file format is updated.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 tools/testing/selftests/damon/debugfs_attrs.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/damon/debugfs_attrs.sh b/tools/testing/selftests/damon/debugfs_attrs.sh
index b6a09ed6a1fc..4c69d8709e3b 100755
--- a/tools/testing/selftests/damon/debugfs_attrs.sh
+++ b/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -78,10 +78,10 @@ echo "$orig_content" > "$file"
 file="$DBGFS/schemes"
 orig_content=$(cat "$file")
 
-test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0" \
+test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3" \
 	"$orig_content" "valid input"
 test_write_fail "$file" "1 2
-3 4 5 6 3 0 0 0" "$orig_content" "multi lines"
+3 4 5 6 3 0 0 0 1 2 3" "$orig_content" "multi lines"
 test_write_succ "$file" "" "$orig_content" "disabling"
 echo "$orig_content" > "$file"
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 11/15] mm/damon/schemes: Activate schemes based on a watermarks mechanism
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (9 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 10/15] tools/selftests/damon: Update for regions prioritization of schemes SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 12/15] mm/damon/dbgfs: Support watermarks SeongJae Park
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

DAMON-based operation schemes need to be manually turned on and off.  In
some use cases, however, the condition for turning a scheme on and off
would depend on the system's situation.  For example, schemes for
proactive pages reclamation would need to be turned on when some memory
pressure is detected, and turned off when the system has enough free
memory.

For easier control of schemes activation based on the system situation,
this commit introduces a watermarks-based mechanism.  The client can
describe the watermark metric (e.g., amount of free memory in the
system), watermark check interval, and three watermarks, namely high,
mid, and low.  If the scheme is deactivated, it only gets the metric and
compare that to the three watermarks for every check interval.  If the
metric is higher than the high watermark, the scheme is deactivated.  If
the metric is between the mid watermark and the low watermark, the
scheme is activated.  If the metric is lower than the low watermark, the
scheme is deactivated again.  This is to allow users fall back to
traditional page-granularity mechanisms.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 include/linux/damon.h | 52 ++++++++++++++++++++++-
 mm/damon/core.c       | 97 ++++++++++++++++++++++++++++++++++++++++++-
 mm/damon/dbgfs.c      |  5 ++-
 3 files changed, 151 insertions(+), 3 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index f56f04b63b1c..b33a5ee94f62 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -146,6 +146,45 @@ struct damos_quota {
 	unsigned int min_score;
 };
 
+/**
+ * enum damos_wmark_metric - Represents the watermark metric.
+ *
+ * @DAMOS_WMARK_NONE:		Ignore the watermarks of the given scheme.
+ * @DAMOS_WMARK_FREE_MEM_RATE:	Free memory rate of the system in [0,1000].
+ */
+enum damos_wmark_metric {
+	DAMOS_WMARK_NONE,
+	DAMOS_WMARK_FREE_MEM_RATE,
+};
+
+/**
+ * struct damos_watermarks - Controls when a given scheme should be activated.
+ * @metric:	Metric for the watermarks.
+ * @interval:	Watermarks check time interval in microseconds.
+ * @high:	High watermark.
+ * @mid:	Middle watermark.
+ * @low:	Low watermark.
+ *
+ * If &metric is &DAMOS_WMARK_NONE, the scheme is always active.  Being active
+ * means DAMON does monitoring and applying the action of the scheme to
+ * appropriate memory regions.  Else, DAMON checks &metric of the system for at
+ * least every &interval microseconds and works as below.
+ *
+ * If &metric is higher than &high, the scheme is inactivated.  If &metric is
+ * between &mid and &low, the scheme is activated.  If &metric is lower than
+ * &low, the scheme is inactivated.
+ */
+struct damos_watermarks {
+	enum damos_wmark_metric metric;
+	unsigned long interval;
+	unsigned long high;
+	unsigned long mid;
+	unsigned long low;
+
+/* private: */
+	bool activated;
+};
+
 /**
  * struct damos - Represents a Data Access Monitoring-based Operation Scheme.
  * @min_sz_region:	Minimum size of target regions.
@@ -156,6 +195,7 @@ struct damos_quota {
  * @max_age_region:	Maximum age of target regions.
  * @action:		&damo_action to be applied to the target regions.
  * @quota:		Control the aggressiveness of this scheme.
+ * @wmarks:		Watermarks for automated (in)activation of this scheme.
  * @stat_count:		Total number of regions that this scheme is applied.
  * @stat_sz:		Total size of regions that this scheme is applied.
  * @list:		List head for siblings.
@@ -166,6 +206,14 @@ struct damos_quota {
  * those.  To avoid consuming too much CPU time or IO resources for the
  * &action, &quota is used.
  *
+ * To do the work only when needed, schemes can be activated for specific
+ * system situations using &wmarks.  If all schemes that registered to the
+ * monitoring context are inactive, DAMON stops monitoring either, and just
+ * repeatedly checks the watermarks.
+ *
+ * If all schemes that registered to a &struct damon_ctx are inactive, DAMON
+ * stops monitoring and just repeatedly checks the watermarks.
+ *
  * After applying the &action to each region, &stat_count and &stat_sz is
  * updated to reflect the number of regions and total size of regions that the
  * &action is applied.
@@ -179,6 +227,7 @@ struct damos {
 	unsigned int max_age_region;
 	enum damos_action action;
 	struct damos_quota quota;
+	struct damos_watermarks wmarks;
 	unsigned long stat_count;
 	unsigned long stat_sz;
 	struct list_head list;
@@ -412,7 +461,8 @@ struct damos *damon_new_scheme(
 		unsigned long min_sz_region, unsigned long max_sz_region,
 		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
 		unsigned int min_age_region, unsigned int max_age_region,
-		enum damos_action action, struct damos_quota *quota);
+		enum damos_action action, struct damos_quota *quota,
+		struct damos_watermarks *wmarks);
 void damon_add_scheme(struct damon_ctx *ctx, struct damos *s);
 void damon_destroy_scheme(struct damos *s);
 
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 4e8b44fe02a0..5b578f65c109 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -10,6 +10,7 @@
 #include <linux/damon.h>
 #include <linux/delay.h>
 #include <linux/kthread.h>
+#include <linux/mm.h>
 #include <linux/random.h>
 #include <linux/slab.h>
 #include <linux/string.h>
@@ -90,7 +91,8 @@ struct damos *damon_new_scheme(
 		unsigned long min_sz_region, unsigned long max_sz_region,
 		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
 		unsigned int min_age_region, unsigned int max_age_region,
-		enum damos_action action, struct damos_quota *quota)
+		enum damos_action action, struct damos_quota *quota,
+		struct damos_watermarks *wmarks)
 {
 	struct damos *scheme;
 
@@ -122,6 +124,13 @@ struct damos *damon_new_scheme(
 	scheme->quota.charge_target_from = NULL;
 	scheme->quota.charge_addr_from = 0;
 
+	scheme->wmarks.metric = wmarks->metric;
+	scheme->wmarks.interval = wmarks->interval;
+	scheme->wmarks.high = wmarks->high;
+	scheme->wmarks.mid = wmarks->mid;
+	scheme->wmarks.low = wmarks->low;
+	scheme->wmarks.activated = true;
+
 	return scheme;
 }
 
@@ -587,6 +596,9 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
 		unsigned long sz = r->ar.end - r->ar.start;
 		struct timespec64 begin, end;
 
+		if (!s->wmarks.activated)
+			continue;
+
 		/* Check the quota */
 		if (quota->esz && quota->charged_sz >= quota->esz)
 			continue;
@@ -689,6 +701,9 @@ static void kdamond_apply_schemes(struct damon_ctx *c)
 		unsigned long cumulated_sz;
 		unsigned int score, max_score = 0;
 
+		if (!s->wmarks.activated)
+			continue;
+
 		if (!quota->ms && !quota->sz)
 			continue;
 
@@ -932,6 +947,83 @@ static bool kdamond_need_stop(struct damon_ctx *ctx)
 	return true;
 }
 
+static unsigned long damos_wmark_metric_value(enum damos_wmark_metric metric)
+{
+	struct sysinfo i;
+
+	switch (metric) {
+	case DAMOS_WMARK_FREE_MEM_RATE:
+		si_meminfo(&i);
+		return i.freeram * 1000 / i.totalram;
+	default:
+		break;
+	}
+	return -EINVAL;
+}
+
+/*
+ * Returns zero if the scheme is active.  Else, returns time to wait for next
+ * watermark check in micro-seconds.
+ */
+static unsigned long damos_wmark_wait_us(struct damos *scheme)
+{
+	unsigned long metric;
+
+	if (scheme->wmarks.metric == DAMOS_WMARK_NONE)
+		return 0;
+
+	metric = damos_wmark_metric_value(scheme->wmarks.metric);
+	/* higher than high watermark or lower than low watermark */
+	if (metric > scheme->wmarks.high || scheme->wmarks.low > metric) {
+		if (scheme->wmarks.activated)
+			pr_info("inactivate a scheme (%d) for %s wmark\n",
+					scheme->action,
+					metric > scheme->wmarks.high ?
+					"high" : "low");
+		scheme->wmarks.activated = false;
+		return scheme->wmarks.interval;
+	}
+
+	/* inactive and higher than middle watermark */
+	if ((scheme->wmarks.high >= metric && metric >= scheme->wmarks.mid) &&
+			!scheme->wmarks.activated)
+		return scheme->wmarks.interval;
+
+	if (!scheme->wmarks.activated)
+		pr_info("activate a scheme (%d)\n", scheme->action);
+	scheme->wmarks.activated = true;
+	return 0;
+}
+
+static void kdamond_usleep(unsigned long usecs)
+{
+	if (usecs > 100 * 1000)
+		schedule_timeout_interruptible(usecs_to_jiffies(usecs));
+	else
+		usleep_range(usecs, usecs + 1);
+}
+
+/* Returns negative error code if it's not activated but should return */
+static int kdamond_wait_activation(struct damon_ctx *ctx)
+{
+	struct damos *s;
+	unsigned long wait_time;
+	unsigned long min_wait_time = 0;
+
+	while (!kdamond_need_stop(ctx)) {
+		damon_for_each_scheme(s, ctx) {
+			wait_time = damos_wmark_wait_us(s);
+			if (!min_wait_time || wait_time < min_wait_time)
+				min_wait_time = wait_time;
+		}
+		if (!min_wait_time)
+			return 0;
+
+		kdamond_usleep(min_wait_time);
+	}
+	return -EBUSY;
+}
+
 static void set_kdamond_stop(struct damon_ctx *ctx)
 {
 	mutex_lock(&ctx->kdamond_lock);
@@ -962,6 +1054,9 @@ static int kdamond_fn(void *data)
 	sz_limit = damon_region_sz_limit(ctx);
 
 	while (!kdamond_need_stop(ctx)) {
+		if (kdamond_wait_activation(ctx))
+			continue;
+
 		if (ctx->primitive.prepare_access_checks)
 			ctx->primitive.prepare_access_checks(ctx);
 		if (ctx->callback.after_sampling &&
diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
index 93ac15cb5786..2509c0b634b4 100644
--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -317,6 +317,9 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
 	*nr_schemes = 0;
 	while (pos < len && *nr_schemes < max_nr_schemes) {
 		struct damos_quota quota = {};
+		struct damos_watermarks wmarks = {
+			.metric = DAMOS_WMARK_NONE,
+		};
 
 		ret = sscanf(&str[pos],
 				"%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n",
@@ -334,7 +337,7 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
 
 		pos += parsed;
 		scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
-				min_age, max_age, action, &quota);
+				min_age, max_age, action, &quota, &wmarks);
 		if (!scheme)
 			goto fail;
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 12/15] mm/damon/dbgfs: Support watermarks
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (10 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 11/15] mm/damon/schemes: Activate schemes based on a watermarks mechanism SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 13/15] selftests/damon: " SeongJae Park
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit updates DAMON debugfs interface to support the watermarks
based schemes activation.  For this, now 'schemes' file receives five
more values.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/damon/dbgfs.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
index 2509c0b634b4..168bd6406a3f 100644
--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -227,7 +227,7 @@ static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
 
 	damon_for_each_scheme(s, c) {
 		rc = scnprintf(&buf[written], len - written,
-				"%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %lu %lu\n",
+				"%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %d %lu %lu %lu %lu %lu %lu\n",
 				s->min_sz_region, s->max_sz_region,
 				s->min_nr_accesses, s->max_nr_accesses,
 				s->min_age_region, s->max_age_region,
@@ -237,6 +237,8 @@ static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
 				s->quota.weight_sz,
 				s->quota.weight_nr_accesses,
 				s->quota.weight_age,
+				s->wmarks.metric, s->wmarks.interval,
+				s->wmarks.high, s->wmarks.mid, s->wmarks.low,
 				s->stat_count, s->stat_sz);
 		if (!rc)
 			return -ENOMEM;
@@ -317,18 +319,18 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
 	*nr_schemes = 0;
 	while (pos < len && *nr_schemes < max_nr_schemes) {
 		struct damos_quota quota = {};
-		struct damos_watermarks wmarks = {
-			.metric = DAMOS_WMARK_NONE,
-		};
+		struct damos_watermarks wmarks;
 
 		ret = sscanf(&str[pos],
-				"%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n",
+				"%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u %u %lu %lu %lu %lu%n",
 				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
 				&min_age, &max_age, &action, &quota.ms,
 				&quota.sz, &quota.reset_interval,
 				&quota.weight_sz, &quota.weight_nr_accesses,
-				&quota.weight_age, &parsed);
-		if (ret != 13)
+				&quota.weight_age, &wmarks.metric,
+				&wmarks.interval, &wmarks.high, &wmarks.mid,
+				&wmarks.low, &parsed);
+		if (ret != 18)
 			break;
 		if (!damos_action_valid(action)) {
 			pr_err("wrong action %d\n", action);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 13/15] selftests/damon: Support watermarks
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (11 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 12/15] mm/damon/dbgfs: Support watermarks SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 14/15] mm/damon: Introduce DAMON-based reclamation SeongJae Park
  2021-07-20 13:13 ` [RFC v3 15/15] Documentation/admin-guide/mm/damon: Add a document for DAMON_RECLAIM SeongJae Park
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit updates DAMON selftests for 'schemes' debugfs file to
reflect the changes in the format.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 tools/testing/selftests/damon/debugfs_attrs.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/damon/debugfs_attrs.sh b/tools/testing/selftests/damon/debugfs_attrs.sh
index 4c69d8709e3b..154f9f76f84a 100755
--- a/tools/testing/selftests/damon/debugfs_attrs.sh
+++ b/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -78,10 +78,10 @@ echo "$orig_content" > "$file"
 file="$DBGFS/schemes"
 orig_content=$(cat "$file")
 
-test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3" \
+test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3 1 100 3 2 1" \
 	"$orig_content" "valid input"
 test_write_fail "$file" "1 2
-3 4 5 6 3 0 0 0 1 2 3" "$orig_content" "multi lines"
+3 4 5 6 3 0 0 0 1 2 3 1 100 3 2 1" "$orig_content" "multi lines"
 test_write_succ "$file" "" "$orig_content" "disabling"
 echo "$orig_content" > "$file"
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 14/15] mm/damon: Introduce DAMON-based reclamation
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (12 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 13/15] selftests/damon: " SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  2021-07-20 13:13 ` [RFC v3 15/15] Documentation/admin-guide/mm/damon: Add a document for DAMON_RECLAIM SeongJae Park
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit implements a new kernel subsystem that finds cold memory
regions using DAMON and reclaims those immediately.  It is intended to
be used as proactive lightweigh reclamation logic for light memory
pressure.  For heavy memory pressure, it could be inactivated and fall
back to the traditional page-scanning based reclamation.

It's implemented on top of DAMON framework to use the DAMON-based
Operation Schemes (DAMOS) feature.  It utilizes all the DAMOS features
including speed limit, prioritization, and watermarks.

It could be enabled and tuned in build time via the kernel
configuration, in boot time via the kernel boot parameter, and in run
time via its module parameter ('/sys/module/damon_reclaim/parameters/')
interface.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 mm/damon/Kconfig   |  12 ++
 mm/damon/Makefile  |   1 +
 mm/damon/reclaim.c | 354 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 367 insertions(+)
 create mode 100644 mm/damon/reclaim.c

diff --git a/mm/damon/Kconfig b/mm/damon/Kconfig
index eeefb5b633b6..d9cd88810279 100644
--- a/mm/damon/Kconfig
+++ b/mm/damon/Kconfig
@@ -84,4 +84,16 @@ config DAMON_DBGFS_KUNIT_TEST
 
 	  If unsure, say N.
 
+config DAMON_RECLAIM
+	bool "Build DAMON-based reclaim (DAMON_RECLAIM)"
+	depends on DAMON_PADDR
+	help
+	  This builds the DAMON-based reclamation subsystem.  It finds pages
+	  that not accessed for a long time (cold) using DAMON and reclaim
+	  those if enabled.
+
+	  This is suggested to be used as a proactive and lightweight
+	  reclamation under light memory pressure, while the traditional page
+	  scanning-based reclamation is used for heavy pressure.
+
 endmenu
diff --git a/mm/damon/Makefile b/mm/damon/Makefile
index 017799e5670a..39433e7d570c 100644
--- a/mm/damon/Makefile
+++ b/mm/damon/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_DAMON_VADDR)	+= prmtv-common.o vaddr.o
 obj-$(CONFIG_DAMON_PADDR)	+= prmtv-common.o paddr.o
 obj-$(CONFIG_DAMON_PGIDLE)	+= prmtv-common.o pgidle.o
 obj-$(CONFIG_DAMON_DBGFS)	+= dbgfs.o
+obj-$(CONFIG_DAMON_RECLAIM)	+= reclaim.o
diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
new file mode 100644
index 000000000000..147fd1304b7b
--- /dev/null
+++ b/mm/damon/reclaim.c
@@ -0,0 +1,354 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * DAMON-based page reclamation
+ *
+ * Author: SeongJae Park <sjpark@amazon.de>
+ */
+
+#define pr_fmt(fmt) "damon-reclaim: " fmt
+
+#include <linux/damon.h>
+#include <linux/ioport.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/workqueue.h>
+
+#ifdef MODULE_PARAM_PREFIX
+#undef MODULE_PARAM_PREFIX
+#endif
+#define MODULE_PARAM_PREFIX "damon_reclaim."
+
+/*
+ * Enable or disable DAMON_RECLAIM.
+ *
+ * You can enable DAMON_RCLAIM by setting the value of this parameter as ``Y``.
+ * Setting it as ``N`` disables DAMON_RECLAIM.  Note that DAMON_RECLAIM could
+ * do no real monitoring and reclamation due to the watermarks-based activation
+ * condition.  Refer to below descriptions for the watermarks parameter for
+ * this.
+ */
+static bool enabled __read_mostly;
+module_param(enabled, bool, 0600);
+
+/*
+ * Time threshold for cold memory regions identification in microseconds.
+ *
+ * If a memory region is not accessed for this or longer time, DAMON_RECLAIM
+ * identifies the region as cold, and reclaims.  5 seconds by default.
+ */
+static unsigned long min_age __read_mostly = 5000000;
+module_param(min_age, ulong, 0600);
+
+/*
+ * Limit of time for trying the reclamation in milliseconds.
+ *
+ * DAMON_RECLAIM tries to use only up to this time within a time window
+ * (quota_reset_interval_ms) for trying reclamation of cold pages.  This can be
+ * used for limiting CPU consumption of DAMON_RECLAIM.  If the value is zero,
+ * the limit is disabled.
+ *
+ * 100 ms by default.
+ */
+static unsigned long quota_ms __read_mostly = 100;
+module_param(quota_ms, ulong, 0600);
+
+/*
+ * Limit of size of memory for the reclamation in bytes.
+ *
+ * DAMON_RECLAIM charges amount of memory which it tried to reclaim within a
+ * time window (quota_reset_interval_ms) and makes no more than this limit is
+ * tried.  This can be used for limiting consumption of CPU and IO.  If this
+ * value is zero, the limit is disabled.
+ *
+ * 1 GiB by default.
+ */
+static unsigned long quota_sz __read_mostly = 1024 * 1024 * 1024;
+module_param(quota_sz, ulong, 0600);
+
+/*
+ * The time/size quota charge reset interval in milliseconds.
+ *
+ * The charge reset interval for the quota of time (quota_ms) and size
+ * (quota_sz).  That is, DAMON_RECLAIM does not try reclamation for more than
+ * quota_ms milliseconds or quota_sz bytes within quota_reset_interval_ms
+ * milliseconds.
+ *
+ * 1 second by default.
+ */
+static unsigned long quota_reset_interval_ms __read_mostly = 1000;
+module_param(quota_reset_interval_ms, ulong, 0600);
+
+/*
+ * The watermarks check time interval in microseconds.
+ *
+ * Minimal time to wait before checking the watermarks, when DAMON_RECLAIM is
+ * enabled but inactive due to its watermarks rule.  5 seconds by default.
+ */
+static unsigned long wmarks_interval __read_mostly = 5000000;
+module_param(wmarks_interval, ulong, 0600);
+
+/*
+ * Free memory rate (per thousand) for the high watermark.
+ *
+ * If free memory of the system in bytes per thousand bytes is higher than
+ * this, DAMON_RECLAIM becomes inactive, so it does nothing but periodically
+ * checks the watermarks.  500 (50%) by default.
+ */
+static unsigned long wmarks_high __read_mostly = 500;
+module_param(wmarks_high, ulong, 0600);
+
+/*
+ * Free memory rate (per thousand) for the middle watermark.
+ *
+ * If free memory of the system in bytes per thousand bytes is between this and
+ * the low watermark, DAMON_RECLAIM becomes active, so starts the monitoring
+ * and the reclaiming.  400 (40%) by default.
+ */
+static unsigned long wmarks_mid __read_mostly = 400;
+module_param(wmarks_mid, ulong, 0600);
+
+/*
+ * Free memory rate (per thousand) for the low watermark.
+ *
+ * If free memory of the system in bytes per thousand bytes is lower than this,
+ * DAMON_RECLAIM becomes inactive, so it does nothing but periodically checks
+ * the watermarks.  In the case, the system falls back to the LRU-based page
+ * granularity reclamation logic.  200 (20%) by default.
+ */
+static unsigned long wmarks_low __read_mostly = 200;
+module_param(wmarks_low, ulong, 0600);
+
+/*
+ * Sampling interval for the monitoring in microseconds.
+ *
+ * The sampling interval of DAMON for the cold memory monitoring.  Please refer
+ * to the DAMON documentation for more detail.  5 ms by default.
+ */
+static unsigned long sample_interval __read_mostly = 5000;
+module_param(sample_interval, ulong, 0600);
+
+/*
+ * Aggregation interval for the monitoring in microseconds.
+ *
+ * The aggregation interval of DAMON for the cold memory monitoring.  Please
+ * refer to the DAMON documentation for more detail.  100 ms by default.
+ */
+static unsigned long aggr_interval __read_mostly = 100000;
+module_param(aggr_interval, ulong, 0600);
+
+/*
+ * Minimum number of monitoring regions.
+ *
+ * The minimal number of monitoring regions of DAMON for the cold memory
+ * monitoring.  This can be used to set lower-bound of the monitoring quality.
+ * But, setting this too high could result in increased monitoring overhead.
+ * Please refer to the DAMON documentation for more detail.  10 by default.
+ */
+static unsigned long min_nr_regions __read_mostly = 10;
+module_param(min_nr_regions, ulong, 0600);
+
+/*
+ * Maximum number of monitoring regions.
+ *
+ * The maximum number of monitoring regions of DAMON for the cold memory
+ * monitoring.  This can be used to set upper-bound of the monitoring overhead.
+ * However, setting this too low could result in bad monitoring quality.
+ * Please refer to the DAMON documentation for more detail.  1000 by default.
+ */
+static unsigned long max_nr_regions __read_mostly = 1000;
+module_param(max_nr_regions, ulong, 0600);
+
+/*
+ * Start of the target memory region in physical address.
+ *
+ * The start physical address of memory region that DAMON_RECLAIM will do work
+ * against.  By default, biggest System RAM is used as the region.
+ */
+static unsigned long monitor_region_start __read_mostly;
+module_param(monitor_region_start, ulong, 0600);
+
+/*
+ * End of the target memory region in physical address.
+ *
+ * The end physical address of memory region that DAMON_RECLAIM will do work
+ * against.  By default, biggest System RAM is used as the region.
+ */
+static unsigned long monitor_region_end __read_mostly;
+module_param(monitor_region_end, ulong, 0600);
+
+/*
+ * PID of the DAMON thread
+ *
+ * If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread.
+ * Else, -1.
+ */
+static int kdamond_pid __read_mostly = -1;
+module_param(kdamond_pid, int, 0400);
+
+static struct damon_ctx *ctx;
+static struct damon_target *target;
+
+struct damon_reclaim_ram_walk_arg {
+	unsigned long start;
+	unsigned long end;
+};
+
+static int walk_system_ram(struct resource *res, void *arg)
+{
+	struct damon_reclaim_ram_walk_arg *a = arg;
+
+	if (a->end - a->start < res->end - res->start) {
+		a->start = res->start;
+		a->end = res->end;
+	}
+	return 0;
+}
+
+/*
+ * Find biggest 'System RAM' resource and store its start and end address in
+ * @start and @end, respectively.  If no System RAM is found, returns false.
+ */
+static bool get_monitoring_region(unsigned long *start, unsigned long *end)
+{
+	struct damon_reclaim_ram_walk_arg arg = {};
+
+	walk_system_ram_res(0, ULONG_MAX, &arg, walk_system_ram);
+	if (arg.end <= arg.start)
+		return false;
+
+	*start = arg.start;
+	*end = arg.end;
+	return true;
+}
+
+static struct damos *damon_reclaim_new_scheme(void)
+{
+	struct damos_watermarks wmarks = {
+		.metric = DAMOS_WMARK_FREE_MEM_RATE,
+		.interval = wmarks_interval,
+		.high = wmarks_high,
+		.mid = wmarks_mid,
+		.low = wmarks_low,
+	};
+	struct damos_quota quota = {
+		/*
+		 * Do not try reclamation for more than quota_ms milliseconds
+		 * or quota_sz bytes within quota_reset_interval_ms.
+		 */
+		.ms = quota_ms,
+		.sz = quota_sz,
+		.reset_interval = quota_reset_interval_ms,
+		/* Within the quota, page out older regions first. */
+		.weight_sz = 0,
+		.weight_nr_accesses = 0,
+		.weight_age = 1
+	};
+	struct damos *scheme = damon_new_scheme(
+			/* Find regions having PAGE_SIZE or larger size */
+			PAGE_SIZE, ULONG_MAX,
+			/* and not accessed at all */
+			0, 0,
+			/* for min_age or more micro-seconds, and */
+			min_age / aggr_interval, UINT_MAX,
+			/* page out those, as soon as found */
+			DAMOS_PAGEOUT,
+			/* under the quota. */
+			&quota,
+			/* (De)activate this according to the watermarks. */
+			&wmarks);
+
+	return scheme;
+}
+
+static int damon_reclaim_turn(bool on)
+{
+	struct damon_region *region;
+	struct damos *scheme;
+	int err;
+
+	if (!on) {
+		err = damon_stop(&ctx, 1);
+		if (!err)
+			kdamond_pid = -1;
+		return err;
+	}
+
+	err = damon_set_attrs(ctx, sample_interval, aggr_interval, 0,
+			min_nr_regions, max_nr_regions);
+	if (err)
+		return err;
+
+	if (monitor_region_start > monitor_region_end)
+		return -EINVAL;
+	if (!monitor_region_start && !monitor_region_end &&
+			!get_monitoring_region(&monitor_region_start,
+				&monitor_region_end))
+		return -EINVAL;
+	/* DAMON will free this on its own when finish monitoring */
+	region = damon_new_region(monitor_region_start, monitor_region_end);
+	if (!region)
+		return -ENOMEM;
+	damon_add_region(region, target);
+
+	/* Will be freed by 'damon_set_schemes()' below */
+	scheme = damon_reclaim_new_scheme();
+	if (!scheme)
+		goto free_region_out;
+	err = damon_set_schemes(ctx, &scheme, 1);
+	if (err)
+		goto free_scheme_out;
+
+	err = damon_start(&ctx, 1);
+	if (!err) {
+		kdamond_pid = ctx->kdamond->pid;
+		return 0;
+	}
+
+free_scheme_out:
+	damon_destroy_scheme(scheme);
+free_region_out:
+	damon_destroy_region(region, target);
+	return err;
+}
+
+#define ENABLE_CHECK_INTERVAL_MS	1000
+static struct delayed_work damon_reclaim_timer;
+static void damon_reclaim_timer_fn(struct work_struct *work)
+{
+	static bool last_enabled;
+	bool now_enabled;
+
+	now_enabled = enabled;
+	if (last_enabled != now_enabled) {
+		if (!damon_reclaim_turn(now_enabled))
+			last_enabled = now_enabled;
+		else
+			enabled = last_enabled;
+	}
+
+	schedule_delayed_work(&damon_reclaim_timer,
+			msecs_to_jiffies(ENABLE_CHECK_INTERVAL_MS));
+}
+static DECLARE_DELAYED_WORK(damon_reclaim_timer, damon_reclaim_timer_fn);
+
+static int __init damon_reclaim_init(void)
+{
+	ctx = damon_new_ctx(DAMON_ADAPTIVE_TARGET);
+	if (!ctx)
+		return -ENOMEM;
+
+	damon_pa_set_primitives(ctx);
+
+	/* 4242 means nothing but fun */
+	target = damon_new_target(4242);
+	if (!target) {
+		damon_destroy_ctx(ctx);
+		return -ENOMEM;
+	}
+	damon_add_target(ctx, target);
+
+	schedule_delayed_work(&damon_reclaim_timer, 0);
+	return 0;
+}
+
+module_init(damon_reclaim_init);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC v3 15/15] Documentation/admin-guide/mm/damon: Add a document for DAMON_RECLAIM
  2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
                   ` (13 preceding siblings ...)
  2021-07-20 13:13 ` [RFC v3 14/15] mm/damon: Introduce DAMON-based reclamation SeongJae Park
@ 2021-07-20 13:13 ` SeongJae Park
  14 siblings, 0 replies; 16+ messages in thread
From: SeongJae Park @ 2021-07-20 13:13 UTC (permalink / raw)
  To: akpm
  Cc: SeongJae Park, Jonathan.Cameron, acme, alexander.shishkin, amit,
	benh, brendanhiggins, corbet, david, dwmw, elver, fan.du,
	foersleo, greg, gthelen, guoju.fgj, jgowans, joe, mgorman,
	mheyne, minchan, mingo, namhyung, peterz, riel, rientjes,
	rostedt, rppt, shakeelb, shuah, sieberf, sj38.park, snu, vbabka,
	vdavydov.dev, zgf574564920, linux-damon, linux-mm, linux-doc,
	linux-kernel

From: SeongJae Park <sjpark@amazon.de>

This commit adds an admin-guide document for DAMON-based Reclamation.

Signed-off-by: SeongJae Park <sjpark@amazon.de>
---
 Documentation/admin-guide/mm/damon/index.rst  |   1 +
 .../admin-guide/mm/damon/reclaim.rst          | 233 ++++++++++++++++++
 2 files changed, 234 insertions(+)
 create mode 100644 Documentation/admin-guide/mm/damon/reclaim.rst

diff --git a/Documentation/admin-guide/mm/damon/index.rst b/Documentation/admin-guide/mm/damon/index.rst
index 8c5dde3a5754..61aff88347f3 100644
--- a/Documentation/admin-guide/mm/damon/index.rst
+++ b/Documentation/admin-guide/mm/damon/index.rst
@@ -13,3 +13,4 @@ optimize those.
 
    start
    usage
+   reclaim
diff --git a/Documentation/admin-guide/mm/damon/reclaim.rst b/Documentation/admin-guide/mm/damon/reclaim.rst
new file mode 100644
index 000000000000..1f4ad79919ca
--- /dev/null
+++ b/Documentation/admin-guide/mm/damon/reclaim.rst
@@ -0,0 +1,233 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+DAMON-based Reclamation
+=======================
+
+DAMON-based Reclamation (DAMON_RECLAIM) is a static kernel module that aimed to
+be used for proactive and lightweight reclamation under light memory pressure.
+It doesn't aim to replace the LRU-list based page_granularity reclamation, but
+to be selectively used for different level of memory pressure and requirements.
+
+Where Proactive Reclamation is Required?
+========================================
+
+On general memory over-committed systems, proactively reclaiming cold pages
+helps saving memory and reducing latency spikes that incurred by the direct
+reclaim of the process or CPU consumption of kswapd, while incurring only
+minimal performance degradation [1]_ [2]_ .
+
+Free Pages Reporting [3]_ based memory over-commit virtualization systems are
+good example of the cases.  In such systems, the guest VMs reports their free
+memory to host, and the host reallocates the reported memory to other guests.
+As a result, the memory of the systems are fully utilized.  However, the
+guests could be not so memory-frugal, mainly because some kernel subsystems and
+user-space applications are designed to use as much memory as available.  Then,
+guests could report only small amount of memory as free to host, results in
+memory utilization drop of the systems.  Running the proactive reclamation in
+guests could mitigate this problem.
+
+How It Works?
+=============
+
+DAMON_RECLAIM finds memory regions that didn't accessed for specific time
+duration and page out.  To avoid it consuming too much CPU for the paging out
+operation, a speed limit can be configured.  Under the speed limit, it pages
+out memory regions that didn't accessed longer time first.  System
+administrators can also configure under what situation this scheme should
+automatically activated and deactivated with three memory pressure watermarks.
+
+Interface: Module Parameters
+============================
+
+To use this feature, you should first ensure your system is running on a kernel
+that is built with ``CONFIG_DAMON_RECLAIM=y``.
+
+To let sysadmins enable or disable it and tune for the given system,
+DAMON_RECLAIM utilizes module parameters.  That is, you can put
+``damon_reclaim.<parameter>=<value>`` on the kernel boot command line or write
+proper values to ``/sys/modules/damon_reclaim/parameters/<parameter>`` files.
+
+Note that the parameter values except ``enabled`` are applied only when
+DAMON_RECLAIM starts.  Therefore, if you want to apply new parameter values in
+runtime and DAMON_RECLAIM is already enabled, you should disable and re-enable
+it via ``enabled`` parameter file.  Writing of the new values to proper
+parameter values should be done before the re-enablement.
+
+Below are the description of each parameter.
+
+enabled
+-------
+
+Enable or disable DAMON_RECLAIM.
+
+You can enable DAMON_RCLAIM by setting the value of this parameter as ``Y``.
+Setting it as ``N`` disables DAMON_RECLAIM.  Note that DAMON_RECLAIM could do
+no real monitoring and reclamation due to the watermarks-based activation
+condition.  Refer to below descriptions for the watermarks parameter for this.
+
+min_age
+-------
+
+Time threshold for cold memory regions identification in microseconds.
+
+If a memory region is not accessed for this or longer time, DAMON_RECLAIM
+identifies the region as cold, and reclaims it.
+
+quota_ms
+--------
+
+Limit of time for the reclamation in milliseconds.
+
+DAMON_RECLAIM tries to use only up to this time within a time window
+(quota_reset_interval_ms) for trying reclamation of cold pages.  This can be
+used for limiting CPU consumption of DAMON_RECLAIM.  If the value is zero, the
+limit is disabled.
+
+100 ms by default.
+
+quota_sz
+--------
+
+Limit of size of memory for the reclamation in bytes.
+
+DAMON_RECLAIM charges amount of memory which it tried to reclaim within a time
+window (quota_reset_interval_ms) and makes no more than this limit is tried.
+This can be used for limiting consumption of CPU and IO.  If this value is
+zero, the limit is disabled.
+
+1 GiB by default.
+
+quota_reset_interval_ms
+-----------------------
+
+The time/size quota charge reset interval in milliseconds.
+
+The charget reset interval for the quota of time (quota_ms) and size
+(quota_sz).  That is, DAMON_RECLAIM does not try reclamation for more than
+quota_ms milliseconds or quota_sz bytes within quota_reset_interval_ms
+milliseconds.
+
+1 second by default.
+
+wmarks_interval
+---------------
+
+Minimal time to wait before checking the watermarks, when DAMON_RECLAIM is
+enabled but inactive due to its watermarks rule.
+
+wmarks_high
+-----------
+
+Free memory rate (per thousand) for the high watermark.
+
+If free memory of the system in bytes per thousand bytes is higher than this,
+DAMON_RECLAIM becomes inactive, so it does nothing but only periodically checks
+the watermarks.
+
+wmarks_mid
+----------
+
+Free memory rate (per thousand) for the middle watermark.
+
+If free memory of the system in bytes per thousand bytes is between this and
+the low watermark, DAMON_RECLAIM becomes active, so starts the monitoring and
+the reclaiming.
+
+wmarks_low
+----------
+
+Free memory rate (per thousand) for the low watermark.
+
+If free memory of the system in bytes per thousand bytes is lower than this,
+DAMON_RECLAIM becomes inactive, so it does nothing but periodically checks the
+watermarks.  In the case, the system falls back to the LRU-list based page
+granularity reclamation logic.
+
+sample_interval
+---------------
+
+Sampling interval for the monitoring in microseconds.
+
+The sampling interval of DAMON for the cold memory monitoring.  Please refer to
+the DAMON documentation (:doc:`usage`) for more detail.
+
+aggr_interval
+-------------
+
+Aggregation interval for the monitoring in microseconds.
+
+The aggregation interval of DAMON for the cold memory monitoring.  Please
+refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+min_nr_regions
+--------------
+
+Minimum number of monitoring regions.
+
+The minimal number of monitoring regions of DAMON for the cold memory
+monitoring.  This can be used to set lower-bound of the monitoring quality.
+But, setting this too high could result in increased monitoring overhead.
+Please refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+max_nr_regions
+--------------
+
+Maximum number of monitoring regions.
+
+The maximum number of monitoring regions of DAMON for the cold memory
+monitoring.  This can be used to set upper-bound of the monitoring overhead.
+However, setting this too low could result in bad monitoring quality.  Please
+refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+monitor_region_start
+--------------------
+
+Start of target memory region in physical address.
+
+The start physical address of memory region that DAMON_RECLAIM will do work
+against.  That is, DAMON_RECLAIM will find cold memory regions in this region
+and reclaims.  By default, biggest System RAM is used as the region.
+
+monitor_region_end
+------------------
+
+End of target memory region in physical address.
+
+The end physical address of memory region that DAMON_RECLAIM will do work
+against.  That is, DAMON_RECLAIM will find cold memory regions in this region
+and reclaims.  By default, biggest System RAM is used as the region.
+
+kdamond_pid
+-----------
+
+PID of the DAMON thread.
+
+If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread.  Else,
+-1.
+
+Example
+=======
+
+Below runtime example commands make DAMON_RECLAIM to find memory regions that
+not accessed for 30 seconds or more and pages out.  The reclamation is limited
+to be done only up to 1 GiB per second to avoid DAMON_RECLAIM consuming too
+much CPU time for the paging out operation.  It also asks DAMON_RECLAIM to do
+nothing if the system's free memory rate is more than 50%, but start the real
+works if it becomes lower than 40%.  If DAMON_RECLAIM doesn't make progress and
+therefore the free memory rate becomes lower than 20%, it asks DAMON_RECLAIM to
+do nothing again, so that we can fall back to the LRU-list based page
+granularity reclamation. ::
+
+    # cd /sys/modules/damon_reclaim/parameters
+    # echo 30000000 > min_age
+    # echo $((1 * 1024 * 1024 * 1024)) > quota_sz
+    # echo 1000 > quota_reset_interval_ms
+    # echo 500 > wmarks_high
+    # echo 400 > wmarks_mid
+    # echo 200 > wmarks_low
+    # echo Y > enabled
+
+.. [1] https://research.google/pubs/pub48551/
+.. [2] https://lwn.net/Articles/787611/
+.. [3] https://www.kernel.org/doc/html/latest/vm/free_page_reporting.html
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-07-20 13:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-20 13:12 [RFC v3 00/15] Introduce DAMON-based Proactive Reclamation SeongJae Park
2021-07-20 13:12 ` [RFC v3 01/15] mm/damon/paddr: Support the pageout scheme SeongJae Park
2021-07-20 13:12 ` [RFC v3 02/15] mm/damon/damos: Make schemes aggressiveness controllable SeongJae Park
2021-07-20 13:12 ` [RFC v3 03/15] damon/core/schemes: Skip already charged targets and regions SeongJae Park
2021-07-20 13:12 ` [RFC v3 04/15] mm/damon/schemes: Implement time quota SeongJae Park
2021-07-20 13:12 ` [RFC v3 05/15] mm/damon/dbgfs: Support schemes' time/IO quotas SeongJae Park
2021-07-20 13:13 ` [RFC v3 06/15] mm/damon/selftests: Support schemes quotas SeongJae Park
2021-07-20 13:13 ` [RFC v3 07/15] mm/damon/schemes: Prioritize regions within the quotas SeongJae Park
2021-07-20 13:13 ` [RFC v3 08/15] mm/damon/vaddr,paddr: Support pageout prioritization SeongJae Park
2021-07-20 13:13 ` [RFC v3 09/15] mm/damon/dbgfs: Support prioritization weights SeongJae Park
2021-07-20 13:13 ` [RFC v3 10/15] tools/selftests/damon: Update for regions prioritization of schemes SeongJae Park
2021-07-20 13:13 ` [RFC v3 11/15] mm/damon/schemes: Activate schemes based on a watermarks mechanism SeongJae Park
2021-07-20 13:13 ` [RFC v3 12/15] mm/damon/dbgfs: Support watermarks SeongJae Park
2021-07-20 13:13 ` [RFC v3 13/15] selftests/damon: " SeongJae Park
2021-07-20 13:13 ` [RFC v3 14/15] mm/damon: Introduce DAMON-based reclamation SeongJae Park
2021-07-20 13:13 ` [RFC v3 15/15] Documentation/admin-guide/mm/damon: Add a document for DAMON_RECLAIM SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).