LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups
@ 2021-11-23 20:23 Luis Chamberlain
  2021-11-23 20:23 ` [PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface Luis Chamberlain
                   ` (9 more replies)
  0 siblings, 10 replies; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

Finally had time to respin the series of the work we had started
last year on cleaning up the kernel/sysct.c kitchen sink. People keeps
stuffing their sysctls in that file and this creates a maintenance
burden. So this effort is aimed at placing sysctls where they actually
belong.

I'm going to split patches up into series as there is quite a bit
of work.

This first set adds register_sysctl_init() for uses of registerting a
sysctl on the init path, adds const where missing to a few places, generalizes
common values so to be more easy to share, and starts the move of a
few kernel/sysctl.c out where they belong.

The majority of rework on v2 in this first patch set is 0-day fixes.
Eric W. Biederman's feedback is later addressed in subsequent patch
sets.

I'll only post the first two patch sets for now. We can address the
rest once the first two patch sets get completely reviewed / Acked.
Since the sysctls are all over the place I can either put up a tree
to keep track of these changes and later send a pull request to Linus
or we can have them trickle into Andrew's tree. Let me know what folks
prefer.

Changes in v2:

  * 0-day compile issues
  * added reviewed-by tags
  * enhanced commit logs
  * Added patch by Stephen Kitt

Stephen Kitt (1):
  sysctl: make ngroups_max const

Xiaoming Ni (8):
  sysctl: add a new register_sysctl_init() interface
  sysctl: Move some boundary constants from sysctl.c to sysctl_vals
  hung_task: Move hung_task sysctl interface to hung_task.c
  watchdog: move watchdog sysctl interface to watchdog.c
  sysctl: use const for typically used max/min proc sysctls
  sysctl: use SYSCTL_ZERO to replace some static int zero uses
  aio: move aio sysctl to aio.c
  dnotify: move dnotify sysctl to dnotify.c

 fs/aio.c                     |  31 +++-
 fs/notify/dnotify/dnotify.c  |  21 ++-
 fs/proc/proc_sysctl.c        |  35 ++++-
 include/linux/aio.h          |   4 -
 include/linux/dnotify.h      |   1 -
 include/linux/sched/sysctl.h |  14 +-
 include/linux/sysctl.h       |  15 +-
 kernel/hung_task.c           |  81 +++++++++-
 kernel/sysctl.c              | 285 ++++++-----------------------------
 kernel/watchdog.c            | 101 +++++++++++++
 10 files changed, 322 insertions(+), 266 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-25 16:14   ` Petr Mladek
  2021-11-23 20:23 ` [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals Luis Chamberlain
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

From: Xiaoming Ni <nixiaoming@huawei.com>

The kernel/sysctl.c is a kitchen sink where everyone leaves
their dirty dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to
places where they actually belong. The proc sysctl maintainers
do not want to know what sysctl knobs you wish to add for your own
piece of code, we just care about the core logic.

Today though folks heavily rely on tables on kernel/sysctl.c so
they can easily just extend this table with their needed sysctls.
In order to help users move their sysctls out we need to provide a
helper which can be used during code initialization.

We special-case the initialization use of register_sysctl() since
it *is* safe to fail, given all that sysctls do is provide a dynamic
interface to query or modify at runtime an existing variable. So the
use case of register_sysctl() on init should *not* stop if the sysctls
don't end up getting registered. It would be counter productive to
stop boot if a simple sysctl registration failed.

Provide a helper for init then, and document the recommended init
levels to use for callers of this routine. We will later use this
in subsequent patches to start slimming down kernel/sysctl.c tables
and moving sysctl registration to the code which actually needs
these sysctls.

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
[mcgrof: major commit log and documentation rephrasing
 also moved to fs/proc/proc_sysctl.c                  ]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/proc/proc_sysctl.c  | 33 +++++++++++++++++++++++++++++++++
 include/linux/sysctl.h |  3 +++
 2 files changed, 36 insertions(+)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 5d66faecd4ef..b4950843d90a 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -16,6 +16,7 @@
 #include <linux/module.h>
 #include <linux/bpf-cgroup.h>
 #include <linux/mount.h>
+#include <linux/kmemleak.h>
 #include "internal.h"
 
 static const struct dentry_operations proc_sys_dentry_operations;
@@ -1384,6 +1385,38 @@ struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *tab
 }
 EXPORT_SYMBOL(register_sysctl);
 
+/**
+ * __register_sysctl_init() - register sysctl table to path
+ * @path: path name for sysctl base
+ * @table: This is the sysctl table that needs to be registered to the path
+ * @table_name: The name of sysctl table, only used for log printing when
+ *              registration fails
+ *
+ * The sysctl interface is used by userspace to query or modify at runtime
+ * a predefined value set on a variable. These variables however have default
+ * values pre-set. Code which depends on these variables will always work even
+ * if register_sysctl() fails. If register_sysctl() fails you'd just loose the
+ * ability to query or modify the sysctls dynamically at run time. Chances of
+ * register_sysctl() failing on init are extremely low, and so for both reasons
+ * this function does not return any error as it is used by initialization code.
+ *
+ * Context: Can only be called after your respective sysctl base path has been
+ * registered. So for instance, most base directories are registered early on
+ * init before init levels are processed through proc_sys_init() and
+ * sysctl_init().
+ */
+void __init __register_sysctl_init(const char *path, struct ctl_table *table,
+				 const char *table_name)
+{
+	struct ctl_table_header *hdr = register_sysctl(path, table);
+
+	if (unlikely(!hdr)) {
+		pr_err("failed when register_sysctl %s to %s\n", table_name, path);
+		return;
+	}
+	kmemleak_not_leak(hdr);
+}
+
 static char *append_path(const char *path, char *pos, const char *name)
 {
 	int namelen;
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 1fa2b69c6fc3..d3ab7969b6b5 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -199,6 +199,9 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
 void unregister_sysctl_table(struct ctl_table_header * table);
 
 extern int sysctl_init(void);
+extern void __register_sysctl_init(const char *path, struct ctl_table *table,
+				 const char *table_name);
+#define register_sysctl_init(path, table) __register_sysctl_init(path, table, #table)
 void do_sysctl_args(void);
 
 extern int pwrsw_enabled;
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
  2021-11-23 20:23 ` [PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-24  4:51   ` Eric W. Biederman
  2021-11-23 20:23 ` [PATCH v2 3/9] hung_task: Move hung_task sysctl interface to hung_task.c Luis Chamberlain
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

From: Xiaoming Ni <nixiaoming@huawei.com>

sysctl has helpers which let us specify boundary values for a min or
max int value. Since these are used for a boundary check only they don't
change, so move these variables to sysctl_vals to avoid adding duplicate
variables. This will help with our cleanup of kernel/sysctl.c.

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
[mcgrof: major rebase]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/proc/proc_sysctl.c  |  2 +-
 include/linux/sysctl.h | 12 +++++++++---
 kernel/sysctl.c        | 44 ++++++++++++++++++------------------------
 3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b4950843d90a..6d462644bb00 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -26,7 +26,7 @@ static const struct file_operations proc_sys_dir_file_operations;
 static const struct inode_operations proc_sys_dir_operations;
 
 /* shared constants to be used in various sysctls */
-const int sysctl_vals[] = { 0, 1, INT_MAX };
+const int sysctl_vals[] = { -1, 0, 1, 2, 4, 100, 200, 1000, INT_MAX };
 EXPORT_SYMBOL(sysctl_vals);
 
 /* Support for permanently empty directories */
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index d3ab7969b6b5..718492057c70 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -38,9 +38,15 @@ struct ctl_table_header;
 struct ctl_dir;
 
 /* Keep the same order as in fs/proc/proc_sysctl.c */
-#define SYSCTL_ZERO	((void *)&sysctl_vals[0])
-#define SYSCTL_ONE	((void *)&sysctl_vals[1])
-#define SYSCTL_INT_MAX	((void *)&sysctl_vals[2])
+#define SYSCTL_NEG_ONE			((void *)&sysctl_vals[0])
+#define SYSCTL_ZERO			((void *)&sysctl_vals[1])
+#define SYSCTL_ONE			((void *)&sysctl_vals[2])
+#define SYSCTL_TWO			((void *)&sysctl_vals[3])
+#define SYSCTL_FOUR			((void *)&sysctl_vals[4])
+#define SYSCTL_ONE_HUNDRED		((void *)&sysctl_vals[5])
+#define SYSCTL_TWO_HUNDRED		((void *)&sysctl_vals[6])
+#define SYSCTL_ONE_THOUSAND		((void *)&sysctl_vals[7])
+#define SYSCTL_INT_MAX			((void *)&sysctl_vals[8])
 
 extern const int sysctl_vals[];
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 857c1ccad9e8..3097f0286504 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -113,15 +113,9 @@
 static int sixty = 60;
 #endif
 
-static int __maybe_unused neg_one = -1;
-static int __maybe_unused two = 2;
-static int __maybe_unused four = 4;
 static unsigned long zero_ul;
 static unsigned long one_ul = 1;
 static unsigned long long_max = LONG_MAX;
-static int one_hundred = 100;
-static int two_hundred = 200;
-static int one_thousand = 1000;
 #ifdef CONFIG_PRINTK
 static int ten_thousand = 10000;
 #endif
@@ -1962,7 +1956,7 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &neg_one,
+		.extra1		= SYSCTL_NEG_ONE,
 		.extra2		= SYSCTL_ONE,
 	},
 #endif
@@ -2304,7 +2298,7 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax_sysadmin,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= SYSCTL_TWO,
 	},
 #endif
 	{
@@ -2564,7 +2558,7 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &neg_one,
+		.extra1		= SYSCTL_NEG_ONE,
 	},
 #endif
 #ifdef CONFIG_RT_MUTEXES
@@ -2626,7 +2620,7 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= perf_cpu_time_max_percent_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &one_hundred,
+		.extra2		= SYSCTL_ONE_HUNDRED,
 	},
 	{
 		.procname	= "perf_event_max_stack",
@@ -2644,7 +2638,7 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= perf_event_max_stack_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &one_thousand,
+		.extra2		= SYSCTL_ONE_THOUSAND,
 	},
 #endif
 	{
@@ -2675,7 +2669,7 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= bpf_unpriv_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= SYSCTL_TWO,
 	},
 	{
 		.procname	= "bpf_stats_enabled",
@@ -2729,7 +2723,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= overcommit_policy_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= SYSCTL_TWO,
 	},
 	{
 		.procname	= "panic_on_oom",
@@ -2738,7 +2732,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= SYSCTL_TWO,
 	},
 	{
 		.procname	= "oom_kill_allocating_task",
@@ -2783,7 +2777,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= dirty_background_ratio_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &one_hundred,
+		.extra2		= SYSCTL_ONE_HUNDRED,
 	},
 	{
 		.procname	= "dirty_background_bytes",
@@ -2800,7 +2794,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= dirty_ratio_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &one_hundred,
+		.extra2		= SYSCTL_ONE_HUNDRED,
 	},
 	{
 		.procname	= "dirty_bytes",
@@ -2840,7 +2834,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two_hundred,
+		.extra2		= SYSCTL_TWO_HUNDRED,
 	},
 #ifdef CONFIG_HUGETLB_PAGE
 	{
@@ -2897,7 +2891,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0200,
 		.proc_handler	= drop_caches_sysctl_handler,
 		.extra1		= SYSCTL_ONE,
-		.extra2		= &four,
+		.extra2		= SYSCTL_FOUR,
 	},
 #ifdef CONFIG_COMPACTION
 	{
@@ -2914,7 +2908,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= compaction_proactiveness_sysctl_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &one_hundred,
+		.extra2		= SYSCTL_ONE_HUNDRED,
 	},
 	{
 		.procname	= "extfrag_threshold",
@@ -2959,7 +2953,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= watermark_scale_factor_sysctl_handler,
 		.extra1		= SYSCTL_ONE,
-		.extra2		= &one_thousand,
+		.extra2		= SYSCTL_ONE_THOUSAND,
 	},
 	{
 		.procname	= "percpu_pagelist_high_fraction",
@@ -3038,7 +3032,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= sysctl_min_unmapped_ratio_sysctl_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &one_hundred,
+		.extra2		= SYSCTL_ONE_HUNDRED,
 	},
 	{
 		.procname	= "min_slab_ratio",
@@ -3047,7 +3041,7 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= sysctl_min_slab_ratio_sysctl_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &one_hundred,
+		.extra2		= SYSCTL_ONE_HUNDRED,
 	},
 #endif
 #ifdef CONFIG_SMP
@@ -3337,7 +3331,7 @@ static struct ctl_table fs_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= SYSCTL_TWO,
 	},
 	{
 		.procname	= "protected_regular",
@@ -3346,7 +3340,7 @@ static struct ctl_table fs_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= SYSCTL_TWO,
 	},
 	{
 		.procname	= "suid_dumpable",
@@ -3355,7 +3349,7 @@ static struct ctl_table fs_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax_coredump,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= SYSCTL_TWO,
 	},
 #if defined(CONFIG_BINFMT_MISC) || defined(CONFIG_BINFMT_MISC_MODULE)
 	{
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 3/9] hung_task: Move hung_task sysctl interface to hung_task.c
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
  2021-11-23 20:23 ` [PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface Luis Chamberlain
  2021-11-23 20:23 ` [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-26  9:46   ` Petr Mladek
  2021-11-23 20:23 ` [PATCH v2 4/9] watchdog: move watchdog sysctl interface to watchdog.c Luis Chamberlain
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

From: Xiaoming Ni <nixiaoming@huawei.com>

The kernel/sysctl.c is a kitchen sink where everyone leaves
their dirty dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to
places where they actually belong. The proc sysctl maintainers
do not want to know what sysctl knobs you wish to add for your own
piece of code, we just care about the core logic.

So move hung_task sysctl interface to hung_task.c and use
register_sysctl() to register the sysctl interface.

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
[mcgrof: commit log refresh and fixed 2-3 0day reported compile issues]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 include/linux/sched/sysctl.h | 14 +------
 kernel/hung_task.c           | 81 ++++++++++++++++++++++++++++++++++--
 kernel/sysctl.c              | 61 ---------------------------
 3 files changed, 79 insertions(+), 77 deletions(-)

diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index 304f431178fd..c19dd5a2c05c 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -7,20 +7,8 @@
 struct ctl_table;
 
 #ifdef CONFIG_DETECT_HUNG_TASK
-
-#ifdef CONFIG_SMP
-extern unsigned int sysctl_hung_task_all_cpu_backtrace;
-#else
-#define sysctl_hung_task_all_cpu_backtrace 0
-#endif /* CONFIG_SMP */
-
-extern int	     sysctl_hung_task_check_count;
-extern unsigned int  sysctl_hung_task_panic;
+/* used for hung_task and block/ */
 extern unsigned long sysctl_hung_task_timeout_secs;
-extern unsigned long sysctl_hung_task_check_interval_secs;
-extern int sysctl_hung_task_warnings;
-int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
-		void *buffer, size_t *lenp, loff_t *ppos);
 #else
 /* Avoid need for ifdefs elsewhere in the code */
 enum { sysctl_hung_task_timeout_secs = 0 };
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 8cc07e7f29aa..40220dfd6fa9 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -63,7 +63,9 @@ static struct task_struct *watchdog_task;
  * Should we dump all CPUs backtraces in a hung task event?
  * Defaults to 0, can be changed via sysctl.
  */
-unsigned int __read_mostly sysctl_hung_task_all_cpu_backtrace;
+static unsigned int __read_mostly sysctl_hung_task_all_cpu_backtrace;
+#else
+#define sysctl_hung_task_all_cpu_backtrace 0
 #endif /* CONFIG_SMP */
 
 /*
@@ -266,11 +268,13 @@ static long hung_timeout_jiffies(unsigned long last_checked,
 		MAX_SCHEDULE_TIMEOUT;
 }
 
+#ifdef CONFIG_SYSCTL
 /*
  * Process updating of timeout sysctl
  */
-int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
-				  void *buffer, size_t *lenp, loff_t *ppos)
+static int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
+				  void __user *buffer,
+				  size_t *lenp, loff_t *ppos)
 {
 	int ret;
 
@@ -285,6 +289,76 @@ int proc_dohung_task_timeout_secs(struct ctl_table *table, int write,
 	return ret;
 }
 
+/*
+ * This is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs
+ * and hung_task_check_interval_secs
+ */
+static const unsigned long hung_task_timeout_max = (LONG_MAX / HZ);
+static struct ctl_table hung_task_sysctls[] = {
+#ifdef CONFIG_SMP
+	{
+		.procname	= "hung_task_all_cpu_backtrace",
+		.data		= &sysctl_hung_task_all_cpu_backtrace,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+#endif /* CONFIG_SMP */
+	{
+		.procname	= "hung_task_panic",
+		.data		= &sysctl_hung_task_panic,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+	{
+		.procname	= "hung_task_check_count",
+		.data		= &sysctl_hung_task_check_count,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+	},
+	{
+		.procname	= "hung_task_timeout_secs",
+		.data		= &sysctl_hung_task_timeout_secs,
+		.maxlen		= sizeof(unsigned long),
+		.mode		= 0644,
+		.proc_handler	= proc_dohung_task_timeout_secs,
+		.extra2		= (void *)&hung_task_timeout_max,
+	},
+	{
+		.procname	= "hung_task_check_interval_secs",
+		.data		= &sysctl_hung_task_check_interval_secs,
+		.maxlen		= sizeof(unsigned long),
+		.mode		= 0644,
+		.proc_handler	= proc_dohung_task_timeout_secs,
+		.extra2		= (void *)&hung_task_timeout_max,
+	},
+	{
+		.procname	= "hung_task_warnings",
+		.data		= &sysctl_hung_task_warnings,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_NEG_ONE,
+	},
+	{}
+};
+
+static void __init hung_task_sysctl_init(void)
+{
+	register_sysctl_init("kernel", hung_task_sysctls);
+}
+#else
+#define hung_task_sysctl_init() do { } while (0)
+#endif /* CONFIG_SYSCTL */
+
+
 static atomic_t reset_hung_task = ATOMIC_INIT(0);
 
 void reset_hung_task_detector(void)
@@ -354,6 +428,7 @@ static int __init hung_task_init(void)
 	pm_notifier(hungtask_pm_notify, 0);
 
 	watchdog_task = kthread_run(watchdog, NULL, "khungtaskd");
+	hung_task_sysctl_init();
 
 	return 0;
 }
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 3097f0286504..9fc6a5222cee 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -133,13 +133,6 @@ static int minolduid;
 static int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;
 
-/*
- * This is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs
- * and hung_task_check_interval_secs
- */
-#ifdef CONFIG_DETECT_HUNG_TASK
-static unsigned long hung_task_timeout_max = (LONG_MAX/HZ);
-#endif
 
 #ifdef CONFIG_INOTIFY_USER
 #include <linux/inotify.h>
@@ -2507,60 +2500,6 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= proc_dointvec,
 	},
 #endif
-#ifdef CONFIG_DETECT_HUNG_TASK
-#ifdef CONFIG_SMP
-	{
-		.procname	= "hung_task_all_cpu_backtrace",
-		.data		= &sysctl_hung_task_all_cpu_backtrace,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-#endif /* CONFIG_SMP */
-	{
-		.procname	= "hung_task_panic",
-		.data		= &sysctl_hung_task_panic,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-	{
-		.procname	= "hung_task_check_count",
-		.data		= &sysctl_hung_task_check_count,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ZERO,
-	},
-	{
-		.procname	= "hung_task_timeout_secs",
-		.data		= &sysctl_hung_task_timeout_secs,
-		.maxlen		= sizeof(unsigned long),
-		.mode		= 0644,
-		.proc_handler	= proc_dohung_task_timeout_secs,
-		.extra2		= &hung_task_timeout_max,
-	},
-	{
-		.procname	= "hung_task_check_interval_secs",
-		.data		= &sysctl_hung_task_check_interval_secs,
-		.maxlen		= sizeof(unsigned long),
-		.mode		= 0644,
-		.proc_handler	= proc_dohung_task_timeout_secs,
-		.extra2		= &hung_task_timeout_max,
-	},
-	{
-		.procname	= "hung_task_warnings",
-		.data		= &sysctl_hung_task_warnings,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_NEG_ONE,
-	},
-#endif
 #ifdef CONFIG_RT_MUTEXES
 	{
 		.procname	= "max_lock_depth",
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 4/9] watchdog: move watchdog sysctl interface to watchdog.c
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
                   ` (2 preceding siblings ...)
  2021-11-23 20:23 ` [PATCH v2 3/9] hung_task: Move hung_task sysctl interface to hung_task.c Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-26  9:40   ` Petr Mladek
  2021-11-23 20:23 ` [PATCH v2 5/9] sysctl: make ngroups_max const Luis Chamberlain
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

From: Xiaoming Ni <nixiaoming@huawei.com>

The kernel/sysctl.c is a kitchen sink where everyone leaves
their dirty dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to
places where they actually belong. The proc sysctl maintainers
do not want to know what sysctl knobs you wish to add for your own
piece of code, we just care about the core logic of proc sysctl.

So, move the watchdog syscl interface to watchdog.c.
Use register_sysctl() to register the sysctl interface to avoid
merge conflicts when different features modify sysctl.c at the
same time.

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
[mcgrof: justify the move on the commit log]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 kernel/sysctl.c   |  96 -------------------------------------------
 kernel/watchdog.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 101 insertions(+), 96 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 9fc6a5222cee..8d5bcf1f08f3 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -102,16 +102,10 @@
 #ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
 #include <linux/stackleak.h>
 #endif
-#ifdef CONFIG_LOCKUP_DETECTOR
-#include <linux/nmi.h>
-#endif
 
 #if defined(CONFIG_SYSCTL)
 
 /* Constants used for minimum and  maximum */
-#ifdef CONFIG_LOCKUP_DETECTOR
-static int sixty = 60;
-#endif
 
 static unsigned long zero_ul;
 static unsigned long one_ul = 1;
@@ -2308,96 +2302,6 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0444,
 		.proc_handler	= proc_dointvec,
 	},
-#if defined(CONFIG_LOCKUP_DETECTOR)
-	{
-		.procname       = "watchdog",
-		.data		= &watchdog_user_enabled,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler   = proc_watchdog,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-	{
-		.procname	= "watchdog_thresh",
-		.data		= &watchdog_thresh,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_watchdog_thresh,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= &sixty,
-	},
-	{
-		.procname       = "nmi_watchdog",
-		.data		= &nmi_watchdog_user_enabled,
-		.maxlen		= sizeof(int),
-		.mode		= NMI_WATCHDOG_SYSCTL_PERM,
-		.proc_handler   = proc_nmi_watchdog,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-	{
-		.procname	= "watchdog_cpumask",
-		.data		= &watchdog_cpumask_bits,
-		.maxlen		= NR_CPUS,
-		.mode		= 0644,
-		.proc_handler	= proc_watchdog_cpumask,
-	},
-#ifdef CONFIG_SOFTLOCKUP_DETECTOR
-	{
-		.procname       = "soft_watchdog",
-		.data		= &soft_watchdog_user_enabled,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler   = proc_soft_watchdog,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-	{
-		.procname	= "softlockup_panic",
-		.data		= &softlockup_panic,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-#ifdef CONFIG_SMP
-	{
-		.procname	= "softlockup_all_cpu_backtrace",
-		.data		= &sysctl_softlockup_all_cpu_backtrace,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-#endif /* CONFIG_SMP */
-#endif
-#ifdef CONFIG_HARDLOCKUP_DETECTOR
-	{
-		.procname	= "hardlockup_panic",
-		.data		= &hardlockup_panic,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-#ifdef CONFIG_SMP
-	{
-		.procname	= "hardlockup_all_cpu_backtrace",
-		.data		= &sysctl_hardlockup_all_cpu_backtrace,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-#endif /* CONFIG_SMP */
-#endif
-#endif
-
 #if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
 	{
 		.procname       = "unknown_nmi_panic",
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index ad912511a0c0..99afb88d2e85 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -740,6 +740,106 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
 	mutex_unlock(&watchdog_mutex);
 	return err;
 }
+
+static const int sixty = 60;
+
+static struct ctl_table watchdog_sysctls[] = {
+	{
+		.procname       = "watchdog",
+		.data		= &watchdog_user_enabled,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler   = proc_watchdog,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+	{
+		.procname	= "watchdog_thresh",
+		.data		= &watchdog_thresh,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_watchdog_thresh,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= (void *)&sixty,
+	},
+	{
+		.procname       = "nmi_watchdog",
+		.data		= &nmi_watchdog_user_enabled,
+		.maxlen		= sizeof(int),
+		.mode		= NMI_WATCHDOG_SYSCTL_PERM,
+		.proc_handler   = proc_nmi_watchdog,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+	{
+		.procname	= "watchdog_cpumask",
+		.data		= &watchdog_cpumask_bits,
+		.maxlen		= NR_CPUS,
+		.mode		= 0644,
+		.proc_handler	= proc_watchdog_cpumask,
+	},
+#ifdef CONFIG_SOFTLOCKUP_DETECTOR
+	{
+		.procname       = "soft_watchdog",
+		.data		= &soft_watchdog_user_enabled,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler   = proc_soft_watchdog,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+	{
+		.procname	= "softlockup_panic",
+		.data		= &softlockup_panic,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+#ifdef CONFIG_SMP
+	{
+		.procname	= "softlockup_all_cpu_backtrace",
+		.data		= &sysctl_softlockup_all_cpu_backtrace,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+#endif /* CONFIG_SMP */
+#endif
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+	{
+		.procname	= "hardlockup_panic",
+		.data		= &hardlockup_panic,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+#ifdef CONFIG_SMP
+	{
+		.procname	= "hardlockup_all_cpu_backtrace",
+		.data		= &sysctl_hardlockup_all_cpu_backtrace,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+#endif /* CONFIG_SMP */
+#endif
+	{}
+};
+
+static void __init watchdog_sysctl_init(void)
+{
+	register_sysctl_init("kernel", watchdog_sysctls);
+}
+#else
+#define watchdog_sysctl_init() do { } while (0)
 #endif /* CONFIG_SYSCTL */
 
 void __init lockup_detector_init(void)
@@ -753,4 +853,5 @@ void __init lockup_detector_init(void)
 	if (!watchdog_nmi_probe())
 		nmi_watchdog_available = true;
 	lockup_detector_setup();
+	watchdog_sysctl_init();
 }
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 5/9] sysctl: make ngroups_max const
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
                   ` (3 preceding siblings ...)
  2021-11-23 20:23 ` [PATCH v2 4/9] watchdog: move watchdog sysctl interface to watchdog.c Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-23 20:23 ` [PATCH v2 6/9] sysctl: use const for typically used max/min proc sysctls Luis Chamberlain
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Stephen Kitt, Luis Chamberlain

From: Stephen Kitt <steve@sk2.org>

ngroups_max is a read-only sysctl entry, reflecting NGROUPS_MAX. Make
it const, in the same way as cap_last_cap.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Stephen Kitt <steve@sk2.org>
---
 kernel/sysctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 8d5bcf1f08f3..b22149c46fa8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -124,7 +124,7 @@ static unsigned long dirty_bytes_min = 2 * PAGE_SIZE;
 static int maxolduid = 65535;
 static int minolduid;
 
-static int ngroups_max = NGROUPS_MAX;
+static const int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;
 
 
@@ -2290,7 +2290,7 @@ static struct ctl_table kern_table[] = {
 #endif
 	{
 		.procname	= "ngroups_max",
-		.data		= &ngroups_max,
+		.data		= (void *)&ngroups_max,
 		.maxlen		= sizeof (int),
 		.mode		= 0444,
 		.proc_handler	= proc_dointvec,
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 6/9] sysctl: use const for typically used max/min proc sysctls
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
                   ` (4 preceding siblings ...)
  2021-11-23 20:23 ` [PATCH v2 5/9] sysctl: make ngroups_max const Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-23 20:23 ` [PATCH v2 7/9] sysctl: use SYSCTL_ZERO to replace some static int zero uses Luis Chamberlain
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

From: Xiaoming Ni <nixiaoming@huawei.com>

When proc_dointvec_minmax() or proc_doulongvec_minmax() are used we
are using the extra1 and extra2 parameters on the sysctl table only
for a min and max boundary, these extra1 and extra2 arguments are
then used for read-only operations. So make them const to reflect
this.

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
[mcgrof: commit log love]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 kernel/sysctl.c | 53 ++++++++++++++++++++++++-------------------------
 1 file changed, 26 insertions(+), 27 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b22149c46fa8..03bbd26d4df0 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -107,27 +107,26 @@
 
 /* Constants used for minimum and  maximum */
 
-static unsigned long zero_ul;
-static unsigned long one_ul = 1;
-static unsigned long long_max = LONG_MAX;
+static const unsigned long zero_ul;
+static const unsigned long one_ul = 1;
+static const unsigned long long_max = LONG_MAX;
 #ifdef CONFIG_PRINTK
-static int ten_thousand = 10000;
+static const int ten_thousand = 10000;
 #endif
 #ifdef CONFIG_PERF_EVENTS
-static int six_hundred_forty_kb = 640 * 1024;
+static const int six_hundred_forty_kb = 640 * 1024;
 #endif
 
 /* this is needed for the proc_doulongvec_minmax of vm_dirty_bytes */
-static unsigned long dirty_bytes_min = 2 * PAGE_SIZE;
+static const unsigned long dirty_bytes_min = 2 * PAGE_SIZE;
 
 /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
-static int maxolduid = 65535;
-static int minolduid;
+static const int maxolduid = 65535;
+static const int minolduid;
 
 static const int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;
 
-
 #ifdef CONFIG_INOTIFY_USER
 #include <linux/inotify.h>
 #endif
@@ -171,8 +170,8 @@ int sysctl_legacy_va_layout;
 #endif
 
 #ifdef CONFIG_COMPACTION
-static int min_extfrag_threshold;
-static int max_extfrag_threshold = 1000;
+static const int min_extfrag_threshold;
+static const int max_extfrag_threshold = 1000;
 #endif
 
 #endif /* CONFIG_SYSCTL */
@@ -2176,8 +2175,8 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &minolduid,
-		.extra2		= &maxolduid,
+		.extra1		= (void *)&minolduid,
+		.extra2		= (void *)&maxolduid,
 	},
 	{
 		.procname	= "overflowgid",
@@ -2185,8 +2184,8 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &minolduid,
-		.extra2		= &maxolduid,
+		.extra1		= (void *)&minolduid,
+		.extra2		= (void *)&maxolduid,
 	},
 #ifdef CONFIG_S390
 	{
@@ -2260,7 +2259,7 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &ten_thousand,
+		.extra2		= (void *)&ten_thousand,
 	},
 	{
 		.procname	= "printk_devkmsg",
@@ -2472,7 +2471,7 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= perf_event_max_stack_handler,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &six_hundred_forty_kb,
+		.extra2		= (void *)&six_hundred_forty_kb,
 	},
 	{
 		.procname	= "perf_event_max_contexts_per_stack",
@@ -2628,7 +2627,7 @@ static struct ctl_table vm_table[] = {
 		.maxlen		= sizeof(dirty_background_bytes),
 		.mode		= 0644,
 		.proc_handler	= dirty_background_bytes_handler,
-		.extra1		= &one_ul,
+		.extra1		= (void *)&one_ul,
 	},
 	{
 		.procname	= "dirty_ratio",
@@ -2645,7 +2644,7 @@ static struct ctl_table vm_table[] = {
 		.maxlen		= sizeof(vm_dirty_bytes),
 		.mode		= 0644,
 		.proc_handler	= dirty_bytes_handler,
-		.extra1		= &dirty_bytes_min,
+		.extra1		= (void *)&dirty_bytes_min,
 	},
 	{
 		.procname	= "dirty_writeback_centisecs",
@@ -2759,8 +2758,8 @@ static struct ctl_table vm_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &min_extfrag_threshold,
-		.extra2		= &max_extfrag_threshold,
+		.extra1		= (void *)&min_extfrag_threshold,
+		.extra2		= (void *)&max_extfrag_threshold,
 	},
 	{
 		.procname	= "compact_unevictable_allowed",
@@ -3046,8 +3045,8 @@ static struct ctl_table fs_table[] = {
 		.maxlen		= sizeof(files_stat.max_files),
 		.mode		= 0644,
 		.proc_handler	= proc_doulongvec_minmax,
-		.extra1		= &zero_ul,
-		.extra2		= &long_max,
+		.extra1		= (void *)&zero_ul,
+		.extra2		= (void *)&long_max,
 	},
 	{
 		.procname	= "nr_open",
@@ -3071,8 +3070,8 @@ static struct ctl_table fs_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &minolduid,
-		.extra2		= &maxolduid,
+		.extra1		= (void *)&minolduid,
+		.extra2		= (void *)&maxolduid,
 	},
 	{
 		.procname	= "overflowgid",
@@ -3080,8 +3079,8 @@ static struct ctl_table fs_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= &minolduid,
-		.extra2		= &maxolduid,
+		.extra1		= (void *)&minolduid,
+		.extra2		= (void *)&maxolduid,
 	},
 #ifdef CONFIG_FILE_LOCKING
 	{
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 7/9] sysctl: use SYSCTL_ZERO to replace some static int zero uses
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
                   ` (5 preceding siblings ...)
  2021-11-23 20:23 ` [PATCH v2 6/9] sysctl: use const for typically used max/min proc sysctls Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-23 20:23 ` [PATCH v2 8/9] aio: move aio sysctl to aio.c Luis Chamberlain
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

From: Xiaoming Ni <nixiaoming@huawei.com>

Use the variable SYSCTL_ZERO to replace some static int boundary
variables with a value of 0 (minolduid, min_extfrag_threshold,
min_wakeup_granularity_ns).

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 kernel/sysctl.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 03bbd26d4df0..597ab5ad4879 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -122,7 +122,7 @@ static const unsigned long dirty_bytes_min = 2 * PAGE_SIZE;
 
 /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
 static const int maxolduid = 65535;
-static const int minolduid;
+/* minolduid is SYSCTL_ZERO */
 
 static const int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;
@@ -170,7 +170,7 @@ int sysctl_legacy_va_layout;
 #endif
 
 #ifdef CONFIG_COMPACTION
-static const int min_extfrag_threshold;
+/* min_extfrag_threshold is SYSCTL_ZERO */;
 static const int max_extfrag_threshold = 1000;
 #endif
 
@@ -2175,7 +2175,7 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= (void *)&minolduid,
+		.extra1		= SYSCTL_ZERO,
 		.extra2		= (void *)&maxolduid,
 	},
 	{
@@ -2184,7 +2184,7 @@ static struct ctl_table kern_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= (void *)&minolduid,
+		.extra1		= SYSCTL_ZERO,
 		.extra2		= (void *)&maxolduid,
 	},
 #ifdef CONFIG_S390
@@ -2758,7 +2758,7 @@ static struct ctl_table vm_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= (void *)&min_extfrag_threshold,
+		.extra1		= SYSCTL_ZERO,
 		.extra2		= (void *)&max_extfrag_threshold,
 	},
 	{
@@ -3070,7 +3070,7 @@ static struct ctl_table fs_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= (void *)&minolduid,
+		.extra1		= SYSCTL_ZERO,
 		.extra2		= (void *)&maxolduid,
 	},
 	{
@@ -3079,7 +3079,7 @@ static struct ctl_table fs_table[] = {
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= (void *)&minolduid,
+		.extra1		= SYSCTL_ZERO,
 		.extra2		= (void *)&maxolduid,
 	},
 #ifdef CONFIG_FILE_LOCKING
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 8/9] aio: move aio sysctl to aio.c
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
                   ` (6 preceding siblings ...)
  2021-11-23 20:23 ` [PATCH v2 7/9] sysctl: use SYSCTL_ZERO to replace some static int zero uses Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-24  9:32   ` Jan Kara
  2021-11-23 20:23 ` [PATCH v2 9/9] dnotify: move dnotify sysctl to dnotify.c Luis Chamberlain
  2021-11-24  0:14 ` [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Andrew Morton
  9 siblings, 1 reply; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

From: Xiaoming Ni <nixiaoming@huawei.com>

The kernel/sysctl.c is a kitchen sink where everyone leaves
their dirty dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to
places where they actually belong. The proc sysctl maintainers
do not want to know what sysctl knobs you wish to add for your own
piece of code, we just care about the core logic.

Move aio sysctl to aio.c and use the new register_sysctl_init() to
register the sysctl interface for aio.

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
[mcgrof: adjust commit log to justify the move]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/aio.c            | 31 +++++++++++++++++++++++++++++--
 include/linux/aio.h |  4 ----
 kernel/sysctl.c     | 17 -----------------
 3 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 9c81cf611d65..83ef2341e73f 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -219,9 +219,35 @@ struct aio_kiocb {
 
 /*------ sysctl variables----*/
 static DEFINE_SPINLOCK(aio_nr_lock);
-unsigned long aio_nr;		/* current system wide number of aio requests */
-unsigned long aio_max_nr = 0x10000; /* system wide maximum number of aio requests */
+static unsigned long aio_nr;		/* current system wide number of aio requests */
+static unsigned long aio_max_nr = 0x10000; /* system wide maximum number of aio requests */
 /*----end sysctl variables---*/
+#ifdef CONFIG_SYSCTL
+static struct ctl_table aio_sysctls[] = {
+	{
+		.procname	= "aio-nr",
+		.data		= &aio_nr,
+		.maxlen		= sizeof(aio_nr),
+		.mode		= 0444,
+		.proc_handler	= proc_doulongvec_minmax,
+	},
+	{
+		.procname	= "aio-max-nr",
+		.data		= &aio_max_nr,
+		.maxlen		= sizeof(aio_max_nr),
+		.mode		= 0644,
+		.proc_handler	= proc_doulongvec_minmax,
+	},
+	{}
+};
+
+static void __init aio_sysctl_init(void)
+{
+	register_sysctl_init("fs", aio_sysctls);
+}
+#else
+#define aio_sysctl_init() do { } while (0)
+#endif
 
 static struct kmem_cache	*kiocb_cachep;
 static struct kmem_cache	*kioctx_cachep;
@@ -274,6 +300,7 @@ static int __init aio_setup(void)
 
 	kiocb_cachep = KMEM_CACHE(aio_kiocb, SLAB_HWCACHE_ALIGN|SLAB_PANIC);
 	kioctx_cachep = KMEM_CACHE(kioctx,SLAB_HWCACHE_ALIGN|SLAB_PANIC);
+	aio_sysctl_init();
 	return 0;
 }
 __initcall(aio_setup);
diff --git a/include/linux/aio.h b/include/linux/aio.h
index b83e68dd006f..86892a4fe7c8 100644
--- a/include/linux/aio.h
+++ b/include/linux/aio.h
@@ -20,8 +20,4 @@ static inline void kiocb_set_cancel_fn(struct kiocb *req,
 				       kiocb_cancel_fn *cancel) { }
 #endif /* CONFIG_AIO */
 
-/* for sysctl: */
-extern unsigned long aio_nr;
-extern unsigned long aio_max_nr;
-
 #endif /* __LINUX__AIO_H */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 597ab5ad4879..20326d67b814 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -20,7 +20,6 @@
  */
 
 #include <linux/module.h>
-#include <linux/aio.h>
 #include <linux/mm.h>
 #include <linux/swap.h>
 #include <linux/slab.h>
@@ -3110,22 +3109,6 @@ static struct ctl_table fs_table[] = {
 		.proc_handler	= proc_dointvec,
 	},
 #endif
-#ifdef CONFIG_AIO
-	{
-		.procname	= "aio-nr",
-		.data		= &aio_nr,
-		.maxlen		= sizeof(aio_nr),
-		.mode		= 0444,
-		.proc_handler	= proc_doulongvec_minmax,
-	},
-	{
-		.procname	= "aio-max-nr",
-		.data		= &aio_max_nr,
-		.maxlen		= sizeof(aio_max_nr),
-		.mode		= 0644,
-		.proc_handler	= proc_doulongvec_minmax,
-	},
-#endif /* CONFIG_AIO */
 #ifdef CONFIG_INOTIFY_USER
 	{
 		.procname	= "inotify",
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 9/9] dnotify: move dnotify sysctl to dnotify.c
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
                   ` (7 preceding siblings ...)
  2021-11-23 20:23 ` [PATCH v2 8/9] aio: move aio sysctl to aio.c Luis Chamberlain
@ 2021-11-23 20:23 ` Luis Chamberlain
  2021-11-24  9:31   ` Jan Kara
  2021-11-24  0:14 ` [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Andrew Morton
  9 siblings, 1 reply; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-23 20:23 UTC (permalink / raw)
  To: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il
  Cc: linux-fsdevel, linux-kernel, Luis Chamberlain

From: Xiaoming Ni <nixiaoming@huawei.com>

The kernel/sysctl.c is a kitchen sink where everyone leaves
their dirty dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to
places where they actually belong. The proc sysctl maintainers
do not want to know what sysctl knobs you wish to add for your own
piece of code, we just care about the core logic.

So move dnotify sysctls to dnotify.c and use the new
register_sysctl_init() to register the sysctl interface.

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
[mcgrof: adjust the commit log to justify the move]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/notify/dnotify/dnotify.c | 21 ++++++++++++++++++++-
 include/linux/dnotify.h     |  1 -
 kernel/sysctl.c             | 10 ----------
 3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index e85e13c50d6d..2b04e2296fb6 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -19,7 +19,25 @@
 #include <linux/fdtable.h>
 #include <linux/fsnotify_backend.h>
 
-int dir_notify_enable __read_mostly = 1;
+static int dir_notify_enable __read_mostly = 1;
+#ifdef CONFIG_SYSCTL
+static struct ctl_table dnotify_sysctls[] = {
+	{
+		.procname	= "dir-notify-enable",
+		.data		= &dir_notify_enable,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
+	{}
+};
+static void __init dnotify_sysctl_init(void)
+{
+	register_sysctl_init("fs", dnotify_sysctls);
+}
+#else
+#define dnotify_sysctl_init() do { } while (0)
+#endif
 
 static struct kmem_cache *dnotify_struct_cache __read_mostly;
 static struct kmem_cache *dnotify_mark_cache __read_mostly;
@@ -386,6 +404,7 @@ static int __init dnotify_init(void)
 	dnotify_group = fsnotify_alloc_group(&dnotify_fsnotify_ops);
 	if (IS_ERR(dnotify_group))
 		panic("unable to allocate fsnotify group for dnotify\n");
+	dnotify_sysctl_init();
 	return 0;
 }
 
diff --git a/include/linux/dnotify.h b/include/linux/dnotify.h
index 0aad774beaec..4f3b25d47436 100644
--- a/include/linux/dnotify.h
+++ b/include/linux/dnotify.h
@@ -29,7 +29,6 @@ struct dnotify_struct {
 			    FS_CREATE | FS_DN_RENAME |\
 			    FS_MOVED_FROM | FS_MOVED_TO)
 
-extern int dir_notify_enable;
 extern void dnotify_flush(struct file *, fl_owner_t);
 extern int fcntl_dirnotify(int, struct file *, unsigned long);
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 20326d67b814..7a90a12b9ea4 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -48,7 +48,6 @@
 #include <linux/times.h>
 #include <linux/limits.h>
 #include <linux/dcache.h>
-#include <linux/dnotify.h>
 #include <linux/syscalls.h>
 #include <linux/vmstat.h>
 #include <linux/nfs_fs.h>
@@ -3090,15 +3089,6 @@ static struct ctl_table fs_table[] = {
 		.proc_handler	= proc_dointvec,
 	},
 #endif
-#ifdef CONFIG_DNOTIFY
-	{
-		.procname	= "dir-notify-enable",
-		.data		= &dir_notify_enable,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec,
-	},
-#endif
 #ifdef CONFIG_MMU
 #ifdef CONFIG_FILE_LOCKING
 	{
-- 
2.33.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups
  2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
                   ` (8 preceding siblings ...)
  2021-11-23 20:23 ` [PATCH v2 9/9] dnotify: move dnotify sysctl to dnotify.c Luis Chamberlain
@ 2021-11-24  0:14 ` Andrew Morton
  2021-11-24  0:27   ` Luis Chamberlain
  9 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2021-11-24  0:14 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh, pjt,
	liu.hailong6, andriy.shevchenko, sre, penguin-kernel, pmladek,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

On Tue, 23 Nov 2021 12:23:38 -0800 Luis Chamberlain <mcgrof@kernel.org> wrote:

> Since the sysctls are all over the place I can either put up a tree
> to keep track of these changes and later send a pull request to Linus
> or we can have them trickle into Andrew's tree. Let me know what folks
> prefer.

I grabbed.  I staged them behind all-of-linux-next, to get visibility
into changes in the various trees which might conflict.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups
  2021-11-24  0:14 ` [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Andrew Morton
@ 2021-11-24  0:27   ` Luis Chamberlain
  0 siblings, 0 replies; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-24  0:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh, pjt,
	liu.hailong6, andriy.shevchenko, sre, penguin-kernel, pmladek,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

On Tue, Nov 23, 2021 at 04:14:26PM -0800, Andrew Morton wrote:
> On Tue, 23 Nov 2021 12:23:38 -0800 Luis Chamberlain <mcgrof@kernel.org> wrote:
> 
> > Since the sysctls are all over the place I can either put up a tree
> > to keep track of these changes and later send a pull request to Linus
> > or we can have them trickle into Andrew's tree. Let me know what folks
> > prefer.
> 
> I grabbed.  I staged them behind all-of-linux-next, to get visibility
> into changes in the various trees which might conflict.

Groovy thanks, if nothing barks back next week I can send out a third
set of changes. I'll note that I've been testing these with 0-day prior
to posting sets.

  Luis

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals
  2021-11-23 20:23 ` [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals Luis Chamberlain
@ 2021-11-24  4:51   ` Eric W. Biederman
  2021-11-24  7:05     ` Xiaoming Ni
  0 siblings, 1 reply; 22+ messages in thread
From: Eric W. Biederman @ 2021-11-24  4:51 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: akpm, keescook, yzaikin, nixiaoming, peterz, gregkh, pjt,
	liu.hailong6, andriy.shevchenko, sre, penguin-kernel, pmladek,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

Luis Chamberlain <mcgrof@kernel.org> writes:

> From: Xiaoming Ni <nixiaoming@huawei.com>
>
> sysctl has helpers which let us specify boundary values for a min or
> max int value. Since these are used for a boundary check only they don't
> change, so move these variables to sysctl_vals to avoid adding duplicate
> variables. This will help with our cleanup of kernel/sysctl.c.

Ouch.

I kind of, sort of, have to protest.

Where the macros that use sysctl_vals don't have a type they have caused
mysterious code breakage because people did not realize they can not be
used with sysctls that take a long instead of an int.

This came up with when the internal storage for ucounts see
kernel/ucount.c changed from an int to a long.  We were quite a while
tracking what was going on until we realized that the code could not use
SYSCTL_ZERO and SYSCTL_INT_MAX and that we had to defined our own that
were long.

So before we extend something like this can we please change the
macro naming convention so that it includes the size of the type
we want.


I am also not a fan of sysctl_vals living in proc_sysctl.  They
have nothing to do with the code in that file.  They would do much
better in kernel/sysctl.c close to the helpers that use them.

Eric


> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> [mcgrof: major rebase]
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  fs/proc/proc_sysctl.c  |  2 +-
>  include/linux/sysctl.h | 12 +++++++++---
>  kernel/sysctl.c        | 44 ++++++++++++++++++------------------------
>  3 files changed, 29 insertions(+), 29 deletions(-)
>
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index b4950843d90a..6d462644bb00 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -26,7 +26,7 @@ static const struct file_operations proc_sys_dir_file_operations;
>  static const struct inode_operations proc_sys_dir_operations;
>  
>  /* shared constants to be used in various sysctls */
> -const int sysctl_vals[] = { 0, 1, INT_MAX };
> +const int sysctl_vals[] = { -1, 0, 1, 2, 4, 100, 200, 1000, INT_MAX };
>  EXPORT_SYMBOL(sysctl_vals);
>  
>  /* Support for permanently empty directories */
> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
> index d3ab7969b6b5..718492057c70 100644
> --- a/include/linux/sysctl.h
> +++ b/include/linux/sysctl.h
> @@ -38,9 +38,15 @@ struct ctl_table_header;
>  struct ctl_dir;
>  
>  /* Keep the same order as in fs/proc/proc_sysctl.c */
> -#define SYSCTL_ZERO	((void *)&sysctl_vals[0])
> -#define SYSCTL_ONE	((void *)&sysctl_vals[1])
> -#define SYSCTL_INT_MAX	((void *)&sysctl_vals[2])
> +#define SYSCTL_NEG_ONE			((void *)&sysctl_vals[0])
> +#define SYSCTL_ZERO			((void *)&sysctl_vals[1])
> +#define SYSCTL_ONE			((void *)&sysctl_vals[2])
> +#define SYSCTL_TWO			((void *)&sysctl_vals[3])
> +#define SYSCTL_FOUR			((void *)&sysctl_vals[4])
> +#define SYSCTL_ONE_HUNDRED		((void *)&sysctl_vals[5])
> +#define SYSCTL_TWO_HUNDRED		((void *)&sysctl_vals[6])
> +#define SYSCTL_ONE_THOUSAND		((void *)&sysctl_vals[7])
> +#define SYSCTL_INT_MAX			((void *)&sysctl_vals[8])
>  
>  extern const int sysctl_vals[];
>  
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 857c1ccad9e8..3097f0286504 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -113,15 +113,9 @@
>  static int sixty = 60;
>  #endif
>  
> -static int __maybe_unused neg_one = -1;
> -static int __maybe_unused two = 2;
> -static int __maybe_unused four = 4;
>  static unsigned long zero_ul;
>  static unsigned long one_ul = 1;
>  static unsigned long long_max = LONG_MAX;
> -static int one_hundred = 100;
> -static int two_hundred = 200;
> -static int one_thousand = 1000;
>  #ifdef CONFIG_PRINTK
>  static int ten_thousand = 10000;
>  #endif
> @@ -1962,7 +1956,7 @@ static struct ctl_table kern_table[] = {
>  		.maxlen		= sizeof(int),
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax,
> -		.extra1		= &neg_one,
> +		.extra1		= SYSCTL_NEG_ONE,
>  		.extra2		= SYSCTL_ONE,
>  	},
>  #endif
> @@ -2304,7 +2298,7 @@ static struct ctl_table kern_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax_sysadmin,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &two,
> +		.extra2		= SYSCTL_TWO,
>  	},
>  #endif
>  	{
> @@ -2564,7 +2558,7 @@ static struct ctl_table kern_table[] = {
>  		.maxlen		= sizeof(int),
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax,
> -		.extra1		= &neg_one,
> +		.extra1		= SYSCTL_NEG_ONE,
>  	},
>  #endif
>  #ifdef CONFIG_RT_MUTEXES
> @@ -2626,7 +2620,7 @@ static struct ctl_table kern_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= perf_cpu_time_max_percent_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &one_hundred,
> +		.extra2		= SYSCTL_ONE_HUNDRED,
>  	},
>  	{
>  		.procname	= "perf_event_max_stack",
> @@ -2644,7 +2638,7 @@ static struct ctl_table kern_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= perf_event_max_stack_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &one_thousand,
> +		.extra2		= SYSCTL_ONE_THOUSAND,
>  	},
>  #endif
>  	{
> @@ -2675,7 +2669,7 @@ static struct ctl_table kern_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= bpf_unpriv_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &two,
> +		.extra2		= SYSCTL_TWO,
>  	},
>  	{
>  		.procname	= "bpf_stats_enabled",
> @@ -2729,7 +2723,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= overcommit_policy_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &two,
> +		.extra2		= SYSCTL_TWO,
>  	},
>  	{
>  		.procname	= "panic_on_oom",
> @@ -2738,7 +2732,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &two,
> +		.extra2		= SYSCTL_TWO,
>  	},
>  	{
>  		.procname	= "oom_kill_allocating_task",
> @@ -2783,7 +2777,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= dirty_background_ratio_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &one_hundred,
> +		.extra2		= SYSCTL_ONE_HUNDRED,
>  	},
>  	{
>  		.procname	= "dirty_background_bytes",
> @@ -2800,7 +2794,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= dirty_ratio_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &one_hundred,
> +		.extra2		= SYSCTL_ONE_HUNDRED,
>  	},
>  	{
>  		.procname	= "dirty_bytes",
> @@ -2840,7 +2834,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &two_hundred,
> +		.extra2		= SYSCTL_TWO_HUNDRED,
>  	},
>  #ifdef CONFIG_HUGETLB_PAGE
>  	{
> @@ -2897,7 +2891,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0200,
>  		.proc_handler	= drop_caches_sysctl_handler,
>  		.extra1		= SYSCTL_ONE,
> -		.extra2		= &four,
> +		.extra2		= SYSCTL_FOUR,
>  	},
>  #ifdef CONFIG_COMPACTION
>  	{
> @@ -2914,7 +2908,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= compaction_proactiveness_sysctl_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &one_hundred,
> +		.extra2		= SYSCTL_ONE_HUNDRED,
>  	},
>  	{
>  		.procname	= "extfrag_threshold",
> @@ -2959,7 +2953,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= watermark_scale_factor_sysctl_handler,
>  		.extra1		= SYSCTL_ONE,
> -		.extra2		= &one_thousand,
> +		.extra2		= SYSCTL_ONE_THOUSAND,
>  	},
>  	{
>  		.procname	= "percpu_pagelist_high_fraction",
> @@ -3038,7 +3032,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= sysctl_min_unmapped_ratio_sysctl_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &one_hundred,
> +		.extra2		= SYSCTL_ONE_HUNDRED,
>  	},
>  	{
>  		.procname	= "min_slab_ratio",
> @@ -3047,7 +3041,7 @@ static struct ctl_table vm_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= sysctl_min_slab_ratio_sysctl_handler,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &one_hundred,
> +		.extra2		= SYSCTL_ONE_HUNDRED,
>  	},
>  #endif
>  #ifdef CONFIG_SMP
> @@ -3337,7 +3331,7 @@ static struct ctl_table fs_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &two,
> +		.extra2		= SYSCTL_TWO,
>  	},
>  	{
>  		.procname	= "protected_regular",
> @@ -3346,7 +3340,7 @@ static struct ctl_table fs_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &two,
> +		.extra2		= SYSCTL_TWO,
>  	},
>  	{
>  		.procname	= "suid_dumpable",
> @@ -3355,7 +3349,7 @@ static struct ctl_table fs_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax_coredump,
>  		.extra1		= SYSCTL_ZERO,
> -		.extra2		= &two,
> +		.extra2		= SYSCTL_TWO,
>  	},
>  #if defined(CONFIG_BINFMT_MISC) || defined(CONFIG_BINFMT_MISC_MODULE)
>  	{

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals
  2021-11-24  4:51   ` Eric W. Biederman
@ 2021-11-24  7:05     ` Xiaoming Ni
  2021-11-24 17:38       ` Eric W. Biederman
  0 siblings, 1 reply; 22+ messages in thread
From: Xiaoming Ni @ 2021-11-24  7:05 UTC (permalink / raw)
  To: Eric W. Biederman, Luis Chamberlain
  Cc: akpm, keescook, yzaikin, peterz, gregkh, pjt, liu.hailong6,
	andriy.shevchenko, sre, penguin-kernel, pmladek, senozhatsky,
	wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

On 2021/11/24 12:51, Eric W. Biederman wrote:
> Luis Chamberlain <mcgrof@kernel.org> writes:
> 
>> From: Xiaoming Ni <nixiaoming@huawei.com>
>>
>> sysctl has helpers which let us specify boundary values for a min or
>> max int value. Since these are used for a boundary check only they don't
>> change, so move these variables to sysctl_vals to avoid adding duplicate
>> variables. This will help with our cleanup of kernel/sysctl.c.
> 
> Ouch.
> 
> I kind of, sort of, have to protest.
> 
> Where the macros that use sysctl_vals don't have a type they have caused
> mysterious code breakage because people did not realize they can not be
> used with sysctls that take a long instead of an int.
> 
> This came up with when the internal storage for ucounts see
> kernel/ucount.c changed from an int to a long.  We were quite a while
> tracking what was going on until we realized that the code could not use
> SYSCTL_ZERO and SYSCTL_INT_MAX and that we had to defined our own thatSYSCTL_ZERO and SYSCTL_ZERO involve dozens of files and are used in hundreds of places.
> were long.
> 
static unsigned long zero_ul;
static unsigned long one_ul = 1;
static unsigned long long_max = LONG_MAX;
EXPORT_SYMBOL(proc_doulongvec_minmax);

Yes, min/max of type unsigned long is used in multiple sysctl 
interfaces. It is necessary to add an unsigned long sysctl_val array to 
avoid repeated definitions in different .c files.

> So before we extend something like this can we please change the
> macro naming convention so that it includes the size of the type
> we want.
>
The int type is the most widely used type. By default, numeric constants 
are also of the int type. SYSCTL_ZERO and SYSCTL_ZERO involve dozens of 
files and are used in hundreds of places. Whether only non-int macros 
need to be named with their type size?

> 
> I am also not a fan of sysctl_vals living in proc_sysctl.  They
> have nothing to do with the code in that file.  They would do much
> better in kernel/sysctl.c close to the helpers that use them.
> 
yes

Thanks
Xiaoming Ni


> Eric
> 
> 
>> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
>> Reviewed-by: Kees Cook <keescook@chromium.org>
>> [mcgrof: major rebase]
>> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
>> ---
>>   fs/proc/proc_sysctl.c  |  2 +-
>>   include/linux/sysctl.h | 12 +++++++++---
>>   kernel/sysctl.c        | 44 ++++++++++++++++++------------------------
>>   3 files changed, 29 insertions(+), 29 deletions(-)
>>
>> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
>> index b4950843d90a..6d462644bb00 100644
>> --- a/fs/proc/proc_sysctl.c
>> +++ b/fs/proc/proc_sysctl.c
>> @@ -26,7 +26,7 @@ static const struct file_operations proc_sys_dir_file_operations;
>>   static const struct inode_operations proc_sys_dir_operations;
>>   
>>   /* shared constants to be used in various sysctls */
>> -const int sysctl_vals[] = { 0, 1, INT_MAX };
>> +const int sysctl_vals[] = { -1, 0, 1, 2, 4, 100, 200, 1000, INT_MAX };
>>   EXPORT_SYMBOL(sysctl_vals);
>>   
>>   /* Support for permanently empty directories */
>> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
>> index d3ab7969b6b5..718492057c70 100644
>> --- a/include/linux/sysctl.h
>> +++ b/include/linux/sysctl.h
>> @@ -38,9 +38,15 @@ struct ctl_table_header;
>>   struct ctl_dir;
>>   
>>   /* Keep the same order as in fs/proc/proc_sysctl.c */
>> -#define SYSCTL_ZERO	((void *)&sysctl_vals[0])
>> -#define SYSCTL_ONE	((void *)&sysctl_vals[1])
>> -#define SYSCTL_INT_MAX	((void *)&sysctl_vals[2])
>> +#define SYSCTL_NEG_ONE			((void *)&sysctl_vals[0])
>> +#define SYSCTL_ZERO			((void *)&sysctl_vals[1])
>> +#define SYSCTL_ONE			((void *)&sysctl_vals[2])
>> +#define SYSCTL_TWO			((void *)&sysctl_vals[3])
>> +#define SYSCTL_FOUR			((void *)&sysctl_vals[4])
>> +#define SYSCTL_ONE_HUNDRED		((void *)&sysctl_vals[5])
>> +#define SYSCTL_TWO_HUNDRED		((void *)&sysctl_vals[6])
>> +#define SYSCTL_ONE_THOUSAND		((void *)&sysctl_vals[7])
>> +#define SYSCTL_INT_MAX			((void *)&sysctl_vals[8])
>>   
>>   extern const int sysctl_vals[];
>>   
>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index 857c1ccad9e8..3097f0286504 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -113,15 +113,9 @@
>>   static int sixty = 60;
>>   #endif
>>   
>> -static int __maybe_unused neg_one = -1;
>> -static int __maybe_unused two = 2;
>> -static int __maybe_unused four = 4;
>>   static unsigned long zero_ul;
>>   static unsigned long one_ul = 1;
>>   static unsigned long long_max = LONG_MAX;
>> -static int one_hundred = 100;
>> -static int two_hundred = 200;
>> -static int one_thousand = 1000;
>>   #ifdef CONFIG_PRINTK
>>   static int ten_thousand = 10000;
>>   #endif
>> @@ -1962,7 +1956,7 @@ static struct ctl_table kern_table[] = {
>>   		.maxlen		= sizeof(int),
>>   		.mode		= 0644,
>>   		.proc_handler	= proc_dointvec_minmax,
>> -		.extra1		= &neg_one,
>> +		.extra1		= SYSCTL_NEG_ONE,
>>   		.extra2		= SYSCTL_ONE,
>>   	},
>>   #endif
>> @@ -2304,7 +2298,7 @@ static struct ctl_table kern_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= proc_dointvec_minmax_sysadmin,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &two,
>> +		.extra2		= SYSCTL_TWO,
>>   	},
>>   #endif
>>   	{
>> @@ -2564,7 +2558,7 @@ static struct ctl_table kern_table[] = {
>>   		.maxlen		= sizeof(int),
>>   		.mode		= 0644,
>>   		.proc_handler	= proc_dointvec_minmax,
>> -		.extra1		= &neg_one,
>> +		.extra1		= SYSCTL_NEG_ONE,
>>   	},
>>   #endif
>>   #ifdef CONFIG_RT_MUTEXES
>> @@ -2626,7 +2620,7 @@ static struct ctl_table kern_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= perf_cpu_time_max_percent_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &one_hundred,
>> +		.extra2		= SYSCTL_ONE_HUNDRED,
>>   	},
>>   	{
>>   		.procname	= "perf_event_max_stack",
>> @@ -2644,7 +2638,7 @@ static struct ctl_table kern_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= perf_event_max_stack_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &one_thousand,
>> +		.extra2		= SYSCTL_ONE_THOUSAND,
>>   	},
>>   #endif
>>   	{
>> @@ -2675,7 +2669,7 @@ static struct ctl_table kern_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= bpf_unpriv_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &two,
>> +		.extra2		= SYSCTL_TWO,
>>   	},
>>   	{
>>   		.procname	= "bpf_stats_enabled",
>> @@ -2729,7 +2723,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= overcommit_policy_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &two,
>> +		.extra2		= SYSCTL_TWO,
>>   	},
>>   	{
>>   		.procname	= "panic_on_oom",
>> @@ -2738,7 +2732,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= proc_dointvec_minmax,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &two,
>> +		.extra2		= SYSCTL_TWO,
>>   	},
>>   	{
>>   		.procname	= "oom_kill_allocating_task",
>> @@ -2783,7 +2777,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= dirty_background_ratio_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &one_hundred,
>> +		.extra2		= SYSCTL_ONE_HUNDRED,
>>   	},
>>   	{
>>   		.procname	= "dirty_background_bytes",
>> @@ -2800,7 +2794,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= dirty_ratio_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &one_hundred,
>> +		.extra2		= SYSCTL_ONE_HUNDRED,
>>   	},
>>   	{
>>   		.procname	= "dirty_bytes",
>> @@ -2840,7 +2834,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= proc_dointvec_minmax,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &two_hundred,
>> +		.extra2		= SYSCTL_TWO_HUNDRED,
>>   	},
>>   #ifdef CONFIG_HUGETLB_PAGE
>>   	{
>> @@ -2897,7 +2891,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0200,
>>   		.proc_handler	= drop_caches_sysctl_handler,
>>   		.extra1		= SYSCTL_ONE,
>> -		.extra2		= &four,
>> +		.extra2		= SYSCTL_FOUR,
>>   	},
>>   #ifdef CONFIG_COMPACTION
>>   	{
>> @@ -2914,7 +2908,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= compaction_proactiveness_sysctl_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &one_hundred,
>> +		.extra2		= SYSCTL_ONE_HUNDRED,
>>   	},
>>   	{
>>   		.procname	= "extfrag_threshold",
>> @@ -2959,7 +2953,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= watermark_scale_factor_sysctl_handler,
>>   		.extra1		= SYSCTL_ONE,
>> -		.extra2		= &one_thousand,
>> +		.extra2		= SYSCTL_ONE_THOUSAND,
>>   	},
>>   	{
>>   		.procname	= "percpu_pagelist_high_fraction",
>> @@ -3038,7 +3032,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= sysctl_min_unmapped_ratio_sysctl_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &one_hundred,
>> +		.extra2		= SYSCTL_ONE_HUNDRED,
>>   	},
>>   	{
>>   		.procname	= "min_slab_ratio",
>> @@ -3047,7 +3041,7 @@ static struct ctl_table vm_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= sysctl_min_slab_ratio_sysctl_handler,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &one_hundred,
>> +		.extra2		= SYSCTL_ONE_HUNDRED,
>>   	},
>>   #endif
>>   #ifdef CONFIG_SMP
>> @@ -3337,7 +3331,7 @@ static struct ctl_table fs_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= proc_dointvec_minmax,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &two,
>> +		.extra2		= SYSCTL_TWO,
>>   	},
>>   	{
>>   		.procname	= "protected_regular",
>> @@ -3346,7 +3340,7 @@ static struct ctl_table fs_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= proc_dointvec_minmax,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &two,
>> +		.extra2		= SYSCTL_TWO,
>>   	},
>>   	{
>>   		.procname	= "suid_dumpable",
>> @@ -3355,7 +3349,7 @@ static struct ctl_table fs_table[] = {
>>   		.mode		= 0644,
>>   		.proc_handler	= proc_dointvec_minmax_coredump,
>>   		.extra1		= SYSCTL_ZERO,
>> -		.extra2		= &two,
>> +		.extra2		= SYSCTL_TWO,
>>   	},
>>   #if defined(CONFIG_BINFMT_MISC) || defined(CONFIG_BINFMT_MISC_MODULE)
>>   	{
> .
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 9/9] dnotify: move dnotify sysctl to dnotify.c
  2021-11-23 20:23 ` [PATCH v2 9/9] dnotify: move dnotify sysctl to dnotify.c Luis Chamberlain
@ 2021-11-24  9:31   ` Jan Kara
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2021-11-24  9:31 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il,
	linux-fsdevel, linux-kernel

On Tue 23-11-21 12:23:47, Luis Chamberlain wrote:
> From: Xiaoming Ni <nixiaoming@huawei.com>
> 
> The kernel/sysctl.c is a kitchen sink where everyone leaves
> their dirty dishes, this makes it very difficult to maintain.
> 
> To help with this maintenance let's start by moving sysctls to
> places where they actually belong. The proc sysctl maintainers
> do not want to know what sysctl knobs you wish to add for your own
> piece of code, we just care about the core logic.
> 
> So move dnotify sysctls to dnotify.c and use the new
> register_sysctl_init() to register the sysctl interface.
> 
> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
> [mcgrof: adjust the commit log to justify the move]
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Looks sane. Feel free to add:

Acked-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/notify/dnotify/dnotify.c | 21 ++++++++++++++++++++-
>  include/linux/dnotify.h     |  1 -
>  kernel/sysctl.c             | 10 ----------
>  3 files changed, 20 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
> index e85e13c50d6d..2b04e2296fb6 100644
> --- a/fs/notify/dnotify/dnotify.c
> +++ b/fs/notify/dnotify/dnotify.c
> @@ -19,7 +19,25 @@
>  #include <linux/fdtable.h>
>  #include <linux/fsnotify_backend.h>
>  
> -int dir_notify_enable __read_mostly = 1;
> +static int dir_notify_enable __read_mostly = 1;
> +#ifdef CONFIG_SYSCTL
> +static struct ctl_table dnotify_sysctls[] = {
> +	{
> +		.procname	= "dir-notify-enable",
> +		.data		= &dir_notify_enable,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec,
> +	},
> +	{}
> +};
> +static void __init dnotify_sysctl_init(void)
> +{
> +	register_sysctl_init("fs", dnotify_sysctls);
> +}
> +#else
> +#define dnotify_sysctl_init() do { } while (0)
> +#endif
>  
>  static struct kmem_cache *dnotify_struct_cache __read_mostly;
>  static struct kmem_cache *dnotify_mark_cache __read_mostly;
> @@ -386,6 +404,7 @@ static int __init dnotify_init(void)
>  	dnotify_group = fsnotify_alloc_group(&dnotify_fsnotify_ops);
>  	if (IS_ERR(dnotify_group))
>  		panic("unable to allocate fsnotify group for dnotify\n");
> +	dnotify_sysctl_init();
>  	return 0;
>  }
>  
> diff --git a/include/linux/dnotify.h b/include/linux/dnotify.h
> index 0aad774beaec..4f3b25d47436 100644
> --- a/include/linux/dnotify.h
> +++ b/include/linux/dnotify.h
> @@ -29,7 +29,6 @@ struct dnotify_struct {
>  			    FS_CREATE | FS_DN_RENAME |\
>  			    FS_MOVED_FROM | FS_MOVED_TO)
>  
> -extern int dir_notify_enable;
>  extern void dnotify_flush(struct file *, fl_owner_t);
>  extern int fcntl_dirnotify(int, struct file *, unsigned long);
>  
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 20326d67b814..7a90a12b9ea4 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -48,7 +48,6 @@
>  #include <linux/times.h>
>  #include <linux/limits.h>
>  #include <linux/dcache.h>
> -#include <linux/dnotify.h>
>  #include <linux/syscalls.h>
>  #include <linux/vmstat.h>
>  #include <linux/nfs_fs.h>
> @@ -3090,15 +3089,6 @@ static struct ctl_table fs_table[] = {
>  		.proc_handler	= proc_dointvec,
>  	},
>  #endif
> -#ifdef CONFIG_DNOTIFY
> -	{
> -		.procname	= "dir-notify-enable",
> -		.data		= &dir_notify_enable,
> -		.maxlen		= sizeof(int),
> -		.mode		= 0644,
> -		.proc_handler	= proc_dointvec,
> -	},
> -#endif
>  #ifdef CONFIG_MMU
>  #ifdef CONFIG_FILE_LOCKING
>  	{
> -- 
> 2.33.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/9] aio: move aio sysctl to aio.c
  2021-11-23 20:23 ` [PATCH v2 8/9] aio: move aio sysctl to aio.c Luis Chamberlain
@ 2021-11-24  9:32   ` Jan Kara
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2021-11-24  9:32 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	pmladek, senozhatsky, wangqing, bcrl, viro, jack, amir73il,
	linux-fsdevel, linux-kernel

On Tue 23-11-21 12:23:46, Luis Chamberlain wrote:
> From: Xiaoming Ni <nixiaoming@huawei.com>
> 
> The kernel/sysctl.c is a kitchen sink where everyone leaves
> their dirty dishes, this makes it very difficult to maintain.
> 
> To help with this maintenance let's start by moving sysctls to
> places where they actually belong. The proc sysctl maintainers
> do not want to know what sysctl knobs you wish to add for your own
> piece of code, we just care about the core logic.
> 
> Move aio sysctl to aio.c and use the new register_sysctl_init() to
> register the sysctl interface for aio.
> 
> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
> [mcgrof: adjust commit log to justify the move]
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Makes sense. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/aio.c            | 31 +++++++++++++++++++++++++++++--
>  include/linux/aio.h |  4 ----
>  kernel/sysctl.c     | 17 -----------------
>  3 files changed, 29 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index 9c81cf611d65..83ef2341e73f 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -219,9 +219,35 @@ struct aio_kiocb {
>  
>  /*------ sysctl variables----*/
>  static DEFINE_SPINLOCK(aio_nr_lock);
> -unsigned long aio_nr;		/* current system wide number of aio requests */
> -unsigned long aio_max_nr = 0x10000; /* system wide maximum number of aio requests */
> +static unsigned long aio_nr;		/* current system wide number of aio requests */
> +static unsigned long aio_max_nr = 0x10000; /* system wide maximum number of aio requests */
>  /*----end sysctl variables---*/
> +#ifdef CONFIG_SYSCTL
> +static struct ctl_table aio_sysctls[] = {
> +	{
> +		.procname	= "aio-nr",
> +		.data		= &aio_nr,
> +		.maxlen		= sizeof(aio_nr),
> +		.mode		= 0444,
> +		.proc_handler	= proc_doulongvec_minmax,
> +	},
> +	{
> +		.procname	= "aio-max-nr",
> +		.data		= &aio_max_nr,
> +		.maxlen		= sizeof(aio_max_nr),
> +		.mode		= 0644,
> +		.proc_handler	= proc_doulongvec_minmax,
> +	},
> +	{}
> +};
> +
> +static void __init aio_sysctl_init(void)
> +{
> +	register_sysctl_init("fs", aio_sysctls);
> +}
> +#else
> +#define aio_sysctl_init() do { } while (0)
> +#endif
>  
>  static struct kmem_cache	*kiocb_cachep;
>  static struct kmem_cache	*kioctx_cachep;
> @@ -274,6 +300,7 @@ static int __init aio_setup(void)
>  
>  	kiocb_cachep = KMEM_CACHE(aio_kiocb, SLAB_HWCACHE_ALIGN|SLAB_PANIC);
>  	kioctx_cachep = KMEM_CACHE(kioctx,SLAB_HWCACHE_ALIGN|SLAB_PANIC);
> +	aio_sysctl_init();
>  	return 0;
>  }
>  __initcall(aio_setup);
> diff --git a/include/linux/aio.h b/include/linux/aio.h
> index b83e68dd006f..86892a4fe7c8 100644
> --- a/include/linux/aio.h
> +++ b/include/linux/aio.h
> @@ -20,8 +20,4 @@ static inline void kiocb_set_cancel_fn(struct kiocb *req,
>  				       kiocb_cancel_fn *cancel) { }
>  #endif /* CONFIG_AIO */
>  
> -/* for sysctl: */
> -extern unsigned long aio_nr;
> -extern unsigned long aio_max_nr;
> -
>  #endif /* __LINUX__AIO_H */
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 597ab5ad4879..20326d67b814 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -20,7 +20,6 @@
>   */
>  
>  #include <linux/module.h>
> -#include <linux/aio.h>
>  #include <linux/mm.h>
>  #include <linux/swap.h>
>  #include <linux/slab.h>
> @@ -3110,22 +3109,6 @@ static struct ctl_table fs_table[] = {
>  		.proc_handler	= proc_dointvec,
>  	},
>  #endif
> -#ifdef CONFIG_AIO
> -	{
> -		.procname	= "aio-nr",
> -		.data		= &aio_nr,
> -		.maxlen		= sizeof(aio_nr),
> -		.mode		= 0444,
> -		.proc_handler	= proc_doulongvec_minmax,
> -	},
> -	{
> -		.procname	= "aio-max-nr",
> -		.data		= &aio_max_nr,
> -		.maxlen		= sizeof(aio_max_nr),
> -		.mode		= 0644,
> -		.proc_handler	= proc_doulongvec_minmax,
> -	},
> -#endif /* CONFIG_AIO */
>  #ifdef CONFIG_INOTIFY_USER
>  	{
>  		.procname	= "inotify",
> -- 
> 2.33.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals
  2021-11-24  7:05     ` Xiaoming Ni
@ 2021-11-24 17:38       ` Eric W. Biederman
  2021-11-24 23:12         ` Luis Chamberlain
  0 siblings, 1 reply; 22+ messages in thread
From: Eric W. Biederman @ 2021-11-24 17:38 UTC (permalink / raw)
  To: Xiaoming Ni
  Cc: Luis Chamberlain, akpm, keescook, yzaikin, peterz, gregkh, pjt,
	liu.hailong6, andriy.shevchenko, sre, penguin-kernel, pmladek,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

Xiaoming Ni <nixiaoming@huawei.com> writes:

> On 2021/11/24 12:51, Eric W. Biederman wrote:
>> Luis Chamberlain <mcgrof@kernel.org> writes:
>>
>>> From: Xiaoming Ni <nixiaoming@huawei.com>
>>>
>>> sysctl has helpers which let us specify boundary values for a min or
>>> max int value. Since these are used for a boundary check only they don't
>>> change, so move these variables to sysctl_vals to avoid adding duplicate
>>> variables. This will help with our cleanup of kernel/sysctl.c.
>>
>> Ouch.
>>
>> I kind of, sort of, have to protest.
>>
>> Where the macros that use sysctl_vals don't have a type they have caused
>> mysterious code breakage because people did not realize they can not be
>> used with sysctls that take a long instead of an int.
>>
>> This came up with when the internal storage for ucounts see
>> kernel/ucount.c changed from an int to a long.  We were quite a while
>> tracking what was going on until we realized that the code could not use
>> SYSCTL_ZERO and SYSCTL_INT_MAX and that we had to defined our own thatSYSCTL_ZERO and SYSCTL_ZERO involve dozens of files and are used in hundreds of places.
>> were long.
>>
> static unsigned long zero_ul;
> static unsigned long one_ul = 1;
> static unsigned long long_max = LONG_MAX;
> EXPORT_SYMBOL(proc_doulongvec_minmax);
>
> Yes, min/max of type unsigned long is used in multiple sysctl
> interfaces. It is necessary to add an unsigned long sysctl_val array
> to avoid repeated definitions in different .c files.
>
>> So before we extend something like this can we please change the
>> macro naming convention so that it includes the size of the type
>> we want.
>>
> The int type is the most widely used type. By default, numeric
> constants are also of the int type. SYSCTL_ZERO and SYSCTL_ZERO
> involve dozens of files and are used in hundreds of places. Whether
> only non-int macros need to be named with their type size?
>
>>
>> I am also not a fan of sysctl_vals living in proc_sysctl.  They
>> have nothing to do with the code in that file.  They would do much
>> better in kernel/sysctl.c close to the helpers that use them.
>>
> yes
>

Looking a little more.  I think it makes sense to do something like:

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 1fa2b69c6fc3..c299009421ea 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -121,8 +121,8 @@ struct ctl_table {
        struct ctl_table *child;        /* Deprecated */
        proc_handler *proc_handler;     /* Callback for text formatting */
        struct ctl_table_poll *poll;
-       void *extra1;
-       void *extra2;
+       long min;
+       long max;
 } __randomize_layout;

Nearly every use of .extra1 and .extra2 are for min and max values.
A long takes the same storage as a void * parameter.

So it would be a net saving in storage as you don't have separate
storage for the values anywhere.

There are a few cases where .extra1 is used for something else
so keeping a "void *extra" field will probably be needed.

By finishing the removal of the child field adding a "void *extra"
field can be done at no storage cost.

Having the min and max parameters in the structure has the major
advantage that there is no redirection, and no fancy games.  People
can just read the value from the structure initializer.  Plus a
conversion from int to long won't requiring changing the min and
max constants.

So really I think instead of doubling down on the error prone case
that we have and extending sysctl_vals we should just get rid of
it entirely.

It is a bit more work to make that change but the long term result is
much better.

Any chance we can do that for a cleanup instead of extending sysctl_vals?

Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals
  2021-11-24 17:38       ` Eric W. Biederman
@ 2021-11-24 23:12         ` Luis Chamberlain
  0 siblings, 0 replies; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-24 23:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Xiaoming Ni, akpm, keescook, yzaikin, peterz, gregkh, pjt,
	liu.hailong6, andriy.shevchenko, sre, penguin-kernel, pmladek,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

On Wed, Nov 24, 2021 at 11:38:19AM -0600, Eric W. Biederman wrote:
> Looking a little more.  I think it makes sense to do something like:
> 
> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
> index 1fa2b69c6fc3..c299009421ea 100644
> --- a/include/linux/sysctl.h
> +++ b/include/linux/sysctl.h
> @@ -121,8 +121,8 @@ struct ctl_table {
>         struct ctl_table *child;        /* Deprecated */
>         proc_handler *proc_handler;     /* Callback for text formatting */
>         struct ctl_table_poll *poll;
> -       void *extra1;
> -       void *extra2;
> +       long min;
> +       long max;
>  } __randomize_layout;
> 
> Any chance we can do that for a cleanup instead of extending sysctl_vals?

Essentially the change is *big*. We either do that *not yet implemented*
change *now, and then have to rebase all the non-upstream work to adapt
to that, or we do that after the cleanup.

Both I think are going to cause some suffering. I'd prefer to do it
after though. Would you be OK with that?

  Luis

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface
  2021-11-23 20:23 ` [PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface Luis Chamberlain
@ 2021-11-25 16:14   ` Petr Mladek
  2021-11-29 21:01     ` Luis Chamberlain
  0 siblings, 1 reply; 22+ messages in thread
From: Petr Mladek @ 2021-11-25 16:14 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

On Tue 2021-11-23 12:23:39, Luis Chamberlain wrote:
> From: Xiaoming Ni <nixiaoming@huawei.com>
> 
> The kernel/sysctl.c is a kitchen sink where everyone leaves
> their dirty dishes, this makes it very difficult to maintain.
> 
> To help with this maintenance let's start by moving sysctls to
> places where they actually belong. The proc sysctl maintainers
> do not want to know what sysctl knobs you wish to add for your own
> piece of code, we just care about the core logic.
> 
> Today though folks heavily rely on tables on kernel/sysctl.c so
> they can easily just extend this table with their needed sysctls.
> In order to help users move their sysctls out we need to provide a
> helper which can be used during code initialization.
> 
> We special-case the initialization use of register_sysctl() since
> it *is* safe to fail, given all that sysctls do is provide a dynamic
> interface to query or modify at runtime an existing variable. So the
> use case of register_sysctl() on init should *not* stop if the sysctls
> don't end up getting registered. It would be counter productive to
> stop boot if a simple sysctl registration failed.
>
> Provide a helper for init then, and document the recommended init
> levels to use for callers of this routine. We will later use this
> in subsequent patches to start slimming down kernel/sysctl.c tables
> and moving sysctl registration to the code which actually needs
> these sysctls.

Do we really need a new helper for this?
Is the failure acceptable only during system initialization?

The warning would be useful even for the original register_sysctl().

It should be up-to-the caller to decide if the failure is fatal
or not. It might be enough to document the reasoning why a warning
is enough in most cases.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/9] watchdog: move watchdog sysctl interface to watchdog.c
  2021-11-23 20:23 ` [PATCH v2 4/9] watchdog: move watchdog sysctl interface to watchdog.c Luis Chamberlain
@ 2021-11-26  9:40   ` Petr Mladek
  0 siblings, 0 replies; 22+ messages in thread
From: Petr Mladek @ 2021-11-26  9:40 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

On Tue 2021-11-23 12:23:42, Luis Chamberlain wrote:
> From: Xiaoming Ni <nixiaoming@huawei.com>
> 
> The kernel/sysctl.c is a kitchen sink where everyone leaves
> their dirty dishes, this makes it very difficult to maintain.
> 
> To help with this maintenance let's start by moving sysctls to
> places where they actually belong. The proc sysctl maintainers
> do not want to know what sysctl knobs you wish to add for your own
> piece of code, we just care about the core logic of proc sysctl.
> 
> So, move the watchdog syscl interface to watchdog.c.
> Use register_sysctl() to register the sysctl interface to avoid
> merge conflicts when different features modify sysctl.c at the
> same time.
> 
> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> [mcgrof: justify the move on the commit log]
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/9] hung_task: Move hung_task sysctl interface to hung_task.c
  2021-11-23 20:23 ` [PATCH v2 3/9] hung_task: Move hung_task sysctl interface to hung_task.c Luis Chamberlain
@ 2021-11-26  9:46   ` Petr Mladek
  0 siblings, 0 replies; 22+ messages in thread
From: Petr Mladek @ 2021-11-26  9:46 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

On Tue 2021-11-23 12:23:41, Luis Chamberlain wrote:
> From: Xiaoming Ni <nixiaoming@huawei.com>
> 
> The kernel/sysctl.c is a kitchen sink where everyone leaves
> their dirty dishes, this makes it very difficult to maintain.
> 
> To help with this maintenance let's start by moving sysctls to
> places where they actually belong. The proc sysctl maintainers
> do not want to know what sysctl knobs you wish to add for your own
> piece of code, we just care about the core logic.
> 
> So move hung_task sysctl interface to hung_task.c and use
> register_sysctl() to register the sysctl interface.
> 
> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> [mcgrof: commit log refresh and fixed 2-3 0day reported compile issues]
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface
  2021-11-25 16:14   ` Petr Mladek
@ 2021-11-29 21:01     ` Luis Chamberlain
  0 siblings, 0 replies; 22+ messages in thread
From: Luis Chamberlain @ 2021-11-29 21:01 UTC (permalink / raw)
  To: Petr Mladek
  Cc: akpm, keescook, yzaikin, nixiaoming, ebiederm, peterz, gregkh,
	pjt, liu.hailong6, andriy.shevchenko, sre, penguin-kernel,
	senozhatsky, wangqing, bcrl, viro, jack, amir73il, linux-fsdevel,
	linux-kernel

On Thu, Nov 25, 2021 at 05:14:23PM +0100, Petr Mladek wrote:
> On Tue 2021-11-23 12:23:39, Luis Chamberlain wrote:
> > From: Xiaoming Ni <nixiaoming@huawei.com>
> > 
> > The kernel/sysctl.c is a kitchen sink where everyone leaves
> > their dirty dishes, this makes it very difficult to maintain.
> > 
> > To help with this maintenance let's start by moving sysctls to
> > places where they actually belong. The proc sysctl maintainers
> > do not want to know what sysctl knobs you wish to add for your own
> > piece of code, we just care about the core logic.
> > 
> > Today though folks heavily rely on tables on kernel/sysctl.c so
> > they can easily just extend this table with their needed sysctls.
> > In order to help users move their sysctls out we need to provide a
> > helper which can be used during code initialization.
> > 
> > We special-case the initialization use of register_sysctl() since
> > it *is* safe to fail, given all that sysctls do is provide a dynamic
> > interface to query or modify at runtime an existing variable. So the
> > use case of register_sysctl() on init should *not* stop if the sysctls
> > don't end up getting registered. It would be counter productive to
> > stop boot if a simple sysctl registration failed.
> >
> > Provide a helper for init then, and document the recommended init
> > levels to use for callers of this routine. We will later use this
> > in subsequent patches to start slimming down kernel/sysctl.c tables
> > and moving sysctl registration to the code which actually needs
> > these sysctls.
> 
> Do we really need a new helper for this?
> Is the failure acceptable only during system initialization?

Yes because it is __init and we allow / guide folks to *think* clearly
about not stopping the init process when it comes to sysctls on failure.

> The warning would be useful even for the original register_sysctl().

We can open code those.

> It should be up-to-the caller to decide if the failure is fatal
> or not. It might be enough to document the reasoning why a warning
> is enough in most cases.

For most case I have seen so far special casing init seems like a worthy
objective. When we're done with the full conversion we can re-visit
things but at this point I can't say sharing this outside of init uses
makes too much sense.

  Luis


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-11-29 22:11 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23 20:23 [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Luis Chamberlain
2021-11-23 20:23 ` [PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface Luis Chamberlain
2021-11-25 16:14   ` Petr Mladek
2021-11-29 21:01     ` Luis Chamberlain
2021-11-23 20:23 ` [PATCH v2 2/9] sysctl: Move some boundary constants from sysctl.c to sysctl_vals Luis Chamberlain
2021-11-24  4:51   ` Eric W. Biederman
2021-11-24  7:05     ` Xiaoming Ni
2021-11-24 17:38       ` Eric W. Biederman
2021-11-24 23:12         ` Luis Chamberlain
2021-11-23 20:23 ` [PATCH v2 3/9] hung_task: Move hung_task sysctl interface to hung_task.c Luis Chamberlain
2021-11-26  9:46   ` Petr Mladek
2021-11-23 20:23 ` [PATCH v2 4/9] watchdog: move watchdog sysctl interface to watchdog.c Luis Chamberlain
2021-11-26  9:40   ` Petr Mladek
2021-11-23 20:23 ` [PATCH v2 5/9] sysctl: make ngroups_max const Luis Chamberlain
2021-11-23 20:23 ` [PATCH v2 6/9] sysctl: use const for typically used max/min proc sysctls Luis Chamberlain
2021-11-23 20:23 ` [PATCH v2 7/9] sysctl: use SYSCTL_ZERO to replace some static int zero uses Luis Chamberlain
2021-11-23 20:23 ` [PATCH v2 8/9] aio: move aio sysctl to aio.c Luis Chamberlain
2021-11-24  9:32   ` Jan Kara
2021-11-23 20:23 ` [PATCH v2 9/9] dnotify: move dnotify sysctl to dnotify.c Luis Chamberlain
2021-11-24  9:31   ` Jan Kara
2021-11-24  0:14 ` [PATCH v2 0/9] sysctl: first set of kernel/sysctl cleanups Andrew Morton
2021-11-24  0:27   ` Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).