LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC PATCH] livepatch/module: Do not patch modules that are not ready
@ 2015-03-03 11:38 Petr Mladek
  2015-03-03 14:55 ` Josh Poimboeuf
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Mladek @ 2015-03-03 11:38 UTC (permalink / raw)
  To: Seth Jennings, Josh Poimboeuf, Jiri Kosina, Rusty Russell
  Cc: Miroslav Benes, Masami Hiramatsu, mingo, mathieu.desnoyers, oleg,
	paulmck, linux-kernel, andi, rostedt, tglx, Petr Mladek

There is a notifier that handles live patches for coming and going modules.
It takes klp_mutex lock to avoid races with coming and going patches.

Unfortunately, there are some possible races in the current implementation.
The problem is that we do not keep the klp_mutex lock all the time when
the module is being added or removed.

First, the module is visible even before ftrace is ready. If we enable a patch
in this time frame, adding ftrace ops will fail and the patch will get rejected
just because bad timing.

Second, if we are "lucky" and enable the patch for the coming module when the
ftrace is ready but before the module notifier has been called. The notifier
will try to enable the patch as well. It will detect that it is already patched,
return error, and the module will get rejected just because bad timing.
The more serious problem is that it will not call the notifier for
going module, so that the mess will stay there and we wont be able to load
the module later.

Third, similar problems are there for going module. If a patch is enabled after
the notifier finishes but before the module is removed from the list of modules,
the new patch will be applied to the module. The module might disappear at
anytime when the patch enabling is in progress, so there might be an access out
of memory. Or the whole patch might be applied and some mess will be left,
so it will not be possible to load/patch the module again.

This patch solves the problem by adding two flags into struct module. They are
switched when the notifier is called. Note that we try to solve a race with a
coming patch, therefore we do not know which modules will get patched and we
need to monitor all modules. This is why I added this to the struct module.

The flags are set and checked under the klp_mutex lock. The related operation
is finished under the same lock. Therefore they are properly serialized now.

Note that the patch solves only the situation when a new patch is registered or
enabled. There are no such problems when the patch is being removed. it does
not matter who disable the patch first, whether the normal disable_patch() or
the module notifier. There is nothing to do once the patch is disabled.

Signed-off-by: Petr Mladek <pmladek@suse.cz>
---
 include/linux/module.h  |  5 +++++
 kernel/livepatch/core.c | 20 +++++++++++++++++++-
 kernel/module.c         |  6 +++++-
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index b653d7c0a05a..7e50d87da510 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -344,6 +344,11 @@ struct module {
 	unsigned long *ftrace_callsites;
 #endif
 
+#ifdef CONFIG_LIVEPATCH
+	bool klp_patched;
+	bool klp_unpatched;
+#endif
+
 #ifdef CONFIG_MODULE_UNLOAD
 	/* What modules depend on me? */
 	struct list_head source_list;
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index a664e485365f..dee4bbcb60e6 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -89,6 +89,8 @@ static bool klp_is_object_loaded(struct klp_object *obj)
 /* sets obj->mod if object is not vmlinux and module is found */
 static void klp_find_object_module(struct klp_object *obj)
 {
+	struct module *mod;
+
 	if (!klp_is_module(obj))
 		return;
 
@@ -98,7 +100,14 @@ static void klp_find_object_module(struct klp_object *obj)
 	 * the klp_mutex, which is also taken by the module notifier.  This
 	 * prevents any module from unloading until we release the klp_mutex.
 	 */
-	obj->mod = find_module(obj->name);
+	mod = find_module(obj->name);
+	/* Do not mess work of the module notifier */
+	if ((mod->state == MODULE_STATE_COMING && !mod->klp_patched) ||
+	    (mod->state == MODULE_STATE_GOING && mod->klp_unpatched))
+		obj->mod = NULL;
+	else
+		obj->mod = mod;
+
 	mutex_unlock(&module_mutex);
 }
 
@@ -927,6 +936,15 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
 
 	mutex_lock(&klp_mutex);
 
+	/*
+	 * Each module has to know that the notifier has been called.
+	 * We never know what module will get patched by a new patch.
+	 */
+	if (action == MODULE_STATE_COMING)
+		mod->klp_patched = true;
+	else
+		mod->klp_unpatched = true;
+
 	list_for_each_entry(patch, &klp_patches, list) {
 		for (obj = patch->objs; obj->funcs; obj++) {
 			if (!klp_is_module(obj) || strcmp(obj->name, mod->name))
diff --git a/kernel/module.c b/kernel/module.c
index d856e96a3cce..8357f15b7ed0 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -852,7 +852,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
 
 	/* Store the name of the last unloaded module for diagnostic purposes */
 	strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
-
 	free_module(mod);
 	return 0;
 out:
@@ -3271,6 +3270,11 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	}
 #endif
 
+#ifdef CONFIG_LIVEPATCH
+	mod->klp_patched = false;
+	mod->klp_unpatched = false;
+#endif
+
 	/* To avoid stressing percpu allocator, do this once we're unique. */
 	err = percpu_modalloc(mod, info);
 	if (err)
-- 
1.8.5.6


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH] livepatch/module: Do not patch modules that are not ready
  2015-03-03 11:38 [RFC PATCH] livepatch/module: Do not patch modules that are not ready Petr Mladek
@ 2015-03-03 14:55 ` Josh Poimboeuf
  2015-03-03 15:48   ` Petr Mladek
  2015-03-03 17:34   ` Petr Mladek
  0 siblings, 2 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-03 14:55 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Seth Jennings, Jiri Kosina, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck,
	linux-kernel, andi, rostedt, tglx

Hi Petr,

Good find.  Some comments below.

On Tue, Mar 03, 2015 at 12:38:29PM +0100, Petr Mladek wrote:
> There is a notifier that handles live patches for coming and going modules.
> It takes klp_mutex lock to avoid races with coming and going patches.
> 
> Unfortunately, there are some possible races in the current implementation.
> The problem is that we do not keep the klp_mutex lock all the time when
> the module is being added or removed.
> 
> First, the module is visible even before ftrace is ready. If we enable a patch
> in this time frame, adding ftrace ops will fail and the patch will get rejected
> just because bad timing.
> 
> Second, if we are "lucky" and enable the patch for the coming module when the
> ftrace is ready but before the module notifier has been called.

Based on the notifier priorities, it looks like the ftrace notifier gets
called last, so I think this particular case can't happen.

> The notifier
> will try to enable the patch as well. It will detect that it is already patched,
> return error, and the module will get rejected just because bad timing.
> The more serious problem is that it will not call the notifier for
> going module, so that the mess will stay there and we wont be able to load
> the module later.
> 
> Third, similar problems are there for going module. If a patch is enabled after
> the notifier finishes but before the module is removed from the list of modules,
> the new patch will be applied to the module. The module might disappear at
> anytime when the patch enabling is in progress, so there might be an access out
> of memory. Or the whole patch might be applied and some mess will be left,
> so it will not be possible to load/patch the module again.
> 
> This patch solves the problem by adding two flags into struct module. They are
> switched when the notifier is called. Note that we try to solve a race with a
> coming patch, therefore we do not know which modules will get patched and we
> need to monitor all modules. This is why I added this to the struct module.
> 
> The flags are set and checked under the klp_mutex lock. The related operation
> is finished under the same lock. Therefore they are properly serialized now.
> 
> Note that the patch solves only the situation when a new patch is registered or
> enabled.

Did we have a reason for calling klp_find_object_module() in both
register and enable?  I'd think we'd only need it for the register path,
since the notifier would catch any future loads/unloads.  Or am I
missing something?

> There are no such problems when the patch is being removed. it does
> not matter who disable the patch first, whether the normal disable_patch() or
> the module notifier. There is nothing to do once the patch is disabled.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.cz>
> ---
>  include/linux/module.h  |  5 +++++
>  kernel/livepatch/core.c | 20 +++++++++++++++++++-
>  kernel/module.c         |  6 +++++-
>  3 files changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/module.h b/include/linux/module.h
> index b653d7c0a05a..7e50d87da510 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -344,6 +344,11 @@ struct module {
>  	unsigned long *ftrace_callsites;
>  #endif
>  
> +#ifdef CONFIG_LIVEPATCH
> +	bool klp_patched;
> +	bool klp_unpatched;
> +#endif
> +
>  #ifdef CONFIG_MODULE_UNLOAD
>  	/* What modules depend on me? */
>  	struct list_head source_list;
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index a664e485365f..dee4bbcb60e6 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -89,6 +89,8 @@ static bool klp_is_object_loaded(struct klp_object *obj)
>  /* sets obj->mod if object is not vmlinux and module is found */
>  static void klp_find_object_module(struct klp_object *obj)
>  {
> +	struct module *mod;
> +
>  	if (!klp_is_module(obj))
>  		return;
>  
> @@ -98,7 +100,14 @@ static void klp_find_object_module(struct klp_object *obj)
>  	 * the klp_mutex, which is also taken by the module notifier.  This
>  	 * prevents any module from unloading until we release the klp_mutex.
>  	 */
> -	obj->mod = find_module(obj->name);
> +	mod = find_module(obj->name);
> +	/* Do not mess work of the module notifier */
> +	if ((mod->state == MODULE_STATE_COMING && !mod->klp_patched) ||
> +	    (mod->state == MODULE_STATE_GOING && mod->klp_unpatched))
> +		obj->mod = NULL;
> +	else
> +		obj->mod = mod;
> +
>  	mutex_unlock(&module_mutex);
>  }
>  

Why do we need two flags for this?  The notifer already
sets/clears obj->mod, so can we rely on the value obj->mod to determine
if the notifier already ran?

For example:

@@ -89,7 +89,7 @@ static bool klp_is_object_loaded(struct klp_object *obj)
 /* sets obj->mod if object is not vmlinux and module is found */
 static void klp_find_object_module(struct klp_object *obj)
 {
-	if (!klp_is_module(obj))
+	if (!klp_is_module(obj) || obj->mod)
 		return;
 
 	mutex_lock(&module_mutex);
@@ -98,7 +98,9 @@ static void klp_find_object_module(struct klp_object *obj)
 	 * the klp_mutex, which is also taken by the module notifier.  This
 	 * prevents any module from unloading until we release the klp_mutex.
 	 */
-	obj->mod = find_module(obj->name);
+	mod = find_module(obj->name);
+	if (mod->state == MODULE_STATE_LIVE)
+		obj->mod = mod;
 	mutex_unlock(&module_mutex);
 }
 
> @@ -927,6 +936,15 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
>  
>  	mutex_lock(&klp_mutex);
>  
> +	/*
> +	 * Each module has to know that the notifier has been called.
> +	 * We never know what module will get patched by a new patch.
> +	 */
> +	if (action == MODULE_STATE_COMING)
> +		mod->klp_patched = true;
> +	else
> +		mod->klp_unpatched = true;
> +
>  	list_for_each_entry(patch, &klp_patches, list) {
>  		for (obj = patch->objs; obj->funcs; obj++) {
>  			if (!klp_is_module(obj) || strcmp(obj->name, mod->name))
> diff --git a/kernel/module.c b/kernel/module.c
> index d856e96a3cce..8357f15b7ed0 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -852,7 +852,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
>  
>  	/* Store the name of the last unloaded module for diagnostic purposes */
>  	strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
> -
>  	free_module(mod);
>  	return 0;
>  out:

Gratuitous whitespace change.

> @@ -3271,6 +3270,11 @@ static int load_module(struct load_info *info, const char __user *uargs,
>  	}
>  #endif
>  
> +#ifdef CONFIG_LIVEPATCH
> +	mod->klp_patched = false;
> +	mod->klp_unpatched = false;
> +#endif
> +
>  	/* To avoid stressing percpu allocator, do this once we're unique. */
>  	err = percpu_modalloc(mod, info);
>  	if (err)
> -- 
> 1.8.5.6
> 

-- 
Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH] livepatch/module: Do not patch modules that are not ready
  2015-03-03 14:55 ` Josh Poimboeuf
@ 2015-03-03 15:48   ` Petr Mladek
  2015-03-03 16:01     ` Josh Poimboeuf
  2015-03-03 17:34   ` Petr Mladek
  1 sibling, 1 reply; 29+ messages in thread
From: Petr Mladek @ 2015-03-03 15:48 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck,
	linux-kernel, andi, rostedt, tglx

On Tue 2015-03-03 08:55:17, Josh Poimboeuf wrote:
> Hi Petr,
> 
> Good find.  Some comments below.
> 
> On Tue, Mar 03, 2015 at 12:38:29PM +0100, Petr Mladek wrote:
> > There is a notifier that handles live patches for coming and going modules.
> > It takes klp_mutex lock to avoid races with coming and going patches.
> > 
> > Unfortunately, there are some possible races in the current implementation.
> > The problem is that we do not keep the klp_mutex lock all the time when
> > the module is being added or removed.
> > 
> > First, the module is visible even before ftrace is ready. If we enable a patch
> > in this time frame, adding ftrace ops will fail and the patch will get rejected
> > just because bad timing.
> > 
> > Second, if we are "lucky" and enable the patch for the coming module when the
> > ftrace is ready but before the module notifier has been called.
> 
> Based on the notifier priorities, it looks like the ftrace notifier gets
> called last, so I think this particular case can't happen.

Ftrace handles only going modules using the module notifier.

Coming modules are handled by ftrace_module_init() that is called directly
from load_module(). The notifiers for the coming modules seems to be
called right after by complete_formation().


> > The notifier
> > will try to enable the patch as well. It will detect that it is already patched,
> > return error, and the module will get rejected just because bad timing.
> > The more serious problem is that it will not call the notifier for
> > going module, so that the mess will stay there and we wont be able to load
> > the module later.
> > 
> > Third, similar problems are there for going module. If a patch is enabled after
> > the notifier finishes but before the module is removed from the list of modules,
> > the new patch will be applied to the module. The module might disappear at
> > anytime when the patch enabling is in progress, so there might be an access out
> > of memory. Or the whole patch might be applied and some mess will be left,
> > so it will not be possible to load/patch the module again.
> > 
> > This patch solves the problem by adding two flags into struct module. They are
> > switched when the notifier is called. Note that we try to solve a race with a
> > coming patch, therefore we do not know which modules will get patched and we
> > need to monitor all modules. This is why I added this to the struct module.
> > 
> > The flags are set and checked under the klp_mutex lock. The related operation
> > is finished under the same lock. Therefore they are properly serialized now.
> > 
> > Note that the patch solves only the situation when a new patch is registered or
> > enabled.
> 
> Did we have a reason for calling klp_find_object_module() in both
> register and enable?  I'd think we'd only need it for the register path,
> since the notifier would catch any future loads/unloads.  Or am I
> missing something?

Good question! I guess that it was there before we added the module
notifier and it was supposed to detect modules that were loaded after
the patch registration.

IMHO, we could replace klp_find_object_module() with
klp_is_object_loaded() in __klp_enable_patch(). It will work because
it will see only modules that were there during registration or
modules that were added by coming handler. It will not see modules
removed by module going handler.

I saw this problem and I considered it only an optimization. We need this
patch anyway because the race might get propagated via
klp_register_patch() and followup klp_enable_patch().

Best Regards,
Petr

 
> > There are no such problems when the patch is being removed. it does
> > not matter who disable the patch first, whether the normal disable_patch() or
> > the module notifier. There is nothing to do once the patch is disabled.
> > 
> > Signed-off-by: Petr Mladek <pmladek@suse.cz>
> > ---
> >  include/linux/module.h  |  5 +++++
> >  kernel/livepatch/core.c | 20 +++++++++++++++++++-
> >  kernel/module.c         |  6 +++++-
> >  3 files changed, 29 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/module.h b/include/linux/module.h
> > index b653d7c0a05a..7e50d87da510 100644
> > --- a/include/linux/module.h
> > +++ b/include/linux/module.h
> > @@ -344,6 +344,11 @@ struct module {
> >  	unsigned long *ftrace_callsites;
> >  #endif
> >  
> > +#ifdef CONFIG_LIVEPATCH
> > +	bool klp_patched;
> > +	bool klp_unpatched;
> > +#endif
> > +
> >  #ifdef CONFIG_MODULE_UNLOAD
> >  	/* What modules depend on me? */
> >  	struct list_head source_list;
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index a664e485365f..dee4bbcb60e6 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -89,6 +89,8 @@ static bool klp_is_object_loaded(struct klp_object *obj)
> >  /* sets obj->mod if object is not vmlinux and module is found */
> >  static void klp_find_object_module(struct klp_object *obj)
> >  {
> > +	struct module *mod;
> > +
> >  	if (!klp_is_module(obj))
> >  		return;
> >  
> > @@ -98,7 +100,14 @@ static void klp_find_object_module(struct klp_object *obj)
> >  	 * the klp_mutex, which is also taken by the module notifier.  This
> >  	 * prevents any module from unloading until we release the klp_mutex.
> >  	 */
> > -	obj->mod = find_module(obj->name);
> > +	mod = find_module(obj->name);
> > +	/* Do not mess work of the module notifier */
> > +	if ((mod->state == MODULE_STATE_COMING && !mod->klp_patched) ||
> > +	    (mod->state == MODULE_STATE_GOING && mod->klp_unpatched))
> > +		obj->mod = NULL;
> > +	else
> > +		obj->mod = mod;
> > +
> >  	mutex_unlock(&module_mutex);
> >  }
> >  
> 
> Why do we need two flags for this?  The notifer already
> sets/clears obj->mod, so can we rely on the value obj->mod to determine
> if the notifier already ran?
> 
> For example:
> 
> @@ -89,7 +89,7 @@ static bool klp_is_object_loaded(struct klp_object *obj)
>  /* sets obj->mod if object is not vmlinux and module is found */
>  static void klp_find_object_module(struct klp_object *obj)
>  {
> -	if (!klp_is_module(obj))
> +	if (!klp_is_module(obj) || obj->mod)
>  		return;
>  
>  	mutex_lock(&module_mutex);
> @@ -98,7 +98,9 @@ static void klp_find_object_module(struct klp_object *obj)
>  	 * the klp_mutex, which is also taken by the module notifier.  This
>  	 * prevents any module from unloading until we release the klp_mutex.
>  	 */
> -	obj->mod = find_module(obj->name);
> +	mod = find_module(obj->name);
> +	if (mod->state == MODULE_STATE_LIVE)
> +		obj->mod = mod;
>  	mutex_unlock(&module_mutex);
>  }
>  
> > @@ -927,6 +936,15 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
> >  
> >  	mutex_lock(&klp_mutex);
> >  
> > +	/*
> > +	 * Each module has to know that the notifier has been called.
> > +	 * We never know what module will get patched by a new patch.
> > +	 */
> > +	if (action == MODULE_STATE_COMING)
> > +		mod->klp_patched = true;
> > +	else
> > +		mod->klp_unpatched = true;
> > +
> >  	list_for_each_entry(patch, &klp_patches, list) {
> >  		for (obj = patch->objs; obj->funcs; obj++) {
> >  			if (!klp_is_module(obj) || strcmp(obj->name, mod->name))
> > diff --git a/kernel/module.c b/kernel/module.c
> > index d856e96a3cce..8357f15b7ed0 100644
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -852,7 +852,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> >  
> >  	/* Store the name of the last unloaded module for diagnostic purposes */
> >  	strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
> > -
> >  	free_module(mod);
> >  	return 0;
> >  out:
> 
> Gratuitous whitespace change.
> 
> > @@ -3271,6 +3270,11 @@ static int load_module(struct load_info *info, const char __user *uargs,
> >  	}
> >  #endif
> >  
> > +#ifdef CONFIG_LIVEPATCH
> > +	mod->klp_patched = false;
> > +	mod->klp_unpatched = false;
> > +#endif
> > +
> >  	/* To avoid stressing percpu allocator, do this once we're unique. */
> >  	err = percpu_modalloc(mod, info);
> >  	if (err)
> > -- 
> > 1.8.5.6
> > 
> 
> -- 
> Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH] livepatch/module: Do not patch modules that are not ready
  2015-03-03 15:48   ` Petr Mladek
@ 2015-03-03 16:01     ` Josh Poimboeuf
  0 siblings, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-03 16:01 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Seth Jennings, Jiri Kosina, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck,
	linux-kernel, andi, rostedt, tglx

On Tue, Mar 03, 2015 at 04:48:16PM +0100, Petr Mladek wrote:
> On Tue 2015-03-03 08:55:17, Josh Poimboeuf wrote:
> > Hi Petr,
> > 
> > Good find.  Some comments below.
> > 
> > On Tue, Mar 03, 2015 at 12:38:29PM +0100, Petr Mladek wrote:
> > > There is a notifier that handles live patches for coming and going modules.
> > > It takes klp_mutex lock to avoid races with coming and going patches.
> > > 
> > > Unfortunately, there are some possible races in the current implementation.
> > > The problem is that we do not keep the klp_mutex lock all the time when
> > > the module is being added or removed.
> > > 
> > > First, the module is visible even before ftrace is ready. If we enable a patch
> > > in this time frame, adding ftrace ops will fail and the patch will get rejected
> > > just because bad timing.
> > > 
> > > Second, if we are "lucky" and enable the patch for the coming module when the
> > > ftrace is ready but before the module notifier has been called.
> > 
> > Based on the notifier priorities, it looks like the ftrace notifier gets
> > called last, so I think this particular case can't happen.
> 
> Ftrace handles only going modules using the module notifier.
> 
> Coming modules are handled by ftrace_module_init() that is called directly
> from load_module(). The notifiers for the coming modules seems to be
> called right after by complete_formation().

Ah, ok.  Makes sense.

> > > The notifier
> > > will try to enable the patch as well. It will detect that it is already patched,
> > > return error, and the module will get rejected just because bad timing.
> > > The more serious problem is that it will not call the notifier for
> > > going module, so that the mess will stay there and we wont be able to load
> > > the module later.
> > > 
> > > Third, similar problems are there for going module. If a patch is enabled after
> > > the notifier finishes but before the module is removed from the list of modules,
> > > the new patch will be applied to the module. The module might disappear at
> > > anytime when the patch enabling is in progress, so there might be an access out
> > > of memory. Or the whole patch might be applied and some mess will be left,
> > > so it will not be possible to load/patch the module again.
> > > 
> > > This patch solves the problem by adding two flags into struct module. They are
> > > switched when the notifier is called. Note that we try to solve a race with a
> > > coming patch, therefore we do not know which modules will get patched and we
> > > need to monitor all modules. This is why I added this to the struct module.
> > > 
> > > The flags are set and checked under the klp_mutex lock. The related operation
> > > is finished under the same lock. Therefore they are properly serialized now.
> > > 
> > > Note that the patch solves only the situation when a new patch is registered or
> > > enabled.
> > 
> > Did we have a reason for calling klp_find_object_module() in both
> > register and enable?  I'd think we'd only need it for the register path,
> > since the notifier would catch any future loads/unloads.  Or am I
> > missing something?
> 
> Good question! I guess that it was there before we added the module
> notifier and it was supposed to detect modules that were loaded after
> the patch registration.
> 
> IMHO, we could replace klp_find_object_module() with
> klp_is_object_loaded() in __klp_enable_patch(). It will work because
> it will see only modules that were there during registration or
> modules that were added by coming handler. It will not see modules
> removed by module going handler.

Yeah, and __klp_enable_patch() already calls klp_is_object_loaded(), so
we can just remove its call to klp_find_object_module().

> I saw this problem and I considered it only an optimization. We need this
> patch anyway because the race might get propagated via
> klp_register_patch() and followup klp_enable_patch().

I agree, the race exists regardless.

Also, you missed my two code comments which are buried below :-)

Thanks,
Josh

> > > There are no such problems when the patch is being removed. it does
> > > not matter who disable the patch first, whether the normal disable_patch() or
> > > the module notifier. There is nothing to do once the patch is disabled.
> > > 
> > > Signed-off-by: Petr Mladek <pmladek@suse.cz>
> > > ---
> > >  include/linux/module.h  |  5 +++++
> > >  kernel/livepatch/core.c | 20 +++++++++++++++++++-
> > >  kernel/module.c         |  6 +++++-
> > >  3 files changed, 29 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/linux/module.h b/include/linux/module.h
> > > index b653d7c0a05a..7e50d87da510 100644
> > > --- a/include/linux/module.h
> > > +++ b/include/linux/module.h
> > > @@ -344,6 +344,11 @@ struct module {
> > >  	unsigned long *ftrace_callsites;
> > >  #endif
> > >  
> > > +#ifdef CONFIG_LIVEPATCH
> > > +	bool klp_patched;
> > > +	bool klp_unpatched;
> > > +#endif
> > > +
> > >  #ifdef CONFIG_MODULE_UNLOAD
> > >  	/* What modules depend on me? */
> > >  	struct list_head source_list;
> > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > > index a664e485365f..dee4bbcb60e6 100644
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > > @@ -89,6 +89,8 @@ static bool klp_is_object_loaded(struct klp_object *obj)
> > >  /* sets obj->mod if object is not vmlinux and module is found */
> > >  static void klp_find_object_module(struct klp_object *obj)
> > >  {
> > > +	struct module *mod;
> > > +
> > >  	if (!klp_is_module(obj))
> > >  		return;
> > >  
> > > @@ -98,7 +100,14 @@ static void klp_find_object_module(struct klp_object *obj)
> > >  	 * the klp_mutex, which is also taken by the module notifier.  This
> > >  	 * prevents any module from unloading until we release the klp_mutex.
> > >  	 */
> > > -	obj->mod = find_module(obj->name);
> > > +	mod = find_module(obj->name);
> > > +	/* Do not mess work of the module notifier */
> > > +	if ((mod->state == MODULE_STATE_COMING && !mod->klp_patched) ||
> > > +	    (mod->state == MODULE_STATE_GOING && mod->klp_unpatched))
> > > +		obj->mod = NULL;
> > > +	else
> > > +		obj->mod = mod;
> > > +
> > >  	mutex_unlock(&module_mutex);
> > >  }
> > >  
> > 
> > Why do we need two flags for this?  The notifer already
> > sets/clears obj->mod, so can we rely on the value obj->mod to determine
> > if the notifier already ran?
> > 
> > For example:
> > 
> > @@ -89,7 +89,7 @@ static bool klp_is_object_loaded(struct klp_object *obj)
> >  /* sets obj->mod if object is not vmlinux and module is found */
> >  static void klp_find_object_module(struct klp_object *obj)
> >  {
> > -	if (!klp_is_module(obj))
> > +	if (!klp_is_module(obj) || obj->mod)
> >  		return;
> >  
> >  	mutex_lock(&module_mutex);
> > @@ -98,7 +98,9 @@ static void klp_find_object_module(struct klp_object *obj)
> >  	 * the klp_mutex, which is also taken by the module notifier.  This
> >  	 * prevents any module from unloading until we release the klp_mutex.
> >  	 */
> > -	obj->mod = find_module(obj->name);
> > +	mod = find_module(obj->name);
> > +	if (mod->state == MODULE_STATE_LIVE)
> > +		obj->mod = mod;
> >  	mutex_unlock(&module_mutex);
> >  }
> >  
> > > @@ -927,6 +936,15 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
> > >  
> > >  	mutex_lock(&klp_mutex);
> > >  
> > > +	/*
> > > +	 * Each module has to know that the notifier has been called.
> > > +	 * We never know what module will get patched by a new patch.
> > > +	 */
> > > +	if (action == MODULE_STATE_COMING)
> > > +		mod->klp_patched = true;
> > > +	else
> > > +		mod->klp_unpatched = true;
> > > +
> > >  	list_for_each_entry(patch, &klp_patches, list) {
> > >  		for (obj = patch->objs; obj->funcs; obj++) {
> > >  			if (!klp_is_module(obj) || strcmp(obj->name, mod->name))
> > > diff --git a/kernel/module.c b/kernel/module.c
> > > index d856e96a3cce..8357f15b7ed0 100644
> > > --- a/kernel/module.c
> > > +++ b/kernel/module.c
> > > @@ -852,7 +852,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> > >  
> > >  	/* Store the name of the last unloaded module for diagnostic purposes */
> > >  	strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
> > > -
> > >  	free_module(mod);
> > >  	return 0;
> > >  out:
> > 
> > Gratuitous whitespace change.
> > 
> > > @@ -3271,6 +3270,11 @@ static int load_module(struct load_info *info, const char __user *uargs,
> > >  	}
> > >  #endif
> > >  
> > > +#ifdef CONFIG_LIVEPATCH
> > > +	mod->klp_patched = false;
> > > +	mod->klp_unpatched = false;
> > > +#endif
> > > +
> > >  	/* To avoid stressing percpu allocator, do this once we're unique. */
> > >  	err = percpu_modalloc(mod, info);
> > >  	if (err)
> > > -- 
> > > 1.8.5.6
> > > 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH] livepatch/module: Do not patch modules that are not ready
  2015-03-03 14:55 ` Josh Poimboeuf
  2015-03-03 15:48   ` Petr Mladek
@ 2015-03-03 17:34   ` Petr Mladek
  2015-03-03 19:31     ` Josh Poimboeuf
  1 sibling, 1 reply; 29+ messages in thread
From: Petr Mladek @ 2015-03-03 17:34 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck,
	linux-kernel, andi, rostedt, tglx

Ah, I have missed the comments in the code in the first reply.

On Tue 2015-03-03 08:55:17, Josh Poimboeuf wrote:
> Hi Petr,
> 
> Good find.  Some comments below.
> 
> On Tue, Mar 03, 2015 at 12:38:29PM +0100, Petr Mladek wrote:
> > There is a notifier that handles live patches for coming and going modules.
> > It takes klp_mutex lock to avoid races with coming and going patches.
> > 
> > Unfortunately, there are some possible races in the current implementation.
> > The problem is that we do not keep the klp_mutex lock all the time when
> > the module is being added or removed.
> > 
> > First, the module is visible even before ftrace is ready. If we enable a patch
> > in this time frame, adding ftrace ops will fail and the patch will get rejected
> > just because bad timing.
> > 
> > Second, if we are "lucky" and enable the patch for the coming module when the
> > ftrace is ready but before the module notifier has been called.
> 
> Based on the notifier priorities, it looks like the ftrace notifier gets
> called last, so I think this particular case can't happen.
> 
> > The notifier
> > will try to enable the patch as well. It will detect that it is already patched,
> > return error, and the module will get rejected just because bad timing.
> > The more serious problem is that it will not call the notifier for
> > going module, so that the mess will stay there and we wont be able to load
> > the module later.
> > 
> > Third, similar problems are there for going module. If a patch is enabled after
> > the notifier finishes but before the module is removed from the list of modules,
> > the new patch will be applied to the module. The module might disappear at
> > anytime when the patch enabling is in progress, so there might be an access out
> > of memory. Or the whole patch might be applied and some mess will be left,
> > so it will not be possible to load/patch the module again.
> > 
> > This patch solves the problem by adding two flags into struct module. They are
> > switched when the notifier is called. Note that we try to solve a race with a
> > coming patch, therefore we do not know which modules will get patched and we
> > need to monitor all modules. This is why I added this to the struct module.
> > 
> > The flags are set and checked under the klp_mutex lock. The related operation
> > is finished under the same lock. Therefore they are properly serialized now.
> > 
> > Note that the patch solves only the situation when a new patch is registered or
> > enabled.
> 
> Did we have a reason for calling klp_find_object_module() in both
> register and enable?  I'd think we'd only need it for the register path,
> since the notifier would catch any future loads/unloads.  Or am I
> missing something?
> 
> > There are no such problems when the patch is being removed. it does
> > not matter who disable the patch first, whether the normal disable_patch() or
> > the module notifier. There is nothing to do once the patch is disabled.
> > 
> > Signed-off-by: Petr Mladek <pmladek@suse.cz>
> > ---
> >  include/linux/module.h  |  5 +++++
> >  kernel/livepatch/core.c | 20 +++++++++++++++++++-
> >  kernel/module.c         |  6 +++++-
> >  3 files changed, 29 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/module.h b/include/linux/module.h
> > index b653d7c0a05a..7e50d87da510 100644
> > --- a/include/linux/module.h
> > +++ b/include/linux/module.h
> > @@ -344,6 +344,11 @@ struct module {
> >  	unsigned long *ftrace_callsites;
> >  #endif
> >  
> > +#ifdef CONFIG_LIVEPATCH
> > +	bool klp_patched;
> > +	bool klp_unpatched;
> > +#endif
> > +
> >  #ifdef CONFIG_MODULE_UNLOAD
> >  	/* What modules depend on me? */
> >  	struct list_head source_list;
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index a664e485365f..dee4bbcb60e6 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -89,6 +89,8 @@ static bool klp_is_object_loaded(struct klp_object *obj)
> >  /* sets obj->mod if object is not vmlinux and module is found */
> >  static void klp_find_object_module(struct klp_object *obj)
> >  {
> > +	struct module *mod;
> > +
> >  	if (!klp_is_module(obj))
> >  		return;
> >  
> > @@ -98,7 +100,14 @@ static void klp_find_object_module(struct klp_object *obj)
> >  	 * the klp_mutex, which is also taken by the module notifier.  This
> >  	 * prevents any module from unloading until we release the klp_mutex.
> >  	 */
> > -	obj->mod = find_module(obj->name);
> > +	mod = find_module(obj->name);
> > +	/* Do not mess work of the module notifier */
> > +	if ((mod->state == MODULE_STATE_COMING && !mod->klp_patched) ||
> > +	    (mod->state == MODULE_STATE_GOING && mod->klp_unpatched))
> > +		obj->mod = NULL;
> > +	else
> > +		obj->mod = mod;
> > +
> >  	mutex_unlock(&module_mutex);
> >  }
> >  
> 
> Why do we need two flags for this?  The notifer already
> sets/clears obj->mod, so can we rely on the value obj->mod to determine
> if the notifier already ran?

We need this for new patches. There we do not know if the module
notifier has already been called or not, see also below.

Of course, we might use more optimal storage for the two flags
if requested. For example, two bits in an unsigned char.


> For example:
> 
> @@ -89,7 +89,7 @@ static bool klp_is_object_loaded(struct klp_object *obj)
>  /* sets obj->mod if object is not vmlinux and module is found */
>  static void klp_find_object_module(struct klp_object *obj)
>  {
> -	if (!klp_is_module(obj))
> +	if (!klp_is_module(obj) || obj->mod)
>  		return;
>  
>  	mutex_lock(&module_mutex);
> @@ -98,7 +98,9 @@ static void klp_find_object_module(struct klp_object *obj)
>  	 * the klp_mutex, which is also taken by the module notifier.  This
>  	 * prevents any module from unloading until we release the klp_mutex.
>  	 */
> -	obj->mod = find_module(obj->name);
> +	mod = find_module(obj->name);
> +	if (mod->state == MODULE_STATE_LIVE)

The check of the MODULE_STATE_LIVE is not enough. We need to init
modules when a new patch is registered, the module is still in
MODULE_STATE_COMING and the module notify handler has already
been called.


> +		obj->mod = mod;
>  	mutex_unlock(&module_mutex);
>  }
>  
> > @@ -927,6 +936,15 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
> >  
> >  	mutex_lock(&klp_mutex);
> >  
> > +	/*
> > +	 * Each module has to know that the notifier has been called.
> > +	 * We never know what module will get patched by a new patch.
> > +	 */
> > +	if (action == MODULE_STATE_COMING)
> > +		mod->klp_patched = true;
> > +	else
> > +		mod->klp_unpatched = true;
> > +
> >  	list_for_each_entry(patch, &klp_patches, list) {
> >  		for (obj = patch->objs; obj->funcs; obj++) {
> >  			if (!klp_is_module(obj) || strcmp(obj->name, mod->name))
> > diff --git a/kernel/module.c b/kernel/module.c
> > index d856e96a3cce..8357f15b7ed0 100644
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -852,7 +852,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> >  
> >  	/* Store the name of the last unloaded module for diagnostic purposes */
> >  	strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
> > -
> >  	free_module(mod);
> >  	return 0;
> >  out:
> 
> Gratuitous whitespace change.

Ah, great catch. I will remove this in v2.

Best Regards,
Petr


> > @@ -3271,6 +3270,11 @@ static int load_module(struct load_info *info, const char __user *uargs,
> >  	}
> >  #endif
> >  
> > +#ifdef CONFIG_LIVEPATCH
> > +	mod->klp_patched = false;
> > +	mod->klp_unpatched = false;
> > +#endif
> > +
> >  	/* To avoid stressing percpu allocator, do this once we're unique. */
> >  	err = percpu_modalloc(mod, info);
> >  	if (err)
> > -- 
> > 1.8.5.6
> > 
> 
> -- 
> Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH] livepatch/module: Do not patch modules that are not ready
  2015-03-03 17:34   ` Petr Mladek
@ 2015-03-03 19:31     ` Josh Poimboeuf
  2015-03-03 19:35       ` Josh Poimboeuf
  2015-03-03 23:02       ` [PATCH 0/2] livepatch: fix patch module loading race Josh Poimboeuf
  0 siblings, 2 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-03 19:31 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Seth Jennings, Jiri Kosina, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck,
	linux-kernel, andi, rostedt, tglx

On Tue, Mar 03, 2015 at 06:34:57PM +0100, Petr Mladek wrote:
> Ah, I have missed the comments in the code in the first reply.
> > Why do we need two flags for this?  The notifer already
> > sets/clears obj->mod, so can we rely on the value obj->mod to determine
> > if the notifier already ran?
> 
> We need this for new patches. There we do not know if the module
> notifier has already been called or not, see also below.
> 
> Of course, we might use more optimal storage for the two flags
> if requested. For example, two bits in an unsigned char.
> 
> 
> > For example:
> > 
> > @@ -89,7 +89,7 @@ static bool klp_is_object_loaded(struct klp_object *obj)
> >  /* sets obj->mod if object is not vmlinux and module is found */
> >  static void klp_find_object_module(struct klp_object *obj)
> >  {
> > -	if (!klp_is_module(obj))
> > +	if (!klp_is_module(obj) || obj->mod)
> >  		return;
> >  
> >  	mutex_lock(&module_mutex);
> > @@ -98,7 +98,9 @@ static void klp_find_object_module(struct klp_object *obj)
> >  	 * the klp_mutex, which is also taken by the module notifier.  This
> >  	 * prevents any module from unloading until we release the klp_mutex.
> >  	 */
> > -	obj->mod = find_module(obj->name);
> > +	mod = find_module(obj->name);
> > +	if (mod->state == MODULE_STATE_LIVE)
> 
> The check of the MODULE_STATE_LIVE is not enough. We need to init
> modules when a new patch is registered, the module is still in
> MODULE_STATE_COMING and the module notify handler has already
> been called.

Ok, thanks for explaining.  I think I got it now.  But the two module
flags aren't ideal.

MODULE_STATE_COMING means that ftrace is already initialized, so I think
it doesn't _have_ to be the notifier which does the work.  Instead, can
we just let it be done by whoever gets there first?  Like this:


diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 01ca088..05390ab 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -89,16 +89,29 @@ static bool klp_is_object_loaded(struct klp_object *obj)
 /* sets obj->mod if object is not vmlinux and module is found */
 static void klp_find_object_module(struct klp_object *obj)
 {
-	if (!klp_is_module(obj))
+	struct module *mod;
+
+	if (!klp_is_module(obj) || obj->mod)
 		return;
 
 	mutex_lock(&module_mutex);
+
 	/*
 	 * We don't need to take a reference on the module here because we have
 	 * the klp_mutex, which is also taken by the module notifier.  This
 	 * prevents any module from unloading until we release the klp_mutex.
 	 */
-	obj->mod = find_module(obj->name);
+	mod = find_module(obj->name);
+
+	/*
+	 * MODULE_STATE_COMING means we got to the module first before the
+	 * notifier did.  ftrace is already initialized, so it's fine to go
+	 * ahead and start using it.
+	 */
+	if (mod->state == MODULE_STATE_COMING ||
+	    mod->state == MODULE_STATE_LIVE)
+		obj->mod = mod;
+
 	mutex_unlock(&module_mutex);
 }
 
@@ -697,8 +710,6 @@ static void klp_free_object_loaded(struct klp_object *obj)
 {
 	struct klp_func *func;
 
-	obj->mod = NULL;
-
 	for (func = obj->funcs; func->old_name; func++)
 		func->old_addr = 0;
 }
@@ -967,10 +978,31 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
 				continue;
 
 			if (action == MODULE_STATE_COMING) {
+
+				/*
+				 * There's a small window where a register or
+				 * enable call may have already done this
+				 * initialization.  First one there wins.
+				 */
+				if (obj->mod)
+					continue;
+
 				obj->mod = mod;
 				klp_module_notify_coming(patch, obj);
-			} else /* MODULE_STATE_GOING */
+			} else {
+				/* MODULE_STATE_GOING */
+
+				/*
+				 * There's a small window where a register or
+				 * enable call may have already done this
+				 * teardown.  First one there wins.
+				 */
+				if (!obj->mod)
+					continue;
+
 				klp_module_notify_going(patch, obj);
+				obj->mod = NULL;
+			}
 
 			break;
 		}

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH] livepatch/module: Do not patch modules that are not ready
  2015-03-03 19:31     ` Josh Poimboeuf
@ 2015-03-03 19:35       ` Josh Poimboeuf
  2015-03-03 23:02       ` [PATCH 0/2] livepatch: fix patch module loading race Josh Poimboeuf
  1 sibling, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-03 19:35 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Seth Jennings, Jiri Kosina, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck,
	linux-kernel, andi, rostedt, tglx

On Tue, Mar 03, 2015 at 01:31:28PM -0600, Josh Poimboeuf wrote:
> @@ -89,16 +89,29 @@ static bool klp_is_object_loaded(struct klp_object *obj)
>  /* sets obj->mod if object is not vmlinux and module is found */
>  static void klp_find_object_module(struct klp_object *obj)
>  {
> -	if (!klp_is_module(obj))
> +	struct module *mod;
> +
> +	if (!klp_is_module(obj) || obj->mod)
>  		return;
>  
>  	mutex_lock(&module_mutex);
> +
>  	/*
>  	 * We don't need to take a reference on the module here because we have
>  	 * the klp_mutex, which is also taken by the module notifier.  This
>  	 * prevents any module from unloading until we release the klp_mutex.
>  	 */
> -	obj->mod = find_module(obj->name);
> +	mod = find_module(obj->name);
> +
> +	/*
> +	 * MODULE_STATE_COMING means we got to the module first before the
> +	 * notifier did.  ftrace is already initialized, so it's fine to go
> +	 * ahead and start using it.
> +	 */

This comment should probably be improved to say:

"MODULE_STATE_COMING and !obj->mod means..."

--
Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 0/2] livepatch: fix patch module loading race
  2015-03-03 19:31     ` Josh Poimboeuf
  2015-03-03 19:35       ` Josh Poimboeuf
@ 2015-03-03 23:02       ` Josh Poimboeuf
  2015-03-03 23:02         ` [PATCH 1/2] livepatch: remove unnecessary call to klp_find_object_module() Josh Poimboeuf
  2015-03-03 23:02         ` [PATCH 2/2] livepatch: fix patched module loading race Josh Poimboeuf
  1 sibling, 2 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-03 23:02 UTC (permalink / raw)
  To: Petr Mladek, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck, andi,
	rostedt, tglx

Patch 1/2 removes the extra klp_find_object_module() call, as we discussed.
It's a prerequisite for making the 2nd patch simpler.

Patch 2/2 is a fixed up version of the scratch patch I posted earlier today,
with better comments and a few simplifications which are possible with the
single caller to klp_find_object_module().

Josh Poimboeuf (2):
  livepatch: remove unnecessary call to klp_find_object_module()
  livepatch: fix patched module loading race

 kernel/livepatch/core.c | 55 +++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 49 insertions(+), 6 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] livepatch: remove unnecessary call to klp_find_object_module()
  2015-03-03 23:02       ` [PATCH 0/2] livepatch: fix patch module loading race Josh Poimboeuf
@ 2015-03-03 23:02         ` Josh Poimboeuf
  2015-03-04  9:00           ` Petr Mladek
  2015-03-04 21:48           ` Jiri Kosina
  2015-03-03 23:02         ` [PATCH 2/2] livepatch: fix patched module loading race Josh Poimboeuf
  1 sibling, 2 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-03 23:02 UTC (permalink / raw)
  To: Petr Mladek, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck, andi,
	rostedt, tglx

klp_find_object_module() is called from both the klp register and enable
paths.  Only the call from the register path is necessary because the
module notifier will let us know if the patched module gets loaded or
unloaded.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 kernel/livepatch/core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 01ca088..a74e4e8 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -541,8 +541,6 @@ static int __klp_enable_patch(struct klp_patch *patch)
 	pr_notice("enabling patch '%s'\n", patch->mod->name);
 
 	for (obj = patch->objs; obj->funcs; obj++) {
-		klp_find_object_module(obj);
-
 		if (!klp_is_object_loaded(obj))
 			continue;
 
-- 
2.1.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-03 23:02       ` [PATCH 0/2] livepatch: fix patch module loading race Josh Poimboeuf
  2015-03-03 23:02         ` [PATCH 1/2] livepatch: remove unnecessary call to klp_find_object_module() Josh Poimboeuf
@ 2015-03-03 23:02         ` Josh Poimboeuf
  2015-03-04 13:17           ` Petr Mladek
  1 sibling, 1 reply; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-03 23:02 UTC (permalink / raw)
  To: Petr Mladek, Seth Jennings, Jiri Kosina, Vojtech Pavlik
  Cc: live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	Masami Hiramatsu, mingo, mathieu.desnoyers, oleg, paulmck, andi,
	rostedt, tglx

It's possible for klp_register_patch() to see a module before the COMING
notifier is called, or after the GOING notifier is called.

That can cause all kinds of ugly races.  As Pter Mladek reported:

  "The problem is that we do not keep the klp_mutex lock all the time when
  the module is being added or removed.

  First, the module is visible even before ftrace is ready. If we enable a patch
  in this time frame, adding ftrace ops will fail and the patch will get rejected
  just because bad timing.

  Second, if we are "lucky" and enable the patch for the coming module when the
  ftrace is ready but before the module notifier has been called. The notifier
  will try to enable the patch as well. It will detect that it is already patched,
  return error, and the module will get rejected just because bad timing.
  The more serious problem is that it will not call the notifier for
  going module, so that the mess will stay there and we wont be able to load
  the module later.

  Third, similar problems are there for going module. If a patch is enabled after
  the notifier finishes but before the module is removed from the list of modules,
  the new patch will be applied to the module. The module might disappear at
  anytime when the patch enabling is in progress, so there might be an access out
  of memory. Or the whole patch might be applied and some mess will be left,
  so it will not be possible to load/patch the module again."

Fix these races by letting the first one who sees the module do the
needed work.

Reported-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 kernel/livepatch/core.c | 53 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 49 insertions(+), 4 deletions(-)

diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index a74e4e8..19a758c 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj)
 /* sets obj->mod if object is not vmlinux and module is found */
 static void klp_find_object_module(struct klp_object *obj)
 {
+	struct module *mod;
+
 	if (!klp_is_module(obj))
 		return;
 
 	mutex_lock(&module_mutex);
+
 	/*
 	 * We don't need to take a reference on the module here because we have
 	 * the klp_mutex, which is also taken by the module notifier.  This
 	 * prevents any module from unloading until we release the klp_mutex.
 	 */
-	obj->mod = find_module(obj->name);
+	mod = find_module(obj->name);
+
+	/*
+	 * Be careful here to prevent races with the notifier:
+	 *
+	 * - MODULE_STATE_COMING: This may be before or after the notifier gets
+	 *   called.  If after, the notifier didn't see the object anyway
+	 *   because it hadn't been added to the patch list yet.  Either way,
+	 *   ftrace is already initialized, so it's safe to just go ahead and
+	 *   initialize the object here.
+	 *
+	 * - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be
+	 *   before or after the notifier gets called.  If after, the notifer
+	 *   didn't see the object anyway.  Either way it's safe to just
+	 *   pretend that it's already gone and leave obj->mod at NULL.
+	 *
+	 *   MODULE_STATE_LIVE: The common case.  The module already finished
+	 *   loading.  Just like with MODULE_STATE_COMING, we know the notifier
+	 *   didn't see the object, so we init it here.
+	 */
+	if (mod && (mod->state == MODULE_STATE_COMING ||
+		    mod->state == MODULE_STATE_LIVE))
+		obj->mod = mod;
+
 	mutex_unlock(&module_mutex);
 }
 
@@ -695,8 +721,6 @@ static void klp_free_object_loaded(struct klp_object *obj)
 {
 	struct klp_func *func;
 
-	obj->mod = NULL;
-
 	for (func = obj->funcs; func->old_name; func++)
 		func->old_addr = 0;
 }
@@ -765,6 +789,7 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
 		return -EINVAL;
 
 	obj->state = KLP_DISABLED;
+	obj->mod = NULL;
 
 	klp_find_object_module(obj);
 
@@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
 				continue;
 
 			if (action == MODULE_STATE_COMING) {
+
+				/*
+				 * Check for a small window where the register
+				 * path already initialized the object.
+				 */
+				if (obj->mod)
+					continue;
+
 				obj->mod = mod;
 				klp_module_notify_coming(patch, obj);
-			} else /* MODULE_STATE_GOING */
+			} else {
+				/* MODULE_STATE_GOING */
+
+				/*
+				 * Check for a small window where the register
+				 * path already saw the GOING state and thus
+				 * didn't set obj->mod.
+				 */
+				if (!obj->mod)
+					continue;
+
 				klp_module_notify_going(patch, obj);
+				obj->mod = NULL;
+			}
 
 			break;
 		}
-- 
2.1.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] livepatch: remove unnecessary call to klp_find_object_module()
  2015-03-03 23:02         ` [PATCH 1/2] livepatch: remove unnecessary call to klp_find_object_module() Josh Poimboeuf
@ 2015-03-04  9:00           ` Petr Mladek
  2015-03-04 21:48           ` Jiri Kosina
  1 sibling, 0 replies; 29+ messages in thread
From: Petr Mladek @ 2015-03-04  9:00 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Tue 2015-03-03 17:02:21, Josh Poimboeuf wrote:
> klp_find_object_module() is called from both the klp register and enable
> paths.  Only the call from the register path is necessary because the
> module notifier will let us know if the patched module gets loaded or
> unloaded.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>

Reviewed-by: Petr Mladek <pmladek@suse.cz>

Yup, the call is redundant.

Best Regards,
Petr

> ---
>  kernel/livepatch/core.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 01ca088..a74e4e8 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -541,8 +541,6 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	pr_notice("enabling patch '%s'\n", patch->mod->name);
>  
>  	for (obj = patch->objs; obj->funcs; obj++) {
> -		klp_find_object_module(obj);
> -
>  		if (!klp_is_object_loaded(obj))
>  			continue;
>  
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-03 23:02         ` [PATCH 2/2] livepatch: fix patched module loading race Josh Poimboeuf
@ 2015-03-04 13:17           ` Petr Mladek
  2015-03-04 14:18             ` Petr Mladek
                               ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Petr Mladek @ 2015-03-04 13:17 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> It's possible for klp_register_patch() to see a module before the COMING
> notifier is called, or after the GOING notifier is called.
> 
> That can cause all kinds of ugly races.  As Pter Mladek reported:
> 
>   "The problem is that we do not keep the klp_mutex lock all the time when
>   the module is being added or removed.
> 
>   First, the module is visible even before ftrace is ready. If we enable a patch
>   in this time frame, adding ftrace ops will fail and the patch will get rejected
>   just because bad timing.

Ah, this is not true after all. I did not properly check when
MODULE_STATE_COMING was set. I though that it was before ftrace was
initialized but it was not true.


>   Second, if we are "lucky" and enable the patch for the coming module when the
>   ftrace is ready but before the module notifier has been called. The notifier
>   will try to enable the patch as well. It will detect that it is already patched,
>   return error, and the module will get rejected just because bad
>   timing. The more serious problem is that it will not call the notifier for
>   going module, so that the mess will stay there and we wont be able to load
>   the module later.

Ah, the race is there but the effect is not that serious in the
end. It seems that errors from module notifiers are ignored. In fact,
we do not propagate the error from klp_module_notify_coming(). It means
that WARN() from klp_enable_object() will be printed but the module
will be loaded and patched.

I am sorry, I was confused by kGraft where kgr_module_init() was
called directly from module_load(). The errors were propagated. It
means that kGraft rejects module when the patch cannot be applied.

Note that the current solution is perfectly fine for the simple
consistency model.


>   Third, similar problems are there for going module. If a patch is enabled after
>   the notifier finishes but before the module is removed from the list of modules,
>   the new patch will be applied to the module. The module might disappear at
>   anytime when the patch enabling is in progress, so there might be an access out
>   of memory. Or the whole patch might be applied and some mess will be left,
>   so it will not be possible to load/patch the module again."

This is true.


> Fix these races by letting the first one who sees the module do the
> needed work.
> 
> Reported-by: Petr Mladek <pmladek@suse.cz>
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  kernel/livepatch/core.c | 53 +++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 49 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index a74e4e8..19a758c 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj)
>  /* sets obj->mod if object is not vmlinux and module is found */
>  static void klp_find_object_module(struct klp_object *obj)
>  {
> +	struct module *mod;
> +
>  	if (!klp_is_module(obj))
>  		return;
>  
>  	mutex_lock(&module_mutex);
> +
>  	/*
>  	 * We don't need to take a reference on the module here because we have
>  	 * the klp_mutex, which is also taken by the module notifier.  This
>  	 * prevents any module from unloading until we release the klp_mutex.
>  	 */
> -	obj->mod = find_module(obj->name);
> +	mod = find_module(obj->name);
> +
> +	/*
> +	 * Be careful here to prevent races with the notifier:
> +	 *
> +	 * - MODULE_STATE_COMING: This may be before or after the notifier gets
> +	 *   called.  If after, the notifier didn't see the object anyway
> +	 *   because it hadn't been added to the patch list yet.  Either way,
> +	 *   ftrace is already initialized, so it's safe to just go ahead and
> +	 *   initialize the object here.

Well, we need to be careful. This will handle only the newly
registered patch. If there are other patches against the module
on the stack, it might produce wrong order at func_stack, see below.


> +	 *
> +	 * - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be
> +	 *   before or after the notifier gets called.  If after, the notifer
> +	 *   didn't see the object anyway.  Either way it's safe to just
> +	 *   pretend that it's already gone and leave obj->mod at NULL.

This is true for the current simple consistency model.

Note that there will be a small race window if we allow dependency
between patched functions and introduce more a complex consistency model
with lazy patching. The problem is the following sequence:


CPU0					CPU1

delete_module()  #SYSCALL

   try_stop_module()
     mod->state = MODULE_STATE_GOING;

   mutex_unlock(&module_mutex);

					klp_register_patch()
					klp_enable_patch()

					#save place to switch universe

					b()     # from module that is going
					  a()   # from core (patched)


   mod->exit();


Note that the function b() can be called until we call mod->exit().

If we do not apply patch against b() because it is in
MODULE_STATE_GOING. It will call patched a() with modified semantic
and things might get wrong.


Well, the above described race is rather theoretical. It will happen
only when a patched module is being removed and a module with a patch
is added at the same time. Also it means that some other CPU will
manage to register, enable the patch, switch the universe, and
call function from the patched module during the small race window.

I am not sure if we need to handle such a type of race if the solution
adds too big complexity.


> +	 *   MODULE_STATE_LIVE: The common case.  The module already finished
> +	 *   loading.  Just like with MODULE_STATE_COMING, we know the notifier
> +	 *   didn't see the object, so we init it here.
> +	 */
> +	if (mod && (mod->state == MODULE_STATE_COMING ||
> +		    mod->state == MODULE_STATE_LIVE))
> +		obj->mod = mod;
> +
>  	mutex_unlock(&module_mutex);
>  }
>  
> @@ -695,8 +721,6 @@ static void klp_free_object_loaded(struct klp_object *obj)
>  {
>  	struct klp_func *func;
>  
> -	obj->mod = NULL;
> -
>
>  	for (func = obj->funcs; func->old_name; func++)
>  		func->old_addr = 0;
>  }
> @@ -765,6 +789,7 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
>  		return -EINVAL;
>  
>  	obj->state = KLP_DISABLED;
> +	obj->mod = NULL;
>  
>  	klp_find_object_module(obj);
>  
> @@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
>  				continue;
>  
>  			if (action == MODULE_STATE_COMING) {
> +
> +				/*
> +				 * Check for a small window where the register
> +				 * path already initialized the object.
> +				 */
s/path/patch/



> +				if (obj->mod)
> +					continue;

This might break stacking. The recently registered patch might become
the last on the stack and thus unused.

>  				obj->mod = mod;
>  				klp_module_notify_coming(patch, obj);
> -			} else /* MODULE_STATE_GOING */
> +			} else {
> +				/* MODULE_STATE_GOING */
> +
> +				/*
> +				 * Check for a small window where the register
> +				 * path already saw the GOING state and thus
> +				 * didn't set obj->mod.

Same typo here.

Also I would write that it did not initialize the object. It is not
only about setting obj->mod.

Best Regards,
Petr

> +				 */
> +				if (!obj->mod)
> +					continue;


>  				klp_module_notify_going(patch, obj);
> +				obj->mod = NULL;
> +			}
>  
>  			break;
>  		}
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 13:17           ` Petr Mladek
@ 2015-03-04 14:18             ` Petr Mladek
  2015-03-04 15:34             ` Josh Poimboeuf
  2015-03-05  0:52             ` Masami Hiramatsu
  2 siblings, 0 replies; 29+ messages in thread
From: Petr Mladek @ 2015-03-04 14:18 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Wed 2015-03-04 14:17:52, Petr Mladek wrote:
> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> > It's possible for klp_register_patch() to see a module before the COMING
> > notifier is called, or after the GOING notifier is called.
> > 
> > That can cause all kinds of ugly races.  As Pter Mladek reported:
> > 
> >   "The problem is that we do not keep the klp_mutex lock all the time when
> >   the module is being added or removed.
> > 
> >   First, the module is visible even before ftrace is ready. If we enable a patch
> >   in this time frame, adding ftrace ops will fail and the patch will get rejected
> >   just because bad timing.
> 
> Ah, this is not true after all. I did not properly check when
> MODULE_STATE_COMING was set. I though that it was before ftrace was
> initialized but it was not true.
> 
> 
> >   Second, if we are "lucky" and enable the patch for the coming module when the
> >   ftrace is ready but before the module notifier has been called. The notifier
> >   will try to enable the patch as well. It will detect that it is already patched,
> >   return error, and the module will get rejected just because bad
> >   timing. The more serious problem is that it will not call the notifier for
> >   going module, so that the mess will stay there and we wont be able to load
> >   the module later.
> 
> Ah, the race is there but the effect is not that serious in the
> end. It seems that errors from module notifiers are ignored. In fact,
> we do not propagate the error from klp_module_notify_coming(). It means
> that WARN() from klp_enable_object() will be printed but the module
> will be loaded and patched.
> 
> I am sorry, I was confused by kGraft where kgr_module_init() was
> called directly from module_load(). The errors were propagated. It
> means that kGraft rejects module when the patch cannot be applied.
> 
> Note that the current solution is perfectly fine for the simple
> consistency model.
> 
> 
> >   Third, similar problems are there for going module. If a patch is enabled after
> >   the notifier finishes but before the module is removed from the list of modules,
> >   the new patch will be applied to the module. The module might disappear at
> >   anytime when the patch enabling is in progress, so there might be an access out
> >   of memory. Or the whole patch might be applied and some mess will be left,
> >   so it will not be possible to load/patch the module again."
> 
> This is true.
> 
> 
> > Fix these races by letting the first one who sees the module do the
> > needed work.
> > 
> > Reported-by: Petr Mladek <pmladek@suse.cz>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  kernel/livepatch/core.c | 53 +++++++++++++++++++++++++++++++++++++++++++++----
> > @@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
> >  				continue;
> >  
> >  			if (action == MODULE_STATE_COMING) {
> > +
> > +				/*
> > +				 * Check for a small window where the register
> > +				 * path already initialized the object.
> > +				 */
> s/path/patch/
> 
> 
> 
> > +				if (obj->mod)
> > +					continue;
> 
> This might break stacking. The recently registered patch might become
> the last on the stack and thus unused.

Going through the stack when registering new patch would be quite
ugly. I am going to provide yet another solution.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 13:17           ` Petr Mladek
  2015-03-04 14:18             ` Petr Mladek
@ 2015-03-04 15:34             ` Josh Poimboeuf
  2015-03-04 15:51               ` Jiri Kosina
  2015-03-04 16:36               ` Petr Mladek
  2015-03-05  0:52             ` Masami Hiramatsu
  2 siblings, 2 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-04 15:34 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Wed, Mar 04, 2015 at 02:17:52PM +0100, Petr Mladek wrote:
> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> > It's possible for klp_register_patch() to see a module before the COMING
> > notifier is called, or after the GOING notifier is called.
> > 
> > That can cause all kinds of ugly races.  As Pter Mladek reported:
> > 
> >   "The problem is that we do not keep the klp_mutex lock all the time when
> >   the module is being added or removed.
> > 
> >   First, the module is visible even before ftrace is ready. If we enable a patch
> >   in this time frame, adding ftrace ops will fail and the patch will get rejected
> >   just because bad timing.
> 
> Ah, this is not true after all. I did not properly check when
> MODULE_STATE_COMING was set. I though that it was before ftrace was
> initialized but it was not true.

Ah, right.  I didn't re-read your description after realizing that
MODULE_STATE_COMING means ftrace is already initialized.

> >   Second, if we are "lucky" and enable the patch for the coming module when the
> >   ftrace is ready but before the module notifier has been called. The notifier
> >   will try to enable the patch as well. It will detect that it is already patched,
> >   return error, and the module will get rejected just because bad
> >   timing. The more serious problem is that it will not call the notifier for
> >   going module, so that the mess will stay there and we wont be able to load
> >   the module later.
> 
> Ah, the race is there but the effect is not that serious in the
> end. It seems that errors from module notifiers are ignored. In fact,
> we do not propagate the error from klp_module_notify_coming(). It means
> that WARN() from klp_enable_object() will be printed but the module
> will be loaded and patched.
> 
> I am sorry, I was confused by kGraft where kgr_module_init() was
> called directly from module_load(). The errors were propagated. It
> means that kGraft rejects module when the patch cannot be applied.

True, but we should still eliminate the race.  Initializing the object
twice could cause some sneaky bugs, if not now then in the future.

> Note that the current solution is perfectly fine for the simple
> consistency model.
> 
> 
> >   Third, similar problems are there for going module. If a patch is enabled after
> >   the notifier finishes but before the module is removed from the list of modules,
> >   the new patch will be applied to the module. The module might disappear at
> >   anytime when the patch enabling is in progress, so there might be an access out
> >   of memory. Or the whole patch might be applied and some mess will be left,
> >   so it will not be possible to load/patch the module again."
> 
> This is true.
> 
> 
> > Fix these races by letting the first one who sees the module do the
> > needed work.
> > 
> > Reported-by: Petr Mladek <pmladek@suse.cz>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> >  kernel/livepatch/core.c | 53 +++++++++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 49 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index a74e4e8..19a758c 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj)
> >  /* sets obj->mod if object is not vmlinux and module is found */
> >  static void klp_find_object_module(struct klp_object *obj)
> >  {
> > +	struct module *mod;
> > +
> >  	if (!klp_is_module(obj))
> >  		return;
> >  
> >  	mutex_lock(&module_mutex);
> > +
> >  	/*
> >  	 * We don't need to take a reference on the module here because we have
> >  	 * the klp_mutex, which is also taken by the module notifier.  This
> >  	 * prevents any module from unloading until we release the klp_mutex.
> >  	 */
> > -	obj->mod = find_module(obj->name);
> > +	mod = find_module(obj->name);
> > +
> > +	/*
> > +	 * Be careful here to prevent races with the notifier:
> > +	 *
> > +	 * - MODULE_STATE_COMING: This may be before or after the notifier gets
> > +	 *   called.  If after, the notifier didn't see the object anyway
> > +	 *   because it hadn't been added to the patch list yet.  Either way,
> > +	 *   ftrace is already initialized, so it's safe to just go ahead and
> > +	 *   initialize the object here.
> 
> Well, we need to be careful. This will handle only the newly
> registered patch. If there are other patches against the module
> on the stack, it might produce wrong order at func_stack, see below.

Sorry, but I don't follow.  Can you give an example?

> > +	 *
> > +	 * - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be
> > +	 *   before or after the notifier gets called.  If after, the notifer
> > +	 *   didn't see the object anyway.  Either way it's safe to just
> > +	 *   pretend that it's already gone and leave obj->mod at NULL.
> 
> This is true for the current simple consistency model.
> 
> Note that there will be a small race window if we allow dependency
> between patched functions and introduce more a complex consistency model
> with lazy patching. The problem is the following sequence:
> 
> 
> CPU0					CPU1
> 
> delete_module()  #SYSCALL
> 
>    try_stop_module()
>      mod->state = MODULE_STATE_GOING;
> 
>    mutex_unlock(&module_mutex);
> 
> 					klp_register_patch()
> 					klp_enable_patch()
> 
> 					#save place to switch universe
> 
> 					b()     # from module that is going
> 					  a()   # from core (patched)
> 
> 
>    mod->exit();
> 
> 
> Note that the function b() can be called until we call mod->exit().
> 
> If we do not apply patch against b() because it is in
> MODULE_STATE_GOING. It will call patched a() with modified semantic
> and things might get wrong.
> 
> 
> Well, the above described race is rather theoretical. It will happen
> only when a patched module is being removed and a module with a patch
> is added at the same time. Also it means that some other CPU will
> manage to register, enable the patch, switch the universe, and
> call function from the patched module during the small race window.
> 
> I am not sure if we need to handle such a type of race if the solution
> adds too big complexity.

But b() can't be called after the module is in MODULE_STATE_GOING,
right?  try_stop_module() makes sure it's not being used before changing
its state.

> > +	 *   MODULE_STATE_LIVE: The common case.  The module already finished
> > +	 *   loading.  Just like with MODULE_STATE_COMING, we know the notifier
> > +	 *   didn't see the object, so we init it here.
> > +	 */
> > +	if (mod && (mod->state == MODULE_STATE_COMING ||
> > +		    mod->state == MODULE_STATE_LIVE))
> > +		obj->mod = mod;
> > +
> >  	mutex_unlock(&module_mutex);
> >  }
> >  
> > @@ -695,8 +721,6 @@ static void klp_free_object_loaded(struct klp_object *obj)
> >  {
> >  	struct klp_func *func;
> >  
> > -	obj->mod = NULL;
> > -
> >
> >  	for (func = obj->funcs; func->old_name; func++)
> >  		func->old_addr = 0;
> >  }
> > @@ -765,6 +789,7 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
> >  		return -EINVAL;
> >  
> >  	obj->state = KLP_DISABLED;
> > +	obj->mod = NULL;
> >  
> >  	klp_find_object_module(obj);
> >  
> > @@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
> >  				continue;
> >  
> >  			if (action == MODULE_STATE_COMING) {
> > +
> > +				/*
> > +				 * Check for a small window where the register
> > +				 * path already initialized the object.
> > +				 */
> s/path/patch/

Not a typo, I was referring to the register code path.  Is there a
clearer way to say that?

> 
> 
> 
> > +				if (obj->mod)
> > +					continue;
> 
> This might break stacking. The recently registered patch might become
> the last on the stack and thus unused.

Example please :-)

> >  				obj->mod = mod;
> >  				klp_module_notify_coming(patch, obj);
> > -			} else /* MODULE_STATE_GOING */
> > +			} else {
> > +				/* MODULE_STATE_GOING */
> > +
> > +				/*
> > +				 * Check for a small window where the register
> > +				 * path already saw the GOING state and thus
> > +				 * didn't set obj->mod.
> 
> Same typo here.

Same not actually a typo ;-)

> Also I would write that it did not initialize the object. It is not
> only about setting obj->mod.

Ok.


-- 
Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 15:34             ` Josh Poimboeuf
@ 2015-03-04 15:51               ` Jiri Kosina
  2015-03-04 16:41                 ` Josh Poimboeuf
  2015-03-04 16:36               ` Petr Mladek
  1 sibling, 1 reply; 29+ messages in thread
From: Jiri Kosina @ 2015-03-04 15:51 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Seth Jennings, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Wed, 4 Mar 2015, Josh Poimboeuf wrote:

> > CPU0					CPU1
> > 
> > delete_module()  #SYSCALL
> > 
> >    try_stop_module()
> >      mod->state = MODULE_STATE_GOING;
> > 
> >    mutex_unlock(&module_mutex);
> > 
> > 					klp_register_patch()
> > 					klp_enable_patch()
> > 
> > 					#save place to switch universe
> > 
> > 					b()     # from module that is going
> > 					  a()   # from core (patched)
> > 
> > 
> >    mod->exit();
> > 
> > 
> > Note that the function b() can be called until we call mod->exit().
> > 
> > If we do not apply patch against b() because it is in
> > MODULE_STATE_GOING. It will call patched a() with modified semantic
> > and things might get wrong.
> > 
> > Well, the above described race is rather theoretical. It will happen
> > only when a patched module is being removed and a module with a patch
> > is added at the same time. Also it means that some other CPU will
> > manage to register, enable the patch, switch the universe, and
> > call function from the patched module during the small race window.
> > 
> > I am not sure if we need to handle such a type of race if the solution
> > adds too big complexity.
> 
> But b() can't be called after the module is in MODULE_STATE_GOING,
> right?  try_stop_module() makes sure it's not being used before changing
> its state.

If b() is called from __exit() of the particular module, then you end up 
in exactly the situation described above, don't you?

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 15:34             ` Josh Poimboeuf
  2015-03-04 15:51               ` Jiri Kosina
@ 2015-03-04 16:36               ` Petr Mladek
  2015-03-04 17:57                 ` Josh Poimboeuf
  1 sibling, 1 reply; 29+ messages in thread
From: Petr Mladek @ 2015-03-04 16:36 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Wed 2015-03-04 09:34:15, Josh Poimboeuf wrote:
> On Wed, Mar 04, 2015 at 02:17:52PM +0100, Petr Mladek wrote:
> > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> > > It's possible for klp_register_patch() to see a module before the COMING
> > > notifier is called, or after the GOING notifier is called.
> > > 
> > > That can cause all kinds of ugly races.  As Pter Mladek reported:
> > > 
> > >   "The problem is that we do not keep the klp_mutex lock all the time when
> > >   the module is being added or removed.

> > >   Second, if we are "lucky" and enable the patch for the coming module when the
> > >   ftrace is ready but before the module notifier has been called. The notifier
> > >   will try to enable the patch as well. It will detect that it is already patched,
> > >   return error, and the module will get rejected just because bad
> > >   timing. The more serious problem is that it will not call the notifier for
> > >   going module, so that the mess will stay there and we wont be able to load
> > >   the module later.
> > 
> > Ah, the race is there but the effect is not that serious in the
> > end. It seems that errors from module notifiers are ignored. In fact,
> > we do not propagate the error from klp_module_notify_coming(). It means
> > that WARN() from klp_enable_object() will be printed but the module
> > will be loaded and patched.
> > 
> > I am sorry, I was confused by kGraft where kgr_module_init() was
> > called directly from module_load(). The errors were propagated. It
> > means that kGraft rejects module when the patch cannot be applied.
> 
> True, but we should still eliminate the race.  Initializing the object
> twice could cause some sneaky bugs, if not now then in the future.

sure

> > Note that the current solution is perfectly fine for the simple
> > consistency model.
> > 
> > 
> > >   Third, similar problems are there for going module. If a patch is enabled after
> > >   the notifier finishes but before the module is removed from the list of modules,
> > >   the new patch will be applied to the module. The module might disappear at
> > >   anytime when the patch enabling is in progress, so there might be an access out
> > >   of memory. Or the whole patch might be applied and some mess will be left,
> > >   so it will not be possible to load/patch the module again."
> > 
> > This is true.
> > 
> > 
> > > Fix these races by letting the first one who sees the module do the
> > > needed work.
> > > 
> > > Reported-by: Petr Mladek <pmladek@suse.cz>
> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > ---
> > >  kernel/livepatch/core.c | 53 +++++++++++++++++++++++++++++++++++++++++++++----
> > >  1 file changed, 49 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > > index a74e4e8..19a758c 100644
> > > --- a/kernel/livepatch/core.c
> > > +++ b/kernel/livepatch/core.c
> > > @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj)
> > >  /* sets obj->mod if object is not vmlinux and module is found */
> > >  static void klp_find_object_module(struct klp_object *obj)
> > >  {
> > > +	struct module *mod;
> > > +
> > >  	if (!klp_is_module(obj))
> > >  		return;
> > >  
> > >  	mutex_lock(&module_mutex);
> > > +
> > >  	/*
> > >  	 * We don't need to take a reference on the module here because we have
> > >  	 * the klp_mutex, which is also taken by the module notifier.  This
> > >  	 * prevents any module from unloading until we release the klp_mutex.
> > >  	 */
> > > -	obj->mod = find_module(obj->name);
> > > +	mod = find_module(obj->name);
> > > +
> > > +	/*
> > > +	 * Be careful here to prevent races with the notifier:
> > > +	 *
> > > +	 * - MODULE_STATE_COMING: This may be before or after the notifier gets
> > > +	 *   called.  If after, the notifier didn't see the object anyway
> > > +	 *   because it hadn't been added to the patch list yet.  Either way,
> > > +	 *   ftrace is already initialized, so it's safe to just go ahead and
> > > +	 *   initialize the object here.
> > 
> > Well, we need to be careful. This will handle only the newly
> > registered patch. If there are other patches against the module
> > on the stack, it might produce wrong order at func_stack, see below.
> 
> Sorry, but I don't follow.  Can you give an example?

See below.

> > > +	 *
> > > +	 * - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be
> > > +	 *   before or after the notifier gets called.  If after, the notifer
> > > +	 *   didn't see the object anyway.  Either way it's safe to just
> > > +	 *   pretend that it's already gone and leave obj->mod at NULL.
> > 
> > This is true for the current simple consistency model.
> > 
> > Note that there will be a small race window if we allow dependency
> > between patched functions and introduce more a complex consistency model
> > with lazy patching. The problem is the following sequence:
> > 
> > 
> > CPU0					CPU1
> > 
> > delete_module()  #SYSCALL
> > 
> >    try_stop_module()
> >      mod->state = MODULE_STATE_GOING;
> > 
> >    mutex_unlock(&module_mutex);
> > 
> > 					klp_register_patch()
> > 					klp_enable_patch()
> > 
> > 					#save place to switch universe
> > 
> > 					b()     # from module that is going
> > 					  a()   # from core (patched)
> > 
> > 
> >    mod->exit();
> > 
> > 
> > Note that the function b() can be called until we call mod->exit().
> > 
> > If we do not apply patch against b() because it is in
> > MODULE_STATE_GOING. It will call patched a() with modified semantic
> > and things might get wrong.
> > 
> > 
> > Well, the above described race is rather theoretical. It will happen
> > only when a patched module is being removed and a module with a patch
> > is added at the same time. Also it means that some other CPU will
> > manage to register, enable the patch, switch the universe, and
> > call function from the patched module during the small race window.
> > 
> > I am not sure if we need to handle such a type of race if the solution
> > adds too big complexity.
> 
> But b() can't be called after the module is in MODULE_STATE_GOING,
> right?  try_stop_module() makes sure it's not being used before changing
> its state.

As Jikos mentioned. b() could get called from mod->exit();

IMHO, try_stop_module() is just a misleading name. If you look
inside, it just decrements the refcount. This could not stop anyone
from using the code.

AFAIK, mod->exit() is responsible for canceling all pending
operations, unregistering the module from any callbacks, queues,
lists. If mod->exit() is not done correctly, the module functions could
get called even after it finishes. But then it is a bug in the module.
This is why module removal is a dangerous operation and any
administrator should think twice before doing it.


> > > +	 *   MODULE_STATE_LIVE: The common case.  The module already finished
> > > +	 *   loading.  Just like with MODULE_STATE_COMING, we know the notifier
> > > +	 *   didn't see the object, so we init it here.
> > > +	 */
> > > +	if (mod && (mod->state == MODULE_STATE_COMING ||
> > > +		    mod->state == MODULE_STATE_LIVE))
> > > +		obj->mod = mod;
> > > +
> > >  	mutex_unlock(&module_mutex);
> > >  }
> > >  
> > > @@ -695,8 +721,6 @@ static void klp_free_object_loaded(struct klp_object *obj)
> > >  {
> > >  	struct klp_func *func;
> > >  
> > > -	obj->mod = NULL;
> > > -
> > >
> > >  	for (func = obj->funcs; func->old_name; func++)
> > >  		func->old_addr = 0;
> > >  }
> > > @@ -765,6 +789,7 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj)
> > >  		return -EINVAL;
> > >  
> > >  	obj->state = KLP_DISABLED;
> > > +	obj->mod = NULL;
> > >  
> > >  	klp_find_object_module(obj);
> > >  
> > > @@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action,
> > >  				continue;
> > >  
> > >  			if (action == MODULE_STATE_COMING) {
> > > +
> > > +				/*
> > > +				 * Check for a small window where the register
> > > +				 * path already initialized the object.
> > > +				 */
> > s/path/patch/
> 
> Not a typo, I was referring to the register code path.  Is there a
> clearer way to say that?

I see.

> > 
> > 
> > 
> > > +				if (obj->mod)
> > > +					continue;
> > 
> > This might break stacking. The recently registered patch might become
> > the last on the stack and thus unused.
> 
> Example please :-)

For example, let's have three patches (P1, P2, P3) for the functions a() and b()
where a() is from vmcore and b() is from a module M. Something like:

	a()	b()
P1	a1()	b1()
P2	a2()	b2()
P3	a3()	b3(3)

If you load the module M after all patches are registered and enabled.
The ftrace ops for function a() and b() has listed the functions in this
order

	ops_a->func_stack -> list(a3,a2,a1)
	ops_b->func_stack -> list(b3,b2,b1)

, so the pointer to b3() is the first and will be used.

Then you might have the following scenario. Let's start with state
when patches P1 and P2 are registered and enabled but the module M
is not loaded. Then ftrace ops for b() does not exist. Then we
get into the following race:


CPU0					CPU1

load_module(M)

  complete_formation()

  mod->state = MODULE_STATE_COMING;
  mutex_unlock(&module_mutex);

					klp_register_patch(P3);
					klp_enable_patch(P3);

					# STATE 1


  klp_module_notify(M)
    klp_module_notify_coming(P1);
    klp_module_notify_coming(P2);
    klp_module_notify_coming(P3);

					# STATE 2


The ftrace ops for a() and b() then looks:

  STATE1:

	ops_a->func_stack -> list(a3,a2,a1);
	ops_b->func_stack -> list(b3);

  STATE2:
	ops_a->func_stack -> list(a3,a2,a1);
	ops_b->func_stack -> list(b2,b1,b3);

therefore, b2() is used for the module but a3() is used for vmcore
because they were the last added.


My plan is to fix this problem by calling klp_module_init() directly
in load_module() just after ftrace_module_init(). It will solve this
problem because it will be called in MODULE_STATE_UNFORMED.

It will have another big advantage. It will allow to pass the error
code and refuse loading modules that could not get patched. This will
be needed for the more complex patches anyway. We have to prevent
running module code that is inconsistent with the patched system.

I am still in doubts how to best solve the problem for going modules.
Your suggested solution is fine for now. But we will need a better fix
after adding the more complex consistency model.

Best Regards,
Petr

> > >  				obj->mod = mod;
> > >  				klp_module_notify_coming(patch, obj);
> > > -			} else /* MODULE_STATE_GOING */
> > > +			} else {
> > > +				/* MODULE_STATE_GOING */
> > > +
> > > +				/*
> > > +				 * Check for a small window where the register
> > > +				 * path already saw the GOING state and thus
> > > +				 * didn't set obj->mod.
> > 
> > Same typo here.
> 
> Same not actually a typo ;-)
> 
> > Also I would write that it did not initialize the object. It is not
> > only about setting obj->mod.
> 
> Ok.
> 
> 
> -- 
> Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 15:51               ` Jiri Kosina
@ 2015-03-04 16:41                 ` Josh Poimboeuf
  0 siblings, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-04 16:41 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Petr Mladek, Seth Jennings, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Wed, Mar 04, 2015 at 04:51:39PM +0100, Jiri Kosina wrote:
> On Wed, 4 Mar 2015, Josh Poimboeuf wrote:
> 
> > > CPU0					CPU1
> > > 
> > > delete_module()  #SYSCALL
> > > 
> > >    try_stop_module()
> > >      mod->state = MODULE_STATE_GOING;
> > > 
> > >    mutex_unlock(&module_mutex);
> > > 
> > > 					klp_register_patch()
> > > 					klp_enable_patch()
> > > 
> > > 					#save place to switch universe
> > > 
> > > 					b()     # from module that is going
> > > 					  a()   # from core (patched)
> > > 
> > > 
> > >    mod->exit();
> > > 
> > > 
> > > Note that the function b() can be called until we call mod->exit().
> > > 
> > > If we do not apply patch against b() because it is in
> > > MODULE_STATE_GOING. It will call patched a() with modified semantic
> > > and things might get wrong.
> > > 
> > > Well, the above described race is rather theoretical. It will happen
> > > only when a patched module is being removed and a module with a patch
> > > is added at the same time. Also it means that some other CPU will
> > > manage to register, enable the patch, switch the universe, and
> > > call function from the patched module during the small race window.
> > > 
> > > I am not sure if we need to handle such a type of race if the solution
> > > adds too big complexity.
> > 
> > But b() can't be called after the module is in MODULE_STATE_GOING,
> > right?  try_stop_module() makes sure it's not being used before changing
> > its state.
> 
> If b() is called from __exit() of the particular module, then you end up 
> in exactly the situation described above, don't you?

Well, maybe.  I guess it depends on the consistency model.  The current
"inconsistency" model doesn't allow for function semantic changes
anyway.

If we went to the per-task consistency model with stack checking (like
my RFC), if mod->exit() calls schedule() before calling b(), the module
task can potentially switch to the new universe, and call old b() which
calls new a().

So yeah.  Maybe this isn't the best approach.

-- 
Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 16:36               ` Petr Mladek
@ 2015-03-04 17:57                 ` Josh Poimboeuf
  2015-03-04 22:02                   ` Jiri Kosina
  0 siblings, 1 reply; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-04 17:57 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Seth Jennings, Jiri Kosina, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Wed, Mar 04, 2015 at 05:36:11PM +0100, Petr Mladek wrote:
> For example, let's have three patches (P1, P2, P3) for the functions a() and b()
> where a() is from vmcore and b() is from a module M. Something like:
> 
> 	a()	b()
> P1	a1()	b1()
> P2	a2()	b2()
> P3	a3()	b3(3)
> 
> If you load the module M after all patches are registered and enabled.
> The ftrace ops for function a() and b() has listed the functions in this
> order
> 
> 	ops_a->func_stack -> list(a3,a2,a1)
> 	ops_b->func_stack -> list(b3,b2,b1)
> 
> , so the pointer to b3() is the first and will be used.
> 
> Then you might have the following scenario. Let's start with state
> when patches P1 and P2 are registered and enabled but the module M
> is not loaded. Then ftrace ops for b() does not exist. Then we
> get into the following race:
> 
> 
> CPU0					CPU1
> 
> load_module(M)
> 
>   complete_formation()
> 
>   mod->state = MODULE_STATE_COMING;
>   mutex_unlock(&module_mutex);
> 
> 					klp_register_patch(P3);
> 					klp_enable_patch(P3);
> 
> 					# STATE 1
> 
> 
>   klp_module_notify(M)
>     klp_module_notify_coming(P1);
>     klp_module_notify_coming(P2);
>     klp_module_notify_coming(P3);
> 
> 					# STATE 2
> 
> 
> The ftrace ops for a() and b() then looks:
> 
>   STATE1:
> 
> 	ops_a->func_stack -> list(a3,a2,a1);
> 	ops_b->func_stack -> list(b3);
> 
>   STATE2:
> 	ops_a->func_stack -> list(a3,a2,a1);
> 	ops_b->func_stack -> list(b2,b1,b3);
> 
> therefore, b2() is used for the module but a3() is used for vmcore
> because they were the last added.

Thanks for the excellent explanation.  That makes sense.

> My plan is to fix this problem by calling klp_module_init() directly
> in load_module() just after ftrace_module_init(). It will solve this
> problem because it will be called in MODULE_STATE_UNFORMED.

Ok, looking forward to that.

> It will have another big advantage. It will allow to pass the error
> code and refuse loading modules that could not get patched. This will
> be needed for the more complex patches anyway. We have to prevent
> running module code that is inconsistent with the patched system.

Yeah, that does need to be fixed up too.

> I am still in doubts how to best solve the problem for going modules.
> Your suggested solution is fine for now. But we will need a better fix
> after adding the more complex consistency model.

Well, we could just get a reference on all patched modules to prevent them
from being unloaded.

-- 
Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] livepatch: remove unnecessary call to klp_find_object_module()
  2015-03-03 23:02         ` [PATCH 1/2] livepatch: remove unnecessary call to klp_find_object_module() Josh Poimboeuf
  2015-03-04  9:00           ` Petr Mladek
@ 2015-03-04 21:48           ` Jiri Kosina
  1 sibling, 0 replies; 29+ messages in thread
From: Jiri Kosina @ 2015-03-04 21:48 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Seth Jennings, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Tue, 3 Mar 2015, Josh Poimboeuf wrote:

> klp_find_object_module() is called from both the klp register and enable
> paths.  Only the call from the register path is necessary because the
> module notifier will let us know if the patched module gets loaded or
> unloaded.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>

Applied to for-4.1/core.

> ---
>  kernel/livepatch/core.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 01ca088..a74e4e8 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -541,8 +541,6 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  	pr_notice("enabling patch '%s'\n", patch->mod->name);
>  
>  	for (obj = patch->objs; obj->funcs; obj++) {
> -		klp_find_object_module(obj);
> -
>  		if (!klp_is_object_loaded(obj))
>  			continue;
>  

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 17:57                 ` Josh Poimboeuf
@ 2015-03-04 22:02                   ` Jiri Kosina
  2015-03-04 22:45                     ` Josh Poimboeuf
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Kosina @ 2015-03-04 22:02 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Seth Jennings, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Wed, 4 Mar 2015, Josh Poimboeuf wrote:

> Well, we could just get a reference on all patched modules to prevent them
> from being unloaded.

Is that really a solution for cases where you are unloading a module (it 
has "just" been switched to MODULE_STATE_GOING) and enable a patch which 
patches one of its functions directly after it got switched to 
MODULE_STATE_GOING?

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 22:02                   ` Jiri Kosina
@ 2015-03-04 22:45                     ` Josh Poimboeuf
  0 siblings, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-04 22:45 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Petr Mladek, Seth Jennings, Vojtech Pavlik, live-patching,
	linux-kernel, Rusty Russell, Miroslav Benes, Masami Hiramatsu,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Wed, Mar 04, 2015 at 11:02:59PM +0100, Jiri Kosina wrote:
> On Wed, 4 Mar 2015, Josh Poimboeuf wrote:
> 
> > Well, we could just get a reference on all patched modules to prevent them
> > from being unloaded.
> 
> Is that really a solution for cases where you are unloading a module (it 
> has "just" been switched to MODULE_STATE_GOING) and enable a patch which 
> patches one of its functions directly after it got switched to 
> MODULE_STATE_GOING?

Ah, right.  try_module_get() fails, we don't patch the module, and we
end up in the same mod->exit() race.  Never mind, again :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-04 13:17           ` Petr Mladek
  2015-03-04 14:18             ` Petr Mladek
  2015-03-04 15:34             ` Josh Poimboeuf
@ 2015-03-05  0:52             ` Masami Hiramatsu
  2015-03-05 14:18               ` Josh Poimboeuf
  2015-03-05 14:24               ` Petr Mladek
  2 siblings, 2 replies; 29+ messages in thread
From: Masami Hiramatsu @ 2015-03-05  0:52 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

(2015/03/04 22:17), Petr Mladek wrote:
> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
>> It's possible for klp_register_patch() to see a module before the COMING
>> notifier is called, or after the GOING notifier is called.
>>
>> That can cause all kinds of ugly races.  As Pter Mladek reported:
>>
>>   "The problem is that we do not keep the klp_mutex lock all the time when
>>   the module is being added or removed.
>>
>>   First, the module is visible even before ftrace is ready. If we enable a patch
>>   in this time frame, adding ftrace ops will fail and the patch will get rejected
>>   just because bad timing.
> 
> Ah, this is not true after all. I did not properly check when
> MODULE_STATE_COMING was set. I though that it was before ftrace was
> initialized but it was not true.
> 
> 
>>   Second, if we are "lucky" and enable the patch for the coming module when the
>>   ftrace is ready but before the module notifier has been called. The notifier
>>   will try to enable the patch as well. It will detect that it is already patched,
>>   return error, and the module will get rejected just because bad
>>   timing. The more serious problem is that it will not call the notifier for
>>   going module, so that the mess will stay there and we wont be able to load
>>   the module later.
> 
> Ah, the race is there but the effect is not that serious in the
> end. It seems that errors from module notifiers are ignored. In fact,
> we do not propagate the error from klp_module_notify_coming(). It means
> that WARN() from klp_enable_object() will be printed but the module
> will be loaded and patched.
> 
> I am sorry, I was confused by kGraft where kgr_module_init() was
> called directly from module_load(). The errors were propagated. It
> means that kGraft rejects module when the patch cannot be applied.
> 
> Note that the current solution is perfectly fine for the simple
> consistency model.
> 
> 
>>   Third, similar problems are there for going module. If a patch is enabled after
>>   the notifier finishes but before the module is removed from the list of modules,
>>   the new patch will be applied to the module. The module might disappear at
>>   anytime when the patch enabling is in progress, so there might be an access out
>>   of memory. Or the whole patch might be applied and some mess will be left,
>>   so it will not be possible to load/patch the module again."
> 
> This is true.

No, that's not true if you try_get_module() before patching. After the
module state goes GOING (more correctly say, after try_release_module_ref()
succeeded), all try_get_module() must fail :)
So, please make sure to get module when applying patches.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-05  0:52             ` Masami Hiramatsu
@ 2015-03-05 14:18               ` Josh Poimboeuf
  2015-03-06  1:24                 ` Masami Hiramatsu
  2015-03-05 14:24               ` Petr Mladek
  1 sibling, 1 reply; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-05 14:18 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Petr Mladek, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote:
> (2015/03/04 22:17), Petr Mladek wrote:
> > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> >> It's possible for klp_register_patch() to see a module before the COMING
> >> notifier is called, or after the GOING notifier is called.
> >>
> >> That can cause all kinds of ugly races.  As Pter Mladek reported:
> >>
> >>   "The problem is that we do not keep the klp_mutex lock all the time when
> >>   the module is being added or removed.
> >>
> >>   First, the module is visible even before ftrace is ready. If we enable a patch
> >>   in this time frame, adding ftrace ops will fail and the patch will get rejected
> >>   just because bad timing.
> > 
> > Ah, this is not true after all. I did not properly check when
> > MODULE_STATE_COMING was set. I though that it was before ftrace was
> > initialized but it was not true.
> > 
> > 
> >>   Second, if we are "lucky" and enable the patch for the coming module when the
> >>   ftrace is ready but before the module notifier has been called. The notifier
> >>   will try to enable the patch as well. It will detect that it is already patched,
> >>   return error, and the module will get rejected just because bad
> >>   timing. The more serious problem is that it will not call the notifier for
> >>   going module, so that the mess will stay there and we wont be able to load
> >>   the module later.
> > 
> > Ah, the race is there but the effect is not that serious in the
> > end. It seems that errors from module notifiers are ignored. In fact,
> > we do not propagate the error from klp_module_notify_coming(). It means
> > that WARN() from klp_enable_object() will be printed but the module
> > will be loaded and patched.
> > 
> > I am sorry, I was confused by kGraft where kgr_module_init() was
> > called directly from module_load(). The errors were propagated. It
> > means that kGraft rejects module when the patch cannot be applied.
> > 
> > Note that the current solution is perfectly fine for the simple
> > consistency model.
> > 
> > 
> >>   Third, similar problems are there for going module. If a patch is enabled after
> >>   the notifier finishes but before the module is removed from the list of modules,
> >>   the new patch will be applied to the module. The module might disappear at
> >>   anytime when the patch enabling is in progress, so there might be an access out
> >>   of memory. Or the whole patch might be applied and some mess will be left,
> >>   so it will not be possible to load/patch the module again."
> > 
> > This is true.
> 
> No, that's not true if you try_get_module() before patching. After the
> module state goes GOING (more correctly say, after try_release_module_ref()
> succeeded), all try_get_module() must fail :)
> So, please make sure to get module when applying patches.

Hi Masami,

As Jikos pointed out elsewhere, try_get_module() won't solve all the
GOING races.

The module can be in GOING before mod->exit() is called.  If we apply a
patch between GOING getting set and mod->exit(), try_module_get() will
fail and the module won't be patched.  But module code can still run
before or during mod->exit(), so the unpatched module code might
interact badly with new patched code elsewhere.

-- 
Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-05  0:52             ` Masami Hiramatsu
  2015-03-05 14:18               ` Josh Poimboeuf
@ 2015-03-05 14:24               ` Petr Mladek
  1 sibling, 0 replies; 29+ messages in thread
From: Petr Mladek @ 2015-03-05 14:24 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Thu 2015-03-05 09:52:41, Masami Hiramatsu wrote:
> (2015/03/04 22:17), Petr Mladek wrote:
> > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> >> It's possible for klp_register_patch() to see a module before the COMING
> >> notifier is called, or after the GOING notifier is called.
> >>
> >> That can cause all kinds of ugly races.  As Pter Mladek reported:
> >>
> >>   "The problem is that we do not keep the klp_mutex lock all the time when
> >>   the module is being added or removed.
> >>
> >>   First, the module is visible even before ftrace is ready. If we enable a patch
> >>   in this time frame, adding ftrace ops will fail and the patch will get rejected
> >>   just because bad timing.
> > 
> > Ah, this is not true after all. I did not properly check when
> > MODULE_STATE_COMING was set. I though that it was before ftrace was
> > initialized but it was not true.
> > 
> > 
> >>   Second, if we are "lucky" and enable the patch for the coming module when the
> >>   ftrace is ready but before the module notifier has been called. The notifier
> >>   will try to enable the patch as well. It will detect that it is already patched,
> >>   return error, and the module will get rejected just because bad
> >>   timing. The more serious problem is that it will not call the notifier for
> >>   going module, so that the mess will stay there and we wont be able to load
> >>   the module later.
> > 
> > Ah, the race is there but the effect is not that serious in the
> > end. It seems that errors from module notifiers are ignored. In fact,
> > we do not propagate the error from klp_module_notify_coming(). It means
> > that WARN() from klp_enable_object() will be printed but the module
> > will be loaded and patched.
> > 
> > I am sorry, I was confused by kGraft where kgr_module_init() was
> > called directly from module_load(). The errors were propagated. It
> > means that kGraft rejects module when the patch cannot be applied.
> > 
> > Note that the current solution is perfectly fine for the simple
> > consistency model.
> > 
> > 
> >>   Third, similar problems are there for going module. If a patch is enabled after
> >>   the notifier finishes but before the module is removed from the list of modules,
> >>   the new patch will be applied to the module. The module might disappear at
> >>   anytime when the patch enabling is in progress, so there might be an access out
> >>   of memory. Or the whole patch might be applied and some mess will be left,
> >>   so it will not be possible to load/patch the module again."
> > 
> > This is true.
> 
> No, that's not true if you try_get_module() before patching. After the
> module state goes GOING (more correctly say, after try_release_module_ref()
> succeeded), all try_get_module() must fail :)
> So, please make sure to get module when applying patches.

The question is what to do if we could not get the module reference
because the module is going. We must not ignore the module because there is
a window when the module code could still get called, see
https://lkml.org/lkml/2015/3/4/776

It would be possible to wait until the module is removed but it might
take quite some time, especially if the mod->exit() function is complex
and need to wait for something.

Or we could refuse to add the patch but it is ugly.

I tend to go back and use the original idea with a boolean that is
updated when the module going notifier is called. It sets the border,
when it is safe to ignore the going module. There is no waiting,
no rejection.

I am actually getting near to send rfc patch v2 with this implementation.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-05 14:18               ` Josh Poimboeuf
@ 2015-03-06  1:24                 ` Masami Hiramatsu
  2015-03-06 10:51                   ` Petr Mladek
  0 siblings, 1 reply; 29+ messages in thread
From: Masami Hiramatsu @ 2015-03-06  1:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Petr Mladek, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

(2015/03/05 23:18), Josh Poimboeuf wrote:
> On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote:
>> (2015/03/04 22:17), Petr Mladek wrote:
>>> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
>>>> It's possible for klp_register_patch() to see a module before the COMING
>>>> notifier is called, or after the GOING notifier is called.
>>>>
>>>> That can cause all kinds of ugly races.  As Pter Mladek reported:
>>>>
>>>>   "The problem is that we do not keep the klp_mutex lock all the time when
>>>>   the module is being added or removed.
>>>>
>>>>   First, the module is visible even before ftrace is ready. If we enable a patch
>>>>   in this time frame, adding ftrace ops will fail and the patch will get rejected
>>>>   just because bad timing.
>>>
>>> Ah, this is not true after all. I did not properly check when
>>> MODULE_STATE_COMING was set. I though that it was before ftrace was
>>> initialized but it was not true.
>>>
>>>
>>>>   Second, if we are "lucky" and enable the patch for the coming module when the
>>>>   ftrace is ready but before the module notifier has been called. The notifier
>>>>   will try to enable the patch as well. It will detect that it is already patched,
>>>>   return error, and the module will get rejected just because bad
>>>>   timing. The more serious problem is that it will not call the notifier for
>>>>   going module, so that the mess will stay there and we wont be able to load
>>>>   the module later.
>>>
>>> Ah, the race is there but the effect is not that serious in the
>>> end. It seems that errors from module notifiers are ignored. In fact,
>>> we do not propagate the error from klp_module_notify_coming(). It means
>>> that WARN() from klp_enable_object() will be printed but the module
>>> will be loaded and patched.
>>>
>>> I am sorry, I was confused by kGraft where kgr_module_init() was
>>> called directly from module_load(). The errors were propagated. It
>>> means that kGraft rejects module when the patch cannot be applied.
>>>
>>> Note that the current solution is perfectly fine for the simple
>>> consistency model.
>>>
>>>
>>>>   Third, similar problems are there for going module. If a patch is enabled after
>>>>   the notifier finishes but before the module is removed from the list of modules,
>>>>   the new patch will be applied to the module. The module might disappear at
>>>>   anytime when the patch enabling is in progress, so there might be an access out
>>>>   of memory. Or the whole patch might be applied and some mess will be left,
>>>>   so it will not be possible to load/patch the module again."
>>>
>>> This is true.
>>
>> No, that's not true if you try_get_module() before patching. After the
>> module state goes GOING (more correctly say, after try_release_module_ref()
>> succeeded), all try_get_module() must fail :)
>> So, please make sure to get module when applying patches.
> 
> Hi Masami,
> 
> As Jikos pointed out elsewhere, try_get_module() won't solve all the
> GOING races.
> 
> The module can be in GOING before mod->exit() is called.  If we apply a
> patch between GOING getting set and mod->exit(), try_module_get() will
> fail and the module won't be patched.  But module code can still run
> before or during mod->exit(), so the unpatched module code might
> interact badly with new patched code elsewhere.

Hmm, in that case, we'd better have new GONE state for the module.
At least kprobe needs it.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-06  1:24                 ` Masami Hiramatsu
@ 2015-03-06 10:51                   ` Petr Mladek
  2015-03-06 11:37                     ` Masami Hiramatsu
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Mladek @ 2015-03-06 10:51 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote:
> (2015/03/05 23:18), Josh Poimboeuf wrote:
> > On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote:
> >> (2015/03/04 22:17), Petr Mladek wrote:
> >>> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> >>>> It's possible for klp_register_patch() to see a module before the COMING
> >>>> notifier is called, or after the GOING notifier is called.
> >>>>
> >>>> That can cause all kinds of ugly races.  As Pter Mladek reported:
> >>>>
> >>>>   "The problem is that we do not keep the klp_mutex lock all the time when
> >>>>   the module is being added or removed.
> >>>>
> >>>>   First, the module is visible even before ftrace is ready. If we enable a patch
> >>>>   in this time frame, adding ftrace ops will fail and the patch will get rejected
> >>>>   just because bad timing.
> >>>
> >>> Ah, this is not true after all. I did not properly check when
> >>> MODULE_STATE_COMING was set. I though that it was before ftrace was
> >>> initialized but it was not true.
> >>>
> >>>
> >>>>   Second, if we are "lucky" and enable the patch for the coming module when the
> >>>>   ftrace is ready but before the module notifier has been called. The notifier
> >>>>   will try to enable the patch as well. It will detect that it is already patched,
> >>>>   return error, and the module will get rejected just because bad
> >>>>   timing. The more serious problem is that it will not call the notifier for
> >>>>   going module, so that the mess will stay there and we wont be able to load
> >>>>   the module later.
> >>>
> >>> Ah, the race is there but the effect is not that serious in the
> >>> end. It seems that errors from module notifiers are ignored. In fact,
> >>> we do not propagate the error from klp_module_notify_coming(). It means
> >>> that WARN() from klp_enable_object() will be printed but the module
> >>> will be loaded and patched.
> >>>
> >>> I am sorry, I was confused by kGraft where kgr_module_init() was
> >>> called directly from module_load(). The errors were propagated. It
> >>> means that kGraft rejects module when the patch cannot be applied.
> >>>
> >>> Note that the current solution is perfectly fine for the simple
> >>> consistency model.
> >>>
> >>>
> >>>>   Third, similar problems are there for going module. If a patch is enabled after
> >>>>   the notifier finishes but before the module is removed from the list of modules,
> >>>>   the new patch will be applied to the module. The module might disappear at
> >>>>   anytime when the patch enabling is in progress, so there might be an access out
> >>>>   of memory. Or the whole patch might be applied and some mess will be left,
> >>>>   so it will not be possible to load/patch the module again."
> >>>
> >>> This is true.
> >>
> >> No, that's not true if you try_get_module() before patching. After the
> >> module state goes GOING (more correctly say, after try_release_module_ref()
> >> succeeded), all try_get_module() must fail :)
> >> So, please make sure to get module when applying patches.
> > 
> > Hi Masami,
> > 
> > As Jikos pointed out elsewhere, try_get_module() won't solve all the
> > GOING races.
> > 
> > The module can be in GOING before mod->exit() is called.  If we apply a
> > patch between GOING getting set and mod->exit(), try_module_get() will
> > fail and the module won't be patched.  But module code can still run
> > before or during mod->exit(), so the unpatched module code might
> > interact badly with new patched code elsewhere.
> 
> Hmm, in that case, we'd better have new GONE state for the module.
> At least kprobe needs it.

What is the exact problem with kprobes, please?

Note that the notifiers for MODULE_STATE_GOING are called
after mod->exit(). Therefore it is safe to kill kprobes
the fast way when using the notifier.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-06 10:51                   ` Petr Mladek
@ 2015-03-06 11:37                     ` Masami Hiramatsu
  2015-03-06 13:05                       ` Petr Mladek
  2015-03-06 14:43                       ` Josh Poimboeuf
  0 siblings, 2 replies; 29+ messages in thread
From: Masami Hiramatsu @ 2015-03-06 11:37 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

(2015/03/06 19:51), Petr Mladek wrote:
> On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote:
>> (2015/03/05 23:18), Josh Poimboeuf wrote:
>>> On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote:
>>>> (2015/03/04 22:17), Petr Mladek wrote:
>>>>> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
>>>>>> It's possible for klp_register_patch() to see a module before the COMING
>>>>>> notifier is called, or after the GOING notifier is called.
>>>>>>
>>>>>> That can cause all kinds of ugly races.  As Pter Mladek reported:
>>>>>>
>>>>>>   "The problem is that we do not keep the klp_mutex lock all the time when
>>>>>>   the module is being added or removed.
>>>>>>
>>>>>>   First, the module is visible even before ftrace is ready. If we enable a patch
>>>>>>   in this time frame, adding ftrace ops will fail and the patch will get rejected
>>>>>>   just because bad timing.
>>>>>
>>>>> Ah, this is not true after all. I did not properly check when
>>>>> MODULE_STATE_COMING was set. I though that it was before ftrace was
>>>>> initialized but it was not true.
>>>>>
>>>>>
>>>>>>   Second, if we are "lucky" and enable the patch for the coming module when the
>>>>>>   ftrace is ready but before the module notifier has been called. The notifier
>>>>>>   will try to enable the patch as well. It will detect that it is already patched,
>>>>>>   return error, and the module will get rejected just because bad
>>>>>>   timing. The more serious problem is that it will not call the notifier for
>>>>>>   going module, so that the mess will stay there and we wont be able to load
>>>>>>   the module later.
>>>>>
>>>>> Ah, the race is there but the effect is not that serious in the
>>>>> end. It seems that errors from module notifiers are ignored. In fact,
>>>>> we do not propagate the error from klp_module_notify_coming(). It means
>>>>> that WARN() from klp_enable_object() will be printed but the module
>>>>> will be loaded and patched.
>>>>>
>>>>> I am sorry, I was confused by kGraft where kgr_module_init() was
>>>>> called directly from module_load(). The errors were propagated. It
>>>>> means that kGraft rejects module when the patch cannot be applied.
>>>>>
>>>>> Note that the current solution is perfectly fine for the simple
>>>>> consistency model.
>>>>>
>>>>>
>>>>>>   Third, similar problems are there for going module. If a patch is enabled after
>>>>>>   the notifier finishes but before the module is removed from the list of modules,
>>>>>>   the new patch will be applied to the module. The module might disappear at
>>>>>>   anytime when the patch enabling is in progress, so there might be an access out
>>>>>>   of memory. Or the whole patch might be applied and some mess will be left,
>>>>>>   so it will not be possible to load/patch the module again."
>>>>>
>>>>> This is true.
>>>>
>>>> No, that's not true if you try_get_module() before patching. After the
>>>> module state goes GOING (more correctly say, after try_release_module_ref()
>>>> succeeded), all try_get_module() must fail :)
>>>> So, please make sure to get module when applying patches.
>>>
>>> Hi Masami,
>>>
>>> As Jikos pointed out elsewhere, try_get_module() won't solve all the
>>> GOING races.
>>>
>>> The module can be in GOING before mod->exit() is called.  If we apply a
>>> patch between GOING getting set and mod->exit(), try_module_get() will
>>> fail and the module won't be patched.  But module code can still run
>>> before or during mod->exit(), so the unpatched module code might
>>> interact badly with new patched code elsewhere.
>>
>> Hmm, in that case, we'd better have new GONE state for the module.
>> At least kprobe needs it.
> 
> What is the exact problem with kprobes, please?

Ah, sorry, I miss understood the issue.


> Note that the notifiers for MODULE_STATE_GOING are called
> after mod->exit(). Therefore it is safe to kill kprobes
> the fast way when using the notifier.

Right.

        /* Stop the machine so refcounts can't move and disable module. */
        ret = try_stop_module(mod, flags, &forced);
        if (ret != 0)
                goto out;

        mutex_unlock(&module_mutex);
        /* Final destruction now no one is using it. */
        if (mod->exit != NULL)
                mod->exit();
        blocking_notifier_call_chain(&module_notify_list,
                                     MODULE_STATE_GOING, mod);
        async_synchronize_full();

And, kprobes also can try to put a probe between mutex_unlock()
and mod->exit() on mod->exit() function.
Since kprobe tries to get module when registering, it will
fail but mod->exit() is called after that fail.

So, there is also a race.

However, I'm not sure that is actual serious issue.
Actually, we can suppose this module unloading context is
not changing universe. thus it is expected behavior, isn't it?

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-06 11:37                     ` Masami Hiramatsu
@ 2015-03-06 13:05                       ` Petr Mladek
  2015-03-06 14:43                       ` Josh Poimboeuf
  1 sibling, 0 replies; 29+ messages in thread
From: Petr Mladek @ 2015-03-06 13:05 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Fri 2015-03-06 20:37:26, Masami Hiramatsu wrote:
> (2015/03/06 19:51), Petr Mladek wrote:
> > On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote:
> >> (2015/03/05 23:18), Josh Poimboeuf wrote:
> >>> On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote:
> >>>> (2015/03/04 22:17), Petr Mladek wrote:
> >>>>> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote:
> >>>>>> It's possible for klp_register_patch() to see a module before the COMING
> >>>>>> notifier is called, or after the GOING notifier is called.
> >>>>>>
> >>>>>> That can cause all kinds of ugly races.  As Pter Mladek reported:
> >>>>>>
> >>>>>>   "The problem is that we do not keep the klp_mutex lock all the time when
> >>>>>>   the module is being added or removed.
> >>>>>>
> >>>>>>   First, the module is visible even before ftrace is ready. If we enable a patch
> >>>>>>   in this time frame, adding ftrace ops will fail and the patch will get rejected
> >>>>>>   just because bad timing.
> >>>>>
> >>>>> Ah, this is not true after all. I did not properly check when
> >>>>> MODULE_STATE_COMING was set. I though that it was before ftrace was
> >>>>> initialized but it was not true.
> >>>>>
> >>>>>
> >>>>>>   Second, if we are "lucky" and enable the patch for the coming module when the
> >>>>>>   ftrace is ready but before the module notifier has been called. The notifier
> >>>>>>   will try to enable the patch as well. It will detect that it is already patched,
> >>>>>>   return error, and the module will get rejected just because bad
> >>>>>>   timing. The more serious problem is that it will not call the notifier for
> >>>>>>   going module, so that the mess will stay there and we wont be able to load
> >>>>>>   the module later.
> >>>>>
> >>>>> Ah, the race is there but the effect is not that serious in the
> >>>>> end. It seems that errors from module notifiers are ignored. In fact,
> >>>>> we do not propagate the error from klp_module_notify_coming(). It means
> >>>>> that WARN() from klp_enable_object() will be printed but the module
> >>>>> will be loaded and patched.
> >>>>>
> >>>>> I am sorry, I was confused by kGraft where kgr_module_init() was
> >>>>> called directly from module_load(). The errors were propagated. It
> >>>>> means that kGraft rejects module when the patch cannot be applied.
> >>>>>
> >>>>> Note that the current solution is perfectly fine for the simple
> >>>>> consistency model.
> >>>>>
> >>>>>
> >>>>>>   Third, similar problems are there for going module. If a patch is enabled after
> >>>>>>   the notifier finishes but before the module is removed from the list of modules,
> >>>>>>   the new patch will be applied to the module. The module might disappear at
> >>>>>>   anytime when the patch enabling is in progress, so there might be an access out
> >>>>>>   of memory. Or the whole patch might be applied and some mess will be left,
> >>>>>>   so it will not be possible to load/patch the module again."
> >>>>>
> >>>>> This is true.
> >>>>
> >>>> No, that's not true if you try_get_module() before patching. After the
> >>>> module state goes GOING (more correctly say, after try_release_module_ref()
> >>>> succeeded), all try_get_module() must fail :)
> >>>> So, please make sure to get module when applying patches.
> >>>
> >>> Hi Masami,
> >>>
> >>> As Jikos pointed out elsewhere, try_get_module() won't solve all the
> >>> GOING races.
> >>>
> >>> The module can be in GOING before mod->exit() is called.  If we apply a
> >>> patch between GOING getting set and mod->exit(), try_module_get() will
> >>> fail and the module won't be patched.  But module code can still run
> >>> before or during mod->exit(), so the unpatched module code might
> >>> interact badly with new patched code elsewhere.
> >>
> >> Hmm, in that case, we'd better have new GONE state for the module.
> >> At least kprobe needs it.
> > 
> > What is the exact problem with kprobes, please?
> 
> Ah, sorry, I miss understood the issue.
> 
> 
> > Note that the notifiers for MODULE_STATE_GOING are called
> > after mod->exit(). Therefore it is safe to kill kprobes
> > the fast way when using the notifier.
> 
> Right.
> 
>         /* Stop the machine so refcounts can't move and disable module. */
>         ret = try_stop_module(mod, flags, &forced);
>         if (ret != 0)
>                 goto out;
> 
>         mutex_unlock(&module_mutex);
>         /* Final destruction now no one is using it. */
>         if (mod->exit != NULL)
>                 mod->exit();
>         blocking_notifier_call_chain(&module_notify_list,
>                                      MODULE_STATE_GOING, mod);
>         async_synchronize_full();
> 
> And, kprobes also can try to put a probe between mutex_unlock()
> and mod->exit() on mod->exit() function.
> Since kprobe tries to get module when registering, it will
> fail but mod->exit() is called after that fail.
> 
> So, there is also a race.

I think that it is not a problem if the Kprobe can not be registered
at this stage. If anyone wants to register Kprobe on the mod->exit(),
she should do it earlier.

> However, I'm not sure that is actual serious issue.
> Actually, we can suppose this module unloading context is
> not changing universe. thus it is expected behavior, isn't it?

Live Patching is in different situation. The patch modifies many
functions at once. If we allow semantic changes in functions, we need
to have the patches applied consistently on all code that might get
called, including going module. The potential race is described at
https://lkml.org/lkml/2015/3/4/776. See the middle of the message.

Best Regards,
Petr


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] livepatch: fix patched module loading race
  2015-03-06 11:37                     ` Masami Hiramatsu
  2015-03-06 13:05                       ` Petr Mladek
@ 2015-03-06 14:43                       ` Josh Poimboeuf
  1 sibling, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2015-03-06 14:43 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Petr Mladek, Seth Jennings, Jiri Kosina, Vojtech Pavlik,
	live-patching, linux-kernel, Rusty Russell, Miroslav Benes,
	mingo, mathieu.desnoyers, oleg, paulmck, andi, rostedt, tglx

On Fri, Mar 06, 2015 at 08:37:26PM +0900, Masami Hiramatsu wrote:
> Actually, we can suppose this module unloading context is
> not changing universe. thus it is expected behavior, isn't it?

In the case of my proposed consistency model RFC, if the module
unloading task gets preempted, or if mod->exit() calls schedule(), its
task can switch to the new universe before it's done.

And for many modules it could also be possible for other contexts to
access the module's functions in the GOING state before mod->exit()
disables them.

-- 
Josh

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-03-06 14:44 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-03 11:38 [RFC PATCH] livepatch/module: Do not patch modules that are not ready Petr Mladek
2015-03-03 14:55 ` Josh Poimboeuf
2015-03-03 15:48   ` Petr Mladek
2015-03-03 16:01     ` Josh Poimboeuf
2015-03-03 17:34   ` Petr Mladek
2015-03-03 19:31     ` Josh Poimboeuf
2015-03-03 19:35       ` Josh Poimboeuf
2015-03-03 23:02       ` [PATCH 0/2] livepatch: fix patch module loading race Josh Poimboeuf
2015-03-03 23:02         ` [PATCH 1/2] livepatch: remove unnecessary call to klp_find_object_module() Josh Poimboeuf
2015-03-04  9:00           ` Petr Mladek
2015-03-04 21:48           ` Jiri Kosina
2015-03-03 23:02         ` [PATCH 2/2] livepatch: fix patched module loading race Josh Poimboeuf
2015-03-04 13:17           ` Petr Mladek
2015-03-04 14:18             ` Petr Mladek
2015-03-04 15:34             ` Josh Poimboeuf
2015-03-04 15:51               ` Jiri Kosina
2015-03-04 16:41                 ` Josh Poimboeuf
2015-03-04 16:36               ` Petr Mladek
2015-03-04 17:57                 ` Josh Poimboeuf
2015-03-04 22:02                   ` Jiri Kosina
2015-03-04 22:45                     ` Josh Poimboeuf
2015-03-05  0:52             ` Masami Hiramatsu
2015-03-05 14:18               ` Josh Poimboeuf
2015-03-06  1:24                 ` Masami Hiramatsu
2015-03-06 10:51                   ` Petr Mladek
2015-03-06 11:37                     ` Masami Hiramatsu
2015-03-06 13:05                       ` Petr Mladek
2015-03-06 14:43                       ` Josh Poimboeuf
2015-03-05 14:24               ` Petr Mladek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).