LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [patch 00/11] Text Edit Lock
@ 2007-11-13 18:46 Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 01/11] Kprobes - use a mutex to protect the instruction pages list Mathieu Desnoyers
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel

Hi Andrew,

Here is a repost of the Text Edit Lock. It is required by the immediate values.

It applies on top of 2.6.24-rc2-git3 in this order :

#Text Edit Lock
#for -mm
kprobes-use-mutex-for-insn-pages.patch
kprobes-dont-use-kprobes-mutex-in-arch-code.patch
kprobes-declare-kprobes-mutex-static.patch
declare-array.patch
text-edit-lock-architecture-independent-code.patch
text-edit-lock-alternative-i386-and-x86_64.patch
text-edit-lock-kprobes-architecture-independent.patch
text-edit-lock-kprobes-i386.patch
text-edit-lock-kprobes-x86_64.patch
text-edit-lock-i386-standardize-debug-rodata.patch
text-edit-lock-x86_64-standardize-debug-rodata.patch

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 01/11] Kprobes - use a mutex to protect the instruction pages list.
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 02/11] Kprobes - do not use kprobes mutex in arch code Mathieu Desnoyers
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel
  Cc: Mathieu Desnoyers, Ananth N Mavinakayanahalli, hch, prasanna,
	anil.s.keshavamurthy, davem

[-- Attachment #1: kprobes-use-mutex-for-insn-pages.patch --]
[-- Type: text/plain, Size: 3625 bytes --]

Protect the instruction pages list by a specific insn pages mutex, called in 
get_insn_slot() and free_insn_slot(). It makes sure that architectures that does
not need to call arch_remove_kprobe() does not take an unneeded kprobes mutex.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: hch@infradead.org
CC: prasanna@in.ibm.com
CC: anil.s.keshavamurthy@intel.com
CC: davem@davemloft.net
---
 kernel/kprobes.c |   27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

Index: linux-2.6-lttng/kernel/kprobes.c
===================================================================
--- linux-2.6-lttng.orig/kernel/kprobes.c	2007-08-27 11:48:56.000000000 -0400
+++ linux-2.6-lttng/kernel/kprobes.c	2007-08-27 11:48:58.000000000 -0400
@@ -95,6 +95,10 @@ enum kprobe_slot_state {
 	SLOT_USED = 2,
 };
 
+/*
+ * Protects the kprobe_insn_pages list. Can nest into kprobe_mutex.
+ */
+static DEFINE_MUTEX(kprobe_insn_mutex);
 static struct hlist_head kprobe_insn_pages;
 static int kprobe_garbage_slots;
 static int collect_garbage_slots(void);
@@ -131,7 +135,9 @@ kprobe_opcode_t __kprobes *get_insn_slot
 {
 	struct kprobe_insn_page *kip;
 	struct hlist_node *pos;
+	kprobe_opcode_t *ret;
 
+	mutex_lock(&kprobe_insn_mutex);
  retry:
 	hlist_for_each_entry(kip, pos, &kprobe_insn_pages, hlist) {
 		if (kip->nused < INSNS_PER_PAGE) {
@@ -140,7 +146,8 @@ kprobe_opcode_t __kprobes *get_insn_slot
 				if (kip->slot_used[i] == SLOT_CLEAN) {
 					kip->slot_used[i] = SLOT_USED;
 					kip->nused++;
-					return kip->insns + (i * MAX_INSN_SIZE);
+					ret = kip->insns + (i * MAX_INSN_SIZE);
+					goto end;
 				}
 			}
 			/* Surprise!  No unused slots.  Fix kip->nused. */
@@ -154,8 +161,10 @@ kprobe_opcode_t __kprobes *get_insn_slot
 	}
 	/* All out of space.  Need to allocate a new page. Use slot 0. */
 	kip = kmalloc(sizeof(struct kprobe_insn_page), GFP_KERNEL);
-	if (!kip)
-		return NULL;
+	if (!kip) {
+		ret = NULL;
+		goto end;
+	}
 
 	/*
 	 * Use module_alloc so this page is within +/- 2GB of where the
@@ -165,7 +174,8 @@ kprobe_opcode_t __kprobes *get_insn_slot
 	kip->insns = module_alloc(PAGE_SIZE);
 	if (!kip->insns) {
 		kfree(kip);
-		return NULL;
+		ret = NULL;
+		goto end;
 	}
 	INIT_HLIST_NODE(&kip->hlist);
 	hlist_add_head(&kip->hlist, &kprobe_insn_pages);
@@ -173,7 +183,10 @@ kprobe_opcode_t __kprobes *get_insn_slot
 	kip->slot_used[0] = SLOT_USED;
 	kip->nused = 1;
 	kip->ngarbage = 0;
-	return kip->insns;
+	ret = kip->insns;
+end:
+	mutex_unlock(&kprobe_insn_mutex);
+	return ret;
 }
 
 /* Return 1 if all garbages are collected, otherwise 0. */
@@ -207,7 +220,7 @@ static int __kprobes collect_garbage_slo
 	struct kprobe_insn_page *kip;
 	struct hlist_node *pos, *next;
 
-	/* Ensure no-one is preepmted on the garbages */
+	/* Ensure no-one is preempted on the garbages */
 	if (check_safety() != 0)
 		return -EAGAIN;
 
@@ -231,6 +244,7 @@ void __kprobes free_insn_slot(kprobe_opc
 	struct kprobe_insn_page *kip;
 	struct hlist_node *pos;
 
+	mutex_lock(&kprobe_insn_mutex);
 	hlist_for_each_entry(kip, pos, &kprobe_insn_pages, hlist) {
 		if (kip->insns <= slot &&
 		    slot < kip->insns + (INSNS_PER_PAGE * MAX_INSN_SIZE)) {
@@ -247,6 +261,7 @@ void __kprobes free_insn_slot(kprobe_opc
 
 	if (dirty && ++kprobe_garbage_slots > INSNS_PER_PAGE)
 		collect_garbage_slots();
+	mutex_unlock(&kprobe_insn_mutex);
 }
 #endif
 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 02/11] Kprobes - do not use kprobes mutex in arch code
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 01/11] Kprobes - use a mutex to protect the instruction pages list Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 03/11] Kprobes - declare kprobe_mutex static Mathieu Desnoyers
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel
  Cc: Mathieu Desnoyers, Ananth N Mavinakayanahalli, prasanna,
	anil.s.keshavamurthy, davem

[-- Attachment #1: kprobes-dont-use-kprobes-mutex-in-arch-code.patch --]
[-- Type: text/plain, Size: 5142 bytes --]

Remove the kprobes mutex from kprobes.h, since it does not belong there. Also
remove all use of this mutex in the architecture specific code, replacing it by
a proper mutex lock/unlock in the architecture agnostic code.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: prasanna@in.ibm.com
CC: anil.s.keshavamurthy@intel.com
CC: davem@davemloft.net
---
 arch/ia64/kernel/kprobes.c    |    2 --
 arch/powerpc/kernel/kprobes.c |    2 --
 arch/s390/kernel/kprobes.c    |    2 --
 arch/x86/kernel/kprobes_32.c  |    2 --
 arch/x86/kernel/kprobes_64.c  |    2 --
 include/linux/kprobes.h       |    2 --
 kernel/kprobes.c              |    2 ++
 7 files changed, 2 insertions(+), 12 deletions(-)

Index: linux-2.6-lttng/include/linux/kprobes.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/kprobes.h	2007-11-13 09:45:33.000000000 -0500
+++ linux-2.6-lttng/include/linux/kprobes.h	2007-11-13 09:45:35.000000000 -0500
@@ -35,7 +35,6 @@
 #include <linux/percpu.h>
 #include <linux/spinlock.h>
 #include <linux/rcupdate.h>
-#include <linux/mutex.h>
 
 #ifdef CONFIG_KPROBES
 #include <asm/kprobes.h>
@@ -183,7 +182,6 @@ static inline void kretprobe_assert(stru
 }
 
 extern spinlock_t kretprobe_lock;
-extern struct mutex kprobe_mutex;
 extern int arch_prepare_kprobe(struct kprobe *p);
 extern void arch_arm_kprobe(struct kprobe *p);
 extern void arch_disarm_kprobe(struct kprobe *p);
Index: linux-2.6-lttng/arch/x86/kernel/kprobes_32.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_32.c	2007-11-13 09:45:33.000000000 -0500
+++ linux-2.6-lttng/arch/x86/kernel/kprobes_32.c	2007-11-13 09:45:35.000000000 -0500
@@ -186,9 +186,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-	mutex_lock(&kprobe_mutex);
 	free_insn_slot(p->ainsn.insn, (p->ainsn.boostable == 1));
-	mutex_unlock(&kprobe_mutex);
 }
 
 static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
Index: linux-2.6-lttng/kernel/kprobes.c
===================================================================
--- linux-2.6-lttng.orig/kernel/kprobes.c	2007-11-13 09:45:35.000000000 -0500
+++ linux-2.6-lttng/kernel/kprobes.c	2007-11-13 09:45:35.000000000 -0500
@@ -644,7 +644,9 @@ valid_p:
 			list_del_rcu(&p->list);
 			kfree(old_p);
 		}
+		mutex_lock(&kprobe_mutex);
 		arch_remove_kprobe(p);
+		mutex_unlock(&kprobe_mutex);
 	} else {
 		mutex_lock(&kprobe_mutex);
 		if (p->break_handler)
Index: linux-2.6-lttng/arch/ia64/kernel/kprobes.c
===================================================================
--- linux-2.6-lttng.orig/arch/ia64/kernel/kprobes.c	2007-11-13 09:45:33.000000000 -0500
+++ linux-2.6-lttng/arch/ia64/kernel/kprobes.c	2007-11-13 09:45:35.000000000 -0500
@@ -567,9 +567,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-	mutex_lock(&kprobe_mutex);
 	free_insn_slot(p->ainsn.insn, 0);
-	mutex_unlock(&kprobe_mutex);
 }
 /*
  * We are resuming execution after a single step fault, so the pt_regs
Index: linux-2.6-lttng/arch/powerpc/kernel/kprobes.c
===================================================================
--- linux-2.6-lttng.orig/arch/powerpc/kernel/kprobes.c	2007-11-13 09:45:33.000000000 -0500
+++ linux-2.6-lttng/arch/powerpc/kernel/kprobes.c	2007-11-13 09:45:35.000000000 -0500
@@ -88,9 +88,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-	mutex_lock(&kprobe_mutex);
 	free_insn_slot(p->ainsn.insn, 0);
-	mutex_unlock(&kprobe_mutex);
 }
 
 static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
Index: linux-2.6-lttng/arch/s390/kernel/kprobes.c
===================================================================
--- linux-2.6-lttng.orig/arch/s390/kernel/kprobes.c	2007-11-13 09:45:33.000000000 -0500
+++ linux-2.6-lttng/arch/s390/kernel/kprobes.c	2007-11-13 09:45:35.000000000 -0500
@@ -220,9 +220,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-	mutex_lock(&kprobe_mutex);
 	free_insn_slot(p->ainsn.insn, 0);
-	mutex_unlock(&kprobe_mutex);
 }
 
 static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
Index: linux-2.6-lttng/arch/x86/kernel/kprobes_64.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_64.c	2007-11-13 09:45:33.000000000 -0500
+++ linux-2.6-lttng/arch/x86/kernel/kprobes_64.c	2007-11-13 09:45:35.000000000 -0500
@@ -225,9 +225,7 @@ void __kprobes arch_disarm_kprobe(struct
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
 {
-	mutex_lock(&kprobe_mutex);
 	free_insn_slot(p->ainsn.insn, 0);
-	mutex_unlock(&kprobe_mutex);
 }
 
 static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 03/11] Kprobes - declare kprobe_mutex static
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 01/11] Kprobes - use a mutex to protect the instruction pages list Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 02/11] Kprobes - do not use kprobes mutex in arch code Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 04/11] Add INIT_ARRAY() to kernel.h Mathieu Desnoyers
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel
  Cc: Mathieu Desnoyers, Ananth N Mavinakayanahalli, hch, prasanna,
	anil.s.keshavamurthy, davem

[-- Attachment #1: kprobes-declare-kprobes-mutex-static.patch --]
[-- Type: text/plain, Size: 1229 bytes --]

Since it will not be used by other kernel objects, it makes sense to declare it
static.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: hch@infradead.org
CC: prasanna@in.ibm.com
CC: anil.s.keshavamurthy@intel.com
CC: davem@davemloft.net
---
 kernel/kprobes.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-lttng/kernel/kprobes.c
===================================================================
--- linux-2.6-lttng.orig/kernel/kprobes.c	2007-08-19 09:09:15.000000000 -0400
+++ linux-2.6-lttng/kernel/kprobes.c	2007-08-19 17:18:07.000000000 -0400
@@ -68,7 +68,7 @@ static struct hlist_head kretprobe_inst_
 /* NOTE: change this value only with kprobe_mutex held */
 static bool kprobe_enabled;
 
-DEFINE_MUTEX(kprobe_mutex);		/* Protects kprobe_table */
+static DEFINE_MUTEX(kprobe_mutex);	/* Protects kprobe_table */
 DEFINE_SPINLOCK(kretprobe_lock);	/* Protects kretprobe_inst_table */
 static DEFINE_PER_CPU(struct kprobe *, kprobe_instance) = NULL;
 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 04/11] Add INIT_ARRAY() to kernel.h
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
                   ` (2 preceding siblings ...)
  2007-11-13 18:46 ` [patch 03/11] Kprobes - declare kprobe_mutex static Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 05/11] Text Edit Lock - Architecture Independent Code Mathieu Desnoyers
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers

[-- Attachment #1: declare-array.patch --]
[-- Type: text/plain, Size: 968 bytes --]

Add initialization of an array, which needs brackets that would pollute kernel
code, to kernel.h. It is used to declare arguments passed as function parameters
such as:
text_poke(addr, INIT_ARRAY(unsigned char, 0xf0, len), len);

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
 include/linux/kernel.h |    2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6-lttng/include/linux/kernel.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/kernel.h	2007-11-13 09:25:29.000000000 -0500
+++ linux-2.6-lttng/include/linux/kernel.h	2007-11-13 09:45:38.000000000 -0500
@@ -421,4 +421,6 @@ struct sysinfo {
 #define NUMA_BUILD 0
 #endif
 
+#define INIT_ARRAY(type, val, len) ((type [len]) { [0 ... (len)-1] = (val) })
+
 #endif

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 05/11] Text Edit Lock - Architecture Independent Code
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
                   ` (3 preceding siblings ...)
  2007-11-13 18:46 ` [patch 04/11] Add INIT_ARRAY() to kernel.h Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 06/11] Text Edit Lock - Alternative code for x86 Mathieu Desnoyers
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers, Andi Kleen

[-- Attachment #1: text-edit-lock-architecture-independent-code.patch --]
[-- Type: text/plain, Size: 3141 bytes --]

This is an architecture independant synchronization around kernel text
modifications through use of a global mutex.

A mutex has been chosen so that kprobes, the main user of this, can sleep during
memory allocation between the memory read of the instructions it must replace
and the memory write of the breakpoint.

Other user of this interface: immediate values.

Paravirt and alternatives are always done when SMP is inactive, so there is no
need to use locks.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
---
 include/linux/memory.h |    7 +++++++
 mm/memory.c            |   34 ++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

Index: linux-2.6-lttng/include/linux/memory.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/memory.h	2007-11-07 11:11:26.000000000 -0500
+++ linux-2.6-lttng/include/linux/memory.h	2007-11-07 11:13:48.000000000 -0500
@@ -93,4 +93,11 @@ extern int memory_notify(unsigned long v
 #define hotplug_memory_notifier(fn, pri) do { } while (0)
 #endif
 
+/*
+ * Take and release the kernel text modification lock, used for code patching.
+ * Users of this lock can sleep.
+ */
+extern void kernel_text_lock(void);
+extern void kernel_text_unlock(void);
+
 #endif /* _LINUX_MEMORY_H_ */
Index: linux-2.6-lttng/mm/memory.c
===================================================================
--- linux-2.6-lttng.orig/mm/memory.c	2007-11-07 11:12:33.000000000 -0500
+++ linux-2.6-lttng/mm/memory.c	2007-11-07 11:14:25.000000000 -0500
@@ -50,6 +50,8 @@
 #include <linux/delayacct.h>
 #include <linux/init.h>
 #include <linux/writeback.h>
+#include <linux/kprobes.h>
+#include <linux/mutex.h>
 
 #include <asm/pgalloc.h>
 #include <asm/uaccess.h>
@@ -84,6 +86,12 @@ EXPORT_SYMBOL(high_memory);
 
 int randomize_va_space __read_mostly = 1;
 
+/*
+ * mutex protecting text section modification (dynamic code patching).
+ * some users need to sleep (allocating memory...) while they hold this lock.
+ */
+static DEFINE_MUTEX(text_mutex);
+
 static int __init disable_randmaps(char *s)
 {
 	randomize_va_space = 0;
@@ -2748,3 +2756,29 @@ int access_process_vm(struct task_struct
 
 	return buf - old_buf;
 }
+
+/**
+ * kernel_text_lock     -   Take the kernel text modification lock
+ *
+ * Insures mutual write exclusion of kernel and modules text live text
+ * modification. Should be used for code patching.
+ * Users of this lock can sleep.
+ */
+void __kprobes kernel_text_lock(void)
+{
+	mutex_lock(&text_mutex);
+}
+EXPORT_SYMBOL_GPL(kernel_text_lock);
+
+/**
+ * kernel_text_unlock   -   Release the kernel text modification lock
+ *
+ * Insures mutual write exclusion of kernel and modules text live text
+ * modification. Should be used for code patching.
+ * Users of this lock can sleep.
+ */
+void __kprobes kernel_text_unlock(void)
+{
+	mutex_unlock(&text_mutex);
+}
+EXPORT_SYMBOL_GPL(kernel_text_unlock);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 06/11] Text Edit Lock - Alternative code for x86
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
                   ` (4 preceding siblings ...)
  2007-11-13 18:46 ` [patch 05/11] Text Edit Lock - Architecture Independent Code Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 20:54   ` pageexec
  2007-11-13 18:46 ` [patch 07/11] Text Edit Lock - kprobes architecture independent support Mathieu Desnoyers
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers, Andi Kleen, pageexec

[-- Attachment #1: text-edit-lock-alternative-i386-and-x86_64.patch --]
[-- Type: text/plain, Size: 9932 bytes --]

Fix a memcpy that should be a text_poke (in apply_alternatives).

Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
correctly and so the CPU HOTPLUG special case can be removed.

clflush all the cachelines touched by text_poke.

Add text_poke_early, for alternatives and paravirt boot-time and module load
time patching.

Notes:
- the clflush is left there, even though Andi Kleen says it breaks some
  architecture.  The proper fix is to detect these CPUs and set the
  cpu_has_clflush flag appropriately. It does not belong here.
- we use a macro for kernel_wp_save/restore to mimic local_irq_save/restore: the
  argument is passed without &.

Changelog:

- Fix text_set and text_poke alignment check (mixed up bitwise and and or)
- Remove text_set
- Use the new macro INIT_ARRAY() to stop polluting the C files with ({ })
  brackets (which breaks some c parsers in editors).
- Export add_nops, so it can be used by others.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
CC: pageexec@freemail.hu
---
 arch/x86/kernel/alternative.c    |   78 +++++++++++++++++++++++++++++----------
 include/asm-x86/alternative_32.h |   38 ++++++++++++++++++-
 include/asm-x86/alternative_64.h |   38 ++++++++++++++++++-
 3 files changed, 132 insertions(+), 22 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c	2007-11-13 09:25:29.000000000 -0500
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c	2007-11-13 09:45:41.000000000 -0500
@@ -27,6 +27,58 @@ __setup("smp-alt-boot", bootonly);
 #define smp_alt_once 1
 #endif
 
+/*
+ * Warning:
+ * When you use this code to patch more than one byte of an instruction
+ * you need to make sure that other CPUs cannot execute this code in parallel.
+ * Also no thread must be currently preempted in the middle of these
+ * instructions.  And on the local CPU you need to be protected again NMI or MCE
+ * handlers seeing an inconsistent instruction while you patch.
+ * Warning: read_cr0 is modified by paravirt, this is why we have _early
+ * versions. They are not in the __init section because they can be used at
+ * module load time.
+ */
+static inline void text_sync(void *addr, size_t len)
+{
+	void *faddr;
+
+	sync_core();
+	/* FIXME Could also do a CLFLUSH here to speed up CPU recovery; but
+	   that causes hangs on some VIA CPUs. */
+	/* Not strictly needed, but can speed CPU recovery up. */
+	if (0 && cpu_has_clflush)
+		for (faddr = addr; faddr < addr + len;
+				faddr += boot_cpu_data.x86_clflush_size)
+			asm("clflush (%0) " :: "r" (faddr) : "memory");
+}
+
+void *text_poke_early(void *addr, const void *opcode, size_t len)
+{
+	memcpy(addr, opcode, len);
+	text_sync(addr, len);
+	return addr;
+}
+
+/*
+ * Only atomic text poke/set should be allowed when not doing early patching.
+ * It means the size must be writable atomically and the address must be aligned
+ * in a way that permits an atomic write.
+ */
+
+void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
+{
+	unsigned long cr0;
+
+	BUG_ON(len > sizeof(long));
+	BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
+		- ((long)addr & ~(sizeof(long) - 1)));
+	kernel_wp_save(cr0);
+	memcpy(addr, opcode, len);
+	kernel_wp_restore(cr0);
+	text_sync(addr, len);
+	return addr;
+}
+
 static int debug_alternative;
 
 static int __init debug_alt(char *str)
@@ -173,7 +225,7 @@ static const unsigned char*const * find_
 #endif /* CONFIG_X86_64 */
 
 /* Use this to add nops to a buffer, then text_poke the whole buffer. */
-static void add_nops(void *insns, unsigned int len)
+void add_nops(void *insns, unsigned int len)
 {
 	const unsigned char *const *noptable = find_nop_table();
 
@@ -186,6 +238,7 @@ static void add_nops(void *insns, unsign
 		len -= noplen;
 	}
 }
+EXPORT_SYMBOL_GPL(add_nops);
 
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
 extern u8 *__smp_locks[], *__smp_locks_end[];
@@ -219,7 +272,7 @@ void apply_alternatives(struct alt_instr
 		memcpy(insnbuf, a->replacement, a->replacementlen);
 		add_nops(insnbuf + a->replacementlen,
 			 a->instrlen - a->replacementlen);
-		text_poke(instr, insnbuf, a->instrlen);
+		text_poke_early(instr, insnbuf, a->instrlen);
 	}
 }
 
@@ -234,7 +287,8 @@ static void alternatives_smp_lock(u8 **s
 			continue;
 		if (*ptr > text_end)
 			continue;
-		text_poke(*ptr, ((unsigned char []){0xf0}), 1); /* add lock prefix */
+		/* add lock prefix */
+		text_poke(*ptr, INIT_ARRAY(unsigned char, 0xf0, 1), 1);
 	};
 }
 
@@ -397,7 +451,7 @@ void apply_paravirt(struct paravirt_patc
 
 		/* Pad the rest with nops */
 		add_nops(insnbuf + used, p->len - used);
-		text_poke(p->instr, insnbuf, p->len);
+		text_poke_early(p->instr, insnbuf, p->len);
 	}
 }
 extern struct paravirt_patch_site __start_parainstructions[],
@@ -456,19 +510,3 @@ void __init alternative_instructions(voi
 	restart_mce();
 #endif
 }
-
-/*
- * Warning:
- * When you use this code to patch more than one byte of an instruction
- * you need to make sure that other CPUs cannot execute this code in parallel.
- * Also no thread must be currently preempted in the middle of these instructions.
- * And on the local CPU you need to be protected again NMI or MCE handlers
- * seeing an inconsistent instruction while you patch.
- */
-void __kprobes text_poke(void *addr, unsigned char *opcode, int len)
-{
-	memcpy(addr, opcode, len);
-	sync_core();
-	/* Could also do a CLFLUSH here to speed up CPU recovery; but
-	   that causes hangs on some VIA CPUs. */
-}
Index: linux-2.6-lttng/include/asm-x86/alternative_32.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-x86/alternative_32.h	2007-11-13 09:25:29.000000000 -0500
+++ linux-2.6-lttng/include/asm-x86/alternative_32.h	2007-11-13 09:45:41.000000000 -0500
@@ -4,6 +4,7 @@
 #include <asm/types.h>
 #include <linux/stddef.h>
 #include <linux/types.h>
+#include <asm/processor-flags.h>
 
 struct alt_instr {
 	u8 *instr; 		/* original instruction */
@@ -149,6 +150,41 @@ apply_paravirt(struct paravirt_patch_sit
 #define __parainstructions_end	NULL
 #endif
 
-extern void text_poke(void *addr, unsigned char *opcode, int len);
+extern void add_nops(void *insns, unsigned int len);
+
+/*
+ * Clear and restore the kernel write-protection flag on the local CPU.
+ * Allows the kernel to edit read-only pages.
+ * Side-effect: any interrupt handler running between save and restore will have
+ * the ability to write to read-only pages.
+ *
+ * Warning:
+ * Code patching in the UP case is safe if NMIs and MCE handlers are stopped and
+ * no thread can be preempted in the instructions being modified (no iret to an
+ * invalid instruction possible) or if the instructions are changed from a
+ * consistent state to another consistent state atomically.
+ * More care must be taken when modifying code in the SMP case because of
+ * Intel's errata.
+ * On the local CPU you need to be protected again NMI or MCE handlers seeing an
+ * inconsistent instruction while you patch.
+ */
+
+extern void *text_poke(void *addr, const void *opcode, size_t len);
+extern void *text_poke_early(void *addr, const void *opcode, size_t len);
+
+#define kernel_wp_save(cr0)					\
+	do {							\
+		preempt_disable();				\
+		cr0 = read_cr0();				\
+		if (cpu_data(smp_processor_id()).wp_works_ok)	\
+			write_cr0(cr0 & ~X86_CR0_WP);		\
+	} while (0)
+
+#define kernel_wp_restore(cr0)					\
+	do {							\
+		if (cpu_data(smp_processor_id()).wp_works_ok)	\
+			write_cr0(cr0);				\
+		preempt_enable();				\
+	} while (0)
 
 #endif /* _I386_ALTERNATIVE_H */
Index: linux-2.6-lttng/include/asm-x86/alternative_64.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-x86/alternative_64.h	2007-11-13 09:25:29.000000000 -0500
+++ linux-2.6-lttng/include/asm-x86/alternative_64.h	2007-11-13 09:45:41.000000000 -0500
@@ -5,6 +5,7 @@
 
 #include <linux/types.h>
 #include <linux/stddef.h>
+#include <asm/processor-flags.h>
 
 /*
  * Alternative inline assembly for SMP.
@@ -154,6 +155,41 @@ apply_paravirt(struct paravirt_patch *st
 #define __parainstructions_end NULL
 #endif
 
-extern void text_poke(void *addr, unsigned char *opcode, int len);
+extern void add_nops(void *insns, unsigned int len);
+
+/*
+ * Clear and restore the kernel write-protection flag on the local CPU.
+ * Allows the kernel to edit read-only pages.
+ * Side-effect: any interrupt handler running between save and restore will have
+ * the ability to write to read-only pages.
+ *
+ * Warning:
+ * Code patching in the UP case is safe if NMIs and MCE handlers are stopped and
+ * no thread can be preempted in the instructions being modified (no iret to an
+ * invalid instruction possible) or if the instructions are changed from a
+ * consistent state to another consistent state atomically.
+ * More care must be taken when modifying code in the SMP case because of
+ * Intel's errata.
+ * On the local CPU you need to be protected again NMI or MCE handlers seeing an
+ * inconsistent instruction while you patch.
+ */
+
+extern void *text_poke(void *addr, const void *opcode, size_t len);
+extern void *text_poke_early(void *addr, const void *opcode, size_t len);
+
+#define kernel_wp_save(cr0)					\
+	do {							\
+		typecheck(unsigned long, cr0);			\
+		preempt_disable();				\
+		cr0 = read_cr0();				\
+		write_cr0(cr0 & ~X86_CR0_WP);			\
+	} while (0)
+
+#define kernel_wp_restore(cr0)					\
+	do {							\
+		typecheck(unsigned long, cr0);			\
+		write_cr0(cr0);					\
+		preempt_enable();				\
+	} while (0)
 
 #endif /* _X86_64_ALTERNATIVE_H */

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 07/11] Text Edit Lock - kprobes architecture independent support
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
                   ` (5 preceding siblings ...)
  2007-11-13 18:46 ` [patch 06/11] Text Edit Lock - Alternative code for x86 Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 08/11] Text Edit Lock - kprobes x86_32 Mathieu Desnoyers
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel
  Cc: Mathieu Desnoyers, Ananth N Mavinakayanahalli, prasanna,
	anil.s.keshavamurthy, davem

[-- Attachment #1: text-edit-lock-kprobes-architecture-independent.patch --]
[-- Type: text/plain, Size: 3064 bytes --]

Use the mutual exclusion provided by the text edit lock in the kprobes code. It
allows coherent manipulation of the kernel code by other subsystems.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: prasanna@in.ibm.com
CC: ananth@in.ibm.com
CC: anil.s.keshavamurthy@intel.com
CC: davem@davemloft.net
---
 kernel/kprobes.c |   19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

Index: linux-2.6-lttng/kernel/kprobes.c
===================================================================
--- linux-2.6-lttng.orig/kernel/kprobes.c	2007-09-07 10:12:06.000000000 -0400
+++ linux-2.6-lttng/kernel/kprobes.c	2007-09-07 10:13:09.000000000 -0400
@@ -43,6 +43,7 @@
 #include <linux/seq_file.h>
 #include <linux/debugfs.h>
 #include <linux/kdebug.h>
+#include <linux/memory.h>
 
 #include <asm-generic/sections.h>
 #include <asm/cacheflush.h>
@@ -568,9 +569,10 @@ static int __kprobes __register_kprobe(s
 		goto out;
 	}
 
+	kernel_text_lock();
 	ret = arch_prepare_kprobe(p);
 	if (ret)
-		goto out;
+		goto out_unlock_text;
 
 	INIT_HLIST_NODE(&p->hlist);
 	hlist_add_head_rcu(&p->hlist,
@@ -578,7 +580,8 @@ static int __kprobes __register_kprobe(s
 
 	if (kprobe_enabled)
 		arch_arm_kprobe(p);
-
+out_unlock_text:
+	kernel_text_unlock();
 out:
 	mutex_unlock(&kprobe_mutex);
 
@@ -621,8 +624,11 @@ valid_p:
 		 * enabled - otherwise, the breakpoint would already have
 		 * been removed. We save on flushing icache.
 		 */
-		if (kprobe_enabled)
+		if (kprobe_enabled) {
+			kernel_text_lock();
 			arch_disarm_kprobe(p);
+			kernel_text_unlock();
+		}
 		hlist_del_rcu(&old_p->hlist);
 		cleanup_p = 1;
 	} else {
@@ -644,9 +650,7 @@ valid_p:
 			list_del_rcu(&p->list);
 			kfree(old_p);
 		}
-		mutex_lock(&kprobe_mutex);
 		arch_remove_kprobe(p);
-		mutex_unlock(&kprobe_mutex);
 	} else {
 		mutex_lock(&kprobe_mutex);
 		if (p->break_handler)
@@ -717,7 +721,6 @@ static int __kprobes pre_handler_kretpro
 		ri->rp = rp;
 		ri->task = current;
 		arch_prepare_kretprobe(ri, regs);
-
 		/* XXX(hch): why is there no hlist_move_head? */
 		hlist_del(&ri->uflist);
 		hlist_add_head(&ri->uflist, &ri->rp->used_instances);
@@ -940,8 +943,10 @@ static void __kprobes enable_all_kprobes
 
 	for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
 		head = &kprobe_table[i];
+		kernel_text_lock();
 		hlist_for_each_entry_rcu(p, node, head, hlist)
 			arch_arm_kprobe(p);
+		kernel_text_unlock();
 	}
 
 	kprobe_enabled = true;
@@ -969,10 +974,12 @@ static void __kprobes disable_all_kprobe
 	printk(KERN_INFO "Kprobes globally disabled\n");
 	for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
 		head = &kprobe_table[i];
+		kernel_text_lock();
 		hlist_for_each_entry_rcu(p, node, head, hlist) {
 			if (!arch_trampoline_kprobe(p))
 				arch_disarm_kprobe(p);
 		}
+		kernel_text_unlock();
 	}
 
 	mutex_unlock(&kprobe_mutex);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 08/11] Text Edit Lock - kprobes x86_32
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
                   ` (6 preceding siblings ...)
  2007-11-13 18:46 ` [patch 07/11] Text Edit Lock - kprobes architecture independent support Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 09/11] Text Edit Lock - kprobes x86_64 Mathieu Desnoyers
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel
  Cc: Mathieu Desnoyers, Andi Kleen, prasanna, ananth,
	anil.s.keshavamurthy, davem

[-- Attachment #1: text-edit-lock-kprobes-i386.patch --]
[-- Type: text/plain, Size: 1349 bytes --]

Make kprobes use INIT_ARRAY().

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Tested-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: prasanna@in.ibm.com
CC: ananth@in.ibm.com
CC: anil.s.keshavamurthy@intel.com
CC: davem@davemloft.net
---
 arch/x86/kernel/kprobes_32.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/kprobes_32.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_32.c	2007-11-13 09:45:35.000000000 -0500
+++ linux-2.6-lttng/arch/x86/kernel/kprobes_32.c	2007-11-13 09:45:44.000000000 -0500
@@ -176,12 +176,13 @@ int __kprobes arch_prepare_kprobe(struct
 
 void __kprobes arch_arm_kprobe(struct kprobe *p)
 {
-	text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1);
+	text_poke(p->addr, INIT_ARRAY(unsigned char, BREAKPOINT_INSTRUCTION, 1),
+			1);
 }
 
 void __kprobes arch_disarm_kprobe(struct kprobe *p)
 {
-	text_poke(p->addr, &p->opcode, 1);
+	text_poke(p->addr, INIT_ARRAY(unsigned char, p->opcode, 1), 1);
 }
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 09/11] Text Edit Lock - kprobes x86_64
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
                   ` (7 preceding siblings ...)
  2007-11-13 18:46 ` [patch 08/11] Text Edit Lock - kprobes x86_32 Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 10/11] Text Edit Lock - x86_32 standardize debug rodata Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 11/11] Text Edit Lock - x86_64 " Mathieu Desnoyers
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel
  Cc: Mathieu Desnoyers, Andi Kleen, prasanna, ananth,
	anil.s.keshavamurthy, davem

[-- Attachment #1: text-edit-lock-kprobes-x86_64.patch --]
[-- Type: text/plain, Size: 1349 bytes --]

Make kprobes use INIT_ARRAY().

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Tested-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: prasanna@in.ibm.com
CC: ananth@in.ibm.com
CC: anil.s.keshavamurthy@intel.com
CC: davem@davemloft.net
---
 arch/x86/kernel/kprobes_64.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/kprobes_64.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/kprobes_64.c	2007-11-13 09:45:35.000000000 -0500
+++ linux-2.6-lttng/arch/x86/kernel/kprobes_64.c	2007-11-13 09:45:46.000000000 -0500
@@ -215,12 +215,13 @@ static void __kprobes arch_copy_kprobe(s
 
 void __kprobes arch_arm_kprobe(struct kprobe *p)
 {
-	text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1);
+	text_poke(p->addr, INIT_ARRAY(unsigned char, BREAKPOINT_INSTRUCTION, 1),
+			1);
 }
 
 void __kprobes arch_disarm_kprobe(struct kprobe *p)
 {
-	text_poke(p->addr, &p->opcode, 1);
+	text_poke(p->addr, INIT_ARRAY(unsigned char, p->opcode, 1), 1);
 }
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 10/11] Text Edit Lock - x86_32 standardize debug rodata
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
                   ` (8 preceding siblings ...)
  2007-11-13 18:46 ` [patch 09/11] Text Edit Lock - kprobes x86_64 Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  2007-11-13 18:46 ` [patch 11/11] Text Edit Lock - x86_64 " Mathieu Desnoyers
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers, Andi Kleen, pageexec

[-- Attachment #1: text-edit-lock-i386-standardize-debug-rodata.patch --]
[-- Type: text/plain, Size: 1951 bytes --]

Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
CC: pageexec@freemail.hu
---
 arch/x86/mm/init_32.c |   20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

Index: linux-2.6-lttng/arch/x86/mm/init_32.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/mm/init_32.c	2007-11-13 09:25:29.000000000 -0500
+++ linux-2.6-lttng/arch/x86/mm/init_32.c	2007-11-13 09:45:48.000000000 -0500
@@ -784,28 +784,21 @@ static int noinline do_test_wp_bit(void)
 }
 
 #ifdef CONFIG_DEBUG_RODATA
-
 void mark_rodata_ro(void)
 {
 	unsigned long start = PFN_ALIGN(_text);
 	unsigned long size = PFN_ALIGN(_etext) - start;
 
-#ifndef CONFIG_KPROBES
-#ifdef CONFIG_HOTPLUG_CPU
-	/* It must still be possible to apply SMP alternatives. */
-	if (num_possible_cpus() <= 1)
-#endif
-	{
-		change_page_attr(virt_to_page(start),
-		                 size >> PAGE_SHIFT, PAGE_KERNEL_RX);
-		printk("Write protecting the kernel text: %luk\n", size >> 10);
-	}
-#endif
+	change_page_attr(virt_to_page(start),
+		size >> PAGE_SHIFT, PAGE_KERNEL_RX);
+	printk(KERN_INFO "Write protecting the kernel text: %luk\n",
+		size >> 10);
+
 	start += size;
 	size = (unsigned long)__end_rodata - start;
 	change_page_attr(virt_to_page(start),
 	                 size >> PAGE_SHIFT, PAGE_KERNEL_RO);
-	printk("Write protecting the kernel read-only data: %luk\n",
+	printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
 	       size >> 10);
 
 	/*
@@ -816,6 +809,7 @@ void mark_rodata_ro(void)
 	 */
 	global_flush_tlb();
 }
+
 #endif
 
 void free_init_pages(char *what, unsigned long begin, unsigned long end)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 11/11] Text Edit Lock - x86_64 standardize debug rodata
  2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
                   ` (9 preceding siblings ...)
  2007-11-13 18:46 ` [patch 10/11] Text Edit Lock - x86_32 standardize debug rodata Mathieu Desnoyers
@ 2007-11-13 18:46 ` Mathieu Desnoyers
  10 siblings, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 18:46 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers, Andi Kleen, pageexec

[-- Attachment #1: text-edit-lock-x86_64-standardize-debug-rodata.patch --]
[-- Type: text/plain, Size: 1732 bytes --]

Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
CC: pageexec@freemail.hu
---
 arch/x86_64/mm/init.c |   23 +++++------------------
 1 file changed, 5 insertions(+), 18 deletions(-)

Index: linux-2.6-lttng/arch/x86/mm/init_64.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/mm/init_64.c	2007-09-24 11:00:01.000000000 -0400
+++ linux-2.6-lttng/arch/x86/mm/init_64.c	2007-09-24 11:00:02.000000000 -0400
@@ -592,25 +592,11 @@ void free_initmem(void)
 
 void mark_rodata_ro(void)
 {
-	unsigned long start = (unsigned long)_stext, end;
+	unsigned long start = PFN_ALIGN(_stext);
+	unsigned long end = PFN_ALIGN(__end_rodata);
 
-#ifdef CONFIG_HOTPLUG_CPU
-	/* It must still be possible to apply SMP alternatives. */
-	if (num_possible_cpus() > 1)
-		start = (unsigned long)_etext;
-#endif
-
-#ifdef CONFIG_KPROBES
-	start = (unsigned long)__start_rodata;
-#endif
-	
-	end = (unsigned long)__end_rodata;
-	start = (start + PAGE_SIZE - 1) & PAGE_MASK;
-	end &= PAGE_MASK;
-	if (end <= start)
-		return;
-
-	change_page_attr_addr(start, (end - start) >> PAGE_SHIFT, PAGE_KERNEL_RO);
+	change_page_attr_addr(start, (end - start) >> PAGE_SHIFT,
+				PAGE_KERNEL_RO);
 
 	printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
 	       (end - start) >> 10);
@@ -623,6 +609,7 @@ void mark_rodata_ro(void)
 	 */
 	global_flush_tlb();
 }
+
 #endif
 
 #ifdef CONFIG_BLK_DEV_INITRD

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/11] Text Edit Lock - Alternative code for x86
  2007-11-13 18:46 ` [patch 06/11] Text Edit Lock - Alternative code for x86 Mathieu Desnoyers
@ 2007-11-13 20:54   ` pageexec
  2007-11-13 22:10     ` Mathieu Desnoyers
  2007-11-14  1:42     ` [patch 06/11] Text Edit Lock - Alternative code for x86 (updated) Mathieu Desnoyers
  0 siblings, 2 replies; 20+ messages in thread
From: pageexec @ 2007-11-13 20:54 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers, Andi Kleen

On 13 Nov 2007 at 13:46, Mathieu Desnoyers wrote:

> +void *text_poke_early(void *addr, const void *opcode, size_t len)
> +{
> +	memcpy(addr, opcode, len);
> +	text_sync(addr, len);
> +	return addr;
> +}

why do you need this function (vs. using text_poke throughout)?

> +#define kernel_wp_save(cr0)					\
> +	do {							\
> +		preempt_disable();				\
> +		cr0 = read_cr0();				\
> +		if (cpu_data(smp_processor_id()).wp_works_ok)	\

why do you need this test? if cr0.wp is ineffective, then it doesn't
matter whether it's on or off (in fact, at least the intel manual
says that 386s would not even let you change its value, they'll
silently ignore attempts of setting the wp bit).

> +			write_cr0(cr0 & ~X86_CR0_WP);		\
> +	} while (0)
> +
> +#define kernel_wp_restore(cr0)					\
> +	do {							\
> +		if (cpu_data(smp_processor_id()).wp_works_ok)	\

ditto...

> +			write_cr0(cr0);				\
> +		preempt_enable();				\
> +	} while (0)
>  
>  #endif /* _I386_ALTERNATIVE_H */



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/11] Text Edit Lock - Alternative code for x86
  2007-11-13 20:54   ` pageexec
@ 2007-11-13 22:10     ` Mathieu Desnoyers
  2007-11-14  1:42     ` [patch 06/11] Text Edit Lock - Alternative code for x86 (updated) Mathieu Desnoyers
  1 sibling, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 22:10 UTC (permalink / raw)
  To: pageexec; +Cc: akpm, linux-kernel, Andi Kleen

* pageexec@freemail.hu (pageexec@freemail.hu) wrote:
> On 13 Nov 2007 at 13:46, Mathieu Desnoyers wrote:
> 
> > +void *text_poke_early(void *addr, const void *opcode, size_t len)
> > +{
> > +	memcpy(addr, opcode, len);
> > +	text_sync(addr, len);
> > +	return addr;
> > +}
> 
> why do you need this function (vs. using text_poke throughout)?
> 

Because it's not safe to use read_cr0() in paravirtualization
before the alternatives are set.

> > +#define kernel_wp_save(cr0)					\
> > +	do {							\
> > +		preempt_disable();				\
> > +		cr0 = read_cr0();				\
> > +		if (cpu_data(smp_processor_id()).wp_works_ok)	\
> 
> why do you need this test? if cr0.wp is ineffective, then it doesn't
> matter whether it's on or off (in fact, at least the intel manual
> says that 386s would not even let you change its value, they'll
> silently ignore attempts of setting the wp bit).
> 

Ok.. then this test could go away then. I prefered to use a conservative
approach. Will fix. Thanks for the hint.

Mathieu

> > +			write_cr0(cr0 & ~X86_CR0_WP);		\
> > +	} while (0)
> > +
> > +#define kernel_wp_restore(cr0)					\
> > +	do {							\
> > +		if (cpu_data(smp_processor_id()).wp_works_ok)	\
> 
> ditto...
> 
> > +			write_cr0(cr0);				\
> > +		preempt_enable();				\
> > +	} while (0)
> >  
> >  #endif /* _I386_ALTERNATIVE_H */
> 
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/11] Text Edit Lock - Alternative code for x86 (updated)
  2007-11-13 20:54   ` pageexec
  2007-11-13 22:10     ` Mathieu Desnoyers
@ 2007-11-14  1:42     ` Mathieu Desnoyers
  1 sibling, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-11-14  1:42 UTC (permalink / raw)
  To: pageexec; +Cc: akpm, linux-kernel, Andi Kleen

Text Edit Lock - Alternative code for x86

Fix a memcpy that should be a text_poke (in apply_alternatives).

Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
correctly and so the CPU HOTPLUG special case can be removed.

clflush all the cachelines touched by text_poke.

Add text_poke_early, for alternatives and paravirt boot-time and module load
time patching.

Notes:
- the clflush is left there, even though Andi Kleen says it breaks some
  architecture.  The proper fix is to detect these CPUs and set the
  cpu_has_clflush flag appropriately. It does not belong here.
- we use a macro for kernel_wp_save/restore to mimic local_irq_save/restore: the
  argument is passed without &.

Changelog:

- Fix text_set and text_poke alignment check (mixed up bitwise and and or)
- Remove text_set
- Use the new macro INIT_ARRAY() to stop polluting the C files with ({ })
  brackets (which breaks some c parsers in editors).
- Export add_nops, so it can be used by others.
- Remove x86 test for "wp_works_ok", it will just be ignored by the architecture
  if not supported.
- Document text_poke_early.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
CC: pageexec@freemail.hu
---
 arch/x86/kernel/alternative.c    |   78 +++++++++++++++++++++++++++++----------
 include/asm-x86/alternative_32.h |   37 ++++++++++++++++++
 include/asm-x86/alternative_64.h |   39 +++++++++++++++++++
 3 files changed, 132 insertions(+), 22 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c	2007-11-13 13:43:20.000000000 -0500
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c	2007-11-13 20:40:54.000000000 -0500
@@ -27,6 +27,58 @@ __setup("smp-alt-boot", bootonly);
 #define smp_alt_once 1
 #endif
 
+/*
+ * Warning:
+ * When you use this code to patch more than one byte of an instruction
+ * you need to make sure that other CPUs cannot execute this code in parallel.
+ * Also no thread must be currently preempted in the middle of these
+ * instructions.  And on the local CPU you need to be protected again NMI or MCE
+ * handlers seeing an inconsistent instruction while you patch.
+ * Warning: read_cr0 is modified by paravirt, this is why we have _early
+ * versions. They are not in the __init section because they can be used at
+ * module load time.
+ */
+static inline void text_sync(void *addr, size_t len)
+{
+	void *faddr;
+
+	sync_core();
+	/* FIXME Could also do a CLFLUSH here to speed up CPU recovery; but
+	   that causes hangs on some VIA CPUs. */
+	/* Not strictly needed, but can speed CPU recovery up. */
+	if (0 && cpu_has_clflush)
+		for (faddr = addr; faddr < addr + len;
+				faddr += boot_cpu_data.x86_clflush_size)
+			asm("clflush (%0) " :: "r" (faddr) : "memory");
+}
+
+void *text_poke_early(void *addr, const void *opcode, size_t len)
+{
+	memcpy(addr, opcode, len);
+	text_sync(addr, len);
+	return addr;
+}
+
+/*
+ * Only atomic text poke/set should be allowed when not doing early patching.
+ * It means the size must be writable atomically and the address must be aligned
+ * in a way that permits an atomic write.
+ */
+
+void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
+{
+	unsigned long cr0;
+
+	BUG_ON(len > sizeof(long));
+	BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
+		- ((long)addr & ~(sizeof(long) - 1)));
+	kernel_wp_save(cr0);
+	memcpy(addr, opcode, len);
+	kernel_wp_restore(cr0);
+	text_sync(addr, len);
+	return addr;
+}
+
 static int debug_alternative;
 
 static int __init debug_alt(char *str)
@@ -173,7 +225,7 @@ static const unsigned char*const * find_
 #endif /* CONFIG_X86_64 */
 
 /* Use this to add nops to a buffer, then text_poke the whole buffer. */
-static void add_nops(void *insns, unsigned int len)
+void add_nops(void *insns, unsigned int len)
 {
 	const unsigned char *const *noptable = find_nop_table();
 
@@ -186,6 +238,7 @@ static void add_nops(void *insns, unsign
 		len -= noplen;
 	}
 }
+EXPORT_SYMBOL_GPL(add_nops);
 
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
 extern u8 *__smp_locks[], *__smp_locks_end[];
@@ -219,7 +272,7 @@ void apply_alternatives(struct alt_instr
 		memcpy(insnbuf, a->replacement, a->replacementlen);
 		add_nops(insnbuf + a->replacementlen,
 			 a->instrlen - a->replacementlen);
-		text_poke(instr, insnbuf, a->instrlen);
+		text_poke_early(instr, insnbuf, a->instrlen);
 	}
 }
 
@@ -234,7 +287,8 @@ static void alternatives_smp_lock(u8 **s
 			continue;
 		if (*ptr > text_end)
 			continue;
-		text_poke(*ptr, ((unsigned char []){0xf0}), 1); /* add lock prefix */
+		/* add lock prefix */
+		text_poke(*ptr, INIT_ARRAY(unsigned char, 0xf0, 1), 1);
 	};
 }
 
@@ -397,7 +451,7 @@ void apply_paravirt(struct paravirt_patc
 
 		/* Pad the rest with nops */
 		add_nops(insnbuf + used, p->len - used);
-		text_poke(p->instr, insnbuf, p->len);
+		text_poke_early(p->instr, insnbuf, p->len);
 	}
 }
 extern struct paravirt_patch_site __start_parainstructions[],
@@ -456,19 +510,3 @@ void __init alternative_instructions(voi
 	restart_mce();
 #endif
 }
-
-/*
- * Warning:
- * When you use this code to patch more than one byte of an instruction
- * you need to make sure that other CPUs cannot execute this code in parallel.
- * Also no thread must be currently preempted in the middle of these instructions.
- * And on the local CPU you need to be protected again NMI or MCE handlers
- * seeing an inconsistent instruction while you patch.
- */
-void __kprobes text_poke(void *addr, unsigned char *opcode, int len)
-{
-	memcpy(addr, opcode, len);
-	sync_core();
-	/* Could also do a CLFLUSH here to speed up CPU recovery; but
-	   that causes hangs on some VIA CPUs. */
-}
Index: linux-2.6-lttng/include/asm-x86/alternative_32.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-x86/alternative_32.h	2007-11-13 13:43:20.000000000 -0500
+++ linux-2.6-lttng/include/asm-x86/alternative_32.h	2007-11-13 20:40:01.000000000 -0500
@@ -4,6 +4,7 @@
 #include <asm/types.h>
 #include <linux/stddef.h>
 #include <linux/types.h>
+#include <asm/processor-flags.h>
 
 struct alt_instr {
 	u8 *instr; 		/* original instruction */
@@ -149,6 +150,40 @@ apply_paravirt(struct paravirt_patch_sit
 #define __parainstructions_end	NULL
 #endif
 
-extern void text_poke(void *addr, unsigned char *opcode, int len);
+extern void add_nops(void *insns, unsigned int len);
+
+/*
+ * Clear and restore the kernel write-protection flag on the local CPU.
+ * Allows the kernel to edit read-only pages.
+ * Side-effect: any interrupt handler running between save and restore will have
+ * the ability to write to read-only pages.
+ *
+ * Warning:
+ * Code patching in the UP case is safe if NMIs and MCE handlers are stopped and
+ * no thread can be preempted in the instructions being modified (no iret to an
+ * invalid instruction possible) or if the instructions are changed from a
+ * consistent state to another consistent state atomically.
+ * More care must be taken when modifying code in the SMP case because of
+ * Intel's errata.
+ * On the local CPU you need to be protected again NMI or MCE handlers seeing an
+ * inconsistent instruction while you patch.
+ * The _early version does not use read_cr0(), which can be paravirtualized.
+ */
+
+extern void *text_poke(void *addr, const void *opcode, size_t len);
+extern void *text_poke_early(void *addr, const void *opcode, size_t len);
+
+#define kernel_wp_save(cr0)					\
+	do {							\
+		preempt_disable();				\
+		cr0 = read_cr0();				\
+		write_cr0(cr0 & ~X86_CR0_WP);			\
+	} while (0)
+
+#define kernel_wp_restore(cr0)					\
+	do {							\
+		write_cr0(cr0);					\
+		preempt_enable();				\
+	} while (0)
 
 #endif /* _I386_ALTERNATIVE_H */
Index: linux-2.6-lttng/include/asm-x86/alternative_64.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-x86/alternative_64.h	2007-11-13 13:43:20.000000000 -0500
+++ linux-2.6-lttng/include/asm-x86/alternative_64.h	2007-11-13 20:40:09.000000000 -0500
@@ -5,6 +5,7 @@
 
 #include <linux/types.h>
 #include <linux/stddef.h>
+#include <asm/processor-flags.h>
 
 /*
  * Alternative inline assembly for SMP.
@@ -154,6 +155,42 @@ apply_paravirt(struct paravirt_patch *st
 #define __parainstructions_end NULL
 #endif
 
-extern void text_poke(void *addr, unsigned char *opcode, int len);
+extern void add_nops(void *insns, unsigned int len);
+
+/*
+ * Clear and restore the kernel write-protection flag on the local CPU.
+ * Allows the kernel to edit read-only pages.
+ * Side-effect: any interrupt handler running between save and restore will have
+ * the ability to write to read-only pages.
+ *
+ * Warning:
+ * Code patching in the UP case is safe if NMIs and MCE handlers are stopped and
+ * no thread can be preempted in the instructions being modified (no iret to an
+ * invalid instruction possible) or if the instructions are changed from a
+ * consistent state to another consistent state atomically.
+ * More care must be taken when modifying code in the SMP case because of
+ * Intel's errata.
+ * On the local CPU you need to be protected again NMI or MCE handlers seeing an
+ * inconsistent instruction while you patch.
+ * The _early version does not use read_cr0(), which can be paravirtualized.
+ */
+
+extern void *text_poke(void *addr, const void *opcode, size_t len);
+extern void *text_poke_early(void *addr, const void *opcode, size_t len);
+
+#define kernel_wp_save(cr0)					\
+	do {							\
+		typecheck(unsigned long, cr0);			\
+		preempt_disable();				\
+		cr0 = read_cr0();				\
+		write_cr0(cr0 & ~X86_CR0_WP);			\
+	} while (0)
+
+#define kernel_wp_restore(cr0)					\
+	do {							\
+		typecheck(unsigned long, cr0);			\
+		write_cr0(cr0);					\
+		preempt_enable();				\
+	} while (0)
 
 #endif /* _X86_64_ALTERNATIVE_H */
-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/11] Text Edit Lock - Alternative code for x86
  2007-12-06 12:19   ` pageexec
  2007-12-06 14:21     ` Mathieu Desnoyers
@ 2007-12-06 14:44     ` Mathieu Desnoyers
  2007-12-06 13:58       ` pageexec
  1 sibling, 1 reply; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-12-06 14:44 UTC (permalink / raw)
  To: pageexec
  Cc: akpm, linux-kernel, Andi Kleen, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin

* pageexec@freemail.hu (pageexec@freemail.hu) wrote:
> On 5 Dec 2007 at 21:02, Mathieu Desnoyers wrote:
> 
> > Fix a memcpy that should be a text_poke (in apply_alternatives).
> > 
> > Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
> > correctly and so the CPU HOTPLUG special case can be removed.
> > 
> > Add text_poke_early, for alternatives and paravirt boot-time and module load
> > time patching.
> > 
> > Notes:
> > - we use a macro for kernel_wp_save/restore to mimic local_irq_save/restore: the
> >   argument is passed without &.
> 
> sorry to chime in again, but lately i've been thinking that the
> cr0 argument is not really needed if one can ensure that calls
> to the kernel open/close macros won't nest (i checked and even
> in the PaX case it's easy to ensure, even desirable in fact).
> 
> in your case it's also true for now and i can't think of a situation
> where you'd really want to nest in the future (that'd mean opening
> up the kernel while some complex piece of code runs, more complex
> than a mere memset at least ;-). what do you think?
> 

It is correct to assume that the WP bit will always be activated, in
every configuration, even though we don't use the DEBUG_RODATA ?

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/11] Text Edit Lock - Alternative code for x86
  2007-12-06 12:19   ` pageexec
@ 2007-12-06 14:21     ` Mathieu Desnoyers
  2007-12-06 14:44     ` Mathieu Desnoyers
  1 sibling, 0 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-12-06 14:21 UTC (permalink / raw)
  To: pageexec
  Cc: akpm, linux-kernel, Andi Kleen, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin

* pageexec@freemail.hu (pageexec@freemail.hu) wrote:
> On 5 Dec 2007 at 21:02, Mathieu Desnoyers wrote:
> 
> > Fix a memcpy that should be a text_poke (in apply_alternatives).
> > 
> > Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
> > correctly and so the CPU HOTPLUG special case can be removed.
> > 
> > Add text_poke_early, for alternatives and paravirt boot-time and module load
> > time patching.
> > 
> > Notes:
> > - we use a macro for kernel_wp_save/restore to mimic local_irq_save/restore: the
> >   argument is passed without &.
> 
> sorry to chime in again, but lately i've been thinking that the
> cr0 argument is not really needed if one can ensure that calls
> to the kernel open/close macros won't nest (i checked and even
> in the PaX case it's easy to ensure, even desirable in fact).
> 
> in your case it's also true for now and i can't think of a situation
> where you'd really want to nest in the future (that'd mean opening
> up the kernel while some complex piece of code runs, more complex
> than a mere memset at least ;-). what do you think?
> 

Yes, your proposal makes sense. I'll do the changes.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/11] Text Edit Lock - Alternative code for x86
  2007-12-06 14:44     ` Mathieu Desnoyers
@ 2007-12-06 13:58       ` pageexec
  0 siblings, 0 replies; 20+ messages in thread
From: pageexec @ 2007-12-06 13:58 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: akpm, linux-kernel, Andi Kleen, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin

On 6 Dec 2007 at 9:44, Mathieu Desnoyers wrote:

> It is correct to assume that the WP bit will always be activated, in
> every configuration, even though we don't use the DEBUG_RODATA ?

yes, without it the kernel couldn't rely on page faults to trigger
copy-on-write on userland pages for example (just look at what it
has to do on the original i386...).


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [patch 06/11] Text Edit Lock - Alternative code for x86
  2007-12-06  2:02 ` [patch 06/11] Text Edit Lock - Alternative code for x86 Mathieu Desnoyers
@ 2007-12-06 12:19   ` pageexec
  2007-12-06 14:21     ` Mathieu Desnoyers
  2007-12-06 14:44     ` Mathieu Desnoyers
  0 siblings, 2 replies; 20+ messages in thread
From: pageexec @ 2007-12-06 12:19 UTC (permalink / raw)
  To: akpm, linux-kernel
  Cc: Mathieu Desnoyers, Andi Kleen, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin

On 5 Dec 2007 at 21:02, Mathieu Desnoyers wrote:

> Fix a memcpy that should be a text_poke (in apply_alternatives).
> 
> Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
> correctly and so the CPU HOTPLUG special case can be removed.
> 
> Add text_poke_early, for alternatives and paravirt boot-time and module load
> time patching.
> 
> Notes:
> - we use a macro for kernel_wp_save/restore to mimic local_irq_save/restore: the
>   argument is passed without &.

sorry to chime in again, but lately i've been thinking that the
cr0 argument is not really needed if one can ensure that calls
to the kernel open/close macros won't nest (i checked and even
in the PaX case it's easy to ensure, even desirable in fact).

in your case it's also true for now and i can't think of a situation
where you'd really want to nest in the future (that'd mean opening
up the kernel while some complex piece of code runs, more complex
than a mere memset at least ;-). what do you think?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [patch 06/11] Text Edit Lock - Alternative code for x86
  2007-12-06  2:02 [patch 00/11] Text Edit Lock for 2.6.24-rc4-git3 Mathieu Desnoyers
@ 2007-12-06  2:02 ` Mathieu Desnoyers
  2007-12-06 12:19   ` pageexec
  0 siblings, 1 reply; 20+ messages in thread
From: Mathieu Desnoyers @ 2007-12-06  2:02 UTC (permalink / raw)
  To: akpm, Ingo Molnar, linux-kernel
  Cc: Mathieu Desnoyers, Andi Kleen, pageexec, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin

[-- Attachment #1: text-edit-lock-alternative-i386-and-x86_64.patch --]
[-- Type: text/plain, Size: 9688 bytes --]

Fix a memcpy that should be a text_poke (in apply_alternatives).

Use kernel_wp_save/kernel_wp_restore in text_poke to support DEBUG_RODATA
correctly and so the CPU HOTPLUG special case can be removed.

Add text_poke_early, for alternatives and paravirt boot-time and module load
time patching.

Notes:
- we use a macro for kernel_wp_save/restore to mimic local_irq_save/restore: the
  argument is passed without &.

Changelog:

- Fix text_set and text_poke alignment check (mixed up bitwise and and or)
- Remove text_set
- Use the new macro INIT_ARRAY() to stop polluting the C files with ({ })
  brackets (which breaks some c parsers in editors).
- Export add_nops, so it can be used by others.
- Remove x86 test for "wp_works_ok", it will just be ignored by the architecture
  if not supported.
- Document text_poke_early.
- Remove clflush, since it breaks some VIA architectures and is not strictly
  necessary.
- Add kerneldoc to text_poke and text_poke_early.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
CC: pageexec@freemail.hu
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: H. Peter Anvin <hpa@zytor.com>
---
 arch/x86/kernel/alternative.c    |   58 ++++++++++++++++++++++++++++++++-------
 include/asm-x86/alternative_32.h |   37 ++++++++++++++++++++++++
 include/asm-x86/alternative_64.h |   39 +++++++++++++++++++++++++-
 3 files changed, 122 insertions(+), 12 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c	2007-11-16 12:39:40.000000000 -0500
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c	2007-11-16 12:42:50.000000000 -0500
@@ -173,7 +173,7 @@ static const unsigned char*const * find_
 #endif /* CONFIG_X86_64 */
 
 /* Use this to add nops to a buffer, then text_poke the whole buffer. */
-static void add_nops(void *insns, unsigned int len)
+void add_nops(void *insns, unsigned int len)
 {
 	const unsigned char *const *noptable = find_nop_table();
 
@@ -186,6 +186,7 @@ static void add_nops(void *insns, unsign
 		len -= noplen;
 	}
 }
+EXPORT_SYMBOL_GPL(add_nops);
 
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
 extern u8 *__smp_locks[], *__smp_locks_end[];
@@ -219,7 +220,7 @@ void apply_alternatives(struct alt_instr
 		memcpy(insnbuf, a->replacement, a->replacementlen);
 		add_nops(insnbuf + a->replacementlen,
 			 a->instrlen - a->replacementlen);
-		text_poke(instr, insnbuf, a->instrlen);
+		text_poke_early(instr, insnbuf, a->instrlen);
 	}
 }
 
@@ -234,7 +235,8 @@ static void alternatives_smp_lock(u8 **s
 			continue;
 		if (*ptr > text_end)
 			continue;
-		text_poke(*ptr, ((unsigned char []){0xf0}), 1); /* add lock prefix */
+		/* add lock prefix */
+		text_poke(*ptr, INIT_ARRAY(unsigned char, 0xf0, 1), 1);
 	};
 }
 
@@ -397,7 +399,7 @@ void apply_paravirt(struct paravirt_patc
 
 		/* Pad the rest with nops */
 		add_nops(insnbuf + used, p->len - used);
-		text_poke(p->instr, insnbuf, p->len);
+		text_poke_early(p->instr, insnbuf, p->len);
 	}
 }
 extern struct paravirt_patch_site __start_parainstructions[],
@@ -457,18 +459,54 @@ void __init alternative_instructions(voi
 #endif
 }
 
-/*
- * Warning:
+/**
+ * text_poke_early - Update instructions on a live kernel at boot time
+ * @addr: address to modify
+ * @opcode: source of the copy
+ * @len: length to copy
+ *
  * When you use this code to patch more than one byte of an instruction
  * you need to make sure that other CPUs cannot execute this code in parallel.
- * Also no thread must be currently preempted in the middle of these instructions.
- * And on the local CPU you need to be protected again NMI or MCE handlers
- * seeing an inconsistent instruction while you patch.
+ * Also no thread must be currently preempted in the middle of these
+ * instructions.  And on the local CPU you need to be protected again NMI or MCE
+ * handlers seeing an inconsistent instruction while you patch.
+ * Warning: read_cr0 is modified by paravirt, this is why we have _early
+ * versions. They are not in the __init section because they can be used at
+ * module load time.
  */
-void __kprobes text_poke(void *addr, unsigned char *opcode, int len)
+void *text_poke_early(void *addr, const void *opcode, size_t len)
 {
 	memcpy(addr, opcode, len);
 	sync_core();
 	/* Could also do a CLFLUSH here to speed up CPU recovery; but
 	   that causes hangs on some VIA CPUs. */
+	return addr;
 }
+
+/**
+ * text_poke - Update instructions on a live kernel
+ * @addr: address to modify
+ * @opcode: source of the copy
+ * @len: length to copy
+ *
+ * Only atomic text poke/set should be allowed when not doing early patching.
+ * It means the size must be writable atomically and the address must be aligned
+ * in a way that permits an atomic write.
+ */
+void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
+{
+	unsigned long cr0;
+
+	BUG_ON(len > sizeof(long));
+	BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
+		- ((long)addr & ~(sizeof(long) - 1)));
+	kernel_wp_save(cr0);
+	memcpy(addr, opcode, len);
+	kernel_wp_restore(cr0);
+	sync_core();
+	/* Could also do a CLFLUSH here to speed up CPU recovery; but
+	   that causes hangs on some VIA CPUs. */
+	return addr;
+}
+
+
Index: linux-2.6-lttng/include/asm-x86/alternative_32.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-x86/alternative_32.h	2007-11-16 12:39:40.000000000 -0500
+++ linux-2.6-lttng/include/asm-x86/alternative_32.h	2007-11-16 12:40:03.000000000 -0500
@@ -4,6 +4,7 @@
 #include <asm/types.h>
 #include <linux/stddef.h>
 #include <linux/types.h>
+#include <asm/processor-flags.h>
 
 struct alt_instr {
 	u8 *instr; 		/* original instruction */
@@ -149,6 +150,40 @@ apply_paravirt(struct paravirt_patch_sit
 #define __parainstructions_end	NULL
 #endif
 
-extern void text_poke(void *addr, unsigned char *opcode, int len);
+extern void add_nops(void *insns, unsigned int len);
+
+/*
+ * Clear and restore the kernel write-protection flag on the local CPU.
+ * Allows the kernel to edit read-only pages.
+ * Side-effect: any interrupt handler running between save and restore will have
+ * the ability to write to read-only pages.
+ *
+ * Warning:
+ * Code patching in the UP case is safe if NMIs and MCE handlers are stopped and
+ * no thread can be preempted in the instructions being modified (no iret to an
+ * invalid instruction possible) or if the instructions are changed from a
+ * consistent state to another consistent state atomically.
+ * More care must be taken when modifying code in the SMP case because of
+ * Intel's errata.
+ * On the local CPU you need to be protected again NMI or MCE handlers seeing an
+ * inconsistent instruction while you patch.
+ * The _early version does not use read_cr0(), which can be paravirtualized.
+ */
+
+extern void *text_poke(void *addr, const void *opcode, size_t len);
+extern void *text_poke_early(void *addr, const void *opcode, size_t len);
+
+#define kernel_wp_save(cr0)					\
+	do {							\
+		preempt_disable();				\
+		cr0 = read_cr0();				\
+		write_cr0(cr0 & ~X86_CR0_WP);			\
+	} while (0)
+
+#define kernel_wp_restore(cr0)					\
+	do {							\
+		write_cr0(cr0);					\
+		preempt_enable();				\
+	} while (0)
 
 #endif /* _I386_ALTERNATIVE_H */
Index: linux-2.6-lttng/include/asm-x86/alternative_64.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-x86/alternative_64.h	2007-11-16 12:39:40.000000000 -0500
+++ linux-2.6-lttng/include/asm-x86/alternative_64.h	2007-11-16 12:40:03.000000000 -0500
@@ -5,6 +5,7 @@
 
 #include <linux/types.h>
 #include <linux/stddef.h>
+#include <asm/processor-flags.h>
 
 /*
  * Alternative inline assembly for SMP.
@@ -154,6 +155,42 @@ apply_paravirt(struct paravirt_patch *st
 #define __parainstructions_end NULL
 #endif
 
-extern void text_poke(void *addr, unsigned char *opcode, int len);
+extern void add_nops(void *insns, unsigned int len);
+
+/*
+ * Clear and restore the kernel write-protection flag on the local CPU.
+ * Allows the kernel to edit read-only pages.
+ * Side-effect: any interrupt handler running between save and restore will have
+ * the ability to write to read-only pages.
+ *
+ * Warning:
+ * Code patching in the UP case is safe if NMIs and MCE handlers are stopped and
+ * no thread can be preempted in the instructions being modified (no iret to an
+ * invalid instruction possible) or if the instructions are changed from a
+ * consistent state to another consistent state atomically.
+ * More care must be taken when modifying code in the SMP case because of
+ * Intel's errata.
+ * On the local CPU you need to be protected again NMI or MCE handlers seeing an
+ * inconsistent instruction while you patch.
+ * The _early version does not use read_cr0(), which can be paravirtualized.
+ */
+
+extern void *text_poke(void *addr, const void *opcode, size_t len);
+extern void *text_poke_early(void *addr, const void *opcode, size_t len);
+
+#define kernel_wp_save(cr0)					\
+	do {							\
+		typecheck(unsigned long, cr0);			\
+		preempt_disable();				\
+		cr0 = read_cr0();				\
+		write_cr0(cr0 & ~X86_CR0_WP);			\
+	} while (0)
+
+#define kernel_wp_restore(cr0)					\
+	do {							\
+		typecheck(unsigned long, cr0);			\
+		write_cr0(cr0);					\
+		preempt_enable();				\
+	} while (0)
 
 #endif /* _X86_64_ALTERNATIVE_H */

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2007-12-06 15:01 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-13 18:46 [patch 00/11] Text Edit Lock Mathieu Desnoyers
2007-11-13 18:46 ` [patch 01/11] Kprobes - use a mutex to protect the instruction pages list Mathieu Desnoyers
2007-11-13 18:46 ` [patch 02/11] Kprobes - do not use kprobes mutex in arch code Mathieu Desnoyers
2007-11-13 18:46 ` [patch 03/11] Kprobes - declare kprobe_mutex static Mathieu Desnoyers
2007-11-13 18:46 ` [patch 04/11] Add INIT_ARRAY() to kernel.h Mathieu Desnoyers
2007-11-13 18:46 ` [patch 05/11] Text Edit Lock - Architecture Independent Code Mathieu Desnoyers
2007-11-13 18:46 ` [patch 06/11] Text Edit Lock - Alternative code for x86 Mathieu Desnoyers
2007-11-13 20:54   ` pageexec
2007-11-13 22:10     ` Mathieu Desnoyers
2007-11-14  1:42     ` [patch 06/11] Text Edit Lock - Alternative code for x86 (updated) Mathieu Desnoyers
2007-11-13 18:46 ` [patch 07/11] Text Edit Lock - kprobes architecture independent support Mathieu Desnoyers
2007-11-13 18:46 ` [patch 08/11] Text Edit Lock - kprobes x86_32 Mathieu Desnoyers
2007-11-13 18:46 ` [patch 09/11] Text Edit Lock - kprobes x86_64 Mathieu Desnoyers
2007-11-13 18:46 ` [patch 10/11] Text Edit Lock - x86_32 standardize debug rodata Mathieu Desnoyers
2007-11-13 18:46 ` [patch 11/11] Text Edit Lock - x86_64 " Mathieu Desnoyers
2007-12-06  2:02 [patch 00/11] Text Edit Lock for 2.6.24-rc4-git3 Mathieu Desnoyers
2007-12-06  2:02 ` [patch 06/11] Text Edit Lock - Alternative code for x86 Mathieu Desnoyers
2007-12-06 12:19   ` pageexec
2007-12-06 14:21     ` Mathieu Desnoyers
2007-12-06 14:44     ` Mathieu Desnoyers
2007-12-06 13:58       ` pageexec

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).