LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] x86 Fix text_poke for vmalloced pages
@ 2008-03-20  0:39 Mathieu Desnoyers
  2008-03-21  9:38 ` Ingo Molnar
  0 siblings, 1 reply; 3+ messages in thread
From: Mathieu Desnoyers @ 2008-03-20  0:39 UTC (permalink / raw)
  To: Ingo Molnar, linux-kernel

The shadow vmap for DEBUG_RODATA kernel text modification uses virt_to_page to
get the pages from the pointer address.

However, I think vmalloc_to_page would be required in case the page is used for
modules.

Since only the core kernel text is marked read-only, use core_kernel_text()
to make sure we only shadow map the core kernel text, not modules.

This is an incremental change to make the DEBUG_RODATA and text_poke play
together nicely. A future step will be to make the module text read-only too,
which will require changes to load module, module free and text_poke.
The idea is to fix the current x86 git tree quickly.

- Changelog:
kernel_text_address() -> core_kernel_text().

It applies on top of the x86 git tree, 2.6.25-rc6.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/alternative.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c	2008-03-19 18:57:29.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c	2008-03-19 20:01:10.000000000 -0400
@@ -511,7 +511,7 @@ void *__kprobes text_poke(void *addr, co
 	BUG_ON(len > sizeof(long));
 	BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
 		- ((long)addr & ~(sizeof(long) - 1)));
-	{
+	if (core_kernel_text((unsigned long)addr)) {
 		struct page *pages[2] = { virt_to_page(addr),
 			virt_to_page(addr + PAGE_SIZE) };
 		if (!pages[1])
@@ -522,6 +522,13 @@ void *__kprobes text_poke(void *addr, co
 		memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
 		local_irq_restore(flags);
 		vunmap(vaddr);
+	} else {
+		/*
+		 * modules are in vmalloc'ed memory, always writable.
+		 */
+		local_irq_save(flags);
+		memcpy(addr, opcode, len);
+		local_irq_restore(flags);
 	}
 	sync_core();
 	/* Could also do a CLFLUSH here to speed up CPU recovery; but
-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86 Fix text_poke for vmalloced pages
  2008-03-20  0:39 [PATCH] x86 Fix text_poke for vmalloced pages Mathieu Desnoyers
@ 2008-03-21  9:38 ` Ingo Molnar
  2008-03-24 17:02   ` Mathieu Desnoyers
  0 siblings, 1 reply; 3+ messages in thread
From: Ingo Molnar @ 2008-03-21  9:38 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: linux-kernel, Arjan van de Ven, Thomas Gleixner


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> The shadow vmap for DEBUG_RODATA kernel text modification uses 
> virt_to_page to get the pages from the pointer address.
> 
> However, I think vmalloc_to_page would be required in case the page is 
> used for modules.
> 
> Since only the core kernel text is marked read-only, use 
> core_kernel_text() to make sure we only shadow map the core kernel 
> text, not modules.
> 
> This is an incremental change to make the DEBUG_RODATA and text_poke 
> play together nicely. A future step will be to make the module text 
> read-only too, which will require changes to load module, module free 
> and text_poke. The idea is to fix the current x86 git tree quickly.

> +	if (core_kernel_text((unsigned long)addr)) {
>  		struct page *pages[2] = { virt_to_page(addr),
>  			virt_to_page(addr + PAGE_SIZE) };
>  		if (!pages[1])
> @@ -522,6 +522,13 @@ void *__kprobes text_poke(void *addr, co
>  		memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
>  		local_irq_restore(flags);
>  		vunmap(vaddr);
> +	} else {
> +		/*
> +		 * modules are in vmalloc'ed memory, always writable.
> +		 */
> +		local_irq_save(flags);
> +		memcpy(addr, opcode, len);
> +		local_irq_restore(flags);

hm, this looks ugly, and the whole text_poke() function looks ugly. For 
example why the extra code block + indentation here:

+void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
+{
+       unsigned long flags;
+       char *vaddr;
+       int nr_pages = 2;
+
+       BUG_ON(len > sizeof(long));
+       BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
+               - ((long)addr & ~(sizeof(long) - 1)));
+       {
+               struct page *pages[2] = { virt_to_page(addr),
+                       virt_to_page(addr + PAGE_SIZE) };

also, more fundamentally - why not introduce a proper, generic "look up 
kernel text struct page *" method, instead of open-coding various 
assumptions about which kernel text is readonly and which isnt?

	Ingo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86 Fix text_poke for vmalloced pages
  2008-03-21  9:38 ` Ingo Molnar
@ 2008-03-24 17:02   ` Mathieu Desnoyers
  0 siblings, 0 replies; 3+ messages in thread
From: Mathieu Desnoyers @ 2008-03-24 17:02 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Arjan van de Ven, Thomas Gleixner

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > The shadow vmap for DEBUG_RODATA kernel text modification uses 
> > virt_to_page to get the pages from the pointer address.
> > 
> > However, I think vmalloc_to_page would be required in case the page is 
> > used for modules.
> > 
> > Since only the core kernel text is marked read-only, use 
> > core_kernel_text() to make sure we only shadow map the core kernel 
> > text, not modules.
> > 
> > This is an incremental change to make the DEBUG_RODATA and text_poke 
> > play together nicely. A future step will be to make the module text 
> > read-only too, which will require changes to load module, module free 
> > and text_poke. The idea is to fix the current x86 git tree quickly.
> 
> > +	if (core_kernel_text((unsigned long)addr)) {
> >  		struct page *pages[2] = { virt_to_page(addr),
> >  			virt_to_page(addr + PAGE_SIZE) };
> >  		if (!pages[1])
> > @@ -522,6 +522,13 @@ void *__kprobes text_poke(void *addr, co
> >  		memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
> >  		local_irq_restore(flags);
> >  		vunmap(vaddr);
> > +	} else {
> > +		/*
> > +		 * modules are in vmalloc'ed memory, always writable.
> > +		 */
> > +		local_irq_save(flags);
> > +		memcpy(addr, opcode, len);
> > +		local_irq_restore(flags);
> 
> hm, this looks ugly, and the whole text_poke() function looks ugly. For 
> example why the extra code block + indentation here:
> 
> +void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> +{
> +       unsigned long flags;
> +       char *vaddr;
> +       int nr_pages = 2;
> +
> +       BUG_ON(len > sizeof(long));
> +       BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
> +               - ((long)addr & ~(sizeof(long) - 1)));
> +       {
> +               struct page *pages[2] = { virt_to_page(addr),
> +                       virt_to_page(addr + PAGE_SIZE) };
> 

The extra indentation is there so we can declare
struct page *pages[2] = { };

Otherwise, we would be located after the BUG_ON lines and would have to
do the following, which takes extra lines...

  unsigned long flags;
  char *vaddr;
  int nr_pages = 2;
  struct page *pages[2];

  BUG_ONs...
  pages[0] = virt_to_page(addr);
  pages[1] = virt_to_page(addr + PAGE_SIZE);

but then extra indentation in the new text_poke is within a block
depending on a "if" statement, so it becomes less curbersome. But it all
goes away in the update below.


> also, more fundamentally - why not introduce a proper, generic "look up 
> kernel text struct page *" method, instead of open-coding various 
> assumptions about which kernel text is readonly and which isnt?
> 

I totally agree with you, that's the correct way to do it. And why
should we special-case the "kernel text is not read-only" case ?
Considering this is a slow path and that it would just potentially make
bugs harder to detect, I would simply consider _always_ doing a shadow
map to modify the kernel text. New patch attached.

Mathieu

> 	Ingo

x86 Fix text_poke for vmalloced pages

The shadow vmap for DEBUG_RODATA kernel text modification uses virt_to_page to
get the pages from the pointer address and vmalloc_to_page for vmalloc'ed
pages.

However, I think vmalloc_to_page would be required in case the page is used for
modules.

- Changelog:
Deal with read-only module text.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/alternative.c |   27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c	2008-03-24 10:00:30.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c	2008-03-24 10:18:19.000000000 -0400
@@ -507,22 +507,27 @@ void *__kprobes text_poke(void *addr, co
 	unsigned long flags;
 	char *vaddr;
 	int nr_pages = 2;
+	struct page *pages[2];
 
 	BUG_ON(len > sizeof(long));
 	BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
 		- ((long)addr & ~(sizeof(long) - 1)));
-	{
-		struct page *pages[2] = { virt_to_page(addr),
-			virt_to_page(addr + PAGE_SIZE) };
-		if (!pages[1])
-			nr_pages = 1;
-		vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
-		WARN_ON(!vaddr);
-		local_irq_save(flags);
-		memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
-		local_irq_restore(flags);
-		vunmap(vaddr);
+	if (is_vmalloc_addr(addr)) {
+		pages[0] = vmalloc_to_page(addr);
+		pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
+	} else {
+		pages[0] = virt_to_page(addr);
+		pages[1] = virt_to_page(addr + PAGE_SIZE);
 	}
+	BUG_ON(!pages[0]);
+	if (!pages[1])
+		nr_pages = 1;
+	vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
+	BUG_ON(!vaddr);
+	local_irq_save(flags);
+	memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
+	local_irq_restore(flags);
+	vunmap(vaddr);
 	sync_core();
 	/* Could also do a CLFLUSH here to speed up CPU recovery; but
 	   that causes hangs on some VIA CPUs. */



-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-03-24 17:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-20  0:39 [PATCH] x86 Fix text_poke for vmalloced pages Mathieu Desnoyers
2008-03-21  9:38 ` Ingo Molnar
2008-03-24 17:02   ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).