LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
@ 2008-02-29 20:39 Jeremy Fitzhardinge
  2008-02-29 20:56 ` H. Peter Anvin
  2008-02-29 21:24 ` Ingo Molnar
  0 siblings, 2 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-02-29 20:39 UTC (permalink / raw)
  To: Mathieu Desnoyers, Ingo Molnar
  Cc: Linux Kernel Mailing List, H. Peter Anvin, Andi Kleen, Zachary Amsden

The patch "x86 - Enhance DEBUG_RODATA support - alternatives" enables 
the kernel for writing by clearing X86_CR0_WP allow privileged writes.  
This won't work in a paravirt environment for two reasons:

   1. the kernel may not be running in ring 0, so writes will still be
      prevented
   2. the hypervisor prevents X86_CR0_WP from being cleared anyway (it
      GPFs the cr0 update)

This crashes on Xen, and it would probably break VMI too.

The only safe way to allow writes is to change the page permissions 
(either on the page itself, or create a temporary writable alias for 
that page).  Perhaps something you could do it with kmap_atomic.

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-02-29 20:39 bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives" Jeremy Fitzhardinge
@ 2008-02-29 20:56 ` H. Peter Anvin
  2008-02-29 21:09   ` Jeremy Fitzhardinge
  2008-02-29 21:24 ` Ingo Molnar
  1 sibling, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2008-02-29 20:56 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Mathieu Desnoyers, Ingo Molnar, Linux Kernel Mailing List,
	Andi Kleen, Zachary Amsden

Jeremy Fitzhardinge wrote:
> The patch "x86 - Enhance DEBUG_RODATA support - alternatives" enables 
> the kernel for writing by clearing X86_CR0_WP allow privileged writes.  
> This won't work in a paravirt environment for two reasons:
> 
>   1. the kernel may not be running in ring 0, so writes will still be
>      prevented
>   2. the hypervisor prevents X86_CR0_WP from being cleared anyway (it
>      GPFs the cr0 update)
> 
> This crashes on Xen, and it would probably break VMI too.
> 
> The only safe way to allow writes is to change the page permissions 
> (either on the page itself, or create a temporary writable alias for 
> that page).  Perhaps something you could do it with kmap_atomic.
> 

A properly implemented hypervisor should arguably emulate this.

Doesn't really mean the patch is worth the pain.

	-hpa

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-02-29 20:56 ` H. Peter Anvin
@ 2008-02-29 21:09   ` Jeremy Fitzhardinge
  2008-03-03 17:30     ` Mathieu Desnoyers
  0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-02-29 21:09 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Mathieu Desnoyers, Ingo Molnar, Linux Kernel Mailing List,
	Andi Kleen, Zachary Amsden

H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>> The patch "x86 - Enhance DEBUG_RODATA support - alternatives" enables 
>> the kernel for writing by clearing X86_CR0_WP allow privileged 
>> writes.  This won't work in a paravirt environment for two reasons:
>>
>>   1. the kernel may not be running in ring 0, so writes will still be
>>      prevented
>>   2. the hypervisor prevents X86_CR0_WP from being cleared anyway (it
>>      GPFs the cr0 update)
>>
>> This crashes on Xen, and it would probably break VMI too.

(lguest too, of course)

>> The only safe way to allow writes is to change the page permissions 
>> (either on the page itself, or create a temporary writable alias for 
>> that page).  Perhaps something you could do it with kmap_atomic.
>>
>
> A properly implemented hypervisor should arguably emulate this.
>
> Doesn't really mean the patch is worth the pain.

No, it would be irritating to implement.

Seems to me that doing the update in a temporary kmap_atomic mapping 
would be a more straightforward way to go, anyway.  How would you 
implement this on a processor without something like X86_CR0_WP?

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-02-29 20:39 bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives" Jeremy Fitzhardinge
  2008-02-29 20:56 ` H. Peter Anvin
@ 2008-02-29 21:24 ` Ingo Molnar
  1 sibling, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2008-02-29 21:24 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Mathieu Desnoyers, Linux Kernel Mailing List, H. Peter Anvin,
	Andi Kleen, Zachary Amsden


* Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> The patch "x86 - Enhance DEBUG_RODATA support - alternatives" enables 
> the kernel for writing by clearing X86_CR0_WP allow privileged writes.  
> This won't work in a paravirt environment for two reasons:

ok, have reverted it and have pushed out a new x86.git.

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-02-29 21:09   ` Jeremy Fitzhardinge
@ 2008-03-03 17:30     ` Mathieu Desnoyers
  2008-03-03 17:35       ` Andi Kleen
  2008-03-03 17:58       ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 12+ messages in thread
From: Mathieu Desnoyers @ 2008-03-03 17:30 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: H. Peter Anvin, Ingo Molnar, Linux Kernel Mailing List,
	Andi Kleen, Zachary Amsden

* Jeremy Fitzhardinge (jeremy@goop.org) wrote:
> H. Peter Anvin wrote:
>> Jeremy Fitzhardinge wrote:
>>> The patch "x86 - Enhance DEBUG_RODATA support - alternatives" enables the 
>>> kernel for writing by clearing X86_CR0_WP allow privileged writes.  This 
>>> won't work in a paravirt environment for two reasons:
>>>
>>>   1. the kernel may not be running in ring 0, so writes will still be
>>>      prevented
>>>   2. the hypervisor prevents X86_CR0_WP from being cleared anyway (it
>>>      GPFs the cr0 update)
>>>
>>> This crashes on Xen, and it would probably break VMI too.
>
> (lguest too, of course)
>
>>> The only safe way to allow writes is to change the page permissions 
>>> (either on the page itself, or create a temporary writable alias for that 
>>> page).  Perhaps something you could do it with kmap_atomic.
>>>
>>
>> A properly implemented hypervisor should arguably emulate this.
>>
>> Doesn't really mean the patch is worth the pain.
>
> No, it would be irritating to implement.
>
> Seems to me that doing the update in a temporary kmap_atomic mapping would 
> be a more straightforward way to go, anyway.  How would you implement this 
> on a processor without something like X86_CR0_WP?
>
>    J

I think kmap_atomic is only implemented on x86_32 and only deals with
highmem pages. It will simply return the original page address without
changing the protection for other pages, which is not what we want.
Would ioremap() be a good alternative ?

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-03-03 17:30     ` Mathieu Desnoyers
@ 2008-03-03 17:35       ` Andi Kleen
  2008-03-03 17:58       ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2008-03-03 17:35 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jeremy Fitzhardinge, H. Peter Anvin, Ingo Molnar,
	Linux Kernel Mailing List, Andi Kleen, Zachary Amsden

> Would ioremap() be a good alternative ?

One of the early attempts at patching ro data used ioremap. Unfortunately
it was a little buggy and was thus reverted, but that were only detail
problems, nothing fundamental of the approach.

-Andi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-03-03 17:30     ` Mathieu Desnoyers
  2008-03-03 17:35       ` Andi Kleen
@ 2008-03-03 17:58       ` Jeremy Fitzhardinge
  2008-03-03 18:03         ` Andi Kleen
  1 sibling, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-03-03 17:58 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: H. Peter Anvin, Ingo Molnar, Linux Kernel Mailing List,
	Andi Kleen, Zachary Amsden

Mathieu Desnoyers wrote:
> I think kmap_atomic is only implemented on x86_32 and only deals with
> highmem pages. It will simply return the original page address without
> changing the protection for other pages, which is not what we want.
> Would ioremap() be a good alternative ?
>   

Perhaps, though that's uncached by default.  You could reserve a fixmap 
slot and use set_fixmap to create the mapping.  Or use vmap, which may 
make dealing with instructions crossing page boundaries a little easier.

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-03-03 17:58       ` Jeremy Fitzhardinge
@ 2008-03-03 18:03         ` Andi Kleen
  2008-03-03 18:05           ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2008-03-03 18:03 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Mathieu Desnoyers, H. Peter Anvin, Ingo Molnar,
	Linux Kernel Mailing List, Zachary Amsden

On Monday 03 March 2008 18:58:03 Jeremy Fitzhardinge wrote:

> Perhaps, though that's uncached by default.

ioremap_cached()

-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-03-03 18:03         ` Andi Kleen
@ 2008-03-03 18:05           ` Jeremy Fitzhardinge
  2008-03-03 20:53             ` Mathieu Desnoyers
  0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-03-03 18:05 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mathieu Desnoyers, H. Peter Anvin, Ingo Molnar,
	Linux Kernel Mailing List, Zachary Amsden

Andi Kleen wrote:
> On Monday 03 March 2008 18:58:03 Jeremy Fitzhardinge wrote:
>
>   
>> Perhaps, though that's uncached by default.
>>     
>
> ioremap_cached()

Sure.  But given that from the perspective of this problem ioremap* is 
just a wrapper for vmap, we may as well use it directly and avoid 
getting tangled up in any current or future io-related stuff that 
ioremap may want to do.

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-03-03 18:05           ` Jeremy Fitzhardinge
@ 2008-03-03 20:53             ` Mathieu Desnoyers
  2008-03-03 22:15               ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 12+ messages in thread
From: Mathieu Desnoyers @ 2008-03-03 20:53 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andi Kleen, H. Peter Anvin, Ingo Molnar,
	Linux Kernel Mailing List, Zachary Amsden

* Jeremy Fitzhardinge (jeremy@goop.org) wrote:
> Andi Kleen wrote:
>> On Monday 03 March 2008 18:58:03 Jeremy Fitzhardinge wrote:
>>
>>   
>>> Perhaps, though that's uncached by default.
>>>     
>>
>> ioremap_cached()
>
> Sure.  But given that from the perspective of this problem ioremap* is just 
> a wrapper for vmap, we may as well use it directly and avoid getting 
> tangled up in any current or future io-related stuff that ioremap may want 
> to do.
>
>    J

Would you have a quick hint on why I get a page fault with the following
implementation ? There is probably a fundamental detail I missed.


void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
{
        char *vaddr;
        struct page *pages[1];

        BUG_ON(len > sizeof(long));
        BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
                - ((long)addr & ~(sizeof(long) - 1)));
        if (kernel_text_address((unsigned long)addr))
                pages[0] = virt_to_page(addr);
        else
                vaddr = addr;
        vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
        memcpy(&vaddr[(unsigned long)addr & PAGE_MASK], opcode, len);
        if (kernel_text_address((unsigned long)addr))
                vunmap(vaddr);
        sync_core();
        /* Could also do a CLFLUSH here to speed up CPU recovery; but
           that causes hangs on some VIA CPUs. */
        return addr;
}


[    0.149856] SMP alternatives: switching to UP code                           
[    0.152009] BUG: unable to handle kernel paging request at b8902000          
[    0.152009] IP: [<c03acfa1>] text_poke+0x85/0xb4                             
[    0.152009] *pde = 00000000                                                  
[    0.152009] Oops: 0002 [#1] SMP                                              
[    0.152009] LTT NESTING LEVEL : 0 <0>                                        
[    0.152009] Modules linked in:
[    0.152009] 
[    0.152009] Pid: 0, comm: swapper Not tainted (2.6.25-rc3-testssmp #744)
[    0.152009] EIP: 0060:[<c03acfa1>] EFLAGS: 00010002 CPU: 0
[    0.152009] EIP is at text_poke+0x85/0xb4
[    0.152009] EAX: f8800000 EBX: 00000001 ECX: 00000001 EDX: f8800000
[    0.152009] ESI: c04a3fab EDI: b8902000 EBP: c0102114 ESP: c04a3f88
[    0.152009]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[    0.152009] Process swapper (pid: 0, ti=c04a2000 task=c04703a0 task.ti=c04a2)
[    0.152009] Stack: 00000163 f8800000 c1002040 c04a400c c04a7cac c0100000 c03 
[    0.152009]        90411244 00000206 c04e072c c17fb72c 0000672c c04aaefe c03 
[    0.152009]        c04ab76a c17f5000 c04a896a 00000092 c04a80d7 00000008 000 
[    0.152009] Call Trace:                                                      
[    0.152009]  [<c03b0139>] _etext+0x0/0xf7ec7                                 
[    0.152009]  [<c0107814>] alternatives_smp_unlock+0x57/0x59                  
[    0.152009]  [<c04aaefe>] alternative_instructions+0x152/0x157               
[    0.152009]  [<c03b0139>] _etext+0x0/0xf7ec7                                 
[    0.152009]  [<c04ab76a>] check_bugs+0x131/0x14e                             
[    0.152009]  [<c04a896a>] start_kernel+0x2d1/0x327                           
[    0.152009]  [<c04a80d7>] unknown_bootoption+0x0/0x1e3                       
[    0.152009]  =======================                                         
[    0.152009] Code: b9 04 00 00 00 ba 01 00 00 00 e8 a6 87 db ff 89 44 24 04 8 
[    0.152009] EIP: [<c03acfa1>] text_poke+0x85/0xb4 SS:ESP 0068:c04a3f88       
[    0.152009] ---[ end trace ca143223eefdc828 ]---                             
[    0.152009] Kernel panic - not syncing: Attempted to kill the idle task!     


Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-03-03 20:53             ` Mathieu Desnoyers
@ 2008-03-03 22:15               ` Jeremy Fitzhardinge
  2008-03-04  4:12                 ` Mathieu Desnoyers
  0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-03-03 22:15 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Andi Kleen, H. Peter Anvin, Ingo Molnar,
	Linux Kernel Mailing List, Zachary Amsden

Mathieu Desnoyers wrote:
> * Jeremy Fitzhardinge (jeremy@goop.org) wrote:
>   
>> Andi Kleen wrote:
>>     
>>> On Monday 03 March 2008 18:58:03 Jeremy Fitzhardinge wrote:
>>>
>>>   
>>>       
>>>> Perhaps, though that's uncached by default.
>>>>     
>>>>         
>>> ioremap_cached()
>>>       
>> Sure.  But given that from the perspective of this problem ioremap* is just 
>> a wrapper for vmap, we may as well use it directly and avoid getting 
>> tangled up in any current or future io-related stuff that ioremap may want 
>> to do.
>>
>>    J
>>     
>
> Would you have a quick hint on why I get a page fault with the following
> implementation ? There is probably a fundamental detail I missed.
>
>
> void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
> {
>         char *vaddr;
>         struct page *pages[1];
>
>         BUG_ON(len > sizeof(long));
>         BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
>                 - ((long)addr & ~(sizeof(long) - 1)));
>         if (kernel_text_address((unsigned long)addr))
>                 pages[0] = virt_to_page(addr);
>         else
>                 vaddr = addr;
>   

What's this for?  You just overwrite vaddr with the vmap on the next line.

>         vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
>   
Do you need to handle the case of an instruction spanning a page boundary?

>         memcpy(&vaddr[(unsigned long)addr & PAGE_MASK], opcode, len);
>         if (kernel_text_address((unsigned long)addr))
>                 vunmap(vaddr);
>         sync_core();
>         /* Could also do a CLFLUSH here to speed up CPU recovery; but
>            that causes hangs on some VIA CPUs. */
>         return addr;
> }
>
>
> [    0.149856] SMP alternatives: switching to UP code                           
> [    0.152009] BUG: unable to handle kernel paging request at b8902000          
>   
That's a userspace address, unless you've got 2G:2G, and the error code 
says there's nothing mapped there.

> [    0.152009] IP: [<c03acfa1>] text_poke+0x85/0xb4                             
> [    0.152009] *pde = 00000000                                                  
> [    0.152009] Oops: 0002 [#1] SMP                                              
> [    0.152009] LTT NESTING LEVEL : 0 <0>                                        
> [    0.152009] Modules linked in:
> [    0.152009] 
> [    0.152009] Pid: 0, comm: swapper Not tainted (2.6.25-rc3-testssmp #744)
> [    0.152009] EIP: 0060:[<c03acfa1>] EFLAGS: 00010002 CPU: 0
> [    0.152009] EIP is at text_poke+0x85/0xb4
> [    0.152009] EAX: f8800000 EBX: 00000001 ECX: 00000001 EDX: f8800000
> [    0.152009] ESI: c04a3fab EDI: b8902000 EBP: c0102114 ESP: c04a3f88
> [    0.152009]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [    0.152009] Process swapper (pid: 0, ti=c04a2000 task=c04703a0 task.ti=c04a2)
> [    0.152009] Stack: 00000163 f8800000 c1002040 c04a400c c04a7cac c0100000 c03 
> [    0.152009]        90411244 00000206 c04e072c c17fb72c 0000672c c04aaefe c03 
> [    0.152009]        c04ab76a c17f5000 c04a896a 00000092 c04a80d7 00000008 000 
> [    0.152009] Call Trace:                                                      
> [    0.152009]  [<c03b0139>] _etext+0x0/0xf7ec7                                 
> [    0.152009]  [<c0107814>] alternatives_smp_unlock+0x57/0x59                  
> [    0.152009]  [<c04aaefe>] alternative_instructions+0x152/0x157               
> [    0.152009]  [<c03b0139>] _etext+0x0/0xf7ec7                                 
> [    0.152009]  [<c04ab76a>] check_bugs+0x131/0x14e                             
> [    0.152009]  [<c04a896a>] start_kernel+0x2d1/0x327                           
> [    0.152009]  [<c04a80d7>] unknown_bootoption+0x0/0x1e3                       
> [    0.152009]  =======================                                         
> [    0.152009] Code: b9 04 00 00 00 ba 01 00 00 00 e8 a6 87 db ff 89 44 24 04 8 
> [    0.152009] EIP: [<c03acfa1>] text_poke+0x85/0xb4 SS:ESP 0068:c04a3f88       
> [    0.152009] ---[ end trace ca143223eefdc828 ]---                             
> [    0.152009] Kernel panic - not syncing: Attempted to kill the idle task!     
>   

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
  2008-03-03 22:15               ` Jeremy Fitzhardinge
@ 2008-03-04  4:12                 ` Mathieu Desnoyers
  0 siblings, 0 replies; 12+ messages in thread
From: Mathieu Desnoyers @ 2008-03-04  4:12 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andi Kleen, H. Peter Anvin, Ingo Molnar,
	Linux Kernel Mailing List, Zachary Amsden

* Jeremy Fitzhardinge (jeremy@goop.org) wrote:
> Mathieu Desnoyers wrote:
>> * Jeremy Fitzhardinge (jeremy@goop.org) wrote:
>>   
>>> Andi Kleen wrote:
>>>     
>>>> On Monday 03 March 2008 18:58:03 Jeremy Fitzhardinge wrote:
>>>>
>>>>         
>>>>> Perhaps, though that's uncached by default.
>>>>>             
>>>> ioremap_cached()
>>>>       
>>> Sure.  But given that from the perspective of this problem ioremap* is 
>>> just a wrapper for vmap, we may as well use it directly and avoid getting 
>>> tangled up in any current or future io-related stuff that ioremap may 
>>> want to do.
>>>
>>>    J
>>>     
>>
>> Would you have a quick hint on why I get a page fault with the following
>> implementation ? There is probably a fundamental detail I missed.
>>
>>
>> void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
>> {
>>         char *vaddr;
>>         struct page *pages[1];
>>
>>         BUG_ON(len > sizeof(long));
>>         BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
>>                 - ((long)addr & ~(sizeof(long) - 1)));
>>         if (kernel_text_address((unsigned long)addr))
>>                 pages[0] = virt_to_page(addr);
>>         else
>>                 vaddr = addr;
>>   
>

Hi Jeremy,

(Connecting brain...)

> What's this for?  You just overwrite vaddr with the vmap on the next line.
>

Ah, it's supposed to be

if (kernel_text_address((unsigned long)addr)) {
  pages[0] = virt_to_page(addr);
  vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
  memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
} else
  memcpy(addr, opcode, len);



>>         vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
>>   
> Do you need to handle the case of an instruction spanning a page boundary?
>

Not really :

1-byte changes :

- kprobes is a 1-byte breakpoint
- smp alternatives changes an f0 prefix into a 1-byte nop when a single
  CPU is active (the opposite is done when bringing up a second CPU).

Aligned on word size :
- immediate values are aligned on word size

At boot time only, before kernel pages are marked RO :
- alternatives and paravirt are applied at boot time only

But yes, I guess it's cheap and allows supporting users which does not
have their instructions aligned. I'll change it.

>>         memcpy(&vaddr[(unsigned long)addr & PAGE_MASK], opcode, len);
>>         if (kernel_text_address((unsigned long)addr))
>>                 vunmap(vaddr);
>>         sync_core();
>>         /* Could also do a CLFLUSH here to speed up CPU recovery; but
>>            that causes hangs on some VIA CPUs. */
>>         return addr;
>> }
>>
>>
>> [    0.149856] SMP alternatives: switching to UP code                      
>>      [    0.152009] BUG: unable to handle kernel paging request at 
>> b8902000            
> That's a userspace address, unless you've got 2G:2G, and the error code 
> says there's nothing mapped there.
>

<banging head on the wall> & PAGE_MASK -> & ~PAGE_MASK  :)

Thanks!

Mathieu

>> [    0.152009] IP: [<c03acfa1>] text_poke+0x85/0xb4                        
>>      [    0.152009] *pde = 00000000                                        
>>           [    0.152009] Oops: 0002 [#1] SMP                               
>>                [    0.152009] LTT NESTING LEVEL : 0 <0>                    
>>                     [    0.152009] Modules linked in:
>> [    0.152009] [    0.152009] Pid: 0, comm: swapper Not tainted 
>> (2.6.25-rc3-testssmp #744)
>> [    0.152009] EIP: 0060:[<c03acfa1>] EFLAGS: 00010002 CPU: 0
>> [    0.152009] EIP is at text_poke+0x85/0xb4
>> [    0.152009] EAX: f8800000 EBX: 00000001 ECX: 00000001 EDX: f8800000
>> [    0.152009] ESI: c04a3fab EDI: b8902000 EBP: c0102114 ESP: c04a3f88
>> [    0.152009]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> [    0.152009] Process swapper (pid: 0, ti=c04a2000 task=c04703a0 
>> task.ti=c04a2)
>> [    0.152009] Stack: 00000163 f8800000 c1002040 c04a400c c04a7cac 
>> c0100000 c03 [    0.152009]        90411244 00000206 c04e072c c17fb72c 
>> 0000672c c04aaefe c03 [    0.152009]        c04ab76a c17f5000 c04a896a 
>> 00000092 c04a80d7 00000008 000 [    0.152009] Call Trace:                  
>>                                     [    0.152009]  [<c03b0139>] 
>> _etext+0x0/0xf7ec7                                 [    0.152009]  
>> [<c0107814>] alternatives_smp_unlock+0x57/0x59                  [    
>> 0.152009]  [<c04aaefe>] alternative_instructions+0x152/0x157               
>> [    0.152009]  [<c03b0139>] _etext+0x0/0xf7ec7                            
>>      [    0.152009]  [<c04ab76a>] check_bugs+0x131/0x14e                   
>>           [    0.152009]  [<c04a896a>] start_kernel+0x2d1/0x327            
>>                [    0.152009]  [<c04a80d7>] unknown_bootoption+0x0/0x1e3   
>>                     [    0.152009]  =======================                
>>                          [    0.152009] Code: b9 04 00 00 00 ba 01 00 00 
>> 00 e8 a6 87 db ff 89 44 24 04 8 [    0.152009] EIP: [<c03acfa1>] 
>> text_poke+0x85/0xb4 SS:ESP 0068:c04a3f88       [    0.152009] ---[ end 
>> trace ca143223eefdc828 ]---                             [    0.152009] 
>> Kernel panic - not syncing: Attempted to kill the idle task!       
>
>    J

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-03-04  4:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-29 20:39 bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives" Jeremy Fitzhardinge
2008-02-29 20:56 ` H. Peter Anvin
2008-02-29 21:09   ` Jeremy Fitzhardinge
2008-03-03 17:30     ` Mathieu Desnoyers
2008-03-03 17:35       ` Andi Kleen
2008-03-03 17:58       ` Jeremy Fitzhardinge
2008-03-03 18:03         ` Andi Kleen
2008-03-03 18:05           ` Jeremy Fitzhardinge
2008-03-03 20:53             ` Mathieu Desnoyers
2008-03-03 22:15               ` Jeremy Fitzhardinge
2008-03-04  4:12                 ` Mathieu Desnoyers
2008-02-29 21:24 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).