LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>, "H. Peter Anvin" <hpa@zytor.com>,
	Ingo Molnar <mingo@elte.hu>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Zachary Amsden <zach@vmware.com>
Subject: Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"
Date: Mon, 3 Mar 2008 23:12:30 -0500	[thread overview]
Message-ID: <20080304041230.GA22753@Krystal> (raw)
In-Reply-To: <47CC7897.1080703@goop.org>

* Jeremy Fitzhardinge (jeremy@goop.org) wrote:
> Mathieu Desnoyers wrote:
>> * Jeremy Fitzhardinge (jeremy@goop.org) wrote:
>>   
>>> Andi Kleen wrote:
>>>     
>>>> On Monday 03 March 2008 18:58:03 Jeremy Fitzhardinge wrote:
>>>>
>>>>         
>>>>> Perhaps, though that's uncached by default.
>>>>>             
>>>> ioremap_cached()
>>>>       
>>> Sure.  But given that from the perspective of this problem ioremap* is 
>>> just a wrapper for vmap, we may as well use it directly and avoid getting 
>>> tangled up in any current or future io-related stuff that ioremap may 
>>> want to do.
>>>
>>>    J
>>>     
>>
>> Would you have a quick hint on why I get a page fault with the following
>> implementation ? There is probably a fundamental detail I missed.
>>
>>
>> void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
>> {
>>         char *vaddr;
>>         struct page *pages[1];
>>
>>         BUG_ON(len > sizeof(long));
>>         BUG_ON((((long)addr + len - 1) & ~(sizeof(long) - 1))
>>                 - ((long)addr & ~(sizeof(long) - 1)));
>>         if (kernel_text_address((unsigned long)addr))
>>                 pages[0] = virt_to_page(addr);
>>         else
>>                 vaddr = addr;
>>   
>

Hi Jeremy,

(Connecting brain...)

> What's this for?  You just overwrite vaddr with the vmap on the next line.
>

Ah, it's supposed to be

if (kernel_text_address((unsigned long)addr)) {
  pages[0] = virt_to_page(addr);
  vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
  memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len);
} else
  memcpy(addr, opcode, len);



>>         vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
>>   
> Do you need to handle the case of an instruction spanning a page boundary?
>

Not really :

1-byte changes :

- kprobes is a 1-byte breakpoint
- smp alternatives changes an f0 prefix into a 1-byte nop when a single
  CPU is active (the opposite is done when bringing up a second CPU).

Aligned on word size :
- immediate values are aligned on word size

At boot time only, before kernel pages are marked RO :
- alternatives and paravirt are applied at boot time only

But yes, I guess it's cheap and allows supporting users which does not
have their instructions aligned. I'll change it.

>>         memcpy(&vaddr[(unsigned long)addr & PAGE_MASK], opcode, len);
>>         if (kernel_text_address((unsigned long)addr))
>>                 vunmap(vaddr);
>>         sync_core();
>>         /* Could also do a CLFLUSH here to speed up CPU recovery; but
>>            that causes hangs on some VIA CPUs. */
>>         return addr;
>> }
>>
>>
>> [    0.149856] SMP alternatives: switching to UP code                      
>>      [    0.152009] BUG: unable to handle kernel paging request at 
>> b8902000            
> That's a userspace address, unless you've got 2G:2G, and the error code 
> says there's nothing mapped there.
>

<banging head on the wall> & PAGE_MASK -> & ~PAGE_MASK  :)

Thanks!

Mathieu

>> [    0.152009] IP: [<c03acfa1>] text_poke+0x85/0xb4                        
>>      [    0.152009] *pde = 00000000                                        
>>           [    0.152009] Oops: 0002 [#1] SMP                               
>>                [    0.152009] LTT NESTING LEVEL : 0 <0>                    
>>                     [    0.152009] Modules linked in:
>> [    0.152009] [    0.152009] Pid: 0, comm: swapper Not tainted 
>> (2.6.25-rc3-testssmp #744)
>> [    0.152009] EIP: 0060:[<c03acfa1>] EFLAGS: 00010002 CPU: 0
>> [    0.152009] EIP is at text_poke+0x85/0xb4
>> [    0.152009] EAX: f8800000 EBX: 00000001 ECX: 00000001 EDX: f8800000
>> [    0.152009] ESI: c04a3fab EDI: b8902000 EBP: c0102114 ESP: c04a3f88
>> [    0.152009]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> [    0.152009] Process swapper (pid: 0, ti=c04a2000 task=c04703a0 
>> task.ti=c04a2)
>> [    0.152009] Stack: 00000163 f8800000 c1002040 c04a400c c04a7cac 
>> c0100000 c03 [    0.152009]        90411244 00000206 c04e072c c17fb72c 
>> 0000672c c04aaefe c03 [    0.152009]        c04ab76a c17f5000 c04a896a 
>> 00000092 c04a80d7 00000008 000 [    0.152009] Call Trace:                  
>>                                     [    0.152009]  [<c03b0139>] 
>> _etext+0x0/0xf7ec7                                 [    0.152009]  
>> [<c0107814>] alternatives_smp_unlock+0x57/0x59                  [    
>> 0.152009]  [<c04aaefe>] alternative_instructions+0x152/0x157               
>> [    0.152009]  [<c03b0139>] _etext+0x0/0xf7ec7                            
>>      [    0.152009]  [<c04ab76a>] check_bugs+0x131/0x14e                   
>>           [    0.152009]  [<c04a896a>] start_kernel+0x2d1/0x327            
>>                [    0.152009]  [<c04a80d7>] unknown_bootoption+0x0/0x1e3   
>>                     [    0.152009]  =======================                
>>                          [    0.152009] Code: b9 04 00 00 00 ba 01 00 00 
>> 00 e8 a6 87 db ff 89 44 24 04 8 [    0.152009] EIP: [<c03acfa1>] 
>> text_poke+0x85/0xb4 SS:ESP 0068:c04a3f88       [    0.152009] ---[ end 
>> trace ca143223eefdc828 ]---                             [    0.152009] 
>> Kernel panic - not syncing: Attempted to kill the idle task!       
>
>    J

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2008-03-04  4:12 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-29 20:39 Jeremy Fitzhardinge
2008-02-29 20:56 ` H. Peter Anvin
2008-02-29 21:09   ` Jeremy Fitzhardinge
2008-03-03 17:30     ` Mathieu Desnoyers
2008-03-03 17:35       ` Andi Kleen
2008-03-03 17:58       ` Jeremy Fitzhardinge
2008-03-03 18:03         ` Andi Kleen
2008-03-03 18:05           ` Jeremy Fitzhardinge
2008-03-03 20:53             ` Mathieu Desnoyers
2008-03-03 22:15               ` Jeremy Fitzhardinge
2008-03-04  4:12                 ` Mathieu Desnoyers [this message]
2008-02-29 21:24 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080304041230.GA22753@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=ak@suse.de \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=zach@vmware.com \
    --subject='Re: bad paravirt/Xen interaction in "x86 - Enhance DEBUG_RODATA support - alternatives"' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).