LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"x86@kernel.org" <x86@kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Jan Beulich <JBeulich@novell.com>
Subject: Re: [PATCH] x86/mm/init: respect memblock reserved regions when destroying mappings
Date: Thu, 10 Feb 2011 16:54:45 -0800	[thread overview]
Message-ID: <4D5488D5.2020607@kernel.org> (raw)
In-Reply-To: <4D54844B.4010007@zytor.com>

On 02/10/2011 04:35 PM, H. Peter Anvin wrote:
> On 02/10/2011 03:57 PM, Yinghai Lu wrote:
>> On 02/10/2011 03:48 PM, Jeremy Fitzhardinge wrote:
>>> On 02/08/2011 11:34 AM, H. Peter Anvin wrote:
>>>> On 02/07/2011 07:12 PM, Yinghai Lu wrote:
>>>>> why punishing native path with those checking?
>>>>>
>>>> What happens if you end up with a reserved range in an unfortunate place
>>>> on real hardware?
>>>
>>> Yes, exactly.  The reserved region code isn't very useful if you can't
>>> rely on it to reserve stuff.
>>
>> assume context is under:
>> moving cleanup_highmap() down after brk is concluded, and check memblock_reserved there.
>>
>> one case for that: native path, bootloader could put initrd under 512M. and it is with memblock reserved.
>> if we check those range with memblock_reserved, initial kernel mapping will not be cleaned up.
>>
>> or worse if we are checking if there is any range from __pa(_brk_end) to 512M is with memblock reserved to decide
>> if we need to clean-up highmap. it will skip for whole range.
>>
> 
> I'm afraid I simply can't parse the above.

1. we have patch that will move down cleanup_highmap, and it will clean initial mapping from _brk_end to 512M (before we have two steps: clear _end to 512M and then _brk_end to _end)

2. So checking memblock_reserved with _brk_end to 512M will cause problem:
   a. will check 256 times less.
   b. if bootloader put initrd ramdisk overlapped with [_brk_end++, 512M), and overlap range will make clean_highmap bail out early. because those range is memblock_reserved.

BTW: Do we really need to cleanup initial mapping between _brk_end to _end?

origin patch from jan:

commit 498343967613183611ac37dccb2846496d954c06
Author: Jan Beulich <jbeulich@novell.com>
Date:   Wed May 6 13:06:47 2009 +0100

    x86-64: finish cleanup_highmaps()'s job wrt. _brk_end
    
    With the introduction of the .brk section, special care must be taken
    that no unused page table entries remain if _brk_end and _end are
    separated by a 2M page boundary. cleanup_highmap() runs very early and
    hence cannot take care of that, hence potential entries needing to be
    removed past _brk_end must be cleared once the brk allocator has done
    its job.
    
    [ Impact: avoids undesirable TLB aliases ]
    
    Signed-off-by: Jan Beulich <jbeulich@novell.com>
    Signed-off-by: H. Peter Anvin <hpa@zytor.com>

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index fd3da1d..ae4f7b5 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -7,6 +7,7 @@
 #include <asm/page.h>
 #include <asm/page_types.h>
 #include <asm/sections.h>
+#include <asm/setup.h>
 #include <asm/system.h>
 #include <asm/tlbflush.h>
 
@@ -304,8 +305,23 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
 #endif
 
 #ifdef CONFIG_X86_64
-       if (!after_bootmem)
+       if (!after_bootmem && !start) {
+               pud_t *pud;
+               pmd_t *pmd;
+
                mmu_cr4_features = read_cr4();
+
+               /*
+                * _brk_end cannot change anymore, but it and _end may be
+                * located on different 2M pages. cleanup_highmap(), however,
+                * can only consider _end when it runs, so destroy any
+                * mappings beyond _brk_end here.
+                */
+               pud = pud_offset(pgd_offset_k(_brk_end), _brk_end);
+               pmd = pmd_offset(pud, _brk_end - 1);
+               while (++pmd <= pmd_offset(pud, (unsigned long)_end - 1))
+                       pmd_clear(pmd);
+       }
 #endif
        __flush_tlb_all();
 


  reply	other threads:[~2011-02-11  0:55 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-31 15:18 Stefano Stabellini
2011-02-02 20:15 ` Jeremy Fitzhardinge
2011-02-03  5:05 ` H. Peter Anvin
2011-02-03 11:25   ` Stefano Stabellini
2011-02-03 17:02     ` H. Peter Anvin
2011-02-04 11:35       ` Stefano Stabellini
2011-02-05  1:18         ` Jeremy Fitzhardinge
2011-02-06  7:02           ` Yinghai Lu
2011-02-06  7:30             ` H. Peter Anvin
2011-02-06 17:49               ` Yinghai Lu
2011-02-06 19:24               ` Yinghai Lu
2011-02-07 16:50                 ` Stefano Stabellini
2011-02-07 18:04                   ` Yinghai Lu
2011-02-07 18:58                     ` Stefano Stabellini
2011-02-07 19:00                       ` Yinghai Lu
2011-02-07 19:18                         ` Yinghai Lu
2011-02-07 21:56                         ` Jeremy Fitzhardinge
2011-02-08  3:12                           ` Yinghai Lu
2011-02-08  4:56                             ` Jeremy Fitzhardinge
2011-02-08  5:09                               ` Yinghai Lu
2011-02-08 14:55                                 ` Konrad Rzeszutek Wilk
2011-02-08 19:24                                   ` Jeremy Fitzhardinge
2011-02-08 20:26                                     ` Stefano Stabellini
2011-02-08 19:34                             ` H. Peter Anvin
2011-02-10 23:48                               ` Jeremy Fitzhardinge
2011-02-10 23:57                                 ` Yinghai Lu
2011-02-11  0:35                                   ` H. Peter Anvin
2011-02-11  0:54                                     ` Yinghai Lu [this message]
2011-02-14 16:26                                       ` Konrad Rzeszutek Wilk
2011-02-14 17:55                                         ` Yinghai Lu
2011-02-14 17:58                                           ` Stefano Stabellini
2011-02-14 18:09                                             ` Yinghai Lu
2011-02-14 20:02                                           ` H. Peter Anvin
2011-02-16 17:36                                             ` Stefano Stabellini
2011-02-07 19:00                   ` Stefano Stabellini
2011-02-08  5:16                     ` Yinghai Lu
2011-02-08 14:03                       ` Stefano Stabellini
2011-02-08 16:04                         ` Yinghai Lu
2011-02-07 16:02           ` Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5488D5.2020607@kernel.org \
    --to=yinghai@kernel.org \
    --cc=JBeulich@novell.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --subject='Re: [PATCH] x86/mm/init: respect memblock reserved regions when destroying mappings' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).