LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping
@ 2008-01-03 17:26 Andi Kleen
2008-01-03 17:26 ` [PATCH] [1/8] GBPAGES: Handle kernel near memory hole in clear_kernel_mapping Andi Kleen
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:26 UTC (permalink / raw)
To: linux-kernel
This patchkit implement GB pages support for AMD Fam10h CPUs. This patchkit only
implements it for the kernel direct mapping for now; support for hugetlbfs is upcomming.
This allows to map the kernel direct mapping using 1GB TLBs instead of 2MB
TLBs and get hopefully less TLB misses for the kernel.
The GB pages are only implemented for 64bit (because the CPU only implements
them for long mode) and also only for data pages (because Fam10h doesn't have GB ITLBs
and AMD recommends against running code in them)
There is an option to turn them off (direct_gbpages=off), although I hope that
won't be needed.
Also includes one generic bug fix for clear_page_kernel.
-Andi
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [1/8] GBPAGES: Handle kernel near memory hole in clear_kernel_mapping
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
@ 2008-01-03 17:26 ` Andi Kleen
2008-01-03 18:29 ` Vivek Goyal
2008-01-03 17:26 ` [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit Andi Kleen
` (6 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:26 UTC (permalink / raw)
To: ebiederm, vgoyal, linux-kernel
This was a long standing obscure problem in the relocatable kernel. The
AMD GART driver needs to unmap part of the GART in the kernel direct mapping to
prevent cache corruption. With the relocatable kernel it is in theory possible
that the separate kernel text mapping straddles that area too.
Normally it should not happen because GART tends to be >= 2GB, and the kernel
is normally not loaded that high, but it is possible in theory.
Teach clear_kernel_mapping() about this case.
This will become more important once the kernel mapping uses 1GB pages.
Cc: ebiederm@xmission.com
Cc: vgoyal@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86/mm/init_64.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
Index: linux/arch/x86/mm/init_64.c
===================================================================
--- linux.orig/arch/x86/mm/init_64.c
+++ linux/arch/x86/mm/init_64.c
@@ -411,7 +411,8 @@ void __init paging_init(void)
from the CPU leading to inconsistent cache lines. address and size
must be aligned to 2MB boundaries.
Does nothing when the mapping doesn't exist. */
-void __init clear_kernel_mapping(unsigned long address, unsigned long size)
+static void __init
+__clear_kernel_mapping(unsigned long address, unsigned long size)
{
unsigned long end = address + size;
@@ -441,6 +442,23 @@ void __init clear_kernel_mapping(unsigne
__flush_tlb_all();
}
+#define overlaps(as,ae,bs,be) ((ae) >= (bs) && (as) <= (be))
+
+void __init clear_kernel_mapping(unsigned long address, unsigned long size)
+{
+ int sh = PMD_SHIFT;
+ unsigned long kernel = __pa(__START_KERNEL_map);
+
+ if (overlaps(kernel>>sh, (kernel + KERNEL_TEXT_SIZE)>>sh,
+ __pa(address)>>sh, __pa(address + size)>>sh)) {
+ printk(KERN_INFO
+ "Kernel at %lx overlaps memory hole at %lx-%lx\n",
+ kernel, __pa(address), __pa(address+size));
+ __clear_kernel_mapping(__START_KERNEL_map+__pa(address), size);
+ }
+ __clear_kernel_mapping(address, size);
+}
+
/*
* Memory hotplug specific functions
*/
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
2008-01-03 17:26 ` [PATCH] [1/8] GBPAGES: Handle kernel near memory hole in clear_kernel_mapping Andi Kleen
@ 2008-01-03 17:26 ` Andi Kleen
2008-01-03 17:26 ` [PATCH] [3/8] GBPAGES: Split LARGE_PAGE_SIZE/MASK into PUD_PAGE_SIZE/PMD_PAGE_SIZE Andi Kleen
` (5 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:26 UTC (permalink / raw)
To: linux-kernel
Signed-off-by: Andi Kleen <ak@suse.de>
---
include/asm-x86/cpufeature.h | 2 ++
1 file changed, 2 insertions(+)
Index: linux/include/asm-x86/cpufeature.h
===================================================================
--- linux.orig/include/asm-x86/cpufeature.h
+++ linux/include/asm-x86/cpufeature.h
@@ -49,6 +49,7 @@
#define X86_FEATURE_MP (1*32+19) /* MP Capable. */
#define X86_FEATURE_NX (1*32+20) /* Execute Disable */
#define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */
+#define X86_FEATURE_GBPAGES (1*32+26) /* GB pages */
#define X86_FEATURE_RDTSCP (1*32+27) /* RDTSCP */
#define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */
#define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */
@@ -168,6 +169,7 @@
#define cpu_has_clflush boot_cpu_has(X86_FEATURE_CLFLSH)
#define cpu_has_bts boot_cpu_has(X86_FEATURE_BTS)
#define cpu_has_ss boot_cpu_has(X86_FEATURE_SELFSNOOP)
+#define cpu_has_gbpages boot_cpu_has(X86_FEATURE_GBPAGES)
#if defined(CONFIG_X86_INVLPG) || defined(CONFIG_X86_64)
# define cpu_has_invlpg 1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [3/8] GBPAGES: Split LARGE_PAGE_SIZE/MASK into PUD_PAGE_SIZE/PMD_PAGE_SIZE
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
2008-01-03 17:26 ` [PATCH] [1/8] GBPAGES: Handle kernel near memory hole in clear_kernel_mapping Andi Kleen
2008-01-03 17:26 ` [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit Andi Kleen
@ 2008-01-03 17:26 ` Andi Kleen
2008-01-03 17:27 ` [PATCH] [4/8] GBPAGES: Add pgtable accessor functions for GB pages Andi Kleen
` (4 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:26 UTC (permalink / raw)
To: linux-kernel
Split the existing LARGE_PAGE_SIZE/MASK macro into two new macros
PUD_PAGE_SIZE/MASK and PMD_PAGE_SIZE/MASK.
Fix up all callers to use the new names.
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86/boot/compressed/head_64.S | 4 ++--
arch/x86/kernel/head_64.S | 4 ++--
arch/x86/kernel/pci-gart_64.c | 2 +-
arch/x86/mm/init_64.c | 6 +++---
arch/x86/mm/pageattr_64.c | 4 ++--
include/asm-x86/page_64.h | 7 +++++--
6 files changed, 15 insertions(+), 12 deletions(-)
Index: linux/include/asm-x86/page_64.h
===================================================================
--- linux.orig/include/asm-x86/page_64.h
+++ linux/include/asm-x86/page_64.h
@@ -29,8 +29,11 @@
#define MCE_STACK 5
#define N_EXCEPTION_STACKS 5 /* hw limit: 7 */
-#define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1))
-#define LARGE_PAGE_SIZE (_AC(1,UL) << PMD_SHIFT)
+#define PMD_PAGE_SIZE (_AC(1,UL) << PMD_SHIFT)
+#define PMD_PAGE_MASK (~(PMD_PAGE_SIZE-1))
+
+#define PUD_PAGE_SIZE (_AC(1,UL) << PUD_SHIFT)
+#define PUD_PAGE_MASK (~(PUD_PAGE_SIZE-1))
#define HPAGE_SHIFT PMD_SHIFT
#define HPAGE_SIZE (_AC(1,UL) << HPAGE_SHIFT)
Index: linux/arch/x86/boot/compressed/head_64.S
===================================================================
--- linux.orig/arch/x86/boot/compressed/head_64.S
+++ linux/arch/x86/boot/compressed/head_64.S
@@ -80,8 +80,8 @@ startup_32:
#ifdef CONFIG_RELOCATABLE
movl %ebp, %ebx
- addl $(LARGE_PAGE_SIZE -1), %ebx
- andl $LARGE_PAGE_MASK, %ebx
+ addl $(PMD_PAGE_SIZE -1), %ebx
+ andl $PMD_PAGE_MASK, %ebx
#else
movl $CONFIG_PHYSICAL_START, %ebx
#endif
Index: linux/arch/x86/kernel/pci-gart_64.c
===================================================================
--- linux.orig/arch/x86/kernel/pci-gart_64.c
+++ linux/arch/x86/kernel/pci-gart_64.c
@@ -501,7 +501,7 @@ static __init unsigned long check_iommu_
}
a = aper + iommu_size;
- iommu_size -= round_up(a, LARGE_PAGE_SIZE) - a;
+ iommu_size -= round_up(a, PMD_PAGE_SIZE) - a;
if (iommu_size < 64*1024*1024) {
printk(KERN_WARNING
Index: linux/arch/x86/kernel/head_64.S
===================================================================
--- linux.orig/arch/x86/kernel/head_64.S
+++ linux/arch/x86/kernel/head_64.S
@@ -63,7 +63,7 @@ startup_64:
/* Is the address not 2M aligned? */
movq %rbp, %rax
- andl $~LARGE_PAGE_MASK, %eax
+ andl $~PMD_PAGE_MASK, %eax
testl %eax, %eax
jnz bad_address
@@ -88,7 +88,7 @@ startup_64:
/* Add an Identity mapping if I am above 1G */
leaq _text(%rip), %rdi
- andq $LARGE_PAGE_MASK, %rdi
+ andq $PMD_PAGE_MASK, %rdi
movq %rdi, %rax
shrq $PUD_SHIFT, %rax
Index: linux/arch/x86/mm/init_64.c
===================================================================
--- linux.orig/arch/x86/mm/init_64.c
+++ linux/arch/x86/mm/init_64.c
@@ -416,10 +416,10 @@ __clear_kernel_mapping(unsigned long add
{
unsigned long end = address + size;
- BUG_ON(address & ~LARGE_PAGE_MASK);
- BUG_ON(size & ~LARGE_PAGE_MASK);
+ BUG_ON(address & ~PMD_PAGE_MASK);
+ BUG_ON(size & ~PMD_PAGE_MASK);
- for (; address < end; address += LARGE_PAGE_SIZE) {
+ for (; address < end; address += PMD_PAGE_SIZE) {
pgd_t *pgd = pgd_offset_k(address);
pud_t *pud;
pmd_t *pmd;
Index: linux/arch/x86/mm/pageattr_64.c
===================================================================
--- linux.orig/arch/x86/mm/pageattr_64.c
+++ linux/arch/x86/mm/pageattr_64.c
@@ -70,7 +70,7 @@ static struct page *split_large_page(uns
page_private(base) = 0;
address = __pa(address);
- addr = address & LARGE_PAGE_MASK;
+ addr = address & PMD_PAGE_MASK;
pbase = (pte_t *)page_address(base);
for (i = 0; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE) {
pbase[i] = pfn_pte(addr >> PAGE_SHIFT,
@@ -150,7 +150,7 @@ static void revert_page(unsigned long ad
BUG_ON(pud_none(*pud));
pmd = pmd_offset(pud, address);
BUG_ON(pmd_val(*pmd) & _PAGE_PSE);
- pfn = (__pa(address) & LARGE_PAGE_MASK) >> PAGE_SHIFT;
+ pfn = (__pa(address) & PMD_PAGE_MASK) >> PAGE_SHIFT;
large_pte = pfn_pte(pfn, ref_prot);
large_pte = pte_mkhuge(large_pte);
set_pte((pte_t *)pmd, large_pte);
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [4/8] GBPAGES: Add pgtable accessor functions for GB pages
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
` (2 preceding siblings ...)
2008-01-03 17:26 ` [PATCH] [3/8] GBPAGES: Split LARGE_PAGE_SIZE/MASK into PUD_PAGE_SIZE/PMD_PAGE_SIZE Andi Kleen
@ 2008-01-03 17:27 ` Andi Kleen
2008-01-03 17:27 ` [PATCH] [5/8] GBPAGES: Support gbpages in pagetable dump Andi Kleen
` (3 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:27 UTC (permalink / raw)
To: linux-kernel
Signed-off-by: Andi Kleen <ak@suse.de>
---
include/asm-x86/pgtable_64.h | 3 +++
1 file changed, 3 insertions(+)
Index: linux/include/asm-x86/pgtable_64.h
===================================================================
--- linux.orig/include/asm-x86/pgtable_64.h
+++ linux/include/asm-x86/pgtable_64.h
@@ -320,6 +320,9 @@ static inline int pmd_large(pmd_t pte) {
return (pmd_val(pte) & __LARGE_PTE) == __LARGE_PTE;
}
+static inline int pud_large(pud_t pte) {
+ return (pud_val(pte) & __LARGE_PTE) == __LARGE_PTE;
+}
/*
* Conversion functions: convert a page and protection to a page entry,
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [5/8] GBPAGES: Support gbpages in pagetable dump
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
` (3 preceding siblings ...)
2008-01-03 17:27 ` [PATCH] [4/8] GBPAGES: Add pgtable accessor functions for GB pages Andi Kleen
@ 2008-01-03 17:27 ` Andi Kleen
2008-01-03 17:27 ` [PATCH] [6/8] GBPAGES: Add an option to disable direct mapping gbpages and a global variable Andi Kleen
` (2 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:27 UTC (permalink / raw)
To: linux-kernel
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86/mm/fault_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux/arch/x86/mm/fault_64.c
===================================================================
--- linux.orig/arch/x86/mm/fault_64.c
+++ linux/arch/x86/mm/fault_64.c
@@ -288,7 +288,7 @@ void dump_pagetable(unsigned long addres
pud = pud_offset(pgd, address);
if (bad_address(pud)) goto bad;
printk("PUD %lx ", pud_val(*pud));
- if (!pud_present(*pud)) goto ret;
+ if (!pud_present(*pud) || pud_large(*pud)) goto ret;
pmd = pmd_offset(pud, address);
if (bad_address(pmd)) goto bad;
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [6/8] GBPAGES: Add an option to disable direct mapping gbpages and a global variable
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
` (4 preceding siblings ...)
2008-01-03 17:27 ` [PATCH] [5/8] GBPAGES: Support gbpages in pagetable dump Andi Kleen
@ 2008-01-03 17:27 ` Andi Kleen
2008-01-03 19:03 ` Nish Aravamudan
2008-01-03 17:27 ` [PATCH] [7/8] GBPAGES: Implement GBpages support in change_page_attr() Andi Kleen
2008-01-03 17:27 ` [PATCH] [8/8] GBPAGES: Do kernel direct mapping at boot using GB pages Andi Kleen
7 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:27 UTC (permalink / raw)
To: linux-kernel
Signed-off-by: Andi Kleen <ak@suse.de>
---
Documentation/x86_64/boot-options.txt | 3 +++
arch/x86/mm/init_64.c | 12 ++++++++++++
include/asm-x86/pgtable_64.h | 2 ++
3 files changed, 17 insertions(+)
Index: linux/arch/x86/mm/init_64.c
===================================================================
--- linux.orig/arch/x86/mm/init_64.c
+++ linux/arch/x86/mm/init_64.c
@@ -57,6 +57,18 @@ static unsigned long dma_reserve __initd
DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);
+int direct_gbpages;
+
+static int __init parse_direct_gbpages(char *arg)
+{
+ if (!strcmp(arg, "off")) {
+ direct_gbpages = -1;
+ return 0;
+ }
+ return -1;
+}
+early_param("direct_gbpages", parse_direct_gbpages);
+
/*
* NOTE: pagetable_init alloc all the fixmap pagetables contiguous on the
* physical space so we can cache the place of the first one and move
Index: linux/include/asm-x86/pgtable_64.h
===================================================================
--- linux.orig/include/asm-x86/pgtable_64.h
+++ linux/include/asm-x86/pgtable_64.h
@@ -408,6 +408,8 @@ static inline pte_t pte_modify(pte_t pte
__changed; \
})
+extern int direct_gbpages;
+
/* Encode and de-code a swap entry */
#define __swp_type(x) (((x).val >> 1) & 0x3f)
#define __swp_offset(x) ((x).val >> 8)
Index: linux/Documentation/x86_64/boot-options.txt
===================================================================
--- linux.orig/Documentation/x86_64/boot-options.txt
+++ linux/Documentation/x86_64/boot-options.txt
@@ -307,3 +307,6 @@ Debugging
stuck (default)
Miscellaneous
+
+ direct_gbpages=off
+ Do not use GB pages for kernel direct mapping.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [7/8] GBPAGES: Implement GBpages support in change_page_attr()
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
` (5 preceding siblings ...)
2008-01-03 17:27 ` [PATCH] [6/8] GBPAGES: Add an option to disable direct mapping gbpages and a global variable Andi Kleen
@ 2008-01-03 17:27 ` Andi Kleen
2008-01-03 17:27 ` [PATCH] [8/8] GBPAGES: Do kernel direct mapping at boot using GB pages Andi Kleen
7 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:27 UTC (permalink / raw)
To: linux-kernel
Teach c_p_a() to split and unsplit GB pages.
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86/mm/pageattr_64.c | 149 ++++++++++++++++++++++++++++++++++++----------
1 file changed, 118 insertions(+), 31 deletions(-)
Index: linux/arch/x86/mm/pageattr_64.c
===================================================================
--- linux.orig/arch/x86/mm/pageattr_64.c
+++ linux/arch/x86/mm/pageattr_64.c
@@ -14,6 +14,8 @@
#include <asm/io.h>
#include <asm/kdebug.h>
+#define Cprintk(x...)
+
enum flush_mode { FLUSH_NONE, FLUSH_CACHE, FLUSH_TLB };
struct flush {
@@ -40,6 +42,9 @@ pte_t *lookup_address(unsigned long addr
pud = pud_offset(pgd, address);
if (!pud_present(*pud))
return NULL;
+ *level = 2;
+ if (pud_large(*pud))
+ return (pte_t *)pud;
pmd = pmd_offset(pud, address);
if (!pmd_present(*pmd))
return NULL;
@@ -53,30 +58,88 @@ pte_t *lookup_address(unsigned long addr
return pte;
}
-static struct page *split_large_page(unsigned long address, pgprot_t prot,
- pgprot_t ref_prot)
-{
- int i;
+static pte_t *alloc_split_page(struct page **base)
+{
+ struct page *p = alloc_page(GFP_KERNEL);
+ if (!p)
+ return NULL;
+ SetPagePrivate(p);
+ page_private(p) = 0;
+ *base = p;
+ return page_address(p);
+}
+
+static struct page *free_split_page(struct page *base)
+{
+ BUG_ON(!PagePrivate(base));
+ BUG_ON(page_private(base) != 0);
+ ClearPagePrivate(base);
+ __free_page(base);
+ return NULL;
+}
+
+static struct page *
+split_pmd(unsigned long paddr, pgprot_t prot, pgprot_t ref_prot)
+{
+ int i;
unsigned long addr;
- struct page *base = alloc_pages(GFP_KERNEL, 0);
- pte_t *pbase;
- if (!base)
+ struct page *base;
+ pte_t *pbase = alloc_split_page(&base);
+ if (!pbase)
return NULL;
- /*
- * page_private is used to track the number of entries in
- * the page table page have non standard attributes.
- */
- SetPagePrivate(base);
- page_private(base) = 0;
- address = __pa(address);
- addr = address & PMD_PAGE_MASK;
- pbase = (pte_t *)page_address(base);
- for (i = 0; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE) {
- pbase[i] = pfn_pte(addr >> PAGE_SHIFT,
- addr == address ? prot : ref_prot);
+ Cprintk("cpa split l3 paddr %lx\n", paddr);
+ addr = paddr & PMD_PAGE_MASK;
+ for (i = 0; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE)
+ pbase[i] = pfn_pte(addr >> PAGE_SHIFT,
+ addr == paddr ? prot : ref_prot);
+
+ return base;
+}
+
+static struct page *
+split_gb(unsigned long paddr, pgprot_t prot, pgprot_t ref_prot)
+{
+ unsigned long addr;
+ int i;
+ struct page *base;
+ pte_t *pbase = alloc_split_page(&base);
+
+ if (!pbase)
+ return NULL;
+ Cprintk("cpa split gb paddr %lx\n", paddr);
+ addr = paddr & PUD_PAGE_MASK;
+ for (i = 0; i < PTRS_PER_PMD; i++, addr += PMD_PAGE_SIZE) {
+ if (paddr >= addr && paddr < addr + PMD_PAGE_SIZE) {
+ struct page *l3;
+ l3 = split_pmd(paddr, prot, ref_prot);
+ if (!l3)
+ return free_split_page(base);
+ page_private(l3)++;
+ pbase[i] = mk_pte(l3, ref_prot);
+ } else {
+ pbase[i] = pfn_pte(addr>>PAGE_SHIFT, ref_prot);
+ pbase[i] = pte_mkhuge(pbase[i]);
+ }
}
return base;
+}
+
+static struct page *split_large_page(unsigned long address, pgprot_t prot,
+ pgprot_t ref_prot, int level)
+{
+ unsigned long paddr = __pa(address);
+ Cprintk("cpa splitting %lx level %d\n", address, level);
+ if (level == 2)
+ return split_gb(paddr, prot, ref_prot);
+ else if (level == 3)
+ return split_pmd(paddr, prot, ref_prot);
+ else {
+ printk("address %lx\n", address);
+ dump_pagetable(address);
+ BUG();
+ }
+ return NULL;
}
struct flush_arg {
@@ -132,17 +195,42 @@ static inline void save_page(struct page
list_add(&fpage->lru, &deferred_pages);
}
+static void reset_large_pte(pte_t *pte, unsigned long addr, pgprot_t prot)
+{
+ unsigned long pfn = __pa(addr) >> PAGE_SHIFT;
+ set_pte(pte, pte_mkhuge(pfn_pte(pfn, prot)));
+}
+
+static void
+revert_gb(unsigned long address, pud_t *pud, pmd_t *pmd, pgprot_t ref_prot)
+{
+ struct page *p = virt_to_page(pmd);
+
+ /* Reserved pages means it has been already set up at boot. Don't touch those. */
+ if (PageReserved(p))
+ return;
+
+ Cprintk("cpa revert gb %lx count %ld\n", address, page_private(p));
+ --page_private(p);
+ BUG_ON(page_private(p) < 0);
+ if (page_private(p) == 0) {
+ save_page(p);
+ reset_large_pte((pte_t *)pud, address & PUD_PAGE_MASK, ref_prot);
+ }
+}
+
/*
* No more special protections in this 2MB area - revert to a
- * large page again.
+ * large or GB page again.
*/
+
static void revert_page(unsigned long address, pgprot_t ref_prot)
{
pgd_t *pgd;
pud_t *pud;
pmd_t *pmd;
- pte_t large_pte;
- unsigned long pfn;
+
+ Cprintk("cpa revert %lx\n", address);
pgd = pgd_offset_k(address);
BUG_ON(pgd_none(*pgd));
@@ -150,10 +238,9 @@ static void revert_page(unsigned long ad
BUG_ON(pud_none(*pud));
pmd = pmd_offset(pud, address);
BUG_ON(pmd_val(*pmd) & _PAGE_PSE);
- pfn = (__pa(address) & PMD_PAGE_MASK) >> PAGE_SHIFT;
- large_pte = pfn_pte(pfn, ref_prot);
- large_pte = pte_mkhuge(large_pte);
- set_pte((pte_t *)pmd, large_pte);
+ reset_large_pte((pte_t *)pmd, address & PMD_PAGE_MASK, ref_prot);
+
+ revert_gb(address, pud, pmd, ref_prot);
}
/*
@@ -189,6 +276,7 @@ static void set_tlb_flush(unsigned long
static unsigned short pat_bit[5] = {
[4] = _PAGE_PAT,
[3] = _PAGE_PAT_LARGE,
+ [2] = _PAGE_PAT_LARGE,
};
static int cache_attr_changed(pte_t pte, pgprot_t prot, int level)
@@ -224,15 +312,14 @@ __change_page_attr(unsigned long address
page_private(kpte_page)++;
set_pte(kpte, pfn_pte(pfn, prot));
} else {
- /*
- * split_large_page will take the reference for this
- * change_page_attr on the split page.
- */
struct page *split;
ref_prot2 = pte_pgprot(pte_clrhuge(*kpte));
- split = split_large_page(address, prot, ref_prot2);
+ split = split_large_page(address, prot, ref_prot2,
+ level);
if (!split)
return -ENOMEM;
+ if (level == 3 && !PageReserved(kpte_page))
+ page_private(kpte_page)++;
pgprot_val(ref_prot2) &= ~_PAGE_NX;
set_pte(kpte, mk_pte(split, ref_prot2));
kpte_page = split;
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [8/8] GBPAGES: Do kernel direct mapping at boot using GB pages
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
` (6 preceding siblings ...)
2008-01-03 17:27 ` [PATCH] [7/8] GBPAGES: Implement GBpages support in change_page_attr() Andi Kleen
@ 2008-01-03 17:27 ` Andi Kleen
7 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 17:27 UTC (permalink / raw)
To: linux-kernel
This should decrease TLB pressure because the kernel will need
less TLB faults for its own data access.
Only done for 64bit because i386 does not support GB page tables.
This only applies to the data portion of the direct mapping; the
kernel text mapping stays with 2MB pages because the AMD Fam10h
microarchitecture does not support GB ITLBs and AMD recommends
against using GB mappings for code.
Can be disabled with direct_gbpages=off
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86/mm/init_64.c | 63 ++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 54 insertions(+), 9 deletions(-)
Index: linux/arch/x86/mm/init_64.c
===================================================================
--- linux.orig/arch/x86/mm/init_64.c
+++ linux/arch/x86/mm/init_64.c
@@ -264,13 +264,20 @@ __meminit void early_iounmap(void *addr,
__flush_tlb();
}
+static unsigned long direct_entry(unsigned long paddr)
+{
+ unsigned long entry;
+ entry = __PAGE_KERNEL_LARGE|_PAGE_GLOBAL|paddr;
+ entry &= __supported_pte_mask;
+ return entry;
+}
+
static void __meminit
phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end)
{
int i = pmd_index(address);
for (; i < PTRS_PER_PMD; i++, address += PMD_SIZE) {
- unsigned long entry;
pmd_t *pmd = pmd_page + pmd_index(address);
if (address >= end) {
@@ -283,9 +290,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned
if (pmd_val(*pmd))
continue;
- entry = __PAGE_KERNEL_LARGE|_PAGE_GLOBAL|address;
- entry &= __supported_pte_mask;
- set_pmd(pmd, __pmd(entry));
+ set_pmd(pmd, __pmd(direct_entry(address)));
}
}
@@ -318,7 +323,13 @@ static void __meminit phys_pud_init(pud_
}
if (pud_val(*pud)) {
- phys_pmd_update(pud, addr, end);
+ if (!pud_large(*pud))
+ phys_pmd_update(pud, addr, end);
+ continue;
+ }
+
+ if (direct_gbpages > 0) {
+ set_pud(pud, __pud(direct_entry(addr)));
continue;
}
@@ -337,9 +348,11 @@ static void __init find_early_table_spac
unsigned long puds, pmds, tables, start;
puds = (end + PUD_SIZE - 1) >> PUD_SHIFT;
- pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT;
- tables = round_up(puds * sizeof(pud_t), PAGE_SIZE) +
- round_up(pmds * sizeof(pmd_t), PAGE_SIZE);
+ tables = round_up(puds * sizeof(pud_t), PAGE_SIZE);
+ if (!direct_gbpages) {
+ pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT;
+ tables += round_up(pmds * sizeof(pmd_t), PAGE_SIZE);
+ }
/* RED-PEN putting page tables only on node 0 could
cause a hotspot and fill up ZONE_DMA. The page tables
@@ -372,8 +385,15 @@ void __init_refok init_memory_mapping(un
* mapped. Unfortunately this is done currently before the nodes are
* discovered.
*/
- if (!after_bootmem)
+ if (!after_bootmem) {
+ if (direct_gbpages >= 0 && cpu_has_gbpages) {
+ printk(KERN_INFO "Using GB pages for direct mapping\n");
+ direct_gbpages = 1;
+ } else
+ direct_gbpages = 0;
+
find_early_table_space(end);
+ }
start = (unsigned long)__va(start);
end = (unsigned long)__va(end);
@@ -419,6 +439,27 @@ void __init paging_init(void)
}
#endif
+static void split_gb_page(pud_t *pud, unsigned long paddr)
+{
+ int i;
+ pmd_t *pmd;
+ struct page *p = alloc_page(GFP_KERNEL);
+ if (!p)
+ return;
+
+ Dprintk("split_gb_page %lx\n", paddr);
+
+ SetPagePrivate(p);
+ /* Set reference to 1 so that c_p_a() does not undo it */
+ page_private(p) = 1;
+
+ paddr &= PUD_PAGE_MASK;
+ pmd = page_address(p);
+ for (i = 0; i < PTRS_PER_PTE; i++, paddr += PMD_PAGE_SIZE)
+ pmd[i] = __pmd(direct_entry(paddr));
+ pud_populate(NULL, pud, pmd);
+}
+
/* Unmap a kernel mapping if it exists. This is useful to avoid prefetches
from the CPU leading to inconsistent cache lines. address and size
must be aligned to 2MB boundaries.
@@ -430,6 +471,8 @@ __clear_kernel_mapping(unsigned long add
BUG_ON(address & ~PMD_PAGE_MASK);
BUG_ON(size & ~PMD_PAGE_MASK);
+
+ Dprintk("clear_kernel_mapping %lx-%lx\n", address, address+size);
for (; address < end; address += PMD_PAGE_SIZE) {
pgd_t *pgd = pgd_offset_k(address);
@@ -438,6 +481,8 @@ __clear_kernel_mapping(unsigned long add
if (pgd_none(*pgd))
continue;
pud = pud_offset(pgd, address);
+ if (pud_large(*pud))
+ split_gb_page(pud, __pa(address));
if (pud_none(*pud))
continue;
pmd = pmd_offset(pud, address);
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] [1/8] GBPAGES: Handle kernel near memory hole in clear_kernel_mapping
2008-01-03 17:26 ` [PATCH] [1/8] GBPAGES: Handle kernel near memory hole in clear_kernel_mapping Andi Kleen
@ 2008-01-03 18:29 ` Vivek Goyal
2008-01-03 18:43 ` Andi Kleen
0 siblings, 1 reply; 15+ messages in thread
From: Vivek Goyal @ 2008-01-03 18:29 UTC (permalink / raw)
To: Andi Kleen; +Cc: ebiederm, linux-kernel
On Thu, Jan 03, 2008 at 06:26:57PM +0100, Andi Kleen wrote:
>
> This was a long standing obscure problem in the relocatable kernel. The
> AMD GART driver needs to unmap part of the GART in the kernel direct mapping to
> prevent cache corruption. With the relocatable kernel it is in theory possible
> that the separate kernel text mapping straddles that area too.
>
> Normally it should not happen because GART tends to be >= 2GB, and the kernel
> is normally not loaded that high, but it is possible in theory.
>
> Teach clear_kernel_mapping() about this case.
>
> This will become more important once the kernel mapping uses 1GB pages.
>
> Cc: ebiederm@xmission.com
> Cc: vgoyal@redhat.com
>
> Signed-off-by: Andi Kleen <ak@suse.de>
>
> ---
> arch/x86/mm/init_64.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> Index: linux/arch/x86/mm/init_64.c
> ===================================================================
> --- linux.orig/arch/x86/mm/init_64.c
> +++ linux/arch/x86/mm/init_64.c
> @@ -411,7 +411,8 @@ void __init paging_init(void)
> from the CPU leading to inconsistent cache lines. address and size
> must be aligned to 2MB boundaries.
> Does nothing when the mapping doesn't exist. */
> -void __init clear_kernel_mapping(unsigned long address, unsigned long size)
> +static void __init
> +__clear_kernel_mapping(unsigned long address, unsigned long size)
> {
> unsigned long end = address + size;
>
> @@ -441,6 +442,23 @@ void __init clear_kernel_mapping(unsigne
> __flush_tlb_all();
> }
>
> +#define overlaps(as,ae,bs,be) ((ae) >= (bs) && (as) <= (be))
> +
> +void __init clear_kernel_mapping(unsigned long address, unsigned long size)
> +{
> + int sh = PMD_SHIFT;
> + unsigned long kernel = __pa(__START_KERNEL_map);
> +
> + if (overlaps(kernel>>sh, (kernel + KERNEL_TEXT_SIZE)>>sh,
> + __pa(address)>>sh, __pa(address + size)>>sh)) {
> + printk(KERN_INFO
> + "Kernel at %lx overlaps memory hole at %lx-%lx\n",
> + kernel, __pa(address), __pa(address+size));
> + __clear_kernel_mapping(__START_KERNEL_map+__pa(address), size);
Hi Andi,
Got a question. How will kernel continue to run if we unmap the kernel
text/data region mappings?
Thanks
Vivek
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] [1/8] GBPAGES: Handle kernel near memory hole in clear_kernel_mapping
2008-01-03 18:29 ` Vivek Goyal
@ 2008-01-03 18:43 ` Andi Kleen
0 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-03 18:43 UTC (permalink / raw)
To: Vivek Goyal; +Cc: ebiederm, linux-kernel
> Got a question. How will kernel continue to run if we unmap the kernel
> text/data region mappings?
Normally it shouldn't be in the same 2MB area as the aperture (which
is the only thing that is unmapped). The problem is mostly
the rest of the 40MB kernel mapping.
-Andi
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] [6/8] GBPAGES: Add an option to disable direct mapping gbpages and a global variable
2008-01-03 17:27 ` [PATCH] [6/8] GBPAGES: Add an option to disable direct mapping gbpages and a global variable Andi Kleen
@ 2008-01-03 19:03 ` Nish Aravamudan
0 siblings, 0 replies; 15+ messages in thread
From: Nish Aravamudan @ 2008-01-03 19:03 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
On 1/3/08, Andi Kleen <ak@suse.de> wrote:
>
> Signed-off-by: Andi Kleen <ak@suse.de>
<snip>
> Index: linux/Documentation/x86_64/boot-options.txt
> ===================================================================
> --- linux.orig/Documentation/x86_64/boot-options.txt
> +++ linux/Documentation/x86_64/boot-options.txt
> @@ -307,3 +307,6 @@ Debugging
> stuck (default)
>
> Miscellaneous
> +
> + direct_gbpages=off
> + Do not use GB pages for kernel direct mapping.
Sorry if this is a FAQ, but why do we have this file in addition to
kernel-parameters.txt? I see that kernel-parameters.txt refers to this
file, so I guess it's ok, but shouldn't we try to consolidate?
Thanks,
Nish
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit
2008-01-23 21:26 ` Jan Engelhardt
@ 2008-01-24 6:57 ` Andi Kleen
0 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2008-01-24 6:57 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: mingo, tglx, linux-kernel
On Wednesday 23 January 2008 22:26:35 Jan Engelhardt wrote:
> On Jan 19 2008 07:48, Andi Kleen wrote:
> >Subject: [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid
> > bit
>
> Is there already a flag for /proc/cpuinfo or could you add one?
There is already one called pdpe1gb. I don't think it's a very clear name,
although AMD calls it the same. Calling it gbpages in /proc/cpuinfo would
have been probably better (and my old original patch did that too), but I
didn't catch the new name submitted by someone else in time.
-Andi
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit
2008-01-19 6:48 ` [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit Andi Kleen
@ 2008-01-23 21:26 ` Jan Engelhardt
2008-01-24 6:57 ` Andi Kleen
0 siblings, 1 reply; 15+ messages in thread
From: Jan Engelhardt @ 2008-01-23 21:26 UTC (permalink / raw)
To: Andi Kleen; +Cc: mingo, tglx, linux-kernel
On Jan 19 2008 07:48, Andi Kleen wrote:
>Subject: [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit
Is there already a flag for /proc/cpuinfo or could you add one?
>Index: linux/include/asm-x86/cpufeature.h
>===================================================================
>--- linux.orig/include/asm-x86/cpufeature.h
>+++ linux/include/asm-x86/cpufeature.h
>@@ -49,6 +49,7 @@
> #define X86_FEATURE_MP (1*32+19) /* MP Capable. */
> #define X86_FEATURE_NX (1*32+20) /* Execute Disable */
> #define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */
>+#define X86_FEATURE_GBPAGES (1*32+26) /* GB pages */
> #define X86_FEATURE_RDTSCP (1*32+27) /* RDTSCP */
> #define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */
> #define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit
2008-01-19 6:48 [PATCH] [0/8] GBpages support for x86-64, v2 Andi Kleen
@ 2008-01-19 6:48 ` Andi Kleen
2008-01-23 21:26 ` Jan Engelhardt
0 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2008-01-19 6:48 UTC (permalink / raw)
To: mingo, tglx, linux-kernel
Signed-off-by: Andi Kleen <ak@suse.de>
---
include/asm-x86/cpufeature.h | 2 ++
1 file changed, 2 insertions(+)
Index: linux/include/asm-x86/cpufeature.h
===================================================================
--- linux.orig/include/asm-x86/cpufeature.h
+++ linux/include/asm-x86/cpufeature.h
@@ -49,6 +49,7 @@
#define X86_FEATURE_MP (1*32+19) /* MP Capable. */
#define X86_FEATURE_NX (1*32+20) /* Execute Disable */
#define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */
+#define X86_FEATURE_GBPAGES (1*32+26) /* GB pages */
#define X86_FEATURE_RDTSCP (1*32+27) /* RDTSCP */
#define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */
#define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */
@@ -173,6 +174,7 @@
#define cpu_has_bts boot_cpu_has(X86_FEATURE_BTS)
#define cpu_has_pat boot_cpu_has(X86_FEATURE_PAT)
#define cpu_has_ss boot_cpu_has(X86_FEATURE_SELFSNOOP)
+#define cpu_has_gbpages boot_cpu_has(X86_FEATURE_GBPAGES)
#if defined(CONFIG_X86_INVLPG) || defined(CONFIG_X86_64)
# define cpu_has_invlpg 1
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-01-24 6:57 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-03 17:26 [PATCH] [0/8] GB pages (PDP1GB) support for the kernel direct mapping Andi Kleen
2008-01-03 17:26 ` [PATCH] [1/8] GBPAGES: Handle kernel near memory hole in clear_kernel_mapping Andi Kleen
2008-01-03 18:29 ` Vivek Goyal
2008-01-03 18:43 ` Andi Kleen
2008-01-03 17:26 ` [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit Andi Kleen
2008-01-03 17:26 ` [PATCH] [3/8] GBPAGES: Split LARGE_PAGE_SIZE/MASK into PUD_PAGE_SIZE/PMD_PAGE_SIZE Andi Kleen
2008-01-03 17:27 ` [PATCH] [4/8] GBPAGES: Add pgtable accessor functions for GB pages Andi Kleen
2008-01-03 17:27 ` [PATCH] [5/8] GBPAGES: Support gbpages in pagetable dump Andi Kleen
2008-01-03 17:27 ` [PATCH] [6/8] GBPAGES: Add an option to disable direct mapping gbpages and a global variable Andi Kleen
2008-01-03 19:03 ` Nish Aravamudan
2008-01-03 17:27 ` [PATCH] [7/8] GBPAGES: Implement GBpages support in change_page_attr() Andi Kleen
2008-01-03 17:27 ` [PATCH] [8/8] GBPAGES: Do kernel direct mapping at boot using GB pages Andi Kleen
2008-01-19 6:48 [PATCH] [0/8] GBpages support for x86-64, v2 Andi Kleen
2008-01-19 6:48 ` [PATCH] [2/8] GBPAGES: Add feature macros for the gbpages cpuid bit Andi Kleen
2008-01-23 21:26 ` Jan Engelhardt
2008-01-24 6:57 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).