LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCHv5 0/7] 5-level paging changes for v4.18
@ 2018-05-18 10:35 Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 1/7] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
` (7 more replies)
0 siblings, 8 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-05-18 10:35 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov
Here's several patches that I would like to queue for v4.18. Please review
and consider applying.
In this version I've addressed Thomas' feedback.
Changing __pgtable_l5_enabled to __initdata is not as trivial as I hoped.
It requires few tricks to avoid section mismatch. I'm not sure if it worth
the gain. We can keep it __ro_after_init.
If you feel it's too invasive, just drop last three patches.
Kirill A. Shutemov (7):
x86/boot/compressed/64: Fix trampoline page table address calculation
x86/mm: Unify pgtable_l5_enabled usage in early boot code
x86/mm: Stop pretending pgtable_l5_enabled is a variable
x86/mm: Introduce 'no5lvl' kernel parameter
x86/cpu: Move early cpu initialization into a separate translation
unit
x86/mm: Mark p4d_offset() __always_inline
x86/mm: Mark __pgtable_l5_enabled __initdata
.../admin-guide/kernel-parameters.txt | 3 +
arch/x86/boot/compressed/cmdline.c | 2 +-
arch/x86/boot/compressed/head_64.S | 1 +
arch/x86/boot/compressed/kaslr.c | 4 +-
arch/x86/boot/compressed/misc.h | 6 +-
arch/x86/boot/compressed/pgtable_64.c | 14 +-
arch/x86/include/asm/page_64_types.h | 2 +-
arch/x86/include/asm/paravirt.h | 4 +-
arch/x86/include/asm/pgalloc.h | 4 +-
arch/x86/include/asm/pgtable.h | 12 +-
arch/x86/include/asm/pgtable_32_types.h | 2 +-
arch/x86/include/asm/pgtable_64.h | 2 +-
arch/x86/include/asm/pgtable_64_types.h | 25 ++-
arch/x86/include/asm/sparsemem.h | 4 +-
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/common.c | 179 +++---------------
arch/x86/kernel/cpu/cpu.h | 7 +
arch/x86/kernel/cpu/early.c | 159 ++++++++++++++++
arch/x86/kernel/head64.c | 25 ++-
arch/x86/kernel/machine_kexec_64.c | 3 +-
arch/x86/mm/dump_pagetables.c | 6 +-
arch/x86/mm/fault.c | 4 +-
arch/x86/mm/ident_map.c | 2 +-
arch/x86/mm/init_64.c | 8 +-
arch/x86/mm/kasan_init_64.c | 14 +-
arch/x86/mm/kaslr.c | 8 +-
arch/x86/mm/tlb.c | 2 +-
arch/x86/platform/efi/efi_64.c | 2 +-
arch/x86/power/hibernate_64.c | 2 +-
29 files changed, 279 insertions(+), 228 deletions(-)
create mode 100644 arch/x86/kernel/cpu/early.c
--
2.17.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/7] x86/boot/compressed/64: Fix trampoline page table address calculation
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
@ 2018-05-18 10:35 ` Kirill A. Shutemov
2018-05-19 8:43 ` Thomas Gleixner
2018-05-19 11:33 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 2/7] x86/mm: Unify pgtable_l5_enabled usage in early boot code Kirill A. Shutemov
` (6 subsequent siblings)
7 siblings, 2 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-05-18 10:35 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov
Hugh noticied that I calculate address of trampoline page table wrongly
in cleanup_trampoline(). TRAMPOLINE_32BIT_PGTABLE_OFFSET has to be
divided by sizeof(unsigned long) since trampoline_32bit is unsigned long
pointer.
TRAMPOLINE_32BIT_PGTABLE_OFFSET is zero so the bug doesn't have a
visible effect.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Hugh Dickins <hughd@google.com>
Fixes: e9d0e6330eb8 ("x86/boot/compressed/64: Prepare new top-level page table for trampoline")
---
arch/x86/boot/compressed/pgtable_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index a362fa0b849c..23707e1da1ff 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -130,7 +130,7 @@ void cleanup_trampoline(void *pgtable)
{
void *trampoline_pgtable;
- trampoline_pgtable = trampoline_32bit + TRAMPOLINE_32BIT_PGTABLE_OFFSET;
+ trampoline_pgtable = trampoline_32bit + TRAMPOLINE_32BIT_PGTABLE_OFFSET / sizeof(unsigned long);
/*
* Move the top level page table out of trampoline memory,
--
2.17.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 2/7] x86/mm: Unify pgtable_l5_enabled usage in early boot code
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 1/7] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
@ 2018-05-18 10:35 ` Kirill A. Shutemov
2018-05-19 8:44 ` Thomas Gleixner
2018-05-19 11:34 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 3/7] x86/mm: Stop pretending pgtable_l5_enabled is a variable Kirill A. Shutemov
` (5 subsequent siblings)
7 siblings, 2 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-05-18 10:35 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov
Usually pgtable_l5_enabled is defined using cpu_feature_enabled().
cpu_feature_enabled() is not available in early boot code. We use
several different preprocessor tricks to get around it. It's messy.
Unify them all.
If cpu_feature_enabled() is not yet available, USE_EARLY_PGTABLE_L5 can
be defined before all includes. It makes pgtable_l5_enabled rely on
__pgtable_l5_enabled variable instead. This approach fits all early
users.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/boot/compressed/kaslr.c | 4 ++--
arch/x86/boot/compressed/misc.h | 6 ++----
arch/x86/include/asm/pgtable_64_types.h | 13 ++++++++++---
arch/x86/kernel/head64.c | 12 +++++-------
arch/x86/mm/kasan_init_64.c | 6 ++----
5 files changed, 21 insertions(+), 20 deletions(-)
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index a0a50b91ecef..b87a7582853d 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -47,7 +47,7 @@
#include <linux/decompress/mm.h>
#ifdef CONFIG_X86_5LEVEL
-unsigned int pgtable_l5_enabled __ro_after_init;
+unsigned int __pgtable_l5_enabled;
unsigned int pgdir_shift __ro_after_init = 39;
unsigned int ptrs_per_p4d __ro_after_init = 1;
#endif
@@ -734,7 +734,7 @@ void choose_random_location(unsigned long input,
#ifdef CONFIG_X86_5LEVEL
if (__read_cr4() & X86_CR4_LA57) {
- pgtable_l5_enabled = 1;
+ __pgtable_l5_enabled = 1;
pgdir_shift = 48;
ptrs_per_p4d = 512;
}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 9e11be4cae19..a423bdb42686 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -12,10 +12,8 @@
#undef CONFIG_PARAVIRT_SPINLOCKS
#undef CONFIG_KASAN
-#ifdef CONFIG_X86_5LEVEL
-/* cpu_feature_enabled() cannot be used that early */
-#define pgtable_l5_enabled __pgtable_l5_enabled
-#endif
+/* cpu_feature_enabled() cannot be used this early */
+#define USE_EARLY_PGTABLE_L5
#include <linux/linkage.h>
#include <linux/screen_info.h>
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index adb47552e6bb..c14a4116a693 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -22,12 +22,19 @@ typedef struct { pteval_t pte; } pte_t;
#ifdef CONFIG_X86_5LEVEL
extern unsigned int __pgtable_l5_enabled;
-#ifndef pgtable_l5_enabled
+
+#ifdef USE_EARLY_PGTABLE_L5
+/*
+ * cpu_feature_enabled() is not available in early boot code.
+ * Use variable instead.
+ */
+#define pgtable_l5_enabled __pgtable_l5_enabled
+#else
#define pgtable_l5_enabled cpu_feature_enabled(X86_FEATURE_LA57)
-#endif
+#endif /* USE_EARLY_PGTABLE_L5 */
#else
#define pgtable_l5_enabled 0
-#endif
+#endif /* CONFIG_X86_5LEVEL */
extern unsigned int pgdir_shift;
extern unsigned int ptrs_per_p4d;
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 0c408f8c4ed4..ef629f2bcd61 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -6,6 +6,10 @@
*/
#define DISABLE_BRANCH_PROFILING
+
+/* cpu_feature_enabled() cannot be used this early */
+#define USE_EARLY_PGTABLE_L5
+
#include <linux/init.h>
#include <linux/linkage.h>
#include <linux/types.h>
@@ -32,11 +36,6 @@
#include <asm/microcode.h>
#include <asm/kasan.h>
-#ifdef CONFIG_X86_5LEVEL
-#undef pgtable_l5_enabled
-#define pgtable_l5_enabled __pgtable_l5_enabled
-#endif
-
/*
* Manage page tables very early on.
*/
@@ -46,7 +45,6 @@ pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
#ifdef CONFIG_X86_5LEVEL
unsigned int __pgtable_l5_enabled __ro_after_init;
-EXPORT_SYMBOL(__pgtable_l5_enabled);
unsigned int pgdir_shift __ro_after_init = 39;
EXPORT_SYMBOL(pgdir_shift);
unsigned int ptrs_per_p4d __ro_after_init = 1;
@@ -88,7 +86,7 @@ static bool __head check_la57_support(unsigned long physaddr)
if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
return false;
- *fixup_int(&pgtable_l5_enabled, physaddr) = 1;
+ *fixup_int(&__pgtable_l5_enabled, physaddr) = 1;
*fixup_int(&pgdir_shift, physaddr) = 48;
*fixup_int(&ptrs_per_p4d, physaddr) = 512;
*fixup_long(&page_offset_base, physaddr) = __PAGE_OFFSET_BASE_L5;
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 980dbebd0ca7..340bb9b32e01 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -2,10 +2,8 @@
#define DISABLE_BRANCH_PROFILING
#define pr_fmt(fmt) "kasan: " fmt
-#ifdef CONFIG_X86_5LEVEL
-/* Too early to use cpu_feature_enabled() */
-#define pgtable_l5_enabled __pgtable_l5_enabled
-#endif
+/* cpu_feature_enabled() cannot be used this early */
+#define USE_EARLY_PGTABLE_L5
#include <linux/bootmem.h>
#include <linux/kasan.h>
--
2.17.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 3/7] x86/mm: Stop pretending pgtable_l5_enabled is a variable
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 1/7] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 2/7] x86/mm: Unify pgtable_l5_enabled usage in early boot code Kirill A. Shutemov
@ 2018-05-18 10:35 ` Kirill A. Shutemov
2018-05-19 8:45 ` Thomas Gleixner
2018-05-19 11:34 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 4/7] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
` (4 subsequent siblings)
7 siblings, 2 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-05-18 10:35 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov
pgtable_l5_enabled is defined using cpu_feature_enabled() but we refer
to it as a variable. This is misleading.
Make pgtable_l5_enabled() a function.
We cannot literally define it as a function due to circular dependencies
between header files. Function-alike macros is close enough.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/include/asm/page_64_types.h | 2 +-
arch/x86/include/asm/paravirt.h | 4 ++--
arch/x86/include/asm/pgalloc.h | 4 ++--
arch/x86/include/asm/pgtable.h | 10 +++++-----
arch/x86/include/asm/pgtable_32_types.h | 2 +-
arch/x86/include/asm/pgtable_64.h | 2 +-
arch/x86/include/asm/pgtable_64_types.h | 14 +++++++++-----
arch/x86/include/asm/sparsemem.h | 4 ++--
arch/x86/kernel/head64.c | 2 +-
arch/x86/kernel/machine_kexec_64.c | 3 ++-
arch/x86/mm/dump_pagetables.c | 6 +++---
arch/x86/mm/fault.c | 4 ++--
arch/x86/mm/ident_map.c | 2 +-
arch/x86/mm/init_64.c | 8 ++++----
arch/x86/mm/kasan_init_64.c | 8 ++++----
arch/x86/mm/kaslr.c | 8 ++++----
arch/x86/mm/tlb.c | 2 +-
arch/x86/platform/efi/efi_64.c | 2 +-
arch/x86/power/hibernate_64.c | 2 +-
19 files changed, 47 insertions(+), 42 deletions(-)
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 2c5a966dc222..6afac386a434 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -53,7 +53,7 @@
#define __PHYSICAL_MASK_SHIFT 52
#ifdef CONFIG_X86_5LEVEL
-#define __VIRTUAL_MASK_SHIFT (pgtable_l5_enabled ? 56 : 47)
+#define __VIRTUAL_MASK_SHIFT (pgtable_l5_enabled() ? 56 : 47)
#else
#define __VIRTUAL_MASK_SHIFT 47
#endif
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 9be2bf13825b..d49bbf4bb5c8 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -574,14 +574,14 @@ static inline void __set_pgd(pgd_t *pgdp, pgd_t pgd)
}
#define set_pgd(pgdp, pgdval) do { \
- if (pgtable_l5_enabled) \
+ if (pgtable_l5_enabled()) \
__set_pgd(pgdp, pgdval); \
else \
set_p4d((p4d_t *)(pgdp), (p4d_t) { (pgdval).pgd }); \
} while (0)
#define pgd_clear(pgdp) do { \
- if (pgtable_l5_enabled) \
+ if (pgtable_l5_enabled()) \
set_pgd(pgdp, __pgd(0)); \
} while (0)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index 263c142a6a6c..ada6410fd2ec 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -167,7 +167,7 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
#if CONFIG_PGTABLE_LEVELS > 4
static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
{
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return;
paravirt_alloc_p4d(mm, __pa(p4d) >> PAGE_SHIFT);
set_pgd(pgd, __pgd(_PAGE_TABLE | __pa(p4d)));
@@ -193,7 +193,7 @@ extern void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d);
static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long address)
{
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
___p4d_free_tlb(tlb, p4d);
}
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index f1633de5a675..5715647fc4fe 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -65,7 +65,7 @@ extern pmdval_t early_pmd_flags;
#ifndef __PAGETABLE_P4D_FOLDED
#define set_pgd(pgdp, pgd) native_set_pgd(pgdp, pgd)
-#define pgd_clear(pgd) (pgtable_l5_enabled ? native_pgd_clear(pgd) : 0)
+#define pgd_clear(pgd) (pgtable_l5_enabled() ? native_pgd_clear(pgd) : 0)
#endif
#ifndef set_p4d
@@ -881,7 +881,7 @@ static inline unsigned long p4d_index(unsigned long address)
#if CONFIG_PGTABLE_LEVELS > 4
static inline int pgd_present(pgd_t pgd)
{
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return 1;
return pgd_flags(pgd) & _PAGE_PRESENT;
}
@@ -900,7 +900,7 @@ static inline unsigned long pgd_page_vaddr(pgd_t pgd)
/* to find an entry in a page-table-directory. */
static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
{
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return (p4d_t *)pgd;
return (p4d_t *)pgd_page_vaddr(*pgd) + p4d_index(address);
}
@@ -909,7 +909,7 @@ static inline int pgd_bad(pgd_t pgd)
{
unsigned long ignore_flags = _PAGE_USER;
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return 0;
if (IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION))
@@ -920,7 +920,7 @@ static inline int pgd_bad(pgd_t pgd)
static inline int pgd_none(pgd_t pgd)
{
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return 0;
/*
* There is no need to do a workaround for the KNL stray
diff --git a/arch/x86/include/asm/pgtable_32_types.h b/arch/x86/include/asm/pgtable_32_types.h
index e3225e83db7d..d9a001a4a872 100644
--- a/arch/x86/include/asm/pgtable_32_types.h
+++ b/arch/x86/include/asm/pgtable_32_types.h
@@ -15,7 +15,7 @@
# include <asm/pgtable-2level_types.h>
#endif
-#define pgtable_l5_enabled 0
+#define pgtable_l5_enabled() 0
#define PGDIR_SIZE (1UL << PGDIR_SHIFT)
#define PGDIR_MASK (~(PGDIR_SIZE - 1))
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 877bc27718ae..3c5385f9a88f 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -220,7 +220,7 @@ static inline void native_set_p4d(p4d_t *p4dp, p4d_t p4d)
{
pgd_t pgd;
- if (pgtable_l5_enabled || !IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION)) {
+ if (pgtable_l5_enabled() || !IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION)) {
*p4dp = p4d;
return;
}
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index c14a4116a693..054765ab2da2 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -28,12 +28,16 @@ extern unsigned int __pgtable_l5_enabled;
* cpu_feature_enabled() is not available in early boot code.
* Use variable instead.
*/
-#define pgtable_l5_enabled __pgtable_l5_enabled
+static inline bool pgtable_l5_enabled(void)
+{
+ return __pgtable_l5_enabled;
+}
#else
-#define pgtable_l5_enabled cpu_feature_enabled(X86_FEATURE_LA57)
+#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
#endif /* USE_EARLY_PGTABLE_L5 */
+
#else
-#define pgtable_l5_enabled 0
+#define pgtable_l5_enabled() 0
#endif /* CONFIG_X86_5LEVEL */
extern unsigned int pgdir_shift;
@@ -109,7 +113,7 @@ extern unsigned int ptrs_per_p4d;
#define LDT_PGD_ENTRY_L4 -3UL
#define LDT_PGD_ENTRY_L5 -112UL
-#define LDT_PGD_ENTRY (pgtable_l5_enabled ? LDT_PGD_ENTRY_L5 : LDT_PGD_ENTRY_L4)
+#define LDT_PGD_ENTRY (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 : LDT_PGD_ENTRY_L4)
#define LDT_BASE_ADDR (LDT_PGD_ENTRY << PGDIR_SHIFT)
#define __VMALLOC_BASE_L4 0xffffc90000000000UL
@@ -123,7 +127,7 @@ extern unsigned int ptrs_per_p4d;
#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
# define VMALLOC_START vmalloc_base
-# define VMALLOC_SIZE_TB (pgtable_l5_enabled ? VMALLOC_SIZE_TB_L5 : VMALLOC_SIZE_TB_L4)
+# define VMALLOC_SIZE_TB (pgtable_l5_enabled() ? VMALLOC_SIZE_TB_L5 : VMALLOC_SIZE_TB_L4)
# define VMEMMAP_START vmemmap_base
#else
# define VMALLOC_START __VMALLOC_BASE_L4
diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
index 4617a2bf123c..199218719a86 100644
--- a/arch/x86/include/asm/sparsemem.h
+++ b/arch/x86/include/asm/sparsemem.h
@@ -27,8 +27,8 @@
# endif
#else /* CONFIG_X86_32 */
# define SECTION_SIZE_BITS 27 /* matt - 128 is convenient right now */
-# define MAX_PHYSADDR_BITS (pgtable_l5_enabled ? 52 : 44)
-# define MAX_PHYSMEM_BITS (pgtable_l5_enabled ? 52 : 46)
+# define MAX_PHYSADDR_BITS (pgtable_l5_enabled() ? 52 : 44)
+# define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 52 : 46)
#endif
#endif /* CONFIG_SPARSEMEM */
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index ef629f2bcd61..ac470e1ea102 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -271,7 +271,7 @@ int __init __early_make_pgtable(unsigned long address, pmdval_t pmd)
* critical -- __PAGE_OFFSET would point us back into the dynamic
* range and we might end up looping forever...
*/
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
p4d_p = pgd_p;
else if (pgd)
p4d_p = (p4dval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map - phys_base);
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index a5e55d832d0a..ffe0f3535200 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -351,7 +351,8 @@ void arch_crash_save_vmcoreinfo(void)
{
VMCOREINFO_NUMBER(phys_base);
VMCOREINFO_SYMBOL(init_top_pgt);
- VMCOREINFO_NUMBER(pgtable_l5_enabled);
+ vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
+ pgtable_l5_enabled());
#ifdef CONFIG_NUMA
VMCOREINFO_SYMBOL(node_data);
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index cc7ff5957194..2f3c9196b834 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -360,7 +360,7 @@ static inline bool kasan_page_table(struct seq_file *m, struct pg_state *st,
void *pt)
{
if (__pa(pt) == __pa(kasan_zero_pmd) ||
- (pgtable_l5_enabled && __pa(pt) == __pa(kasan_zero_p4d)) ||
+ (pgtable_l5_enabled() && __pa(pt) == __pa(kasan_zero_p4d)) ||
__pa(pt) == __pa(kasan_zero_pud)) {
pgprotval_t prot = pte_flags(kasan_zero_pte[0]);
note_page(m, st, __pgprot(prot), 0, 5);
@@ -476,8 +476,8 @@ static void walk_p4d_level(struct seq_file *m, struct pg_state *st, pgd_t addr,
}
}
-#define pgd_large(a) (pgtable_l5_enabled ? pgd_large(a) : p4d_large(__p4d(pgd_val(a))))
-#define pgd_none(a) (pgtable_l5_enabled ? pgd_none(a) : p4d_none(__p4d(pgd_val(a))))
+#define pgd_large(a) (pgtable_l5_enabled() ? pgd_large(a) : p4d_large(__p4d(pgd_val(a))))
+#define pgd_none(a) (pgtable_l5_enabled() ? pgd_none(a) : p4d_none(__p4d(pgd_val(a))))
static inline bool is_hypervisor_range(int idx)
{
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 73bd8c95ac71..77ec014554e7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -439,7 +439,7 @@ static noinline int vmalloc_fault(unsigned long address)
if (pgd_none(*pgd_k))
return -1;
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
if (pgd_none(*pgd)) {
set_pgd(pgd, *pgd_k);
arch_flush_lazy_mmu_mode();
@@ -454,7 +454,7 @@ static noinline int vmalloc_fault(unsigned long address)
if (p4d_none(*p4d_k))
return -1;
- if (p4d_none(*p4d) && !pgtable_l5_enabled) {
+ if (p4d_none(*p4d) && !pgtable_l5_enabled()) {
set_p4d(p4d, *p4d_k);
arch_flush_lazy_mmu_mode();
} else {
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index a2f0c7e20fb0..fe7a12599d8e 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -123,7 +123,7 @@ int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
result = ident_p4d_init(info, p4d, addr, next);
if (result)
return result;
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
set_pgd(pgd, __pgd(__pa(p4d) | info->kernpg_flag));
} else {
/*
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 0a400606dea0..17383f9677fa 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -180,7 +180,7 @@ static void sync_global_pgds_l4(unsigned long start, unsigned long end)
*/
void sync_global_pgds(unsigned long start, unsigned long end)
{
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
sync_global_pgds_l5(start, end);
else
sync_global_pgds_l4(start, end);
@@ -643,7 +643,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
unsigned long vaddr = (unsigned long)__va(paddr);
int i = p4d_index(vaddr);
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return phys_pud_init((pud_t *) p4d_page, paddr, paddr_end, page_size_mask);
for (; i < PTRS_PER_P4D; i++, paddr = paddr_next) {
@@ -723,7 +723,7 @@ kernel_physical_mapping_init(unsigned long paddr_start,
page_size_mask);
spin_lock(&init_mm.page_table_lock);
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
pgd_populate(&init_mm, pgd, p4d);
else
p4d_populate(&init_mm, p4d_offset(pgd, vaddr), (pud_t *) p4d);
@@ -1100,7 +1100,7 @@ remove_p4d_table(p4d_t *p4d_start, unsigned long addr, unsigned long end,
* 5-level case we should free them. This code will have to change
* to adapt for boot-time switching between 4 and 5 level page tables.
*/
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
free_pud_table(pud_base, p4d);
}
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 340bb9b32e01..e3e77527f8df 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -180,7 +180,7 @@ static void __init clear_pgds(unsigned long start,
* With folded p4d, pgd_clear() is nop, use p4d_clear()
* instead.
*/
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
pgd_clear(pgd);
else
p4d_clear(p4d_offset(pgd, start));
@@ -195,7 +195,7 @@ static inline p4d_t *early_p4d_offset(pgd_t *pgd, unsigned long addr)
{
unsigned long p4d;
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return (p4d_t *)pgd;
p4d = __pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK;
@@ -282,7 +282,7 @@ void __init kasan_early_init(void)
for (i = 0; i < PTRS_PER_PUD; i++)
kasan_zero_pud[i] = __pud(pud_val);
- for (i = 0; pgtable_l5_enabled && i < PTRS_PER_P4D; i++)
+ for (i = 0; pgtable_l5_enabled() && i < PTRS_PER_P4D; i++)
kasan_zero_p4d[i] = __p4d(p4d_val);
kasan_map_early_shadow(early_top_pgt);
@@ -313,7 +313,7 @@ void __init kasan_init(void)
* bunch of things like kernel code, modules, EFI mapping, etc.
* We need to take extra steps to not overwrite them.
*/
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
void *ptr;
ptr = (void *)pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_END));
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 615cc03ced84..61db77b0eda9 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -78,7 +78,7 @@ void __init kernel_randomize_memory(void)
struct rnd_state rand_state;
unsigned long remain_entropy;
- vaddr_start = pgtable_l5_enabled ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
+ vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
vaddr = vaddr_start;
/*
@@ -124,7 +124,7 @@ void __init kernel_randomize_memory(void)
*/
entropy = remain_entropy / (ARRAY_SIZE(kaslr_regions) - i);
prandom_bytes_state(&rand_state, &rand, sizeof(rand));
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
entropy = (rand % (entropy + 1)) & P4D_MASK;
else
entropy = (rand % (entropy + 1)) & PUD_MASK;
@@ -136,7 +136,7 @@ void __init kernel_randomize_memory(void)
* randomization alignment.
*/
vaddr += get_padding(&kaslr_regions[i]);
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
vaddr = round_up(vaddr + 1, P4D_SIZE);
else
vaddr = round_up(vaddr + 1, PUD_SIZE);
@@ -212,7 +212,7 @@ void __meminit init_trampoline(void)
return;
}
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
init_trampoline_p4d();
else
init_trampoline_pud();
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index e055d1a06699..6eb1f34c3c85 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -157,7 +157,7 @@ static void sync_current_stack_to_mm(struct mm_struct *mm)
unsigned long sp = current_stack_pointer;
pgd_t *pgd = pgd_offset(mm, sp);
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
if (unlikely(pgd_none(*pgd))) {
pgd_t *pgd_ref = pgd_offset_k(sp);
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index bed7e7f4e44c..e01f7ceb9e7a 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -225,7 +225,7 @@ int __init efi_alloc_page_tables(void)
pud = pud_alloc(&init_mm, p4d, EFI_VA_END);
if (!pud) {
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
free_page((unsigned long) pgd_page_vaddr(*pgd));
free_pages((unsigned long)efi_pgd, PGD_ALLOCATION_ORDER);
return -ENOMEM;
diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
index ccf4a49bb065..67ccf64c8bd8 100644
--- a/arch/x86/power/hibernate_64.c
+++ b/arch/x86/power/hibernate_64.c
@@ -72,7 +72,7 @@ static int set_up_temporary_text_mapping(pgd_t *pgd)
* tables used by the image kernel.
*/
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
p4d = (p4d_t *)get_safe_page(GFP_ATOMIC);
if (!p4d)
return -ENOMEM;
--
2.17.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 4/7] x86/mm: Introduce 'no5lvl' kernel parameter
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
` (2 preceding siblings ...)
2018-05-18 10:35 ` [PATCH 3/7] x86/mm: Stop pretending pgtable_l5_enabled is a variable Kirill A. Shutemov
@ 2018-05-18 10:35 ` Kirill A. Shutemov
2018-05-19 8:46 ` Thomas Gleixner
2018-05-19 11:35 ` [tip:x86/boot] x86/mm: Introduce the " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 5/7] x86/cpu: Move early cpu initialization into a separate translation unit Kirill A. Shutemov
` (3 subsequent siblings)
7 siblings, 2 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-05-18 10:35 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov
The kernel parameter allows to force kernel to use 4-level paging even
if hardware and kernel support 5-level paging.
The option may be useful to workaround regressions related to 5-level
paging.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
Documentation/admin-guide/kernel-parameters.txt | 3 +++
arch/x86/boot/compressed/cmdline.c | 2 +-
arch/x86/boot/compressed/head_64.S | 1 +
arch/x86/boot/compressed/pgtable_64.c | 12 ++++++++++--
arch/x86/kernel/cpu/common.c | 15 +++++++++++++++
arch/x86/kernel/head64.c | 9 +++++----
6 files changed, 35 insertions(+), 7 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 11fc28ecdb6d..364a33c1534d 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2600,6 +2600,9 @@
emulation library even if a 387 maths coprocessor
is present.
+ no5lvl [X86-64] Disable 5-level paging mode. Forces
+ kernel to use 4-level paging instead.
+
no_console_suspend
[HW] Never suspend the console
Disable suspending of consoles during suspend and
diff --git a/arch/x86/boot/compressed/cmdline.c b/arch/x86/boot/compressed/cmdline.c
index 0cb325734cfb..af6cda0b7900 100644
--- a/arch/x86/boot/compressed/cmdline.c
+++ b/arch/x86/boot/compressed/cmdline.c
@@ -1,7 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include "misc.h"
-#if CONFIG_EARLY_PRINTK || CONFIG_RANDOMIZE_BASE
+#if CONFIG_EARLY_PRINTK || CONFIG_RANDOMIZE_BASE || CONFIG_X86_5LEVEL
static unsigned long fs;
static inline void set_fs(unsigned long seg)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 8169e8b7a4dc..64037895b085 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -365,6 +365,7 @@ ENTRY(startup_64)
* this function call.
*/
pushq %rsi
+ movq %rsi, %rdi /* real mode address */
call paging_prepare
popq %rsi
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index 23707e1da1ff..8c5107545251 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -31,16 +31,23 @@ static char trampoline_save[TRAMPOLINE_32BIT_SIZE];
*/
unsigned long *trampoline_32bit __section(.data);
-struct paging_config paging_prepare(void)
+extern struct boot_params *boot_params;
+int cmdline_find_option_bool(const char *option);
+
+struct paging_config paging_prepare(void *rmode)
{
struct paging_config paging_config = {};
unsigned long bios_start, ebda_start;
+ /* Initialize boot_params. Required for cmdline_find_option_bool(). */
+ boot_params = rmode;
+
/*
* Check if LA57 is desired and supported.
*
- * There are two parts to the check:
+ * There are several parts to the check:
* - if the kernel supports 5-level paging: CONFIG_X86_5LEVEL=y
+ * - if user asked to disable 5-level paging: no5lvl in cmdline
* - if the machine supports 5-level paging:
* + CPUID leaf 7 is supported
* + the leaf has the feature bit set
@@ -48,6 +55,7 @@ struct paging_config paging_prepare(void)
* That's substitute for boot_cpu_has() in early boot code.
*/
if (IS_ENABLED(CONFIG_X86_5LEVEL) &&
+ !cmdline_find_option_bool("no5lvl") &&
native_cpuid_eax(0) >= 7 &&
(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) {
paging_config.l5_required = 1;
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ce243f7d2d4e..a32f3c02327f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1008,6 +1008,21 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
*/
setup_clear_cpu_cap(X86_FEATURE_PCID);
#endif
+
+ /*
+ * Later in the boot process pgtable_l5_enabled() relies on
+ * cpu_feature_enabled(X86_FEATURE_LA57). If 5-level paging is not
+ * enabled by this point we need to clear the feature bit to avoid
+ * false-positives at the later stage.
+ *
+ * pgtable_l5_enabled() can be false here for several reasons:
+ * - 5-level paging is disabled compile-time;
+ * - it's 32-bit kernel;
+ * - machine doesn't support 5-level paging;
+ * - user specified 'no5lvl' in kernel command line.
+ */
+ if (!pgtable_l5_enabled())
+ setup_clear_cpu_cap(X86_FEATURE_LA57);
}
void __init early_cpu_init(void)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index ac470e1ea102..43b009a97f23 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -80,10 +80,11 @@ static unsigned int __head *fixup_int(void *ptr, unsigned long physaddr)
static bool __head check_la57_support(unsigned long physaddr)
{
- if (native_cpuid_eax(0) < 7)
- return false;
-
- if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+ /*
+ * 5-level paging is detected and enabled at kernel decomression
+ * stage. Only check if it has been enabled there.
+ */
+ if (!(native_read_cr4() & X86_CR4_LA57))
return false;
*fixup_int(&__pgtable_l5_enabled, physaddr) = 1;
--
2.17.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 5/7] x86/cpu: Move early cpu initialization into a separate translation unit
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
` (3 preceding siblings ...)
2018-05-18 10:35 ` [PATCH 4/7] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
@ 2018-05-18 10:35 ` Kirill A. Shutemov
2018-05-19 8:47 ` Thomas Gleixner
2018-05-18 10:35 ` [PATCH 6/7] x86/mm: Mark p4d_offset() __always_inline Kirill A. Shutemov
` (2 subsequent siblings)
7 siblings, 1 reply; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-05-18 10:35 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov
__pgtable_l5_enabled shouldn't be needed after system has booted, we can
mark it as __initdata, but it requires preparation.
This patch moves early cpu initialization into a separate translation
unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code.
Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is
not __init function and it leads to section mismatch.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/common.c | 194 ++++-------------------------------
arch/x86/kernel/cpu/cpu.h | 7 ++
arch/x86/kernel/cpu/early.c | 159 ++++++++++++++++++++++++++++
4 files changed, 189 insertions(+), 172 deletions(-)
create mode 100644 arch/x86/kernel/cpu/early.c
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index a66229f51b12..6d88889706a8 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -19,6 +19,7 @@ CFLAGS_common.o := $(nostackp)
obj-y := intel_cacheinfo.o scattered.o topology.o
obj-y += common.o
+obj-y += early.o
obj-y += rdrand.o
obj-y += match.o
obj-y += bugs.o
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a32f3c02327f..381675c7e485 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -47,7 +47,6 @@
#include <asm/pat.h>
#include <asm/microcode.h>
#include <asm/microcode_intel.h>
-#include <asm/intel-family.h>
#include <asm/cpu_device_id.h>
#ifdef CONFIG_X86_LOCAL_APIC
@@ -98,7 +97,7 @@ static const struct cpu_dev default_cpu = {
.c_x86_vendor = X86_VENDOR_UNKNOWN,
};
-static const struct cpu_dev *this_cpu = &default_cpu;
+const struct cpu_dev *this_cpu_dev = &default_cpu;
DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
#ifdef CONFIG_X86_64
@@ -419,7 +418,7 @@ cpuid_dependent_features[] = {
{ 0, 0 }
};
-static void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn)
+void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn)
{
const struct cpuid_dependent_feature *df;
@@ -464,10 +463,10 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c)
if (c->x86_model >= 16)
return NULL; /* Range check */
- if (!this_cpu)
+ if (!this_cpu_dev)
return NULL;
- info = this_cpu->legacy_models;
+ info = this_cpu_dev->legacy_models;
while (info->family) {
if (info->family == c->x86)
@@ -544,7 +543,7 @@ void switch_to_new_gdt(int cpu)
load_percpu_segment(cpu);
}
-static const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {};
+const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {};
static void get_model_name(struct cpuinfo_x86 *c)
{
@@ -602,8 +601,8 @@ void cpu_detect_cache_sizes(struct cpuinfo_x86 *c)
c->x86_tlbsize += ((ebx >> 16) & 0xfff) + (ebx & 0xfff);
#else
/* do processor-specific cache resizing */
- if (this_cpu->legacy_cache_size)
- l2size = this_cpu->legacy_cache_size(c, l2size);
+ if (this_cpu_dev->legacy_cache_size)
+ l2size = this_cpu_dev->legacy_cache_size(c, l2size);
/* Allow user to override all this if necessary. */
if (cachesize_override != -1)
@@ -626,8 +625,8 @@ u16 __read_mostly tlb_lld_1g[NR_INFO];
static void cpu_detect_tlb(struct cpuinfo_x86 *c)
{
- if (this_cpu->c_detect_tlb)
- this_cpu->c_detect_tlb(c);
+ if (this_cpu_dev->c_detect_tlb)
+ this_cpu_dev->c_detect_tlb(c);
pr_info("Last level iTLB entries: 4KB %d, 2MB %d, 4MB %d\n",
tlb_lli_4k[ENTRIES], tlb_lli_2m[ENTRIES],
@@ -689,7 +688,7 @@ void detect_ht(struct cpuinfo_x86 *c)
#endif
}
-static void get_cpu_vendor(struct cpuinfo_x86 *c)
+void get_cpu_vendor(struct cpuinfo_x86 *c)
{
char *v = c->x86_vendor_id;
int i;
@@ -702,8 +701,8 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c)
(cpu_devs[i]->c_ident[1] &&
!strcmp(v, cpu_devs[i]->c_ident[1]))) {
- this_cpu = cpu_devs[i];
- c->x86_vendor = this_cpu->c_x86_vendor;
+ this_cpu_dev = cpu_devs[i];
+ c->x86_vendor = this_cpu_dev->c_x86_vendor;
return;
}
}
@@ -712,7 +711,7 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c)
"CPU: Your system may be unstable.\n", v);
c->x86_vendor = X86_VENDOR_UNKNOWN;
- this_cpu = &default_cpu;
+ this_cpu_dev = &default_cpu;
}
void cpu_detect(struct cpuinfo_x86 *c)
@@ -867,7 +866,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
apply_forced_caps(c);
}
-static void get_cpu_address_sizes(struct cpuinfo_x86 *c)
+void get_cpu_address_sizes(struct cpuinfo_x86 *c)
{
u32 eax, ebx, ecx, edx;
@@ -883,7 +882,7 @@ static void get_cpu_address_sizes(struct cpuinfo_x86 *c)
#endif
}
-static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
+void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_X86_32
int i;
@@ -909,155 +908,6 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#endif
}
-static const __initconst struct x86_cpu_id cpu_no_speculation[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_CENTAUR, 5 },
- { X86_VENDOR_INTEL, 5 },
- { X86_VENDOR_NSC, 5 },
- { X86_VENDOR_ANY, 4 },
- {}
-};
-
-static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
- { X86_VENDOR_AMD },
- {}
-};
-
-static bool __init cpu_vulnerable_to_meltdown(struct cpuinfo_x86 *c)
-{
- u64 ia32_cap = 0;
-
- if (x86_match_cpu(cpu_no_meltdown))
- return false;
-
- if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
- rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
-
- /* Rogue Data Cache Load? No! */
- if (ia32_cap & ARCH_CAP_RDCL_NO)
- return false;
-
- return true;
-}
-
-/*
- * Do minimum CPU detection early.
- * Fields really needed: vendor, cpuid_level, family, model, mask,
- * cache alignment.
- * The others are not touched to avoid unwanted side effects.
- *
- * WARNING: this function is only called on the boot CPU. Don't add code
- * here that is supposed to run on all CPUs.
- */
-static void __init early_identify_cpu(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_X86_64
- c->x86_clflush_size = 64;
- c->x86_phys_bits = 36;
- c->x86_virt_bits = 48;
-#else
- c->x86_clflush_size = 32;
- c->x86_phys_bits = 32;
- c->x86_virt_bits = 32;
-#endif
- c->x86_cache_alignment = c->x86_clflush_size;
-
- memset(&c->x86_capability, 0, sizeof c->x86_capability);
- c->extended_cpuid_level = 0;
-
- /* cyrix could have cpuid enabled via c_identify()*/
- if (have_cpuid_p()) {
- cpu_detect(c);
- get_cpu_vendor(c);
- get_cpu_cap(c);
- get_cpu_address_sizes(c);
- setup_force_cpu_cap(X86_FEATURE_CPUID);
-
- if (this_cpu->c_early_init)
- this_cpu->c_early_init(c);
-
- c->cpu_index = 0;
- filter_cpuid_features(c, false);
-
- if (this_cpu->c_bsp_init)
- this_cpu->c_bsp_init(c);
- } else {
- identify_cpu_without_cpuid(c);
- setup_clear_cpu_cap(X86_FEATURE_CPUID);
- }
-
- setup_force_cpu_cap(X86_FEATURE_ALWAYS);
-
- if (!x86_match_cpu(cpu_no_speculation)) {
- if (cpu_vulnerable_to_meltdown(c))
- setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
- setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
- setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
- }
-
- fpu__init_system(c);
-
-#ifdef CONFIG_X86_32
- /*
- * Regardless of whether PCID is enumerated, the SDM says
- * that it can't be enabled in 32-bit mode.
- */
- setup_clear_cpu_cap(X86_FEATURE_PCID);
-#endif
-
- /*
- * Later in the boot process pgtable_l5_enabled() relies on
- * cpu_feature_enabled(X86_FEATURE_LA57). If 5-level paging is not
- * enabled by this point we need to clear the feature bit to avoid
- * false-positives at the later stage.
- *
- * pgtable_l5_enabled() can be false here for several reasons:
- * - 5-level paging is disabled compile-time;
- * - it's 32-bit kernel;
- * - machine doesn't support 5-level paging;
- * - user specified 'no5lvl' in kernel command line.
- */
- if (!pgtable_l5_enabled())
- setup_clear_cpu_cap(X86_FEATURE_LA57);
-}
-
-void __init early_cpu_init(void)
-{
- const struct cpu_dev *const *cdev;
- int count = 0;
-
-#ifdef CONFIG_PROCESSOR_SELECT
- pr_info("KERNEL supported cpus:\n");
-#endif
-
- for (cdev = __x86_cpu_dev_start; cdev < __x86_cpu_dev_end; cdev++) {
- const struct cpu_dev *cpudev = *cdev;
-
- if (count >= X86_VENDOR_NUM)
- break;
- cpu_devs[count] = cpudev;
- count++;
-
-#ifdef CONFIG_PROCESSOR_SELECT
- {
- unsigned int j;
-
- for (j = 0; j < 2; j++) {
- if (!cpudev->c_ident[j])
- continue;
- pr_info(" %s %s\n", cpudev->c_vendor,
- cpudev->c_ident[j]);
- }
- }
-#endif
- }
- early_identify_cpu(&boot_cpu_data);
-}
-
/*
* The NOPL instruction is supposed to exist on all CPUs of family >= 6;
* unfortunately, that's not true in practice because of early VIA
@@ -1234,8 +1084,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
generic_identify(c);
- if (this_cpu->c_identify)
- this_cpu->c_identify(c);
+ if (this_cpu_dev->c_identify)
+ this_cpu_dev->c_identify(c);
/* Clear/Set all flags overridden by options, after probe */
apply_forced_caps(c);
@@ -1254,8 +1104,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
* At the end of this section, c->x86_capability better
* indicate the features this CPU genuinely supports!
*/
- if (this_cpu->c_init)
- this_cpu->c_init(c);
+ if (this_cpu_dev->c_init)
+ this_cpu_dev->c_init(c);
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
@@ -1389,7 +1239,7 @@ void print_cpu_info(struct cpuinfo_x86 *c)
const char *vendor = NULL;
if (c->x86_vendor < X86_VENDOR_NUM) {
- vendor = this_cpu->c_vendor;
+ vendor = this_cpu_dev->c_vendor;
} else {
if (c->cpuid_level >= 0)
vendor = c->x86_vendor_id;
@@ -1763,8 +1613,8 @@ void cpu_init(void)
static void bsp_resume(void)
{
- if (this_cpu->c_bsp_resume)
- this_cpu->c_bsp_resume(&boot_cpu_data);
+ if (this_cpu_dev->c_bsp_resume)
+ this_cpu_dev->c_bsp_resume(&boot_cpu_data);
}
static struct syscore_ops cpu_syscore_ops = {
diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
index e806b11a99af..d633835b59ee 100644
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -45,8 +45,15 @@ struct _tlb_table {
extern const struct cpu_dev *const __x86_cpu_dev_start[],
*const __x86_cpu_dev_end[];
+extern const struct cpu_dev *cpu_devs[];
+extern const struct cpu_dev *this_cpu_dev;
+
extern void get_cpu_cap(struct cpuinfo_x86 *c);
+extern void get_cpu_vendor(struct cpuinfo_x86 *c);
+extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
+extern void identify_cpu_without_cpuid(struct cpuinfo_x86 *c);
+extern void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn);
unsigned int aperfmperf_get_khz(int cpu);
diff --git a/arch/x86/kernel/cpu/early.c b/arch/x86/kernel/cpu/early.c
new file mode 100644
index 000000000000..cb42c1d909f6
--- /dev/null
+++ b/arch/x86/kernel/cpu/early.c
@@ -0,0 +1,159 @@
+#include <linux/linkage.h>
+#include <linux/kernel.h>
+
+#include <asm/processor.h>
+#include <asm/cpu.h>
+#include <asm/cpu_device_id.h>
+#include <asm/intel-family.h>
+#include <asm/fpu/internal.h>
+
+#include "cpu.h"
+
+static const __initconst struct x86_cpu_id cpu_no_speculation[] = {
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_CENTAUR, 5 },
+ { X86_VENDOR_INTEL, 5 },
+ { X86_VENDOR_NSC, 5 },
+ { X86_VENDOR_ANY, 4 },
+ {}
+};
+
+static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
+ { X86_VENDOR_AMD },
+ {}
+};
+
+static bool __init cpu_vulnerable_to_meltdown(struct cpuinfo_x86 *c)
+{
+ u64 ia32_cap = 0;
+
+ if (x86_match_cpu(cpu_no_meltdown))
+ return false;
+
+ if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+ /* Rogue Data Cache Load? No! */
+ if (ia32_cap & ARCH_CAP_RDCL_NO)
+ return false;
+
+ return true;
+}
+
+/*
+ * Do minimum CPU detection early.
+ * Fields really needed: vendor, cpuid_level, family, model, mask,
+ * cache alignment.
+ * The others are not touched to avoid unwanted side effects.
+ *
+ * WARNING: this function is only called on the boot CPU. Don't add code
+ * here that is supposed to run on all CPUs.
+ */
+static void __init early_identify_cpu(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_X86_64
+ c->x86_clflush_size = 64;
+ c->x86_phys_bits = 36;
+ c->x86_virt_bits = 48;
+#else
+ c->x86_clflush_size = 32;
+ c->x86_phys_bits = 32;
+ c->x86_virt_bits = 32;
+#endif
+ c->x86_cache_alignment = c->x86_clflush_size;
+
+ memset(&c->x86_capability, 0, sizeof c->x86_capability);
+ c->extended_cpuid_level = 0;
+
+ /* cyrix could have cpuid enabled via c_identify()*/
+ if (have_cpuid_p()) {
+ cpu_detect(c);
+ get_cpu_vendor(c);
+ get_cpu_cap(c);
+ get_cpu_address_sizes(c);
+ setup_force_cpu_cap(X86_FEATURE_CPUID);
+
+ if (this_cpu_dev->c_early_init)
+ this_cpu_dev->c_early_init(c);
+
+ c->cpu_index = 0;
+ filter_cpuid_features(c, false);
+
+ if (this_cpu_dev->c_bsp_init)
+ this_cpu_dev->c_bsp_init(c);
+ } else {
+ identify_cpu_without_cpuid(c);
+ setup_clear_cpu_cap(X86_FEATURE_CPUID);
+ }
+
+ setup_force_cpu_cap(X86_FEATURE_ALWAYS);
+
+ if (!x86_match_cpu(cpu_no_speculation)) {
+ if (cpu_vulnerable_to_meltdown(c))
+ setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+ }
+
+ fpu__init_system(c);
+
+#ifdef CONFIG_X86_32
+ /*
+ * Regardless of whether PCID is enumerated, the SDM says
+ * that it can't be enabled in 32-bit mode.
+ */
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+#endif
+
+ /*
+ * Later in the boot process pgtable_l5_enabled() relies on
+ * cpu_feature_enabled(X86_FEATURE_LA57). If 5-level paging is not
+ * enabled by this point we need to clear the feature bit to avoid
+ * false-positives at the later stage.
+ *
+ * pgtable_l5_enabled() can be false here for several reasons:
+ * - 5-level paging is disabled compile-time;
+ * - it's 32-bit kernel;
+ * - machine doesn't support 5-level paging;
+ * - user specified 'no5lvl' in kernel command line.
+ */
+ if (!pgtable_l5_enabled())
+ setup_clear_cpu_cap(X86_FEATURE_LA57);
+}
+
+void __init early_cpu_init(void)
+{
+ const struct cpu_dev *const *cdev;
+ int count = 0;
+
+#ifdef CONFIG_PROCESSOR_SELECT
+ pr_info("KERNEL supported cpus:\n");
+#endif
+
+ for (cdev = __x86_cpu_dev_start; cdev < __x86_cpu_dev_end; cdev++) {
+ const struct cpu_dev *cpudev = *cdev;
+
+ if (count >= X86_VENDOR_NUM)
+ break;
+ cpu_devs[count] = cpudev;
+ count++;
+
+#ifdef CONFIG_PROCESSOR_SELECT
+ {
+ unsigned int j;
+
+ for (j = 0; j < 2; j++) {
+ if (!cpudev->c_ident[j])
+ continue;
+ pr_info(" %s %s\n", cpudev->c_vendor,
+ cpudev->c_ident[j]);
+ }
+ }
+#endif
+ }
+ early_identify_cpu(&boot_cpu_data);
+}
--
2.17.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 6/7] x86/mm: Mark p4d_offset() __always_inline
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
` (4 preceding siblings ...)
2018-05-18 10:35 ` [PATCH 5/7] x86/cpu: Move early cpu initialization into a separate translation unit Kirill A. Shutemov
@ 2018-05-18 10:35 ` Kirill A. Shutemov
2018-05-19 8:47 ` Thomas Gleixner
2018-05-19 11:35 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 7/7] x86/mm: Mark __pgtable_l5_enabled __initdata Kirill A. Shutemov
2018-05-19 8:49 ` [PATCHv5 0/7] 5-level paging changes for v4.18 Thomas Gleixner
7 siblings, 2 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-05-18 10:35 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov
__pgtable_l5_enabled shouldn't be needed after system has booted, we can
mark it as __initdata, but it requires preparation.
KASAN initialization code is a user of USE_EARLY_PGTABLE_L5, so all
pgtable_l5_enabled() translated to __pgtable_l5_enabled there, including
the one in p4d_offset().
It may lead to section mismatch, if a compiler would not inline
p4d_offset(), but leave it as a standalone function: p4d_offset() is not
marked as __init.
Marking p4d_offset() as __always_inline fixes the issue.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/include/asm/pgtable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5715647fc4fe..99ecde23c3ec 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -898,7 +898,7 @@ static inline unsigned long pgd_page_vaddr(pgd_t pgd)
#define pgd_page(pgd) pfn_to_page(pgd_pfn(pgd))
/* to find an entry in a page-table-directory. */
-static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
+static __always_inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
{
if (!pgtable_l5_enabled())
return (p4d_t *)pgd;
--
2.17.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 7/7] x86/mm: Mark __pgtable_l5_enabled __initdata
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
` (5 preceding siblings ...)
2018-05-18 10:35 ` [PATCH 6/7] x86/mm: Mark p4d_offset() __always_inline Kirill A. Shutemov
@ 2018-05-18 10:35 ` Kirill A. Shutemov
2018-05-19 8:48 ` Thomas Gleixner
2018-05-19 11:36 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-19 8:49 ` [PATCHv5 0/7] 5-level paging changes for v4.18 Thomas Gleixner
7 siblings, 2 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-05-18 10:35 UTC (permalink / raw)
To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov
__pgtable_l5_enabled shouldn't be needed after system has booted.
All preparation is done. We can now mark it as __initdata.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
arch/x86/kernel/head64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 43b009a97f23..b56160efb1f9 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -44,7 +44,7 @@ static unsigned int __initdata next_early_pgt;
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
#ifdef CONFIG_X86_5LEVEL
-unsigned int __pgtable_l5_enabled __ro_after_init;
+unsigned int __pgtable_l5_enabled __initdata;
unsigned int pgdir_shift __ro_after_init = 39;
EXPORT_SYMBOL(pgdir_shift);
unsigned int ptrs_per_p4d __ro_after_init = 1;
--
2.17.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/7] x86/boot/compressed/64: Fix trampoline page table address calculation
2018-05-18 10:35 ` [PATCH 1/7] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
@ 2018-05-19 8:43 ` Thomas Gleixner
2018-05-19 11:33 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-05-19 8:43 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Fri, 18 May 2018, Kirill A. Shutemov wrote:
> Hugh noticied that I calculate address of trampoline page table wrongly
> in cleanup_trampoline(). TRAMPOLINE_32BIT_PGTABLE_OFFSET has to be
> divided by sizeof(unsigned long) since trampoline_32bit is unsigned long
> pointer.
>
> TRAMPOLINE_32BIT_PGTABLE_OFFSET is zero so the bug doesn't have a
> visible effect.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reported-by: Hugh Dickins <hughd@google.com>
> Fixes: e9d0e6330eb8 ("x86/boot/compressed/64: Prepare new top-level page table for trampoline")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/7] x86/mm: Unify pgtable_l5_enabled usage in early boot code
2018-05-18 10:35 ` [PATCH 2/7] x86/mm: Unify pgtable_l5_enabled usage in early boot code Kirill A. Shutemov
@ 2018-05-19 8:44 ` Thomas Gleixner
2018-05-19 11:34 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-05-19 8:44 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Fri, 18 May 2018, Kirill A. Shutemov wrote:
> Usually pgtable_l5_enabled is defined using cpu_feature_enabled().
> cpu_feature_enabled() is not available in early boot code. We use
> several different preprocessor tricks to get around it. It's messy.
>
> Unify them all.
>
> If cpu_feature_enabled() is not yet available, USE_EARLY_PGTABLE_L5 can
> be defined before all includes. It makes pgtable_l5_enabled rely on
> __pgtable_l5_enabled variable instead. This approach fits all early
> users.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 3/7] x86/mm: Stop pretending pgtable_l5_enabled is a variable
2018-05-18 10:35 ` [PATCH 3/7] x86/mm: Stop pretending pgtable_l5_enabled is a variable Kirill A. Shutemov
@ 2018-05-19 8:45 ` Thomas Gleixner
2018-05-19 11:34 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-05-19 8:45 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Fri, 18 May 2018, Kirill A. Shutemov wrote:
> pgtable_l5_enabled is defined using cpu_feature_enabled() but we refer
> to it as a variable. This is misleading.
>
> Make pgtable_l5_enabled() a function.
>
> We cannot literally define it as a function due to circular dependencies
> between header files. Function-alike macros is close enough.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/7] x86/mm: Introduce 'no5lvl' kernel parameter
2018-05-18 10:35 ` [PATCH 4/7] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
@ 2018-05-19 8:46 ` Thomas Gleixner
2018-05-19 11:35 ` [tip:x86/boot] x86/mm: Introduce the " tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-05-19 8:46 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Fri, 18 May 2018, Kirill A. Shutemov wrote:
> The kernel parameter allows to force kernel to use 4-level paging even
> if hardware and kernel support 5-level paging.
>
> The option may be useful to workaround regressions related to 5-level
> paging.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 5/7] x86/cpu: Move early cpu initialization into a separate translation unit
2018-05-18 10:35 ` [PATCH 5/7] x86/cpu: Move early cpu initialization into a separate translation unit Kirill A. Shutemov
@ 2018-05-19 8:47 ` Thomas Gleixner
2018-06-05 10:19 ` Kirill A. Shutemov
0 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2018-05-19 8:47 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Fri, 18 May 2018, Kirill A. Shutemov wrote:
> __pgtable_l5_enabled shouldn't be needed after system has booted, we can
> mark it as __initdata, but it requires preparation.
>
> This patch moves early cpu initialization into a separate translation
> unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code.
>
> Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is
> not __init function and it leads to section mismatch.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
This makes a lot of sense independent of 5level changes.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 6/7] x86/mm: Mark p4d_offset() __always_inline
2018-05-18 10:35 ` [PATCH 6/7] x86/mm: Mark p4d_offset() __always_inline Kirill A. Shutemov
@ 2018-05-19 8:47 ` Thomas Gleixner
2018-05-19 11:35 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-05-19 8:47 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Fri, 18 May 2018, Kirill A. Shutemov wrote:
> __pgtable_l5_enabled shouldn't be needed after system has booted, we can
> mark it as __initdata, but it requires preparation.
>
> KASAN initialization code is a user of USE_EARLY_PGTABLE_L5, so all
> pgtable_l5_enabled() translated to __pgtable_l5_enabled there, including
> the one in p4d_offset().
>
> It may lead to section mismatch, if a compiler would not inline
> p4d_offset(), but leave it as a standalone function: p4d_offset() is not
> marked as __init.
>
> Marking p4d_offset() as __always_inline fixes the issue.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 7/7] x86/mm: Mark __pgtable_l5_enabled __initdata
2018-05-18 10:35 ` [PATCH 7/7] x86/mm: Mark __pgtable_l5_enabled __initdata Kirill A. Shutemov
@ 2018-05-19 8:48 ` Thomas Gleixner
2018-05-19 11:36 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-05-19 8:48 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Fri, 18 May 2018, Kirill A. Shutemov wrote:
> __pgtable_l5_enabled shouldn't be needed after system has booted.
> All preparation is done. We can now mark it as __initdata.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCHv5 0/7] 5-level paging changes for v4.18
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
` (6 preceding siblings ...)
2018-05-18 10:35 ` [PATCH 7/7] x86/mm: Mark __pgtable_l5_enabled __initdata Kirill A. Shutemov
@ 2018-05-19 8:49 ` Thomas Gleixner
7 siblings, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2018-05-19 8:49 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Fri, 18 May 2018, Kirill A. Shutemov wrote:
> Here's several patches that I would like to queue for v4.18. Please review
> and consider applying.
>
> In this version I've addressed Thomas' feedback.
>
> Changing __pgtable_l5_enabled to __initdata is not as trivial as I hoped.
> It requires few tricks to avoid section mismatch. I'm not sure if it worth
> the gain. We can keep it __ro_after_init.
>
> If you feel it's too invasive, just drop last three patches.
Well done. Thanks for cleaning it up.
Thanks,
tglx
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:x86/boot] x86/boot/compressed/64: Fix trampoline page table address calculation
2018-05-18 10:35 ` [PATCH 1/7] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
2018-05-19 8:43 ` Thomas Gleixner
@ 2018-05-19 11:33 ` tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Kirill A. Shutemov @ 2018-05-19 11:33 UTC (permalink / raw)
To: linux-tip-commits
Cc: tglx, kirill.shutemov, peterz, torvalds, hughd, hpa, linux-kernel, mingo
Commit-ID: 30bbf728ba91b1e8b0e539126cd105ad7e2fa16a
Gitweb: https://git.kernel.org/tip/30bbf728ba91b1e8b0e539126cd105ad7e2fa16a
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate: Fri, 18 May 2018 13:35:22 +0300
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 19 May 2018 11:56:57 +0200
x86/boot/compressed/64: Fix trampoline page table address calculation
Hugh noticied that we calculate the address of the trampoline page table
incorrectly in cleanup_trampoline().
TRAMPOLINE_32BIT_PGTABLE_OFFSET has to be divided by sizeof(unsigned long),
since trampoline_32bit is an 'unsigned long' pointer.
TRAMPOLINE_32BIT_PGTABLE_OFFSET is zero so the bug doesn't have a
visible effect.
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: e9d0e6330eb8 ("x86/boot/compressed/64: Prepare new top-level page table for trampoline")
Link: http://lkml.kernel.org/r/20180518103528.59260-2-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/boot/compressed/pgtable_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index a362fa0b849c..23707e1da1ff 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -130,7 +130,7 @@ void cleanup_trampoline(void *pgtable)
{
void *trampoline_pgtable;
- trampoline_pgtable = trampoline_32bit + TRAMPOLINE_32BIT_PGTABLE_OFFSET;
+ trampoline_pgtable = trampoline_32bit + TRAMPOLINE_32BIT_PGTABLE_OFFSET / sizeof(unsigned long);
/*
* Move the top level page table out of trampoline memory,
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:x86/boot] x86/mm: Unify pgtable_l5_enabled usage in early boot code
2018-05-18 10:35 ` [PATCH 2/7] x86/mm: Unify pgtable_l5_enabled usage in early boot code Kirill A. Shutemov
2018-05-19 8:44 ` Thomas Gleixner
@ 2018-05-19 11:34 ` tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Kirill A. Shutemov @ 2018-05-19 11:34 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, tglx, torvalds, hpa, peterz, kirill.shutemov, hughd, mingo
Commit-ID: ad3fe525b9507d8d750d60e8e5dd8e0c0836fb99
Gitweb: https://git.kernel.org/tip/ad3fe525b9507d8d750d60e8e5dd8e0c0836fb99
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate: Fri, 18 May 2018 13:35:23 +0300
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 19 May 2018 11:56:57 +0200
x86/mm: Unify pgtable_l5_enabled usage in early boot code
Usually pgtable_l5_enabled is defined using cpu_feature_enabled().
cpu_feature_enabled() is not available in early boot code. We use
several different preprocessor tricks to get around it. It's messy.
Unify them all.
If cpu_feature_enabled() is not yet available, USE_EARLY_PGTABLE_L5 can
be defined before all includes. It makes pgtable_l5_enabled rely on
__pgtable_l5_enabled variable instead. This approach fits all early
users.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180518103528.59260-3-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/boot/compressed/kaslr.c | 4 ++--
arch/x86/boot/compressed/misc.h | 6 ++----
arch/x86/include/asm/pgtable_64_types.h | 13 ++++++++++---
arch/x86/kernel/head64.c | 12 +++++-------
arch/x86/mm/kasan_init_64.c | 6 ++----
5 files changed, 21 insertions(+), 20 deletions(-)
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index a0a50b91ecef..b87a7582853d 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -47,7 +47,7 @@
#include <linux/decompress/mm.h>
#ifdef CONFIG_X86_5LEVEL
-unsigned int pgtable_l5_enabled __ro_after_init;
+unsigned int __pgtable_l5_enabled;
unsigned int pgdir_shift __ro_after_init = 39;
unsigned int ptrs_per_p4d __ro_after_init = 1;
#endif
@@ -734,7 +734,7 @@ void choose_random_location(unsigned long input,
#ifdef CONFIG_X86_5LEVEL
if (__read_cr4() & X86_CR4_LA57) {
- pgtable_l5_enabled = 1;
+ __pgtable_l5_enabled = 1;
pgdir_shift = 48;
ptrs_per_p4d = 512;
}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 9e11be4cae19..a423bdb42686 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -12,10 +12,8 @@
#undef CONFIG_PARAVIRT_SPINLOCKS
#undef CONFIG_KASAN
-#ifdef CONFIG_X86_5LEVEL
-/* cpu_feature_enabled() cannot be used that early */
-#define pgtable_l5_enabled __pgtable_l5_enabled
-#endif
+/* cpu_feature_enabled() cannot be used this early */
+#define USE_EARLY_PGTABLE_L5
#include <linux/linkage.h>
#include <linux/screen_info.h>
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index adb47552e6bb..c14a4116a693 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -22,12 +22,19 @@ typedef struct { pteval_t pte; } pte_t;
#ifdef CONFIG_X86_5LEVEL
extern unsigned int __pgtable_l5_enabled;
-#ifndef pgtable_l5_enabled
+
+#ifdef USE_EARLY_PGTABLE_L5
+/*
+ * cpu_feature_enabled() is not available in early boot code.
+ * Use variable instead.
+ */
+#define pgtable_l5_enabled __pgtable_l5_enabled
+#else
#define pgtable_l5_enabled cpu_feature_enabled(X86_FEATURE_LA57)
-#endif
+#endif /* USE_EARLY_PGTABLE_L5 */
#else
#define pgtable_l5_enabled 0
-#endif
+#endif /* CONFIG_X86_5LEVEL */
extern unsigned int pgdir_shift;
extern unsigned int ptrs_per_p4d;
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 2d29e47c056e..494fea1dbd6e 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -6,6 +6,10 @@
*/
#define DISABLE_BRANCH_PROFILING
+
+/* cpu_feature_enabled() cannot be used this early */
+#define USE_EARLY_PGTABLE_L5
+
#include <linux/init.h>
#include <linux/linkage.h>
#include <linux/types.h>
@@ -32,11 +36,6 @@
#include <asm/microcode.h>
#include <asm/kasan.h>
-#ifdef CONFIG_X86_5LEVEL
-#undef pgtable_l5_enabled
-#define pgtable_l5_enabled __pgtable_l5_enabled
-#endif
-
/*
* Manage page tables very early on.
*/
@@ -46,7 +45,6 @@ pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
#ifdef CONFIG_X86_5LEVEL
unsigned int __pgtable_l5_enabled __ro_after_init;
-EXPORT_SYMBOL(__pgtable_l5_enabled);
unsigned int pgdir_shift __ro_after_init = 39;
EXPORT_SYMBOL(pgdir_shift);
unsigned int ptrs_per_p4d __ro_after_init = 1;
@@ -88,7 +86,7 @@ static bool __head check_la57_support(unsigned long physaddr)
if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
return false;
- *fixup_int(&pgtable_l5_enabled, physaddr) = 1;
+ *fixup_int(&__pgtable_l5_enabled, physaddr) = 1;
*fixup_int(&pgdir_shift, physaddr) = 48;
*fixup_int(&ptrs_per_p4d, physaddr) = 512;
*fixup_long(&page_offset_base, physaddr) = __PAGE_OFFSET_BASE_L5;
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 980dbebd0ca7..340bb9b32e01 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -2,10 +2,8 @@
#define DISABLE_BRANCH_PROFILING
#define pr_fmt(fmt) "kasan: " fmt
-#ifdef CONFIG_X86_5LEVEL
-/* Too early to use cpu_feature_enabled() */
-#define pgtable_l5_enabled __pgtable_l5_enabled
-#endif
+/* cpu_feature_enabled() cannot be used this early */
+#define USE_EARLY_PGTABLE_L5
#include <linux/bootmem.h>
#include <linux/kasan.h>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:x86/boot] x86/mm: Stop pretending pgtable_l5_enabled is a variable
2018-05-18 10:35 ` [PATCH 3/7] x86/mm: Stop pretending pgtable_l5_enabled is a variable Kirill A. Shutemov
2018-05-19 8:45 ` Thomas Gleixner
@ 2018-05-19 11:34 ` tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Kirill A. Shutemov @ 2018-05-19 11:34 UTC (permalink / raw)
To: linux-tip-commits
Cc: hpa, peterz, torvalds, mingo, tglx, linux-kernel, kirill.shutemov, hughd
Commit-ID: ed7588d5dc6f5e7202fb9bbeb14d94706ba225d7
Gitweb: https://git.kernel.org/tip/ed7588d5dc6f5e7202fb9bbeb14d94706ba225d7
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate: Fri, 18 May 2018 13:35:24 +0300
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 19 May 2018 11:56:57 +0200
x86/mm: Stop pretending pgtable_l5_enabled is a variable
pgtable_l5_enabled is defined using cpu_feature_enabled() but we refer
to it as a variable. This is misleading.
Make pgtable_l5_enabled() a function.
We cannot literally define it as a function due to circular dependencies
between header files. Function-alike macros is close enough.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180518103528.59260-4-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/page_64_types.h | 2 +-
arch/x86/include/asm/paravirt.h | 4 ++--
arch/x86/include/asm/pgalloc.h | 4 ++--
arch/x86/include/asm/pgtable.h | 10 +++++-----
arch/x86/include/asm/pgtable_32_types.h | 2 +-
arch/x86/include/asm/pgtable_64.h | 2 +-
arch/x86/include/asm/pgtable_64_types.h | 14 +++++++++-----
arch/x86/include/asm/sparsemem.h | 4 ++--
arch/x86/kernel/head64.c | 2 +-
arch/x86/kernel/machine_kexec_64.c | 3 ++-
arch/x86/mm/dump_pagetables.c | 6 +++---
arch/x86/mm/fault.c | 4 ++--
arch/x86/mm/ident_map.c | 2 +-
arch/x86/mm/init_64.c | 8 ++++----
arch/x86/mm/kasan_init_64.c | 8 ++++----
arch/x86/mm/kaslr.c | 8 ++++----
arch/x86/mm/tlb.c | 2 +-
arch/x86/platform/efi/efi_64.c | 2 +-
arch/x86/power/hibernate_64.c | 2 +-
19 files changed, 47 insertions(+), 42 deletions(-)
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 2c5a966dc222..6afac386a434 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -53,7 +53,7 @@
#define __PHYSICAL_MASK_SHIFT 52
#ifdef CONFIG_X86_5LEVEL
-#define __VIRTUAL_MASK_SHIFT (pgtable_l5_enabled ? 56 : 47)
+#define __VIRTUAL_MASK_SHIFT (pgtable_l5_enabled() ? 56 : 47)
#else
#define __VIRTUAL_MASK_SHIFT 47
#endif
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 9be2bf13825b..d49bbf4bb5c8 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -574,14 +574,14 @@ static inline void __set_pgd(pgd_t *pgdp, pgd_t pgd)
}
#define set_pgd(pgdp, pgdval) do { \
- if (pgtable_l5_enabled) \
+ if (pgtable_l5_enabled()) \
__set_pgd(pgdp, pgdval); \
else \
set_p4d((p4d_t *)(pgdp), (p4d_t) { (pgdval).pgd }); \
} while (0)
#define pgd_clear(pgdp) do { \
- if (pgtable_l5_enabled) \
+ if (pgtable_l5_enabled()) \
set_pgd(pgdp, __pgd(0)); \
} while (0)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index 263c142a6a6c..ada6410fd2ec 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -167,7 +167,7 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
#if CONFIG_PGTABLE_LEVELS > 4
static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
{
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return;
paravirt_alloc_p4d(mm, __pa(p4d) >> PAGE_SHIFT);
set_pgd(pgd, __pgd(_PAGE_TABLE | __pa(p4d)));
@@ -193,7 +193,7 @@ extern void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d);
static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long address)
{
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
___p4d_free_tlb(tlb, p4d);
}
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index f1633de5a675..5715647fc4fe 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -65,7 +65,7 @@ extern pmdval_t early_pmd_flags;
#ifndef __PAGETABLE_P4D_FOLDED
#define set_pgd(pgdp, pgd) native_set_pgd(pgdp, pgd)
-#define pgd_clear(pgd) (pgtable_l5_enabled ? native_pgd_clear(pgd) : 0)
+#define pgd_clear(pgd) (pgtable_l5_enabled() ? native_pgd_clear(pgd) : 0)
#endif
#ifndef set_p4d
@@ -881,7 +881,7 @@ static inline unsigned long p4d_index(unsigned long address)
#if CONFIG_PGTABLE_LEVELS > 4
static inline int pgd_present(pgd_t pgd)
{
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return 1;
return pgd_flags(pgd) & _PAGE_PRESENT;
}
@@ -900,7 +900,7 @@ static inline unsigned long pgd_page_vaddr(pgd_t pgd)
/* to find an entry in a page-table-directory. */
static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
{
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return (p4d_t *)pgd;
return (p4d_t *)pgd_page_vaddr(*pgd) + p4d_index(address);
}
@@ -909,7 +909,7 @@ static inline int pgd_bad(pgd_t pgd)
{
unsigned long ignore_flags = _PAGE_USER;
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return 0;
if (IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION))
@@ -920,7 +920,7 @@ static inline int pgd_bad(pgd_t pgd)
static inline int pgd_none(pgd_t pgd)
{
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return 0;
/*
* There is no need to do a workaround for the KNL stray
diff --git a/arch/x86/include/asm/pgtable_32_types.h b/arch/x86/include/asm/pgtable_32_types.h
index e3225e83db7d..d9a001a4a872 100644
--- a/arch/x86/include/asm/pgtable_32_types.h
+++ b/arch/x86/include/asm/pgtable_32_types.h
@@ -15,7 +15,7 @@
# include <asm/pgtable-2level_types.h>
#endif
-#define pgtable_l5_enabled 0
+#define pgtable_l5_enabled() 0
#define PGDIR_SIZE (1UL << PGDIR_SHIFT)
#define PGDIR_MASK (~(PGDIR_SIZE - 1))
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 877bc27718ae..3c5385f9a88f 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -220,7 +220,7 @@ static inline void native_set_p4d(p4d_t *p4dp, p4d_t p4d)
{
pgd_t pgd;
- if (pgtable_l5_enabled || !IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION)) {
+ if (pgtable_l5_enabled() || !IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION)) {
*p4dp = p4d;
return;
}
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index c14a4116a693..054765ab2da2 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -28,12 +28,16 @@ extern unsigned int __pgtable_l5_enabled;
* cpu_feature_enabled() is not available in early boot code.
* Use variable instead.
*/
-#define pgtable_l5_enabled __pgtable_l5_enabled
+static inline bool pgtable_l5_enabled(void)
+{
+ return __pgtable_l5_enabled;
+}
#else
-#define pgtable_l5_enabled cpu_feature_enabled(X86_FEATURE_LA57)
+#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
#endif /* USE_EARLY_PGTABLE_L5 */
+
#else
-#define pgtable_l5_enabled 0
+#define pgtable_l5_enabled() 0
#endif /* CONFIG_X86_5LEVEL */
extern unsigned int pgdir_shift;
@@ -109,7 +113,7 @@ extern unsigned int ptrs_per_p4d;
#define LDT_PGD_ENTRY_L4 -3UL
#define LDT_PGD_ENTRY_L5 -112UL
-#define LDT_PGD_ENTRY (pgtable_l5_enabled ? LDT_PGD_ENTRY_L5 : LDT_PGD_ENTRY_L4)
+#define LDT_PGD_ENTRY (pgtable_l5_enabled() ? LDT_PGD_ENTRY_L5 : LDT_PGD_ENTRY_L4)
#define LDT_BASE_ADDR (LDT_PGD_ENTRY << PGDIR_SHIFT)
#define __VMALLOC_BASE_L4 0xffffc90000000000UL
@@ -123,7 +127,7 @@ extern unsigned int ptrs_per_p4d;
#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
# define VMALLOC_START vmalloc_base
-# define VMALLOC_SIZE_TB (pgtable_l5_enabled ? VMALLOC_SIZE_TB_L5 : VMALLOC_SIZE_TB_L4)
+# define VMALLOC_SIZE_TB (pgtable_l5_enabled() ? VMALLOC_SIZE_TB_L5 : VMALLOC_SIZE_TB_L4)
# define VMEMMAP_START vmemmap_base
#else
# define VMALLOC_START __VMALLOC_BASE_L4
diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
index 4617a2bf123c..199218719a86 100644
--- a/arch/x86/include/asm/sparsemem.h
+++ b/arch/x86/include/asm/sparsemem.h
@@ -27,8 +27,8 @@
# endif
#else /* CONFIG_X86_32 */
# define SECTION_SIZE_BITS 27 /* matt - 128 is convenient right now */
-# define MAX_PHYSADDR_BITS (pgtable_l5_enabled ? 52 : 44)
-# define MAX_PHYSMEM_BITS (pgtable_l5_enabled ? 52 : 46)
+# define MAX_PHYSADDR_BITS (pgtable_l5_enabled() ? 52 : 44)
+# define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 52 : 46)
#endif
#endif /* CONFIG_SPARSEMEM */
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 494fea1dbd6e..8d372d1c266d 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -279,7 +279,7 @@ again:
* critical -- __PAGE_OFFSET would point us back into the dynamic
* range and we might end up looping forever...
*/
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
p4d_p = pgd_p;
else if (pgd)
p4d_p = (p4dval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map - phys_base);
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 6010449ca6d2..4c8acdfdc5a7 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -354,7 +354,8 @@ void arch_crash_save_vmcoreinfo(void)
{
VMCOREINFO_NUMBER(phys_base);
VMCOREINFO_SYMBOL(init_top_pgt);
- VMCOREINFO_NUMBER(pgtable_l5_enabled);
+ vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
+ pgtable_l5_enabled());
#ifdef CONFIG_NUMA
VMCOREINFO_SYMBOL(node_data);
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index cc7ff5957194..2f3c9196b834 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -360,7 +360,7 @@ static inline bool kasan_page_table(struct seq_file *m, struct pg_state *st,
void *pt)
{
if (__pa(pt) == __pa(kasan_zero_pmd) ||
- (pgtable_l5_enabled && __pa(pt) == __pa(kasan_zero_p4d)) ||
+ (pgtable_l5_enabled() && __pa(pt) == __pa(kasan_zero_p4d)) ||
__pa(pt) == __pa(kasan_zero_pud)) {
pgprotval_t prot = pte_flags(kasan_zero_pte[0]);
note_page(m, st, __pgprot(prot), 0, 5);
@@ -476,8 +476,8 @@ static void walk_p4d_level(struct seq_file *m, struct pg_state *st, pgd_t addr,
}
}
-#define pgd_large(a) (pgtable_l5_enabled ? pgd_large(a) : p4d_large(__p4d(pgd_val(a))))
-#define pgd_none(a) (pgtable_l5_enabled ? pgd_none(a) : p4d_none(__p4d(pgd_val(a))))
+#define pgd_large(a) (pgtable_l5_enabled() ? pgd_large(a) : p4d_large(__p4d(pgd_val(a))))
+#define pgd_none(a) (pgtable_l5_enabled() ? pgd_none(a) : p4d_none(__p4d(pgd_val(a))))
static inline bool is_hypervisor_range(int idx)
{
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 73bd8c95ac71..77ec014554e7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -439,7 +439,7 @@ static noinline int vmalloc_fault(unsigned long address)
if (pgd_none(*pgd_k))
return -1;
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
if (pgd_none(*pgd)) {
set_pgd(pgd, *pgd_k);
arch_flush_lazy_mmu_mode();
@@ -454,7 +454,7 @@ static noinline int vmalloc_fault(unsigned long address)
if (p4d_none(*p4d_k))
return -1;
- if (p4d_none(*p4d) && !pgtable_l5_enabled) {
+ if (p4d_none(*p4d) && !pgtable_l5_enabled()) {
set_p4d(p4d, *p4d_k);
arch_flush_lazy_mmu_mode();
} else {
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index a2f0c7e20fb0..fe7a12599d8e 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -123,7 +123,7 @@ int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
result = ident_p4d_init(info, p4d, addr, next);
if (result)
return result;
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
set_pgd(pgd, __pgd(__pa(p4d) | info->kernpg_flag));
} else {
/*
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 0a400606dea0..17383f9677fa 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -180,7 +180,7 @@ static void sync_global_pgds_l4(unsigned long start, unsigned long end)
*/
void sync_global_pgds(unsigned long start, unsigned long end)
{
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
sync_global_pgds_l5(start, end);
else
sync_global_pgds_l4(start, end);
@@ -643,7 +643,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
unsigned long vaddr = (unsigned long)__va(paddr);
int i = p4d_index(vaddr);
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return phys_pud_init((pud_t *) p4d_page, paddr, paddr_end, page_size_mask);
for (; i < PTRS_PER_P4D; i++, paddr = paddr_next) {
@@ -723,7 +723,7 @@ kernel_physical_mapping_init(unsigned long paddr_start,
page_size_mask);
spin_lock(&init_mm.page_table_lock);
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
pgd_populate(&init_mm, pgd, p4d);
else
p4d_populate(&init_mm, p4d_offset(pgd, vaddr), (pud_t *) p4d);
@@ -1100,7 +1100,7 @@ remove_p4d_table(p4d_t *p4d_start, unsigned long addr, unsigned long end,
* 5-level case we should free them. This code will have to change
* to adapt for boot-time switching between 4 and 5 level page tables.
*/
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
free_pud_table(pud_base, p4d);
}
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 340bb9b32e01..e3e77527f8df 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -180,7 +180,7 @@ static void __init clear_pgds(unsigned long start,
* With folded p4d, pgd_clear() is nop, use p4d_clear()
* instead.
*/
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
pgd_clear(pgd);
else
p4d_clear(p4d_offset(pgd, start));
@@ -195,7 +195,7 @@ static inline p4d_t *early_p4d_offset(pgd_t *pgd, unsigned long addr)
{
unsigned long p4d;
- if (!pgtable_l5_enabled)
+ if (!pgtable_l5_enabled())
return (p4d_t *)pgd;
p4d = __pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK;
@@ -282,7 +282,7 @@ void __init kasan_early_init(void)
for (i = 0; i < PTRS_PER_PUD; i++)
kasan_zero_pud[i] = __pud(pud_val);
- for (i = 0; pgtable_l5_enabled && i < PTRS_PER_P4D; i++)
+ for (i = 0; pgtable_l5_enabled() && i < PTRS_PER_P4D; i++)
kasan_zero_p4d[i] = __p4d(p4d_val);
kasan_map_early_shadow(early_top_pgt);
@@ -313,7 +313,7 @@ void __init kasan_init(void)
* bunch of things like kernel code, modules, EFI mapping, etc.
* We need to take extra steps to not overwrite them.
*/
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
void *ptr;
ptr = (void *)pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_END));
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 615cc03ced84..61db77b0eda9 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -78,7 +78,7 @@ void __init kernel_randomize_memory(void)
struct rnd_state rand_state;
unsigned long remain_entropy;
- vaddr_start = pgtable_l5_enabled ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
+ vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
vaddr = vaddr_start;
/*
@@ -124,7 +124,7 @@ void __init kernel_randomize_memory(void)
*/
entropy = remain_entropy / (ARRAY_SIZE(kaslr_regions) - i);
prandom_bytes_state(&rand_state, &rand, sizeof(rand));
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
entropy = (rand % (entropy + 1)) & P4D_MASK;
else
entropy = (rand % (entropy + 1)) & PUD_MASK;
@@ -136,7 +136,7 @@ void __init kernel_randomize_memory(void)
* randomization alignment.
*/
vaddr += get_padding(&kaslr_regions[i]);
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
vaddr = round_up(vaddr + 1, P4D_SIZE);
else
vaddr = round_up(vaddr + 1, PUD_SIZE);
@@ -212,7 +212,7 @@ void __meminit init_trampoline(void)
return;
}
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
init_trampoline_p4d();
else
init_trampoline_pud();
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index e055d1a06699..6eb1f34c3c85 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -157,7 +157,7 @@ static void sync_current_stack_to_mm(struct mm_struct *mm)
unsigned long sp = current_stack_pointer;
pgd_t *pgd = pgd_offset(mm, sp);
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
if (unlikely(pgd_none(*pgd))) {
pgd_t *pgd_ref = pgd_offset_k(sp);
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index bed7e7f4e44c..e01f7ceb9e7a 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -225,7 +225,7 @@ int __init efi_alloc_page_tables(void)
pud = pud_alloc(&init_mm, p4d, EFI_VA_END);
if (!pud) {
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled())
free_page((unsigned long) pgd_page_vaddr(*pgd));
free_pages((unsigned long)efi_pgd, PGD_ALLOCATION_ORDER);
return -ENOMEM;
diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
index ccf4a49bb065..67ccf64c8bd8 100644
--- a/arch/x86/power/hibernate_64.c
+++ b/arch/x86/power/hibernate_64.c
@@ -72,7 +72,7 @@ static int set_up_temporary_text_mapping(pgd_t *pgd)
* tables used by the image kernel.
*/
- if (pgtable_l5_enabled) {
+ if (pgtable_l5_enabled()) {
p4d = (p4d_t *)get_safe_page(GFP_ATOMIC);
if (!p4d)
return -ENOMEM;
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:x86/boot] x86/mm: Introduce the 'no5lvl' kernel parameter
2018-05-18 10:35 ` [PATCH 4/7] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
2018-05-19 8:46 ` Thomas Gleixner
@ 2018-05-19 11:35 ` tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Kirill A. Shutemov @ 2018-05-19 11:35 UTC (permalink / raw)
To: linux-tip-commits
Cc: kirill.shutemov, hpa, tglx, torvalds, mingo, hughd, linux-kernel, peterz
Commit-ID: 372fddf709041743a93e381556f4c41aad1e28f8
Gitweb: https://git.kernel.org/tip/372fddf709041743a93e381556f4c41aad1e28f8
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate: Fri, 18 May 2018 13:35:25 +0300
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 19 May 2018 11:56:57 +0200
x86/mm: Introduce the 'no5lvl' kernel parameter
This kernel parameter allows to force kernel to use 4-level paging even
if hardware and kernel support 5-level paging.
The option may be useful to work around regressions related to 5-level
paging.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180518103528.59260-5-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
Documentation/admin-guide/kernel-parameters.txt | 3 +++
arch/x86/boot/compressed/cmdline.c | 2 +-
arch/x86/boot/compressed/head_64.S | 1 +
arch/x86/boot/compressed/pgtable_64.c | 12 ++++++++++--
arch/x86/kernel/cpu/common.c | 15 +++++++++++++++
arch/x86/kernel/head64.c | 9 +++++----
6 files changed, 35 insertions(+), 7 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 11fc28ecdb6d..364a33c1534d 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2600,6 +2600,9 @@
emulation library even if a 387 maths coprocessor
is present.
+ no5lvl [X86-64] Disable 5-level paging mode. Forces
+ kernel to use 4-level paging instead.
+
no_console_suspend
[HW] Never suspend the console
Disable suspending of consoles during suspend and
diff --git a/arch/x86/boot/compressed/cmdline.c b/arch/x86/boot/compressed/cmdline.c
index 0cb325734cfb..af6cda0b7900 100644
--- a/arch/x86/boot/compressed/cmdline.c
+++ b/arch/x86/boot/compressed/cmdline.c
@@ -1,7 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include "misc.h"
-#if CONFIG_EARLY_PRINTK || CONFIG_RANDOMIZE_BASE
+#if CONFIG_EARLY_PRINTK || CONFIG_RANDOMIZE_BASE || CONFIG_X86_5LEVEL
static unsigned long fs;
static inline void set_fs(unsigned long seg)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 8169e8b7a4dc..64037895b085 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -365,6 +365,7 @@ ENTRY(startup_64)
* this function call.
*/
pushq %rsi
+ movq %rsi, %rdi /* real mode address */
call paging_prepare
popq %rsi
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index 23707e1da1ff..8c5107545251 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -31,16 +31,23 @@ static char trampoline_save[TRAMPOLINE_32BIT_SIZE];
*/
unsigned long *trampoline_32bit __section(.data);
-struct paging_config paging_prepare(void)
+extern struct boot_params *boot_params;
+int cmdline_find_option_bool(const char *option);
+
+struct paging_config paging_prepare(void *rmode)
{
struct paging_config paging_config = {};
unsigned long bios_start, ebda_start;
+ /* Initialize boot_params. Required for cmdline_find_option_bool(). */
+ boot_params = rmode;
+
/*
* Check if LA57 is desired and supported.
*
- * There are two parts to the check:
+ * There are several parts to the check:
* - if the kernel supports 5-level paging: CONFIG_X86_5LEVEL=y
+ * - if user asked to disable 5-level paging: no5lvl in cmdline
* - if the machine supports 5-level paging:
* + CPUID leaf 7 is supported
* + the leaf has the feature bit set
@@ -48,6 +55,7 @@ struct paging_config paging_prepare(void)
* That's substitute for boot_cpu_has() in early boot code.
*/
if (IS_ENABLED(CONFIG_X86_5LEVEL) &&
+ !cmdline_find_option_bool("no5lvl") &&
native_cpuid_eax(0) >= 7 &&
(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) {
paging_config.l5_required = 1;
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 39ed2e6ff8a0..27f68d14c962 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1028,6 +1028,21 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
*/
setup_clear_cpu_cap(X86_FEATURE_PCID);
#endif
+
+ /*
+ * Later in the boot process pgtable_l5_enabled() relies on
+ * cpu_feature_enabled(X86_FEATURE_LA57). If 5-level paging is not
+ * enabled by this point we need to clear the feature bit to avoid
+ * false-positives at the later stage.
+ *
+ * pgtable_l5_enabled() can be false here for several reasons:
+ * - 5-level paging is disabled compile-time;
+ * - it's 32-bit kernel;
+ * - machine doesn't support 5-level paging;
+ * - user specified 'no5lvl' in kernel command line.
+ */
+ if (!pgtable_l5_enabled())
+ setup_clear_cpu_cap(X86_FEATURE_LA57);
}
void __init early_cpu_init(void)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 8d372d1c266d..8047379e575a 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -80,10 +80,11 @@ static unsigned int __head *fixup_int(void *ptr, unsigned long physaddr)
static bool __head check_la57_support(unsigned long physaddr)
{
- if (native_cpuid_eax(0) < 7)
- return false;
-
- if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+ /*
+ * 5-level paging is detected and enabled at kernel decomression
+ * stage. Only check if it has been enabled there.
+ */
+ if (!(native_read_cr4() & X86_CR4_LA57))
return false;
*fixup_int(&__pgtable_l5_enabled, physaddr) = 1;
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:x86/boot] x86/mm: Mark p4d_offset() __always_inline
2018-05-18 10:35 ` [PATCH 6/7] x86/mm: Mark p4d_offset() __always_inline Kirill A. Shutemov
2018-05-19 8:47 ` Thomas Gleixner
@ 2018-05-19 11:35 ` tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Kirill A. Shutemov @ 2018-05-19 11:35 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, torvalds, linux-kernel, hughd, hpa, peterz, tglx, kirill.shutemov
Commit-ID: 1ea66554d3b09ce09c42e6a871899c84a276bb39
Gitweb: https://git.kernel.org/tip/1ea66554d3b09ce09c42e6a871899c84a276bb39
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate: Fri, 18 May 2018 13:35:27 +0300
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 19 May 2018 11:56:57 +0200
x86/mm: Mark p4d_offset() __always_inline
__pgtable_l5_enabled shouldn't be needed after system has booted, we can
mark it as __initdata, but it requires preparation.
KASAN initialization code is a user of USE_EARLY_PGTABLE_L5, so all
pgtable_l5_enabled() translated to __pgtable_l5_enabled there, including
the one in p4d_offset().
It may lead to section mismatch, if a compiler would not inline
p4d_offset(), but leave it as a standalone function: p4d_offset() is not
marked as __init.
Marking p4d_offset() as __always_inline fixes the issue.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180518103528.59260-7-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/pgtable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5715647fc4fe..99ecde23c3ec 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -898,7 +898,7 @@ static inline unsigned long pgd_page_vaddr(pgd_t pgd)
#define pgd_page(pgd) pfn_to_page(pgd_pfn(pgd))
/* to find an entry in a page-table-directory. */
-static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
+static __always_inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
{
if (!pgtable_l5_enabled())
return (p4d_t *)pgd;
^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:x86/boot] x86/mm: Mark __pgtable_l5_enabled __initdata
2018-05-18 10:35 ` [PATCH 7/7] x86/mm: Mark __pgtable_l5_enabled __initdata Kirill A. Shutemov
2018-05-19 8:48 ` Thomas Gleixner
@ 2018-05-19 11:36 ` tip-bot for Kirill A. Shutemov
1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Kirill A. Shutemov @ 2018-05-19 11:36 UTC (permalink / raw)
To: linux-tip-commits
Cc: kirill.shutemov, hughd, mingo, tglx, linux-kernel, torvalds, hpa, peterz
Commit-ID: e4e961e36f063484c48bed919013c106d178995d
Gitweb: https://git.kernel.org/tip/e4e961e36f063484c48bed919013c106d178995d
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate: Fri, 18 May 2018 13:35:28 +0300
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 19 May 2018 11:56:58 +0200
x86/mm: Mark __pgtable_l5_enabled __initdata
__pgtable_l5_enabled shouldn't be needed after system has booted.
All preparation is done. We can now mark it as __initdata.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180518103528.59260-8-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/head64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 8047379e575a..a21d6ace648e 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -44,7 +44,7 @@ static unsigned int __initdata next_early_pgt;
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
#ifdef CONFIG_X86_5LEVEL
-unsigned int __pgtable_l5_enabled __ro_after_init;
+unsigned int __pgtable_l5_enabled __initdata;
unsigned int pgdir_shift __ro_after_init = 39;
EXPORT_SYMBOL(pgdir_shift);
unsigned int ptrs_per_p4d __ro_after_init = 1;
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 5/7] x86/cpu: Move early cpu initialization into a separate translation unit
2018-05-19 8:47 ` Thomas Gleixner
@ 2018-06-05 10:19 ` Kirill A. Shutemov
0 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2018-06-05 10:19 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, x86, H. Peter Anvin, Hugh Dickins, linux-kernel
On Sat, May 19, 2018 at 08:47:33AM +0000, Thomas Gleixner wrote:
> On Fri, 18 May 2018, Kirill A. Shutemov wrote:
>
> > __pgtable_l5_enabled shouldn't be needed after system has booted, we can
> > mark it as __initdata, but it requires preparation.
> >
> > This patch moves early cpu initialization into a separate translation
> > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code.
> >
> > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is
> > not __init function and it leads to section mismatch.
> >
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>
> This makes a lot of sense independent of 5level changes.
>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Ingo, I've just noticed that this patch wasn't applied.
Below is rebased version. It applies cleanly on current tip/master and
Linus' tree.
---------------------8<----------------------------------
>From ff84fea44db72d09890dd69f4afb82060e6633a1 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Fri, 18 May 2018 13:35:26 +0300
Subject: [PATCH] x86/cpu: Move early cpu initialization into a separate
translation unit
__pgtable_l5_enabled shouldn't be needed after system has booted, we can
mark it as __initdata, but it requires preparation.
This patch moves early cpu initialization into a separate translation
unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code.
Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is
not __init function and it leads to section mismatch.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/common.c | 215 ++++-------------------------------
arch/x86/kernel/cpu/cpu.h | 7 ++
arch/x86/kernel/cpu/early.c | 183 +++++++++++++++++++++++++++++
4 files changed, 213 insertions(+), 193 deletions(-)
create mode 100644 arch/x86/kernel/cpu/early.c
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 7a40196967cb..b1da5a7c145c 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -19,6 +19,7 @@ CFLAGS_common.o := $(nostackp)
obj-y := cacheinfo.o scattered.o topology.o
obj-y += common.o
+obj-y += early.o
obj-y += rdrand.o
obj-y += match.o
obj-y += bugs.o
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 95c8e507580d..fa3dcbb7d4d8 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -47,7 +47,6 @@
#include <asm/pat.h>
#include <asm/microcode.h>
#include <asm/microcode_intel.h>
-#include <asm/intel-family.h>
#include <asm/cpu_device_id.h>
#ifdef CONFIG_X86_LOCAL_APIC
@@ -105,7 +104,7 @@ static const struct cpu_dev default_cpu = {
.c_x86_vendor = X86_VENDOR_UNKNOWN,
};
-static const struct cpu_dev *this_cpu = &default_cpu;
+const struct cpu_dev *this_cpu_dev = &default_cpu;
DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
#ifdef CONFIG_X86_64
@@ -426,7 +425,7 @@ cpuid_dependent_features[] = {
{ 0, 0 }
};
-static void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn)
+void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn)
{
const struct cpuid_dependent_feature *df;
@@ -471,10 +470,10 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c)
if (c->x86_model >= 16)
return NULL; /* Range check */
- if (!this_cpu)
+ if (!this_cpu_dev)
return NULL;
- info = this_cpu->legacy_models;
+ info = this_cpu_dev->legacy_models;
while (info->family) {
if (info->family == c->x86)
@@ -551,7 +550,7 @@ void switch_to_new_gdt(int cpu)
load_percpu_segment(cpu);
}
-static const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {};
+const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {};
static void get_model_name(struct cpuinfo_x86 *c)
{
@@ -622,8 +621,8 @@ void cpu_detect_cache_sizes(struct cpuinfo_x86 *c)
c->x86_tlbsize += ((ebx >> 16) & 0xfff) + (ebx & 0xfff);
#else
/* do processor-specific cache resizing */
- if (this_cpu->legacy_cache_size)
- l2size = this_cpu->legacy_cache_size(c, l2size);
+ if (this_cpu_dev->legacy_cache_size)
+ l2size = this_cpu_dev->legacy_cache_size(c, l2size);
/* Allow user to override all this if necessary. */
if (cachesize_override != -1)
@@ -646,8 +645,8 @@ u16 __read_mostly tlb_lld_1g[NR_INFO];
static void cpu_detect_tlb(struct cpuinfo_x86 *c)
{
- if (this_cpu->c_detect_tlb)
- this_cpu->c_detect_tlb(c);
+ if (this_cpu_dev->c_detect_tlb)
+ this_cpu_dev->c_detect_tlb(c);
pr_info("Last level iTLB entries: 4KB %d, 2MB %d, 4MB %d\n",
tlb_lli_4k[ENTRIES], tlb_lli_2m[ENTRIES],
@@ -709,7 +708,7 @@ void detect_ht(struct cpuinfo_x86 *c)
#endif
}
-static void get_cpu_vendor(struct cpuinfo_x86 *c)
+void get_cpu_vendor(struct cpuinfo_x86 *c)
{
char *v = c->x86_vendor_id;
int i;
@@ -722,8 +721,8 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c)
(cpu_devs[i]->c_ident[1] &&
!strcmp(v, cpu_devs[i]->c_ident[1]))) {
- this_cpu = cpu_devs[i];
- c->x86_vendor = this_cpu->c_x86_vendor;
+ this_cpu_dev = cpu_devs[i];
+ c->x86_vendor = this_cpu_dev->c_x86_vendor;
return;
}
}
@@ -732,7 +731,7 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c)
"CPU: Your system may be unstable.\n", v);
c->x86_vendor = X86_VENDOR_UNKNOWN;
- this_cpu = &default_cpu;
+ this_cpu_dev = &default_cpu;
}
void cpu_detect(struct cpuinfo_x86 *c)
@@ -902,7 +901,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
apply_forced_caps(c);
}
-static void get_cpu_address_sizes(struct cpuinfo_x86 *c)
+void get_cpu_address_sizes(struct cpuinfo_x86 *c)
{
u32 eax, ebx, ecx, edx;
@@ -918,7 +917,7 @@ static void get_cpu_address_sizes(struct cpuinfo_x86 *c)
#endif
}
-static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
+void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_X86_32
int i;
@@ -944,176 +943,6 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#endif
}
-static const __initconst struct x86_cpu_id cpu_no_speculation[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_CENTAUR, 5 },
- { X86_VENDOR_INTEL, 5 },
- { X86_VENDOR_NSC, 5 },
- { X86_VENDOR_ANY, 4 },
- {}
-};
-
-static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
- { X86_VENDOR_AMD },
- {}
-};
-
-/* Only list CPUs which speculate but are non susceptible to SSB */
-static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_CORE_YONAH },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM },
- { X86_VENDOR_AMD, 0x12, },
- { X86_VENDOR_AMD, 0x11, },
- { X86_VENDOR_AMD, 0x10, },
- { X86_VENDOR_AMD, 0xf, },
- {}
-};
-
-static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
-{
- u64 ia32_cap = 0;
-
- if (x86_match_cpu(cpu_no_speculation))
- return;
-
- setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
- setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
-
- if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
- rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
-
- if (!x86_match_cpu(cpu_no_spec_store_bypass) &&
- !(ia32_cap & ARCH_CAP_SSB_NO))
- setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
-
- if (x86_match_cpu(cpu_no_meltdown))
- return;
-
- /* Rogue Data Cache Load? No! */
- if (ia32_cap & ARCH_CAP_RDCL_NO)
- return;
-
- setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
-}
-
-/*
- * Do minimum CPU detection early.
- * Fields really needed: vendor, cpuid_level, family, model, mask,
- * cache alignment.
- * The others are not touched to avoid unwanted side effects.
- *
- * WARNING: this function is only called on the boot CPU. Don't add code
- * here that is supposed to run on all CPUs.
- */
-static void __init early_identify_cpu(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_X86_64
- c->x86_clflush_size = 64;
- c->x86_phys_bits = 36;
- c->x86_virt_bits = 48;
-#else
- c->x86_clflush_size = 32;
- c->x86_phys_bits = 32;
- c->x86_virt_bits = 32;
-#endif
- c->x86_cache_alignment = c->x86_clflush_size;
-
- memset(&c->x86_capability, 0, sizeof c->x86_capability);
- c->extended_cpuid_level = 0;
-
- /* cyrix could have cpuid enabled via c_identify()*/
- if (have_cpuid_p()) {
- cpu_detect(c);
- get_cpu_vendor(c);
- get_cpu_cap(c);
- get_cpu_address_sizes(c);
- setup_force_cpu_cap(X86_FEATURE_CPUID);
-
- if (this_cpu->c_early_init)
- this_cpu->c_early_init(c);
-
- c->cpu_index = 0;
- filter_cpuid_features(c, false);
-
- if (this_cpu->c_bsp_init)
- this_cpu->c_bsp_init(c);
- } else {
- identify_cpu_without_cpuid(c);
- setup_clear_cpu_cap(X86_FEATURE_CPUID);
- }
-
- setup_force_cpu_cap(X86_FEATURE_ALWAYS);
-
- cpu_set_bug_bits(c);
-
- fpu__init_system(c);
-
-#ifdef CONFIG_X86_32
- /*
- * Regardless of whether PCID is enumerated, the SDM says
- * that it can't be enabled in 32-bit mode.
- */
- setup_clear_cpu_cap(X86_FEATURE_PCID);
-#endif
-
- /*
- * Later in the boot process pgtable_l5_enabled() relies on
- * cpu_feature_enabled(X86_FEATURE_LA57). If 5-level paging is not
- * enabled by this point we need to clear the feature bit to avoid
- * false-positives at the later stage.
- *
- * pgtable_l5_enabled() can be false here for several reasons:
- * - 5-level paging is disabled compile-time;
- * - it's 32-bit kernel;
- * - machine doesn't support 5-level paging;
- * - user specified 'no5lvl' in kernel command line.
- */
- if (!pgtable_l5_enabled())
- setup_clear_cpu_cap(X86_FEATURE_LA57);
-}
-
-void __init early_cpu_init(void)
-{
- const struct cpu_dev *const *cdev;
- int count = 0;
-
-#ifdef CONFIG_PROCESSOR_SELECT
- pr_info("KERNEL supported cpus:\n");
-#endif
-
- for (cdev = __x86_cpu_dev_start; cdev < __x86_cpu_dev_end; cdev++) {
- const struct cpu_dev *cpudev = *cdev;
-
- if (count >= X86_VENDOR_NUM)
- break;
- cpu_devs[count] = cpudev;
- count++;
-
-#ifdef CONFIG_PROCESSOR_SELECT
- {
- unsigned int j;
-
- for (j = 0; j < 2; j++) {
- if (!cpudev->c_ident[j])
- continue;
- pr_info(" %s %s\n", cpudev->c_vendor,
- cpudev->c_ident[j]);
- }
- }
-#endif
- }
- early_identify_cpu(&boot_cpu_data);
-}
-
/*
* The NOPL instruction is supposed to exist on all CPUs of family >= 6;
* unfortunately, that's not true in practice because of early VIA
@@ -1290,8 +1119,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
generic_identify(c);
- if (this_cpu->c_identify)
- this_cpu->c_identify(c);
+ if (this_cpu_dev->c_identify)
+ this_cpu_dev->c_identify(c);
/* Clear/Set all flags overridden by options, after probe */
apply_forced_caps(c);
@@ -1310,8 +1139,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
* At the end of this section, c->x86_capability better
* indicate the features this CPU genuinely supports!
*/
- if (this_cpu->c_init)
- this_cpu->c_init(c);
+ if (this_cpu_dev->c_init)
+ this_cpu_dev->c_init(c);
/* Disable the PN if appropriate */
squash_the_stupid_serial_number(c);
@@ -1446,7 +1275,7 @@ void print_cpu_info(struct cpuinfo_x86 *c)
const char *vendor = NULL;
if (c->x86_vendor < X86_VENDOR_NUM) {
- vendor = this_cpu->c_vendor;
+ vendor = this_cpu_dev->c_vendor;
} else {
if (c->cpuid_level >= 0)
vendor = c->x86_vendor_id;
@@ -1820,8 +1649,8 @@ void cpu_init(void)
static void bsp_resume(void)
{
- if (this_cpu->c_bsp_resume)
- this_cpu->c_bsp_resume(&boot_cpu_data);
+ if (this_cpu_dev->c_bsp_resume)
+ this_cpu_dev->c_bsp_resume(&boot_cpu_data);
}
static struct syscore_ops cpu_syscore_ops = {
diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
index 38216f678fc3..959529a61f9b 100644
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -45,8 +45,15 @@ struct _tlb_table {
extern const struct cpu_dev *const __x86_cpu_dev_start[],
*const __x86_cpu_dev_end[];
+extern const struct cpu_dev *cpu_devs[];
+extern const struct cpu_dev *this_cpu_dev;
+
extern void get_cpu_cap(struct cpuinfo_x86 *c);
+extern void get_cpu_vendor(struct cpuinfo_x86 *c);
+extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
+extern void identify_cpu_without_cpuid(struct cpuinfo_x86 *c);
+extern void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn);
extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
extern u32 get_scattered_cpuid_leaf(unsigned int level,
unsigned int sub_leaf,
diff --git a/arch/x86/kernel/cpu/early.c b/arch/x86/kernel/cpu/early.c
new file mode 100644
index 000000000000..3014203b684c
--- /dev/null
+++ b/arch/x86/kernel/cpu/early.c
@@ -0,0 +1,183 @@
+/* cpu_feature_enabled() cannot be used this early */
+#define USE_EARLY_PGTABLE_L5
+
+#include <linux/linkage.h>
+#include <linux/kernel.h>
+
+#include <asm/processor.h>
+#include <asm/cpu.h>
+#include <asm/cpu_device_id.h>
+#include <asm/intel-family.h>
+#include <asm/fpu/internal.h>
+
+#include "cpu.h"
+
+static const __initconst struct x86_cpu_id cpu_no_speculation[] = {
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_CENTAUR, 5 },
+ { X86_VENDOR_INTEL, 5 },
+ { X86_VENDOR_NSC, 5 },
+ { X86_VENDOR_ANY, 4 },
+ {}
+};
+
+static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
+ { X86_VENDOR_AMD },
+ {}
+};
+
+/* Only list CPUs which speculate but are non susceptible to SSB */
+static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = {
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_CORE_YONAH },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM },
+ { X86_VENDOR_AMD, 0x12, },
+ { X86_VENDOR_AMD, 0x11, },
+ { X86_VENDOR_AMD, 0x10, },
+ { X86_VENDOR_AMD, 0xf, },
+ {}
+};
+
+static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+{
+ u64 ia32_cap = 0;
+
+ if (x86_match_cpu(cpu_no_speculation))
+ return;
+
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+
+ if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+ if (!x86_match_cpu(cpu_no_spec_store_bypass) &&
+ !(ia32_cap & ARCH_CAP_SSB_NO))
+ setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
+
+ if (x86_match_cpu(cpu_no_meltdown))
+ return;
+
+ /* Rogue Data Cache Load? No! */
+ if (ia32_cap & ARCH_CAP_RDCL_NO)
+ return;
+
+ setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
+}
+
+/*
+ * Do minimum CPU detection early.
+ * Fields really needed: vendor, cpuid_level, family, model, mask,
+ * cache alignment.
+ * The others are not touched to avoid unwanted side effects.
+ *
+ * WARNING: this function is only called on the boot CPU. Don't add code
+ * here that is supposed to run on all CPUs.
+ */
+static void __init early_identify_cpu(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_X86_64
+ c->x86_clflush_size = 64;
+ c->x86_phys_bits = 36;
+ c->x86_virt_bits = 48;
+#else
+ c->x86_clflush_size = 32;
+ c->x86_phys_bits = 32;
+ c->x86_virt_bits = 32;
+#endif
+ c->x86_cache_alignment = c->x86_clflush_size;
+
+ memset(&c->x86_capability, 0, sizeof c->x86_capability);
+ c->extended_cpuid_level = 0;
+
+ /* cyrix could have cpuid enabled via c_identify()*/
+ if (have_cpuid_p()) {
+ cpu_detect(c);
+ get_cpu_vendor(c);
+ get_cpu_cap(c);
+ get_cpu_address_sizes(c);
+ setup_force_cpu_cap(X86_FEATURE_CPUID);
+
+ if (this_cpu_dev->c_early_init)
+ this_cpu_dev->c_early_init(c);
+
+ c->cpu_index = 0;
+ filter_cpuid_features(c, false);
+
+ if (this_cpu_dev->c_bsp_init)
+ this_cpu_dev->c_bsp_init(c);
+ } else {
+ identify_cpu_without_cpuid(c);
+ setup_clear_cpu_cap(X86_FEATURE_CPUID);
+ }
+
+ setup_force_cpu_cap(X86_FEATURE_ALWAYS);
+
+ cpu_set_bug_bits(c);
+
+ fpu__init_system(c);
+
+#ifdef CONFIG_X86_32
+ /*
+ * Regardless of whether PCID is enumerated, the SDM says
+ * that it can't be enabled in 32-bit mode.
+ */
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+#endif
+
+ /*
+ * Later in the boot process pgtable_l5_enabled() relies on
+ * cpu_feature_enabled(X86_FEATURE_LA57). If 5-level paging is not
+ * enabled by this point we need to clear the feature bit to avoid
+ * false-positives at the later stage.
+ *
+ * pgtable_l5_enabled() can be false here for several reasons:
+ * - 5-level paging is disabled compile-time;
+ * - it's 32-bit kernel;
+ * - machine doesn't support 5-level paging;
+ * - user specified 'no5lvl' in kernel command line.
+ */
+ if (!pgtable_l5_enabled())
+ setup_clear_cpu_cap(X86_FEATURE_LA57);
+}
+
+void __init early_cpu_init(void)
+{
+ const struct cpu_dev *const *cdev;
+ int count = 0;
+
+#ifdef CONFIG_PROCESSOR_SELECT
+ pr_info("KERNEL supported cpus:\n");
+#endif
+
+ for (cdev = __x86_cpu_dev_start; cdev < __x86_cpu_dev_end; cdev++) {
+ const struct cpu_dev *cpudev = *cdev;
+
+ if (count >= X86_VENDOR_NUM)
+ break;
+ cpu_devs[count] = cpudev;
+ count++;
+
+#ifdef CONFIG_PROCESSOR_SELECT
+ {
+ unsigned int j;
+
+ for (j = 0; j < 2; j++) {
+ if (!cpudev->c_ident[j])
+ continue;
+ pr_info(" %s %s\n", cpudev->c_vendor,
+ cpudev->c_ident[j]);
+ }
+ }
+#endif
+ }
+ early_identify_cpu(&boot_cpu_data);
+}
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2018-06-05 10:19 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-18 10:35 [PATCHv5 0/7] 5-level paging changes for v4.18 Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 1/7] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
2018-05-19 8:43 ` Thomas Gleixner
2018-05-19 11:33 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 2/7] x86/mm: Unify pgtable_l5_enabled usage in early boot code Kirill A. Shutemov
2018-05-19 8:44 ` Thomas Gleixner
2018-05-19 11:34 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 3/7] x86/mm: Stop pretending pgtable_l5_enabled is a variable Kirill A. Shutemov
2018-05-19 8:45 ` Thomas Gleixner
2018-05-19 11:34 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 4/7] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
2018-05-19 8:46 ` Thomas Gleixner
2018-05-19 11:35 ` [tip:x86/boot] x86/mm: Introduce the " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 5/7] x86/cpu: Move early cpu initialization into a separate translation unit Kirill A. Shutemov
2018-05-19 8:47 ` Thomas Gleixner
2018-06-05 10:19 ` Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 6/7] x86/mm: Mark p4d_offset() __always_inline Kirill A. Shutemov
2018-05-19 8:47 ` Thomas Gleixner
2018-05-19 11:35 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-18 10:35 ` [PATCH 7/7] x86/mm: Mark __pgtable_l5_enabled __initdata Kirill A. Shutemov
2018-05-19 8:48 ` Thomas Gleixner
2018-05-19 11:36 ` [tip:x86/boot] " tip-bot for Kirill A. Shutemov
2018-05-19 8:49 ` [PATCHv5 0/7] 5-level paging changes for v4.18 Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).