LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages)
@ 2021-07-14 2:21 Anshuman Khandual
2021-07-14 2:21 ` [RFC 01/10] mm/mmap: Dynamically initialize protection_map[] Anshuman Khandual
` (9 more replies)
0 siblings, 10 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
This series enables 52 bits PA support for 4K and 16K page configs via
existing CONFIG_ARM64_PA_BITS_52, utilizing a new arch feature FEAT_LPA2
which is available from ARM v8.7. IDMAP needs changes to accommodate two
new level of page tables in certain scenarios like (4K|39VA|52PA) but the
same problem also exists for (16K|36VA|48PA) which needs fixing. I am
currently working on the IDMAP fix for 16K and later will enable it for
FEAT_LPA2 as well.
This series applies on v5.14-rc1.
Testing:
Build and boot tested (individual patches) on all existing and new
FEAT_LPA2 enabled config combinations.
Pending:
- Enable IDMAP for FEAT_LPA2
- Enable 52 bit VA range on 4K/16K page sizes
- Evaluate KVM and SMMU impacts from FEAT_LPA2
Anshuman Khandual (10):
mm/mmap: Dynamically initialize protection_map[]
arm64/mm: Consolidate TCR_EL1 fields
arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]
arm64/mm: Add FEAT_LPA2 specific encoding
arm64/mm: Detect and enable FEAT_LPA2
arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S
arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented
arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES
arch/arm64/Kconfig | 9 +++++-
arch/arm64/include/asm/assembler.h | 48 ++++++++++++++++++++++------
arch/arm64/include/asm/kernel-pgtable.h | 4 +--
arch/arm64/include/asm/memory.h | 1 +
arch/arm64/include/asm/pgtable-hwdef.h | 28 ++++++++++++++---
arch/arm64/include/asm/pgtable.h | 18 +++++++++--
arch/arm64/include/asm/sysreg.h | 9 +++---
arch/arm64/kernel/head.S | 55 ++++++++++++++++++++++++++-------
arch/arm64/mm/mmu.c | 3 ++
arch/arm64/mm/pgd.c | 2 +-
arch/arm64/mm/proc.S | 11 ++++++-
arch/arm64/mm/ptdump.c | 26 ++++++++++++++--
mm/mmap.c | 26 +++++++++++++---
13 files changed, 195 insertions(+), 45 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC 01/10] mm/mmap: Dynamically initialize protection_map[]
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 02/10] arm64/mm: Consolidate TCR_EL1 fields Anshuman Khandual
` (8 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
The protection_map[] elements (__PXXX and __SXXX) might sometimes contain
runtime variables in certain platforms like arm64 preventing a successful
build because of the current static initialization. So it just defers the
initialization until mmmap_init() via a new helper init_protection_map().
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
mm/mmap.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index ca54d36..a95b078 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -100,10 +100,7 @@ static void unmap_region(struct mm_struct *mm,
* w: (no) no
* x: (yes) yes
*/
-pgprot_t protection_map[16] __ro_after_init = {
- __P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
- __S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
-};
+pgprot_t protection_map[16] __ro_after_init;
#ifndef CONFIG_ARCH_HAS_FILTER_PGPROT
static inline pgprot_t arch_filter_pgprot(pgprot_t prot)
@@ -3708,6 +3705,26 @@ void mm_drop_all_locks(struct mm_struct *mm)
mutex_unlock(&mm_all_locks_mutex);
}
+static void init_protection_map(void)
+{
+ protection_map[0] = __P000;
+ protection_map[1] = __P001;
+ protection_map[2] = __P010;
+ protection_map[3] = __P011;
+ protection_map[4] = __P100;
+ protection_map[5] = __P101;
+ protection_map[6] = __P110;
+ protection_map[7] = __P111;
+ protection_map[8] = __S000;
+ protection_map[9] = __S001;
+ protection_map[10] = __S010;
+ protection_map[11] = __S011;
+ protection_map[12] = __S100;
+ protection_map[13] = __S101;
+ protection_map[14] = __S110;
+ protection_map[15] = __S111;
+}
+
/*
* initialise the percpu counter for VM
*/
@@ -3715,6 +3732,7 @@ void __init mmap_init(void)
{
int ret;
+ init_protection_map();
ret = percpu_counter_init(&vm_committed_as, 0, GFP_KERNEL);
VM_BUG_ON(ret);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 02/10] arm64/mm: Consolidate TCR_EL1 fields
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
2021-07-14 2:21 ` [RFC 01/10] mm/mmap: Dynamically initialize protection_map[] Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 03/10] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field Anshuman Khandual
` (7 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
This renames and moves SYS_TCR_EL1_TCMA1 and SYS_TCR_EL1_TCMA0 definitions
into pgtable-hwdef.h thus consolidating all TCR fields in a single header.
This does not cause any functional change.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
arch/arm64/include/asm/sysreg.h | 4 ----
arch/arm64/mm/proc.S | 2 +-
3 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 40085e5..66671ff 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -273,6 +273,8 @@
#define TCR_NFD1 (UL(1) << 54)
#define TCR_E0PD0 (UL(1) << 55)
#define TCR_E0PD1 (UL(1) << 56)
+#define TCR_TCMA0 (UL(1) << 57)
+#define TCR_TCMA1 (UL(1) << 58)
/*
* TTBR.
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 7b9c3ac..5cbfaf6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1059,10 +1059,6 @@
#define CPACR_EL1_ZEN_EL0EN (BIT(17)) /* enable EL0 access, if EL1EN set */
#define CPACR_EL1_ZEN (CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)
-/* TCR EL1 Bit Definitions */
-#define SYS_TCR_EL1_TCMA1 (BIT(58))
-#define SYS_TCR_EL1_TCMA0 (BIT(57))
-
/* GCR_EL1 Definitions */
#define SYS_GCR_EL1_RRND (BIT(16))
#define SYS_GCR_EL1_EXCL_MASK 0xffffUL
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 35936c5..1ae0c2b 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -46,7 +46,7 @@
#endif
#ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_MTE_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
+#define TCR_MTE_FLAGS TCR_TCMA1 | TCR_TBI1 | TCR_TBID1
#else
/*
* The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 03/10] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
2021-07-14 2:21 ` [RFC 01/10] mm/mmap: Dynamically initialize protection_map[] Anshuman Khandual
2021-07-14 2:21 ` [RFC 02/10] arm64/mm: Consolidate TCR_EL1 fields Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 04/10] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Anshuman Khandual
` (6 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
As per ARM ARM (0487G.A) TCR_EL1.DS fields controls whether 52 bit input
and output address get supported on 4K and 16K page size configuration,
when FEAT_LPA2 is known to have been implemented. This adds TCR_DS field
definition which would be used when FEAT_LPA2 gets enabled.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/pgtable-hwdef.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 66671ff..1eb5574 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -275,6 +275,7 @@
#define TCR_E0PD1 (UL(1) << 56)
#define TCR_TCMA0 (UL(1) << 57)
#define TCR_TCMA1 (UL(1) << 58)
+#define TCR_DS (UL(1) << 59)
/*
* TTBR.
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 04/10] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
` (2 preceding siblings ...)
2021-07-14 2:21 ` [RFC 03/10] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 05/10] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2] Anshuman Khandual
` (5 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
PAGE_SIZE support is tested against possible minimum and maximum values for
its respective ID_AA64MMFR0.TGRAN field, depending on whether it is signed
or unsigned. But then FEAT_LPA2 implementation needs to be validated for 4K
and 16K page sizes via feature specific ID_AA64MMFR0.TGRAN values. Hence it
adds FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] values per ARM ARM (0487G.A).
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/sysreg.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 5cbfaf6..deecde0 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -849,16 +849,19 @@
#define ID_AA64MMFR0_TGRAN4_NI 0xf
#define ID_AA64MMFR0_TGRAN4_SUPPORTED 0x0
+#define ID_AA64MMFR0_TGRAN4_LPA2 0x1
#define ID_AA64MMFR0_TGRAN64_NI 0xf
#define ID_AA64MMFR0_TGRAN64_SUPPORTED 0x0
#define ID_AA64MMFR0_TGRAN16_NI 0x0
#define ID_AA64MMFR0_TGRAN16_SUPPORTED 0x1
+#define ID_AA64MMFR0_TGRAN16_LPA2 0x2
#define ID_AA64MMFR0_PARANGE_48 0x5
#define ID_AA64MMFR0_PARANGE_52 0x6
#define ID_AA64MMFR0_TGRAN_2_SUPPORTED_DEFAULT 0x0
#define ID_AA64MMFR0_TGRAN_2_SUPPORTED_NONE 0x1
#define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MIN 0x2
+#define ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2 0x3
#define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MAX 0x7
#ifdef CONFIG_ARM64_PA_BITS_52
@@ -1030,10 +1033,12 @@
#define ID_AA64MMFR0_TGRAN_SHIFT ID_AA64MMFR0_TGRAN4_SHIFT
#define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_TGRAN4_SUPPORTED
#define ID_AA64MMFR0_TGRAN_SUPPORTED_MAX 0x7
+#define ID_AA64MMFR0_TGRAN_LPA2 ID_AA64MMFR0_TGRAN4_LPA2
#elif defined(CONFIG_ARM64_16K_PAGES)
#define ID_AA64MMFR0_TGRAN_SHIFT ID_AA64MMFR0_TGRAN16_SHIFT
#define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_TGRAN16_SUPPORTED
#define ID_AA64MMFR0_TGRAN_SUPPORTED_MAX 0xF
+#define ID_AA64MMFR0_TGRAN_LPA2 ID_AA64MMFR0_TGRAN16_LPA2
#elif defined(CONFIG_ARM64_64K_PAGES)
#define ID_AA64MMFR0_TGRAN_SHIFT ID_AA64MMFR0_TGRAN64_SHIFT
#define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_TGRAN64_SUPPORTED
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 05/10] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
` (3 preceding siblings ...)
2021-07-14 2:21 ` [RFC 04/10] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
` (4 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
Going forward, CONFIG_ARM64_PA_BITS_52 could be enabled on a system via two
different architecture features i.e FEAT_LPA for CONFIG_ARM64_64K_PAGES and
FEAT_LPA2 for CONFIG_ARM64_[4K|16K]_PAGES. But CONFIG_ARM64_PA_BITS_52 is
exclussively available on 64K page size config currently, which needs to be
freed up for other page size configs to use when FEAT_LPA2 gets enabled.
To achieve CONFIG_ARM64_PA_BITS_52 and CONFIG_ARM64_64K_PAGES decoupling,
and also to reduce #ifdefs while navigating various page size configs, this
adds two internal config options CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]. While
here it also converts existing 64K page size based FEAT_LPA implementations
to use CONFIG_ARM64_PA_BITS_52_LPA. TTBR representation remains same for
both FEAT_LPA and FEAT_LPA2. No functional change for 64K page size config.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/Kconfig | 7 +++++++
arch/arm64/include/asm/assembler.h | 12 ++++++------
arch/arm64/include/asm/pgtable-hwdef.h | 7 ++++---
arch/arm64/include/asm/pgtable.h | 6 +++---
arch/arm64/mm/pgd.c | 2 +-
5 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e07e7de..658a6fd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -934,6 +934,12 @@ config ARM64_VA_BITS
default 48 if ARM64_VA_BITS_48
default 52 if ARM64_VA_BITS_52
+config ARM64_PA_BITS_52_LPA
+ bool
+
+config ARM64_PA_BITS_52_LPA2
+ bool
+
choice
prompt "Physical address space size"
default ARM64_PA_BITS_48
@@ -948,6 +954,7 @@ config ARM64_PA_BITS_52
bool "52-bit (ARMv8.2)"
depends on ARM64_64K_PAGES
depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
+ select ARM64_PA_BITS_52_LPA if ARM64_64K_PAGES
help
Enable support for a 52-bit physical address space, introduced as
part of the ARMv8.2-LPA extension.
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 89faca0..fedc202 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -607,26 +607,26 @@ alternative_endif
.endm
.macro phys_to_pte, pte, phys
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
/*
* We assume \phys is 64K aligned and this is guaranteed by only
* supporting this configuration with 64K pages.
*/
orr \pte, \phys, \phys, lsr #36
and \pte, \pte, #PTE_ADDR_MASK
-#else
+#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
mov \pte, \phys
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
.endm
.macro pte_to_phys, phys, pte
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
ubfiz \phys, \pte, #(48 - 16 - 12), #16
bfxil \phys, \pte, #16, #32
lsl \phys, \phys, #16
-#else
+#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
and \phys, \pte, #PTE_ADDR_MASK
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
.endm
/*
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 1eb5574..f375bcf 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -155,13 +155,14 @@
#define PTE_PXN (_AT(pteval_t, 1) << 53) /* Privileged XN */
#define PTE_UXN (_AT(pteval_t, 1) << 54) /* User XN */
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
#define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
-#ifdef CONFIG_ARM64_PA_BITS_52
#define PTE_ADDR_HIGH (_AT(pteval_t, 0xf) << 12)
#define PTE_ADDR_MASK (PTE_ADDR_LOW | PTE_ADDR_HIGH)
-#else
+#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
+#define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
#define PTE_ADDR_MASK PTE_ADDR_LOW
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
/*
* AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f09bf5c..3c57fb2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -66,14 +66,14 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
* Macros to convert between a physical address and its placement in a
* page table entry, taking care of 52-bit addresses.
*/
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
#define __pte_to_phys(pte) \
((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
#define __phys_to_pte_val(phys) (((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
-#else
+#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
#define __pte_to_phys(pte) (pte_val(pte) & PTE_ADDR_MASK)
#define __phys_to_pte_val(phys) (phys)
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
#define pte_pfn(pte) (__pte_to_phys(pte) >> PAGE_SHIFT)
#define pfn_pte(pfn,prot) \
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index 4a64089..090dfbe 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -40,7 +40,7 @@ void __init pgtable_cache_init(void)
if (PGD_SIZE == PAGE_SIZE)
return;
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
/*
* With 52-bit physical addresses, the architecture requires the
* top-level table to be aligned to at least 64 bytes.
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
` (4 preceding siblings ...)
2021-07-14 2:21 ` [RFC 05/10] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2] Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 15:38 ` Steven Price
2021-07-14 2:21 ` [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2 Anshuman Khandual
` (3 subsequent siblings)
9 siblings, 1 reply; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
to accept a temporary variable and changes impacted call sites.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/assembler.h | 23 +++++++++++++++++++----
arch/arm64/include/asm/pgtable-hwdef.h | 4 ++++
arch/arm64/include/asm/pgtable.h | 4 ++++
arch/arm64/kernel/head.S | 25 +++++++++++++------------
4 files changed, 40 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index fedc202..0492543 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -606,7 +606,7 @@ alternative_endif
#endif
.endm
- .macro phys_to_pte, pte, phys
+ .macro phys_to_pte, pte, phys, tmp
#ifdef CONFIG_ARM64_PA_BITS_52_LPA
/*
* We assume \phys is 64K aligned and this is guaranteed by only
@@ -614,6 +614,17 @@ alternative_endif
*/
orr \pte, \phys, \phys, lsr #36
and \pte, \pte, #PTE_ADDR_MASK
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+ orr \pte, \phys, \phys, lsr #42
+
+ /*
+ * The 'tmp' is being used here to just prepare
+ * and hold PTE_ADDR_MASK which cannot be passed
+ * to the subsequent 'and' instruction.
+ */
+ mov \tmp, #PTE_ADDR_LOW
+ orr \tmp, \tmp, #PTE_ADDR_HIGH
+ and \pte, \pte, \tmp
#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
mov \pte, \phys
#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
@@ -621,9 +632,13 @@ alternative_endif
.macro pte_to_phys, phys, pte
#ifdef CONFIG_ARM64_PA_BITS_52_LPA
- ubfiz \phys, \pte, #(48 - 16 - 12), #16
- bfxil \phys, \pte, #16, #32
- lsl \phys, \phys, #16
+ ubfiz \phys, \pte, #(48 - PAGE_SHIFT - 12), #16
+ bfxil \phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
+ lsl \phys, \phys, #PAGE_SHIFT
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+ ubfiz \phys, \pte, #(52 - PAGE_SHIFT - 10), #10
+ bfxil \phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
+ lsl \phys, \phys, #PAGE_SHIFT
#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
and \phys, \pte, #PTE_ADDR_MASK
#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index f375bcf..c815a85 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -159,6 +159,10 @@
#define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
#define PTE_ADDR_HIGH (_AT(pteval_t, 0xf) << 12)
#define PTE_ADDR_MASK (PTE_ADDR_LOW | PTE_ADDR_HIGH)
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+#define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (50 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+#define PTE_ADDR_HIGH (_AT(pteval_t, 0x3) << 8)
+#define PTE_ADDR_MASK (PTE_ADDR_LOW | PTE_ADDR_HIGH)
#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
#define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
#define PTE_ADDR_MASK PTE_ADDR_LOW
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 3c57fb2..5e7e402 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -70,6 +70,10 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
#define __pte_to_phys(pte) \
((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
#define __phys_to_pte_val(phys) (((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+#define __pte_to_phys(pte) \
+ ((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
+#define __phys_to_pte_val(phys) (((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
#define __pte_to_phys(pte) (pte_val(pte) & PTE_ADDR_MASK)
#define __phys_to_pte_val(phys) (phys)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c5c994a..6444147 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -134,9 +134,9 @@ SYM_CODE_END(preserve_boot_args)
* Corrupts: ptrs, tmp1, tmp2
* Returns: tbl -> next level table page address
*/
- .macro create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
+ .macro create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2, tmp3
add \tmp1, \tbl, #PAGE_SIZE
- phys_to_pte \tmp2, \tmp1
+ phys_to_pte \tmp2, \tmp1, \tmp3
orr \tmp2, \tmp2, #PMD_TYPE_TABLE // address of next table and entry type
lsr \tmp1, \virt, #\shift
sub \ptrs, \ptrs, #1
@@ -161,8 +161,8 @@ SYM_CODE_END(preserve_boot_args)
* Corrupts: index, tmp1
* Returns: rtbl
*/
- .macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1
-.Lpe\@: phys_to_pte \tmp1, \rtbl
+ .macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1, tmp2
+.Lpe\@: phys_to_pte \tmp1, \rtbl, \tmp2
orr \tmp1, \tmp1, \flags // tmp1 = table entry
str \tmp1, [\tbl, \index, lsl #3]
add \rtbl, \rtbl, \inc // rtbl = pa next level
@@ -224,31 +224,32 @@ SYM_CODE_END(preserve_boot_args)
* Preserves: vstart, vend, flags
* Corrupts: tbl, rtbl, istart, iend, tmp, count, sv
*/
- .macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, tmp, count, sv
+ .macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, \
+ tmp, tmp1, count, sv
add \rtbl, \tbl, #PAGE_SIZE
mov \sv, \rtbl
mov \count, #0
compute_indices \vstart, \vend, #PGDIR_SHIFT, \pgds, \istart, \iend, \count
- populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+ populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
mov \tbl, \sv
mov \sv, \rtbl
#if SWAPPER_PGTABLE_LEVELS > 3
compute_indices \vstart, \vend, #PUD_SHIFT, #PTRS_PER_PUD, \istart, \iend, \count
- populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+ populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
mov \tbl, \sv
mov \sv, \rtbl
#endif
#if SWAPPER_PGTABLE_LEVELS > 2
compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #PTRS_PER_PMD, \istart, \iend, \count
- populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+ populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
mov \tbl, \sv
#endif
compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #PTRS_PER_PTE, \istart, \iend, \count
bic \count, \phys, #SWAPPER_BLOCK_SIZE - 1
- populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp
+ populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp, \tmp1
.endm
/*
@@ -343,7 +344,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
#endif
mov x4, EXTRA_PTRS
- create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6
+ create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6, x20
#else
/*
* If VA_BITS == 48, we don't have to configure an additional
@@ -356,7 +357,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
ldr_l x4, idmap_ptrs_per_pgd
adr_l x6, __idmap_text_end // __pa(__idmap_text_end)
- map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14
+ map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
/*
* Map the kernel image (starting with PHYS_OFFSET).
@@ -370,7 +371,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
sub x6, x6, x3 // _end - _text
add x6, x6, x5 // runtime __va(_end)
- map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14
+ map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
/*
* Since the page tables have been populated with non-cacheable
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
` (5 preceding siblings ...)
2021-07-14 2:21 ` [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 8:21 ` Suzuki K Poulose
2021-07-14 2:21 ` [RFC 08/10] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S Anshuman Khandual
` (2 subsequent siblings)
9 siblings, 1 reply; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
Detect FEAT_LPA2 implementation early enough during boot when requested via
CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
was requested but found not to be implemented.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/memory.h | 1 +
arch/arm64/kernel/head.S | 15 +++++++++++++++
arch/arm64/mm/mmu.c | 3 +++
arch/arm64/mm/proc.S | 9 +++++++++
4 files changed, 28 insertions(+)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 824a365..d0ca002 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -178,6 +178,7 @@
#include <asm/bug.h>
extern u64 vabits_actual;
+extern u64 arm64_lpa2_enabled;
extern s64 memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 6444147..9cf79ea 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
adrp x23, __PHYS_OFFSET
and x23, x23, MIN_KIMG_ALIGN - 1 // KASLR offset, defaults to 0
bl set_cpu_boot_mode_flag
+
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+ mrs x10, ID_AA64MMFR0_EL1
+ ubfx x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
+ cmp x10, #ID_AA64MMFR0_TGRAN_LPA2
+ b.ne 1f
+
+ mov x10, #1
+ adr_l x11, arm64_lpa2_enabled
+ str x10, [x11]
+ dmb sy
+ dc ivac, x11
+1:
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
bl __create_page_tables
/*
* The following calls CPU setup code, see arch/arm64/mm/proc.S for
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d745865..00b7595 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -48,6 +48,9 @@ u64 idmap_ptrs_per_pgd = PTRS_PER_PGD;
u64 __section(".mmuoff.data.write") vabits_actual;
EXPORT_SYMBOL(vabits_actual);
+u64 __section(".mmuoff.data.write") arm64_lpa2_enabled;
+EXPORT_SYMBOL(arm64_lpa2_enabled);
+
u64 kimage_voffset __ro_after_init;
EXPORT_SYMBOL(kimage_voffset);
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 1ae0c2b..672880c 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -423,6 +423,15 @@ SYM_FUNC_START(__cpu_setup)
TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+ ldr_l x10, arm64_lpa2_enabled
+ cmp x10, #1
+ b.ne 1f
+ mov_q x10, TCR_DS
+ orr tcr, tcr, x10
+1:
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
#ifdef CONFIG_ARM64_MTE
/*
* Update MAIR_EL1, GCR_EL1 and TFSR*_EL1 if MTE is supported
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 08/10] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
` (6 preceding siblings ...)
2021-07-14 2:21 ` [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2 Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 09/10] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented Anshuman Khandual
2021-07-14 2:21 ` [RFC 10/10] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES Anshuman Khandual
9 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
PTE[9:8] which holds the sharability attribute bits SH[1:0] could collide
with PA[51:50] when CONFIG_ARM64_PA_BITS_52 is enabled but then FEAT_LPA2
is not detected during boot. Dropping PTE_SHARED and PMD_SECT_S attributes
completely in this scenario will create non-shared page table entries which
would cause regression.
Instead just define PTE_SHARED and PMD_SECT_S after accounting for runtime
'arm64_lpa2_enable', thus maintaining required sharability attributes for
both kernel and user space page table entries. This updates ptdump handling
for page table entry shared attributes accommodating FEAT_LPA2 scenarios.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/kernel-pgtable.h | 4 ++--
arch/arm64/include/asm/pgtable-hwdef.h | 12 ++++++++++--
arch/arm64/kernel/head.S | 15 +++++++++++++++
arch/arm64/mm/ptdump.c | 26 ++++++++++++++++++++++++--
4 files changed, 51 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 3512184..db682b5 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -103,8 +103,8 @@
/*
* Initial memory map attributes.
*/
-#define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF)
+#define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF)
#if ARM64_KERNEL_USES_PMD_MAPS
#define SWAPPER_MM_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index c815a85..8a3b75e 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -116,13 +116,21 @@
#define PMD_TYPE_SECT (_AT(pmdval_t, 1) << 0)
#define PMD_TABLE_BIT (_AT(pmdval_t, 1) << 1)
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+#define PTE_SHARED (arm64_lpa2_enabled ? 0 : PTE_SHARED_STATIC)
+#define PMD_SECT_S (arm64_lpa2_enabled ? 0 : PMD_SECT_S_STATIC)
+#else /* !CONFIG_ARM64_PA_BITS_52_LPA2 */
+#define PTE_SHARED PTE_SHARED_STATIC
+#define PMD_SECT_S PMD_SECT_S_STATIC
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
/*
* Section
*/
#define PMD_SECT_VALID (_AT(pmdval_t, 1) << 0)
#define PMD_SECT_USER (_AT(pmdval_t, 1) << 6) /* AP[1] */
#define PMD_SECT_RDONLY (_AT(pmdval_t, 1) << 7) /* AP[2] */
-#define PMD_SECT_S (_AT(pmdval_t, 3) << 8)
+#define PMD_SECT_S_STATIC (_AT(pmdval_t, 3) << 8)
#define PMD_SECT_AF (_AT(pmdval_t, 1) << 10)
#define PMD_SECT_NG (_AT(pmdval_t, 1) << 11)
#define PMD_SECT_CONT (_AT(pmdval_t, 1) << 52)
@@ -146,7 +154,7 @@
#define PTE_TABLE_BIT (_AT(pteval_t, 1) << 1)
#define PTE_USER (_AT(pteval_t, 1) << 6) /* AP[1] */
#define PTE_RDONLY (_AT(pteval_t, 1) << 7) /* AP[2] */
-#define PTE_SHARED (_AT(pteval_t, 3) << 8) /* SH[1:0], inner shareable */
+#define PTE_SHARED_STATIC (_AT(pteval_t, 3) << 8) /* SH[1:0], inner shareable */
#define PTE_AF (_AT(pteval_t, 1) << 10) /* Access Flag */
#define PTE_NG (_AT(pteval_t, 1) << 11) /* nG */
#define PTE_GP (_AT(pteval_t, 1) << 50) /* BTI guarded */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 9cf79ea..5732da0 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -302,6 +302,21 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
mov x7, SWAPPER_MM_MMUFLAGS
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+ ldr_l x2, arm64_lpa2_enabled
+ cmp x2, #1
+ b.eq 1f
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
+ /*
+ * FEAT_LPA2 has not been detected during boot.
+ * Hence SWAPPER_MM_MMUFLAGS needs to have the
+ * regular sharability attributes in PTE[9:8].
+ * Same is also applicable when FEAT_LPA2 has
+ * not been requested in the first place.
+ */
+ orr x7, x7, PTE_SHARED_STATIC
+1:
/*
* Create the identity mapping.
*/
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 1c40353..be171cf 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -115,8 +115,8 @@ static const struct prot_bits pte_bits[] = {
.set = "NX",
.clear = "x ",
}, {
- .mask = PTE_SHARED,
- .val = PTE_SHARED,
+ .mask = PTE_SHARED_STATIC,
+ .val = PTE_SHARED_STATIC,
.set = "SHD",
.clear = " ",
}, {
@@ -211,6 +211,28 @@ static void dump_prot(struct pg_state *st, const struct prot_bits *bits,
for (i = 0; i < num; i++, bits++) {
const char *s;
+ if (IS_ENABLED(CONFIG_ARM64_PA_BITS_52_LPA2) &&
+ (bits->mask == PTE_SHARED_STATIC)) {
+ /*
+ * If FEAT_LPA2 has been detected and enabled
+ * sharing attributes for page table entries
+ * are inherited from TCR_EL1.SH1 as init_mm
+ * based mappings are enabled from TTBR1_EL1.
+ */
+ if (arm64_lpa2_enabled) {
+ if ((read_sysreg(tcr_el1) & TCR_SH1_INNER) == TCR_SH1_INNER)
+ pt_dump_seq_printf(st->seq, " SHD ");
+ else
+ pt_dump_seq_printf(st->seq, " ");
+ continue;
+ }
+ /*
+ * In case FEAT_LPA2 has not been detected and
+ * enabled sharing attributes should be found
+ * in the regular PTE positions. It just falls
+ * through regular PTE attribute handling.
+ */
+ }
if ((st->current_prot & bits->mask) == bits->val)
s = bits->set;
else
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 09/10] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
` (7 preceding siblings ...)
2021-07-14 2:21 ` [RFC 08/10] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 10/10] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES Anshuman Khandual
9 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
CONFIG_ARM64_PA_BITS_52 build kernels need to fallback for 48 bits PA range
encodings when FEAT_LPA2 is not implemented i.e TCR_EL1.DS could not be set
. Hence modify applicable PTE and TTBR encoding helpers to accommodate the
scenario via 'arm64_lpa2_enabled'.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/assembler.h | 13 +++++++++++++
arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
arch/arm64/include/asm/pgtable.h | 12 ++++++++++--
3 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0492543..844e9a0 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -615,6 +615,10 @@ alternative_endif
orr \pte, \phys, \phys, lsr #36
and \pte, \pte, #PTE_ADDR_MASK
#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+ ldr_l \tmp, arm64_lpa2_enabled
+ cmp \tmp, #1
+ b.ne .Lskip_lpa2\@
+
orr \pte, \phys, \phys, lsr #42
/*
@@ -625,6 +629,9 @@ alternative_endif
mov \tmp, #PTE_ADDR_LOW
orr \tmp, \tmp, #PTE_ADDR_HIGH
and \pte, \pte, \tmp
+
+.Lskip_lpa2\@:
+ mov \pte, \phys
#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
mov \pte, \phys
#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
@@ -636,9 +643,15 @@ alternative_endif
bfxil \phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
lsl \phys, \phys, #PAGE_SHIFT
#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+ ldr_l \phys, arm64_lpa2_enabled
+ cmp \phys, #1
+ b.ne .Lskip_lpa2\@
+
ubfiz \phys, \pte, #(52 - PAGE_SHIFT - 10), #10
bfxil \phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
lsl \phys, \phys, #PAGE_SHIFT
+.Lskip_lpa2\@:
+ and \phys, \pte, #PTE_ADDR_MASK_48
#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
and \phys, \pte, #PTE_ADDR_MASK
#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 8a3b75e..b98b764 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -176,6 +176,8 @@
#define PTE_ADDR_MASK PTE_ADDR_LOW
#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
+#define PTE_ADDR_MASK_48 (((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+
/*
* AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
*/
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 5e7e402..97b3cd2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -71,9 +71,17 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
#define __phys_to_pte_val(phys) (((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
-#define __pte_to_phys(pte) \
+#define __pte_to_phys_52(pte) \
((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
-#define __phys_to_pte_val(phys) (((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
+#define __phys_to_pte_val_52(phys) (((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
+
+#define __pte_to_phys_48(pte) (pte_val(pte) & PTE_ADDR_MASK_48)
+#define __phys_to_pte_val_48(phys) (phys)
+
+#define __pte_to_phys(pte) \
+ (arm64_lpa2_enabled ? __pte_to_phys_52(pte) : __pte_to_phys_48(pte))
+#define __phys_to_pte_val(phys) \
+ (arm64_lpa2_enabled ? __phys_to_pte_val_52(phys) : __phys_to_pte_val_48(phys))
#else /* !CONFIG_ARM64_PA_BITS_52_LPA */
#define __pte_to_phys(pte) (pte_val(pte) & PTE_ADDR_MASK)
#define __phys_to_pte_val(phys) (phys)
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [RFC 10/10] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
` (8 preceding siblings ...)
2021-07-14 2:21 ` [RFC 09/10] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented Anshuman Khandual
@ 2021-07-14 2:21 ` Anshuman Khandual
9 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-14 2:21 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse, steven.price, Anshuman Khandual
All required FEAT_LPA2 components for 52 bit PA range are already in place.
Just enable CONFIG_ARM64_PA_BITS_52 on 4K and 16K pages which would select
CONFIG_ARM64_PA_BITS_52_LPA2 activating 52 bit PA range via FEAT_LPA2.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 658a6fd..bc7e5c6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -952,9 +952,9 @@ config ARM64_PA_BITS_48
config ARM64_PA_BITS_52
bool "52-bit (ARMv8.2)"
- depends on ARM64_64K_PAGES
depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
select ARM64_PA_BITS_52_LPA if ARM64_64K_PAGES
+ select ARM64_PA_BITS_52_LPA2 if (ARM64_4K_PAGES || ARM64_16K_PAGES)
help
Enable support for a 52-bit physical address space, introduced as
part of the ARMv8.2-LPA extension.
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
2021-07-14 2:21 ` [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2 Anshuman Khandual
@ 2021-07-14 8:21 ` Suzuki K Poulose
2021-07-16 7:06 ` Anshuman Khandual
0 siblings, 1 reply; 19+ messages in thread
From: Suzuki K Poulose @ 2021-07-14 8:21 UTC (permalink / raw)
To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
steven.price
On 14/07/2021 03:21, Anshuman Khandual wrote:
> Detect FEAT_LPA2 implementation early enough during boot when requested via
> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
> was requested but found not to be implemented.
>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> arch/arm64/include/asm/memory.h | 1 +
> arch/arm64/kernel/head.S | 15 +++++++++++++++
> arch/arm64/mm/mmu.c | 3 +++
> arch/arm64/mm/proc.S | 9 +++++++++
> 4 files changed, 28 insertions(+)
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 824a365..d0ca002 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -178,6 +178,7 @@
> #include <asm/bug.h>
>
> extern u64 vabits_actual;
> +extern u64 arm64_lpa2_enabled;
>
> extern s64 memstart_addr;
> /* PHYS_OFFSET - the physical address of the start of memory. */
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 6444147..9cf79ea 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
> adrp x23, __PHYS_OFFSET
> and x23, x23, MIN_KIMG_ALIGN - 1 // KASLR offset, defaults to 0
> bl set_cpu_boot_mode_flag
> +
> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
> + mrs x10, ID_AA64MMFR0_EL1
> + ubfx x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
> + cmp x10, #ID_AA64MMFR0_TGRAN_LPA2
> + b.ne 1f
For the sake of forward compatibility, this should be "b.lt"
Suzuki
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
2021-07-14 2:21 ` [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
@ 2021-07-14 15:38 ` Steven Price
2021-07-16 7:20 ` Anshuman Khandual
0 siblings, 1 reply; 19+ messages in thread
From: Steven Price @ 2021-07-14 15:38 UTC (permalink / raw)
To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse
On 14/07/2021 03:21, Anshuman Khandual wrote:
> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
> to accept a temporary variable and changes impacted call sites.
>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> arch/arm64/include/asm/assembler.h | 23 +++++++++++++++++++----
> arch/arm64/include/asm/pgtable-hwdef.h | 4 ++++
> arch/arm64/include/asm/pgtable.h | 4 ++++
> arch/arm64/kernel/head.S | 25 +++++++++++++------------
> 4 files changed, 40 insertions(+), 16 deletions(-)
>
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index fedc202..0492543 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -606,7 +606,7 @@ alternative_endif
> #endif
> .endm
>
> - .macro phys_to_pte, pte, phys
> + .macro phys_to_pte, pte, phys, tmp
> #ifdef CONFIG_ARM64_PA_BITS_52_LPA
> /*
> * We assume \phys is 64K aligned and this is guaranteed by only
> @@ -614,6 +614,17 @@ alternative_endif
> */
> orr \pte, \phys, \phys, lsr #36
> and \pte, \pte, #PTE_ADDR_MASK
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> + orr \pte, \phys, \phys, lsr #42
> +
> + /*
> + * The 'tmp' is being used here to just prepare
> + * and hold PTE_ADDR_MASK which cannot be passed
> + * to the subsequent 'and' instruction.
> + */
> + mov \tmp, #PTE_ADDR_LOW
> + orr \tmp, \tmp, #PTE_ADDR_HIGH
> + and \pte, \pte, \tmp
Rather than adding an extra temporary register (and the fallout of
various other macros needing an extra register), this can be done with
two AND instructions:
/* PTE_ADDR_MASK cannot be encoded as an immediate, so
* mask off all but two bits, followed by masking the
* extra two bits
*/
and \pte, \pte, #PTE_ADDR_MASK | (3 << 10)
and \pte, \pte, #~(3 << 10)
Steve
> #else /* !CONFIG_ARM64_PA_BITS_52_LPA */
> mov \pte, \phys
> #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
> @@ -621,9 +632,13 @@ alternative_endif
>
> .macro pte_to_phys, phys, pte
> #ifdef CONFIG_ARM64_PA_BITS_52_LPA
> - ubfiz \phys, \pte, #(48 - 16 - 12), #16
> - bfxil \phys, \pte, #16, #32
> - lsl \phys, \phys, #16
> + ubfiz \phys, \pte, #(48 - PAGE_SHIFT - 12), #16
> + bfxil \phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
> + lsl \phys, \phys, #PAGE_SHIFT
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> + ubfiz \phys, \pte, #(52 - PAGE_SHIFT - 10), #10
> + bfxil \phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
> + lsl \phys, \phys, #PAGE_SHIFT
> #else /* !CONFIG_ARM64_PA_BITS_52_LPA */
> and \phys, \pte, #PTE_ADDR_MASK
> #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index f375bcf..c815a85 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -159,6 +159,10 @@
> #define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
> #define PTE_ADDR_HIGH (_AT(pteval_t, 0xf) << 12)
> #define PTE_ADDR_MASK (PTE_ADDR_LOW | PTE_ADDR_HIGH)
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +#define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (50 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
> +#define PTE_ADDR_HIGH (_AT(pteval_t, 0x3) << 8)
> +#define PTE_ADDR_MASK (PTE_ADDR_LOW | PTE_ADDR_HIGH)
> #else /* !CONFIG_ARM64_PA_BITS_52_LPA */
> #define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
> #define PTE_ADDR_MASK PTE_ADDR_LOW
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 3c57fb2..5e7e402 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -70,6 +70,10 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
> #define __pte_to_phys(pte) \
> ((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
> #define __phys_to_pte_val(phys) (((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +#define __pte_to_phys(pte) \
> + ((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
> +#define __phys_to_pte_val(phys) (((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
> #else /* !CONFIG_ARM64_PA_BITS_52_LPA */
> #define __pte_to_phys(pte) (pte_val(pte) & PTE_ADDR_MASK)
> #define __phys_to_pte_val(phys) (phys)
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index c5c994a..6444147 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -134,9 +134,9 @@ SYM_CODE_END(preserve_boot_args)
> * Corrupts: ptrs, tmp1, tmp2
> * Returns: tbl -> next level table page address
> */
> - .macro create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
> + .macro create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2, tmp3
> add \tmp1, \tbl, #PAGE_SIZE
> - phys_to_pte \tmp2, \tmp1
> + phys_to_pte \tmp2, \tmp1, \tmp3
> orr \tmp2, \tmp2, #PMD_TYPE_TABLE // address of next table and entry type
> lsr \tmp1, \virt, #\shift
> sub \ptrs, \ptrs, #1
> @@ -161,8 +161,8 @@ SYM_CODE_END(preserve_boot_args)
> * Corrupts: index, tmp1
> * Returns: rtbl
> */
> - .macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1
> -.Lpe\@: phys_to_pte \tmp1, \rtbl
> + .macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1, tmp2
> +.Lpe\@: phys_to_pte \tmp1, \rtbl, \tmp2
> orr \tmp1, \tmp1, \flags // tmp1 = table entry
> str \tmp1, [\tbl, \index, lsl #3]
> add \rtbl, \rtbl, \inc // rtbl = pa next level
> @@ -224,31 +224,32 @@ SYM_CODE_END(preserve_boot_args)
> * Preserves: vstart, vend, flags
> * Corrupts: tbl, rtbl, istart, iend, tmp, count, sv
> */
> - .macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, tmp, count, sv
> + .macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, \
> + tmp, tmp1, count, sv
> add \rtbl, \tbl, #PAGE_SIZE
> mov \sv, \rtbl
> mov \count, #0
> compute_indices \vstart, \vend, #PGDIR_SHIFT, \pgds, \istart, \iend, \count
> - populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> + populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
> mov \tbl, \sv
> mov \sv, \rtbl
>
> #if SWAPPER_PGTABLE_LEVELS > 3
> compute_indices \vstart, \vend, #PUD_SHIFT, #PTRS_PER_PUD, \istart, \iend, \count
> - populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> + populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
> mov \tbl, \sv
> mov \sv, \rtbl
> #endif
>
> #if SWAPPER_PGTABLE_LEVELS > 2
> compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #PTRS_PER_PMD, \istart, \iend, \count
> - populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> + populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
> mov \tbl, \sv
> #endif
>
> compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #PTRS_PER_PTE, \istart, \iend, \count
> bic \count, \phys, #SWAPPER_BLOCK_SIZE - 1
> - populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp
> + populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp, \tmp1
> .endm
>
> /*
> @@ -343,7 +344,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
> #endif
>
> mov x4, EXTRA_PTRS
> - create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6
> + create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6, x20
> #else
> /*
> * If VA_BITS == 48, we don't have to configure an additional
> @@ -356,7 +357,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
> ldr_l x4, idmap_ptrs_per_pgd
> adr_l x6, __idmap_text_end // __pa(__idmap_text_end)
>
> - map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14
> + map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
>
> /*
> * Map the kernel image (starting with PHYS_OFFSET).
> @@ -370,7 +371,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
> sub x6, x6, x3 // _end - _text
> add x6, x6, x5 // runtime __va(_end)
>
> - map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14
> + map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
>
> /*
> * Since the page tables have been populated with non-cacheable
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
2021-07-14 8:21 ` Suzuki K Poulose
@ 2021-07-16 7:06 ` Anshuman Khandual
2021-07-16 8:08 ` Suzuki K Poulose
0 siblings, 1 reply; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-16 7:06 UTC (permalink / raw)
To: Suzuki K Poulose, linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
steven.price
On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
> On 14/07/2021 03:21, Anshuman Khandual wrote:
>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>> was requested but found not to be implemented.
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>> arch/arm64/include/asm/memory.h | 1 +
>> arch/arm64/kernel/head.S | 15 +++++++++++++++
>> arch/arm64/mm/mmu.c | 3 +++
>> arch/arm64/mm/proc.S | 9 +++++++++
>> 4 files changed, 28 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>> index 824a365..d0ca002 100644
>> --- a/arch/arm64/include/asm/memory.h
>> +++ b/arch/arm64/include/asm/memory.h
>> @@ -178,6 +178,7 @@
>> #include <asm/bug.h>
>> extern u64 vabits_actual;
>> +extern u64 arm64_lpa2_enabled;
>> extern s64 memstart_addr;
>> /* PHYS_OFFSET - the physical address of the start of memory. */
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 6444147..9cf79ea 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>> adrp x23, __PHYS_OFFSET
>> and x23, x23, MIN_KIMG_ALIGN - 1 // KASLR offset, defaults to 0
>> bl set_cpu_boot_mode_flag
>> +
>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>> + mrs x10, ID_AA64MMFR0_EL1
>> + ubfx x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>> + cmp x10, #ID_AA64MMFR0_TGRAN_LPA2
>> + b.ne 1f
>
> For the sake of forward compatibility, this should be "b.lt"
Right, I guess we could assume that the feature will be present from the
current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
limit is different for 4K and 16K page sizes.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
2021-07-14 15:38 ` Steven Price
@ 2021-07-16 7:20 ` Anshuman Khandual
2021-07-16 10:02 ` Steven Price
0 siblings, 1 reply; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-16 7:20 UTC (permalink / raw)
To: Steven Price, linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse
On 7/14/21 9:08 PM, Steven Price wrote:
> On 14/07/2021 03:21, Anshuman Khandual wrote:
>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>> to accept a temporary variable and changes impacted call sites.
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>> arch/arm64/include/asm/assembler.h | 23 +++++++++++++++++++----
>> arch/arm64/include/asm/pgtable-hwdef.h | 4 ++++
>> arch/arm64/include/asm/pgtable.h | 4 ++++
>> arch/arm64/kernel/head.S | 25 +++++++++++++------------
>> 4 files changed, 40 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>> index fedc202..0492543 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -606,7 +606,7 @@ alternative_endif
>> #endif
>> .endm
>>
>> - .macro phys_to_pte, pte, phys
>> + .macro phys_to_pte, pte, phys, tmp
>> #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>> /*
>> * We assume \phys is 64K aligned and this is guaranteed by only
>> @@ -614,6 +614,17 @@ alternative_endif
>> */
>> orr \pte, \phys, \phys, lsr #36
>> and \pte, \pte, #PTE_ADDR_MASK
>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>> + orr \pte, \phys, \phys, lsr #42
>> +
>> + /*
>> + * The 'tmp' is being used here to just prepare
>> + * and hold PTE_ADDR_MASK which cannot be passed
>> + * to the subsequent 'and' instruction.
>> + */
>> + mov \tmp, #PTE_ADDR_LOW
>> + orr \tmp, \tmp, #PTE_ADDR_HIGH
>> + and \pte, \pte, \tmp
> Rather than adding an extra temporary register (and the fallout of
> various other macros needing an extra register), this can be done with
> two AND instructions:
I would really like to get rid of the 'tmp' variable here as
well but did not figure out any method of accomplishing it.
>
> /* PTE_ADDR_MASK cannot be encoded as an immediate, so
> * mask off all but two bits, followed by masking the
> * extra two bits
> */
> and \pte, \pte, #PTE_ADDR_MASK | (3 << 10)
> and \pte, \pte, #~(3 << 10)
Did this change as suggested
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -626,9 +626,8 @@ alternative_endif
* and hold PTE_ADDR_MASK which cannot be passed
* to the subsequent 'and' instruction.
*/
- mov \tmp, #PTE_ADDR_LOW
- orr \tmp, \tmp, #PTE_ADDR_HIGH
- and \pte, \pte, \tmp
+ and \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
+ and \pte, \pte, #~(0x3 << 10)
.Lskip_lpa2\@:
mov \pte, \phys
but still fails to build (tested on 16K)
arch/arm64/kernel/head.S: Assembler messages:
arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
2021-07-16 7:06 ` Anshuman Khandual
@ 2021-07-16 8:08 ` Suzuki K Poulose
2021-07-19 4:47 ` Anshuman Khandual
0 siblings, 1 reply; 19+ messages in thread
From: Suzuki K Poulose @ 2021-07-16 8:08 UTC (permalink / raw)
To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
steven.price
On 16/07/2021 08:06, Anshuman Khandual wrote:
>
> On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>>> was requested but found not to be implemented.
>>>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>> arch/arm64/include/asm/memory.h | 1 +
>>> arch/arm64/kernel/head.S | 15 +++++++++++++++
>>> arch/arm64/mm/mmu.c | 3 +++
>>> arch/arm64/mm/proc.S | 9 +++++++++
>>> 4 files changed, 28 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>> index 824a365..d0ca002 100644
>>> --- a/arch/arm64/include/asm/memory.h
>>> +++ b/arch/arm64/include/asm/memory.h
>>> @@ -178,6 +178,7 @@
>>> #include <asm/bug.h>
>>> extern u64 vabits_actual;
>>> +extern u64 arm64_lpa2_enabled;
>>> extern s64 memstart_addr;
>>> /* PHYS_OFFSET - the physical address of the start of memory. */
>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>>> index 6444147..9cf79ea 100644
>>> --- a/arch/arm64/kernel/head.S
>>> +++ b/arch/arm64/kernel/head.S
>>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>>> adrp x23, __PHYS_OFFSET
>>> and x23, x23, MIN_KIMG_ALIGN - 1 // KASLR offset, defaults to 0
>>> bl set_cpu_boot_mode_flag
>>> +
>>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>>> + mrs x10, ID_AA64MMFR0_EL1
>>> + ubfx x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>>> + cmp x10, #ID_AA64MMFR0_TGRAN_LPA2
>>> + b.ne 1f
>>
>> For the sake of forward compatibility, this should be "b.lt"
> Right, I guess we could assume that the feature will be present from the
> current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
> not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
> limit is different for 4K and 16K page sizes.
Absolutely.
Cheers
Suzuki
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
2021-07-16 7:20 ` Anshuman Khandual
@ 2021-07-16 10:02 ` Steven Price
2021-07-16 14:37 ` Anshuman Khandual
0 siblings, 1 reply; 19+ messages in thread
From: Steven Price @ 2021-07-16 10:02 UTC (permalink / raw)
To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse
On 16/07/2021 08:20, Anshuman Khandual wrote:
>
>
> On 7/14/21 9:08 PM, Steven Price wrote:
>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>>> to accept a temporary variable and changes impacted call sites.
>>>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>> arch/arm64/include/asm/assembler.h | 23 +++++++++++++++++++----
>>> arch/arm64/include/asm/pgtable-hwdef.h | 4 ++++
>>> arch/arm64/include/asm/pgtable.h | 4 ++++
>>> arch/arm64/kernel/head.S | 25 +++++++++++++------------
>>> 4 files changed, 40 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>>> index fedc202..0492543 100644
>>> --- a/arch/arm64/include/asm/assembler.h
>>> +++ b/arch/arm64/include/asm/assembler.h
>>> @@ -606,7 +606,7 @@ alternative_endif
>>> #endif
>>> .endm
>>>
>>> - .macro phys_to_pte, pte, phys
>>> + .macro phys_to_pte, pte, phys, tmp
>>> #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>>> /*
>>> * We assume \phys is 64K aligned and this is guaranteed by only
>>> @@ -614,6 +614,17 @@ alternative_endif
>>> */
>>> orr \pte, \phys, \phys, lsr #36
>>> and \pte, \pte, #PTE_ADDR_MASK
>>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>>> + orr \pte, \phys, \phys, lsr #42
>>> +
>>> + /*
>>> + * The 'tmp' is being used here to just prepare
>>> + * and hold PTE_ADDR_MASK which cannot be passed
>>> + * to the subsequent 'and' instruction.
>>> + */
>>> + mov \tmp, #PTE_ADDR_LOW
>>> + orr \tmp, \tmp, #PTE_ADDR_HIGH
>>> + and \pte, \pte, \tmp
>> Rather than adding an extra temporary register (and the fallout of
>> various other macros needing an extra register), this can be done with
>> two AND instructions:
>
> I would really like to get rid of the 'tmp' variable here as
> well but did not figure out any method of accomplishing it.
>
>>
>> /* PTE_ADDR_MASK cannot be encoded as an immediate, so
>> * mask off all but two bits, followed by masking the
>> * extra two bits
>> */
>> and \pte, \pte, #PTE_ADDR_MASK | (3 << 10)
>> and \pte, \pte, #~(3 << 10)
>
> Did this change as suggested
>
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -626,9 +626,8 @@ alternative_endif
> * and hold PTE_ADDR_MASK which cannot be passed
> * to the subsequent 'and' instruction.
> */
> - mov \tmp, #PTE_ADDR_LOW
> - orr \tmp, \tmp, #PTE_ADDR_HIGH
> - and \pte, \pte, \tmp
> + and \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
> + and \pte, \pte, #~(0x3 << 10)
>
> .Lskip_lpa2\@:
> mov \pte, \phys
>
>
> but still fails to build (tested on 16K)
>
> arch/arm64/kernel/head.S: Assembler messages:
> arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>
Ah, I'd only tested this for 4k. 16k would require a different set of masks.
So the bits we need to cover are those from just below PAGE_SHIFT to the
top of PTE_ADDR_HIGH (bit 10). So we can compute the mask for both 4k
and 16k with GENMASK(PAGE_SHIFT-1, 10):
and \pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
and \pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)
This compiles (for both 4k and 16k) and the assembly looks correct, but
I've not done any other testing.
Steve
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
2021-07-16 10:02 ` Steven Price
@ 2021-07-16 14:37 ` Anshuman Khandual
0 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-16 14:37 UTC (permalink / raw)
To: Steven Price, linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
james.morse
On 7/16/21 3:32 PM, Steven Price wrote:
> On 16/07/2021 08:20, Anshuman Khandual wrote:
>>
>>
>> On 7/14/21 9:08 PM, Steven Price wrote:
>>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>>>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>>>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>>>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>>>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>>>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>>>> to accept a temporary variable and changes impacted call sites.
>>>>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>> arch/arm64/include/asm/assembler.h | 23 +++++++++++++++++++----
>>>> arch/arm64/include/asm/pgtable-hwdef.h | 4 ++++
>>>> arch/arm64/include/asm/pgtable.h | 4 ++++
>>>> arch/arm64/kernel/head.S | 25 +++++++++++++------------
>>>> 4 files changed, 40 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>>>> index fedc202..0492543 100644
>>>> --- a/arch/arm64/include/asm/assembler.h
>>>> +++ b/arch/arm64/include/asm/assembler.h
>>>> @@ -606,7 +606,7 @@ alternative_endif
>>>> #endif
>>>> .endm
>>>>
>>>> - .macro phys_to_pte, pte, phys
>>>> + .macro phys_to_pte, pte, phys, tmp
>>>> #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>>>> /*
>>>> * We assume \phys is 64K aligned and this is guaranteed by only
>>>> @@ -614,6 +614,17 @@ alternative_endif
>>>> */
>>>> orr \pte, \phys, \phys, lsr #36
>>>> and \pte, \pte, #PTE_ADDR_MASK
>>>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>>>> + orr \pte, \phys, \phys, lsr #42
>>>> +
>>>> + /*
>>>> + * The 'tmp' is being used here to just prepare
>>>> + * and hold PTE_ADDR_MASK which cannot be passed
>>>> + * to the subsequent 'and' instruction.
>>>> + */
>>>> + mov \tmp, #PTE_ADDR_LOW
>>>> + orr \tmp, \tmp, #PTE_ADDR_HIGH
>>>> + and \pte, \pte, \tmp
>>> Rather than adding an extra temporary register (and the fallout of
>>> various other macros needing an extra register), this can be done with
>>> two AND instructions:
>>
>> I would really like to get rid of the 'tmp' variable here as
>> well but did not figure out any method of accomplishing it.
>>
>>>
>>> /* PTE_ADDR_MASK cannot be encoded as an immediate, so
>>> * mask off all but two bits, followed by masking the
>>> * extra two bits
>>> */
>>> and \pte, \pte, #PTE_ADDR_MASK | (3 << 10)
>>> and \pte, \pte, #~(3 << 10)
>>
>> Did this change as suggested
>>
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -626,9 +626,8 @@ alternative_endif
>> * and hold PTE_ADDR_MASK which cannot be passed
>> * to the subsequent 'and' instruction.
>> */
>> - mov \tmp, #PTE_ADDR_LOW
>> - orr \tmp, \tmp, #PTE_ADDR_HIGH
>> - and \pte, \pte, \tmp
>> + and \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
>> + and \pte, \pte, #~(0x3 << 10)
>>
>> .Lskip_lpa2\@:
>> mov \pte, \phys
>>
>>
>> but still fails to build (tested on 16K)
>>
>> arch/arm64/kernel/head.S: Assembler messages:
>> arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>>
>
> Ah, I'd only tested this for 4k. 16k would require a different set of masks.
>
> So the bits we need to cover are those from just below PAGE_SHIFT to the
> top of PTE_ADDR_HIGH (bit 10). So we can compute the mask for both 4k
Okay.
> and 16k with GENMASK(PAGE_SHIFT-1, 10):
>
> and \pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
> and \pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)
>
> This compiles (for both 4k and 16k) and the assembly looks correct, but
> I've not done any other testing.
Yeah it works, will do the change.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
2021-07-16 8:08 ` Suzuki K Poulose
@ 2021-07-19 4:47 ` Anshuman Khandual
0 siblings, 0 replies; 19+ messages in thread
From: Anshuman Khandual @ 2021-07-19 4:47 UTC (permalink / raw)
To: Suzuki K Poulose, linux-arm-kernel, linux-kernel, linux-mm
Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
steven.price
On 7/16/21 1:38 PM, Suzuki K Poulose wrote:
> On 16/07/2021 08:06, Anshuman Khandual wrote:
>>
>> On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
>>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>>>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>>>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>>>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>>>> was requested but found not to be implemented.
>>>>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>> arch/arm64/include/asm/memory.h | 1 +
>>>> arch/arm64/kernel/head.S | 15 +++++++++++++++
>>>> arch/arm64/mm/mmu.c | 3 +++
>>>> arch/arm64/mm/proc.S | 9 +++++++++
>>>> 4 files changed, 28 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>>> index 824a365..d0ca002 100644
>>>> --- a/arch/arm64/include/asm/memory.h
>>>> +++ b/arch/arm64/include/asm/memory.h
>>>> @@ -178,6 +178,7 @@
>>>> #include <asm/bug.h>
>>>> extern u64 vabits_actual;
>>>> +extern u64 arm64_lpa2_enabled;
>>>> extern s64 memstart_addr;
>>>> /* PHYS_OFFSET - the physical address of the start of memory. */
>>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>>>> index 6444147..9cf79ea 100644
>>>> --- a/arch/arm64/kernel/head.S
>>>> +++ b/arch/arm64/kernel/head.S
>>>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>>>> adrp x23, __PHYS_OFFSET
>>>> and x23, x23, MIN_KIMG_ALIGN - 1 // KASLR offset, defaults to 0
>>>> bl set_cpu_boot_mode_flag
>>>> +
>>>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>>>> + mrs x10, ID_AA64MMFR0_EL1
>>>> + ubfx x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>>>> + cmp x10, #ID_AA64MMFR0_TGRAN_LPA2
>>>> + b.ne 1f
>>>
>>> For the sake of forward compatibility, this should be "b.lt"
>> Right, I guess we could assume that the feature will be present from the
>> current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
>> not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
>> limit is different for 4K and 16K page sizes.
>
> Absolutely.
ID_AA64MMFR0_TGRAN_SUPPORTED_MAX check there is not required as __enable_mmu()
already performs the required boundary check for a given page size support.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2021-07-19 4:46 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-14 2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
2021-07-14 2:21 ` [RFC 01/10] mm/mmap: Dynamically initialize protection_map[] Anshuman Khandual
2021-07-14 2:21 ` [RFC 02/10] arm64/mm: Consolidate TCR_EL1 fields Anshuman Khandual
2021-07-14 2:21 ` [RFC 03/10] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field Anshuman Khandual
2021-07-14 2:21 ` [RFC 04/10] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Anshuman Khandual
2021-07-14 2:21 ` [RFC 05/10] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2] Anshuman Khandual
2021-07-14 2:21 ` [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
2021-07-14 15:38 ` Steven Price
2021-07-16 7:20 ` Anshuman Khandual
2021-07-16 10:02 ` Steven Price
2021-07-16 14:37 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2 Anshuman Khandual
2021-07-14 8:21 ` Suzuki K Poulose
2021-07-16 7:06 ` Anshuman Khandual
2021-07-16 8:08 ` Suzuki K Poulose
2021-07-19 4:47 ` Anshuman Khandual
2021-07-14 2:21 ` [RFC 08/10] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S Anshuman Khandual
2021-07-14 2:21 ` [RFC 09/10] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented Anshuman Khandual
2021-07-14 2:21 ` [RFC 10/10] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES Anshuman Khandual
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).