LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2
@ 2018-10-01 15:54 Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment Punit Agrawal
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose

This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage 2 a
feature that is useful on cores that have support for large sized TLB
mappings (e.g., 1GB for 4K granule).

The only change in this version is to update the kvm_stage2_has_pud()
helper for arm to use CONFIG_PGTABLE_LEVELS.

The patches are based on v6 of the dynamic IPA support.

The patches have been tested on AMD Seattle system with the following
hugepage sizes - 64K, 32M, 1G.

Thanks,
Punit

v7 -> v8

* Add kvm_stage2_has_pud() helper on arm32
* Rebased to v6 of 52bit dynamic IPA support

v6 -> v7

* Restrict thp check to exclude hugetlbfs pages - Patch 1
* Don't update PUD entry if there's no change - Patch 9
* Add check for PUD level in stage 2 - Patch 9

v5 -> v6

* Split Patch 1 to move out the refactoring of exec permissions on
  page table entries.
* Patch 4 - Initialise p*dpp pointers in stage2_get_leaf_entry()
* Patch 5 - Trigger a BUG() in kvm_pud_pfn() on arm

v4 -> v5:
* Patch 1 - Drop helper stage2_should_exec() and refactor the
  condition to decide if a page table entry should be marked
  executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
  latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
  helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
  of levels of stage 2 tables differs from stage 1

v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

[0] https://www.spinics.net/lists/kvm-arm/msg32753.html
[1] https://lkml.org/lkml/2018/9/26/936

Punit Agrawal (9):
  KVM: arm/arm64: Ensure only THP is candidate for adjustment
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault
  KVM: arm/arm64: Introduce helpers to manipulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Support PUD hugepage in stage2_is_exec()
  KVM: arm64: Support handling access faults for PUD hugepages
  KVM: arm64: Update age handlers to support PUD hugepages
  KVM: arm64: Add support for creating PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h         |  61 +++++
 arch/arm/include/asm/stage2_pgtable.h  |   9 +
 arch/arm64/include/asm/kvm_mmu.h       |  48 ++++
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h       |   9 +
 virt/kvm/arm/mmu.c                     | 320 +++++++++++++++++++------
 6 files changed, 373 insertions(+), 78 deletions(-)

-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-03 10:50   ` Marc Zyngier
  2018-10-31 14:36   ` Christoffer Dall
  2018-10-01 15:54 ` [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort() Punit Agrawal
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Christoffer Dall, stable

PageTransCompoundMap() returns true for hugetlbfs and THP
hugepages. This behaviour incorrectly leads to stage 2 faults for
unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
treated as THP faults.

Tighten the check to filter out hugetlbfs pages. This also leads to
consistently mapping all unsupported hugepage sizes as PTE level
entries at stage 2.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org # v4.13+
---
 virt/kvm/arm/mmu.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 7e477b3cae5b..c23a1b323aad 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
 {
 	kvm_pfn_t pfn = *pfnp;
 	gfn_t gfn = *ipap >> PAGE_SHIFT;
+	struct page *page = pfn_to_page(pfn);
 
-	if (PageTransCompoundMap(pfn_to_page(pfn))) {
+	/*
+	 * PageTransCompoungMap() returns true for THP and
+	 * hugetlbfs. Make sure the adjustment is done only for THP
+	 * pages.
+	 */
+	if (!PageHuge(page) && PageTransCompoundMap(page)) {
 		unsigned long mask;
 		/*
 		 * The address we faulted on is backed by a transparent huge
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort()
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-01 17:44   ` Suzuki K Poulose
  2018-10-03 15:20   ` Marc Zyngier
  2018-10-01 15:54 ` [PATCH v8 3/9] KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault Punit Agrawal
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Christoffer Dall

The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.

Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index c23a1b323aad..5b76ee204000 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1490,7 +1490,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	kvm_pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
 	bool logging_active = memslot_is_logging(memslot);
-	unsigned long flags = 0;
+	unsigned long vma_pagesize, flags = 0;
 
 	write_fault = kvm_is_write_fault(vcpu);
 	exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1510,10 +1510,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+	vma_pagesize = vma_kernel_pagesize(vma);
+	if (vma_pagesize == PMD_SIZE && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
+		/*
+		 * Fallback to PTE if it's not one of the Stage 2
+		 * supported hugepage sizes
+		 */
+		vma_pagesize = PAGE_SIZE;
+
 		/*
 		 * Pages belonging to memslots that don't have the same
 		 * alignment for userspace and IPA cannot be mapped using
@@ -1579,23 +1586,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
 
-	if (!hugetlb && !force_pte)
+	if (!hugetlb && !force_pte) {
+		/*
+		 * Only PMD_SIZE transparent hugepages(THP) are
+		 * currently supported. This code will need to be
+		 * updated to support other THP sizes.
+		 */
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
+		if (hugetlb)
+			vma_pagesize = PMD_SIZE;
+	}
+
+	if (writable)
+		kvm_set_pfn_dirty(pfn);
 
-	if (hugetlb) {
+	if (fault_status != FSC_PERM)
+		clean_dcache_guest_page(pfn, vma_pagesize);
+
+	if (exec_fault)
+		invalidate_icache_guest_page(pfn, vma_pagesize);
+
+	if (hugetlb && vma_pagesize == PMD_SIZE) {
 		pmd_t new_pmd = pfn_pmd(pfn, mem_type);
 		new_pmd = pmd_mkhuge(new_pmd);
-		if (writable) {
+		if (writable)
 			new_pmd = kvm_s2pmd_mkwrite(new_pmd);
-			kvm_set_pfn_dirty(pfn);
-		}
-
-		if (fault_status != FSC_PERM)
-			clean_dcache_guest_page(pfn, PMD_SIZE);
 
 		if (exec_fault) {
 			new_pmd = kvm_s2pmd_mkexec(new_pmd);
-			invalidate_icache_guest_page(pfn, PMD_SIZE);
 		} else if (fault_status == FSC_PERM) {
 			/* Preserve execute if XN was already cleared */
 			if (stage2_is_exec(kvm, fault_ipa))
@@ -1608,16 +1626,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 
 		if (writable) {
 			new_pte = kvm_s2pte_mkwrite(new_pte);
-			kvm_set_pfn_dirty(pfn);
 			mark_page_dirty(kvm, gfn);
 		}
 
-		if (fault_status != FSC_PERM)
-			clean_dcache_guest_page(pfn, PAGE_SIZE);
-
 		if (exec_fault) {
 			new_pte = kvm_s2pte_mkexec(new_pte);
-			invalidate_icache_guest_page(pfn, PAGE_SIZE);
 		} else if (fault_status == FSC_PERM) {
 			/* Preserve execute if XN was already cleared */
 			if (stage2_is_exec(kvm, fault_ipa))
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 3/9] KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort() Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 4/9] KVM: arm/arm64: Introduce helpers to manipulate page table entries Punit Agrawal
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Christoffer Dall

Stage 2 fault handler marks a page as executable if it is handling an
execution fault or if it was a permission fault in which case the
executable bit needs to be preserved.

The logic to decide if the page should be marked executable is
duplicated for PMD and PTE entries. To avoid creating another copy
when support for PUD hugepages is introduced refactor the code to
share the checks needed to mark a page table entry as executable.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 5b76ee204000..ec64d21c6571 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1481,7 +1481,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  unsigned long fault_status)
 {
 	int ret;
-	bool write_fault, exec_fault, writable, hugetlb = false, force_pte = false;
+	bool write_fault, writable, hugetlb = false, force_pte = false;
+	bool exec_fault, needs_exec;
 	unsigned long mmu_seq;
 	gfn_t gfn = fault_ipa >> PAGE_SHIFT;
 	struct kvm *kvm = vcpu->kvm;
@@ -1606,19 +1607,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (exec_fault)
 		invalidate_icache_guest_page(pfn, vma_pagesize);
 
+	/*
+	 * If we took an execution fault we have made the
+	 * icache/dcache coherent above and should now let the s2
+	 * mapping be executable.
+	 *
+	 * Write faults (!exec_fault && FSC_PERM) are orthogonal to
+	 * execute permissions, and we preserve whatever we have.
+	 */
+	needs_exec = exec_fault ||
+		(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
+
 	if (hugetlb && vma_pagesize == PMD_SIZE) {
 		pmd_t new_pmd = pfn_pmd(pfn, mem_type);
 		new_pmd = pmd_mkhuge(new_pmd);
 		if (writable)
 			new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
-		if (exec_fault) {
+		if (needs_exec)
 			new_pmd = kvm_s2pmd_mkexec(new_pmd);
-		} else if (fault_status == FSC_PERM) {
-			/* Preserve execute if XN was already cleared */
-			if (stage2_is_exec(kvm, fault_ipa))
-				new_pmd = kvm_s2pmd_mkexec(new_pmd);
-		}
 
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
@@ -1629,13 +1636,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			mark_page_dirty(kvm, gfn);
 		}
 
-		if (exec_fault) {
+		if (needs_exec)
 			new_pte = kvm_s2pte_mkexec(new_pte);
-		} else if (fault_status == FSC_PERM) {
-			/* Preserve execute if XN was already cleared */
-			if (stage2_is_exec(kvm, fault_ipa))
-				new_pte = kvm_s2pte_mkexec(new_pte);
-		}
 
 		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
 	}
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 4/9] KVM: arm/arm64: Introduce helpers to manipulate page table entries
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
                   ` (2 preceding siblings ...)
  2018-10-01 15:54 ` [PATCH v8 3/9] KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 5/9] KVM: arm64: Support dirty page tracking for PUD hugepages Punit Agrawal
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Russell King, Catalin Marinas

Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.

The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |  5 +++++
 arch/arm64/include/asm/kvm_mmu.h |  5 +++++
 virt/kvm/arm/mmu.c               | 14 ++++++++------
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 5ad1a54f98dc..e77212e53e77 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -82,6 +82,11 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pud(pmdp)	__pud(__pa(pmdp) | PMD_TYPE_TABLE)
 #define kvm_mk_pgd(pudp)	({ BUILD_BUG(); 0; })
 
+#define kvm_pfn_pte(pfn, prot)	pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot)	pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)	pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 77b1af9e64db..baabea0cbb66 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,6 +184,11 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)					\
 	__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_pfn_pte(pfn, prot)		pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot)		pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)		pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ec64d21c6571..21079eb5bc15 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -607,7 +607,7 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned long start,
 	addr = start;
 	do {
 		pte = pte_offset_kernel(pmd, addr);
-		kvm_set_pte(pte, pfn_pte(pfn, prot));
+		kvm_set_pte(pte, kvm_pfn_pte(pfn, prot));
 		get_page(virt_to_page(pte));
 		pfn++;
 	} while (addr += PAGE_SIZE, addr != end);
@@ -1202,7 +1202,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 	pfn = __phys_to_pfn(pa);
 
 	for (addr = guest_ipa; addr < end; addr += PAGE_SIZE) {
-		pte_t pte = pfn_pte(pfn, PAGE_S2_DEVICE);
+		pte_t pte = kvm_pfn_pte(pfn, PAGE_S2_DEVICE);
 
 		if (writable)
 			pte = kvm_s2pte_mkwrite(pte);
@@ -1619,8 +1619,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
 
 	if (hugetlb && vma_pagesize == PMD_SIZE) {
-		pmd_t new_pmd = pfn_pmd(pfn, mem_type);
-		new_pmd = pmd_mkhuge(new_pmd);
+		pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+		new_pmd = kvm_pmd_mkhuge(new_pmd);
+
 		if (writable)
 			new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
@@ -1629,7 +1631,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
-		pte_t new_pte = pfn_pte(pfn, mem_type);
+		pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
 
 		if (writable) {
 			new_pte = kvm_s2pte_mkwrite(new_pte);
@@ -1886,7 +1888,7 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
 	 * just like a translation fault and clean the cache to the PoC.
 	 */
 	clean_dcache_guest_page(pfn, PAGE_SIZE);
-	stage2_pte = pfn_pte(pfn, PAGE_S2);
+	stage2_pte = kvm_pfn_pte(pfn, PAGE_S2);
 	handle_hva_to_gpa(kvm, hva, end, &kvm_set_spte_handler, &stage2_pte);
 }
 
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 5/9] KVM: arm64: Support dirty page tracking for PUD hugepages
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
                   ` (3 preceding siblings ...)
  2018-10-01 15:54 ` [PATCH v8 4/9] KVM: arm/arm64: Introduce helpers to manipulate page table entries Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 6/9] KVM: arm64: Support PUD hugepage in stage2_is_exec() Punit Agrawal
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Russell King, Catalin Marinas

In preparation for creating PUD hugepages at stage 2, add support for
write protecting PUD hugepages when they are encountered. Write
protecting guest tables is used to track dirty pages when migrating
VMs.

Also, provide trivial implementations of required kvm_s2pud_* helpers
to allow sharing of code with arm32.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   | 15 +++++++++++++++
 arch/arm64/include/asm/kvm_mmu.h | 10 ++++++++++
 virt/kvm/arm/mmu.c               | 11 +++++++----
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index e77212e53e77..9ec09f4cc284 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -87,6 +87,21 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pmd_mkhuge(pmd)	pmd_mkhuge(pmd)
 
+/*
+ * The following kvm_*pud*() functions are provided strictly to allow
+ * sharing code with arm64. They should never be called in practice.
+ */
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+	BUG();
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	BUG();
+	return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index baabea0cbb66..3cc342177474 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -251,6 +251,16 @@ static inline bool kvm_s2pmd_exec(pmd_t *pmdp)
 	return !(READ_ONCE(pmd_val(*pmdp)) & PMD_S2_XN);
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pudp)
+{
+	kvm_set_s2pte_readonly((pte_t *)pudp);
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pudp)
+{
+	return kvm_s2pte_readonly((pte_t *)pudp);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 21079eb5bc15..9c48f2ca6583 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1347,9 +1347,12 @@ static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
 	do {
 		next = stage2_pud_addr_end(kvm, addr, end);
 		if (!stage2_pud_none(kvm, *pud)) {
-			/* TODO:PUD not supported, revisit later if supported */
-			BUG_ON(stage2_pud_huge(kvm, *pud));
-			stage2_wp_pmds(kvm, pud, addr, next);
+			if (stage2_pud_huge(kvm, *pud)) {
+				if (!kvm_s2pud_readonly(pud))
+					kvm_set_s2pud_readonly(pud);
+			} else {
+				stage2_wp_pmds(kvm, pud, addr, next);
+			}
 		}
 	} while (pud++, addr = next, addr != end);
 }
@@ -1392,7 +1395,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
  *
  * Called to start logging dirty pages after memory region
  * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
- * all present PMD and PTEs are write protected in the memory region.
+ * all present PUD, PMD and PTEs are write protected in the memory region.
  * Afterwards read of dirty page log can be called.
  *
  * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 6/9] KVM: arm64: Support PUD hugepage in stage2_is_exec()
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
                   ` (4 preceding siblings ...)
  2018-10-01 15:54 ` [PATCH v8 5/9] KVM: arm64: Support dirty page tracking for PUD hugepages Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 7/9] KVM: arm64: Support handling access faults for PUD hugepages Punit Agrawal
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Christoffer Dall, Russell King,
	Catalin Marinas

In preparation for creating PUD hugepages at stage 2, add support for
detecting execute permissions on PUD page table entries. Faults due to
lack of execute permissions on page table entries is used to perform
i-cache invalidation on first execute.

Provide trivial implementations of arm32 helpers to allow sharing of
code.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h         |  6 +++
 arch/arm64/include/asm/kvm_mmu.h       |  5 +++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 +
 virt/kvm/arm/mmu.c                     | 53 +++++++++++++++++++++++---
 4 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 9ec09f4cc284..26a2ab05b3f6 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -102,6 +102,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
 	return false;
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pud)
+{
+	BUG();
+	return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 3cc342177474..c06ef3be8ca9 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -261,6 +261,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
 	return kvm_s2pte_readonly((pte_t *)pudp);
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pudp)
+{
+	return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..10ae592b78b8 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN		(_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_XN		(_AT(pudval_t, 2) << 53)  /* XN[1:0] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 9c48f2ca6583..5fd1eae7d964 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1083,23 +1083,66 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 	return 0;
 }
 
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+/*
+ * stage2_get_leaf_entry - walk the stage2 VM page tables and return
+ * true if a valid and present leaf-entry is found. A pointer to the
+ * leaf-entry is returned in the appropriate level variable - pudpp,
+ * pmdpp, ptepp.
+ */
+static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
+				  pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
 {
+	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
 
-	pmdp = stage2_get_pmd(kvm, NULL, addr);
+	*pudpp = NULL;
+	*pmdpp = NULL;
+	*ptepp = NULL;
+
+	pudp = stage2_get_pud(kvm, NULL, addr);
+	if (!pudp || stage2_pud_none(kvm, *pudp) || !stage2_pud_present(kvm, *pudp))
+		return false;
+
+	if (stage2_pud_huge(kvm, *pudp)) {
+		*pudpp = pudp;
+		return true;
+	}
+
+	pmdp = stage2_pmd_offset(kvm, pudp, addr);
 	if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
 		return false;
 
-	if (pmd_thp_or_huge(*pmdp))
-		return kvm_s2pmd_exec(pmdp);
+	if (pmd_thp_or_huge(*pmdp)) {
+		*pmdpp = pmdp;
+		return true;
+	}
 
 	ptep = pte_offset_kernel(pmdp, addr);
 	if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
 		return false;
 
-	return kvm_s2pte_exec(ptep);
+	*ptepp = ptep;
+	return true;
+}
+
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+{
+	pud_t *pudp;
+	pmd_t *pmdp;
+	pte_t *ptep;
+	bool found;
+
+	found = stage2_get_leaf_entry(kvm, addr, &pudp, &pmdp, &ptep);
+	if (!found)
+		return false;
+
+	if (pudp)
+		return kvm_s2pud_exec(pudp);
+	else if (pmdp)
+		return kvm_s2pmd_exec(pmdp);
+	else
+		return kvm_s2pte_exec(ptep);
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 7/9] KVM: arm64: Support handling access faults for PUD hugepages
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
                   ` (5 preceding siblings ...)
  2018-10-01 15:54 ` [PATCH v8 6/9] KVM: arm64: Support PUD hugepage in stage2_is_exec() Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 8/9] KVM: arm64: Update age handlers to support " Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2 Punit Agrawal
  8 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Christoffer Dall, Russell King,
	Catalin Marinas

In preparation for creating larger hugepages at Stage 2, extend the
access fault handling at Stage 2 to support PUD hugepages when
encountered.

Provide trivial helpers for arm32 to allow sharing of code.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |  9 +++++++++
 arch/arm64/include/asm/kvm_mmu.h |  7 +++++++
 arch/arm64/include/asm/pgtable.h |  6 ++++++
 virt/kvm/arm/mmu.c               | 22 +++++++++++-----------
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 26a2ab05b3f6..95b34aad0dc8 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -85,6 +85,9 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot)	pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot)	pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)	({ BUG(); 0; })
+
+
 #define kvm_pmd_mkhuge(pmd)	pmd_mkhuge(pmd)
 
 /*
@@ -108,6 +111,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
 	return false;
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+	BUG();
+	return pud;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index c06ef3be8ca9..b93e5167728f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -187,6 +187,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot)		pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot)		pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)		pud_pfn(pud)
+
 #define kvm_pmd_mkhuge(pmd)		pmd_mkhuge(pmd)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
@@ -266,6 +268,11 @@ static inline bool kvm_s2pud_exec(pud_t *pudp)
 	return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+	return pud_mkyoung(pud);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 1bdeca8918a6..a64a5c35beb1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,6 +314,11 @@ static inline pte_t pud_pte(pud_t pud)
 	return __pte(pud_val(pud));
 }
 
+static inline pud_t pte_pud(pte_t pte)
+{
+	return __pud(pte_val(pte));
+}
+
 static inline pmd_t pud_pmd(pud_t pud)
 {
 	return __pmd(pud_val(pud));
@@ -380,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)	__pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)	pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_mkyoung(pud)	pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud)		pte_write(pud_pte(pud))
 
 #define __pud_to_phys(pud)	__pte_to_phys(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 5fd1eae7d964..1401dc015a22 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1706,6 +1706,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
  */
 static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 {
+	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 	kvm_pfn_t pfn;
@@ -1715,24 +1716,23 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 
 	spin_lock(&vcpu->kvm->mmu_lock);
 
-	pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
-	if (!pmd || pmd_none(*pmd))	/* Nothing there */
+	if (!stage2_get_leaf_entry(vcpu->kvm, fault_ipa, &pud, &pmd, &pte))
 		goto out;
 
-	if (pmd_thp_or_huge(*pmd)) {	/* THP, HugeTLB */
+	if (pud) {		/* HugeTLB */
+		*pud = kvm_s2pud_mkyoung(*pud);
+		pfn = kvm_pud_pfn(*pud);
+		pfn_valid = true;
+	} else	if (pmd) {	/* THP, HugeTLB */
 		*pmd = pmd_mkyoung(*pmd);
 		pfn = pmd_pfn(*pmd);
 		pfn_valid = true;
-		goto out;
+	} else {
+		*pte = pte_mkyoung(*pte);	/* Just a page... */
+		pfn = pte_pfn(*pte);
+		pfn_valid = true;
 	}
 
-	pte = pte_offset_kernel(pmd, fault_ipa);
-	if (pte_none(*pte))		/* Nothing there either */
-		goto out;
-
-	*pte = pte_mkyoung(*pte);	/* Just a page... */
-	pfn = pte_pfn(*pte);
-	pfn_valid = true;
 out:
 	spin_unlock(&vcpu->kvm->mmu_lock);
 	if (pfn_valid)
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 8/9] KVM: arm64: Update age handlers to support PUD hugepages
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
                   ` (6 preceding siblings ...)
  2018-10-01 15:54 ` [PATCH v8 7/9] KVM: arm64: Support handling access faults for PUD hugepages Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-01 15:54 ` [PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2 Punit Agrawal
  8 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Christoffer Dall, Russell King,
	Catalin Marinas

In preparation for creating larger hugepages at Stage 2, add support
to the age handling notifiers for PUD hugepages when encountered.

Provide trivial helpers for arm32 to allow sharing code.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |  6 +++++
 arch/arm64/include/asm/kvm_mmu.h |  5 ++++
 arch/arm64/include/asm/pgtable.h |  1 +
 virt/kvm/arm/mmu.c               | 39 ++++++++++++++++----------------
 4 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 95b34aad0dc8..a42b9505c9a7 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -117,6 +117,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
 	return pud;
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+	BUG();
+	return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index b93e5167728f..3baf72705dcc 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -273,6 +273,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
 	return pud_mkyoung(pud);
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+	return pud_young(pud);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a64a5c35beb1..4d9476e420d9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -385,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)	__pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)	pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_young(pud)		pte_young(pud_pte(pud))
 #define pud_mkyoung(pud)	pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud)		pte_write(pud_pte(pud))
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1401dc015a22..1cf84507bbd6 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1225,6 +1225,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
 	return stage2_ptep_test_and_clear_young((pte_t *)pmd);
 }
 
+static int stage2_pudp_test_and_clear_young(pud_t *pud)
+{
+	return stage2_ptep_test_and_clear_young((pte_t *)pud);
+}
+
 /**
  * kvm_phys_addr_ioremap - map a device range to guest IPA
  *
@@ -1940,42 +1945,38 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
 
 static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data)
 {
+	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 
-	WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-	pmd = stage2_get_pmd(kvm, NULL, gpa);
-	if (!pmd || pmd_none(*pmd))	/* Nothing there */
+	WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+	if (!stage2_get_leaf_entry(kvm, gpa, &pud, &pmd, &pte))
 		return 0;
 
-	if (pmd_thp_or_huge(*pmd))	/* THP, HugeTLB */
+	if (pud)
+		return stage2_pudp_test_and_clear_young(pud);
+	else if (pmd)
 		return stage2_pmdp_test_and_clear_young(pmd);
-
-	pte = pte_offset_kernel(pmd, gpa);
-	if (pte_none(*pte))
-		return 0;
-
-	return stage2_ptep_test_and_clear_young(pte);
+	else
+		return stage2_ptep_test_and_clear_young(pte);
 }
 
 static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data)
 {
+	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 
-	WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-	pmd = stage2_get_pmd(kvm, NULL, gpa);
-	if (!pmd || pmd_none(*pmd))	/* Nothing there */
+	WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+	if (!stage2_get_leaf_entry(kvm, gpa, &pud, &pmd, &pte))
 		return 0;
 
-	if (pmd_thp_or_huge(*pmd))		/* THP, HugeTLB */
+	if (pud)
+		return kvm_s2pud_young(*pud);
+	else if (pmd)
 		return pmd_young(*pmd);
-
-	pte = pte_offset_kernel(pmd, gpa);
-	if (!pte_none(*pte))		/* Just a page... */
+	else
 		return pte_young(*pte);
-
-	return 0;
 }
 
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2
  2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
                   ` (7 preceding siblings ...)
  2018-10-01 15:54 ` [PATCH v8 8/9] KVM: arm64: Update age handlers to support " Punit Agrawal
@ 2018-10-01 15:54 ` Punit Agrawal
  2018-10-01 21:30   ` Suzuki K Poulose
  8 siblings, 1 reply; 21+ messages in thread
From: Punit Agrawal @ 2018-10-01 15:54 UTC (permalink / raw)
  To: kvmarm
  Cc: Punit Agrawal, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, Christoffer Dall, Russell King,
	Catalin Marinas

KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.

Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h         |  20 +++++
 arch/arm/include/asm/stage2_pgtable.h  |   9 +++
 arch/arm64/include/asm/kvm_mmu.h       |  16 ++++
 arch/arm64/include/asm/pgtable-hwdef.h |   2 +
 arch/arm64/include/asm/pgtable.h       |   2 +
 virt/kvm/arm/mmu.c                     | 106 +++++++++++++++++++++++--
 6 files changed, 149 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a42b9505c9a7..da5f078ae68c 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -84,11 +84,14 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot)	pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot)	pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot)	(__pud(0))
 
 #define kvm_pud_pfn(pud)	({ BUG(); 0; })
 
 
 #define kvm_pmd_mkhuge(pmd)	pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)	( {BUG(); pud; })
 
 /*
  * The following kvm_*pud*() functions are provided strictly to allow
@@ -105,6 +108,23 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
 	return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+	BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+	BUG();
+	return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+	BUG();
+	return pud;
+}
+
 static inline bool kvm_s2pud_exec(pud_t *pud)
 {
 	BUG();
diff --git a/arch/arm/include/asm/stage2_pgtable.h b/arch/arm/include/asm/stage2_pgtable.h
index f6a7ea805232..a4ec25360e50 100644
--- a/arch/arm/include/asm/stage2_pgtable.h
+++ b/arch/arm/include/asm/stage2_pgtable.h
@@ -68,4 +68,13 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
 #define stage2_pmd_table_empty(kvm, pmdp)	kvm_page_empty(pmdp)
 #define stage2_pud_table_empty(kvm, pudp)	false
 
+static inline bool kvm_stage2_has_pud(struct kvm *kvm)
+{
+#if CONFIG_PGTABLE_LEVELS > 3
+	return true;
+#else
+	return false;
+#endif
+}
+
 #endif	/* __ARM_S2_PGTABLE_H_ */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 3baf72705dcc..b4e9c2cceecb 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,12 +184,16 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)					\
 	__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_set_pud(pudp, pud)		set_pud(pudp, pud)
+
 #define kvm_pfn_pte(pfn, prot)		pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot)		pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot)		pfn_pud(pfn, prot)
 
 #define kvm_pud_pfn(pud)		pud_pfn(pud)
 
 #define kvm_pmd_mkhuge(pmd)		pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)		pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -203,6 +207,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
 	return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+	pud_val(pud) |= PUD_S2_RDWR;
+	return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
 	pte_val(pte) &= ~PTE_S2_XN;
@@ -215,6 +225,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
 	return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+	pud_val(pud) &= ~PUD_S2_XN;
+	return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
 	pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 10ae592b78b8..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN		(_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
 #define PUD_S2_XN		(_AT(pudval_t, 2) << 53)  /* XN[1:0] */
 
 /*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4d9476e420d9..0afc34f94ff5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -389,6 +389,8 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pud_mkyoung(pud)	pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud)		pte_write(pud_pte(pud))
 
+#define pud_mkhuge(pud)		(__pud(pud_val(pud) & ~PUD_TABLE_BIT))
+
 #define __pud_to_phys(pud)	__pte_to_phys(pud_pte(pud))
 #define __phys_to_pud_val(phys)	__phys_to_pte_val(phys)
 #define pud_pfn(pud)		((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1cf84507bbd6..5b72ddf25efc 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -115,6 +115,25 @@ static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
 	put_page(virt_to_page(pmd));
 }
 
+/**
+ * stage2_dissolve_pud() - clear and flush huge PUD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr:	IPA
+ * @pud:	pud pointer for IPA
+ *
+ * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+static void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr, pud_t *pudp)
+{
+	if (!stage2_pud_huge(kvm, *pudp))
+		return;
+
+	stage2_pud_clear(kvm, pudp);
+	kvm_tlb_flush_vmid_ipa(kvm, addr);
+	put_page(virt_to_page(pudp));
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -1022,7 +1041,7 @@ static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
 	pmd_t *pmd;
 
 	pud = stage2_get_pud(kvm, cache, addr);
-	if (!pud)
+	if (!pud || stage2_pud_huge(kvm, *pud))
 		return NULL;
 
 	if (stage2_pud_none(kvm, *pud)) {
@@ -1083,6 +1102,36 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 	return 0;
 }
 
+static int stage2_set_pud_huge(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
+			       phys_addr_t addr, const pud_t *new_pudp)
+{
+	pud_t *pudp, old_pud;
+
+	pudp = stage2_get_pud(kvm, cache, addr);
+	VM_BUG_ON(!pudp);
+
+	old_pud = *pudp;
+
+	/*
+	 * A large number of vcpus faulting on the same stage 2 entry,
+	 * can lead to a refault due to the
+	 * stage2_pud_clear()/tlb_flush(). Skip updating the page
+	 * tables if there is no change.
+	 */
+	if (pud_val(old_pud) == pud_val(*new_pudp))
+		return 0;
+
+	if (stage2_pud_present(kvm, old_pud)) {
+		stage2_pud_clear(kvm, pudp);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+	} else {
+		get_page(virt_to_page(pudp));
+	}
+
+	kvm_set_pud(pudp, *new_pudp);
+	return 0;
+}
+
 /*
  * stage2_get_leaf_entry - walk the stage2 VM page tables and return
  * true if a valid and present leaf-entry is found. A pointer to the
@@ -1149,6 +1198,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 			  phys_addr_t addr, const pte_t *new_pte,
 			  unsigned long flags)
 {
+	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte, old_pte;
 	bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
@@ -1157,7 +1207,31 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 	VM_BUG_ON(logging_active && !cache);
 
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
-	pmd = stage2_get_pmd(kvm, cache, addr);
+	pud = stage2_get_pud(kvm, cache, addr);
+	if (!pud) {
+		/*
+		 * Ignore calls from kvm_set_spte_hva for unallocated
+		 * address ranges.
+		 */
+		return 0;
+	}
+
+	/*
+	 * While dirty page logging - dissolve huge PUD, then continue
+	 * on to allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pud(kvm, addr, pud);
+
+	if (stage2_pud_none(kvm, *pud)) {
+		if (!cache)
+			return 0; /* ignore calls from kvm_set_spte_hva */
+		pmd = mmu_memory_cache_alloc(cache);
+		stage2_pud_populate(kvm, pud, pmd);
+		get_page(virt_to_page(pud));
+	}
+
+	pmd = stage2_pmd_offset(kvm, pud, addr);
 	if (!pmd) {
 		/*
 		 * Ignore calls from kvm_set_spte_hva for unallocated
@@ -1563,13 +1637,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	}
 
 	vma_pagesize = vma_kernel_pagesize(vma);
-	if (vma_pagesize == PMD_SIZE && !logging_active) {
+	/*
+	 * PUD level may not exist for a VM but PMD is guaranteed to
+	 * exist.
+	 */
+	if ((vma_pagesize == PMD_SIZE ||
+	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pud(kvm))) &&
+	    !logging_active) {
+		struct hstate *h = hstate_vma(vma);
+
 		hugetlb = true;
-		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
+		gfn = (fault_ipa & huge_page_mask(h)) >> PAGE_SHIFT;
 	} else {
 		/*
 		 * Fallback to PTE if it's not one of the Stage 2
-		 * supported hugepage sizes
+		 * supported hugepage sizes or the corresponding level
+		 * doesn't exist
 		 */
 		vma_pagesize = PAGE_SIZE;
 
@@ -1669,7 +1752,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	needs_exec = exec_fault ||
 		(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
 
-	if (hugetlb && vma_pagesize == PMD_SIZE) {
+	if (hugetlb && vma_pagesize == PUD_SIZE) {
+		pud_t new_pud = kvm_pfn_pud(pfn, mem_type);
+
+		new_pud = kvm_pud_mkhuge(new_pud);
+		if (writable)
+			new_pud = kvm_s2pud_mkwrite(new_pud);
+
+		if (needs_exec)
+			new_pud = kvm_s2pud_mkexec(new_pud);
+
+		ret = stage2_set_pud_huge(kvm, memcache, fault_ipa, &new_pud);
+	} else if (hugetlb && vma_pagesize == PMD_SIZE) {
 		pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
 
 		new_pmd = kvm_pmd_mkhuge(new_pmd);
-- 
2.18.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort()
  2018-10-01 15:54 ` [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort() Punit Agrawal
@ 2018-10-01 17:44   ` Suzuki K Poulose
  2018-10-03 15:20   ` Marc Zyngier
  1 sibling, 0 replies; 21+ messages in thread
From: Suzuki K Poulose @ 2018-10-01 17:44 UTC (permalink / raw)
  To: Punit Agrawal, kvmarm
  Cc: marc.zyngier, will.deacon, linux-kernel, linux-arm-kernel,
	Christoffer Dall

On 10/01/2018 04:54 PM, Punit Agrawal wrote:
> The code for operations such as marking the pfn as dirty, and
> dcache/icache maintenance during stage 2 fault handling is duplicated
> between normal pages and PMD hugepages.
> 
> Instead of creating another copy of the operations when we introduce
> PUD hugepages, let's share them across the different pagesizes.
> 
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> ---

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2
  2018-10-01 15:54 ` [PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2 Punit Agrawal
@ 2018-10-01 21:30   ` Suzuki K Poulose
  2018-10-02  8:52     ` Punit Agrawal
  0 siblings, 1 reply; 21+ messages in thread
From: Suzuki K Poulose @ 2018-10-01 21:30 UTC (permalink / raw)
  To: Punit Agrawal, kvmarm
  Cc: marc.zyngier, will.deacon, linux-kernel, linux-arm-kernel,
	Christoffer Dall, Russell King, Catalin Marinas

On 10/01/2018 04:54 PM, Punit Agrawal wrote:
> KVM only supports PMD hugepages at stage 2. Now that the various page
> handling routines are updated, extend the stage 2 fault handling to
> map in PUD hugepages.
> 
> Addition of PUD hugepage support enables additional page sizes (e.g.,
> 1G with 4K granule) which can be useful on cores that support mapping
> larger block sizes in the TLB entries.
> 
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>   arch/arm/include/asm/kvm_mmu.h         |  20 +++++
>   arch/arm/include/asm/stage2_pgtable.h  |   9 +++
>   arch/arm64/include/asm/kvm_mmu.h       |  16 ++++
>   arch/arm64/include/asm/pgtable-hwdef.h |   2 +
>   arch/arm64/include/asm/pgtable.h       |   2 +
>   virt/kvm/arm/mmu.c                     | 106 +++++++++++++++++++++++--
>   6 files changed, 149 insertions(+), 6 deletions(-)
> 

...

> diff --git a/arch/arm/include/asm/stage2_pgtable.h b/arch/arm/include/asm/stage2_pgtable.h
> index f6a7ea805232..a4ec25360e50 100644
> --- a/arch/arm/include/asm/stage2_pgtable.h
> +++ b/arch/arm/include/asm/stage2_pgtable.h
> @@ -68,4 +68,13 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
>   #define stage2_pmd_table_empty(kvm, pmdp)	kvm_page_empty(pmdp)
>   #define stage2_pud_table_empty(kvm, pudp)	false
>   
> +static inline bool kvm_stage2_has_pud(struct kvm *kvm)
> +{
> +#if CONFIG_PGTABLE_LEVELS > 3
> +	return true;
> +#else
> +	return false;
> +#endif

nit: We can only have PGTABLE_LEVELS=3 on ARM with LPAE.
AFAIT, this can be set to false always for ARM.

> +}
> +

...

> @@ -1669,7 +1752,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>   	needs_exec = exec_fault ||
>   		(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
>   
> -	if (hugetlb && vma_pagesize == PMD_SIZE) {
> +	if (hugetlb && vma_pagesize == PUD_SIZE) {
> +		pud_t new_pud = kvm_pfn_pud(pfn, mem_type);
> +
> +		new_pud = kvm_pud_mkhuge(new_pud);
> +		if (writable)
> +			new_pud = kvm_s2pud_mkwrite(new_pud);
> +
> +		if (needs_exec)
> +			new_pud = kvm_s2pud_mkexec(new_pud);
> +
> +		ret = stage2_set_pud_huge(kvm, memcache, fault_ipa, &new_pud);
> +	} else if (hugetlb && vma_pagesize == PMD_SIZE) {
>   		pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
>   
>   		new_pmd = kvm_pmd_mkhuge(new_pmd);
> 


Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2
  2018-10-01 21:30   ` Suzuki K Poulose
@ 2018-10-02  8:52     ` Punit Agrawal
  0 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-02  8:52 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: kvmarm, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, Christoffer Dall, Russell King,
	Catalin Marinas

Suzuki K Poulose <suzuki.poulose@arm.com> writes:

> On 10/01/2018 04:54 PM, Punit Agrawal wrote:
>> KVM only supports PMD hugepages at stage 2. Now that the various page
>> handling routines are updated, extend the stage 2 fault handling to
>> map in PUD hugepages.
>>
>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>> 1G with 4K granule) which can be useful on cores that support mapping
>> larger block sizes in the TLB entries.
>>
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: Russell King <linux@armlinux.org.uk>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> ---
>>   arch/arm/include/asm/kvm_mmu.h         |  20 +++++
>>   arch/arm/include/asm/stage2_pgtable.h  |   9 +++
>>   arch/arm64/include/asm/kvm_mmu.h       |  16 ++++
>>   arch/arm64/include/asm/pgtable-hwdef.h |   2 +
>>   arch/arm64/include/asm/pgtable.h       |   2 +
>>   virt/kvm/arm/mmu.c                     | 106 +++++++++++++++++++++++--
>>   6 files changed, 149 insertions(+), 6 deletions(-)
>>
>
> ...
>
>> diff --git a/arch/arm/include/asm/stage2_pgtable.h b/arch/arm/include/asm/stage2_pgtable.h
>> index f6a7ea805232..a4ec25360e50 100644
>> --- a/arch/arm/include/asm/stage2_pgtable.h
>> +++ b/arch/arm/include/asm/stage2_pgtable.h
>> @@ -68,4 +68,13 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
>>   #define stage2_pmd_table_empty(kvm, pmdp)	kvm_page_empty(pmdp)
>>   #define stage2_pud_table_empty(kvm, pudp)	false
>>   +static inline bool kvm_stage2_has_pud(struct kvm *kvm)
>> +{
>> +#if CONFIG_PGTABLE_LEVELS > 3
>> +	return true;
>> +#else
>> +	return false;
>> +#endif
>
> nit: We can only have PGTABLE_LEVELS=3 on ARM with LPAE.
> AFAIT, this can be set to false always for ARM.

I debated this and veered towards being generic but not committed either
ways.

I've updated this locally but will wait for further comments before
re-posting.

>
>> +}
>> +
>
> ...
>
>> @@ -1669,7 +1752,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>   	needs_exec = exec_fault ||
>>   		(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
>>   -	if (hugetlb && vma_pagesize == PMD_SIZE) {
>> +	if (hugetlb && vma_pagesize == PUD_SIZE) {
>> +		pud_t new_pud = kvm_pfn_pud(pfn, mem_type);
>> +
>> +		new_pud = kvm_pud_mkhuge(new_pud);
>> +		if (writable)
>> +			new_pud = kvm_s2pud_mkwrite(new_pud);
>> +
>> +		if (needs_exec)
>> +			new_pud = kvm_s2pud_mkexec(new_pud);
>> +
>> +		ret = stage2_set_pud_huge(kvm, memcache, fault_ipa, &new_pud);
>> +	} else if (hugetlb && vma_pagesize == PMD_SIZE) {
>>   		pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
>>     		new_pmd = kvm_pmd_mkhuge(new_pmd);
>>
>
>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Thanks a lot for going through the series.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment
  2018-10-01 15:54 ` [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment Punit Agrawal
@ 2018-10-03 10:50   ` Marc Zyngier
  2018-10-03 10:59     ` Punit Agrawal
  2018-10-31 14:36   ` Christoffer Dall
  1 sibling, 1 reply; 21+ messages in thread
From: Marc Zyngier @ 2018-10-03 10:50 UTC (permalink / raw)
  To: Punit Agrawal, kvmarm
  Cc: will.deacon, linux-kernel, linux-arm-kernel, suzuki.poulose,
	Christoffer Dall, stable

On 01/10/18 16:54, Punit Agrawal wrote:
> PageTransCompoundMap() returns true for hugetlbfs and THP
> hugepages. This behaviour incorrectly leads to stage 2 faults for
> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
> treated as THP faults.
> 
> Tighten the check to filter out hugetlbfs pages. This also leads to
> consistently mapping all unsupported hugepage sizes as PTE level
> entries at stage 2.
> 
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: stable@vger.kernel.org # v4.13+

FWIW, I've cherry-picked that single patch from the series and queued it 
for 4.20.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment
  2018-10-03 10:50   ` Marc Zyngier
@ 2018-10-03 10:59     ` Punit Agrawal
  0 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-03 10:59 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvmarm, will.deacon, stable, linux-kernel, linux-arm-kernel

Marc Zyngier <marc.zyngier@arm.com> writes:

> On 01/10/18 16:54, Punit Agrawal wrote:
>> PageTransCompoundMap() returns true for hugetlbfs and THP
>> hugepages. This behaviour incorrectly leads to stage 2 faults for
>> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
>> treated as THP faults.
>>
>> Tighten the check to filter out hugetlbfs pages. This also leads to
>> consistently mapping all unsupported hugepage sizes as PTE level
>> entries at stage 2.
>>
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>> Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: stable@vger.kernel.org # v4.13+
>
> FWIW, I've cherry-picked that single patch from the series and queued
> it for 4.20.

Thanks for picking up the fix.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort()
  2018-10-01 15:54 ` [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort() Punit Agrawal
  2018-10-01 17:44   ` Suzuki K Poulose
@ 2018-10-03 15:20   ` Marc Zyngier
  2018-10-03 16:25     ` Punit Agrawal
  1 sibling, 1 reply; 21+ messages in thread
From: Marc Zyngier @ 2018-10-03 15:20 UTC (permalink / raw)
  To: Punit Agrawal, kvmarm
  Cc: will.deacon, linux-kernel, linux-arm-kernel, suzuki.poulose,
	Christoffer Dall

Hi Punit,

On 01/10/18 16:54, Punit Agrawal wrote:
> The code for operations such as marking the pfn as dirty, and
> dcache/icache maintenance during stage 2 fault handling is duplicated
> between normal pages and PMD hugepages.
> 
> Instead of creating another copy of the operations when we introduce
> PUD hugepages, let's share them across the different pagesizes.
> 
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> ---
>   virt/kvm/arm/mmu.c | 45 +++++++++++++++++++++++++++++----------------
>   1 file changed, 29 insertions(+), 16 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index c23a1b323aad..5b76ee204000 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1490,7 +1490,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>   	kvm_pfn_t pfn;
>   	pgprot_t mem_type = PAGE_S2;
>   	bool logging_active = memslot_is_logging(memslot);
> -	unsigned long flags = 0;
> +	unsigned long vma_pagesize, flags = 0;
>   
>   	write_fault = kvm_is_write_fault(vcpu);
>   	exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
> @@ -1510,10 +1510,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>   		return -EFAULT;
>   	}
>   
> -	if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
> +	vma_pagesize = vma_kernel_pagesize(vma);
> +	if (vma_pagesize == PMD_SIZE && !logging_active) {
>   		hugetlb = true;
>   		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>   	} else {
> +		/*
> +		 * Fallback to PTE if it's not one of the Stage 2
> +		 * supported hugepage sizes
> +		 */
> +		vma_pagesize = PAGE_SIZE;
> +
>   		/*
>   		 * Pages belonging to memslots that don't have the same
>   		 * alignment for userspace and IPA cannot be mapped using
> @@ -1579,23 +1586,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>   	if (mmu_notifier_retry(kvm, mmu_seq))
>   		goto out_unlock;
>   
> -	if (!hugetlb && !force_pte)
> +	if (!hugetlb && !force_pte) {
> +		/*
> +		 * Only PMD_SIZE transparent hugepages(THP) are
> +		 * currently supported. This code will need to be
> +		 * updated to support other THP sizes.
> +		 */
>   		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
> +		if (hugetlb)
> +			vma_pagesize = PMD_SIZE;
> +	}
> +
> +	if (writable)
> +		kvm_set_pfn_dirty(pfn);
>   
> -	if (hugetlb) {
> +	if (fault_status != FSC_PERM)
> +		clean_dcache_guest_page(pfn, vma_pagesize);
> +
> +	if (exec_fault)
> +		invalidate_icache_guest_page(pfn, vma_pagesize);
> +
> +	if (hugetlb && vma_pagesize == PMD_SIZE) {

Can you end-up in a situation where hugetlb==false and vma_pagesize == 
PMD_SIZE? If that's the case, then the above CMOs are not done on the 
same granularity as they were done before this patch. If that cannot 
happen, then the above condition can be simplified.

Which one is it?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort()
  2018-10-03 15:20   ` Marc Zyngier
@ 2018-10-03 16:25     ` Punit Agrawal
  0 siblings, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-03 16:25 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvmarm, will.deacon, linux-kernel, linux-arm-kernel

Marc Zyngier <marc.zyngier@arm.com> writes:

> Hi Punit,
>
> On 01/10/18 16:54, Punit Agrawal wrote:
>> The code for operations such as marking the pfn as dirty, and
>> dcache/icache maintenance during stage 2 fault handling is duplicated
>> between normal pages and PMD hugepages.
>>
>> Instead of creating another copy of the operations when we introduce
>> PUD hugepages, let's share them across the different pagesizes.
>>
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>   virt/kvm/arm/mmu.c | 45 +++++++++++++++++++++++++++++----------------
>>   1 file changed, 29 insertions(+), 16 deletions(-)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index c23a1b323aad..5b76ee204000 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1490,7 +1490,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>   	kvm_pfn_t pfn;
>>   	pgprot_t mem_type = PAGE_S2;
>>   	bool logging_active = memslot_is_logging(memslot);
>> -	unsigned long flags = 0;
>> +	unsigned long vma_pagesize, flags = 0;
>>     	write_fault = kvm_is_write_fault(vcpu);
>>   	exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
>> @@ -1510,10 +1510,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>   		return -EFAULT;
>>   	}
>>   -	if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
>> +	vma_pagesize = vma_kernel_pagesize(vma);
>> +	if (vma_pagesize == PMD_SIZE && !logging_active) {
>>   		hugetlb = true;
>>   		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>>   	} else {
>> +		/*
>> +		 * Fallback to PTE if it's not one of the Stage 2
>> +		 * supported hugepage sizes
>> +		 */
>> +		vma_pagesize = PAGE_SIZE;
>> +
>>   		/*
>>   		 * Pages belonging to memslots that don't have the same
>>   		 * alignment for userspace and IPA cannot be mapped using
>> @@ -1579,23 +1586,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>   	if (mmu_notifier_retry(kvm, mmu_seq))
>>   		goto out_unlock;
>>   -	if (!hugetlb && !force_pte)
>> +	if (!hugetlb && !force_pte) {
>> +		/*
>> +		 * Only PMD_SIZE transparent hugepages(THP) are
>> +		 * currently supported. This code will need to be
>> +		 * updated to support other THP sizes.
>> +		 */
>>   		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>> +		if (hugetlb)
>> +			vma_pagesize = PMD_SIZE;
>> +	}
>> +
>> +	if (writable)
>> +		kvm_set_pfn_dirty(pfn);
>>   -	if (hugetlb) {
>> +	if (fault_status != FSC_PERM)
>> +		clean_dcache_guest_page(pfn, vma_pagesize);
>> +
>> +	if (exec_fault)
>> +		invalidate_icache_guest_page(pfn, vma_pagesize);
>> +
>> +	if (hugetlb && vma_pagesize == PMD_SIZE) {
>
> Can you end-up in a situation where hugetlb==false and vma_pagesize ==
> PMD_SIZE? If that's the case, then the above CMOs are not done on the
> same granularity as they were done before this patch. If that cannot
> happen, then the above condition can be simplified.
>
> Which one is it?

hugetlb is a hangover from when we didn't have vma_pagesize. I think we
can drop it and rely on the pagesize to control the size of mapping we
put down.

Let me give that a try.

Thanks for taking a look.

>
>
> Thanks,
>
> 	M.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment
  2018-10-01 15:54 ` [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment Punit Agrawal
  2018-10-03 10:50   ` Marc Zyngier
@ 2018-10-31 14:36   ` Christoffer Dall
  2018-10-31 14:52     ` Punit Agrawal
  1 sibling, 1 reply; 21+ messages in thread
From: Christoffer Dall @ 2018-10-31 14:36 UTC (permalink / raw)
  To: Punit Agrawal
  Cc: kvmarm, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, stable

On Mon, Oct 01, 2018 at 04:54:35PM +0100, Punit Agrawal wrote:
> PageTransCompoundMap() returns true for hugetlbfs and THP
> hugepages. This behaviour incorrectly leads to stage 2 faults for
> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
> treated as THP faults.
> 
> Tighten the check to filter out hugetlbfs pages. This also leads to
> consistently mapping all unsupported hugepage sizes as PTE level
> entries at stage 2.
> 
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: stable@vger.kernel.org # v4.13+


Hmm, this function is only actually called from user_mem_abort() if we
have (!hugetlb), so I'm not sure the cc stable here was actually
warranted, nor that this patch is strictly necessary.

It doesn't hurt, and makes the code potentially more robust for the
future though.

Am I missing something?

Thanks,

    Christoffer

> ---
>  virt/kvm/arm/mmu.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 7e477b3cae5b..c23a1b323aad 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
>  {
>  	kvm_pfn_t pfn = *pfnp;
>  	gfn_t gfn = *ipap >> PAGE_SHIFT;
> +	struct page *page = pfn_to_page(pfn);
>  
> -	if (PageTransCompoundMap(pfn_to_page(pfn))) {
> +	/*
> +	 * PageTransCompoungMap() returns true for THP and
> +	 * hugetlbfs. Make sure the adjustment is done only for THP
> +	 * pages.
> +	 */
> +	if (!PageHuge(page) && PageTransCompoundMap(page)) {
>  		unsigned long mask;
>  		/*
>  		 * The address we faulted on is backed by a transparent huge
> -- 
> 2.18.0
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment
  2018-10-31 14:36   ` Christoffer Dall
@ 2018-10-31 14:52     ` Punit Agrawal
  2018-10-31 17:15       ` Punit Agrawal
  2018-11-01  8:31       ` Christoffer Dall
  0 siblings, 2 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-31 14:52 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvmarm, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, stable

Christoffer Dall <christoffer.dall@arm.com> writes:

> On Mon, Oct 01, 2018 at 04:54:35PM +0100, Punit Agrawal wrote:
>> PageTransCompoundMap() returns true for hugetlbfs and THP
>> hugepages. This behaviour incorrectly leads to stage 2 faults for
>> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
>> treated as THP faults.
>> 
>> Tighten the check to filter out hugetlbfs pages. This also leads to
>> consistently mapping all unsupported hugepage sizes as PTE level
>> entries at stage 2.
>> 
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>> Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: stable@vger.kernel.org # v4.13+
>
>
> Hmm, this function is only actually called from user_mem_abort() if we
> have (!hugetlb), so I'm not sure the cc stable here was actually
> warranted, nor that this patch is strictly necessary.
>
> It doesn't hurt, and makes the code potentially more robust for the
> future though.
>
> Am I missing something?

!hugetlb is only true for hugepage sizes supported at stage 2. The
function also got called for unsupported hugepage size at stage 2, e.g.,
64k hugepage with 4k page size, which then ended up doing the wrong
thing.

Hope that adds some context. I should've added this to the commit log.

>
> Thanks,
>
>     Christoffer
>
>> ---
>>  virt/kvm/arm/mmu.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>> 
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 7e477b3cae5b..c23a1b323aad 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
>>  {
>>  	kvm_pfn_t pfn = *pfnp;
>>  	gfn_t gfn = *ipap >> PAGE_SHIFT;
>> +	struct page *page = pfn_to_page(pfn);
>>  
>> -	if (PageTransCompoundMap(pfn_to_page(pfn))) {
>> +	/*
>> +	 * PageTransCompoungMap() returns true for THP and
>> +	 * hugetlbfs. Make sure the adjustment is done only for THP
>> +	 * pages.
>> +	 */
>> +	if (!PageHuge(page) && PageTransCompoundMap(page)) {
>>  		unsigned long mask;
>>  		/*
>>  		 * The address we faulted on is backed by a transparent huge
>> -- 
>> 2.18.0
>> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment
  2018-10-31 14:52     ` Punit Agrawal
@ 2018-10-31 17:15       ` Punit Agrawal
  2018-11-01  8:31       ` Christoffer Dall
  1 sibling, 0 replies; 21+ messages in thread
From: Punit Agrawal @ 2018-10-31 17:15 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: marc.zyngier, will.deacon, linux-kernel, stable, kvmarm,
	linux-arm-kernel

Punit Agrawal <punit.agrawal@arm.com> writes:

> Christoffer Dall <christoffer.dall@arm.com> writes:
>
>> On Mon, Oct 01, 2018 at 04:54:35PM +0100, Punit Agrawal wrote:
>>> PageTransCompoundMap() returns true for hugetlbfs and THP
>>> hugepages. This behaviour incorrectly leads to stage 2 faults for
>>> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
>>> treated as THP faults.
>>> 
>>> Tighten the check to filter out hugetlbfs pages. This also leads to
>>> consistently mapping all unsupported hugepage sizes as PTE level
>>> entries at stage 2.
>>> 
>>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>>> Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
>>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>> Cc: stable@vger.kernel.org # v4.13+
>>
>>
>> Hmm, this function is only actually called from user_mem_abort() if we
>> have (!hugetlb), so I'm not sure the cc stable here was actually
>> warranted, nor that this patch is strictly necessary.
>>
>> It doesn't hurt, and makes the code potentially more robust for the
>> future though.
>>
>> Am I missing something?
>
> !hugetlb is only true for hugepage sizes supported at stage 2. 

Of course I meant "hugetlb" above (Note the lack of "!").

> The function also got called for unsupported hugepage size at stage 2,
> e.g., 64k hugepage with 4k page size, which then ended up doing the
> wrong thing.
>
> Hope that adds some context. I should've added this to the commit log.
>
>>
>> Thanks,
>>
>>     Christoffer
>>
>>> ---
>>>  virt/kvm/arm/mmu.c | 8 +++++++-
>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 7e477b3cae5b..c23a1b323aad 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
>>>  {
>>>  	kvm_pfn_t pfn = *pfnp;
>>>  	gfn_t gfn = *ipap >> PAGE_SHIFT;
>>> +	struct page *page = pfn_to_page(pfn);
>>>  
>>> -	if (PageTransCompoundMap(pfn_to_page(pfn))) {
>>> +	/*
>>> +	 * PageTransCompoungMap() returns true for THP and
>>> +	 * hugetlbfs. Make sure the adjustment is done only for THP
>>> +	 * pages.
>>> +	 */
>>> +	if (!PageHuge(page) && PageTransCompoundMap(page)) {
>>>  		unsigned long mask;
>>>  		/*
>>>  		 * The address we faulted on is backed by a transparent huge
>>> -- 
>>> 2.18.0
>>> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment
  2018-10-31 14:52     ` Punit Agrawal
  2018-10-31 17:15       ` Punit Agrawal
@ 2018-11-01  8:31       ` Christoffer Dall
  1 sibling, 0 replies; 21+ messages in thread
From: Christoffer Dall @ 2018-11-01  8:31 UTC (permalink / raw)
  To: Punit Agrawal
  Cc: kvmarm, marc.zyngier, will.deacon, linux-kernel,
	linux-arm-kernel, suzuki.poulose, stable

On Wed, Oct 31, 2018 at 02:52:20PM +0000, Punit Agrawal wrote:
> Christoffer Dall <christoffer.dall@arm.com> writes:
> 
> > On Mon, Oct 01, 2018 at 04:54:35PM +0100, Punit Agrawal wrote:
> >> PageTransCompoundMap() returns true for hugetlbfs and THP
> >> hugepages. This behaviour incorrectly leads to stage 2 faults for
> >> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
> >> treated as THP faults.
> >> 
> >> Tighten the check to filter out hugetlbfs pages. This also leads to
> >> consistently mapping all unsupported hugepage sizes as PTE level
> >> entries at stage 2.
> >> 
> >> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> >> Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
> >> Cc: Christoffer Dall <christoffer.dall@arm.com>
> >> Cc: Marc Zyngier <marc.zyngier@arm.com>
> >> Cc: stable@vger.kernel.org # v4.13+
> >
> >
> > Hmm, this function is only actually called from user_mem_abort() if we
> > have (!hugetlb), so I'm not sure the cc stable here was actually
> > warranted, nor that this patch is strictly necessary.
> >
> > It doesn't hurt, and makes the code potentially more robust for the
> > future though.
> >
> > Am I missing something?
> 
> !hugetlb is only true for hugepage sizes supported at stage 2. The
> function also got called for unsupported hugepage size at stage 2, e.g.,
> 64k hugepage with 4k page size, which then ended up doing the wrong
> thing.
> 
> Hope that adds some context. I should've added this to the commit log.
> 

To be fair you did say that this was for unsupported hugepage sizes.

Thanks for the explanation.


    Christoffer

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-11-01  8:31 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-01 15:54 [PATCH v8 0/9] KVM: Support PUD hugepage at stage 2 Punit Agrawal
2018-10-01 15:54 ` [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment Punit Agrawal
2018-10-03 10:50   ` Marc Zyngier
2018-10-03 10:59     ` Punit Agrawal
2018-10-31 14:36   ` Christoffer Dall
2018-10-31 14:52     ` Punit Agrawal
2018-10-31 17:15       ` Punit Agrawal
2018-11-01  8:31       ` Christoffer Dall
2018-10-01 15:54 ` [PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort() Punit Agrawal
2018-10-01 17:44   ` Suzuki K Poulose
2018-10-03 15:20   ` Marc Zyngier
2018-10-03 16:25     ` Punit Agrawal
2018-10-01 15:54 ` [PATCH v8 3/9] KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault Punit Agrawal
2018-10-01 15:54 ` [PATCH v8 4/9] KVM: arm/arm64: Introduce helpers to manipulate page table entries Punit Agrawal
2018-10-01 15:54 ` [PATCH v8 5/9] KVM: arm64: Support dirty page tracking for PUD hugepages Punit Agrawal
2018-10-01 15:54 ` [PATCH v8 6/9] KVM: arm64: Support PUD hugepage in stage2_is_exec() Punit Agrawal
2018-10-01 15:54 ` [PATCH v8 7/9] KVM: arm64: Support handling access faults for PUD hugepages Punit Agrawal
2018-10-01 15:54 ` [PATCH v8 8/9] KVM: arm64: Update age handlers to support " Punit Agrawal
2018-10-01 15:54 ` [PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2 Punit Agrawal
2018-10-01 21:30   ` Suzuki K Poulose
2018-10-02  8:52     ` Punit Agrawal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).