LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v1 0/2] Do some code cleanups related to mm
@ 2021-08-28  4:23 Qi Zheng
  2021-08-28  4:23 ` [PATCH v1 1/2] mm: introduce pmd_install() helper Qi Zheng
  2021-08-28  4:23 ` [PATCH v1 2/2] mm: remove redundant smp_wmb() Qi Zheng
  0 siblings, 2 replies; 8+ messages in thread
From: Qi Zheng @ 2021-08-28  4:23 UTC (permalink / raw)
  To: akpm, tglx, hannes, mhocko, vdavydov.dev, kirill.shutemov,
	mika.penttila, david
  Cc: linux-doc, linux-kernel, linux-mm, songmuchun, Qi Zheng

Hi,

This patch series aims to do some code cleanups related to mm.

This series is based on next-20210827.

Comments and suggestions are welcome.

Thanks,
Qi.

Qi Zheng (2):
  mm: introduce pmd_install() helper
  mm: remove redundant smp_wmb()

 include/linux/mm.h  |  1 +
 mm/filemap.c        | 11 ++------
 mm/memory.c         | 81 ++++++++++++++++++++++++-----------------------------
 mm/sparse-vmemmap.c |  2 +-
 4 files changed, 40 insertions(+), 55 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v1 1/2] mm: introduce pmd_install() helper
  2021-08-28  4:23 [PATCH v1 0/2] Do some code cleanups related to mm Qi Zheng
@ 2021-08-28  4:23 ` Qi Zheng
  2021-08-28  5:25   ` Muchun Song
  2021-08-28  4:23 ` [PATCH v1 2/2] mm: remove redundant smp_wmb() Qi Zheng
  1 sibling, 1 reply; 8+ messages in thread
From: Qi Zheng @ 2021-08-28  4:23 UTC (permalink / raw)
  To: akpm, tglx, hannes, mhocko, vdavydov.dev, kirill.shutemov,
	mika.penttila, david
  Cc: linux-doc, linux-kernel, linux-mm, songmuchun, Qi Zheng

Currently we have three times the same few lines repeated in the
code. Deduplicate them by newly introduced pmd_install() helper.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h |  1 +
 mm/filemap.c       | 11 ++---------
 mm/memory.c        | 34 ++++++++++++++++------------------
 3 files changed, 19 insertions(+), 27 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a3cc83d64564..0af420a7e382 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2463,6 +2463,7 @@ static inline spinlock_t *pud_lock(struct mm_struct *mm, pud_t *pud)
 	return ptl;
 }
 
+extern void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte);
 extern void __init pagecache_init(void);
 extern void __init free_area_init_memoryless_node(int nid);
 extern void free_initmem(void);
diff --git a/mm/filemap.c b/mm/filemap.c
index c90b6e4984c9..923cbba1bf37 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3209,15 +3209,8 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page)
 	    }
 	}
 
-	if (pmd_none(*vmf->pmd)) {
-		vmf->ptl = pmd_lock(mm, vmf->pmd);
-		if (likely(pmd_none(*vmf->pmd))) {
-			mm_inc_nr_ptes(mm);
-			pmd_populate(mm, vmf->pmd, vmf->prealloc_pte);
-			vmf->prealloc_pte = NULL;
-		}
-		spin_unlock(vmf->ptl);
-	}
+	if (pmd_none(*vmf->pmd))
+		pmd_install(mm, vmf->pmd, &vmf->prealloc_pte);
 
 	/* See comment in handle_pte_fault() */
 	if (pmd_devmap_trans_unstable(vmf->pmd)) {
diff --git a/mm/memory.c b/mm/memory.c
index 39e7a1495c3c..ef7b1762e996 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -433,9 +433,20 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma,
 	}
 }
 
+void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte)
+{
+	spinlock_t *ptl = pmd_lock(mm, pmd);
+
+	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
+		mm_inc_nr_ptes(mm);
+		pmd_populate(mm, pmd, *pte);
+		*pte = NULL;
+	}
+	spin_unlock(ptl);
+}
+
 int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
 {
-	spinlock_t *ptl;
 	pgtable_t new = pte_alloc_one(mm);
 	if (!new)
 		return -ENOMEM;
@@ -455,13 +466,7 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
 	 */
 	smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
 
-	ptl = pmd_lock(mm, pmd);
-	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
-		mm_inc_nr_ptes(mm);
-		pmd_populate(mm, pmd, new);
-		new = NULL;
-	}
-	spin_unlock(ptl);
+	pmd_install(mm, pmd, &new);
 	if (new)
 		pte_free(mm, new);
 	return 0;
@@ -4027,17 +4032,10 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
 				return ret;
 		}
 
-		if (vmf->prealloc_pte) {
-			vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
-			if (likely(pmd_none(*vmf->pmd))) {
-				mm_inc_nr_ptes(vma->vm_mm);
-				pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte);
-				vmf->prealloc_pte = NULL;
-			}
-			spin_unlock(vmf->ptl);
-		} else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) {
+		if (vmf->prealloc_pte)
+			pmd_install(vma->vm_mm, vmf->pmd, &vmf->prealloc_pte);
+		else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd)))
 			return VM_FAULT_OOM;
-		}
 	}
 
 	/* See comment in handle_pte_fault() */
-- 
2.11.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v1 2/2] mm: remove redundant smp_wmb()
  2021-08-28  4:23 [PATCH v1 0/2] Do some code cleanups related to mm Qi Zheng
  2021-08-28  4:23 ` [PATCH v1 1/2] mm: introduce pmd_install() helper Qi Zheng
@ 2021-08-28  4:23 ` Qi Zheng
  2021-08-31 10:02   ` David Hildenbrand
  2021-08-31 10:20   ` Vlastimil Babka
  1 sibling, 2 replies; 8+ messages in thread
From: Qi Zheng @ 2021-08-28  4:23 UTC (permalink / raw)
  To: akpm, tglx, hannes, mhocko, vdavydov.dev, kirill.shutemov,
	mika.penttila, david
  Cc: linux-doc, linux-kernel, linux-mm, songmuchun, Qi Zheng

The smp_wmb() which is in the __pte_alloc() is used to
ensure all ptes setup is visible before the pte is made
visible to other CPUs by being put into page tables. We
only need this when the pte is actually populated, so
move it to pte_install(). __pte_alloc_kernel(),
__p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar
to this case.

We can also defer smp_wmb() to the place where the pmd entry
is really populated by preallocated pte. There are two kinds
of user of preallocated pte, one is filemap & finish_fault(),
another is THP. The former does not need another smp_wmb()
because the smp_wmb() has been done by pte_install().
Fortunately, the latter also does not need another smp_wmb()
because there is already a smp_wmb() before populating the
new pte when the THP uses a preallocated pte to split a huge
pmd.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/memory.c         | 47 ++++++++++++++++++++---------------------------
 mm/sparse-vmemmap.c |  2 +-
 2 files changed, 21 insertions(+), 28 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index ef7b1762e996..9c7534187454 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte)
 
 	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
 		mm_inc_nr_ptes(mm);
+		/*
+		 * Ensure all pte setup (eg. pte page lock and page clearing) are
+		 * visible before the pte is made visible to other CPUs by being
+		 * put into page tables.
+		 *
+		 * The other side of the story is the pointer chasing in the page
+		 * table walking code (when walking the page table without locking;
+		 * ie. most of the time). Fortunately, these data accesses consist
+		 * of a chain of data-dependent loads, meaning most CPUs (alpha
+		 * being the notable exception) will already guarantee loads are
+		 * seen in-order. See the alpha page table accessors for the
+		 * smp_rmb() barriers in page table walking code.
+		 */
+		smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
 		pmd_populate(mm, pmd, *pte);
 		*pte = NULL;
 	}
@@ -451,21 +465,6 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
 	if (!new)
 		return -ENOMEM;
 
-	/*
-	 * Ensure all pte setup (eg. pte page lock and page clearing) are
-	 * visible before the pte is made visible to other CPUs by being
-	 * put into page tables.
-	 *
-	 * The other side of the story is the pointer chasing in the page
-	 * table walking code (when walking the page table without locking;
-	 * ie. most of the time). Fortunately, these data accesses consist
-	 * of a chain of data-dependent loads, meaning most CPUs (alpha
-	 * being the notable exception) will already guarantee loads are
-	 * seen in-order. See the alpha page table accessors for the
-	 * smp_rmb() barriers in page table walking code.
-	 */
-	smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
-
 	pmd_install(mm, pmd, &new);
 	if (new)
 		pte_free(mm, new);
@@ -478,10 +477,9 @@ int __pte_alloc_kernel(pmd_t *pmd)
 	if (!new)
 		return -ENOMEM;
 
-	smp_wmb(); /* See comment in __pte_alloc */
-
 	spin_lock(&init_mm.page_table_lock);
 	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
+		smp_wmb(); /* See comment in pmd_install() */
 		pmd_populate_kernel(&init_mm, pmd, new);
 		new = NULL;
 	}
@@ -3857,7 +3855,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
 		vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
 		if (!vmf->prealloc_pte)
 			return VM_FAULT_OOM;
-		smp_wmb(); /* See comment in __pte_alloc() */
 	}
 
 	ret = vma->vm_ops->fault(vmf);
@@ -3919,7 +3916,6 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
 		vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
 		if (!vmf->prealloc_pte)
 			return VM_FAULT_OOM;
-		smp_wmb(); /* See comment in __pte_alloc() */
 	}
 
 	vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
@@ -4144,7 +4140,6 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf)
 		vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm);
 		if (!vmf->prealloc_pte)
 			return VM_FAULT_OOM;
-		smp_wmb(); /* See comment in __pte_alloc() */
 	}
 
 	return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
@@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address)
 	if (!new)
 		return -ENOMEM;
 
-	smp_wmb(); /* See comment in __pte_alloc */
-
 	spin_lock(&mm->page_table_lock);
 	if (pgd_present(*pgd))		/* Another has populated it */
 		p4d_free(mm, new);
-	else
+	else {
+		smp_wmb(); /* See comment in pmd_install() */
 		pgd_populate(mm, pgd, new);
+	}
 	spin_unlock(&mm->page_table_lock);
 	return 0;
 }
@@ -4842,11 +4837,10 @@ int __pud_alloc(struct mm_struct *mm, p4d_t *p4d, unsigned long address)
 	if (!new)
 		return -ENOMEM;
 
-	smp_wmb(); /* See comment in __pte_alloc */
-
 	spin_lock(&mm->page_table_lock);
 	if (!p4d_present(*p4d)) {
 		mm_inc_nr_puds(mm);
+		smp_wmb(); /* See comment in pmd_install() */
 		p4d_populate(mm, p4d, new);
 	} else	/* Another has populated it */
 		pud_free(mm, new);
@@ -4867,11 +4861,10 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
 	if (!new)
 		return -ENOMEM;
 
-	smp_wmb(); /* See comment in __pte_alloc */
-
 	ptl = pud_lock(mm, pud);
 	if (!pud_present(*pud)) {
 		mm_inc_nr_pmds(mm);
+		smp_wmb(); /* See comment in pmd_install() */
 		pud_populate(mm, pud, new);
 	} else	/* Another has populated it */
 		pmd_free(mm, new);
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index bdce883f9286..db6df27c852a 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -76,7 +76,7 @@ static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start,
 		set_pte_at(&init_mm, addr, pte, entry);
 	}
 
-	/* Make pte visible before pmd. See comment in __pte_alloc(). */
+	/* Make pte visible before pmd. See comment in pmd_install(). */
 	smp_wmb();
 	pmd_populate_kernel(&init_mm, pmd, pgtable);
 
-- 
2.11.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 1/2] mm: introduce pmd_install() helper
  2021-08-28  4:23 ` [PATCH v1 1/2] mm: introduce pmd_install() helper Qi Zheng
@ 2021-08-28  5:25   ` Muchun Song
  0 siblings, 0 replies; 8+ messages in thread
From: Muchun Song @ 2021-08-28  5:25 UTC (permalink / raw)
  To: Qi Zheng
  Cc: Andrew Morton, Thomas Gleixner, Johannes Weiner, Michal Hocko,
	Vladimir Davydov, Kirill A. Shutemov, mika.penttila,
	David Hildenbrand, linux-doc, LKML, Linux Memory Management List

On Sat, Aug 28, 2021 at 12:23 PM Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>
> Currently we have three times the same few lines repeated in the
> code. Deduplicate them by newly introduced pmd_install() helper.
>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 2/2] mm: remove redundant smp_wmb()
  2021-08-28  4:23 ` [PATCH v1 2/2] mm: remove redundant smp_wmb() Qi Zheng
@ 2021-08-31 10:02   ` David Hildenbrand
  2021-08-31 12:36     ` Qi Zheng
  2021-08-31 10:20   ` Vlastimil Babka
  1 sibling, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2021-08-31 10:02 UTC (permalink / raw)
  To: Qi Zheng, akpm, tglx, hannes, mhocko, vdavydov.dev,
	kirill.shutemov, mika.penttila
  Cc: linux-doc, linux-kernel, linux-mm, songmuchun

On 28.08.21 06:23, Qi Zheng wrote:
> The smp_wmb() which is in the __pte_alloc() is used to
> ensure all ptes setup is visible before the pte is made
> visible to other CPUs by being put into page tables. We
> only need this when the pte is actually populated, so
> move it to pte_install(). __pte_alloc_kernel(),
> __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar
> to this case.
> 
> We can also defer smp_wmb() to the place where the pmd entry
> is really populated by preallocated pte. There are two kinds
> of user of preallocated pte, one is filemap & finish_fault(),
> another is THP. The former does not need another smp_wmb()
> because the smp_wmb() has been done by pte_install().
> Fortunately, the latter also does not need another smp_wmb()
> because there is already a smp_wmb() before populating the
> new pte when the THP uses a preallocated pte to split a huge
> pmd.
> 
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
> ---
>   mm/memory.c         | 47 ++++++++++++++++++++---------------------------
>   mm/sparse-vmemmap.c |  2 +-
>   2 files changed, 21 insertions(+), 28 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index ef7b1762e996..9c7534187454 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte)
>   
>   	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
>   		mm_inc_nr_ptes(mm);
> +		/*
> +		 * Ensure all pte setup (eg. pte page lock and page clearing) are
> +		 * visible before the pte is made visible to other CPUs by being
> +		 * put into page tables.
> +		 *
> +		 * The other side of the story is the pointer chasing in the page
> +		 * table walking code (when walking the page table without locking;
> +		 * ie. most of the time). Fortunately, these data accesses consist
> +		 * of a chain of data-dependent loads, meaning most CPUs (alpha
> +		 * being the notable exception) will already guarantee loads are
> +		 * seen in-order. See the alpha page table accessors for the
> +		 * smp_rmb() barriers in page table walking code.
> +		 */
> +		smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
>   		pmd_populate(mm, pmd, *pte);
>   		*pte = NULL;
>   	}
> @@ -451,21 +465,6 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
>   	if (!new)
>   		return -ENOMEM;
>   
> -	/*
> -	 * Ensure all pte setup (eg. pte page lock and page clearing) are
> -	 * visible before the pte is made visible to other CPUs by being
> -	 * put into page tables.
> -	 *
> -	 * The other side of the story is the pointer chasing in the page
> -	 * table walking code (when walking the page table without locking;
> -	 * ie. most of the time). Fortunately, these data accesses consist
> -	 * of a chain of data-dependent loads, meaning most CPUs (alpha
> -	 * being the notable exception) will already guarantee loads are
> -	 * seen in-order. See the alpha page table accessors for the
> -	 * smp_rmb() barriers in page table walking code.
> -	 */
> -	smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
> -
>   	pmd_install(mm, pmd, &new);
>   	if (new)
>   		pte_free(mm, new);
> @@ -478,10 +477,9 @@ int __pte_alloc_kernel(pmd_t *pmd)
>   	if (!new)
>   		return -ENOMEM;
>   
> -	smp_wmb(); /* See comment in __pte_alloc */
> -
>   	spin_lock(&init_mm.page_table_lock);
>   	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
> +		smp_wmb(); /* See comment in pmd_install() */
>   		pmd_populate_kernel(&init_mm, pmd, new);
>   		new = NULL;
>   	}
> @@ -3857,7 +3855,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>   		vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
>   		if (!vmf->prealloc_pte)
>   			return VM_FAULT_OOM;
> -		smp_wmb(); /* See comment in __pte_alloc() */
>   	}
>   
>   	ret = vma->vm_ops->fault(vmf);
> @@ -3919,7 +3916,6 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
>   		vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
>   		if (!vmf->prealloc_pte)
>   			return VM_FAULT_OOM;
> -		smp_wmb(); /* See comment in __pte_alloc() */
>   	}
>   
>   	vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
> @@ -4144,7 +4140,6 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf)
>   		vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm);
>   		if (!vmf->prealloc_pte)
>   			return VM_FAULT_OOM;
> -		smp_wmb(); /* See comment in __pte_alloc() */
>   	}
>   
>   	return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
> @@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address)
>   	if (!new)
>   		return -ENOMEM;
>   
> -	smp_wmb(); /* See comment in __pte_alloc */
> -
>   	spin_lock(&mm->page_table_lock);
>   	if (pgd_present(*pgd))		/* Another has populated it */
>   		p4d_free(mm, new);
> -	else
> +	else {
> +		smp_wmb(); /* See comment in pmd_install() */
>   		pgd_populate(mm, pgd, new);
> +	}

Nit:

if () {

} else {

}

see Documentation/process/coding-style.rst

"This does not apply if only one branch of a conditional statement is a 
single statement; in the latter case use braces in both branches:"


Apart from that, I think this is fine,

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 2/2] mm: remove redundant smp_wmb()
  2021-08-28  4:23 ` [PATCH v1 2/2] mm: remove redundant smp_wmb() Qi Zheng
  2021-08-31 10:02   ` David Hildenbrand
@ 2021-08-31 10:20   ` Vlastimil Babka
  2021-08-31 12:52     ` Qi Zheng
  1 sibling, 1 reply; 8+ messages in thread
From: Vlastimil Babka @ 2021-08-31 10:20 UTC (permalink / raw)
  To: Qi Zheng, akpm, tglx, hannes, mhocko, vdavydov.dev,
	kirill.shutemov, mika.penttila, david
  Cc: linux-doc, linux-kernel, linux-mm, songmuchun

On 8/28/21 06:23, Qi Zheng wrote:
> The smp_wmb() which is in the __pte_alloc() is used to
> ensure all ptes setup is visible before the pte is made
> visible to other CPUs by being put into page tables. We
> only need this when the pte is actually populated, so
> move it to pte_install(). __pte_alloc_kernel(),

It's named pmd_install()?

> __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar
> to this case.
> 
> We can also defer smp_wmb() to the place where the pmd entry
> is really populated by preallocated pte. There are two kinds
> of user of preallocated pte, one is filemap & finish_fault(),
> another is THP. The former does not need another smp_wmb()
> because the smp_wmb() has been done by pte_install().

Same here.

> Fortunately, the latter also does not need another smp_wmb()
> because there is already a smp_wmb() before populating the
> new pte when the THP uses a preallocated pte to split a huge
> pmd.
> 
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  mm/memory.c         | 47 ++++++++++++++++++++---------------------------
>  mm/sparse-vmemmap.c |  2 +-
>  2 files changed, 21 insertions(+), 28 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index ef7b1762e996..9c7534187454 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte)
>  
>  	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
>  		mm_inc_nr_ptes(mm);
> +		/*
> +		 * Ensure all pte setup (eg. pte page lock and page clearing) are
> +		 * visible before the pte is made visible to other CPUs by being
> +		 * put into page tables.
> +		 *
> +		 * The other side of the story is the pointer chasing in the page
> +		 * table walking code (when walking the page table without locking;
> +		 * ie. most of the time). Fortunately, these data accesses consist
> +		 * of a chain of data-dependent loads, meaning most CPUs (alpha
> +		 * being the notable exception) will already guarantee loads are
> +		 * seen in-order. See the alpha page table accessors for the
> +		 * smp_rmb() barriers in page table walking code.
> +		 */
> +		smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */

So, could it? :)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 2/2] mm: remove redundant smp_wmb()
  2021-08-31 10:02   ` David Hildenbrand
@ 2021-08-31 12:36     ` Qi Zheng
  0 siblings, 0 replies; 8+ messages in thread
From: Qi Zheng @ 2021-08-31 12:36 UTC (permalink / raw)
  To: David Hildenbrand, akpm, tglx, hannes, mhocko, vdavydov.dev,
	kirill.shutemov, mika.penttila
  Cc: linux-doc, linux-kernel, linux-mm, songmuchun



On 2021/8/31 PM6:02, David Hildenbrand wrote:
> On 28.08.21 06:23, Qi Zheng wrote:
>> The smp_wmb() which is in the __pte_alloc() is used to
>> ensure all ptes setup is visible before the pte is made
>> visible to other CPUs by being put into page tables. We
>> only need this when the pte is actually populated, so
>> move it to pte_install(). __pte_alloc_kernel(),
>> __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar
>> to this case.
>>
>> We can also defer smp_wmb() to the place where the pmd entry
>> is really populated by preallocated pte. There are two kinds
>> of user of preallocated pte, one is filemap & finish_fault(),
>> another is THP. The former does not need another smp_wmb()
>> because the smp_wmb() has been done by pte_install().
>> Fortunately, the latter also does not need another smp_wmb()
>> because there is already a smp_wmb() before populating the
>> new pte when the THP uses a preallocated pte to split a huge
>> pmd.
>>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>> ---
>>   mm/memory.c         | 47 
>> ++++++++++++++++++++---------------------------
>>   mm/sparse-vmemmap.c |  2 +-
>>   2 files changed, 21 insertions(+), 28 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index ef7b1762e996..9c7534187454 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t 
>> *pmd, pgtable_t *pte)
>>       if (likely(pmd_none(*pmd))) {    /* Has another populated it ? */
>>           mm_inc_nr_ptes(mm);
>> +        /*
>> +         * Ensure all pte setup (eg. pte page lock and page clearing) 
>> are
>> +         * visible before the pte is made visible to other CPUs by being
>> +         * put into page tables.
>> +         *
>> +         * The other side of the story is the pointer chasing in the 
>> page
>> +         * table walking code (when walking the page table without 
>> locking;
>> +         * ie. most of the time). Fortunately, these data accesses 
>> consist
>> +         * of a chain of data-dependent loads, meaning most CPUs (alpha
>> +         * being the notable exception) will already guarantee loads are
>> +         * seen in-order. See the alpha page table accessors for the
>> +         * smp_rmb() barriers in page table walking code.
>> +         */
>> +        smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
>>           pmd_populate(mm, pmd, *pte);
>>           *pte = NULL;
>>       }
>> @@ -451,21 +465,6 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
>>       if (!new)
>>           return -ENOMEM;
>> -    /*
>> -     * Ensure all pte setup (eg. pte page lock and page clearing) are
>> -     * visible before the pte is made visible to other CPUs by being
>> -     * put into page tables.
>> -     *
>> -     * The other side of the story is the pointer chasing in the page
>> -     * table walking code (when walking the page table without locking;
>> -     * ie. most of the time). Fortunately, these data accesses consist
>> -     * of a chain of data-dependent loads, meaning most CPUs (alpha
>> -     * being the notable exception) will already guarantee loads are
>> -     * seen in-order. See the alpha page table accessors for the
>> -     * smp_rmb() barriers in page table walking code.
>> -     */
>> -    smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
>> -
>>       pmd_install(mm, pmd, &new);
>>       if (new)
>>           pte_free(mm, new);
>> @@ -478,10 +477,9 @@ int __pte_alloc_kernel(pmd_t *pmd)
>>       if (!new)
>>           return -ENOMEM;
>> -    smp_wmb(); /* See comment in __pte_alloc */
>> -
>>       spin_lock(&init_mm.page_table_lock);
>>       if (likely(pmd_none(*pmd))) {    /* Has another populated it ? */
>> +        smp_wmb(); /* See comment in pmd_install() */
>>           pmd_populate_kernel(&init_mm, pmd, new);
>>           new = NULL;
>>       }
>> @@ -3857,7 +3855,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>>           vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
>>           if (!vmf->prealloc_pte)
>>               return VM_FAULT_OOM;
>> -        smp_wmb(); /* See comment in __pte_alloc() */
>>       }
>>       ret = vma->vm_ops->fault(vmf);
>> @@ -3919,7 +3916,6 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, 
>> struct page *page)
>>           vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
>>           if (!vmf->prealloc_pte)
>>               return VM_FAULT_OOM;
>> -        smp_wmb(); /* See comment in __pte_alloc() */
>>       }
>>       vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
>> @@ -4144,7 +4140,6 @@ static vm_fault_t do_fault_around(struct 
>> vm_fault *vmf)
>>           vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm);
>>           if (!vmf->prealloc_pte)
>>               return VM_FAULT_OOM;
>> -        smp_wmb(); /* See comment in __pte_alloc() */
>>       }
>>       return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
>> @@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pgd_t 
>> *pgd, unsigned long address)
>>       if (!new)
>>           return -ENOMEM;
>> -    smp_wmb(); /* See comment in __pte_alloc */
>> -
>>       spin_lock(&mm->page_table_lock);
>>       if (pgd_present(*pgd))        /* Another has populated it */
>>           p4d_free(mm, new);
>> -    else
>> +    else {
>> +        smp_wmb(); /* See comment in pmd_install() */
>>           pgd_populate(mm, pgd, new);
>> +    }
> 
> Nit:
> 
> if () {
> 
> } else {
> 
> }
> 
> see Documentation/process/coding-style.rst
> 
> "This does not apply if only one branch of a conditional statement is a 
> single statement; in the latter case use braces in both branches:"

Got it.

> 
> 
> Apart from that, I think this is fine,
> 
> Acked-by: David Hildenbrand <david@redhat.com>
> 

Thanks,
Qi


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 2/2] mm: remove redundant smp_wmb()
  2021-08-31 10:20   ` Vlastimil Babka
@ 2021-08-31 12:52     ` Qi Zheng
  0 siblings, 0 replies; 8+ messages in thread
From: Qi Zheng @ 2021-08-31 12:52 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, tglx, hannes, mhocko, vdavydov.dev,
	kirill.shutemov, mika.penttila, david
  Cc: linux-doc, linux-kernel, linux-mm, songmuchun



On 2021/8/31 PM6:20, Vlastimil Babka wrote:
> On 8/28/21 06:23, Qi Zheng wrote:
>> The smp_wmb() which is in the __pte_alloc() is used to
>> ensure all ptes setup is visible before the pte is made
>> visible to other CPUs by being put into page tables. We
>> only need this when the pte is actually populated, so
>> move it to pte_install(). __pte_alloc_kernel(),
> 
> It's named pmd_install()?

Yes, I will update it in the next version.

> 
>> __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar
>> to this case.
>>
>> We can also defer smp_wmb() to the place where the pmd entry
>> is really populated by preallocated pte. There are two kinds
>> of user of preallocated pte, one is filemap & finish_fault(),
>> another is THP. The former does not need another smp_wmb()
>> because the smp_wmb() has been done by pte_install().
> 
> Same here.
> 
>> Fortunately, the latter also does not need another smp_wmb()
>> because there is already a smp_wmb() before populating the
>> new pte when the THP uses a preallocated pte to split a huge
>> pmd.
>>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>> ---
>>   mm/memory.c         | 47 ++++++++++++++++++++---------------------------
>>   mm/sparse-vmemmap.c |  2 +-
>>   2 files changed, 21 insertions(+), 28 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index ef7b1762e996..9c7534187454 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte)
>>   
>>   	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
>>   		mm_inc_nr_ptes(mm);
>> +		/*
>> +		 * Ensure all pte setup (eg. pte page lock and page clearing) are
>> +		 * visible before the pte is made visible to other CPUs by being
>> +		 * put into page tables.
>> +		 *
>> +		 * The other side of the story is the pointer chasing in the page
>> +		 * table walking code (when walking the page table without locking;
>> +		 * ie. most of the time). Fortunately, these data accesses consist
>> +		 * of a chain of data-dependent loads, meaning most CPUs (alpha
>> +		 * being the notable exception) will already guarantee loads are
>> +		 * seen in-order. See the alpha page table accessors for the
>> +		 * smp_rmb() barriers in page table walking code.
>> +		 */
>> +		smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
> 
> So, could it? :)
> 

Yes, it could, but we don't have smp_wmb__after_spin_lock() now.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-31 12:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-28  4:23 [PATCH v1 0/2] Do some code cleanups related to mm Qi Zheng
2021-08-28  4:23 ` [PATCH v1 1/2] mm: introduce pmd_install() helper Qi Zheng
2021-08-28  5:25   ` Muchun Song
2021-08-28  4:23 ` [PATCH v1 2/2] mm: remove redundant smp_wmb() Qi Zheng
2021-08-31 10:02   ` David Hildenbrand
2021-08-31 12:36     ` Qi Zheng
2021-08-31 10:20   ` Vlastimil Babka
2021-08-31 12:52     ` Qi Zheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).