LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/2] page munlock issues when breaking up COW
@ 2011-02-08  0:47 Michel Lespinasse
  2011-02-08  0:47 ` [PATCH 1/2] mlock: fix race when munlocking pages in do_wp_page() Michel Lespinasse
  2011-02-08  0:47 ` [PATCH 2/2] mlock: do not munlock pages in __do_fault() Michel Lespinasse
  0 siblings, 2 replies; 7+ messages in thread
From: Michel Lespinasse @ 2011-02-08  0:47 UTC (permalink / raw)
  To: linux-mm, Lee Schermerhorn
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Hugh Dickins, Rik van Riel,
	Andrea Arcangeli, linux-kernel

It looks like there is a race in the do_wp_page() code that munlocks the
old page after breaking up COW. The pte still points to that old page,
so I don't see that we are protected against vmscan mlocking back the
page right away. This can be easily worked around by moving that code to
the end of do_wp_page(), after the pte has been pointed to the new page.

Also, the corresponding code in __do_fault() seems entirely unnecessary,
since there was never a pte pointing to the old page in our vma.

Michel Lespinasse (2):
  mlock: fix race when munlocking pages in do_wp_page()
  mlock: do not munlock pages in __do_fault()

 mm/memory.c |   32 ++++++++++++--------------------
 1 files changed, 12 insertions(+), 20 deletions(-)

-- 
1.7.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] mlock: fix race when munlocking pages in do_wp_page()
  2011-02-08  0:47 [PATCH 0/2] page munlock issues when breaking up COW Michel Lespinasse
@ 2011-02-08  0:47 ` Michel Lespinasse
  2011-02-08  1:45   ` KAMEZAWA Hiroyuki
  2011-02-08 18:28   ` Hugh Dickins
  2011-02-08  0:47 ` [PATCH 2/2] mlock: do not munlock pages in __do_fault() Michel Lespinasse
  1 sibling, 2 replies; 7+ messages in thread
From: Michel Lespinasse @ 2011-02-08  0:47 UTC (permalink / raw)
  To: linux-mm, Lee Schermerhorn
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Hugh Dickins, Rik van Riel,
	Andrea Arcangeli, linux-kernel

vmscan can lazily find pages that are mapped within VM_LOCKED vmas,
and set the PageMlocked bit on these pages, transfering them onto the
unevictable list. When do_wp_page() breaks COW within a VM_LOCKED vma,
it may need to clear PageMlocked on the old page and set it on the
new page instead.

This change fixes an issue where do_wp_page() was clearing PageMlocked on
the old page while the pte was still pointing to it (as well as rmap).
Therefore, we were not protected against vmscan immediately trasnfering
the old page back onto the unevictable list. This could cause pages to
get stranded there forever.

I propose to move the corresponding code to the end of do_wp_page(),
after the pte (and rmap) have been pointed to the new page. Additionally,
we can use munlock_vma_page() instead of clear_page_mlock(), so that
the old page stays mlocked if there are still other VM_LOCKED vmas
mapping it.

Signed-off-by: Michel Lespinasse <walken@google.com>
---
 mm/memory.c |   26 ++++++++++++--------------
 1 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 31250fa..32df03c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2219,7 +2219,6 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
 							 &ptl);
 			if (!pte_same(*page_table, orig_pte)) {
 				unlock_page(old_page);
-				page_cache_release(old_page);
 				goto unlock;
 			}
 			page_cache_release(old_page);
@@ -2289,7 +2288,6 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
 							 &ptl);
 			if (!pte_same(*page_table, orig_pte)) {
 				unlock_page(old_page);
-				page_cache_release(old_page);
 				goto unlock;
 			}
 
@@ -2367,16 +2365,6 @@ gotten:
 	}
 	__SetPageUptodate(new_page);
 
-	/*
-	 * Don't let another task, with possibly unlocked vma,
-	 * keep the mlocked page.
-	 */
-	if ((vma->vm_flags & VM_LOCKED) && old_page) {
-		lock_page(old_page);	/* for LRU manipulation */
-		clear_page_mlock(old_page);
-		unlock_page(old_page);
-	}
-
 	if (mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))
 		goto oom_free_new;
 
@@ -2444,10 +2432,20 @@ gotten:
 
 	if (new_page)
 		page_cache_release(new_page);
-	if (old_page)
-		page_cache_release(old_page);
 unlock:
 	pte_unmap_unlock(page_table, ptl);
+	if (old_page) {
+		/*
+		 * Don't let another task, with possibly unlocked vma,
+		 * keep the mlocked page.
+		 */
+		if ((ret & VM_FAULT_WRITE) && (vma->vm_flags & VM_LOCKED)) {
+			lock_page(old_page);	/* LRU manipulation */
+			munlock_vma_page(old_page);
+			unlock_page(old_page);
+		}
+		page_cache_release(old_page);
+	}
 	return ret;
 oom_free_new:
 	page_cache_release(new_page);
-- 
1.7.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] mlock: do not munlock pages in __do_fault()
  2011-02-08  0:47 [PATCH 0/2] page munlock issues when breaking up COW Michel Lespinasse
  2011-02-08  0:47 ` [PATCH 1/2] mlock: fix race when munlocking pages in do_wp_page() Michel Lespinasse
@ 2011-02-08  0:47 ` Michel Lespinasse
  2011-02-08  1:47   ` KAMEZAWA Hiroyuki
  2011-02-08 18:29   ` Hugh Dickins
  1 sibling, 2 replies; 7+ messages in thread
From: Michel Lespinasse @ 2011-02-08  0:47 UTC (permalink / raw)
  To: linux-mm, Lee Schermerhorn
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Hugh Dickins, Rik van Riel,
	Andrea Arcangeli, linux-kernel

If the page is going to be written to, __do_page needs to break COW.
However, the old page (before breaking COW) was never mapped mapped into
the current pte (__do_fault is only called when the pte is not present),
so vmscan can't have marked the old page as PageMlocked due to being
mapped in __do_fault's VMA. Therefore, __do_fault() does not need to worry
about clearing PageMlocked() on the old page.

Signed-off-by: Michel Lespinasse <walken@google.com>
---
 mm/memory.c |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 32df03c..8e8c1832 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3051,12 +3051,6 @@ static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 				goto out;
 			}
 			charged = 1;
-			/*
-			 * Don't let another task, with possibly unlocked vma,
-			 * keep the mlocked page.
-			 */
-			if (vma->vm_flags & VM_LOCKED)
-				clear_page_mlock(vmf.page);
 			copy_user_highpage(page, vmf.page, address, vma);
 			__SetPageUptodate(page);
 		} else {
-- 
1.7.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mlock: fix race when munlocking pages in do_wp_page()
  2011-02-08  0:47 ` [PATCH 1/2] mlock: fix race when munlocking pages in do_wp_page() Michel Lespinasse
@ 2011-02-08  1:45   ` KAMEZAWA Hiroyuki
  2011-02-08 18:28   ` Hugh Dickins
  1 sibling, 0 replies; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-02-08  1:45 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux-mm, Lee Schermerhorn, Andrew Morton, Hugh Dickins,
	Rik van Riel, Andrea Arcangeli, linux-kernel

On Mon,  7 Feb 2011 16:47:35 -0800
Michel Lespinasse <walken@google.com> wrote:

> vmscan can lazily find pages that are mapped within VM_LOCKED vmas,
> and set the PageMlocked bit on these pages, transfering them onto the
> unevictable list. When do_wp_page() breaks COW within a VM_LOCKED vma,
> it may need to clear PageMlocked on the old page and set it on the
> new page instead.
> 
> This change fixes an issue where do_wp_page() was clearing PageMlocked on
> the old page while the pte was still pointing to it (as well as rmap).
> Therefore, we were not protected against vmscan immediately trasnfering
> the old page back onto the unevictable list. This could cause pages to
> get stranded there forever.
> 
> I propose to move the corresponding code to the end of do_wp_page(),
> after the pte (and rmap) have been pointed to the new page. Additionally,
> we can use munlock_vma_page() instead of clear_page_mlock(), so that
> the old page stays mlocked if there are still other VM_LOCKED vmas
> mapping it.
> 
> Signed-off-by: Michel Lespinasse <walken@google.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] mlock: do not munlock pages in __do_fault()
  2011-02-08  0:47 ` [PATCH 2/2] mlock: do not munlock pages in __do_fault() Michel Lespinasse
@ 2011-02-08  1:47   ` KAMEZAWA Hiroyuki
  2011-02-08 18:29   ` Hugh Dickins
  1 sibling, 0 replies; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-02-08  1:47 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux-mm, Lee Schermerhorn, Andrew Morton, Hugh Dickins,
	Rik van Riel, Andrea Arcangeli, linux-kernel

On Mon,  7 Feb 2011 16:47:36 -0800
Michel Lespinasse <walken@google.com> wrote:

> If the page is going to be written to, __do_page needs to break COW.
> However, the old page (before breaking COW) was never mapped mapped into
> the current pte (__do_fault is only called when the pte is not present),
> so vmscan can't have marked the old page as PageMlocked due to being
> mapped in __do_fault's VMA. Therefore, __do_fault() does not need to worry
> about clearing PageMlocked() on the old page.
> 
> Signed-off-by: Michel Lespinasse <walken@google.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mlock: fix race when munlocking pages in do_wp_page()
  2011-02-08  0:47 ` [PATCH 1/2] mlock: fix race when munlocking pages in do_wp_page() Michel Lespinasse
  2011-02-08  1:45   ` KAMEZAWA Hiroyuki
@ 2011-02-08 18:28   ` Hugh Dickins
  1 sibling, 0 replies; 7+ messages in thread
From: Hugh Dickins @ 2011-02-08 18:28 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux-mm, Lee Schermerhorn, Andrew Morton, KAMEZAWA Hiroyuki,
	Rik van Riel, Andrea Arcangeli, linux-kernel

On Mon, Feb 7, 2011 at 4:47 PM, Michel Lespinasse <walken@google.com> wrote:
>
> vmscan can lazily find pages that are mapped within VM_LOCKED vmas,
> and set the PageMlocked bit on these pages, transfering them onto the
> unevictable list. When do_wp_page() breaks COW within a VM_LOCKED vma,
> it may need to clear PageMlocked on the old page and set it on the
> new page instead.
>
> This change fixes an issue where do_wp_page() was clearing PageMlocked on
> the old page while the pte was still pointing to it (as well as rmap).
> Therefore, we were not protected against vmscan immediately trasnfering
> the old page back onto the unevictable list. This could cause pages to
> get stranded there forever.
>
> I propose to move the corresponding code to the end of do_wp_page(),
> after the pte (and rmap) have been pointed to the new page. Additionally,
> we can use munlock_vma_page() instead of clear_page_mlock(), so that
> the old page stays mlocked if there are still other VM_LOCKED vmas
> mapping it.
>
> Signed-off-by: Michel Lespinasse <walken@google.com>

Acked-by: Hugh Dickins <hughd@google.com>

(but I have to say, do_wp_page() grows even ughlier!)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] mlock: do not munlock pages in __do_fault()
  2011-02-08  0:47 ` [PATCH 2/2] mlock: do not munlock pages in __do_fault() Michel Lespinasse
  2011-02-08  1:47   ` KAMEZAWA Hiroyuki
@ 2011-02-08 18:29   ` Hugh Dickins
  1 sibling, 0 replies; 7+ messages in thread
From: Hugh Dickins @ 2011-02-08 18:29 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux-mm, Lee Schermerhorn, Andrew Morton, KAMEZAWA Hiroyuki,
	Rik van Riel, Andrea Arcangeli, linux-kernel

On Mon, Feb 7, 2011 at 4:47 PM, Michel Lespinasse <walken@google.com> wrote:
> If the page is going to be written to, __do_page needs to break COW.
> However, the old page (before breaking COW) was never mapped mapped into
> the current pte (__do_fault is only called when the pte is not present),
> so vmscan can't have marked the old page as PageMlocked due to being
> mapped in __do_fault's VMA. Therefore, __do_fault() does not need to worry
> about clearing PageMlocked() on the old page.
>
> Signed-off-by: Michel Lespinasse <walken@google.com>

Acked-by: Hugh Dickins <hughd@google.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-02-08 18:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-08  0:47 [PATCH 0/2] page munlock issues when breaking up COW Michel Lespinasse
2011-02-08  0:47 ` [PATCH 1/2] mlock: fix race when munlocking pages in do_wp_page() Michel Lespinasse
2011-02-08  1:45   ` KAMEZAWA Hiroyuki
2011-02-08 18:28   ` Hugh Dickins
2011-02-08  0:47 ` [PATCH 2/2] mlock: do not munlock pages in __do_fault() Michel Lespinasse
2011-02-08  1:47   ` KAMEZAWA Hiroyuki
2011-02-08 18:29   ` Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).