LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback.
@ 2011-01-30  7:15 Tao Ma
  2011-01-30 10:26 ` Michel Lespinasse
  0 siblings, 1 reply; 5+ messages in thread
From: Tao Ma @ 2011-01-30  7:15 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Michel Lespinasse, Andrew Morton

From: Tao Ma <boyu.mt@taobao.com>

In 5ecfda0, we do some optimization in mlock, but it causes
a very basic test case(attached below) of mlock to fail. So
this patch revert it with some tiny modification so that it
apply successfully with the lastest 38-rc2 kernel.
The test program is attached below.

#include <sys/mman.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>

int main()
{
	char *buf, *testfile = "test_mmap";
	int fd, file_len = 40960, ret = -1;

	fd = open(testfile, O_RDWR);
	if (fd < 0) {
		perror("open");
		return -1;
	}

	if (ftruncate(fd, file_len) < 0) {
		perror("ftruncate");
		goto out;
	}

	buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
	if (buf == MAP_FAILED) {
		perror("mmap");
		goto out;
	}

	if (mlock(buf, file_len) < 0) {
		perror("mlock");
		goto out;
	}

	munlock(buf, file_len);
	munmap(buf, file_len);
	ret = 0;
out:
	close(fd);
	return ret;
}

Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
 mm/mlock.c |    8 ++------
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index 13e81ee..76e106c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -170,12 +170,8 @@ static long __mlock_vma_pages_range(struct vm_area_struct *vma,
 	VM_BUG_ON(!rwsem_is_locked(&mm->mmap_sem));
 
 	gup_flags = FOLL_TOUCH;
-	/*
-	 * We want to touch writable mappings with a write fault in order
-	 * to break COW, except for shared mappings because these don't COW
-	 * and we would not want to dirty them for nothing.
-	 */
-	if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
+
+	if (vma->vm_flags & VM_WRITE)
 		gup_flags |= FOLL_WRITE;
 
 	if (vma->vm_flags & VM_LOCKED)
-- 
1.6.3.GIT


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback.
  2011-01-30  7:15 [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback Tao Ma
@ 2011-01-30 10:26 ` Michel Lespinasse
  2011-01-30 13:57   ` Tao Ma
  2011-01-31 11:43   ` KOSAKI Motohiro
  0 siblings, 2 replies; 5+ messages in thread
From: Michel Lespinasse @ 2011-01-30 10:26 UTC (permalink / raw)
  To: Tao Ma; +Cc: linux-mm, linux-kernel, Andrew Morton

On Sat, Jan 29, 2011 at 11:15 PM, Tao Ma <tm@tao.ma> wrote:
>        buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
>        if (buf == MAP_FAILED) {
>                perror("mmap");
>                goto out;
>        }
>
>        if (mlock(buf, file_len) < 0) {
>                perror("mlock");
>                goto out;
>        }

Thanks Tao for tracing this to an individual change. I can reproduce
this on my system. The issue is that the file is mapped without the
PROT_READ permission, so mlock can't fault in the pages. Up to 2.6.37
this worked because mlock was using a write.

The test case does show there was a behavior change; however it's not
clear to me that the tested behavior is valid.

I can see two possible resolutions:

1- do nothing, if we can agree that the test case is invalid

2- restore the previous behavior for writable, non-readable, shared
mappings while preserving the optimization for read/write shared
mappings. The test would then look like:
        if ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & (VM_READ |
VM_SHARED)) != VM_SHARED)
                gup_flags |= FOLL_WRITE;

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback.
  2011-01-30 10:26 ` Michel Lespinasse
@ 2011-01-30 13:57   ` Tao Ma
  2011-01-31 11:43   ` KOSAKI Motohiro
  1 sibling, 0 replies; 5+ messages in thread
From: Tao Ma @ 2011-01-30 13:57 UTC (permalink / raw)
  To: Michel Lespinasse; +Cc: linux-mm, linux-kernel, Andrew Morton

On 01/30/2011 06:26 PM, Michel Lespinasse wrote:
> On Sat, Jan 29, 2011 at 11:15 PM, Tao Ma<tm@tao.ma>  wrote:
>>         buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
>>         if (buf == MAP_FAILED) {
>>                 perror("mmap");
>>                 goto out;
>>         }
>>
>>         if (mlock(buf, file_len)<  0) {
>>                 perror("mlock");
>>                 goto out;
>>         }
> Thanks Tao for tracing this to an individual change. I can reproduce
> this on my system. The issue is that the file is mapped without the
> PROT_READ permission, so mlock can't fault in the pages. Up to 2.6.37
> this worked because mlock was using a write.
>
> The test case does show there was a behavior change; however it's not
> clear to me that the tested behavior is valid.
>
> I can see two possible resolutions:
>
> 1- do nothing, if we can agree that the test case is invalid
The test case does exist in the real world and used widespread. ;)
It is blktrace. 
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/blktrace.git
I can paste codes here also.
In blktrace.c setup_mmap:
mip->fs_buf = my_mmap(NULL, mip->fs_buf_len, PROT_WRITE,
                                       MAP_SHARED, fd,
                                       mip->fs_size - mip->fs_off);
> 2- restore the previous behavior for writable, non-readable, shared
> mappings while preserving the optimization for read/write shared
> mappings. The test would then look like:
>          if ((vma->vm_flags&  VM_WRITE)&&  (vma->vm_flags&  (VM_READ |
> VM_SHARED)) != VM_SHARED)
>                  gup_flags |= FOLL_WRITE;
I am not sure whether it is proper or not. I guess a fat comment is 
needed here
to explain the corner case. So do you have some statistics that your change
improve the performance a lot? If yes, I agree with you. Otherwise, I would
prefer to revert it back to the original design.

Regards,
Tao

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback.
  2011-01-30 10:26 ` Michel Lespinasse
  2011-01-30 13:57   ` Tao Ma
@ 2011-01-31 11:43   ` KOSAKI Motohiro
  2011-01-31 15:41     ` [PATCH v2] mlock: set VM_WRITE in case we don't have read permission Tao Ma
  1 sibling, 1 reply; 5+ messages in thread
From: KOSAKI Motohiro @ 2011-01-31 11:43 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: kosaki.motohiro, Tao Ma, linux-mm, linux-kernel, Andrew Morton

> On Sat, Jan 29, 2011 at 11:15 PM, Tao Ma <tm@tao.ma> wrote:
> >        buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
> >        if (buf == MAP_FAILED) {
> >                perror("mmap");
> >                goto out;
> >        }
> >
> >        if (mlock(buf, file_len) < 0) {
> >                perror("mlock");
> >                goto out;
> >        }
> 
> Thanks Tao for tracing this to an individual change. I can reproduce
> this on my system. The issue is that the file is mapped without the
> PROT_READ permission, so mlock can't fault in the pages. Up to 2.6.37
> this worked because mlock was using a write.
> 
> The test case does show there was a behavior change; however it's not
> clear to me that the tested behavior is valid.
> 
> I can see two possible resolutions:

Please don't ignore bug port anytime.


> 1- do nothing, if we can agree that the test case is invalid
> 
> 2- restore the previous behavior for writable, non-readable, shared
> mappings while preserving the optimization for read/write shared
> mappings. The test would then look like:
>         if ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & (VM_READ |
> VM_SHARED)) != VM_SHARED)
>                 gup_flags |= FOLL_WRITE;

Maybe two separate conditiions are cleaner more. Like this,

	/*
	 * We want to touch writable mappings with a write fault in order
	 * to break COW, except for shared mappings because these don't COW
	 * and we would not want to dirty them for nothing.
	 */
	if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
 		gup_flags |= FOLL_WRITE;

	/*
	* We don't have writable permission. Therefore we can't use read operation
	*  even though it's faster.
	*/
	if ((vma->vm_flags & (VM_READ|VM_WRITE)) == VM_WRITE)
 		gup_flags |= FOLL_WRITE;


Thanks.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] mlock: set VM_WRITE in case we don't have read permission.
  2011-01-31 11:43   ` KOSAKI Motohiro
@ 2011-01-31 15:41     ` Tao Ma
  0 siblings, 0 replies; 5+ messages in thread
From: Tao Ma @ 2011-01-31 15:41 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, KOSAKI Motohiro, Michel Lespinasse, Andrew Morton

From: Tao Ma <boyu.mt@taobao.com>

In 5ecfda0, we do some optimization in mlock, but it causes
a very basic test case(attached below) of mlock to fail. So
this patch add another check that if we don't have read permission,
still set FOLL_WRITE flag. Thank KOSAKI for the suggestion.
The test program is attached below.

#include <sys/mman.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>

int main()
{
	char *buf, *testfile = "test_mmap";
	int fd, file_len = 40960, ret = -1;

	fd = open(testfile, O_RDWR);
	if (fd < 0) {
		perror("open");
		return -1;
	}

	if (ftruncate(fd, file_len) < 0) {
		perror("ftruncate");
		goto out;
	}

	buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
	if (buf == MAP_FAILED) {
		perror("mmap");
		goto out;
	}

	if (mlock(buf, file_len) < 0) {
		perror("mlock");
		goto out;
	}

	munlock(buf, file_len);
	munmap(buf, file_len);
	ret = 0;
out:
	close(fd);
	return ret;
}

Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
 mm/mlock.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index 13e81ee..8508c5a 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -178,6 +178,13 @@ static long __mlock_vma_pages_range(struct vm_area_struct *vma,
 	if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
 		gup_flags |= FOLL_WRITE;
 
+	/*
+	 * We don't have readable permission. Therefore we can't use read
+	 * operation even though it's faster.
+	 */
+	if ((vma->vm_flags & (VM_READ|VM_WRITE)) == VM_WRITE)
+		gup_flags |= FOLL_WRITE;
+
 	if (vma->vm_flags & VM_LOCKED)
 		gup_flags |= FOLL_MLOCK;
 
-- 
1.6.3.GIT


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-01-31 15:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-30  7:15 [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback Tao Ma
2011-01-30 10:26 ` Michel Lespinasse
2011-01-30 13:57   ` Tao Ma
2011-01-31 11:43   ` KOSAKI Motohiro
2011-01-31 15:41     ` [PATCH v2] mlock: set VM_WRITE in case we don't have read permission Tao Ma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).