LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback.
@ 2011-01-30 7:15 Tao Ma
2011-01-30 10:26 ` Michel Lespinasse
0 siblings, 1 reply; 5+ messages in thread
From: Tao Ma @ 2011-01-30 7:15 UTC (permalink / raw)
To: linux-mm; +Cc: linux-kernel, Michel Lespinasse, Andrew Morton
From: Tao Ma <boyu.mt@taobao.com>
In 5ecfda0, we do some optimization in mlock, but it causes
a very basic test case(attached below) of mlock to fail. So
this patch revert it with some tiny modification so that it
apply successfully with the lastest 38-rc2 kernel.
The test program is attached below.
#include <sys/mman.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
int main()
{
char *buf, *testfile = "test_mmap";
int fd, file_len = 40960, ret = -1;
fd = open(testfile, O_RDWR);
if (fd < 0) {
perror("open");
return -1;
}
if (ftruncate(fd, file_len) < 0) {
perror("ftruncate");
goto out;
}
buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
if (buf == MAP_FAILED) {
perror("mmap");
goto out;
}
if (mlock(buf, file_len) < 0) {
perror("mlock");
goto out;
}
munlock(buf, file_len);
munmap(buf, file_len);
ret = 0;
out:
close(fd);
return ret;
}
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
mm/mlock.c | 8 ++------
1 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/mm/mlock.c b/mm/mlock.c
index 13e81ee..76e106c 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -170,12 +170,8 @@ static long __mlock_vma_pages_range(struct vm_area_struct *vma,
VM_BUG_ON(!rwsem_is_locked(&mm->mmap_sem));
gup_flags = FOLL_TOUCH;
- /*
- * We want to touch writable mappings with a write fault in order
- * to break COW, except for shared mappings because these don't COW
- * and we would not want to dirty them for nothing.
- */
- if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
+
+ if (vma->vm_flags & VM_WRITE)
gup_flags |= FOLL_WRITE;
if (vma->vm_flags & VM_LOCKED)
--
1.6.3.GIT
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback.
2011-01-30 7:15 [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback Tao Ma
@ 2011-01-30 10:26 ` Michel Lespinasse
2011-01-30 13:57 ` Tao Ma
2011-01-31 11:43 ` KOSAKI Motohiro
0 siblings, 2 replies; 5+ messages in thread
From: Michel Lespinasse @ 2011-01-30 10:26 UTC (permalink / raw)
To: Tao Ma; +Cc: linux-mm, linux-kernel, Andrew Morton
On Sat, Jan 29, 2011 at 11:15 PM, Tao Ma <tm@tao.ma> wrote:
> buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
> if (buf == MAP_FAILED) {
> perror("mmap");
> goto out;
> }
>
> if (mlock(buf, file_len) < 0) {
> perror("mlock");
> goto out;
> }
Thanks Tao for tracing this to an individual change. I can reproduce
this on my system. The issue is that the file is mapped without the
PROT_READ permission, so mlock can't fault in the pages. Up to 2.6.37
this worked because mlock was using a write.
The test case does show there was a behavior change; however it's not
clear to me that the tested behavior is valid.
I can see two possible resolutions:
1- do nothing, if we can agree that the test case is invalid
2- restore the previous behavior for writable, non-readable, shared
mappings while preserving the optimization for read/write shared
mappings. The test would then look like:
if ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & (VM_READ |
VM_SHARED)) != VM_SHARED)
gup_flags |= FOLL_WRITE;
--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback.
2011-01-30 10:26 ` Michel Lespinasse
@ 2011-01-30 13:57 ` Tao Ma
2011-01-31 11:43 ` KOSAKI Motohiro
1 sibling, 0 replies; 5+ messages in thread
From: Tao Ma @ 2011-01-30 13:57 UTC (permalink / raw)
To: Michel Lespinasse; +Cc: linux-mm, linux-kernel, Andrew Morton
On 01/30/2011 06:26 PM, Michel Lespinasse wrote:
> On Sat, Jan 29, 2011 at 11:15 PM, Tao Ma<tm@tao.ma> wrote:
>> buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
>> if (buf == MAP_FAILED) {
>> perror("mmap");
>> goto out;
>> }
>>
>> if (mlock(buf, file_len)< 0) {
>> perror("mlock");
>> goto out;
>> }
> Thanks Tao for tracing this to an individual change. I can reproduce
> this on my system. The issue is that the file is mapped without the
> PROT_READ permission, so mlock can't fault in the pages. Up to 2.6.37
> this worked because mlock was using a write.
>
> The test case does show there was a behavior change; however it's not
> clear to me that the tested behavior is valid.
>
> I can see two possible resolutions:
>
> 1- do nothing, if we can agree that the test case is invalid
The test case does exist in the real world and used widespread. ;)
It is blktrace.
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/blktrace.git
I can paste codes here also.
In blktrace.c setup_mmap:
mip->fs_buf = my_mmap(NULL, mip->fs_buf_len, PROT_WRITE,
MAP_SHARED, fd,
mip->fs_size - mip->fs_off);
> 2- restore the previous behavior for writable, non-readable, shared
> mappings while preserving the optimization for read/write shared
> mappings. The test would then look like:
> if ((vma->vm_flags& VM_WRITE)&& (vma->vm_flags& (VM_READ |
> VM_SHARED)) != VM_SHARED)
> gup_flags |= FOLL_WRITE;
I am not sure whether it is proper or not. I guess a fat comment is
needed here
to explain the corner case. So do you have some statistics that your change
improve the performance a lot? If yes, I agree with you. Otherwise, I would
prefer to revert it back to the original design.
Regards,
Tao
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback.
2011-01-30 10:26 ` Michel Lespinasse
2011-01-30 13:57 ` Tao Ma
@ 2011-01-31 11:43 ` KOSAKI Motohiro
2011-01-31 15:41 ` [PATCH v2] mlock: set VM_WRITE in case we don't have read permission Tao Ma
1 sibling, 1 reply; 5+ messages in thread
From: KOSAKI Motohiro @ 2011-01-31 11:43 UTC (permalink / raw)
To: Michel Lespinasse
Cc: kosaki.motohiro, Tao Ma, linux-mm, linux-kernel, Andrew Morton
> On Sat, Jan 29, 2011 at 11:15 PM, Tao Ma <tm@tao.ma> wrote:
> > buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
> > if (buf == MAP_FAILED) {
> > perror("mmap");
> > goto out;
> > }
> >
> > if (mlock(buf, file_len) < 0) {
> > perror("mlock");
> > goto out;
> > }
>
> Thanks Tao for tracing this to an individual change. I can reproduce
> this on my system. The issue is that the file is mapped without the
> PROT_READ permission, so mlock can't fault in the pages. Up to 2.6.37
> this worked because mlock was using a write.
>
> The test case does show there was a behavior change; however it's not
> clear to me that the tested behavior is valid.
>
> I can see two possible resolutions:
Please don't ignore bug port anytime.
> 1- do nothing, if we can agree that the test case is invalid
>
> 2- restore the previous behavior for writable, non-readable, shared
> mappings while preserving the optimization for read/write shared
> mappings. The test would then look like:
> if ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & (VM_READ |
> VM_SHARED)) != VM_SHARED)
> gup_flags |= FOLL_WRITE;
Maybe two separate conditiions are cleaner more. Like this,
/*
* We want to touch writable mappings with a write fault in order
* to break COW, except for shared mappings because these don't COW
* and we would not want to dirty them for nothing.
*/
if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
gup_flags |= FOLL_WRITE;
/*
* We don't have writable permission. Therefore we can't use read operation
* even though it's faster.
*/
if ((vma->vm_flags & (VM_READ|VM_WRITE)) == VM_WRITE)
gup_flags |= FOLL_WRITE;
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2] mlock: set VM_WRITE in case we don't have read permission.
2011-01-31 11:43 ` KOSAKI Motohiro
@ 2011-01-31 15:41 ` Tao Ma
0 siblings, 0 replies; 5+ messages in thread
From: Tao Ma @ 2011-01-31 15:41 UTC (permalink / raw)
To: linux-mm; +Cc: linux-kernel, KOSAKI Motohiro, Michel Lespinasse, Andrew Morton
From: Tao Ma <boyu.mt@taobao.com>
In 5ecfda0, we do some optimization in mlock, but it causes
a very basic test case(attached below) of mlock to fail. So
this patch add another check that if we don't have read permission,
still set FOLL_WRITE flag. Thank KOSAKI for the suggestion.
The test program is attached below.
#include <sys/mman.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
int main()
{
char *buf, *testfile = "test_mmap";
int fd, file_len = 40960, ret = -1;
fd = open(testfile, O_RDWR);
if (fd < 0) {
perror("open");
return -1;
}
if (ftruncate(fd, file_len) < 0) {
perror("ftruncate");
goto out;
}
buf = mmap(NULL, file_len, PROT_WRITE, MAP_SHARED, fd, 0);
if (buf == MAP_FAILED) {
perror("mmap");
goto out;
}
if (mlock(buf, file_len) < 0) {
perror("mlock");
goto out;
}
munlock(buf, file_len);
munmap(buf, file_len);
ret = 0;
out:
close(fd);
return ret;
}
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
---
mm/mlock.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/mm/mlock.c b/mm/mlock.c
index 13e81ee..8508c5a 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -178,6 +178,13 @@ static long __mlock_vma_pages_range(struct vm_area_struct *vma,
if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
gup_flags |= FOLL_WRITE;
+ /*
+ * We don't have readable permission. Therefore we can't use read
+ * operation even though it's faster.
+ */
+ if ((vma->vm_flags & (VM_READ|VM_WRITE)) == VM_WRITE)
+ gup_flags |= FOLL_WRITE;
+
if (vma->vm_flags & VM_LOCKED)
gup_flags |= FOLL_MLOCK;
--
1.6.3.GIT
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-01-31 15:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-30 7:15 [PATCH] mlock: revert the optimization for dirtying pages and triggering writeback Tao Ma
2011-01-30 10:26 ` Michel Lespinasse
2011-01-30 13:57 ` Tao Ma
2011-01-31 11:43 ` KOSAKI Motohiro
2011-01-31 15:41 ` [PATCH v2] mlock: set VM_WRITE in case we don't have read permission Tao Ma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).