LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints @ 2015-02-02 16:55 Mel Gorman 2015-02-02 22:05 ` Andrew Morton 2015-02-02 22:22 ` Dave Hansen 0 siblings, 2 replies; 31+ messages in thread From: Mel Gorman @ 2015-02-02 16:55 UTC (permalink / raw) To: linux-mm; +Cc: Minchan Kim, Vlastimil Babka, Andrew Morton, linux-kernel glibc malloc changed behaviour in glibc 2.10 to have per-thread arenas instead of creating new areans if the existing ones were contended. The decision appears to have been made so the allocator scales better but the downside is that madvise(MADV_DONTNEED) is now called for these per-thread areans during free. This tears down pages that would have previously remained. There is nothing wrong with this decision from a functional point of view but any threaded application that frequently allocates/frees the same-sized region is going to incur the full teardown and refault costs. This is extremely obvious in the ebizzy benchmark. At its core, threads are frequently freeing and allocating buffers of the same size. It is much faster on distributions with older versions of glibc. Profiles showed that a large amount of system CPU time was spent on tearing down and refaulting pages. This patch identifies when a thread is frequently calling MADV_DONTNEED on the same region of memory and starts ignoring the hint. On an 8-core single-socket machine this was the impact on ebizzy using glibc 2.19. ebizzy Overall Throughput 3.19.0-rc6 3.19.0-rc6 vanilla madvise-v1r1 Hmean Rsec-1 12619.93 ( 0.00%) 34807.02 (175.81%) Hmean Rsec-3 33434.19 ( 0.00%) 100733.77 (201.29%) Hmean Rsec-5 45796.68 ( 0.00%) 134257.34 (193.16%) Hmean Rsec-7 53146.93 ( 0.00%) 145512.85 (173.79%) Hmean Rsec-12 55132.87 ( 0.00%) 145560.86 (164.02%) Hmean Rsec-18 54846.52 ( 0.00%) 145120.79 (164.59%) Hmean Rsec-24 54368.95 ( 0.00%) 142733.89 (162.53%) Hmean Rsec-30 54388.86 ( 0.00%) 141424.09 (160.02%) Hmean Rsec-32 54047.11 ( 0.00%) 139151.76 (157.46%) And the system CPU usage was also much reduced 3.19.0-rc6 3.19.0-rc6 vanilla madvise-v1r1 User 2647.19 8347.26 System 5742.90 42.42 Elapsed 1350.60 1350.65 It's even more ridiculous on a 4 socket machine ebizzy Overall Throughput 3.19.0-rc6 3.19.0-rc6 vanilla madvise-v1r1 Hmean Rsec-1 5354.37 ( 0.00%) 12838.61 (139.78%) Hmean Rsec-4 10338.41 ( 0.00%) 50514.52 (388.61%) Hmean Rsec-7 7766.33 ( 0.00%) 88555.30 (1040.25%) Hmean Rsec-12 7188.40 ( 0.00%) 154180.78 (2044.86%) Hmean Rsec-21 7001.82 ( 0.00%) 266555.51 (3706.95%) Hmean Rsec-30 8975.08 ( 0.00%) 314369.88 (3402.70%) Hmean Rsec-48 12136.53 ( 0.00%) 358525.74 (2854.10%) Hmean Rsec-79 12607.37 ( 0.00%) 341646.49 (2609.89%) Hmean Rsec-110 12563.37 ( 0.00%) 338058.65 (2590.83%) Hmean Rsec-141 11701.85 ( 0.00%) 331255.78 (2730.80%) Hmean Rsec-172 10987.39 ( 0.00%) 312003.62 (2739.65%) Hmean Rsec-192 12050.46 ( 0.00%) 296401.88 (2359.67%) 3.19.0-rc6 3.19.0-rc6 vanilla madvise-v1r1 User 4136.44 53506.65 System 50262.68 906.49 Elapsed 1802.07 1801.99 Note in both cases that the elapsed time is similar because the benchmark is configured to run for a fixed duration. MADV_FREE would have a lower cost if the underlying allocator used it but there is no guarantee that allocators will use it. Arguably the kernel has no business preventing an application developer shooting themselves in a foot but this is a case where it's relatively easy to detect the bad behaviour and avoid it. Signed-off-by: Mel Gorman <mgorman@suse.de> --- fs/exec.c | 4 ++++ include/linux/sched.h | 5 +++++ kernel/fork.c | 5 +++++ mm/madvise.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 70 insertions(+) diff --git a/fs/exec.c b/fs/exec.c index ad8798e26be9..5c691fcc32f4 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1551,6 +1551,10 @@ static int do_execveat_common(int fd, struct filename *filename, current->in_execve = 0; acct_update_integrals(current); task_numa_free(current); + if (current->madvise_state) { + kfree(current->madvise_state); + current->madvise_state = NULL; + } free_bprm(bprm); kfree(pathbuf); putname(filename); diff --git a/include/linux/sched.h b/include/linux/sched.h index 8db31ef98d2f..b6706bdb27fd 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1271,6 +1271,9 @@ enum perf_event_task_context { perf_nr_task_contexts, }; +/* mm/madvise.c */ +struct madvise_state_info; + struct task_struct { volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */ void *stack; @@ -1637,6 +1640,8 @@ struct task_struct { struct page_frag task_frag; + struct madvise_state_info *madvise_state; + #ifdef CONFIG_TASK_DELAY_ACCT struct task_delay_info *delays; #endif diff --git a/kernel/fork.c b/kernel/fork.c index 4dc2ddade9f1..6d8dd1379240 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -246,6 +246,11 @@ void __put_task_struct(struct task_struct *tsk) delayacct_tsk_free(tsk); put_signal_struct(tsk->signal); + if (current->madvise_state) { + kfree(current->madvise_state); + current->madvise_state = NULL; + } + if (!profile_handoff_task(tsk)) free_task(tsk); } diff --git a/mm/madvise.c b/mm/madvise.c index a271adc93289..907bb0922711 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -19,6 +19,7 @@ #include <linux/blkdev.h> #include <linux/swap.h> #include <linux/swapops.h> +#include <linux/vmacache.h> /* * Any behaviour which results in changes to the vma->vm_flags needs to @@ -251,6 +252,57 @@ static long madvise_willneed(struct vm_area_struct *vma, return 0; } +#define MADVISE_HASH VMACACHE_HASH +#define MADVISE_STATE_SIZE VMACACHE_SIZE +#define MADVISE_THRESHOLD 8 + +struct madvise_state_info { + unsigned long start; + unsigned long end; + int count; + unsigned long jiffies; +}; + +/* Returns true if userspace is continually dropping the same address range */ +static bool ignore_madvise_hint(unsigned long start, unsigned long end) +{ + int i; + + if (!current->madvise_state) + current->madvise_state = kzalloc(sizeof(struct madvise_state_info) * MADVISE_STATE_SIZE, GFP_KERNEL); + if (!current->madvise_state) + return false; + + i = VMACACHE_HASH(start); + if (current->madvise_state[i].start != start || + current->madvise_state[i].end != end) { + /* cache miss */ + current->madvise_state[i].start = start; + current->madvise_state[i].end = end; + current->madvise_state[i].count = 0; + current->madvise_state[i].jiffies = jiffies; + } else { + /* cache hit */ + unsigned long reset = current->madvise_state[i].jiffies + HZ; + if (time_after(jiffies, reset)) { + /* + * If it is a second since the last madvise on this + * range or since madvise hints got ignored then reset + * the counts and apply the hint again. + */ + current->madvise_state[i].count = 0; + current->madvise_state[i].jiffies = jiffies; + } else + current->madvise_state[i].count++; + + if (current->madvise_state[i].count > MADVISE_THRESHOLD) + return true; + current->madvise_state[i].jiffies = jiffies; + } + + return false; +} + /* * Application no longer needs these pages. If the pages are dirty, * it's OK to just throw them away. The app will be more careful about @@ -278,6 +330,10 @@ static long madvise_dontneed(struct vm_area_struct *vma, if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP)) return -EINVAL; + /* Ignore hint if madvise is continually dropping the same range */ + if (ignore_madvise_hint(start, end)) + return 0; + if (unlikely(vma->vm_flags & VM_NONLINEAR)) { struct zap_details details = { .nonlinear_vma = vma, ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 16:55 [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints Mel Gorman @ 2015-02-02 22:05 ` Andrew Morton 2015-02-02 22:18 ` Mel Gorman 2015-02-02 22:22 ` Dave Hansen 1 sibling, 1 reply; 31+ messages in thread From: Andrew Morton @ 2015-02-02 22:05 UTC (permalink / raw) To: Mel Gorman; +Cc: linux-mm, Minchan Kim, Vlastimil Babka, linux-kernel On Mon, 2 Feb 2015 16:55:25 +0000 Mel Gorman <mgorman@suse.de> wrote: > glibc malloc changed behaviour in glibc 2.10 to have per-thread arenas > instead of creating new areans if the existing ones were contended. > The decision appears to have been made so the allocator scales better but the > downside is that madvise(MADV_DONTNEED) is now called for these per-thread > areans during free. This tears down pages that would have previously > remained. There is nothing wrong with this decision from a functional point > of view but any threaded application that frequently allocates/frees the > same-sized region is going to incur the full teardown and refault costs. MADV_DONTNEED has been there for many years. How could this problem not have been noticed during glibc 2.10 development/testing? Is there some more recent kernel change which is triggering this? > This patch identifies when a thread is frequently calling MADV_DONTNEED > on the same region of memory and starts ignoring the hint. That's pretty nasty-looking :( And presumably there are all sorts of behaviours which will still trigger the problem but which will avoid the start/end equality test in ignore_madvise_hint()? Really, this is a glibc problem and only a glibc problem. MADV_DONTNEED is unavoidably expensive and glibc is calling MADV_DONTNEED for a region which it *does* need. Is there something preventing this from being addressed within glibc? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 22:05 ` Andrew Morton @ 2015-02-02 22:18 ` Mel Gorman 2015-02-02 22:35 ` Andrew Morton 2015-02-05 21:44 ` Rik van Riel 0 siblings, 2 replies; 31+ messages in thread From: Mel Gorman @ 2015-02-02 22:18 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-mm, Minchan Kim, Vlastimil Babka, linux-kernel On Mon, Feb 02, 2015 at 02:05:06PM -0800, Andrew Morton wrote: > On Mon, 2 Feb 2015 16:55:25 +0000 Mel Gorman <mgorman@suse.de> wrote: > > > glibc malloc changed behaviour in glibc 2.10 to have per-thread arenas > > instead of creating new areans if the existing ones were contended. > > The decision appears to have been made so the allocator scales better but the > > downside is that madvise(MADV_DONTNEED) is now called for these per-thread > > areans during free. This tears down pages that would have previously > > remained. There is nothing wrong with this decision from a functional point > > of view but any threaded application that frequently allocates/frees the > > same-sized region is going to incur the full teardown and refault costs. > > MADV_DONTNEED has been there for many years. How could this problem > not have been noticed during glibc 2.10 development/testing? I do not know. I only spotted it due to switching distributions. Looping allocations and frees of the same sizes is considered inefficient and it might have been dismissed on those grounds. It's probably less noticeable when it only affects threaded applications. > Is there > some more recent kernel change which is triggering this? > Not that I'm aware of. > > This patch identifies when a thread is frequently calling MADV_DONTNEED > > on the same region of memory and starts ignoring the hint. > > That's pretty nasty-looking :( > Yep, it is but we're very limited in terms of what we can do within the kernel here. > And presumably there are all sorts of behaviours which will still > trigger the problem but which will avoid the start/end equality test in > ignore_madvise_hint()? > Yes. I would expect that a simple pattern of multiple allocs followed by multiple frees in a loop would also trigger it. > Really, this is a glibc problem and only a glibc problem. > MADV_DONTNEED is unavoidably expensive and glibc is calling > MADV_DONTNEED for a region which it *does* need. To be fair to glibc, it calls it on a region it *thinks* it doesn't need only to reuse it immediately afterwards because of how the benchmark is implemented. > Is there something > preventing this from being addressed within glibc? I doubt it other than I expect they'll punt it back and blame either the application for being stupid or the kernel for being slow. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 22:18 ` Mel Gorman @ 2015-02-02 22:35 ` Andrew Morton 2015-02-03 0:26 ` Davidlohr Bueso 2015-02-03 10:50 ` Mel Gorman 2015-02-05 21:44 ` Rik van Riel 1 sibling, 2 replies; 31+ messages in thread From: Andrew Morton @ 2015-02-02 22:35 UTC (permalink / raw) To: Mel Gorman; +Cc: linux-mm, Minchan Kim, Vlastimil Babka, linux-kernel On Mon, 2 Feb 2015 22:18:24 +0000 Mel Gorman <mgorman@suse.de> wrote: > > Is there something > > preventing this from being addressed within glibc? > > I doubt it other than I expect they'll punt it back and blame either the > application for being stupid or the kernel for being slow. *Is* the application being stupid? What is it actually doing? Something like pthread_routine() { p = malloc(X); do_some(work); free(p); return; } ? If so, that doesn't seem stupid? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 22:35 ` Andrew Morton @ 2015-02-03 0:26 ` Davidlohr Bueso 2015-02-03 10:50 ` Mel Gorman 1 sibling, 0 replies; 31+ messages in thread From: Davidlohr Bueso @ 2015-02-03 0:26 UTC (permalink / raw) To: Andrew Morton Cc: Mel Gorman, linux-mm, Minchan Kim, Vlastimil Babka, linux-kernel On Mon, 2015-02-02 at 14:35 -0800, Andrew Morton wrote: > On Mon, 2 Feb 2015 22:18:24 +0000 Mel Gorman <mgorman@suse.de> wrote: > > > > Is there something > > > preventing this from being addressed within glibc? > > > > I doubt it other than I expect they'll punt it back and blame either the > > application for being stupid or the kernel for being slow. > > *Is* the application being stupid? What is it actually doing? > Something like > > pthread_routine() > { > p = malloc(X); > do_some(work); > free(p); Ebizzy adds a time based loop in there. But yeah, pretty much a standard pthread model. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 22:35 ` Andrew Morton 2015-02-03 0:26 ` Davidlohr Bueso @ 2015-02-03 10:50 ` Mel Gorman 1 sibling, 0 replies; 31+ messages in thread From: Mel Gorman @ 2015-02-03 10:50 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-mm, Minchan Kim, Vlastimil Babka, linux-kernel On Mon, Feb 02, 2015 at 02:35:41PM -0800, Andrew Morton wrote: > On Mon, 2 Feb 2015 22:18:24 +0000 Mel Gorman <mgorman@suse.de> wrote: > > > > Is there something > > > preventing this from being addressed within glibc? > > > > I doubt it other than I expect they'll punt it back and blame either the > > application for being stupid or the kernel for being slow. > > *Is* the application being stupid? What is it actually doing? Only a little. There is little simulated think time between the allocation and the subsequent free. It means the cost of alloc/free dominates where in "real" applications they would either be reusing buffers if they were constantly needed or the think time would mask the cost of the free. > Something like > > pthread_routine() > { > p = malloc(X); > do_some(work); > free(p); > return; > } > Pretty much. There is a search_mem() function that alloc(copy_size) memcpy search free(copy) A real application might try and avoid the copy or reuse buffers if they encountered this particular problem. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 22:18 ` Mel Gorman 2015-02-02 22:35 ` Andrew Morton @ 2015-02-05 21:44 ` Rik van Riel 1 sibling, 0 replies; 31+ messages in thread From: Rik van Riel @ 2015-02-05 21:44 UTC (permalink / raw) To: Mel Gorman, Andrew Morton Cc: linux-mm, Minchan Kim, Vlastimil Babka, linux-kernel On 02/02/2015 05:18 PM, Mel Gorman wrote: > On Mon, Feb 02, 2015 at 02:05:06PM -0800, Andrew Morton wrote: >> On Mon, 2 Feb 2015 16:55:25 +0000 Mel Gorman <mgorman@suse.de> wrote: >> >>> glibc malloc changed behaviour in glibc 2.10 to have per-thread arenas >>> instead of creating new areans if the existing ones were contended. >>> The decision appears to have been made so the allocator scales better but the >>> downside is that madvise(MADV_DONTNEED) is now called for these per-thread >>> areans during free. This tears down pages that would have previously >>> remained. There is nothing wrong with this decision from a functional point >>> of view but any threaded application that frequently allocates/frees the >>> same-sized region is going to incur the full teardown and refault costs. >> >> MADV_DONTNEED has been there for many years. How could this problem >> not have been noticed during glibc 2.10 development/testing? > > I do not know. I only spotted it due to switching distributions. Looping > allocations and frees of the same sizes is considered inefficient and it > might have been dismissed on those grounds. It's probably less noticeable > when it only affects threaded applications. > >> Is there >> some more recent kernel change which is triggering this? >> > > Not that I'm aware of. > >>> This patch identifies when a thread is frequently calling MADV_DONTNEED >>> on the same region of memory and starts ignoring the hint. >> >> That's pretty nasty-looking :( >> > > Yep, it is but we're very limited in terms of what we can do within the > kernel here. > >> And presumably there are all sorts of behaviours which will still >> trigger the problem but which will avoid the start/end equality test in >> ignore_madvise_hint()? >> > > Yes. I would expect that a simple pattern of multiple allocs followed by > multiple frees in a loop would also trigger it. > >> Really, this is a glibc problem and only a glibc problem. >> MADV_DONTNEED is unavoidably expensive and glibc is calling >> MADV_DONTNEED for a region which it *does* need. > > To be fair to glibc, it calls it on a region it *thinks* it doesn't need only > to reuse it immediately afterwards because of how the benchmark is > implemented. > >> Is there something >> preventing this from being addressed within glibc? > > I doubt it other than I expect they'll punt it back and blame either the > application for being stupid or the kernel for being slow. This sounds like something that could benefit from Minchan's MADV_FREE, instead of MADV_DONTNEED. If non page aligned malloc/free does not depend on pages being zeroed, I suspect an MADV_DONTNEED resulting from a malloc/free loop also does not depend on it. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 16:55 [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints Mel Gorman 2015-02-02 22:05 ` Andrew Morton @ 2015-02-02 22:22 ` Dave Hansen 2015-02-03 8:19 ` MADV_DONTNEED semantics? Was: " Vlastimil Babka 2015-02-03 9:47 ` Mel Gorman 1 sibling, 2 replies; 31+ messages in thread From: Dave Hansen @ 2015-02-02 22:22 UTC (permalink / raw) To: Mel Gorman, linux-mm Cc: Minchan Kim, Vlastimil Babka, Andrew Morton, linux-kernel On 02/02/2015 08:55 AM, Mel Gorman wrote: > This patch identifies when a thread is frequently calling MADV_DONTNEED > on the same region of memory and starts ignoring the hint. On an 8-core > single-socket machine this was the impact on ebizzy using glibc 2.19. The manpage, at least, claims that we zero-fill after MADV_DONTNEED is called: > MADV_DONTNEED > Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources > associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the > underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. So if we have anything depending on the behavior that it's _always_ zero-filled after an MADV_DONTNEED, this will break it. ^ permalink raw reply [flat|nested] 31+ messages in thread
* MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 22:22 ` Dave Hansen @ 2015-02-03 8:19 ` Vlastimil Babka 2015-02-03 10:53 ` Kirill A. Shutemov 2015-02-03 11:16 ` Mel Gorman 2015-02-03 9:47 ` Mel Gorman 1 sibling, 2 replies; 31+ messages in thread From: Vlastimil Babka @ 2015-02-03 8:19 UTC (permalink / raw) To: Dave Hansen, Mel Gorman, linux-mm Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-api, mtk.manpages, linux-man [CC linux-api, man pages] On 02/02/2015 11:22 PM, Dave Hansen wrote: > On 02/02/2015 08:55 AM, Mel Gorman wrote: >> This patch identifies when a thread is frequently calling MADV_DONTNEED >> on the same region of memory and starts ignoring the hint. On an 8-core >> single-socket machine this was the impact on ebizzy using glibc 2.19. > > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is > called: > >> MADV_DONTNEED >> Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources >> associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the >> underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. > > So if we have anything depending on the behavior that it's _always_ > zero-filled after an MADV_DONTNEED, this will break it. OK, so that's a third person (including me) who understood it as a zero-fill guarantee. I think the man page should be clarified (if it's indeed not guaranteed), or we have a bug. The implementation actually skips MADV_DONTNEED for VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's. I'm not sure about VM_PFNMAP, these are probably special enough. For mlock, one could expect that mlocking and MADV_DONTNEED would be in some opposition, but it's not documented in the manpage AFAIK. Neither is the hugetlb case, which could be really unexpected by the user. Next, what the man page says about guarantees: "The kernel is free to ignore the advice." - that would suggest that nothing is guaranteed "This call does not influence the semantics of the application (except in the case of MADV_DONTNEED)" - that depends if the reader understands it as "does influence by MADV_DONTNEED" or "may influence by MADV_DONTNEED" - btw, isn't MADV_DONTFORK another exception that does influence the semantics? And since it's mentioned as a workaround for some hardware, is it OK to ignore this advice? And the part you already cited: "Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the underlying mapped file (see mmap(2)) or zero-fill on-demand pages for mappings without an underlying file." - The word "will result" did sound as a guarantee at least to me. So here it could be changed to "may result (unless the advice is ignored)"? And if we agree that there is indeed no guarantee, what's the actual semantic difference from MADV_FREE? I guess none? So there's only a possible perfomance difference? Vlastimil ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 8:19 ` MADV_DONTNEED semantics? Was: " Vlastimil Babka @ 2015-02-03 10:53 ` Kirill A. Shutemov 2015-02-03 11:42 ` Vlastimil Babka 2015-02-03 11:16 ` Mel Gorman 1 sibling, 1 reply; 31+ messages in thread From: Kirill A. Shutemov @ 2015-02-03 10:53 UTC (permalink / raw) To: Vlastimil Babka Cc: Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, linux-kernel, linux-api, mtk.manpages, linux-man On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: > [CC linux-api, man pages] > > On 02/02/2015 11:22 PM, Dave Hansen wrote: > > On 02/02/2015 08:55 AM, Mel Gorman wrote: > >> This patch identifies when a thread is frequently calling MADV_DONTNEED > >> on the same region of memory and starts ignoring the hint. On an 8-core > >> single-socket machine this was the impact on ebizzy using glibc 2.19. > > > > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is > > called: > > > >> MADV_DONTNEED > >> Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources > >> associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the > >> underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. > > > > So if we have anything depending on the behavior that it's _always_ > > zero-filled after an MADV_DONTNEED, this will break it. > > OK, so that's a third person (including me) who understood it as a zero-fill > guarantee. I think the man page should be clarified (if it's indeed not > guaranteed), or we have a bug. > > The implementation actually skips MADV_DONTNEED for > VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's. It doesn't skip. It fails with -EINVAL. Or I miss something. > - The word "will result" did sound as a guarantee at least to me. So here it > could be changed to "may result (unless the advice is ignored)"? It's too late to fix documentation. Applications already depends on the beheviour. -- Kirill A. Shutemov ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 10:53 ` Kirill A. Shutemov @ 2015-02-03 11:42 ` Vlastimil Babka 2015-02-03 16:20 ` Michael Kerrisk (man-pages) 2015-02-04 0:09 ` Minchan Kim 0 siblings, 2 replies; 31+ messages in thread From: Vlastimil Babka @ 2015-02-03 11:42 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, linux-kernel, linux-api, mtk.manpages, linux-man On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote: > On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: >> [CC linux-api, man pages] >> >> On 02/02/2015 11:22 PM, Dave Hansen wrote: >> > On 02/02/2015 08:55 AM, Mel Gorman wrote: >> >> This patch identifies when a thread is frequently calling MADV_DONTNEED >> >> on the same region of memory and starts ignoring the hint. On an 8-core >> >> single-socket machine this was the impact on ebizzy using glibc 2.19. >> > >> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is >> > called: >> > >> >> MADV_DONTNEED >> >> Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources >> >> associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the >> >> underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. >> > >> > So if we have anything depending on the behavior that it's _always_ >> > zero-filled after an MADV_DONTNEED, this will break it. >> >> OK, so that's a third person (including me) who understood it as a zero-fill >> guarantee. I think the man page should be clarified (if it's indeed not >> guaranteed), or we have a bug. >> >> The implementation actually skips MADV_DONTNEED for >> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's. > > It doesn't skip. It fails with -EINVAL. Or I miss something. No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in this case: * The application is attempting to release locked or shared pages (with MADV_DONTNEED). - that covers mlocking ok, not sure if the rest fits the "shared pages" case though. I dont see any check for other kinds of shared pages in the code. >> - The word "will result" did sound as a guarantee at least to me. So here it >> could be changed to "may result (unless the advice is ignored)"? > > It's too late to fix documentation. Applications already depends on the > beheviour. Right, so as long as they check for EINVAL, it should be safe. It appears that jemalloc does. I still wouldnt be sure just by reading the man page that the clearing is guaranteed whenever I dont get an error return value, though, ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 11:42 ` Vlastimil Babka @ 2015-02-03 16:20 ` Michael Kerrisk (man-pages) 2015-02-04 13:46 ` Vlastimil Babka 2015-02-04 0:09 ` Minchan Kim 1 sibling, 1 reply; 31+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-02-03 16:20 UTC (permalink / raw) To: Vlastimil Babka, Kirill A. Shutemov Cc: mtk.manpages, Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, linux-kernel, linux-api, linux-man, Hugh Dickins Hello Vlastimil Thanks for CCing me into this thread. On 02/03/2015 12:42 PM, Vlastimil Babka wrote: > On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote: >> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: >>> [CC linux-api, man pages] >>> >>> On 02/02/2015 11:22 PM, Dave Hansen wrote: >>>> On 02/02/2015 08:55 AM, Mel Gorman wrote: >>>>> This patch identifies when a thread is frequently calling MADV_DONTNEED >>>>> on the same region of memory and starts ignoring the hint. On an 8-core >>>>> single-socket machine this was the impact on ebizzy using glibc 2.19. >>>> >>>> The manpage, at least, claims that we zero-fill after MADV_DONTNEED is >>>> called: >>>> >>>>> MADV_DONTNEED >>>>> Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources >>>>> associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the >>>>> underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. >>>> >>>> So if we have anything depending on the behavior that it's _always_ >>>> zero-filled after an MADV_DONTNEED, this will break it. >>> >>> OK, so that's a third person (including me) who understood it as a zero-fill >>> guarantee. I think the man page should be clarified (if it's indeed not >>> guaranteed), or we have a bug. >>> >>> The implementation actually skips MADV_DONTNEED for >>> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's. >> >> It doesn't skip. It fails with -EINVAL. Or I miss something. > > No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in > this case: > > * The application is attempting to release locked or shared pages (with > MADV_DONTNEED). Yes, there is that. But the page could be more explicit when discussing MADV_DONTNEED in the main text. I've done that. > - that covers mlocking ok, not sure if the rest fits the "shared pages" case > though. I dont see any check for other kinds of shared pages in the code. Agreed. "shared" here seems confused. I've removed it. And I've added mention of "Huge TLB pages" for this error. >>> - The word "will result" did sound as a guarantee at least to me. So here it >>> could be changed to "may result (unless the advice is ignored)"? >> >> It's too late to fix documentation. Applications already depends on the >> beheviour. > > Right, so as long as they check for EINVAL, it should be safe. It appears that > jemalloc does. So, first a brief question: in the cases where the call does not error out, are we agreed that in the current implementation, MADV_DONTNEED will always result in zero-filled pages when the region is faulted back in (when we consider pages that are not backed by a file)? > I still wouldnt be sure just by reading the man page that the clearing is > guaranteed whenever I dont get an error return value, though, I'm not quite sure what you want here. I mean: if there's an error, then the DONTNEED action didn't occur, right? Therefore, there won't be zero-filled pages. But, for what it's worth, I added "If the operation succeeds" at the start of that sentence beginning "Subsequent accesses...". Now, some history, explaining why the page is a bit of a mess, and for that matter why I could really use more help on it from MM folk (especially in the form of actual patches [1], rather than notes about deficiencies in the documentation), because: ***I simply cannot keep up with all of the details***. Once upon a time (Linux 2.4), there was madvise() with just 5 flags: MADV_NORMAL MADV_RANDOM MADV_SEQUENTIAL MADV_WILLNEED MADV_DONTNEED And already a dozen years ago, *I* added the text about MADV_DONTNEED. Back then, I believe it was true. I'm not sure if it's still true now, but I assume for the moment that it is, and await feedback. And the text saying that the call does not affect the semantics of memory access dates back even further (and was then true, MADV_DONTNEED aside). Those 5 flags have analogs in POSIX's posix_madvise() (albeit, there is a semantic mismatch between the destructive MADV_DONTNEED and POSIX's nondestructive POSIX_MADV_DONTNEED). They also appear on most other implementations. Since the original implementation, numerous pieces of cruft^W^W^W excellent new flags have been overloaded into this one system call. Some of those certainly violated the "does not change the semantics of the application" statement, but, sadly, the kernel developers who implemented MADV_REMOVE or MADV_DONTFORK did not think to send a patch to the man page for those new flags, one that might have noted that the semantics of the application are changed by such flags. Equally sadly, I did overlook to scan the bigger page when *I* added documentation of these flags to those pages, otherwise I might have caught that detail. So, just to repeat, I could really use more help on it from MM folk in the form of actual patches to the man page. Thanks, Michael [1] https://www.kernel.org/doc/man-pages/patches.html -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 16:20 ` Michael Kerrisk (man-pages) @ 2015-02-04 13:46 ` Vlastimil Babka 2015-02-04 14:00 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 31+ messages in thread From: Vlastimil Babka @ 2015-02-04 13:46 UTC (permalink / raw) To: Michael Kerrisk (man-pages), Kirill A. Shutemov Cc: Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, linux-kernel, linux-api, linux-man, Hugh Dickins On 02/03/2015 05:20 PM, Michael Kerrisk (man-pages) wrote: > Hello Vlastimil > > Thanks for CCing me into this thread. NP > On 02/03/2015 12:42 PM, Vlastimil Babka wrote: >> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote: >>> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: >>> >>> It doesn't skip. It fails with -EINVAL. Or I miss something. >> >> No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in >> this case: >> >> * The application is attempting to release locked or shared pages (with >> MADV_DONTNEED). > > Yes, there is that. But the page could be more explicit when discussing > MADV_DONTNEED in the main text. I've done that. > >> - that covers mlocking ok, not sure if the rest fits the "shared pages" case >> though. I dont see any check for other kinds of shared pages in the code. > > Agreed. "shared" here seems confused. I've removed it. And I've > added mention of "Huge TLB pages" for this error. > Thanks. >>>> - The word "will result" did sound as a guarantee at least to me. So here it >>>> could be changed to "may result (unless the advice is ignored)"? >>> >>> It's too late to fix documentation. Applications already depends on the >>> beheviour. >> >> Right, so as long as they check for EINVAL, it should be safe. It appears that >> jemalloc does. > > So, first a brief question: in the cases where the call does not error out, > are we agreed that in the current implementation, MADV_DONTNEED will > always result in zero-filled pages when the region is faulted back in > (when we consider pages that are not backed by a file)? I'd agree at this point. Also we should probably mention anonymously shared pages (shmem). I think they behave the same as file here. >> I still wouldnt be sure just by reading the man page that the clearing is >> guaranteed whenever I dont get an error return value, though, > > I'm not quite sure what you want here. I mean: if there's an error, I was just reiterating that the guarantee is not clear from if you consider all the statements in the man page. > then the DONTNEED action didn't occur, right? Therefore, there won't > be zero-filled pages. But, for what it's worth, I added "If the > operation succeeds" at the start of that sentence beginning "Subsequent > accesses...". Yes, that should clarify it. Thanks! > Now, some history, explaining why the page is a bit of a mess, > and for that matter why I could really use more help on it from MM > folk (especially in the form of actual patches [1], rather than notes > about deficiencies in the documentation), because: > > ***I simply cannot keep up with all of the details***. I see, and expected it would be like this. I would just send patch if the situation was clear, but here we should agree first, and I thought you should be involved from the beginning. > Once upon a time (Linux 2.4), there was madvise() with just 5 flags: > > MADV_NORMAL > MADV_RANDOM > MADV_SEQUENTIAL > MADV_WILLNEED > MADV_DONTNEED > > And already a dozen years ago, *I* added the text about MADV_DONTNEED. > Back then, I believe it was true. I'm not sure if it's still true now, > but I assume for the moment that it is, and await feedback. And the > text saying that the call does not affect the semantics of memory > access dates back even further (and was then true, MADV_DONTNEED aside). > > Those 5 flags have analogs in POSIX's posix_madvise() (albeit, there > is a semantic mismatch between the destructive MADV_DONTNEED and > POSIX's nondestructive POSIX_MADV_DONTNEED). They also appear > on most other implementations. > > Since the original implementation, numerous pieces of cruft^W^W^W > excellent new flags have been overloaded into this one system call. > Some of those certainly violated the "does not change the semantics > of the application" statement, but, sadly, the kernel developers who > implemented MADV_REMOVE or MADV_DONTFORK did not think to send a > patch to the man page for those new flags, one that might have noted > that the semantics of the application are changed by such flags. Equally > sadly, I did overlook to scan the bigger page when *I* added > documentation of these flags to those pages, otherwise I might have > caught that detail. > > So, just to repeat, I could really use more help on it from MM > folk in the form of actual patches to the man page. Thanks for the background. I'll try to remember to check for man-pages part when I review some api changing patch. > Thanks, > > Michael > > [1] https://www.kernel.org/doc/man-pages/patches.html > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-04 13:46 ` Vlastimil Babka @ 2015-02-04 14:00 ` Michael Kerrisk (man-pages) 2015-02-04 17:02 ` Vlastimil Babka 0 siblings, 1 reply; 31+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-02-04 14:00 UTC (permalink / raw) To: Vlastimil Babka Cc: Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins Hello Vlastimil, On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@suse.cz> wrote: > On 02/03/2015 05:20 PM, Michael Kerrisk (man-pages) wrote: >> >> On 02/03/2015 12:42 PM, Vlastimil Babka wrote: >>> >>> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote: >>>> >>>> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: >>>> >>>> It doesn't skip. It fails with -EINVAL. Or I miss something. >>> >>> >>> No, I missed that. Thanks for pointing out. The manpage also explains >>> EINVAL in >>> this case: >>> >>> * The application is attempting to release locked or shared pages (with >>> MADV_DONTNEED). >> >> Yes, there is that. But the page could be more explicit when discussing >> MADV_DONTNEED in the main text. I've done that. >> >>> - that covers mlocking ok, not sure if the rest fits the "shared pages" >>> case >>> though. I dont see any check for other kinds of shared pages in the code. >> >> Agreed. "shared" here seems confused. I've removed it. And I've >> added mention of "Huge TLB pages" for this error. > > Thanks. I also added those cases for MADV_REMOVE, BTW. >>>>> - The word "will result" did sound as a guarantee at least to me. So >>>>> here it >>>>> could be changed to "may result (unless the advice is ignored)"? >>>> >>>> It's too late to fix documentation. Applications already depends on the >>>> beheviour. >>> >>> Right, so as long as they check for EINVAL, it should be safe. It appears >>> that >>> jemalloc does. >> >> >> So, first a brief question: in the cases where the call does not error >> out, >> are we agreed that in the current implementation, MADV_DONTNEED will >> always result in zero-filled pages when the region is faulted back in >> (when we consider pages that are not backed by a file)? > > > I'd agree at this point. Thanks for the confirmation. > Also we should probably mention anonymously shared pages (shmem). I think > they behave the same as file here. You mean tmpfs here, right? (I don't keep all of the synonyms straight.) >>> I still wouldnt be sure just by reading the man page that the clearing is >>> guaranteed whenever I dont get an error return value, though, >> >> I'm not quite sure what you want here. I mean: if there's an error, > > I was just reiterating that the guarantee is not clear from if you consider > all the statements in the man page. > >> then the DONTNEED action didn't occur, right? Therefore, there won't >> be zero-filled pages. But, for what it's worth, I added "If the >> operation succeeds" at the start of that sentence beginning "Subsequent >> accesses...". > > Yes, that should clarify it. Thanks! Okay. >> Now, some history, explaining why the page is a bit of a mess, >> and for that matter why I could really use more help on it from MM >> folk (especially in the form of actual patches [1], rather than notes >> about deficiencies in the documentation), because: >> >> ***I simply cannot keep up with all of the details***. > > I see, and expected it would be like this. I would just send patch if the > situation was clear, but here we should agree first, and I thought you > should be involved from the beginning. Sorry -- I should have made it clearer, this statement was not targeted at you personally, or even necessarily at this particular thread. It was a general comment, that came up sharply to me as I looked at how much cruft there is in the madvise() page. >> Once upon a time (Linux 2.4), there was madvise() with just 5 flags: >> >> MADV_NORMAL >> MADV_RANDOM >> MADV_SEQUENTIAL >> MADV_WILLNEED >> MADV_DONTNEED >> >> And already a dozen years ago, *I* added the text about MADV_DONTNEED. >> Back then, I believe it was true. I'm not sure if it's still true now, >> but I assume for the moment that it is, and await feedback. And the >> text saying that the call does not affect the semantics of memory >> access dates back even further (and was then true, MADV_DONTNEED aside). >> >> Those 5 flags have analogs in POSIX's posix_madvise() (albeit, there >> is a semantic mismatch between the destructive MADV_DONTNEED and >> POSIX's nondestructive POSIX_MADV_DONTNEED). They also appear >> on most other implementations. >> >> Since the original implementation, numerous pieces of cruft^W^W^W >> excellent new flags have been overloaded into this one system call. >> Some of those certainly violated the "does not change the semantics >> of the application" statement, but, sadly, the kernel developers who >> implemented MADV_REMOVE or MADV_DONTFORK did not think to send a >> patch to the man page for those new flags, one that might have noted >> that the semantics of the application are changed by such flags. Equally >> sadly, I did overlook to scan the bigger page when *I* added >> documentation of these flags to those pages, otherwise I might have >> caught that detail. >> >> So, just to repeat, I could really use more help on it from MM >> folk in the form of actual patches to the man page. > > Thanks for the background. I'll try to remember to check for man-pages part > when I review some api changing patch. That would be great. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-04 14:00 ` Michael Kerrisk (man-pages) @ 2015-02-04 17:02 ` Vlastimil Babka 2015-02-04 19:24 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 31+ messages in thread From: Vlastimil Babka @ 2015-02-04 17:02 UTC (permalink / raw) To: mtk.manpages Cc: Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote: > Hello Vlastimil, > > On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@suse.cz> wrote: >>>> - that covers mlocking ok, not sure if the rest fits the "shared pages" >>>> case >>>> though. I dont see any check for other kinds of shared pages in the code. >>> >>> Agreed. "shared" here seems confused. I've removed it. And I've >>> added mention of "Huge TLB pages" for this error. >> >> Thanks. > > I also added those cases for MADV_REMOVE, BTW. Right. There's also the following for MADV_REMOVE that needs updating: "Currently, only shmfs/tmpfs supports this; other filesystems return with the error ENOSYS." - it's not just shmem/tmpfs anymore. It should be best to refer to fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to date. - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is listed in the ERRORS section. >>>>>> - The word "will result" did sound as a guarantee at least to me. So >>>>>> here it >>>>>> could be changed to "may result (unless the advice is ignored)"? >>>>> >>>>> It's too late to fix documentation. Applications already depends on the >>>>> beheviour. >>>> >>>> Right, so as long as they check for EINVAL, it should be safe. It appears >>>> that >>>> jemalloc does. >>> >>> >>> So, first a brief question: in the cases where the call does not error >>> out, >>> are we agreed that in the current implementation, MADV_DONTNEED will >>> always result in zero-filled pages when the region is faulted back in >>> (when we consider pages that are not backed by a file)? >> >> >> I'd agree at this point. > > Thanks for the confirmation. > >> Also we should probably mention anonymously shared pages (shmem). I think >> they behave the same as file here. > > You mean tmpfs here, right? (I don't keep all of the synonyms straight.) shmem is tmpfs (that by itself would fit under "files" just fine), but also sys V segments created by shmget(2) and also mappings created by mmap with MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to refer to the full list. Thanks, Vlastimil ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-04 17:02 ` Vlastimil Babka @ 2015-02-04 19:24 ` Michael Kerrisk (man-pages) 2015-02-05 1:07 ` Minchan Kim 2015-02-05 15:41 ` Michal Hocko 0 siblings, 2 replies; 31+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-02-04 19:24 UTC (permalink / raw) To: Vlastimil Babka Cc: Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins On 4 February 2015 at 18:02, Vlastimil Babka <vbabka@suse.cz> wrote: > On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote: >> >> Hello Vlastimil, >> >> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@suse.cz> wrote: >>>>> >>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages" >>>>> case >>>>> though. I dont see any check for other kinds of shared pages in the >>>>> code. >>>> >>>> >>>> Agreed. "shared" here seems confused. I've removed it. And I've >>>> added mention of "Huge TLB pages" for this error. >>> >>> >>> Thanks. >> >> >> I also added those cases for MADV_REMOVE, BTW. > > > Right. There's also the following for MADV_REMOVE that needs updating: > > "Currently, only shmfs/tmpfs supports this; other filesystems return with > the error ENOSYS." > > - it's not just shmem/tmpfs anymore. It should be best to refer to > fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to > date. > > - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is > listed in the ERRORS section. Yup, I recently added that as well, based on a patch from Jan Chaloupka. >>>>>>> - The word "will result" did sound as a guarantee at least to me. So >>>>>>> here it >>>>>>> could be changed to "may result (unless the advice is ignored)"? >>>>>> >>>>>> It's too late to fix documentation. Applications already depends on >>>>>> the >>>>>> beheviour. >>>>> >>>>> Right, so as long as they check for EINVAL, it should be safe. It >>>>> appears >>>>> that >>>>> jemalloc does. >>>> >>>> So, first a brief question: in the cases where the call does not error >>>> out, >>>> are we agreed that in the current implementation, MADV_DONTNEED will >>>> always result in zero-filled pages when the region is faulted back in >>>> (when we consider pages that are not backed by a file)? >>> >>> I'd agree at this point. >> >> Thanks for the confirmation. >> >>> Also we should probably mention anonymously shared pages (shmem). I think >>> they behave the same as file here. >> >> You mean tmpfs here, right? (I don't keep all of the synonyms straight.) > > shmem is tmpfs (that by itself would fit under "files" just fine), but also > sys V segments created by shmget(2) and also mappings created by mmap with > MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to > refer to the full list. So, how about this text: After a successful MADV_DONTNEED operation, the seman‐ tics of memory access in the specified region are changed: subsequent accesses of pages in the range will succeed, but will result in either reloading of the memory contents from the underlying mapped file (for shared file mappings, shared anonymous mappings, and shmem-based techniques such as System V shared memory segments) or zero-fill-on-demand pages for anonymous private mappings. Thanks, Michael ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-04 19:24 ` Michael Kerrisk (man-pages) @ 2015-02-05 1:07 ` Minchan Kim 2015-02-06 15:41 ` Michael Kerrisk (man-pages) 2015-02-05 15:41 ` Michal Hocko 1 sibling, 1 reply; 31+ messages in thread From: Minchan Kim @ 2015-02-05 1:07 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins Hello, On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote: > On 4 February 2015 at 18:02, Vlastimil Babka <vbabka@suse.cz> wrote: > > On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote: > >> > >> Hello Vlastimil, > >> > >> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@suse.cz> wrote: > >>>>> > >>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages" > >>>>> case > >>>>> though. I dont see any check for other kinds of shared pages in the > >>>>> code. > >>>> > >>>> > >>>> Agreed. "shared" here seems confused. I've removed it. And I've > >>>> added mention of "Huge TLB pages" for this error. > >>> > >>> > >>> Thanks. > >> > >> > >> I also added those cases for MADV_REMOVE, BTW. > > > > > > Right. There's also the following for MADV_REMOVE that needs updating: > > > > "Currently, only shmfs/tmpfs supports this; other filesystems return with > > the error ENOSYS." > > > > - it's not just shmem/tmpfs anymore. It should be best to refer to > > fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to > > date. > > > > - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is > > listed in the ERRORS section. > > Yup, I recently added that as well, based on a patch from Jan Chaloupka. > > >>>>>>> - The word "will result" did sound as a guarantee at least to me. So > >>>>>>> here it > >>>>>>> could be changed to "may result (unless the advice is ignored)"? > >>>>>> > >>>>>> It's too late to fix documentation. Applications already depends on > >>>>>> the > >>>>>> beheviour. > >>>>> > >>>>> Right, so as long as they check for EINVAL, it should be safe. It > >>>>> appears > >>>>> that > >>>>> jemalloc does. > >>>> > >>>> So, first a brief question: in the cases where the call does not error > >>>> out, > >>>> are we agreed that in the current implementation, MADV_DONTNEED will > >>>> always result in zero-filled pages when the region is faulted back in > >>>> (when we consider pages that are not backed by a file)? > >>> > >>> I'd agree at this point. > >> > >> Thanks for the confirmation. > >> > >>> Also we should probably mention anonymously shared pages (shmem). I think > >>> they behave the same as file here. > >> > >> You mean tmpfs here, right? (I don't keep all of the synonyms straight.) > > > > shmem is tmpfs (that by itself would fit under "files" just fine), but also > > sys V segments created by shmget(2) and also mappings created by mmap with > > MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to > > refer to the full list. > > So, how about this text: > > After a successful MADV_DONTNEED operation, the seman‐ > tics of memory access in the specified region are > changed: subsequent accesses of pages in the range > will succeed, but will result in either reloading of > the memory contents from the underlying mapped file > (for shared file mappings, shared anonymous mappings, > and shmem-based techniques such as System V shared > memory segments) or zero-fill-on-demand pages for > anonymous private mappings. Hmm, I'd like to clarify. Whether it was intention or not, some of userspace developers thought about that syscall drop pages instantly if was no-error return so that they will see more free pages(ie, rss for the process will be decreased) with keeping the VMA. Can we rely on it? And we should make error section, too. "locked" covers mlock(2) and you said you will add hugetlb. Then, VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP? special mapping for some drivers? One more thing, "The kernel is free to ignore the advice". It conflicts "This call does not influence the semantics of the application (except in the case of MADV_DONTNEED)" so is it okay we can believe "The kernel is free to ingmore the advise except MADV_DONTNEED"? Thanks. -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-05 1:07 ` Minchan Kim @ 2015-02-06 15:41 ` Michael Kerrisk (man-pages) 2015-02-09 6:46 ` Minchan Kim 0 siblings, 1 reply; 31+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-02-06 15:41 UTC (permalink / raw) To: Minchan Kim Cc: mtk.manpages, Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins On 02/05/2015 02:07 AM, Minchan Kim wrote: > Hello, > > On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote: >> On 4 February 2015 at 18:02, Vlastimil Babka <vbabka@suse.cz> wrote: >>> On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote: >>>> >>>> Hello Vlastimil, >>>> >>>> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@suse.cz> wrote: >>>>>>> >>>>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages" >>>>>>> case >>>>>>> though. I dont see any check for other kinds of shared pages in the >>>>>>> code. >>>>>> >>>>>> >>>>>> Agreed. "shared" here seems confused. I've removed it. And I've >>>>>> added mention of "Huge TLB pages" for this error. >>>>> >>>>> >>>>> Thanks. >>>> >>>> >>>> I also added those cases for MADV_REMOVE, BTW. >>> >>> >>> Right. There's also the following for MADV_REMOVE that needs updating: >>> >>> "Currently, only shmfs/tmpfs supports this; other filesystems return with >>> the error ENOSYS." >>> >>> - it's not just shmem/tmpfs anymore. It should be best to refer to >>> fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to >>> date. >>> >>> - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is >>> listed in the ERRORS section. >> >> Yup, I recently added that as well, based on a patch from Jan Chaloupka. >> >>>>>>>>> - The word "will result" did sound as a guarantee at least to me. So >>>>>>>>> here it >>>>>>>>> could be changed to "may result (unless the advice is ignored)"? >>>>>>>> >>>>>>>> It's too late to fix documentation. Applications already depends on >>>>>>>> the >>>>>>>> beheviour. >>>>>>> >>>>>>> Right, so as long as they check for EINVAL, it should be safe. It >>>>>>> appears >>>>>>> that >>>>>>> jemalloc does. >>>>>> >>>>>> So, first a brief question: in the cases where the call does not error >>>>>> out, >>>>>> are we agreed that in the current implementation, MADV_DONTNEED will >>>>>> always result in zero-filled pages when the region is faulted back in >>>>>> (when we consider pages that are not backed by a file)? >>>>> >>>>> I'd agree at this point. >>>> >>>> Thanks for the confirmation. >>>> >>>>> Also we should probably mention anonymously shared pages (shmem). I think >>>>> they behave the same as file here. >>>> >>>> You mean tmpfs here, right? (I don't keep all of the synonyms straight.) >>> >>> shmem is tmpfs (that by itself would fit under "files" just fine), but also >>> sys V segments created by shmget(2) and also mappings created by mmap with >>> MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to >>> refer to the full list. >> >> So, how about this text: >> >> After a successful MADV_DONTNEED operation, the seman‐ >> tics of memory access in the specified region are >> changed: subsequent accesses of pages in the range >> will succeed, but will result in either reloading of >> the memory contents from the underlying mapped file >> (for shared file mappings, shared anonymous mappings, >> and shmem-based techniques such as System V shared >> memory segments) or zero-fill-on-demand pages for >> anonymous private mappings. > > Hmm, I'd like to clarify. > > Whether it was intention or not, some of userspace developers thought > about that syscall drop pages instantly if was no-error return so that > they will see more free pages(ie, rss for the process will be decreased) > with keeping the VMA. Can we rely on it? I do not know. Michael? > And we should make error section, too. > "locked" covers mlock(2) and you said you will add hugetlb. Then, > VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP? > special mapping for some drivers? I'm open for offers on what to add. > One more thing, "The kernel is free to ignore the advice". > It conflicts "This call does not influence the semantics of the > application (except in the case of MADV_DONTNEED)" so > is it okay we can believe "The kernel is free to ingmore the advise > except MADV_DONTNEED"? I decided to just drop the sentence The kernel is free to ignore the advice. It creates misunderstandings, and does not really add information. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-06 15:41 ` Michael Kerrisk (man-pages) @ 2015-02-09 6:46 ` Minchan Kim 2015-02-09 9:13 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 31+ messages in thread From: Minchan Kim @ 2015-02-09 6:46 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins Hello, Michael On Fri, Feb 06, 2015 at 04:41:12PM +0100, Michael Kerrisk (man-pages) wrote: > On 02/05/2015 02:07 AM, Minchan Kim wrote: > > Hello, > > > > On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote: > >> On 4 February 2015 at 18:02, Vlastimil Babka <vbabka@suse.cz> wrote: > >>> On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote: > >>>> > >>>> Hello Vlastimil, > >>>> > >>>> On 4 February 2015 at 14:46, Vlastimil Babka <vbabka@suse.cz> wrote: > >>>>>>> > >>>>>>> - that covers mlocking ok, not sure if the rest fits the "shared pages" > >>>>>>> case > >>>>>>> though. I dont see any check for other kinds of shared pages in the > >>>>>>> code. > >>>>>> > >>>>>> > >>>>>> Agreed. "shared" here seems confused. I've removed it. And I've > >>>>>> added mention of "Huge TLB pages" for this error. > >>>>> > >>>>> > >>>>> Thanks. > >>>> > >>>> > >>>> I also added those cases for MADV_REMOVE, BTW. > >>> > >>> > >>> Right. There's also the following for MADV_REMOVE that needs updating: > >>> > >>> "Currently, only shmfs/tmpfs supports this; other filesystems return with > >>> the error ENOSYS." > >>> > >>> - it's not just shmem/tmpfs anymore. It should be best to refer to > >>> fallocate(2) option FALLOC_FL_PUNCH_HOLE which seems to be (more) up to > >>> date. > >>> > >>> - AFAICS it doesn't return ENOSYS but EOPNOTSUPP. Also neither error code is > >>> listed in the ERRORS section. > >> > >> Yup, I recently added that as well, based on a patch from Jan Chaloupka. > >> > >>>>>>>>> - The word "will result" did sound as a guarantee at least to me. So > >>>>>>>>> here it > >>>>>>>>> could be changed to "may result (unless the advice is ignored)"? > >>>>>>>> > >>>>>>>> It's too late to fix documentation. Applications already depends on > >>>>>>>> the > >>>>>>>> beheviour. > >>>>>>> > >>>>>>> Right, so as long as they check for EINVAL, it should be safe. It > >>>>>>> appears > >>>>>>> that > >>>>>>> jemalloc does. > >>>>>> > >>>>>> So, first a brief question: in the cases where the call does not error > >>>>>> out, > >>>>>> are we agreed that in the current implementation, MADV_DONTNEED will > >>>>>> always result in zero-filled pages when the region is faulted back in > >>>>>> (when we consider pages that are not backed by a file)? > >>>>> > >>>>> I'd agree at this point. > >>>> > >>>> Thanks for the confirmation. > >>>> > >>>>> Also we should probably mention anonymously shared pages (shmem). I think > >>>>> they behave the same as file here. > >>>> > >>>> You mean tmpfs here, right? (I don't keep all of the synonyms straight.) > >>> > >>> shmem is tmpfs (that by itself would fit under "files" just fine), but also > >>> sys V segments created by shmget(2) and also mappings created by mmap with > >>> MAP_SHARED | MAP_ANONYMOUS. I'm not sure if there's a single manpage to > >>> refer to the full list. > >> > >> So, how about this text: > >> > >> After a successful MADV_DONTNEED operation, the seman‐ > >> tics of memory access in the specified region are > >> changed: subsequent accesses of pages in the range > >> will succeed, but will result in either reloading of > >> the memory contents from the underlying mapped file > >> (for shared file mappings, shared anonymous mappings, > >> and shmem-based techniques such as System V shared > >> memory segments) or zero-fill-on-demand pages for > >> anonymous private mappings. > > > > Hmm, I'd like to clarify. > > > > Whether it was intention or not, some of userspace developers thought > > about that syscall drop pages instantly if was no-error return so that > > they will see more free pages(ie, rss for the process will be decreased) > > with keeping the VMA. Can we rely on it? > > I do not know. Michael? It's important to identify difference between MADV_DONTNEED and MADV_FREE so it would be better to clear out in this chance. > > > And we should make error section, too. > > "locked" covers mlock(2) and you said you will add hugetlb. Then, > > VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP? > > special mapping for some drivers? > > I'm open for offers on what to add. I suggests from quote "LWN" http://lwn.net/Articles/162860/ "*special mapping* which is not made up of "normal" pages. It is usually created by device drivers which map special memory areas into user space" > > > One more thing, "The kernel is free to ignore the advice". > > It conflicts "This call does not influence the semantics of the > > application (except in the case of MADV_DONTNEED)" so > > is it okay we can believe "The kernel is free to ingmore the advise > > except MADV_DONTNEED"? > > I decided to just drop the sentence > > The kernel is free to ignore the advice. > > It creates misunderstandings, and does not really add information. Sounds good. > > Cheers, > > Michael > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/ -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-09 6:46 ` Minchan Kim @ 2015-02-09 9:13 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 31+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-02-09 9:13 UTC (permalink / raw) To: Minchan Kim Cc: mtk.manpages, Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins Hello Minchan On 02/09/2015 07:46 AM, Minchan Kim wrote: > Hello, Michael > > On Fri, Feb 06, 2015 at 04:41:12PM +0100, Michael Kerrisk (man-pages) wrote: >> On 02/05/2015 02:07 AM, Minchan Kim wrote: >>> Hello, >>> >>> On Wed, Feb 04, 2015 at 08:24:27PM +0100, Michael Kerrisk (man-pages) wrote: >>>> On 4 February 2015 at 18:02, Vlastimil Babka <vbabka@suse.cz> wrote: >>>>> On 02/04/2015 03:00 PM, Michael Kerrisk (man-pages) wrote: [...] >>> And we should make error section, too. >>> "locked" covers mlock(2) and you said you will add hugetlb. Then, >>> VM_PFNMAP? In that case, it fails. How can we say about VM_PFNMAP? >>> special mapping for some drivers? >> >> I'm open for offers on what to add. > > I suggests from quote "LWN" http://lwn.net/Articles/162860/ > "*special mapping* which is not made up of "normal" pages. > It is usually created by device drivers which map special memory areas > into user space" Thanks. I've added mention of VM_PFNMAP in the discussion of both MADV_DONTNEED and MADV_REMOVE, and noted that both of those operations will give an error when applied to VM_PFNMAP pages. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-04 19:24 ` Michael Kerrisk (man-pages) 2015-02-05 1:07 ` Minchan Kim @ 2015-02-05 15:41 ` Michal Hocko 2015-02-06 15:57 ` Michael Kerrisk (man-pages) 1 sibling, 1 reply; 31+ messages in thread From: Michal Hocko @ 2015-02-05 15:41 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins On Wed 04-02-15 20:24:27, Michael Kerrisk wrote: [...] > So, how about this text: > > After a successful MADV_DONTNEED operation, the seman‐ > tics of memory access in the specified region are > changed: subsequent accesses of pages in the range > will succeed, but will result in either reloading of > the memory contents from the underlying mapped file " result in either providing the up-to-date contents of the underlying mapped file " Would be more precise IMO because reload might be interpreted as a major fault which is not necessarily the case (see below). > (for shared file mappings, shared anonymous mappings, > and shmem-based techniques such as System V shared > memory segments) or zero-fill-on-demand pages for > anonymous private mappings. Yes, this wording is better because many users are not aware of MAP_ANON|MAP_SHARED being file backed in fact and mmap man page doesn't mention that. I am just wondering whether it makes sense to mention that MADV_DONTNEED for shared mappings might be surprising and not freeing the backing pages thus not really freeing memory until there is a memory pressure. But maybe this is too implementation specific for a man page. What about the following wording on top of yours? " Please note that the MADV_DONTNEED hint on shared mappings might not lead to immediate freeing of pages in the range. The kernel is free to delay this until an appropriate moment. RSS of the calling process will be reduced however. " -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-05 15:41 ` Michal Hocko @ 2015-02-06 15:57 ` Michael Kerrisk (man-pages) 2015-02-06 20:45 ` Michal Hocko 2015-02-09 6:50 ` Minchan Kim 0 siblings, 2 replies; 31+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-02-06 15:57 UTC (permalink / raw) To: Michal Hocko Cc: mtk.manpages, Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins Hi Michael On 02/05/2015 04:41 PM, Michal Hocko wrote: > On Wed 04-02-15 20:24:27, Michael Kerrisk wrote: > [...] >> So, how about this text: >> >> After a successful MADV_DONTNEED operation, the seman‐ >> tics of memory access in the specified region are >> changed: subsequent accesses of pages in the range >> will succeed, but will result in either reloading of >> the memory contents from the underlying mapped file > > " > result in either providing the up-to-date contents of the underlying > mapped file > " Thanks! I did something like that. See below. > Would be more precise IMO because reload might be interpreted as a major > fault which is not necessarily the case (see below). > >> (for shared file mappings, shared anonymous mappings, >> and shmem-based techniques such as System V shared >> memory segments) or zero-fill-on-demand pages for >> anonymous private mappings. > > Yes, this wording is better because many users are not aware of > MAP_ANON|MAP_SHARED being file backed in fact and mmap man page doesn't > mention that. (Michal, would you have a text to propose to add to the mmap(2) page? Maybe it would be useful to add something there.) > > I am just wondering whether it makes sense to mention that MADV_DONTNEED > for shared mappings might be surprising and not freeing the backing > pages thus not really freeing memory until there is a memory > pressure. But maybe this is too implementation specific for a man > page. What about the following wording on top of yours? > " > Please note that the MADV_DONTNEED hint on shared mappings might not > lead to immediate freeing of pages in the range. The kernel is free to > delay this until an appropriate moment. RSS of the calling process will > be reduced however. > " Thanks! I added this, but dropped in the word "immediately" in the last sentence, since I assume that was implied. So now we have: After a successful MADV_DONTNEED operation, the seman‐ tics of memory access in the specified region are changed: subsequent accesses of pages in the range will succeed, but will result in either repopulating the mem‐ ory contents from the up-to-date contents of the under‐ lying mapped file (for shared file mappings, shared anonymous mappings, and shmem-based techniques such as System V shared memory segments) or zero-fill-on-demand pages for anonymous private mappings. Note that, when applied to shared mappings, MADV_DONT‐ NEED might not lead to immediate freeing of the pages in the range. The kernel is free to delay freeing the pages until an appropriate moment. The resident set size (RSS) of the calling process will be immediately reduced however. The current draft of the page can be found in a branch, http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_madvise Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-06 15:57 ` Michael Kerrisk (man-pages) @ 2015-02-06 20:45 ` Michal Hocko 2015-02-09 6:50 ` Minchan Kim 1 sibling, 0 replies; 31+ messages in thread From: Michal Hocko @ 2015-02-06 20:45 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Minchan Kim, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins On Fri 06-02-15 16:57:50, Michael Kerrisk wrote: [...] > > Yes, this wording is better because many users are not aware of > > MAP_ANON|MAP_SHARED being file backed in fact and mmap man page doesn't > > mention that. > > (Michal, would you have a text to propose to add to the mmap(2) page? > Maybe it would be useful to add something there.) I am half way on vacation, but I can cook a patch after I am back after week. > > I am just wondering whether it makes sense to mention that MADV_DONTNEED > > for shared mappings might be surprising and not freeing the backing > > pages thus not really freeing memory until there is a memory > > pressure. But maybe this is too implementation specific for a man > > page. What about the following wording on top of yours? > > " > > Please note that the MADV_DONTNEED hint on shared mappings might not > > lead to immediate freeing of pages in the range. The kernel is free to > > delay this until an appropriate moment. RSS of the calling process will > > be reduced however. > > " > > Thanks! I added this, but dropped in the word "immediately" in the last > sentence, since I assume that was implied. So now we have: > > After a successful MADV_DONTNEED operation, the seman‐ > tics of memory access in the specified region are > changed: subsequent accesses of pages in the range will > succeed, but will result in either repopulating the mem‐ > ory contents from the up-to-date contents of the under‐ > lying mapped file (for shared file mappings, shared > anonymous mappings, and shmem-based techniques such as > System V shared memory segments) or zero-fill-on-demand > pages for anonymous private mappings. > > Note that, when applied to shared mappings, MADV_DONT‐ > NEED might not lead to immediate freeing of the pages in > the range. The kernel is free to delay freeing the > pages until an appropriate moment. The resident set > size (RSS) of the calling process will be immediately > reduced however. This sounds good to me and it is definitely much better than the current state. Thanks! > The current draft of the page can be found in a branch, > http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_madvise > > Thanks, > > Michael > > > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/ -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-06 15:57 ` Michael Kerrisk (man-pages) 2015-02-06 20:45 ` Michal Hocko @ 2015-02-09 6:50 ` Minchan Kim 1 sibling, 0 replies; 31+ messages in thread From: Minchan Kim @ 2015-02-09 6:50 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Michal Hocko, Vlastimil Babka, Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Andrew Morton, lkml, Linux API, linux-man, Hugh Dickins On Fri, Feb 06, 2015 at 04:57:50PM +0100, Michael Kerrisk (man-pages) wrote: > Hi Michael > > On 02/05/2015 04:41 PM, Michal Hocko wrote: > > On Wed 04-02-15 20:24:27, Michael Kerrisk wrote: > > [...] > >> So, how about this text: > >> > >> After a successful MADV_DONTNEED operation, the seman‐ > >> tics of memory access in the specified region are > >> changed: subsequent accesses of pages in the range > >> will succeed, but will result in either reloading of > >> the memory contents from the underlying mapped file > > > > " > > result in either providing the up-to-date contents of the underlying > > mapped file > > " > > Thanks! I did something like that. See below. > > > Would be more precise IMO because reload might be interpreted as a major > > fault which is not necessarily the case (see below). > > > >> (for shared file mappings, shared anonymous mappings, > >> and shmem-based techniques such as System V shared > >> memory segments) or zero-fill-on-demand pages for > >> anonymous private mappings. > > > > Yes, this wording is better because many users are not aware of > > MAP_ANON|MAP_SHARED being file backed in fact and mmap man page doesn't > > mention that. > > (Michal, would you have a text to propose to add to the mmap(2) page? > Maybe it would be useful to add something there.) > > > > > I am just wondering whether it makes sense to mention that MADV_DONTNEED > > for shared mappings might be surprising and not freeing the backing > > pages thus not really freeing memory until there is a memory > > pressure. But maybe this is too implementation specific for a man > > page. What about the following wording on top of yours? > > " > > Please note that the MADV_DONTNEED hint on shared mappings might not > > lead to immediate freeing of pages in the range. The kernel is free to > > delay this until an appropriate moment. RSS of the calling process will > > be reduced however. > > " > > Thanks! I added this, but dropped in the word "immediately" in the last > sentence, since I assume that was implied. So now we have: > > After a successful MADV_DONTNEED operation, the seman‐ > tics of memory access in the specified region are > changed: subsequent accesses of pages in the range will > succeed, but will result in either repopulating the mem‐ > ory contents from the up-to-date contents of the under‐ > lying mapped file (for shared file mappings, shared > anonymous mappings, and shmem-based techniques such as > System V shared memory segments) or zero-fill-on-demand > pages for anonymous private mappings. > > Note that, when applied to shared mappings, MADV_DONT‐ > NEED might not lead to immediate freeing of the pages in > the range. The kernel is free to delay freeing the > pages until an appropriate moment. The resident set > size (RSS) of the calling process will be immediately > reduced however. Looks good. So, I can parse it that anonymous private mappings will lead to immediate freeing of the pages in the range so it's clearly different with MADV_FREE. > > The current draft of the page can be found in a branch, > http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_madvise > > Thanks, > > Michael > > > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/ -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 11:42 ` Vlastimil Babka 2015-02-03 16:20 ` Michael Kerrisk (man-pages) @ 2015-02-04 0:09 ` Minchan Kim 1 sibling, 0 replies; 31+ messages in thread From: Minchan Kim @ 2015-02-04 0:09 UTC (permalink / raw) To: Vlastimil Babka Cc: Kirill A. Shutemov, Dave Hansen, Mel Gorman, linux-mm, Andrew Morton, linux-kernel, linux-api, mtk.manpages, linux-man, Rik van Riel On Tue, Feb 03, 2015 at 12:42:53PM +0100, Vlastimil Babka wrote: > On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote: > > On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: > >> [CC linux-api, man pages] > >> > >> On 02/02/2015 11:22 PM, Dave Hansen wrote: > >> > On 02/02/2015 08:55 AM, Mel Gorman wrote: > >> >> This patch identifies when a thread is frequently calling MADV_DONTNEED > >> >> on the same region of memory and starts ignoring the hint. On an 8-core > >> >> single-socket machine this was the impact on ebizzy using glibc 2.19. > >> > > >> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is > >> > called: > >> > > >> >> MADV_DONTNEED > >> >> Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources > >> >> associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the > >> >> underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. > >> > > >> > So if we have anything depending on the behavior that it's _always_ > >> > zero-filled after an MADV_DONTNEED, this will break it. > >> > >> OK, so that's a third person (including me) who understood it as a zero-fill > >> guarantee. I think the man page should be clarified (if it's indeed not > >> guaranteed), or we have a bug. > >> > >> The implementation actually skips MADV_DONTNEED for > >> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's. > > > > It doesn't skip. It fails with -EINVAL. Or I miss something. > > No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in > this case: > > * The application is attempting to release locked or shared pages (with > MADV_DONTNEED). > > - that covers mlocking ok, not sure if the rest fits the "shared pages" case > though. I dont see any check for other kinds of shared pages in the code. > > >> - The word "will result" did sound as a guarantee at least to me. So here it > >> could be changed to "may result (unless the advice is ignored)"? > > > > It's too late to fix documentation. Applications already depends on the > > beheviour. > > Right, so as long as they check for EINVAL, it should be safe. It appears that > jemalloc does. > > I still wouldnt be sure just by reading the man page that the clearing is > guaranteed whenever I dont get an error return value, though, > IMHO, Man page said "MADV_DONTNEED: Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file." Heap by allocated by malloc(3) is anonymous page so it's a mapping withtout an underlying file so userspace can expect zero-fill. Man page said "EINVAL: The application is attempting to release locked or shared pages (with MADV_DONTNEED)" So, user can expect the call on area by allocated by malloc(3) if he doesn't call mlock will always be successful. Man page said "madivse: This call does not influence the semantics of the application (except in the case of MADV_DONTNEED)" So, we shouldn't break MADV_DONTNEED's semantic which free pages instantly. It's a long time semantic and it was one of arguable issues on MADV_FREE Rik had tried long time ago to replace MADV_DONTNEED with MADV_FREE. -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 8:19 ` MADV_DONTNEED semantics? Was: " Vlastimil Babka 2015-02-03 10:53 ` Kirill A. Shutemov @ 2015-02-03 11:16 ` Mel Gorman 2015-02-03 15:21 ` Michal Hocko 1 sibling, 1 reply; 31+ messages in thread From: Mel Gorman @ 2015-02-03 11:16 UTC (permalink / raw) To: Vlastimil Babka Cc: Dave Hansen, linux-mm, Minchan Kim, Andrew Morton, linux-kernel, linux-api, mtk.manpages, linux-man On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: > [CC linux-api, man pages] > > On 02/02/2015 11:22 PM, Dave Hansen wrote: > > On 02/02/2015 08:55 AM, Mel Gorman wrote: > >> This patch identifies when a thread is frequently calling MADV_DONTNEED > >> on the same region of memory and starts ignoring the hint. On an 8-core > >> single-socket machine this was the impact on ebizzy using glibc 2.19. > > > > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is > > called: > > > >> MADV_DONTNEED > >> Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources > >> associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the > >> underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. > > > > So if we have anything depending on the behavior that it's _always_ > > zero-filled after an MADV_DONTNEED, this will break it. > > OK, so that's a third person (including me) who understood it as a zero-fill > guarantee. I think the man page should be clarified (if it's indeed not > guaranteed), or we have a bug. > > The implementation actually skips MADV_DONTNEED for > VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's. > This was the first reason why I did not consider the zero-filling to be a guarantee. That said, at this point I'm also not considering pushing this patch towards the kernel. I agree that this is a glibc bug so I've dropped a line to some glibc people to see what they think the approach should be. > I'm not sure about VM_PFNMAP, these are probably special enough. For mlock, one > could expect that mlocking and MADV_DONTNEED would be in some opposition, but > it's not documented in the manpage AFAIK. Neither is the hugetlb case, which > could be really unexpected by the user. > The equivalent posix page also lacks details on how exactly this flag should behave. hugetlb is sortof special in that it's always backed by a ram-based file where the contents can be refaulted. It gets hairy when the mapping has been created to look anonymous but is not anonymous really. The semantics of hugetlb have always been fuzzy. > Next, what the man page says about guarantees: > > "The kernel is free to ignore the advice." > > - that would suggest that nothing is guaranteed > Yep, another reason why I did not clear the page when ignoring the hint. > "This call does not influence the semantics of the application (except in the > case of MADV_DONTNEED)" > > - that depends if the reader understands it as "does influence by MADV_DONTNEED" > or "may influence by MADV_DONTNEED" > > - btw, isn't MADV_DONTFORK another exception that does influence the semantics? > And since it's mentioned as a workaround for some hardware, is it OK to ignore > this advice? > MADV_DONTFORK is also a Linux-specific extention. It happens to be one that if it gets ignored then the application will be very surprised. > And the part you already cited: > > "Subsequent accesses of pages in this range will succeed, but will result either > in reloading of the memory contents from the underlying mapped file (see > mmap(2)) or zero-fill on-demand pages for mappings without an underlying file." > > - The word "will result" did sound as a guarantee at least to me. So here it > could be changed to "may result (unless the advice is ignored)"? > The wording should be "may result" as there are circumstances where it gets ignored even without this prototype patch. > And if we agree that there is indeed no guarantee, what's the actual semantic > difference from MADV_FREE? I guess none? So there's only a possible perfomance > difference? > Timing. MADV_DONTNEED if it has an effect is immediate, is a heavier operations and RSS is reduced. MADV_FREE only has an impact in the future if there is memory pressure. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 11:16 ` Mel Gorman @ 2015-02-03 15:21 ` Michal Hocko 2015-02-03 16:25 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 31+ messages in thread From: Michal Hocko @ 2015-02-03 15:21 UTC (permalink / raw) To: Mel Gorman Cc: Vlastimil Babka, Dave Hansen, linux-mm, Minchan Kim, Andrew Morton, linux-kernel, linux-api, mtk.manpages, linux-man On Tue 03-02-15 11:16:00, Mel Gorman wrote: > On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: [...] > > And if we agree that there is indeed no guarantee, what's the actual semantic > > difference from MADV_FREE? I guess none? So there's only a possible perfomance > > difference? > > > > Timing. MADV_DONTNEED if it has an effect is immediate, is a heavier > operations and RSS is reduced. MADV_FREE only has an impact in the future > if there is memory pressure. JFTR. the man page for MADV_FREE has been proposed already (https://lkml.org/lkml/2014/12/5/63 should be the last version AFAIR). I do not see it in the man-pages git tree but the patch was not in time for 3.19 so I guess it will only appear in 3.20. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 15:21 ` Michal Hocko @ 2015-02-03 16:25 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 31+ messages in thread From: Michael Kerrisk (man-pages) @ 2015-02-03 16:25 UTC (permalink / raw) To: Michal Hocko, Mel Gorman Cc: mtk.manpages, minchan Kim, Dave Hansen, linux-mm, Minchan Kim, Andrew Morton, linux-kernel, linux-api, linux-man On 02/03/2015 04:21 PM, Michal Hocko wrote: > On Tue 03-02-15 11:16:00, Mel Gorman wrote: >> On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote: > [...] >>> And if we agree that there is indeed no guarantee, what's the actual semantic >>> difference from MADV_FREE? I guess none? So there's only a possible perfomance >>> difference? >>> >> >> Timing. MADV_DONTNEED if it has an effect is immediate, is a heavier >> operations and RSS is reduced. MADV_FREE only has an impact in the future >> if there is memory pressure. > > JFTR. the man page for MADV_FREE has been proposed already > (https://lkml.org/lkml/2014/12/5/63 should be the last version AFAIR). I > do not see it in the man-pages git tree but the patch was not in time > for 3.19 so I guess it will only appear in 3.20. > Yikes! That patch was buried in the bottom of a locked filing cabinet in a disused lavatory. I unfortunately don't read every thread that comes my way, especially if it doesn't look like a man-pages patch (i.e., falls in the middle of an LKML thread that starts on another topic, and doesn't see linux-man@). I'll respond to that patch soon. (There are some problems that mean I could not accept it, AFAICT.) Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-02 22:22 ` Dave Hansen 2015-02-03 8:19 ` MADV_DONTNEED semantics? Was: " Vlastimil Babka @ 2015-02-03 9:47 ` Mel Gorman 2015-02-03 10:47 ` Kirill A. Shutemov 1 sibling, 1 reply; 31+ messages in thread From: Mel Gorman @ 2015-02-03 9:47 UTC (permalink / raw) To: Dave Hansen Cc: linux-mm, Minchan Kim, Vlastimil Babka, Andrew Morton, linux-kernel On Mon, Feb 02, 2015 at 02:22:36PM -0800, Dave Hansen wrote: > On 02/02/2015 08:55 AM, Mel Gorman wrote: > > This patch identifies when a thread is frequently calling MADV_DONTNEED > > on the same region of memory and starts ignoring the hint. On an 8-core > > single-socket machine this was the impact on ebizzy using glibc 2.19. > > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is > called: > It also claims that the kernel is free to ignore the advice. > > MADV_DONTNEED > > Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources > > associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the > > underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. > > So if we have anything depending on the behavior that it's _always_ > zero-filled after an MADV_DONTNEED, this will break it. True. I'd be surprised if any application depended on that but to be safe, an ignored hint could clear the pages. It would still be cheaper than a full teardown and refault. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 9:47 ` Mel Gorman @ 2015-02-03 10:47 ` Kirill A. Shutemov 2015-02-03 11:21 ` Mel Gorman 0 siblings, 1 reply; 31+ messages in thread From: Kirill A. Shutemov @ 2015-02-03 10:47 UTC (permalink / raw) To: Mel Gorman Cc: Dave Hansen, linux-mm, Minchan Kim, Vlastimil Babka, Andrew Morton, linux-kernel On Tue, Feb 03, 2015 at 09:47:18AM +0000, Mel Gorman wrote: > On Mon, Feb 02, 2015 at 02:22:36PM -0800, Dave Hansen wrote: > > On 02/02/2015 08:55 AM, Mel Gorman wrote: > > > This patch identifies when a thread is frequently calling MADV_DONTNEED > > > on the same region of memory and starts ignoring the hint. On an 8-core > > > single-socket machine this was the impact on ebizzy using glibc 2.19. > > > > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is > > called: > > > > It also claims that the kernel is free to ignore the advice. > > > > MADV_DONTNEED > > > Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources > > > associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the > > > underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. > > > > So if we have anything depending on the behavior that it's _always_ > > zero-filled after an MADV_DONTNEED, this will break it. > > True. I'd be surprised if any application depended on that IIUC, jemalloc depends on this[1]. [1] https://github.com/jemalloc/jemalloc/blob/dev/src/chunk_mmap.c#L117 -- Kirill A. Shutemov ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints 2015-02-03 10:47 ` Kirill A. Shutemov @ 2015-02-03 11:21 ` Mel Gorman 0 siblings, 0 replies; 31+ messages in thread From: Mel Gorman @ 2015-02-03 11:21 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Dave Hansen, linux-mm, Minchan Kim, Vlastimil Babka, Andrew Morton, linux-kernel On Tue, Feb 03, 2015 at 12:47:56PM +0200, Kirill A. Shutemov wrote: > On Tue, Feb 03, 2015 at 09:47:18AM +0000, Mel Gorman wrote: > > On Mon, Feb 02, 2015 at 02:22:36PM -0800, Dave Hansen wrote: > > > On 02/02/2015 08:55 AM, Mel Gorman wrote: > > > > This patch identifies when a thread is frequently calling MADV_DONTNEED > > > > on the same region of memory and starts ignoring the hint. On an 8-core > > > > single-socket machine this was the impact on ebizzy using glibc 2.19. > > > > > > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is > > > called: > > > > > > > It also claims that the kernel is free to ignore the advice. > > > > > > MADV_DONTNEED > > > > Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources > > > > associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents from the > > > > underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file. > > > > > > So if we have anything depending on the behavior that it's _always_ > > > zero-filled after an MADV_DONTNEED, this will break it. > > > > True. I'd be surprised if any application depended on that > > IIUC, jemalloc depends on this[1]. > > [1] https://github.com/jemalloc/jemalloc/blob/dev/src/chunk_mmap.c#L117 > Hope they never back regions with hugetlb then or fall apart if the process called mlockall -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2015-02-09 9:13 UTC | newest] Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-02-02 16:55 [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints Mel Gorman 2015-02-02 22:05 ` Andrew Morton 2015-02-02 22:18 ` Mel Gorman 2015-02-02 22:35 ` Andrew Morton 2015-02-03 0:26 ` Davidlohr Bueso 2015-02-03 10:50 ` Mel Gorman 2015-02-05 21:44 ` Rik van Riel 2015-02-02 22:22 ` Dave Hansen 2015-02-03 8:19 ` MADV_DONTNEED semantics? Was: " Vlastimil Babka 2015-02-03 10:53 ` Kirill A. Shutemov 2015-02-03 11:42 ` Vlastimil Babka 2015-02-03 16:20 ` Michael Kerrisk (man-pages) 2015-02-04 13:46 ` Vlastimil Babka 2015-02-04 14:00 ` Michael Kerrisk (man-pages) 2015-02-04 17:02 ` Vlastimil Babka 2015-02-04 19:24 ` Michael Kerrisk (man-pages) 2015-02-05 1:07 ` Minchan Kim 2015-02-06 15:41 ` Michael Kerrisk (man-pages) 2015-02-09 6:46 ` Minchan Kim 2015-02-09 9:13 ` Michael Kerrisk (man-pages) 2015-02-05 15:41 ` Michal Hocko 2015-02-06 15:57 ` Michael Kerrisk (man-pages) 2015-02-06 20:45 ` Michal Hocko 2015-02-09 6:50 ` Minchan Kim 2015-02-04 0:09 ` Minchan Kim 2015-02-03 11:16 ` Mel Gorman 2015-02-03 15:21 ` Michal Hocko 2015-02-03 16:25 ` Michael Kerrisk (man-pages) 2015-02-03 9:47 ` Mel Gorman 2015-02-03 10:47 ` Kirill A. Shutemov 2015-02-03 11:21 ` Mel Gorman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).