LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@suse.de>, Shaohua Li <shli@kernel.org>,
	Yalin.Wang@sonymobile.com
Subject: Re: [PATCH RFC 1/4] mm: throttle MADV_FREE
Date: Tue, 24 Feb 2015 16:43:18 +0100	[thread overview]
Message-ID: <20150224154318.GA14939@dhcp22.suse.cz> (raw)
In-Reply-To: <1424765897-27377-1-git-send-email-minchan@kernel.org>

On Tue 24-02-15 17:18:14, Minchan Kim wrote:
> Recently, Shaohua reported that MADV_FREE is much slower than
> MADV_DONTNEED in his MADV_FREE bomb test. The reason is many of
> applications went to stall with direct reclaim since kswapd's
> reclaim speed isn't fast than applications's allocation speed
> so that it causes lots of stall and lock contention.

I am not sure I understand this correctly. So the issue is that there is
huge number of MADV_FREE on the LRU and they are not close to the tail
of the list so the reclaim has to do a lot of work before it starts
dropping them?

> This patch throttles MADV_FREEing so it works only if there
> are enough pages in the system which will not trigger backgroud/
> direct reclaim. Otherwise, MADV_FREE falls back to MADV_DONTNEED
> because there is no point to delay freeing if we know system
> is under memory pressure.

Hmm, this is still conforming to the documentation because the kernel is
free to free pages at its convenience. I am not sure this is a good
idea, though. Why some MADV_FREE calls should be treated differently?
Wouldn't that lead to hard to predict behavior? E.g. LIFO reused blocks
would work without long stalls most of the time - except when there is a
memory pressure.

Comparison to MADV_DONTNEED is not very fair IMHO because the scope of the
two calls is different.

> When I test the patch on my 3G machine + 12 CPU + 8G swap,
> test: 12 processes
> 
> loop = 5;
> mmap(512M);

Who is eating the rest of the memory?

> while (loop--) {
> 	memset(512M);
> 	madvise(MADV_FREE or MADV_DONTNEED);
> }
> 
> 1) dontneed: 6.78user 234.09system 0:48.89elapsed
> 2) madvfree: 6.03user 401.17system 1:30.67elapsed
> 3) madvfree + this ptach: 5.68user 113.42system 0:36.52elapsed
> 
> It's clearly win.
> 
> Reported-by: Shaohua Li <shli@kernel.org>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

I don't know. This looks like a hack with hard to predict consequences
which might trigger pathological corner cases.

> ---
>  mm/madvise.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 6d0fcb8921c2..81bb26ecf064 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -523,8 +523,17 @@ madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev,
>  		 * XXX: In this implementation, MADV_FREE works like
>  		 * MADV_DONTNEED on swapless system or full swap.
>  		 */
> -		if (get_nr_swap_pages() > 0)
> -			return madvise_free(vma, prev, start, end);
> +		if (get_nr_swap_pages() > 0) {
> +			unsigned long threshold;
> +			/*
> +			 * If we have trobule with memory pressure(ie,
> +			 * under high watermark), free pages instantly.
> +			 */
> +			threshold = min_free_kbytes >> (PAGE_SHIFT - 10);
> +			threshold = threshold + (threshold >> 1);

Why threshold += threshold >> 1 ?

> +			if (nr_free_pages() > threshold)
> +				return madvise_free(vma, prev, start, end);
> +		}
>  		/* passthrough */
>  	case MADV_DONTNEED:
>  		return madvise_dontneed(vma, prev, start, end);
> -- 
> 1.9.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2015-02-24 15:43 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-24  8:18 Minchan Kim
2015-02-24  8:18 ` [PATCH RFC 2/4] mm: change deactivate_page with deactivate_file_page Minchan Kim
2015-02-24  8:18 ` [PATCH RFC 3/4] mm: move lazy free pages to inactive list Minchan Kim
2015-02-24 16:14   ` Michal Hocko
2015-02-25  0:27     ` Minchan Kim
2015-02-25 15:17       ` Michal Hocko
2015-02-24  8:18 ` [PATCH RFC 4/4] mm: support MADV_FREE in swapless system Minchan Kim
2015-02-24 16:51   ` Michal Hocko
2015-02-25  1:41     ` Minchan Kim
2015-02-24 15:43 ` Michal Hocko [this message]
2015-02-24 22:54   ` [PATCH RFC 1/4] mm: throttle MADV_FREE Shaohua Li
2015-02-25 14:13     ` Michal Hocko
2015-02-25  0:08   ` Minchan Kim
2015-02-25  7:11     ` Minchan Kim
2015-02-25 15:07       ` Michal Hocko
2015-02-25 18:37       ` Shaohua Li
2015-02-26  0:42         ` Minchan Kim
2015-02-26 19:04           ` Shaohua Li
2015-02-27  3:37     ` [RFC] mm: change mm_advise_free to clear page dirty Wang, Yalin
2015-02-27  5:28       ` Minchan Kim
2015-02-27  5:48         ` Wang, Yalin
2015-02-27  6:44           ` Minchan Kim
2015-02-27  7:50             ` Wang, Yalin
2015-02-27 13:37               ` Minchan Kim
2015-02-28 13:50                 ` Minchan Kim
2015-03-02  1:59                   ` Wang, Yalin
2015-03-03  0:42                     ` Minchan Kim
2015-02-27 21:02       ` Michal Hocko
2015-02-28  2:11         ` Wang, Yalin
2015-02-28  6:01           ` [RFC V2] " Wang, Yalin
2015-03-02 12:38             ` Michal Hocko
2015-03-03  2:06               ` [RFC V3] " Wang, Yalin
2015-02-28 13:55           ` [RFC] " Minchan Kim
2015-03-02  1:53             ` Wang, Yalin
2015-03-02 12:33           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150224154318.GA14939@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=Yalin.Wang@sonymobile.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=shli@kernel.org \
    --subject='Re: [PATCH RFC 1/4] mm: throttle MADV_FREE' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).