LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Vladimir Davydov <vdavydov@parallels.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH -mm 3/3] slub: make dead caches discard free slabs immediately
Date: Tue, 27 Jan 2015 17:00:09 +0900	[thread overview]
Message-ID: <20150127080009.GB11358@js1304-P5Q-DELUXE> (raw)
In-Reply-To: <42d95683e3c7f4bb00be4d777e2b334e8981d552.1422275084.git.vdavydov@parallels.com>

On Mon, Jan 26, 2015 at 03:55:29PM +0300, Vladimir Davydov wrote:
> To speed up further allocations SLUB may store empty slabs in per
> cpu/node partial lists instead of freeing them immediately. This
> prevents per memcg caches destruction, because kmem caches created for a
> memory cgroup are only destroyed after the last page charged to the
> cgroup is freed.
> 
> To fix this issue, this patch resurrects approach first proposed in [1].
> It forbids SLUB to cache empty slabs after the memory cgroup that the
> cache belongs to was destroyed. It is achieved by setting kmem_cache's
> cpu_partial and min_partial constants to 0 and tuning put_cpu_partial()
> so that it would drop frozen empty slabs immediately if cpu_partial = 0.
> 
> The runtime overhead is minimal. From all the hot functions, we only
> touch relatively cold put_cpu_partial(): we make it call
> unfreeze_partials() after freezing a slab that belongs to an offline
> memory cgroup. Since slab freezing exists to avoid moving slabs from/to
> a partial list on free/alloc, and there can't be allocations from dead
> caches, it shouldn't cause any overhead. We do have to disable
> preemption for put_cpu_partial() to achieve that though.
> 
> The original patch was accepted well and even merged to the mm tree.
> However, I decided to withdraw it due to changes happening to the memcg
> core at that time. I had an idea of introducing per-memcg shrinkers for
> kmem caches, but now, as memcg has finally settled down, I do not see it
> as an option, because SLUB shrinker would be too costly to call since
> SLUB does not keep free slabs on a separate list. Besides, we currently
> do not even call per-memcg shrinkers for offline memcgs. Overall, it
> would introduce much more complexity to both SLUB and memcg than this
> small patch.
> 
> Regarding to SLAB, there's no problem with it, because it shrinks
> per-cpu/node caches periodically. Thanks to list_lru reparenting, we no
> longer keep entries for offline cgroups in per-memcg arrays (such as
> memcg_cache_params->memcg_caches), so we do not have to bother if a
> per-memcg cache will be shrunk a bit later than it could be.
> 
> [1] http://thread.gmane.org/gmane.linux.kernel.mm/118649/focus=118650
> 
> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
> ---
>  mm/slab.c        |    2 +-
>  mm/slab.h        |    2 +-
>  mm/slab_common.c |   15 +++++++++++++--
>  mm/slob.c        |    2 +-
>  mm/slub.c        |   25 ++++++++++++++++++++-----
>  5 files changed, 36 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 279c44d6d8e1..f0514df07b85 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2400,7 +2400,7 @@ static int __cache_shrink(struct kmem_cache *cachep)
>  	return (ret ? 1 : 0);
>  }
>  
> -void __kmem_cache_shrink(struct kmem_cache *cachep)
> +void __kmem_cache_shrink(struct kmem_cache *cachep, bool deactivate)
>  {
>  	__cache_shrink(cachep);
>  }
> diff --git a/mm/slab.h b/mm/slab.h
> index c036e520d2cf..041260197984 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -138,7 +138,7 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
>  #define CACHE_CREATE_MASK (SLAB_CORE_FLAGS | SLAB_DEBUG_FLAGS | SLAB_CACHE_FLAGS)
>  
>  int __kmem_cache_shutdown(struct kmem_cache *);
> -void __kmem_cache_shrink(struct kmem_cache *);
> +void __kmem_cache_shrink(struct kmem_cache *, bool);
>  void slab_kmem_cache_release(struct kmem_cache *);
>  
>  struct seq_file;
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 6803639fdff0..472ab7fcffd4 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -549,10 +549,13 @@ void memcg_deactivate_kmem_caches(struct mem_cgroup *memcg)
>  {
>  	int idx;
>  	struct memcg_cache_array *arr;
> -	struct kmem_cache *s;
> +	struct kmem_cache *s, *c;
>  
>  	idx = memcg_cache_id(memcg);
>  
> +	get_online_cpus();
> +	get_online_mems();
> +
>  	mutex_lock(&slab_mutex);
>  	list_for_each_entry(s, &slab_caches, list) {
>  		if (!is_root_cache(s))
> @@ -560,9 +563,17 @@ void memcg_deactivate_kmem_caches(struct mem_cgroup *memcg)
>  
>  		arr = rcu_dereference_protected(s->memcg_params.memcg_caches,
>  						lockdep_is_held(&slab_mutex));
> +		c = arr->entries[idx];
> +		if (!c)
> +			continue;
> +
> +		__kmem_cache_shrink(c, true);
>  		arr->entries[idx] = NULL;
>  	}
>  	mutex_unlock(&slab_mutex);
> +
> +	put_online_mems();
> +	put_online_cpus();
>  }
>  
>  void memcg_destroy_kmem_caches(struct mem_cgroup *memcg)
> @@ -646,7 +657,7 @@ void kmem_cache_shrink(struct kmem_cache *cachep)
>  {
>  	get_online_cpus();
>  	get_online_mems();
> -	__kmem_cache_shrink(cachep);
> +	__kmem_cache_shrink(cachep, false);
>  	put_online_mems();
>  	put_online_cpus();
>  }
> diff --git a/mm/slob.c b/mm/slob.c
> index 043a14b6ccbe..e63ff9d926dc 100644
> --- a/mm/slob.c
> +++ b/mm/slob.c
> @@ -618,7 +618,7 @@ int __kmem_cache_shutdown(struct kmem_cache *c)
>  	return 0;
>  }
>  
> -void __kmem_cache_shrink(struct kmem_cache *c)
> +void __kmem_cache_shrink(struct kmem_cache *c, bool deactivate)
>  {
>  }
>  
> diff --git a/mm/slub.c b/mm/slub.c
> index c09d93dde40e..6f57824af019 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2007,6 +2007,7 @@ static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
>  	int pages;
>  	int pobjects;
>  
> +	preempt_disable();
>  	do {
>  		pages = 0;
>  		pobjects = 0;
> @@ -2040,6 +2041,14 @@ static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
>  
>  	} while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page)
>  								!= oldpage);
> +	if (unlikely(!s->cpu_partial)) {
> +		unsigned long flags;
> +
> +		local_irq_save(flags);
> +		unfreeze_partials(s, this_cpu_ptr(s->cpu_slab));
> +		local_irq_restore(flags);
> +	}
> +	preempt_enable();
>  #endif
>  }
>  
> @@ -3368,7 +3377,7 @@ EXPORT_SYMBOL(kfree);
>   * being allocated from last increasing the chance that the last objects
>   * are freed in them.
>   */
> -void __kmem_cache_shrink(struct kmem_cache *s)
> +void __kmem_cache_shrink(struct kmem_cache *s, bool deactivate)
>  {
>  	int node;
>  	int i;
> @@ -3381,6 +3390,15 @@ void __kmem_cache_shrink(struct kmem_cache *s)
>  		kmalloc(sizeof(struct list_head) * objects, GFP_KERNEL);
>  	unsigned long flags;
>  
> +	if (deactivate) {
> +		/*
> +		 * Disable empty slabs caching. Used to avoid pinning offline
> +		 * memory cgroups by freeable kmem pages.
> +		 */
> +		s->cpu_partial = 0;
> +		s->min_partial = 0;
> +	}
> +

Hello,

Maybe, kick_all_cpus_sync() is needed here since object would
be freed asynchronously so they can't see this updated value.

Thanks.

  reply	other threads:[~2015-01-27  7:58 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-26 12:55 [PATCH -mm 0/3] " Vladimir Davydov
2015-01-26 12:55 ` [PATCH -mm 1/3] slub: don't fail kmem_cache_shrink if slab placement optimization fails Vladimir Davydov
2015-01-26 15:48   ` Christoph Lameter
2015-01-26 17:01     ` Vladimir Davydov
2015-01-26 18:24       ` Christoph Lameter
2015-01-26 19:36         ` Vladimir Davydov
2015-01-26 19:53           ` Christoph Lameter
2015-01-27 12:58             ` Vladimir Davydov
2015-01-27 17:02               ` Christoph Lameter
2015-01-28 15:00                 ` Vladimir Davydov
2015-01-26 12:55 ` [PATCH -mm 2/3] slab: zap kmem_cache_shrink return value Vladimir Davydov
2015-01-26 15:49   ` Christoph Lameter
2015-01-26 17:04     ` Vladimir Davydov
2015-01-26 18:26       ` Christoph Lameter
2015-01-26 19:48         ` Vladimir Davydov
2015-01-26 19:55           ` Christoph Lameter
2015-01-26 20:16             ` Vladimir Davydov
2015-01-26 20:28               ` Christoph Lameter
2015-01-26 20:43                 ` Vladimir Davydov
2015-01-26 12:55 ` [PATCH -mm 3/3] slub: make dead caches discard free slabs immediately Vladimir Davydov
2015-01-27  8:00   ` Joonsoo Kim [this message]
2015-01-27  8:23     ` Vladimir Davydov
2015-01-27  9:21       ` Joonsoo Kim
2015-01-27  9:28         ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150127080009.GB11358@js1304-P5Q-DELUXE \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=vdavydov@parallels.com \
    --subject='Re: [PATCH -mm 3/3] slub: make dead caches discard free slabs immediately' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).