LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: linux-mm@kvack.org, YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	Paul Menage <menage@google.com>,
	lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	David Rientjes <rientjes@google.com>,
	Pavel Emelianov <xemul@openvz.org>,
	Dhaval Giani <dhaval@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [mm] [PATCH 3/4] Memory cgroup hierarchical reclaim
Date: Sun, 02 Nov 2008 11:14:48 +0530	[thread overview]
Message-ID: <490D3E50.9070606@linux.vnet.ibm.com> (raw)
In-Reply-To: <20081102143707.1bf7e2d0.kamezawa.hiroyu@jp.fujitsu.com>

KAMEZAWA Hiroyuki wrote:
> On Sun, 02 Nov 2008 00:18:49 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
>> This patch introduces hierarchical reclaim. When an ancestor goes over its
>> limit, the charging routine points to the parent that is above its limit.
>> The reclaim process then starts from the last scanned child of the ancestor
>> and reclaims until the ancestor goes below its limit.
>>
>> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
>> ---
>>
>>  mm/memcontrol.c |  153 +++++++++++++++++++++++++++++++++++++++++++++++---------
>>  1 file changed, 129 insertions(+), 24 deletions(-)
>>
>> diff -puN mm/memcontrol.c~memcg-hierarchical-reclaim mm/memcontrol.c
>> --- linux-2.6.28-rc2/mm/memcontrol.c~memcg-hierarchical-reclaim	2008-11-02 00:14:59.000000000 +0530
>> +++ linux-2.6.28-rc2-balbir/mm/memcontrol.c	2008-11-02 00:14:59.000000000 +0530
>> @@ -132,6 +132,11 @@ struct mem_cgroup {
>>  	 * statistics.
>>  	 */
>>  	struct mem_cgroup_stat stat;
>> +	/*
>> +	 * While reclaiming in a hiearchy, we cache the last child we
>> +	 * reclaimed from.
>> +	 */
>> +	struct mem_cgroup *last_scanned_child;
>>  };
>>  static struct mem_cgroup init_mem_cgroup;
>>  
>> @@ -467,6 +472,125 @@ unsigned long mem_cgroup_isolate_pages(u
>>  	return nr_taken;
>>  }
>>  
>> +static struct mem_cgroup *
>> +mem_cgroup_from_res_counter(struct res_counter *counter)
>> +{
>> +	return container_of(counter, struct mem_cgroup, res);
>> +}
>> +
>> +/*
>> + * Dance down the hierarchy if needed to reclaim memory. We remember the
>> + * last child we reclaimed from, so that we don't end up penalizing
>> + * one child extensively based on its position in the children list
>> + */
>> +static int
>> +mem_cgroup_hierarchical_reclaim(struct mem_cgroup *mem, gfp_t gfp_mask)
>> +{
>> +	struct cgroup *cg, *cg_current, *cgroup;
>> +	struct mem_cgroup *mem_child;
>> +	int ret = 0;
>> +
>> +	if (try_to_free_mem_cgroup_pages(mem, gfp_mask))
>> +		return -ENOMEM;
>> +
>> +	/*
>> +	 * try_to_free_mem_cgroup_pages() might not give us a full
>> +	 * picture of reclaim. Some pages are reclaimed and might be
>> +	 * moved to swap cache or just unmapped from the cgroup.
>> +	 * Check the limit again to see if the reclaim reduced the
>> +	 * current usage of the cgroup before giving up
>> +	 */
>> +	if (res_counter_check_under_limit(&mem->res))
>> +		return 0;
>> +
>> +	/*
>> +	 * Scan all children under the mem_cgroup mem
>> +	 */
>> +	if (!mem->last_scanned_child)
>> +		cgroup = list_first_entry(&mem->css.cgroup->children,
>> +				struct cgroup, sibling);
>> +	else
>> +		cgroup = mem->last_scanned_child->css.cgroup;
>> +
>> +	cg_current = cgroup;
>> +
>> +	/*
>> +	 * We iterate twice, one of it is fundamental list issue, where
>> +	 * the elements are inserted using list_add and hence the list
>> +	 * behaves like a stack and list_for_entry_safe_from() stops
>> +	 * after seeing the first child. The two loops help us work
>> +	 * independently of the insertion and it helps us get a full pass at
>> +	 * scanning all list entries for reclaim
>> +	 */
>> +	list_for_each_entry_safe_from(cgroup, cg, &cg_current->parent->children,
>> +						 sibling) {
>> +		mem_child = mem_cgroup_from_cont(cgroup);
>> +
>> +		/*
>> +		 * Move beyond last scanned child
>> +		 */
>> +		if (mem_child == mem->last_scanned_child)
>> +			continue;
>> +
>> +		ret = try_to_free_mem_cgroup_pages(mem_child, gfp_mask);
>> +		mem->last_scanned_child = mem_child;
>> +
>> +		if (res_counter_check_under_limit(&mem->res)) {
>> +			ret = 0;
>> +			goto done;
>> +		}
>> +	}
> 
> Is this safe against cgroup create/remove ? cgroup_mutex is held ?

Yes, I thought about it, but with the setup, each parent will be busy since they
have children and hence cannot be removed. The leaf child itself has tasks, so
it cannot be removed. IOW, it should be safe against removal.

For creation we might need to hold the mutex, I'll review that part of the code.

Thanks for the comments,

-- 
	Balbir

  reply	other threads:[~2008-11-02  5:45 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-01 18:48 [mm][PATCH 0/4] Memory cgroup hierarchy introduction Balbir Singh
2008-11-01 18:48 ` [mm] [PATCH 1/4] Memory cgroup hierarchy documentation Balbir Singh
2008-11-04  6:25   ` Paul Menage
2008-11-04  6:26     ` Paul Menage
2008-11-05 13:55       ` Balbir Singh
2008-11-01 18:48 ` [mm] [PATCH 2/4] Memory cgroup resource counters for hierarchy Balbir Singh
2008-11-02  5:42   ` KAMEZAWA Hiroyuki
2008-11-02  5:49     ` Balbir Singh
2008-11-02  5:56       ` KAMEZAWA Hiroyuki
2008-11-02 11:46         ` Balbir Singh
2008-11-01 18:48 ` [mm] [PATCH 3/4] Memory cgroup hierarchical reclaim Balbir Singh
2008-11-02  5:37   ` KAMEZAWA Hiroyuki
2008-11-02  5:44     ` Balbir Singh [this message]
2008-11-04  2:17       ` KAMEZAWA Hiroyuki
2008-11-05 13:34         ` Balbir Singh
2008-11-05 16:20           ` KAMEZAWA Hiroyuki
2008-11-06 14:00             ` Balbir Singh
2008-11-01 18:49 ` [mm] [PATCH 4/4] Memory cgroup hierarchy feature selector Balbir Singh
2008-11-02  5:38   ` KAMEZAWA Hiroyuki
2008-11-02  6:03     ` Balbir Singh
2008-11-02  6:24       ` KAMEZAWA Hiroyuki
2008-11-02 15:52         ` Balbir Singh
2008-11-04  6:37           ` Paul Menage
2008-11-06  7:00             ` Balbir Singh
2008-11-06  7:01               ` Balbir Singh
2008-11-06  6:56         ` Balbir Singh
2008-11-06  7:30           ` KAMEZAWA Hiroyuki
2008-11-04  0:15 ` [mm][PATCH 0/4] Memory cgroup hierarchy introduction KAMEZAWA Hiroyuki
2008-11-05 13:51   ` Balbir Singh
2008-11-05 16:33     ` KAMEZAWA Hiroyuki
2008-11-05 17:52       ` Balbir Singh
2008-11-06  0:22         ` KAMEZAWA Hiroyuki
2008-11-04  9:21 ` [patch 1/2] memcg: hierarchy, yet another one KAMEZAWA Hiroyuki
2008-11-04  9:25 ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=490D3E50.9070606@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rientjes@google.com \
    --cc=xemul@openvz.org \
    --cc=yamamoto@valinux.co.jp \
    --subject='Re: [mm] [PATCH 3/4] Memory cgroup hierarchical reclaim' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).