LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Kirill Korotaev <dev@sw.ru>
To: Herbert Poetzl <herbert@13thfloor.at>
Cc: Paul Menage <menage@google.com>,
	akpm@osdl.org, pj@sgi.com, sekharan@us.ibm.com, xemul@sw.ru,
	serue@us.ibm.com, vatsa@in.ibm.com, rohitseth@google.com,
	winget@google.com, containers@lists.osdl.org,
	ckrm-tech@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6/6] containers: BeanCounters over generic process containers
Date: Mon, 25 Dec 2006 13:16:48 +0300	[thread overview]
Message-ID: <458FA510.3040104@sw.ru> (raw)
In-Reply-To: <20061223194955.GA30764@MAIL.13thfloor.at>

Herbert,

>>This patch implements the BeanCounter resource control abstraction
>>over generic process containers. It contains the beancounter core
>>code, plus the numfiles resource counter. It doesn't currently contain
>>any of the memory tracking code or the code for switching beancounter
>>context in interrupts.
> 
> 
> I don't like it, it looks bloated and probably
> adds plenty of overhead (similar to the OVZ
> implementation where this seems to be taken from)
> here are some comments/questions:
Look like you have commented anything, but OpenVZ :)
sure you don't like it, cause it doesn't add racy dcache accounting and works :)
speaking about overhead: have you done a single measurement of BC code?

>>Currently all the beancounters resource counters are lumped into a
>>single hierarchy; ideally it would be possible for each resource
>>counter to be a separate container subsystem, allowing them to be
>>connected to different hierarchies.
>>
>>+static inline void bc_uncharge(struct beancounter *bc, int res_id,
>>+		unsigned long val)
>>+{
>>+	unsigned long flags;
>>+
>>+	spin_lock_irqsave(&bc->bc_lock, flags);
>>+	bc_uncharge_locked(bc, res_id, val);
>>+	spin_unlock_irqrestore(&bc->bc_lock, flags);
> 
> 
> why use a spinlock, when we could use atomic
> counters?
1. some resources can contribute to multiple BC params, thus need to be accounted
   atomically into 2 counters.
2. spin_lock()/spin_unlock() has the same 1 lock operation on SMP on most archs
3. using atomic counters you can't get a snapshot of current usages.
4. all the performance critical resources should be handled with pre-charges,
   as it is done in TCP/IP accounting in mainstream and as it is done for
   files/kmemsize in OpenVZ.

>>+int bc_charge_locked(struct beancounter *bc, int res, unsigned long val,
>>+		int strict, unsigned long flags)
>>+{
>>+	struct bc_resource_parm *parm;
>>+	unsigned long new_held;
>>+
>>+	BUG_ON(val > BC_MAXVALUE);
>>+
>>+	parm = &bc->bc_parms[res];
>>+	new_held = parm->held + val;
>>+
>>+	switch (strict) {
>>+	case BC_LIMIT:
>>+		if (new_held > parm->limit)
>>+			break;
>>+		/* fallthrough */
>>+	case BC_BARRIER:
>>+		if (new_held > parm->barrier) {
>>+			if (strict == BC_BARRIER)
>>+				break;
>>+			if (parm->held < parm->barrier &&
>>+					bc_resources[res]->bcr_barrier_hit)
>>+				bc_resources[res]->bcr_barrier_hit(bc);
>>+		}
> 
> 
> why do barrier checks with every accounting? 
> there are probably a few cases where the
> checks could be independant from the accounting
<<<< cause it simply doesn't worth optimizing.
its overhead is minor compared to accounting itself (atomic operations).

>>+		/* fallthrough */
>>+	case BC_FORCE:
>>+		parm->held = new_held;
>>+		bc_adjust_maxheld(parm);
> 
> 
> in what cases do we want to cross the barrier?
in this patchset it is not used AFAICS.
however, it was taken from the full BC patch where it is used to handle 
resource denials as most gracefully as possible.


>>+		return 0;
>>+	default:
>>+		BUG();
>>+	}
>>+
>>+	if (bc_resources[res]->bcr_limit_hit)
>>+		return bc_resources[res]->bcr_limit_hit(bc, val, flags);
>>+
>>+	parm->failcnt++;
>>+	return -ENOMEM;
> 
> 
>>+int bc_file_charge(struct file *file)
>>+{
>>+	int sev;
>>+	struct beancounter *bc;
>>+
>>+	task_lock(current);
> 
> 
> why do we lock current? it won't go away that
> easily, and for switching the bc, it might be 
> better to use RCU or a separate lock, no?
<<<< I guess it's countainers patch issue...

>>+	bc = task_bc(current);
>>+	css_get_current(&bc->css);
>>+	task_unlock(current);
>>+
>>+	sev = (capable(CAP_SYS_ADMIN) ? BC_LIMIT : BC_BARRIER);
>>+
>>+	if (bc_charge(bc, BC_NUMFILES, 1, sev)) {
>>+		css_put(&bc->css);
>>+		return -EMFILE;
>>+	}
>>+
>>+	file->f_bc = bc;
>>+	return 0;
>>+}
> 
> 
> also note that certain limits are much more
> complicated than the (very simple) file limits
> and the code will be called at higher frequency
Agree with this. This patch doesn't prove that BCs can be integrated to the
containers infrastructure.

> how to handle requests like:
>  try to get as 64 files or as many as available
>  whatever is smaller
Do you see any problems with these except for not-needed-anywhere-now? P)

Kirill


  parent reply	other threads:[~2006-12-25 10:07 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-22 14:14 [PATCH 0/6] containers: Generic Process Containers (V6) Paul Menage
2006-12-22 14:14 ` [PATCH 1/6] containers: Generic container system abstracted from cpusets code Paul Menage
2006-12-30 13:10   ` Eric W. Biederman
2006-12-31  5:17     ` Paul Jackson
2007-01-02 22:15     ` Paul Menage
2006-12-22 14:14 ` [PATCH 2/6] containers: Cpusets hooked into containers Paul Menage
2006-12-22 14:14 ` [PATCH 3/6] containers: Add generic multi-subsystem API to containers Paul Menage
2007-01-10 15:56   ` [ckrm-tech] " Balbir Singh
2007-01-11 22:53     ` Paul Menage
2007-01-12  6:29       ` Balbir Singh
2007-01-12  8:10         ` Paul Menage
2007-01-12  8:22           ` Balbir Singh
2007-01-20 17:27           ` Balbir Singh
2006-12-22 14:14 ` [PATCH 4/6] containers: Simple CPU accounting container subsystem Paul Menage
2007-01-10 14:21   ` [ckrm-tech] " Balbir Singh
2007-01-12  0:33     ` Paul Menage
2007-01-12  6:24       ` Balbir Singh
2007-01-12  8:15         ` Paul Menage
2007-01-12  8:26           ` Balbir Singh
2007-01-12 17:32             ` Paul Menage
2007-01-15  9:01       ` [PATCH 0/1] Add mount/umount callbacks to containers (Re: [ckrm-tech] [PATCH 4/6] containers: Simple CPU accounting container subsystem) Balbir Singh
2007-01-15  9:04         ` [PATCH 1/1] Fix a panic while mouting containers on powerpc and some other small cleanups " Balbir Singh
2007-01-15  9:22           ` Paul Menage
2007-01-15  9:51             ` [ckrm-tech] [PATCH 1/1] Fix a panic while mouting containers on powerpc and some other small cleanups (Re: " Balbir Singh
2007-01-15 10:01               ` Paul Menage
2007-01-15 10:10                 ` Balbir Singh
2006-12-22 14:14 ` [PATCH 5/6] containers: Resource Groups over generic containers Paul Menage
2006-12-22 14:14 ` [PATCH 6/6] containers: BeanCounters over generic process containers Paul Menage
2006-12-23 19:49   ` Herbert Poetzl
2006-12-24 11:32     ` Paul Menage
2006-12-25 10:16     ` Kirill Korotaev [this message]
2006-12-26  0:54       ` Paul Menage
2006-12-25 10:35     ` Pavel Emelianov
2007-01-03 14:43 ` [PATCH 0/6] containers: Generic Process Containers (V6) Serge E. Hallyn
2007-01-05  0:25   ` Paul Menage
2007-01-12 18:42     ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=458FA510.3040104@sw.ru \
    --to=dev@sw.ru \
    --cc=akpm@osdl.org \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=containers@lists.osdl.org \
    --cc=herbert@13thfloor.at \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=pj@sgi.com \
    --cc=rohitseth@google.com \
    --cc=sekharan@us.ibm.com \
    --cc=serue@us.ibm.com \
    --cc=vatsa@in.ibm.com \
    --cc=winget@google.com \
    --cc=xemul@sw.ru \
    --subject='Re: [PATCH 6/6] containers: BeanCounters over generic process containers' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).