LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Hugh Dickins <hugh@veritas.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>,
	Pavel Emelyanov <xemul@openvz.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sudhir Kumar <skumar@linux.vnet.ibm.com>,
	YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	Paul Menage <menage@google.com>,
	lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
	taka@valinux.co.jp, linux-mm@kvack.org,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH] Move memory controller allocations to their own slabs (v2)
Date: Wed, 12 Mar 2008 14:38:01 +1100	[thread overview]
Message-ID: <200803121438.02857.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <Pine.LNX.4.64.0803111256110.18261@blonde.site>

On Wednesday 12 March 2008 00:05, Hugh Dickins wrote:
> On Tue, 11 Mar 2008, Balbir Singh wrote:
> > On my 64 bit powerpc system (structure size could be different on other
> > systems)
> >
> > 1. sizeof page_cgroup is 40 bytes
> >    which means kmalloc will allocate 64 bytes
> > 2. With 4K pagesize SLAB with HWCACHE_ALIGN, 59 objects are packed per
> > slab 3. With SLUB the value is 102 per slab
>
> I expect you got those numbers with 2.6.25-rc4?  Towards the end of -rc5
> there's a patch from Nick to make SLUB's treatment of HWCACHE_ALIGN the
> same as SLAB's, so I expect you'd be back to a similar poor density with
> SLUB too.  (But I'm replying without actually testing it out myself.)

Yes, that will be the case.

With a 64 byte cacheline size, 

page_cgroup: |----|----|----|----|----|----|----|----
cacheline:   |-------|-------|-------|-------|-------

So if you are accessing each field in a random page_cgroup, then it
will statistically cost you 1.5 cachelines (really: half the time, it
takes 2 cachelines).

If you HWCACHE_ALIGN this, it will cost you just 1 line.

Long live the size/speed tradeoff ;)

Whether this improvement is actually noticable is a different matter.


> I think you'd need a strong reason to choose HWCACHE_ALIGN for these.
>
> Consider: the (normal configuration) x86_64 struct page size was 56
> bytes for a long time (and still is without MEM_RES_CTLR), but we've
> never inserted padding to make that a round 64 bytes (and they would
> benefit additionally from some simpler arithmetic, not the case with
> page_cgroups).

I actually still think padding struct page will be a good idea. And
hopefully we can get rid of the page_cgroups pointer out of struct
page in order that we can have both config cases padded to 64 bytes
and use the extra space for larger refcounters maybe ;)

I simply haven't tried to run a realistic workload where it would
matter a great deal. I have a feeling that the oltp workloads could
see some improvement, because they're mainly just using the kernel
for direct IO, struct page is one of the few data structures that
they actually spend any cache on.


> Though it's good to avoid unnecessary sharing and 
> multiple cacheline accesses, it's not so good as to justify almost
> doubling the size of a very very common structure.  I think.

Anyway, back on topic from my rant, yes I agree with Hugh: I think
there should be some demonstration of a speedup (even in a "best
case" situation), before spending more bytes on a common data
structure.


  reply	other threads:[~2008-03-12  3:38 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-11  6:18 Balbir Singh
2008-03-11  8:11 ` Pavel Emelyanov
2008-03-11  8:15   ` Balbir Singh
2008-03-11  8:35     ` Pavel Emelyanov
2008-03-11 11:09       ` Balbir Singh
2008-03-11 13:05         ` Hugh Dickins
2008-03-12  3:38           ` Nick Piggin [this message]
2008-03-12  3:48           ` Balbir Singh
2008-03-11 12:55 ` Hugh Dickins
2008-03-11 18:47   ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200803121438.02857.nickpiggin@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=rientjes@google.com \
    --cc=skumar@linux.vnet.ibm.com \
    --cc=taka@valinux.co.jp \
    --cc=xemul@openvz.org \
    --cc=yamamoto@valinux.co.jp \
    --subject='Re: [PATCH] Move memory controller allocations to their own slabs (v2)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).