LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Tiny cpusets -- cpusets for small systems?
@ 2008-02-23 12:09 Paul Jackson
  2008-02-23 15:09 ` Paul Menage
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Paul Jackson @ 2008-02-23 12:09 UTC (permalink / raw)
  To: KOSAKI Motohiro, Max Krasnyanskiy
  Cc: Alan Cox, Andreas Dilger, Andrew Morton, Daniel Spang,
	Ingo Molnar, Jon Masters, Marcelo Tosatti, Paul Menage,
	Pavel Machek, Peter Zijlstra, Rik van Riel, LKML

A couple of proposals have been made recently by people working Linux
on smaller systems, for improving realtime isolation and memory
pressure handling:

(1) cpu isolation for hard(er) realtime
	http://lkml.org/lkml/2008/2/21/517
	Max Krasnyanskiy <maxk@qualcomm.com>
	[PATCH sched-devel 0/7] CPU isolation extensions

(2) notify user space of tight memory
	http://lkml.org/lkml/2008/2/9/144
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
	[PATCH 0/8][for -mm] mem_notify v6

In both cases, some of us have responded "why not use cpusets", and the
original submitters have replied "cpusets are too fat"  (well, they
were more diplomatic than that, but I guess I can say that ;)

I wonder if there might be room for a "tiny cpusets" configuration
option:
  * provide the same hooks to the rest of the kernel, and
  * provide the same syntactic interface to user space, but
  * with more limited semantics.

The primary semantic limit I'd suggest would be supporting exactly
one layer depth of cpusets, not a full hierarchy.  So one could still
successfully issue from user space 'mkdir /dev/cpuset/foo', but trying
to do 'mkdir /dev/cpuset/foo/bar' would fail.  This reminds me of
very early FAT file systems, which had just a single, fixed size
root directory ;).  There might even be a configurable fixed upper
limit on how many /dev/cpuset/* directories were allowed, further
simplifying the locking and dynamic memory behavior of this apparatus.

Some other features that aren't so easy to implement, and which have
less value on small systems, such as notify_on_release, could also be
stubbed out and always disabled, simply returning error if requested
to be enabled from user space.  The recent, chunky piece of code
needed to compute dynamic sched domains from the cpuset hierarchy
probably admits of a simpler variant in the tiny cpuset configuration.

I suppose it would still be a vfs-based pseudo file system (even
embedded Linux still has that infrastructure), except that the vfs
operator functions could be simpler, as this would really be just
a flat set of cpumask_t's and nodemask_t's at the core of the
implementation, not an arbitrarily nested hierarchy of them.  See
further my comments on cgroups, below.

The rest of the kernel would see no difference ... except that some
of the cpuset_*() hooks would return more quickly.  This tiny cpuset
option would provide the same kernel hooks as are now provided by
the defines and inline stubs, in the "#else" to "#endif" half of the
"#ifdef CONFIG_CPUSETS" code lines in linux/cpuset.h.

User space would see the same API, except that some valid operations
on full cpusets, such as a nested mkdir, would fail on tiny cpusets.

How this extends to cgroups I don't know; for now I suspect that most
cgroup module development is motivated by the needs of larger systems,
not smaller systems.  However, cpusets is now a module client of
cgroups, and it is cgroups that now provides cpusets with its interface
to the vfs infrastructure.  It would seem unfortunate if this relation
was not continued with tiny cpusets.  Perhaps someone can imagine a tiny
cgroups?  This might be the most difficult part of this proposal.

Looking at some IA64 sn2 config builds I have laying about, I see the
following text sizes for a couple of versions, showing the growth of
the cpuset/cgroup apparatus over time:

	25933	2.6.18-rc3-mm1/kernel/cpuset.o (Aug 2006)
vs.
	37823	2.6.25-rc2-mm1/kernel/cgroup.o (Feb 2008)
	19558	2.6.25-rc2-mm1/kernel/cpuset.o

So the total has grown from 25933 to 57381 text bytes (note that
this is IA64 arch; most arch's will have proportionately smaller
text sizes.)

Unfortunately, ideas without code are usually met with the sound of
silence, as well they should be.  Furthermore, I can promise that I
have no time to design or develop this myself; my good employer is
quite focused on the other end of things - the big honkin NUMA and
cluster systems.


-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tiny cpusets -- cpusets for small systems?
  2008-02-23 12:09 Tiny cpusets -- cpusets for small systems? Paul Jackson
@ 2008-02-23 15:09 ` Paul Menage
  2008-02-23 15:57   ` Paul Jackson
  2008-02-24  3:54 ` Max Krasnyansky
  2008-02-25  3:10 ` KOSAKI Motohiro
  2 siblings, 1 reply; 8+ messages in thread
From: Paul Menage @ 2008-02-23 15:09 UTC (permalink / raw)
  To: Paul Jackson
  Cc: KOSAKI Motohiro, Max Krasnyanskiy, Alan Cox, Andreas Dilger,
	Andrew Morton, Daniel Spang, Ingo Molnar, Jon Masters,
	Marcelo Tosatti, Pavel Machek, Peter Zijlstra, Rik van Riel,
	LKML

On Sat, Feb 23, 2008 at 4:09 AM, Paul Jackson <pj@sgi.com> wrote:
> A couple of proposals have been made recently by people working Linux
>  on smaller systems, for improving realtime isolation and memory
>  pressure handling:
>
>  (1) cpu isolation for hard(er) realtime
>         http://lkml.org/lkml/2008/2/21/517
>         Max Krasnyanskiy <maxk@qualcomm.com>
>         [PATCH sched-devel 0/7] CPU isolation extensions
>
>  (2) notify user space of tight memory
>         http://lkml.org/lkml/2008/2/9/144
>         KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>         [PATCH 0/8][for -mm] mem_notify v6
>
>  In both cases, some of us have responded "why not use cpusets", and the
>  original submitters have replied "cpusets are too fat"  (well, they
>  were more diplomatic than that, but I guess I can say that ;)

Having read those threads, it looks to me as though:

- the parts of Max's problem that would be solved by cpusets can be
mostly accomplished just via sched_setaffinity()

- Motohiro wants to add a new system-wide API that you would also like
to have available on a per-cpuset basis. (Why not just add two access
points for the same feature?)

I'm don't think that either of these would be enough to justify big
changes to cpusets or cgroups, although eliminating bloat is always a
good thing.

>  The primary semantic limit I'd suggest would be supporting exactly
>  one layer depth of cpusets, not a full hierarchy.  So one could still
>  successfully issue from user space 'mkdir /dev/cpuset/foo', but trying
>  to do 'mkdir /dev/cpuset/foo/bar' would fail.  This reminds me of
>  very early FAT file systems, which had just a single, fixed size
>  root directory ;).  There might even be a configurable fixed upper
>  limit on how many /dev/cpuset/* directories were allowed, further
>  simplifying the locking and dynamic memory behavior of this apparatus.

I'm not sure that either of these would make much difference to the
overall footprint.

A single layer of cpusets would allow you to simplify
validate_change() but not much else.

I don't see how a fixed upper limit on the number of cpusets makes the
locking sufficiently simpler to save much code.

>
>  How this extends to cgroups I don't know; for now I suspect that most
>  cgroup module development is motivated by the needs of larger systems,
>  not smaller systems.  However, cpusets is now a module client of
>  cgroups, and it is cgroups that now provides cpusets with its interface
>  to the vfs infrastructure.  It would seem unfortunate if this relation
>  was not continued with tiny cpusets.  Perhaps someone can imagine a tiny
>  cgroups?  This might be the most difficult part of this proposal.

If we wanted to go this way, I can imagine a cgroups config option
that forces just a single hierarchy, which would allow a bunch of
simplifications that would save plenty of text.

>
>  Looking at some IA64 sn2 config builds I have laying about, I see the
>  following text sizes for a couple of versions, showing the growth of
>  the cpuset/cgroup apparatus over time:
>
>         25933   2.6.18-rc3-mm1/kernel/cpuset.o (Aug 2006)
>  vs.
>         37823   2.6.25-rc2-mm1/kernel/cgroup.o (Feb 2008)
>         19558   2.6.25-rc2-mm1/kernel/cpuset.o
>
>  So the total has grown from 25933 to 57381 text bytes (note that
>  this is IA64 arch; most arch's will have proportionately smaller
>  text sizes.)

On x86_64 they're:

cgroup.o: 17348
cpuset.o: 8533

Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tiny cpusets -- cpusets for small systems?
  2008-02-23 15:09 ` Paul Menage
@ 2008-02-23 15:57   ` Paul Jackson
  2008-03-12 15:01     ` Paul Mundt
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Jackson @ 2008-02-23 15:57 UTC (permalink / raw)
  To: Paul Menage
  Cc: kosaki.motohiro, maxk, alan, adilger, akpm, daniel.spang, mingo,
	jonathan, marcelo, pavel, a.p.zijlstra, riel, linux-kernel

Paul M wrote:
> I'm don't think that either of these would be enough to justify big
> changes to cpusets or cgroups, although eliminating bloat is always a
> good thing.

My "tiny cpuset" idea doesn't so much eliminate bloat, as provide a
thin alternative, along side of the existing fat alternative.  So
far as kernel source goes, it would get bigger, not smaller, with now
two CONFIG choices for cpusets, fat or tiny.

The odds are, however, given that one of us has just promised not to
code this, and the other of us doesn't figure it's worth it, this
idea will not live long.  Someone would have to step up from the
embedded side with a coded version that saved a nice chunk of memory
(from their perspective) to get this off the ground, and no telling
whether even that would meet with a warm reception.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tiny cpusets -- cpusets for small systems?
  2008-02-23 12:09 Tiny cpusets -- cpusets for small systems? Paul Jackson
  2008-02-23 15:09 ` Paul Menage
@ 2008-02-24  3:54 ` Max Krasnyansky
  2008-02-26  2:05   ` Paul Jackson
  2008-02-25  3:10 ` KOSAKI Motohiro
  2 siblings, 1 reply; 8+ messages in thread
From: Max Krasnyansky @ 2008-02-24  3:54 UTC (permalink / raw)
  To: Paul Jackson
  Cc: KOSAKI Motohiro, Alan Cox, Andreas Dilger, Andrew Morton,
	Daniel Spang, Ingo Molnar, Jon Masters, Marcelo Tosatti,
	Paul Menage, Pavel Machek, Peter Zijlstra, Rik van Riel, LKML

Hi Paul,

> A couple of proposals have been made recently by people working Linux
> on smaller systems, for improving realtime isolation and memory
> pressure handling:
> 
> (1) cpu isolation for hard(er) realtime
> 	http://lkml.org/lkml/2008/2/21/517
> 	Max Krasnyanskiy <maxk@qualcomm.com>
> 	[PATCH sched-devel 0/7] CPU isolation extensions
> 
> (2) notify user space of tight memory
> 	http://lkml.org/lkml/2008/2/9/144
> 	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> 	[PATCH 0/8][for -mm] mem_notify v6
> 
> In both cases, some of us have responded "why not use cpusets", and the
> original submitters have replied "cpusets are too fat"  (well, they
> were more diplomatic than that, but I guess I can say that ;)

My primary issue with cpusets (from CPU isolation perspective that is) was 
not the fatness. I did make a couple of comments like "On dual-cpu box
I do not need cpusets to manage the CPUs" but that's not directly related to
the CPU isolation.
For the CPU isolation in particular I need code like this
	
int select_irq_affinity(unsigned int irq)
{
        cpumask_t usable_cpus;
        cpus_andnot(usable_cpus, cpu_online_map, cpu_isolated_map);
        irq_desc[irq].affinity = usable_cpus;
        irq_desc[irq].chip->set_affinity(irq, usable_cpus);
        return 0;
}

How would you implement that with cpusets ?
I haven't seen you patches but I'd imagine that they will still need locks and 
iterators for "Is CPU N isolated" functionality.

So. I see cpusets as a higher level API/mechanism and cpu_isolated_map as lower
level mechanism that actually makes kernel aware of what's isolated what's not.
Kind of like sched domain/cpuset relationship. ie cpusets affect sched domains
but scheduler does not use cpusets directly.

> I wonder if there might be room for a "tiny cpusets" configuration option:
>   * provide the same hooks to the rest of the kernel, and
>   * provide the same syntactic interface to user space, but
>   * with more limited semantics.
> 
> The primary semantic limit I'd suggest would be supporting exactly
> one layer depth of cpusets, not a full hierarchy.  So one could still
> successfully issue from user space 'mkdir /dev/cpuset/foo', but trying
> to do 'mkdir /dev/cpuset/foo/bar' would fail.  This reminds me of
> very early FAT file systems, which had just a single, fixed size
> root directory ;).  There might even be a configurable fixed upper
> limit on how many /dev/cpuset/* directories were allowed, further
> simplifying the locking and dynamic memory behavior of this apparatus.
In a foreseeable future 2-8 cores will be most common configuration.
Do you think that cpusets are needed/useful for those machines ?
The reason I'm asking is because given the restrictions you mentioned
above it seems that you might as well just do
	taskset -c 1,2,3 app1
	taskset -c 3,4,5 app2 
Yes it's not quite the same of course but imo covers most cases. That's what we
do on 2-4 cores these days, and are quite happy with that. ie We either let the 
specialized apps manage their thread affinities themselves or use "taskset" to 
manage the apps.

> User space would see the same API, except that some valid operations
> on full cpusets, such as a nested mkdir, would fail on tiny cpusets.
Speaking of user-space API. I guess it's not directly related to the tiny-cpusets 
proposal but rather to the cpusets in general.
Stuff that I'm working on this days (wireless basestations) is designed with the 
following model:
	cpuN - runs soft-RT networking and management code
	cpuN+1 to cpuN+x - are used as dedicated engines
ie Simplest example would be
	cpu0 - runs IP, L2 and control plane
	cpu1 - runs hard-RT MAC 

So if CPU isolation is implemented on top of the cpusets what kind of API do 
you envision for such an app ? I mean currently cpusets seems to be mostly dealing
with entire processes, whereas in this case we're really dealing with the threads. 
ie Different threads of the same process require different policies, some must run
on isolated cpus some must not. I guess one could write a thread's pid into cpusets
fs but that's not very convenient. pthread_set_affinity() is exactly what's needed.
Personally I do not see much use for cpusets for those kinds of designs. But maybe
I missing something. I got really excited when cpusets where first merged into 
mainline but after looking closer I could not really find a use for them, at least 
for not for our apps.

Max

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tiny cpusets -- cpusets for small systems?
  2008-02-23 12:09 Tiny cpusets -- cpusets for small systems? Paul Jackson
  2008-02-23 15:09 ` Paul Menage
  2008-02-24  3:54 ` Max Krasnyansky
@ 2008-02-25  3:10 ` KOSAKI Motohiro
  2 siblings, 0 replies; 8+ messages in thread
From: KOSAKI Motohiro @ 2008-02-25  3:10 UTC (permalink / raw)
  To: Paul Jackson, Pavel Machek
  Cc: kosaki.motohiro, Max Krasnyanskiy, Alan Cox, Andreas Dilger,
	Andrew Morton, Daniel Spang, Ingo Molnar, Jon Masters,
	Marcelo Tosatti, Paul Menage, Peter Zijlstra, Rik van Riel, LKML

Hi Pual

> Looking at some IA64 sn2 config builds I have laying about, I see the
> following text sizes for a couple of versions, showing the growth of
> the cpuset/cgroup apparatus over time:
> 
> 	25933	2.6.18-rc3-mm1/kernel/cpuset.o (Aug 2006)
> vs.
> 	37823	2.6.25-rc2-mm1/kernel/cgroup.o (Feb 2008)
> 	19558	2.6.25-rc2-mm1/kernel/cpuset.o
> 
> So the total has grown from 25933 to 57381 text bytes (note that
> this is IA64 arch; most arch's will have proportionately smaller
> text sizes.)

hm, interesting.
but unfortunately the cpuset have more than depend.(i.e. CONFIG_SMP)

To more bad thing, some embedded cpu have poor or no atomic instruction
support.
at that, turn on CONFIG_SMP become large performace regression ;)


I am not already embedded engineer.
thus, I might have made a mistake.
(BTW: I am large server engineer now)

but no thinking dependency is wrong, may be.


Pavel, what do you think it?

- kosaki



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tiny cpusets -- cpusets for small systems?
  2008-02-24  3:54 ` Max Krasnyansky
@ 2008-02-26  2:05   ` Paul Jackson
  2008-02-26  2:37     ` Max Krasnyanskiy
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Jackson @ 2008-02-26  2:05 UTC (permalink / raw)
  To: Max Krasnyansky
  Cc: kosaki.motohiro, alan, adilger, akpm, daniel.spang, mingo,
	jonathan, marcelo, menage, pavel, a.p.zijlstra, riel,
	linux-kernel

> So. I see cpusets as a higher level API/mechanism and cpu_isolated_map as lower
> level mechanism that actually makes kernel aware of what's isolated what's not.
> Kind of like sched domain/cpuset relationship. ie cpusets affect sched domains
> but scheduler does not use cpusets directly.

One could use cpusets to control the setting of cpu_isolated_map,
separate from the code such as your select_irq_affinity() that
uses it.


> In a foreseeable future 2-8 cores will be most common configuration.
> Do you think that cpusets are needed/useful for those machines ?
> The reason I'm asking is because given the restrictions you mentioned
> above it seems that you might as well just do
> 	taskset -c 1,2,3 app1
> 	taskset -c 3,4,5 app2 

People tend to manage the CPU and memory placement of the threads
and processes within a single co-operating job using taskset
(sched_setaffinity) and numactl (mbind, set_mempolicy.)

They tend to manage the placement of multiple unrelated jobs onto
a single system, whether on separate or shared CPUs and nodes,
using cpusets.

Something like cpu_isolated_map looks to me like a system-wide
mechanism, which should, like sched_domains, be managed system-wide.
Managing it with a mechanism that encourages each thread to update
it directly, as if that thread owned the system, will break down,
resulting in conflicting updates, as multiple, insufficiently
co-operating threads issue conflicting settings.


> Stuff that I'm working on this days (wireless basestations) is designed
> with the  following model:
> 	cpuN - runs soft-RT networking and management code
> 	cpuN+1 to cpuN+x - are used as dedicated engines
> ie Simplest example would be
> 	cpu0 - runs IP, L2 and control plane
> 	cpu1 - runs hard-RT MAC 
> 
> So if CPU isolation is implemented on top of the cpusets what kind of API do 
> you envision for such an app ?

That depends on what more API is needed.  Do we need to place
irqs better ... cpusets might not be a natural for that use.
Aren't irqs directed to specific CPUs, not to hierarchically
nested subsets of CPUs.

Separate question:
  Is it desired that the dedicated CPUs cpuN+1 ... cpuN+x even appear
  as general purpose systems running a Linux kernel in your systems?
  These dedicated engines seem more like intelligent devices to me,
  such as disk controllers, which the kernel controls via device
  drivers, not by loading itself on them too.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tiny cpusets -- cpusets for small systems?
  2008-02-26  2:05   ` Paul Jackson
@ 2008-02-26  2:37     ` Max Krasnyanskiy
  0 siblings, 0 replies; 8+ messages in thread
From: Max Krasnyanskiy @ 2008-02-26  2:37 UTC (permalink / raw)
  To: Paul Jackson
  Cc: kosaki.motohiro, alan, adilger, akpm, daniel.spang, mingo,
	jonathan, marcelo, menage, pavel, a.p.zijlstra, riel,
	linux-kernel

Paul Jackson wrote:
>> So. I see cpusets as a higher level API/mechanism and cpu_isolated_map as lower
>> level mechanism that actually makes kernel aware of what's isolated what's not.
>> Kind of like sched domain/cpuset relationship. ie cpusets affect sched domains
>> but scheduler does not use cpusets directly.
> 
> One could use cpusets to control the setting of cpu_isolated_map,
> separate from the code such as your select_irq_affinity() that
> uses it.
Yes. That's what I proposed too. In one of the CPU isolation threads with 
Peter. The only issue is that you need to simulate CPU_DOWN hotplug even in 
order to cleanup what's already running on those CPUs.

>> In a foreseeable future 2-8 cores will be most common configuration.
>> Do you think that cpusets are needed/useful for those machines ?
>> The reason I'm asking is because given the restrictions you mentioned
>> above it seems that you might as well just do
>> 	taskset -c 1,2,3 app1
>> 	taskset -c 3,4,5 app2 
> 
> People tend to manage the CPU and memory placement of the threads
> and processes within a single co-operating job using taskset
> (sched_setaffinity) and numactl (mbind, set_mempolicy.)
> 
> They tend to manage the placement of multiple unrelated jobs onto
> a single system, whether on separate or shared CPUs and nodes,
> using cpusets.
 >
> Something like cpu_isolated_map looks to me like a system-wide
> mechanism, which should, like sched_domains, be managed system-wide.
> Managing it with a mechanism that encourages each thread to update
> it directly, as if that thread owned the system, will break down,
> resulting in conflicting updates, as multiple, insufficiently
> co-operating threads issue conflicting settings.
I'm not sure how to interpret that. I think you might have mixed a couple of 
things I asked about in one reply ;-).
The question was that given the restrictions you talked about when you 
explained tiny-cpusets functionality I asked how much one gains from using 
them compared to the taskset/numactl. ie On the machines with 2-8 cores it's 
fairly easy to manage cpus with simple affinity masks.

The second part of your reply seems to imply that I somehow made you think 
that I suggested that cpu_isolated_map is managed per thread. That is of 
course not the case. It's definitely a system-wide mechanism and individual 
threads have nothing to do with it.
btw I just re-read my prev reply. I definitely did not say anything about 
threads managing cpu_isolated_map :).

>> Stuff that I'm working on this days (wireless basestations) is designed
>> with the  following model:
>> 	cpuN - runs soft-RT networking and management code
>> 	cpuN+1 to cpuN+x - are used as dedicated engines
>> ie Simplest example would be
>> 	cpu0 - runs IP, L2 and control plane
>> 	cpu1 - runs hard-RT MAC 
>>
>> So if CPU isolation is implemented on top of the cpusets what kind of API do 
>> you envision for such an app ?
> 
> That depends on what more API is needed.  Do we need to place
> irqs better ... cpusets might not be a natural for that use.
> Aren't irqs directed to specific CPUs, not to hierarchically
> nested subsets of CPUs.

You clipped the part where I elaborated. Which was:
>> So if CPU isolation is implemented on top of the cpusets what kind of API do 
>> you envision for such an app ? I mean currently cpusets seems to be mostly dealing
>> with entire processes, whereas in this case we're really dealing with the threads. 
>> ie Different threads of the same process require different policies, some must run
>> on isolated cpus some must not. I guess one could write a thread's pid into cpusets
>> fs but that's not very convenient. pthread_set_affinity() is exactly what's needed.
In other words how would an app place its individual threads into the 
different cpusets.
IRQ stuff is separate, like we said above cpusets could simply update 
cpu_isolated_map which would take care of IRQs. I was talking specifically 
about the thread management.

> Separate question:
>   Is it desired that the dedicated CPUs cpuN+1 ... cpuN+x even appear
>   as general purpose systems running a Linux kernel in your systems?
>   These dedicated engines seem more like intelligent devices to me,
>   such as disk controllers, which the kernel controls via device
>   drivers, not by loading itself on them too.
We still want to be able to run normal threads on them. Which means IPI, 
memory management, etc is still needed. So yes they better show up as normal 
CPUs :)
Also with dynamic isolation you can for example un-isolate a cpu when you're 
compiling stuff on the machine and then isolate it when you're running special 
app(s).

Max

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Tiny cpusets -- cpusets for small systems?
  2008-02-23 15:57   ` Paul Jackson
@ 2008-03-12 15:01     ` Paul Mundt
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Mundt @ 2008-03-12 15:01 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Paul Menage, kosaki.motohiro, maxk, alan, adilger, akpm,
	daniel.spang, mingo, jonathan, marcelo, pavel, a.p.zijlstra,
	riel, linux-kernel

On Sat, Feb 23, 2008 at 09:57:52AM -0600, Paul Jackson wrote:
> Paul M wrote:
> > I'm don't think that either of these would be enough to justify big
> > changes to cpusets or cgroups, although eliminating bloat is always a
> > good thing.
> 
> My "tiny cpuset" idea doesn't so much eliminate bloat, as provide a
> thin alternative, along side of the existing fat alternative.  So
> far as kernel source goes, it would get bigger, not smaller, with now
> two CONFIG choices for cpusets, fat or tiny.
> 
> The odds are, however, given that one of us has just promised not to
> code this, and the other of us doesn't figure it's worth it, this
> idea will not live long.  Someone would have to step up from the
> embedded side with a coded version that saved a nice chunk of memory
> (from their perspective) to get this off the ground, and no telling
> whether even that would meet with a warm reception.
> 
This has actually been on my TODO list for awhile, though not quite in
the way that you outlined in your initial post. A good reason for why
cpusets are fat at the moment is largely because there's no isolation or
configurability of individual features supported and exposed, leaving one
with an all-or-nothing scenario.

Both the SMP and NUMA bits are fairly orthogonal, and I think isolating
those and making each configurable would already help trim things down
quite a bit (ie, nodemask handling, scheduler domains, etc.). The
filesystem bits are not really that heavy comparatively, so rather than
working on a tiny cpuset implementation, simply splitting up the existing
implementation seems like a much saner approach, and one that can be done
incrementally.

While we're on the topic of cpuset reform, co-processors are another
thing that cpusets is in pretty good shape to handle (particularly in
terms of carving up large grid and dataflow processors and things of that
nature, which we do have customer use cases for in embedded space today).

I'll try to follow up to this thread with an initial patch series in the
not-too-distant future.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-03-12 15:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-23 12:09 Tiny cpusets -- cpusets for small systems? Paul Jackson
2008-02-23 15:09 ` Paul Menage
2008-02-23 15:57   ` Paul Jackson
2008-03-12 15:01     ` Paul Mundt
2008-02-24  3:54 ` Max Krasnyansky
2008-02-26  2:05   ` Paul Jackson
2008-02-26  2:37     ` Max Krasnyanskiy
2008-02-25  3:10 ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).