LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Per-CPU data as a structure
@ 2007-05-03 15:16 Julio M. Merino Vidal
  2007-05-03 17:19 ` Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: Julio M. Merino Vidal @ 2007-05-03 15:16 UTC (permalink / raw)
  To: linux-kernel

Hello,

At the moment, data specific to a CPU is stored in different, fixed- 
size separate arrays by means of the "percpu framework".  I'm working  
on some changes to modify the way some CPUs are represented, and I'm  
wondering what's the rationale behind such a representation.

At first sight, it'd seem more reasonable to have a structure holding  
all the information that is CPU-specific (as is done with any object  
represented within the system).  After searching the mail archives, I  
see that similar changes were proposed before, but those threads did  
not seem to get any reply (so I'm assuming that the changes were not  
desired).

Similarly, and if I understood it correctly, the PDA (Per-processor  
Data Area) also aims to do the above, but at the moment it only  
contains some fields and is not defined in all platforms.  There are  
still a lot of usages of the percpu functionality (such as, e.g., in  
kernel/sched.c).

Part of my changes introduce a new structure that is able to  
represent any kind of CPU (and which each platform can extend to add  
new information to it).  It is supposed to supersede the per-cpu  
definitions.  I bet this could also be redone by using percpu in some  
way...  The thing is I am willing to share my work when I've finished  
it (it is still very much work-in-progress), but first I'm interested  
to know if adding this new structure is a crazy idea (meaning I  
should stick to percpu wherever possible) or something that can be  
accepted later on.

Summarizing, my questions are:
- Why is the code currently using multiple separate arrays (percpu)
   to hold CPU information instead of a structure?
- Could this structure-based approach (instead of all these separate
   arrays) be considered for inclusion into the system?

As far as I can tell, the advantage of percpu is that you can define  
new "fields" anywhere in the code and independently from the rest of  
the system.  Also, I seem to understand that there are performance  
advantages related to this.  But on the other hand, percpu seems like  
an unnatural approach to "reimplement" regular structures.

Thank you very much.

-- 
Julio M. Merino Vidal <jmerino@ac.upc.edu>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Per-CPU data as a structure
  2007-05-03 15:16 Per-CPU data as a structure Julio M. Merino Vidal
@ 2007-05-03 17:19 ` Andi Kleen
  2007-05-04  8:36   ` Julio M. Merino Vidal
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2007-05-03 17:19 UTC (permalink / raw)
  To: Julio M. Merino Vidal; +Cc: linux-kernel

"Julio M. Merino Vidal" <jmerino@ac.upc.edu> writes:
 
> Similarly, and if I understood it correctly, the PDA (Per-processor
> Data Area) also aims to do the above, but at the moment it only
> contains some fields and is not defined in all platforms.  There are
> still a lot of usages of the percpu functionality (such as, e.g., in
> kernel/sched.c).

PDA is an earlier version of percpu; it still can be more efficiently
accessed so it is kept for some low level code.
 
> As far as I can tell, the advantage of percpu is that you can define
> new "fields" anywhere in the code and independently from the rest of
> the system. 

- Independent maintenance as you noted
- Fast access and relatively compact code
- Avoids false sharing by keeping cache lines of different CPUs separate
- Doesn't waste a lot of memory in padding like NR_CPUs arrays usually
need to to avoid the previous point.

Any replacement that doesn't have these properties too will probably
be not useful.

-Andi


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Per-CPU data as a structure
  2007-05-03 17:19 ` Andi Kleen
@ 2007-05-04  8:36   ` Julio M. Merino Vidal
  2007-05-04  8:55     ` Eric Dumazet
  2007-05-04 14:07     ` Andi Kleen
  0 siblings, 2 replies; 5+ messages in thread
From: Julio M. Merino Vidal @ 2007-05-04  8:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel

Andi Kleen wrote:
>> As far as I can tell, the advantage of percpu is that you can define
>> new "fields" anywhere in the code and independently from the rest of
>> the system. 
>>     
>
> - Independent maintenance as you noted
> - Fast access and relatively compact code
> - Avoids false sharing by keeping cache lines of different CPUs separate
> - Doesn't waste a lot of memory in padding like NR_CPUs arrays usually
> need to to avoid the previous point.
>
> Any replacement that doesn't have these properties too will probably
> be not useful.
>   
Thank you for the details.  I'll try to stick to per-cpu wherever 
possible for now.

Anyway, what do you think about adding the above text to the code (percpu.h
maybe) as documentation?  See the patch below.  (Dunno if the Signed-off-by
line is appropriate as most of the text is yours.)

Signed-off-by: Julio M. Merino Vidal <jmerino@ac.upc.edu>

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 600e3d3..b8e8b8c 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -1,6 +1,21 @@
 #ifndef __LINUX_PERCPU_H
 #define __LINUX_PERCPU_H
 
+/*
+ * percpu provides a mechanism to define variables that are specific to 
each
+ * CPU in the system.
+ *
+ * Each variable is defined as an independent array of NR_CPUS elements.
+ * This approach is used instead of a per-CPU structure because it has the
+ * following advantages:
+ * - Independent maintenance: a source file can define new per-CPU
+ *   variables without distorting others.
+ * - Fast access and relatively compact code.
+ * - Avoids false sharing by keeping cache lines of different CPUs 
separate.
+ * - Doesn't waste a lot of memory in padding like NR_CPUs arrays usually
+ *   need to to avoid the previous point.
+ */
+
 #include <linux/spinlock.h> /* For preempt_disable() */
 #include <linux/slab.h> /* For kmalloc() */
 #include <linux/smp.h>


Kind regards.

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Per-CPU data as a structure
  2007-05-04  8:36   ` Julio M. Merino Vidal
@ 2007-05-04  8:55     ` Eric Dumazet
  2007-05-04 14:07     ` Andi Kleen
  1 sibling, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2007-05-04  8:55 UTC (permalink / raw)
  To: Julio M. Merino Vidal; +Cc: Andi Kleen, linux-kernel

On Fri, 04 May 2007 10:36:37 +0200
"Julio M. Merino Vidal" <jmerino@ac.upc.edu> wrote:


> 
> Anyway, what do you think about adding the above text to the code (percpu.h
> maybe) as documentation?  See the patch below.  (Dunno if the Signed-off-by
> line is appropriate as most of the text is yours.)
> 
> Signed-off-by: Julio M. Merino Vidal <jmerino@ac.upc.edu>
> 
> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> index 600e3d3..b8e8b8c 100644
> --- a/include/linux/percpu.h
> +++ b/include/linux/percpu.h
> @@ -1,6 +1,21 @@
>  #ifndef __LINUX_PERCPU_H
>  #define __LINUX_PERCPU_H
>  
> +/*
> + * percpu provides a mechanism to define variables that are specific to 
> each
> + * CPU in the system.
> + *
> + * Each variable is defined as an independent array of NR_CPUS elements.
> + * This approach is used instead of a per-CPU structure because it has the
> + * following advantages:
> + * - Independent maintenance: a source file can define new per-CPU
> + *   variables without distorting others.
> + * - Fast access and relatively compact code.
> + * - Avoids false sharing by keeping cache lines of different CPUs 
> separate.
> + * - Doesn't waste a lot of memory in padding like NR_CPUs arrays usually
> + *   need to to avoid the previous point.
> + */
> +
>  #include <linux/spinlock.h> /* For preempt_disable() */
>  #include <linux/slab.h> /* For kmalloc() */
>  #include <linux/smp.h>

Documentation is good, and percpu probably misses one, but please add it in a Documentation/percpu.txt file, because it's the right place.

You then can really have an extensive documentation, and you wont slow down kernel compiles...

I suggest you document all variants (get_cpu_var(), __get_cpu_var(), ...) with examples of use

Also, please note that per cpu data is not allocated * NR_CPUS, but depends on possible cpus. So if you boot an SMP kernel on a one CPU desktop, kernel allocates only the needed space.

So per_cpu data has also a space saving argument against structures declared with [NR_CPUS] arrays.

Thank you


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Per-CPU data as a structure
  2007-05-04  8:36   ` Julio M. Merino Vidal
  2007-05-04  8:55     ` Eric Dumazet
@ 2007-05-04 14:07     ` Andi Kleen
  1 sibling, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2007-05-04 14:07 UTC (permalink / raw)
  To: Julio M. Merino Vidal; +Cc: Andi Kleen, linux-kernel

> +/*
> + * percpu provides a mechanism to define variables that are specific to 
> each
> + * CPU in the system.
> + *
> + * Each variable is defined as an independent array of NR_CPUS elements.

The independent array term seems misleading to me. There isn't really
an array anywhere. Perhaps explain it in more details.

If you write documentation please write it in Kerneldoc format so 
that it can be automatically extracted. See
Documentation/kernel-doc-nano-HOWTO.txt

-Andi



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-05-04 14:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-03 15:16 Per-CPU data as a structure Julio M. Merino Vidal
2007-05-03 17:19 ` Andi Kleen
2007-05-04  8:36   ` Julio M. Merino Vidal
2007-05-04  8:55     ` Eric Dumazet
2007-05-04 14:07     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).