LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/7]: uninline some net related static inline in .h bloaters
@ 2008-03-27 12:37 Ilpo Järvinen
  2008-03-27 12:38 ` [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot Ilpo Järvinen
  2008-03-28  0:54 ` [PATCH 0/7]: uninline some net related static inline in .h bloaters David Miller
  0 siblings, 2 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-27 12:37 UTC (permalink / raw)
  To: Andrew Morton, David Miller, netdev, linux-kernel
  Cc: Arnaldo Carvalho de Melo

Hi all,

As suggested by Andrew, I rerun the uninlining of all what's in .h
(v2.6.25-rc2-mm1 this time) to fix the (un)likely profiling overhead
which showed up in the allyesconfig results. The config was manually
tweaked from allyesconfig to not include a number of debug related
options [1]. Other setup: 32-bit x86, gcc (GCC) 4.1.2 20070626
(Red Hat 4.1.2-13). I also tweaked the uninlining machinery a bit and
got even better coverage than last time.

IS_ERR is now successfully off the top :-). The numbers are smaller
than what was previously measured, esp. when a function has a large
number of call-sites and has (un)likely, nevertheless, mostly the
same functions show up among top bloaters as with plain allyesconfig.
Also my earlier, smaller scale tests yielded similar conclusion.
Interesting new comers, however, are those list related functions
which had non-inlined debug versions with allyesconfig.

Ok, here's the top of the list (5000+ bytes):

-60744  855 funcs, 861 +, 61605 -, diff: -60744 --- skb_put 
-33129  42 funcs, 199 +, 33328 -, diff: -33129 --- cfi_build_cmd 
-32338  1182 funcs, 594 +, 32932 -, diff: -32338 --- atomic_dec_and_test 
-22906  1208 funcs, 462 +, 23368 -, diff: -22906 --- list_del 
-18399  468 funcs, 282 +, 18681 -, diff: -18399 --- netif_wake_queue 
-14283  14 funcs, 365 +, 14648 -, diff: -14283 --- cfi_send_gen_cmd 
-13890  341 funcs, 189 +, 14079 -, diff: -13890 --- skb_push 
-12853  35 funcs, 114 +, 12967 -, diff: -12853 --- le_key_k_type 
-12178  382 funcs, 157 +, 12335 -, diff: -12178 --- dev_alloc_skb 
-11675  1178 funcs, 1471 +, 13146 -, diff: -11675 --- list_add_tail 
-10996  1688 funcs, 1144 +, 12140 -, diff: -10996 --- raw_local_irq_enable 
-10838  1832 funcs, 3763 +, 14601 -, diff: -10838 --- __list_add 
-10283  401 funcs, 208 +, 10491 -, diff: -10283 --- __dev_alloc_skb 
-9697  338 funcs, 221 +, 9918 -, diff: -9697 --- skb_pull 
-8963  836 funcs, 1903 +, 10866 -, diff: -8963 --- list_add 
-7939  29 funcs, 304 +, 8243 -, diff: -7939 --- __vlan_hwaccel_rx 
-7937  238 funcs, 314 +, 8251 -, diff: -7937 --- i_size_read 
-7768  106 funcs, 254 +, 8022 -, diff: -7768 --- __nlmsg_put 
-7569  376 funcs, 246 +, 7815 -, diff: -7569 --- __skb_pull 
-7467  658 funcs, 503 +, 7970 -, diff: -7467 --- list_del_init 
-7360  192 funcs, 131 +, 7491 -, diff: -7360 --- skb_trim 
-7257  186 funcs, 70 +, 7327 -, diff: -7257 --- dst_release 
-6943  234 funcs, 109 +, 7052 -, diff: -6943 --- __skb_trim 
-6923  71 funcs, 296 +, 7219 -, diff: -6923 --- nlmsg_put 
-6662  475 funcs, 410 +, 7072 -, diff: -6662 --- add_timer 
-6535  380 funcs, 1491 +, 8026 -, diff: -6535 --- ___arch__swab64 
-6457  372 funcs, 1429 +, 7886 -, diff: -6457 --- __fswab64 
-5625  48 funcs, 79 +, 5704 -, diff: -5625 --- tty_insert_flip_char 
-5615  25 funcs, 284 +, 5899 -, diff: -5615 --- vlan_hwaccel_receive_skb 
-5404  23 funcs, 509 +, 5913 -, diff: -5404 --- jhash 
-5396  214 funcs, 155 +, 5551 -, diff: -5396 --- dev_to_shost 
-5310  110 funcs, 133 +, 5443 -, diff: -5310 --- skb_header_pointer 
-5169  403 funcs, 131 +, 5300 -, diff: -5169 --- pci_write_config_byte 

Please note that because one inline was tested (function uninlined)
at a time, the actual benefits of removing multiple inlines may well
be below what the sum of those individually is (especially when
something calls __-func with equal name, e.g., dev_alloc_skb that
basically just calls __dev_alloc_skb).

~210 are 1000+ bytes (was ~250 with allyesconfig) and ~350 in 500+
(was ~440 previously). Full list has small number of entries without
any details indicating compile failures, and a bit larger set of cases
where the static inline was preprocessed away due to #if blocks. Except
for that, it should be self explinary:
  http://www.cs.helsinki.fi/u/ijjarvin/inlines/sorted.v2.6.25-rc2-mm1

I include couple of net related uninlines in this patch series. I don't
include the jhash to lib/ patch this time (was there previously) because
I haven't had time to finish it up.

Another view to the results showing size of the uninlined bodies is
available in here:
  http://www.cs.helsinki.fi/u/ijjarvin/inlines/bodies.v2.6.25-rc2-mm1

The tools I used are available here except the site-specific
distribute machinery (in addition one needs pretty late
codiff from Arnaldo's toolset because there were some inline
related bugs fixed lately):

  http://www.cs.helsinki.fi/u/ijjarvin/inline-tools.git/

As stated earlier, similar analysis should be performed on .h files
not under include/, but it would require minor modifications to
those tools.

--
 i.

[1] http://www.cs.helsinki.fi/u/ijjarvin/inlines/config.nodebug.v2.6.25-rc2-mm1




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-27 12:37 [PATCH 0/7]: uninline some net related static inline in .h bloaters Ilpo Järvinen
@ 2008-03-27 12:38 ` Ilpo Järvinen
  2008-03-27 12:38   ` [PATCH 2/7] [NET]: uninline skb_pull, " Ilpo Järvinen
                     ` (2 more replies)
  2008-03-28  0:54 ` [PATCH 0/7]: uninline some net related static inline in .h bloaters David Miller
  1 sibling, 3 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-27 12:38 UTC (permalink / raw)
  To: Andrew Morton, David Miller, netdev, linux-kernel
  Cc: Arnaldo Carvalho de Melo, Ilpo J�rvinen

Allyesconfig (v2.6.24-mm1):

~500 files changed
...
 869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put
  skb_put                       | +104

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):

-60744  855 funcs, 861 +, 61605 -, diff: -60744 --- skb_put
  skb_put                       |  +57

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/linux/skbuff.h |   21 +--------------------
 net/core/skbuff.c      |   21 +++++++++++++++++++++
 2 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7beb239..f085955 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -892,6 +892,7 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
 /*
  *	Add data to an sk_buff
  */
+extern unsigned char *skb_put(struct sk_buff *skb, unsigned int len);
 static inline unsigned char *__skb_put(struct sk_buff *skb, unsigned int len)
 {
 	unsigned char *tmp = skb_tail_pointer(skb);
@@ -901,26 +902,6 @@ static inline unsigned char *__skb_put(struct sk_buff *skb, unsigned int len)
 	return tmp;
 }
 
-/**
- *	skb_put - add data to a buffer
- *	@skb: buffer to use
- *	@len: amount of data to add
- *
- *	This function extends the used data area of the buffer. If this would
- *	exceed the total buffer size the kernel will panic. A pointer to the
- *	first byte of the extra data is returned.
- */
-static inline unsigned char *skb_put(struct sk_buff *skb, unsigned int len)
-{
-	unsigned char *tmp = skb_tail_pointer(skb);
-	SKB_LINEAR_ASSERT(skb);
-	skb->tail += len;
-	skb->len  += len;
-	if (unlikely(skb->tail > skb->end))
-		skb_over_panic(skb, len, current_text_addr());
-	return tmp;
-}
-
 static inline unsigned char *__skb_push(struct sk_buff *skb, unsigned int len)
 {
 	skb->data -= len;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0d0fd28..3402eca 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -857,6 +857,27 @@ free_skb:
 	return err;
 }
 
+/**
+ *	skb_put - add data to a buffer
+ *	@skb: buffer to use
+ *	@len: amount of data to add
+ *
+ *	This function extends the used data area of the buffer. If this would
+ *	exceed the total buffer size the kernel will panic. A pointer to the
+ *	first byte of the extra data is returned.
+ */
+unsigned char *skb_put(struct sk_buff *skb, unsigned int len)
+{
+	unsigned char *tmp = skb_tail_pointer(skb);
+	SKB_LINEAR_ASSERT(skb);
+	skb->tail += len;
+	skb->len  += len;
+	if (unlikely(skb->tail > skb->end))
+		skb_over_panic(skb, len, __builtin_return_address(0));
+	return tmp;
+}
+EXPORT_SYMBOL(skb_put);
+
 /* Trims skb to length len. It can change skb pointers.
  */
 
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/7] [NET]: uninline skb_pull, de-bloats a lot
  2008-03-27 12:38 ` [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot Ilpo Järvinen
@ 2008-03-27 12:38   ` Ilpo Järvinen
  2008-03-27 12:38     ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Ilpo Järvinen
  2008-03-27 19:10   ` [PATCH 1/7] [NET]: uninline skb_put, " Joe Perches
  2008-03-28  0:43   ` David Miller
  2 siblings, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-27 12:38 UTC (permalink / raw)
  To: Andrew Morton, David Miller, netdev, linux-kernel
  Cc: Arnaldo Carvalho de Melo, Ilpo J�rvinen

Allyesconfig (v2.6.24-mm1):

-28162  354 funcs, 3005 +, 31167 -, diff: -28162 --- skb_pull

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):

-9697  338 funcs, 221 +, 9918 -, diff: -9697 --- skb_pull
skb_pull                      |  +44

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/linux/skbuff.h |   16 +---------------
 net/core/skbuff.c      |   16 ++++++++++++++++
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f085955..6d6cde7 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -927,6 +927,7 @@ static inline unsigned char *skb_push(struct sk_buff *skb, unsigned int len)
 	return skb->data;
 }
 
+extern unsigned char *skb_pull(struct sk_buff *skb, unsigned int len);
 static inline unsigned char *__skb_pull(struct sk_buff *skb, unsigned int len)
 {
 	skb->len -= len;
@@ -934,21 +935,6 @@ static inline unsigned char *__skb_pull(struct sk_buff *skb, unsigned int len)
 	return skb->data += len;
 }
 
-/**
- *	skb_pull - remove data from the start of a buffer
- *	@skb: buffer to use
- *	@len: amount of data to remove
- *
- *	This function removes data from the start of a buffer, returning
- *	the memory to the headroom. A pointer to the next data in the buffer
- *	is returned. Once the data has been pulled future pushes will overwrite
- *	the old data.
- */
-static inline unsigned char *skb_pull(struct sk_buff *skb, unsigned int len)
-{
-	return unlikely(len > skb->len) ? NULL : __skb_pull(skb, len);
-}
-
 extern unsigned char *__pskb_pull_tail(struct sk_buff *skb, int delta);
 
 static inline unsigned char *__pskb_pull(struct sk_buff *skb, unsigned int len)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3402eca..cf489b6 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -878,6 +878,22 @@ unsigned char *skb_put(struct sk_buff *skb, unsigned int len)
 }
 EXPORT_SYMBOL(skb_put);
 
+/**
+ *	skb_pull - remove data from the start of a buffer
+ *	@skb: buffer to use
+ *	@len: amount of data to remove
+ *
+ *	This function removes data from the start of a buffer, returning
+ *	the memory to the headroom. A pointer to the next data in the buffer
+ *	is returned. Once the data has been pulled future pushes will overwrite
+ *	the old data.
+ */
+unsigned char *skb_pull(struct sk_buff *skb, unsigned int len)
+{
+	return unlikely(len > skb->len) ? NULL : __skb_pull(skb, len);
+}
+EXPORT_SYMBOL(skb_pull);
+
 /* Trims skb to length len. It can change skb pointers.
  */
 
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/7] [NET]: uninline dev_alloc_skb, de-bloats a lot
  2008-03-27 12:38   ` [PATCH 2/7] [NET]: uninline skb_pull, " Ilpo Järvinen
@ 2008-03-27 12:38     ` Ilpo Järvinen
  2008-03-27 12:38       ` [PATCH 4/7] [NET]: uninline skb_push, " Ilpo Järvinen
                         ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-27 12:38 UTC (permalink / raw)
  To: Andrew Morton, David Miller, netdev, linux-kernel
  Cc: Arnaldo Carvalho de Melo, Ilpo J�rvinen

Allyesconfig (v2.6.24-mm1):

-23668  392 funcs, 104 +, 23772 -, diff: -23668 --- dev_alloc_skb

Without many debug CONFIGs (v2.6.25-rc2-mm1):

-12178  382 funcs, 157 +, 12335 -, diff: -12178 --- dev_alloc_skb
dev_alloc_skb                 |  +37

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/linux/skbuff.h |   17 +----------------
 net/core/skbuff.c      |   18 ++++++++++++++++++
 2 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6d6cde7..01a11b0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1272,22 +1272,7 @@ static inline struct sk_buff *__dev_alloc_skb(unsigned int length,
 	return skb;
 }
 
-/**
- *	dev_alloc_skb - allocate an skbuff for receiving
- *	@length: length to allocate
- *
- *	Allocate a new &sk_buff and assign it a usage count of one. The
- *	buffer has unspecified headroom built in. Users should allocate
- *	the headroom they think they need without accounting for the
- *	built in space. The built in space is used for optimisations.
- *
- *	%NULL is returned if there is no free memory. Although this function
- *	allocates memory it can be called from an interrupt.
- */
-static inline struct sk_buff *dev_alloc_skb(unsigned int length)
-{
-	return __dev_alloc_skb(length, GFP_ATOMIC);
-}
+extern struct sk_buff *dev_alloc_skb(unsigned int length);
 
 extern struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 		unsigned int length, gfp_t gfp_mask);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index cf489b6..0daf5c0 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -263,6 +263,24 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 	return skb;
 }
 
+/**
+ *	dev_alloc_skb - allocate an skbuff for receiving
+ *	@length: length to allocate
+ *
+ *	Allocate a new &sk_buff and assign it a usage count of one. The
+ *	buffer has unspecified headroom built in. Users should allocate
+ *	the headroom they think they need without accounting for the
+ *	built in space. The built in space is used for optimisations.
+ *
+ *	%NULL is returned if there is no free memory. Although this function
+ *	allocates memory it can be called from an interrupt.
+ */
+struct sk_buff *dev_alloc_skb(unsigned int length)
+{
+	return __dev_alloc_skb(length, GFP_ATOMIC);
+}
+EXPORT_SYMBOL(dev_alloc_skb);
+
 static void skb_drop_list(struct sk_buff **listp)
 {
 	struct sk_buff *list = *listp;
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 4/7] [NET]: uninline skb_push, de-bloats a lot
  2008-03-27 12:38     ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Ilpo Järvinen
@ 2008-03-27 12:38       ` Ilpo Järvinen
  2008-03-27 12:38         ` [PATCH 5/7] [NET]: uninline dst_release Ilpo Järvinen
  2008-03-28  0:52         ` [PATCH 4/7] [NET]: uninline skb_push, de-bloats a lot David Miller
  2008-03-27 23:36       ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Denys Vlasenko
  2008-03-28  0:51       ` David Miller
  2 siblings, 2 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-27 12:38 UTC (permalink / raw)
  To: Andrew Morton, David Miller, netdev, linux-kernel
  Cc: Arnaldo Carvalho de Melo, Ilpo J�rvinen

Allyesconfig (v2.6.24-mm1):

-21593  356 funcs, 2418 +, 24011 -, diff: -21593 --- skb_push

Without many debug related CONFIGs (v2.6.25-rc2-mm1):

-13890  341 funcs, 189 +, 14079 -, diff: -13890 --- skb_push
skb_push                      |  +46

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/linux/skbuff.h |   19 +------------------
 net/core/skbuff.c      |   19 +++++++++++++++++++
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 01a11b0..1baf4d4 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -902,6 +902,7 @@ static inline unsigned char *__skb_put(struct sk_buff *skb, unsigned int len)
 	return tmp;
 }
 
+extern unsigned char *skb_push(struct sk_buff *skb, unsigned int len);
 static inline unsigned char *__skb_push(struct sk_buff *skb, unsigned int len)
 {
 	skb->data -= len;
@@ -909,24 +910,6 @@ static inline unsigned char *__skb_push(struct sk_buff *skb, unsigned int len)
 	return skb->data;
 }
 
-/**
- *	skb_push - add data to the start of a buffer
- *	@skb: buffer to use
- *	@len: amount of data to add
- *
- *	This function extends the used data area of the buffer at the buffer
- *	start. If this would exceed the total buffer headroom the kernel will
- *	panic. A pointer to the first byte of the extra data is returned.
- */
-static inline unsigned char *skb_push(struct sk_buff *skb, unsigned int len)
-{
-	skb->data -= len;
-	skb->len  += len;
-	if (unlikely(skb->data<skb->head))
-		skb_under_panic(skb, len, current_text_addr());
-	return skb->data;
-}
-
 extern unsigned char *skb_pull(struct sk_buff *skb, unsigned int len);
 static inline unsigned char *__skb_pull(struct sk_buff *skb, unsigned int len)
 {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0daf5c0..a37127b 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -897,6 +897,25 @@ unsigned char *skb_put(struct sk_buff *skb, unsigned int len)
 EXPORT_SYMBOL(skb_put);
 
 /**
+ *	skb_push - add data to the start of a buffer
+ *	@skb: buffer to use
+ *	@len: amount of data to add
+ *
+ *	This function extends the used data area of the buffer at the buffer
+ *	start. If this would exceed the total buffer headroom the kernel will
+ *	panic. A pointer to the first byte of the extra data is returned.
+ */
+unsigned char *skb_push(struct sk_buff *skb, unsigned int len)
+{
+	skb->data -= len;
+	skb->len  += len;
+	if (unlikely(skb->data<skb->head))
+		skb_under_panic(skb, len, __builtin_return_address(0));
+	return skb->data;
+}
+EXPORT_SYMBOL(skb_push);
+
+/**
  *	skb_pull - remove data from the start of a buffer
  *	@skb: buffer to use
  *	@len: amount of data to remove
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 5/7] [NET]: uninline dst_release
  2008-03-27 12:38       ` [PATCH 4/7] [NET]: uninline skb_push, " Ilpo Järvinen
@ 2008-03-27 12:38         ` Ilpo Järvinen
  2008-03-27 12:38           ` [PATCH 6/7] [NET]: uninline skb_trim, de-bloats Ilpo Järvinen
  2008-03-28  0:53           ` [PATCH 5/7] [NET]: uninline dst_release David Miller
  2008-03-28  0:52         ` [PATCH 4/7] [NET]: uninline skb_push, de-bloats a lot David Miller
  1 sibling, 2 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-27 12:38 UTC (permalink / raw)
  To: Andrew Morton, David Miller, netdev, linux-kernel
  Cc: Arnaldo Carvalho de Melo, Ilpo J�rvinen

Codiff stats (allyesconfig, v2.6.24-mm1):
-16420  187 funcs, 103 +, 16523 -, diff: -16420 --- dst_release

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7257  186 funcs, 70 +, 7327 -, diff: -7257 --- dst_release
dst_release                   |  +40

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/net/dst.h |   10 +---------
 net/core/dst.c    |   10 ++++++++++
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index ae13370..002500e 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -163,15 +163,7 @@ struct dst_entry * dst_clone(struct dst_entry * dst)
 	return dst;
 }
 
-static inline
-void dst_release(struct dst_entry * dst)
-{
-	if (dst) {
-		WARN_ON(atomic_read(&dst->__refcnt) < 1);
-		smp_mb__before_atomic_dec();
-		atomic_dec(&dst->__refcnt);
-	}
-}
+extern void dst_release(struct dst_entry *dst);
 
 /* Children define the path of the packet through the
  * Linux networking.  Thus, destinations are stackable.
diff --git a/net/core/dst.c b/net/core/dst.c
index 694cd2a..fe03266 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -259,6 +259,16 @@ again:
 	return NULL;
 }
 
+void dst_release(struct dst_entry *dst)
+{
+	if (dst) {
+		WARN_ON(atomic_read(&dst->__refcnt) < 1);
+		smp_mb__before_atomic_dec();
+		atomic_dec(&dst->__refcnt);
+	}
+}
+EXPORT_SYMBOL(dst_release);
+
 /* Dirty hack. We did it in 2.2 (in __dst_free),
  * we have _very_ good reasons not to repeat
  * this mistake in 2.3, but we have no choice
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 6/7] [NET]: uninline skb_trim, de-bloats
  2008-03-27 12:38         ` [PATCH 5/7] [NET]: uninline dst_release Ilpo Järvinen
@ 2008-03-27 12:38           ` Ilpo Järvinen
  2008-03-27 12:38             ` [PATCH 7/7] [SCTP]: Remove sctp_add_cmd_sf wrapper bloat Ilpo Järvinen
  2008-03-28  0:54             ` [PATCH 6/7] [NET]: uninline skb_trim, de-bloats David Miller
  2008-03-28  0:53           ` [PATCH 5/7] [NET]: uninline dst_release David Miller
  1 sibling, 2 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-27 12:38 UTC (permalink / raw)
  To: Andrew Morton, David Miller, netdev, linux-kernel
  Cc: Arnaldo Carvalho de Melo, Ilpo J�rvinen

Allyesconfig (v2.6.24-mm1):
-10976  209 funcs, 123 +, 11099 -, diff: -10976 --- skb_trim

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7360  192 funcs, 131 +, 7491 -, diff: -7360 --- skb_trim
skb_trim                      |  +42

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 include/linux/skbuff.h |   16 +---------------
 net/core/skbuff.c      |   16 ++++++++++++++++
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 1baf4d4..ff72145 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1158,21 +1158,7 @@ static inline void __skb_trim(struct sk_buff *skb, unsigned int len)
 	skb_set_tail_pointer(skb, len);
 }
 
-/**
- *	skb_trim - remove end from a buffer
- *	@skb: buffer to alter
- *	@len: new length
- *
- *	Cut the length of a buffer down by removing data from the tail. If
- *	the buffer is already under the length specified it is not modified.
- *	The skb must be linear.
- */
-static inline void skb_trim(struct sk_buff *skb, unsigned int len)
-{
-	if (skb->len > len)
-		__skb_trim(skb, len);
-}
-
+extern void skb_trim(struct sk_buff *skb, unsigned int len);
 
 static inline int __pskb_trim(struct sk_buff *skb, unsigned int len)
 {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a37127b..86e5682 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -931,6 +931,22 @@ unsigned char *skb_pull(struct sk_buff *skb, unsigned int len)
 }
 EXPORT_SYMBOL(skb_pull);
 
+/**
+ *	skb_trim - remove end from a buffer
+ *	@skb: buffer to alter
+ *	@len: new length
+ *
+ *	Cut the length of a buffer down by removing data from the tail. If
+ *	the buffer is already under the length specified it is not modified.
+ *	The skb must be linear.
+ */
+void skb_trim(struct sk_buff *skb, unsigned int len)
+{
+	if (skb->len > len)
+		__skb_trim(skb, len);
+}
+EXPORT_SYMBOL(skb_trim);
+
 /* Trims skb to length len. It can change skb pointers.
  */
 
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 7/7] [SCTP]: Remove sctp_add_cmd_sf wrapper bloat
  2008-03-27 12:38           ` [PATCH 6/7] [NET]: uninline skb_trim, de-bloats Ilpo Järvinen
@ 2008-03-27 12:38             ` Ilpo Järvinen
  2008-03-28  0:54               ` David Miller
  2008-03-28  0:54             ` [PATCH 6/7] [NET]: uninline skb_trim, de-bloats David Miller
  1 sibling, 1 reply; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-27 12:38 UTC (permalink / raw)
  To: Andrew Morton, David Miller, netdev, linux-kernel
  Cc: Arnaldo Carvalho de Melo, Ilpo J�rvinen, Vlad Yasevich

With a was number of callsites sctp_add_cmd_sf wrapper bloats
kernel by some amount. Due to unlikely tracking allyesconfig,
with the initial result were around ~7kB (thus caught my
attention) while a non-debug config produced only ~2.3kB effect.

I (ij) proposed first a patch to uninline it but Vlad responded
with a patch that removed the only sctp_add_cmd call which is
wrapped by sctp_add_cmd_sf (I wasn't sure if I could do that).
I did minor cleanup to Vlad's patch.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 include/net/sctp/command.h |    3 +--
 include/net/sctp/sm.h      |    8 --------
 net/sctp/command.c         |   10 ++--------
 net/sctp/sm_statefuns.c    |    8 ++------
 4 files changed, 5 insertions(+), 24 deletions(-)

diff --git a/include/net/sctp/command.h b/include/net/sctp/command.h
index 10ae2da..4263af8 100644
--- a/include/net/sctp/command.h
+++ b/include/net/sctp/command.h
@@ -205,12 +205,11 @@ typedef struct {
 int sctp_init_cmd_seq(sctp_cmd_seq_t *seq);
 
 /* Add a command to an sctp_cmd_seq_t.
- * Return 0 if the command sequence is full.
  *
  * Use the SCTP_* constructors defined by SCTP_ARG_CONSTRUCTOR() above
  * to wrap data which goes in the obj argument.
  */
-int sctp_add_cmd(sctp_cmd_seq_t *seq, sctp_verb_t verb, sctp_arg_t obj);
+void sctp_add_cmd_sf(sctp_cmd_seq_t *seq, sctp_verb_t verb, sctp_arg_t obj);
 
 /* Return the next command structure in an sctp_cmd_seq.
  * Return NULL at the end of the sequence.
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index ef9e7ed..2481173 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -385,14 +385,6 @@ static inline int ADDIP_SERIAL_gte(__u16 s, __u16 t)
 	return (((s) == (t)) || (((t) - (s)) & ADDIP_SERIAL_SIGN_BIT));
 }
 
-
-/* Run sctp_add_cmd() generating a BUG() if there is a failure.  */
-static inline void sctp_add_cmd_sf(sctp_cmd_seq_t *seq, sctp_verb_t verb, sctp_arg_t obj)
-{
-	if (unlikely(!sctp_add_cmd(seq, verb, obj)))
-		BUG();
-}
-
 /* Check VTAG of the packet matches the sender's own tag. */
 static inline int
 sctp_vtag_verify(const struct sctp_chunk *chunk,
diff --git a/net/sctp/command.c b/net/sctp/command.c
index bb97733..c004401 100644
--- a/net/sctp/command.c
+++ b/net/sctp/command.c
@@ -52,18 +52,12 @@ int sctp_init_cmd_seq(sctp_cmd_seq_t *seq)
 /* Add a command to a sctp_cmd_seq_t.
  * Return 0 if the command sequence is full.
  */
-int sctp_add_cmd(sctp_cmd_seq_t *seq, sctp_verb_t verb, sctp_arg_t obj)
+void sctp_add_cmd_sf(sctp_cmd_seq_t *seq, sctp_verb_t verb, sctp_arg_t obj)
 {
-	if (seq->next_free_slot >= SCTP_MAX_NUM_COMMANDS)
-		goto fail;
+	BUG_ON(seq->next_free_slot >= SCTP_MAX_NUM_COMMANDS);
 
 	seq->cmds[seq->next_free_slot].verb = verb;
 	seq->cmds[seq->next_free_slot++].obj = obj;
-
-	return 1;
-
-fail:
-	return 0;
 }
 
 /* Return the next command structure in a sctp_cmd_seq.
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 6545b5f..b534dbe 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -3135,12 +3135,8 @@ sctp_disposition_t sctp_sf_operr_notify(const struct sctp_endpoint *ep,
 		if (!ev)
 			goto nomem;
 
-		if (!sctp_add_cmd(commands, SCTP_CMD_EVENT_ULP,
-				  SCTP_ULPEVENT(ev))) {
-			sctp_ulpevent_free(ev);
-			goto nomem;
-		}
-
+		sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP,
+				SCTP_ULPEVENT(ev));
 		sctp_add_cmd_sf(commands, SCTP_CMD_PROCESS_OPERR,
 				SCTP_CHUNK(chunk));
 	}
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-27 12:38 ` [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot Ilpo Järvinen
  2008-03-27 12:38   ` [PATCH 2/7] [NET]: uninline skb_pull, " Ilpo Järvinen
@ 2008-03-27 19:10   ` Joe Perches
  2008-03-27 22:04     ` David Miller
  2008-03-28  0:43   ` David Miller
  2 siblings, 1 reply; 29+ messages in thread
From: Joe Perches @ 2008-03-27 19:10 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Andrew Morton, David Miller, netdev, linux-kernel,
	Arnaldo Carvalho de Melo, Matt Mackall

On Thu, 2008-03-27 at 14:38 +0200, Ilpo Järvinen wrote:
> Allyesconfig (v2.6.24-mm1):

I think this change is only good in severely memory
limited uses.  This will very likely negatively impact
high speed networking.  It's a speed/size trade off.

Perhaps a CONFIG option to reduce net code size would
be appropriate.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-27 19:10   ` [PATCH 1/7] [NET]: uninline skb_put, " Joe Perches
@ 2008-03-27 22:04     ` David Miller
  2008-03-28  0:02       ` Joe Perches
  2008-03-28  0:11       ` Matt Mackall
  0 siblings, 2 replies; 29+ messages in thread
From: David Miller @ 2008-03-27 22:04 UTC (permalink / raw)
  To: joe; +Cc: ilpo.jarvinen, akpm, netdev, linux-kernel, acme, mpm

From: Joe Perches <joe@perches.com>
Date: Thu, 27 Mar 2008 12:10:50 -0700

> On Thu, 2008-03-27 at 14:38 +0200, Ilpo Järvinen wrote:
> > Allyesconfig (v2.6.24-mm1):
> 
> I think this change is only good in severely memory
> limited uses.  This will very likely negatively impact
> high speed networking.  It's a speed/size trade off.

I severely doubt this, the bulk of the overhead of
skb_put() is the atomic operation, not whether the
instructions get executed inline or not.

Please run some tests before making claims like this.
I might give you some slack in this area if you've
been doing networking performance tuning for 10+ years
but you haven't.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] [NET]: uninline dev_alloc_skb, de-bloats a lot
  2008-03-27 12:38     ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Ilpo Järvinen
  2008-03-27 12:38       ` [PATCH 4/7] [NET]: uninline skb_push, " Ilpo Järvinen
@ 2008-03-27 23:36       ` Denys Vlasenko
  2008-03-28  0:52         ` David Miller
  2008-03-28  0:51       ` David Miller
  2 siblings, 1 reply; 29+ messages in thread
From: Denys Vlasenko @ 2008-03-27 23:36 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Andrew Morton, David Miller, netdev, linux-kernel,
	Arnaldo Carvalho de Melo

On Thursday 27 March 2008 13:38, Ilpo Järvinen wrote:
> Allyesconfig (v2.6.24-mm1):
> 
> -23668  392 funcs, 104 +, 23772 -, diff: -23668 --- dev_alloc_skb
> 
> Without many debug CONFIGs (v2.6.25-rc2-mm1):
> 
> -12178  382 funcs, 157 +, 12335 -, diff: -12178 --- dev_alloc_skb
> dev_alloc_skb                 |  +37


This will be very confusing for casual reader:

> +struct sk_buff *dev_alloc_skb(unsigned int length)
> +{
> +	return __dev_alloc_skb(length, GFP_ATOMIC);
> +}
> +EXPORT_SYMBOL(dev_alloc_skb);

"what, they want to save one push instruction per callsite??".

Can you add a comment which explains the intent?

+struct sk_buff *dev_alloc_skb(unsigned int length)
+{
+       /* There is more code here than it seems:
+        * __dev_alloc_skb is an inline */
+	return __dev_alloc_skb(length, GFP_ATOMIC);
+}
+EXPORT_SYMBOL(dev_alloc_skb);


Another good chunk of code size saving can be achieved
by introducing dev_alloc_skb_or_warn(), and using it
in places like these:

drivers/net/irda/nsc-ircc.c:

                        skb = dev_alloc_skb(len+1);
                        if (skb == NULL)  {
                                IRDA_WARNING("%s(), memory squeeze, "
                                             "dropping frame.\n",
                                             __FUNCTION__);

drivers/net/appletalk/ltpc.c:

        skb = dev_alloc_skb(3+sklen);
        if (skb == NULL)
        {
                printk("%s: dropping packet due to memory squeeze.\n",

net/econet/af_econet.c:

        newskb = alloc_skb((len - sizeof(struct aunhdr) + 15) & ~15,
                           GFP_ATOMIC);
        if (newskb == NULL)
        {
                printk(KERN_DEBUG "AUN: memory squeeze, dropping packet.\n");
                /* Send nack and hope sender tries again */
                goto bad;
        }

(hmm, this last one also wants s/alloc_skb(GFP_ATOMIC)/dev_alloc_skb/)

--
vda

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-27 22:04     ` David Miller
@ 2008-03-28  0:02       ` Joe Perches
  2008-03-28  0:04         ` David Miller
  2008-03-28  0:11       ` Matt Mackall
  1 sibling, 1 reply; 29+ messages in thread
From: Joe Perches @ 2008-03-28  0:02 UTC (permalink / raw)
  To: David Miller; +Cc: ilpo.jarvinen, akpm, netdev, linux-kernel, acme, mpm

On Thu, 2008-03-27 at 15:04 -0700, David Miller wrote:
> From: Joe Perches <joe@perches.com>
> > This will very likely negatively impact
> > high speed networking.
> Please run some tests before making claims like this.

I think this is the sort of change you might have argued
against accepting without performance testing not long ago.

> I might give you some slack in this area if you've
> been doing networking performance tuning for 10+ years
> but you haven't.

I've done some work in network performance testing.

cheers, Joe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-28  0:02       ` Joe Perches
@ 2008-03-28  0:04         ` David Miller
  0 siblings, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:04 UTC (permalink / raw)
  To: joe; +Cc: ilpo.jarvinen, akpm, netdev, linux-kernel, acme, mpm

From: Joe Perches <joe@perches.com>
Date: Thu, 27 Mar 2008 17:02:37 -0700

> I think this is the sort of change you might have argued
> against accepting without performance testing not long ago.

Look at how complicated the assembler of this function
is.  Any inlining performance gains will be trumped by
the instruction cache usage bloat.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-27 22:04     ` David Miller
  2008-03-28  0:02       ` Joe Perches
@ 2008-03-28  0:11       ` Matt Mackall
  2008-03-28  0:54         ` Joe Perches
  1 sibling, 1 reply; 29+ messages in thread
From: Matt Mackall @ 2008-03-28  0:11 UTC (permalink / raw)
  To: David Miller; +Cc: joe, ilpo.jarvinen, akpm, netdev, linux-kernel, acme


On Thu, 2008-03-27 at 15:04 -0700, David Miller wrote:
> From: Joe Perches <joe@perches.com>
> Date: Thu, 27 Mar 2008 12:10:50 -0700
> 
> > On Thu, 2008-03-27 at 14:38 +0200, Ilpo Järvinen wrote:
> > > Allyesconfig (v2.6.24-mm1):
> > 
> > I think this change is only good in severely memory
> > limited uses.  This will very likely negatively impact
> > high speed networking.  It's a speed/size trade off.
> 
> I severely doubt this, the bulk of the overhead of
> skb_put() is the atomic operation, not whether the
> instructions get executed inline or not.

More generally, we have to weigh the cost of a function call against the
cost of a cache miss here or -somewhere else-. That is, running multiple
copies of this code inline means that much other code gets pushed out of
cache. Further, consolidating multiple copies of this code into one
means it's that much more likely to already be in cache when we hit it.

In the 486 era, when CPU performance was close to 1:1 with memory,
branches were more expensive than sequential memory fetches, and
registers were scarce, inlining made a fair amount of sense.

But now we've moved very far away from that indeed: CPU is orders of
magnitude faster than memory, branches are quite cheap, and register
are.. well, not quite as scarce. All of that means that a cache miss is
much more expensive than a function call.

Inlining typical only makes sense for fairly trivial transformations
(something you'd consider doing with a macro) where the code to set up a
function call is comparable to the size of the function itself and code
in innermost loops. And if the inline is in a header file, it's probably
not in the latter class.

In the case of this patch, removing 60-100k from the network stack means
we're almost certainly avoiding a lot of cache misses in the big picture
while taking a few cycle hit per packet in the smallest scale.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-27 12:38 ` [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot Ilpo Järvinen
  2008-03-27 12:38   ` [PATCH 2/7] [NET]: uninline skb_pull, " Ilpo Järvinen
  2008-03-27 19:10   ` [PATCH 1/7] [NET]: uninline skb_put, " Joe Perches
@ 2008-03-28  0:43   ` David Miller
  2 siblings, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:43 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: akpm, netdev, linux-kernel, acme

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 27 Mar 2008 14:38:00 +0200

> Allyesconfig (v2.6.24-mm1):
> 
> ~500 files changed
> ...
>  869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put
>   skb_put                       | +104
> 
> Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
> 
> -60744  855 funcs, 861 +, 61605 -, diff: -60744 --- skb_put
>   skb_put                       |  +57
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] [NET]: uninline dev_alloc_skb, de-bloats a lot
  2008-03-27 12:38     ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Ilpo Järvinen
  2008-03-27 12:38       ` [PATCH 4/7] [NET]: uninline skb_push, " Ilpo Järvinen
  2008-03-27 23:36       ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Denys Vlasenko
@ 2008-03-28  0:51       ` David Miller
  2 siblings, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:51 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: akpm, netdev, linux-kernel, acme

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 27 Mar 2008 14:38:02 +0200

> Allyesconfig (v2.6.24-mm1):
> 
> -23668  392 funcs, 104 +, 23772 -, diff: -23668 --- dev_alloc_skb
> 
> Without many debug CONFIGs (v2.6.25-rc2-mm1):
> 
> -12178  382 funcs, 157 +, 12335 -, diff: -12178 --- dev_alloc_skb
> dev_alloc_skb                 |  +37
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] [NET]: uninline dev_alloc_skb, de-bloats a lot
  2008-03-27 23:36       ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Denys Vlasenko
@ 2008-03-28  0:52         ` David Miller
  2008-03-28  1:42           ` Arnaldo Carvalho de Melo
  2008-03-28 22:34           ` Denys Vlasenko
  0 siblings, 2 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:52 UTC (permalink / raw)
  To: vda.linux; +Cc: ilpo.jarvinen, akpm, netdev, linux-kernel, acme

From: Denys Vlasenko <vda.linux@googlemail.com>
Date: Fri, 28 Mar 2008 00:36:59 +0100

> Can you add a comment which explains the intent?
> 
> +struct sk_buff *dev_alloc_skb(unsigned int length)
> +{
> +       /* There is more code here than it seems:
> +        * __dev_alloc_skb is an inline */
> +	return __dev_alloc_skb(length, GFP_ATOMIC);
> +}
> +EXPORT_SYMBOL(dev_alloc_skb);

I've applied his patch already, if you want this comment
please submit the patch to add it and also please use
the correct formatting of the comment.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/7] [NET]: uninline skb_push, de-bloats a lot
  2008-03-27 12:38       ` [PATCH 4/7] [NET]: uninline skb_push, " Ilpo Järvinen
  2008-03-27 12:38         ` [PATCH 5/7] [NET]: uninline dst_release Ilpo Järvinen
@ 2008-03-28  0:52         ` David Miller
  1 sibling, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:52 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: akpm, netdev, linux-kernel, acme

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 27 Mar 2008 14:38:03 +0200

> Allyesconfig (v2.6.24-mm1):
> 
> -21593  356 funcs, 2418 +, 24011 -, diff: -21593 --- skb_push
> 
> Without many debug related CONFIGs (v2.6.25-rc2-mm1):
> 
> -13890  341 funcs, 189 +, 14079 -, diff: -13890 --- skb_push
> skb_push                      |  +46
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/7] [NET]: uninline dst_release
  2008-03-27 12:38         ` [PATCH 5/7] [NET]: uninline dst_release Ilpo Järvinen
  2008-03-27 12:38           ` [PATCH 6/7] [NET]: uninline skb_trim, de-bloats Ilpo Järvinen
@ 2008-03-28  0:53           ` David Miller
  1 sibling, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:53 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: akpm, netdev, linux-kernel, acme

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 27 Mar 2008 14:38:04 +0200

> Codiff stats (allyesconfig, v2.6.24-mm1):
> -16420  187 funcs, 103 +, 16523 -, diff: -16420 --- dst_release
> 
> Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
> -7257  186 funcs, 70 +, 7327 -, diff: -7257 --- dst_release
> dst_release                   |  +40
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/7] [NET]: uninline skb_trim, de-bloats
  2008-03-27 12:38           ` [PATCH 6/7] [NET]: uninline skb_trim, de-bloats Ilpo Järvinen
  2008-03-27 12:38             ` [PATCH 7/7] [SCTP]: Remove sctp_add_cmd_sf wrapper bloat Ilpo Järvinen
@ 2008-03-28  0:54             ` David Miller
  1 sibling, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:54 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: akpm, netdev, linux-kernel, acme

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 27 Mar 2008 14:38:05 +0200

> Allyesconfig (v2.6.24-mm1):
> -10976  209 funcs, 123 +, 11099 -, diff: -10976 --- skb_trim
> 
> Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
> -7360  192 funcs, 131 +, 7491 -, diff: -7360 --- skb_trim
> skb_trim                      |  +42
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 7/7] [SCTP]: Remove sctp_add_cmd_sf wrapper bloat
  2008-03-27 12:38             ` [PATCH 7/7] [SCTP]: Remove sctp_add_cmd_sf wrapper bloat Ilpo Järvinen
@ 2008-03-28  0:54               ` David Miller
  0 siblings, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:54 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: akpm, netdev, linux-kernel, acme, vladislav.yasevich

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 27 Mar 2008 14:38:06 +0200

> With a was number of callsites sctp_add_cmd_sf wrapper bloats
> kernel by some amount. Due to unlikely tracking allyesconfig,
> with the initial result were around ~7kB (thus caught my
> attention) while a non-debug config produced only ~2.3kB effect.
> 
> I (ij) proposed first a patch to uninline it but Vlad responded
> with a patch that removed the only sctp_add_cmd call which is
> wrapped by sctp_add_cmd_sf (I wasn't sure if I could do that).
> I did minor cleanup to Vlad's patch.
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-28  0:11       ` Matt Mackall
@ 2008-03-28  0:54         ` Joe Perches
  2008-03-28  1:18           ` Matt Mackall
  0 siblings, 1 reply; 29+ messages in thread
From: Joe Perches @ 2008-03-28  0:54 UTC (permalink / raw)
  To: Matt Mackall
  Cc: David Miller, ilpo.jarvinen, akpm, netdev, linux-kernel, acme

On Thu, 2008-03-27 at 19:11 -0500, Matt Mackall wrote:
> In the 486 era, when CPU performance was close to 1:1 with memory,
> branches were more expensive than sequential memory fetches, and
> registers were scarce, inlining made a fair amount of sense.
> 
> But now we've moved very far away from that indeed:

Systems have certainly improved but Linux is used in a
wide variety of CPU Hz, memory & register architectures.

Some of those systems haven't changed at all.
Some of those systems have sufficient cache for a
networking stack.

I think this change could negatively impact some of these
different uses and systems.

> In the case of this patch, removing 60-100k from the network stack means
> we're almost certainly avoiding a lot of cache misses in the big picture
> while taking a few cycle hit per packet in the smallest scale.

I think the quantities of big v small are instance dependent
and it might be prudent to have the capability to keep these
functions inline.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7]: uninline some net related static inline in .h bloaters
  2008-03-27 12:37 [PATCH 0/7]: uninline some net related static inline in .h bloaters Ilpo Järvinen
  2008-03-27 12:38 ` [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot Ilpo Järvinen
@ 2008-03-28  0:54 ` David Miller
  1 sibling, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28  0:54 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: akpm, netdev, linux-kernel, acme

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Thu, 27 Mar 2008 14:37:59 +0200

> As suggested by Andrew, I rerun the uninlining of all what's in .h
> (v2.6.25-rc2-mm1 this time) to fix the (un)likely profiling overhead
> which showed up in the allyesconfig results. The config was manually
> tweaked from allyesconfig to not include a number of debug related
> options [1]. Other setup: 32-bit x86, gcc (GCC) 4.1.2 20070626
> (Red Hat 4.1.2-13). I also tweaked the uninlining machinery a bit and
> got even better coverage than last time.

I applied everything and will push to net-2.6.26 after some
build tests.

Thanks!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-28  0:54         ` Joe Perches
@ 2008-03-28  1:18           ` Matt Mackall
  2008-03-28 19:56             ` Ilpo Järvinen
  0 siblings, 1 reply; 29+ messages in thread
From: Matt Mackall @ 2008-03-28  1:18 UTC (permalink / raw)
  To: Joe Perches; +Cc: David Miller, ilpo.jarvinen, akpm, netdev, linux-kernel, acme


On Thu, 2008-03-27 at 17:54 -0700, Joe Perches wrote:
> On Thu, 2008-03-27 at 19:11 -0500, Matt Mackall wrote:
> > In the 486 era, when CPU performance was close to 1:1 with memory,
> > branches were more expensive than sequential memory fetches, and
> > registers were scarce, inlining made a fair amount of sense.
> > 
> > But now we've moved very far away from that indeed:
> 
> Systems have certainly improved but Linux is used in a
> wide variety of CPU Hz, memory & register architectures.
> 
> Some of those systems haven't changed at all.

It's true. In particular, 486s haven't changed at all since the 486 era.
What's changed is that people no longer run 486s to go fast, they run
them to save money. Saving memory is a win for those people.

The same goes for embedded systems. Saving memory is much higher on the
priority scale than performance. And the fact that saving memory on the
low end aligns very nicely with increasing performance on the high end
means that's the direction we're going.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] [NET]: uninline dev_alloc_skb, de-bloats a lot
  2008-03-28  0:52         ` David Miller
@ 2008-03-28  1:42           ` Arnaldo Carvalho de Melo
  2008-03-28 22:34           ` Denys Vlasenko
  1 sibling, 0 replies; 29+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-03-28  1:42 UTC (permalink / raw)
  To: David Miller; +Cc: vda.linux, ilpo.jarvinen, akpm, netdev, linux-kernel, acme

Em Thu, Mar 27, 2008 at 05:52:12PM -0700, David Miller escreveu:
> From: Denys Vlasenko <vda.linux@googlemail.com>
> Date: Fri, 28 Mar 2008 00:36:59 +0100
> 
> > Can you add a comment which explains the intent?
> > 
> > +struct sk_buff *dev_alloc_skb(unsigned int length)
> > +{
> > +       /* There is more code here than it seems:
> > +        * __dev_alloc_skb is an inline */
> > +	return __dev_alloc_skb(length, GFP_ATOMIC);
> > +}
> > +EXPORT_SYMBOL(dev_alloc_skb);
> 
> I've applied his patch already, if you want this comment
> please submit the patch to add it and also please use
> the correct formatting of the comment.

Which is, of course:

	/*
	 * There is more code here than it seems:
	 * __def_alloc_skb is an inline
	 */

Ilpo: Keep up the debloating jihad/crusade/campaign/surge/kamppailla! 8-)

- arnaldo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot
  2008-03-28  1:18           ` Matt Mackall
@ 2008-03-28 19:56             ` Ilpo Järvinen
  0 siblings, 0 replies; 29+ messages in thread
From: Ilpo Järvinen @ 2008-03-28 19:56 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Joe Perches, David Miller, Andrew Morton, Netdev, LKML,
	Arnaldo Carvalho de Melo

On Thu, 27 Mar 2008, Matt Mackall wrote:
> On Thu, 2008-03-27 at 17:54 -0700, Joe Perches wrote:
> > On Thu, 2008-03-27 at 19:11 -0500, Matt Mackall wrote:
> > > In the 486 era, when CPU performance was close to 1:1 with memory,
> > > branches were more expensive than sequential memory fetches, and
> > > registers were scarce, inlining made a fair amount of sense.
> > > 
> > > But now we've moved very far away from that indeed:
> > 
> > Systems have certainly improved but Linux is used in a
> > wide variety of CPU Hz, memory & register architectures.
> > 
> > Some of those systems haven't changed at all.
> 
> It's true. In particular, 486s haven't changed at all since the 486 era.
> What's changed is that people no longer run 486s to go fast, they run
> them to save money. Saving memory is a win for those people.

I'd guess though that they won't get that big size saving because they
probably have carefully tuned configs as well.

> The same goes for embedded systems. Saving memory is much higher on the
> priority scale than performance. And the fact that saving memory on the
> low end aligns very nicely with increasing performance on the high end
> means that's the direction we're going.

Also if some TLB misses are avoided due to uninlining, it may well 
rebalance the equation (each config & scenario is quite unique though).

-- 
 i.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] [NET]: uninline dev_alloc_skb, de-bloats a lot
  2008-03-28  0:52         ` David Miller
  2008-03-28  1:42           ` Arnaldo Carvalho de Melo
@ 2008-03-28 22:34           ` Denys Vlasenko
  2008-03-28 22:57             ` David Miller
  2008-03-29  4:52             ` Bill Fink
  1 sibling, 2 replies; 29+ messages in thread
From: Denys Vlasenko @ 2008-03-28 22:34 UTC (permalink / raw)
  To: David Miller; +Cc: ilpo.jarvinen, akpm, netdev, linux-kernel, acme

[-- Attachment #1: Type: text/plain, Size: 695 bytes --]

On Friday 28 March 2008 01:52, David Miller wrote:
> From: Denys Vlasenko <vda.linux@googlemail.com>
> Date: Fri, 28 Mar 2008 00:36:59 +0100
> 
> > Can you add a comment which explains the intent?
> > 
> > +struct sk_buff *dev_alloc_skb(unsigned int length)
> > +{
> > +       /* There is more code here than it seems:
> > +        * __dev_alloc_skb is an inline */
> > +	return __dev_alloc_skb(length, GFP_ATOMIC);
> > +}
> > +EXPORT_SYMBOL(dev_alloc_skb);
> 
> I've applied his patch already, if you want this comment
> please submit the patch to add it and also please use
> the correct formatting of the comment.

Here it is.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
--
vda

[-- Attachment #2: dev_alloc_skb_comment.diff --]
[-- Type: text/x-diff, Size: 374 bytes --]

--- net-2.6.26/net/core/skbuff.c	Fri Mar 28 23:31:05 2008
+++ net-2.6.26.comment/net/core/skbuff.c	Fri Mar 28 23:33:03 2008
@@ -277,6 +277,10 @@
  */
 struct sk_buff *dev_alloc_skb(unsigned int length)
 {
+	/*
+	 * There is more code here than it seems:
+	 * __def_alloc_skb is an inline
+	 */
 	return __dev_alloc_skb(length, GFP_ATOMIC);
 }
 EXPORT_SYMBOL(dev_alloc_skb);

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] [NET]: uninline dev_alloc_skb, de-bloats a lot
  2008-03-28 22:34           ` Denys Vlasenko
@ 2008-03-28 22:57             ` David Miller
  2008-03-29  4:52             ` Bill Fink
  1 sibling, 0 replies; 29+ messages in thread
From: David Miller @ 2008-03-28 22:57 UTC (permalink / raw)
  To: vda.linux; +Cc: ilpo.jarvinen, akpm, netdev, linux-kernel, acme

From: Denys Vlasenko <vda.linux@googlemail.com>
Date: Fri, 28 Mar 2008 23:34:29 +0100

> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>

Applied, thanks!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/7] [NET]: uninline dev_alloc_skb, de-bloats a lot
  2008-03-28 22:34           ` Denys Vlasenko
  2008-03-28 22:57             ` David Miller
@ 2008-03-29  4:52             ` Bill Fink
  1 sibling, 0 replies; 29+ messages in thread
From: Bill Fink @ 2008-03-29  4:52 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: David Miller, ilpo.jarvinen, akpm, netdev, linux-kernel, acme

Typo in patch (see below):

On Fri, 28 Mar 2008, Denys Vlasenko wrote:

> On Friday 28 March 2008 01:52, David Miller wrote:
> > From: Denys Vlasenko <vda.linux@googlemail.com>
> > Date: Fri, 28 Mar 2008 00:36:59 +0100
> > 
> > > Can you add a comment which explains the intent?
> > > 
> > > +struct sk_buff *dev_alloc_skb(unsigned int length)
> > > +{
> > > +       /* There is more code here than it seems:
> > > +        * __dev_alloc_skb is an inline */
> > > +	return __dev_alloc_skb(length, GFP_ATOMIC);
> > > +}
> > > +EXPORT_SYMBOL(dev_alloc_skb);
> > 
> > I've applied his patch already, if you want this comment
> > please submit the patch to add it and also please use
> > the correct formatting of the comment.
> 
> Here it is.
> 
> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
> --
> vda
> 
> --- net-2.6.26/net/core/skbuff.c	Fri Mar 28 23:31:05 2008
> +++ net-2.6.26.comment/net/core/skbuff.c	Fri Mar 28 23:33:03 2008
> @@ -277,6 +277,10 @@
>   */
>  struct sk_buff *dev_alloc_skb(unsigned int length)
>  {
> +	/*
> +	 * There is more code here than it seems:
> +	 * __def_alloc_skb is an inline
               ^
               v

> +	 */
>  	return __dev_alloc_skb(length, GFP_ATOMIC);
>  }
>  EXPORT_SYMBOL(dev_alloc_skb);

						-Bill

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2008-03-29  4:52 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-27 12:37 [PATCH 0/7]: uninline some net related static inline in .h bloaters Ilpo Järvinen
2008-03-27 12:38 ` [PATCH 1/7] [NET]: uninline skb_put, de-bloats a lot Ilpo Järvinen
2008-03-27 12:38   ` [PATCH 2/7] [NET]: uninline skb_pull, " Ilpo Järvinen
2008-03-27 12:38     ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Ilpo Järvinen
2008-03-27 12:38       ` [PATCH 4/7] [NET]: uninline skb_push, " Ilpo Järvinen
2008-03-27 12:38         ` [PATCH 5/7] [NET]: uninline dst_release Ilpo Järvinen
2008-03-27 12:38           ` [PATCH 6/7] [NET]: uninline skb_trim, de-bloats Ilpo Järvinen
2008-03-27 12:38             ` [PATCH 7/7] [SCTP]: Remove sctp_add_cmd_sf wrapper bloat Ilpo Järvinen
2008-03-28  0:54               ` David Miller
2008-03-28  0:54             ` [PATCH 6/7] [NET]: uninline skb_trim, de-bloats David Miller
2008-03-28  0:53           ` [PATCH 5/7] [NET]: uninline dst_release David Miller
2008-03-28  0:52         ` [PATCH 4/7] [NET]: uninline skb_push, de-bloats a lot David Miller
2008-03-27 23:36       ` [PATCH 3/7] [NET]: uninline dev_alloc_skb, " Denys Vlasenko
2008-03-28  0:52         ` David Miller
2008-03-28  1:42           ` Arnaldo Carvalho de Melo
2008-03-28 22:34           ` Denys Vlasenko
2008-03-28 22:57             ` David Miller
2008-03-29  4:52             ` Bill Fink
2008-03-28  0:51       ` David Miller
2008-03-27 19:10   ` [PATCH 1/7] [NET]: uninline skb_put, " Joe Perches
2008-03-27 22:04     ` David Miller
2008-03-28  0:02       ` Joe Perches
2008-03-28  0:04         ` David Miller
2008-03-28  0:11       ` Matt Mackall
2008-03-28  0:54         ` Joe Perches
2008-03-28  1:18           ` Matt Mackall
2008-03-28 19:56             ` Ilpo Järvinen
2008-03-28  0:43   ` David Miller
2008-03-28  0:54 ` [PATCH 0/7]: uninline some net related static inline in .h bloaters David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).