LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] ipvs: Make the synchronization interval controllable
@ 2008-02-06 21:39 Sven Wegener
  2008-02-06 22:09 ` David Rientjes
  2008-02-08 14:23 ` Andi Kleen
  0 siblings, 2 replies; 8+ messages in thread
From: Sven Wegener @ 2008-02-06 21:39 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

The default synchronization interval of 1000 milliseconds is too high for a
heavily loaded director. Collecting the connection information from one second
and then sending it out in a burst will overflow the socket buffer and lead to
synchronization information being dropped. Make the interval controllable by a
sysctl variable so that users can tune it.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---
 Documentation/networking/ipvs-sysctl.txt |    9 +++++++++
 net/ipv4/ipvs/ip_vs_ctl.c                |    9 ++++++++-
 net/ipv4/ipvs/ip_vs_sync.c               |    6 ++++--
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
index 4ccdbca..1389e2f 100644
--- a/Documentation/networking/ipvs-sysctl.txt
+++ b/Documentation/networking/ipvs-sysctl.txt
@@ -141,3 +141,12 @@ sync_threshold - INTEGER
         synchronized, every time the number of its incoming packets
         modulus 50 equals the threshold. The range of the threshold is
         from 0 to 49.
+
+sync_interval - INTEGER
+	default 1000
+
+	The information from synchronization is buffered and sent out at
+	regular intervals by a kernel thread. The interval (in ms) is
+	controlled by this value. The default is too high for a heavily loaded
+	director. If you get a lot of "ip_vs_send_async error" messages from
+	your kernel, then you should lower this value.
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index 94c5767..2781505 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -81,7 +81,7 @@ int sysctl_ip_vs_expire_nodest_conn = 0;
 int sysctl_ip_vs_expire_quiescent_template = 0;
 int sysctl_ip_vs_sync_threshold[2] = { 3, 50 };
 int sysctl_ip_vs_nat_icmp_send = 0;
-
+extern int sysctl_ip_vs_sync_interval;
 
 #ifdef CONFIG_IP_VS_DEBUG
 static int sysctl_ip_vs_debug_level = 0;
@@ -1582,6 +1582,13 @@ static struct ctl_table vs_vars[] = {
 		.proc_handler	= &proc_do_sync_threshold,
 	},
 	{
+		.procname	= "sync_interval",
+		.data		= &sysctl_ip_vs_sync_interval,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
+	{
 		.procname	= "nat_icmp_send",
 		.data		= &sysctl_ip_vs_nat_icmp_send,
 		.maxlen		= sizeof(int),
diff --git a/net/ipv4/ipvs/ip_vs_sync.c b/net/ipv4/ipvs/ip_vs_sync.c
index 948378d..9b57ad3 100644
--- a/net/ipv4/ipvs/ip_vs_sync.c
+++ b/net/ipv4/ipvs/ip_vs_sync.c
@@ -143,6 +143,8 @@ char ip_vs_backup_mcast_ifn[IP_VS_IFNAME_MAXLEN];
 /* multicast addr */
 static struct sockaddr_in mcast_addr;
 
+/* milliseconds between synchronization runs */
+int sysctl_ip_vs_sync_interval = 1000;
 
 static inline void sb_queue_tail(struct ip_vs_sync_buff *sb)
 {
@@ -701,7 +703,7 @@ static void sync_master_loop(void)
 		if (stop_master_sync)
 			break;
 
-		msleep_interruptible(1000);
+		msleep_interruptible(sysctl_ip_vs_sync_interval);
 	}
 
 	/* clean up the sync_buff queue */
@@ -758,7 +760,7 @@ static void sync_backup_loop(void)
 		if (stop_backup_sync)
 			break;
 
-		msleep_interruptible(1000);
+		msleep_interruptible(sysctl_ip_vs_sync_interval);
 	}
 
 	/* release the sending multicast socket */
-- 
1.5.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipvs: Make the synchronization interval controllable
  2008-02-06 21:39 [PATCH] ipvs: Make the synchronization interval controllable Sven Wegener
@ 2008-02-06 22:09 ` David Rientjes
  2008-02-06 23:12   ` Sven Wegener
  2008-02-08 14:23 ` Andi Kleen
  1 sibling, 1 reply; 8+ messages in thread
From: David Rientjes @ 2008-02-06 22:09 UTC (permalink / raw)
  To: Sven Wegener; +Cc: netdev, linux-kernel

On Wed, 6 Feb 2008, Sven Wegener wrote:

> diff --git a/net/ipv4/ipvs/ip_vs_sync.c b/net/ipv4/ipvs/ip_vs_sync.c
> index 948378d..9b57ad3 100644
> --- a/net/ipv4/ipvs/ip_vs_sync.c
> +++ b/net/ipv4/ipvs/ip_vs_sync.c
> @@ -143,6 +143,8 @@ char ip_vs_backup_mcast_ifn[IP_VS_IFNAME_MAXLEN];
>  /* multicast addr */
>  static struct sockaddr_in mcast_addr;
>  
> +/* milliseconds between synchronization runs */
> +int sysctl_ip_vs_sync_interval = 1000;
>  
>  static inline void sb_queue_tail(struct ip_vs_sync_buff *sb)
>  {

How useful is a negative ip_vs_sync_interval?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipvs: Make the synchronization interval controllable
  2008-02-06 22:09 ` David Rientjes
@ 2008-02-06 23:12   ` Sven Wegener
  2008-02-06 23:57     ` Sven Wegener
  2008-02-07  6:54     ` David Rientjes
  0 siblings, 2 replies; 8+ messages in thread
From: Sven Wegener @ 2008-02-06 23:12 UTC (permalink / raw)
  To: David Rientjes; +Cc: netdev, linux-kernel

On Wed, 6 Feb 2008, David Rientjes wrote:

> On Wed, 6 Feb 2008, Sven Wegener wrote:
>
>> diff --git a/net/ipv4/ipvs/ip_vs_sync.c b/net/ipv4/ipvs/ip_vs_sync.c
>> index 948378d..9b57ad3 100644
>> --- a/net/ipv4/ipvs/ip_vs_sync.c
>> +++ b/net/ipv4/ipvs/ip_vs_sync.c
>> @@ -143,6 +143,8 @@ char ip_vs_backup_mcast_ifn[IP_VS_IFNAME_MAXLEN];
>>  /* multicast addr */
>>  static struct sockaddr_in mcast_addr;
>>
>> +/* milliseconds between synchronization runs */
>> +int sysctl_ip_vs_sync_interval = 1000;
>>
>>  static inline void sb_queue_tail(struct ip_vs_sync_buff *sb)
>>  {
>
> How useful is a negative ip_vs_sync_interval?

Negative values will be converted to MAX_JIFFY_OFFSET by msecs_to_jiffies 
and result in a very long interval. A too long interval will be a good way 
to get your system OOM. We could use an unsigned int or even restrict the 
value with proc_dointvec_minmax. I'd prefer the latter, that's what I 
already had in my mind and it also protects from unintentionally choosing 
a too long interval.

Sven

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] ipvs: Make the synchronization interval controllable
  2008-02-06 23:12   ` Sven Wegener
@ 2008-02-06 23:57     ` Sven Wegener
  2008-02-07  4:28       ` Simon Horman
  2008-02-07  6:54     ` David Rientjes
  1 sibling, 1 reply; 8+ messages in thread
From: Sven Wegener @ 2008-02-06 23:57 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

The default synchronization interval of 1000 milliseconds is too high for a
heavily loaded director. Collecting the connection information from one second
and then sending it out in a burst will overflow the socket buffer and lead to
synchronization information being dropped. Make the interval controllable by a
sysctl variable so that users can tune it. We enforce a lower limit of 0 and an
upper limit of 2000 ms on the interval. A too large interval can make the
synchronization buffer consume too much memory and will also delay the exit of
the kernel threads.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
---

Changes from the last version include the addition of the range enforcement.
Also place the definitions of the variables where all other ipvs sysctl
variables are.

Documentation/networking/ipvs-sysctl.txt |   10 ++++++++++
 include/net/ip_vs.h                      |    1 +
 net/ipv4/ipvs/ip_vs_ctl.c                |   12 ++++++++++++
 net/ipv4/ipvs/ip_vs_sync.c               |    4 ++--
 4 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
index 4ccdbca..bb4eb9a 100644
--- a/Documentation/networking/ipvs-sysctl.txt
+++ b/Documentation/networking/ipvs-sysctl.txt
@@ -141,3 +141,13 @@ sync_threshold - INTEGER
         synchronized, every time the number of its incoming packets
         modulus 50 equals the threshold. The range of the threshold is
         from 0 to 49.
+
+sync_interval - INTEGER
+	default 1000
+
+	The information from synchronization is buffered and sent out at a
+	regular interval by a kernel thread. The interval (in ms) is
+	controlled by this value. The default is too high for a heavily loaded
+	director. If you get a lot of "ip_vs_send_async error" messages from
+	your kernel, then you should lower this value. The value of the
+	interval can be chosen from the range from 0 to 2000.
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 56f3c94..9c4498b 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -854,6 +854,7 @@ extern int sysctl_ip_vs_cache_bypass;
 extern int sysctl_ip_vs_expire_nodest_conn;
 extern int sysctl_ip_vs_expire_quiescent_template;
 extern int sysctl_ip_vs_sync_threshold[2];
+extern int sysctl_ip_vs_sync_interval;
 extern int sysctl_ip_vs_nat_icmp_send;
 extern struct ip_vs_stats ip_vs_stats;
 extern struct ctl_path net_vs_ctl_path[];
diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c
index 94c5767..c6322f7 100644
--- a/net/ipv4/ipvs/ip_vs_ctl.c
+++ b/net/ipv4/ipvs/ip_vs_ctl.c
@@ -80,8 +80,11 @@ int sysctl_ip_vs_cache_bypass = 0;
 int sysctl_ip_vs_expire_nodest_conn = 0;
 int sysctl_ip_vs_expire_quiescent_template = 0;
 int sysctl_ip_vs_sync_threshold[2] = { 3, 50 };
+int sysctl_ip_vs_sync_interval = 1000;
 int sysctl_ip_vs_nat_icmp_send = 0;
 
+static int ip_vs_sync_interval_min = 0;
+static int ip_vs_sync_interval_max = 2000;
 
 #ifdef CONFIG_IP_VS_DEBUG
 static int sysctl_ip_vs_debug_level = 0;
@@ -1582,6 +1585,15 @@ static struct ctl_table vs_vars[] = {
 		.proc_handler	= &proc_do_sync_threshold,
 	},
 	{
+		.procname	= "sync_interval",
+		.data		= &sysctl_ip_vs_sync_interval,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec_minmax,
+		.extra1		= &ip_vs_sync_interval_min,
+		.extra2		= &ip_vs_sync_interval_max,
+	},
+	{
 		.procname	= "nat_icmp_send",
 		.data		= &sysctl_ip_vs_nat_icmp_send,
 		.maxlen		= sizeof(int),
diff --git a/net/ipv4/ipvs/ip_vs_sync.c b/net/ipv4/ipvs/ip_vs_sync.c
index 948378d..10ab1b7 100644
--- a/net/ipv4/ipvs/ip_vs_sync.c
+++ b/net/ipv4/ipvs/ip_vs_sync.c
@@ -701,7 +701,7 @@ static void sync_master_loop(void)
 		if (stop_master_sync)
 			break;
 
-		msleep_interruptible(1000);
+		msleep_interruptible(sysctl_ip_vs_sync_interval);
 	}
 
 	/* clean up the sync_buff queue */
@@ -758,7 +758,7 @@ static void sync_backup_loop(void)
 		if (stop_backup_sync)
 			break;
 
-		msleep_interruptible(1000);
+		msleep_interruptible(sysctl_ip_vs_sync_interval);
 	}
 
 	/* release the sending multicast socket */
-- 
1.5.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipvs: Make the synchronization interval controllable
  2008-02-06 23:57     ` Sven Wegener
@ 2008-02-07  4:28       ` Simon Horman
  0 siblings, 0 replies; 8+ messages in thread
From: Simon Horman @ 2008-02-07  4:28 UTC (permalink / raw)
  To: Sven Wegener; +Cc: netdev, linux-kernel

On Thu, Feb 07, 2008 at 12:57:43AM +0100, Sven Wegener wrote:
> The default synchronization interval of 1000 milliseconds is too high for a
> heavily loaded director. Collecting the connection information from one second
> and then sending it out in a burst will overflow the socket buffer and lead to
> synchronization information being dropped. Make the interval controllable by a
> sysctl variable so that users can tune it. We enforce a lower limit of 0 and an
> upper limit of 2000 ms on the interval. A too large interval can make the
> synchronization buffer consume too much memory and will also delay the exit of
> the kernel threads.
> 
> Signed-off-by: Sven Wegener <sven.wegener@stealer.net>

Hi Sven,

I have no problems with the spirit of this patch, and the range checking
seems good to me.  Though I wonder if 2000 might not be an exessively
low maximum.  I can't think of a usage case off-hand, but I'm not sure I
understand why a user shouldn't be able to set it to much (rediculously)
higher values.

In any case, thats not a big deal.

Acked-by: Simon Horman <horms@verge.net.au>

> ---
> 
> Changes from the last version include the addition of the range enforcement.
> Also place the definitions of the variables where all other ipvs sysctl
> variables are.

-- 
Horms


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipvs: Make the synchronization interval controllable
  2008-02-06 23:12   ` Sven Wegener
  2008-02-06 23:57     ` Sven Wegener
@ 2008-02-07  6:54     ` David Rientjes
  1 sibling, 0 replies; 8+ messages in thread
From: David Rientjes @ 2008-02-07  6:54 UTC (permalink / raw)
  To: Sven Wegener; +Cc: netdev, linux-kernel

On Thu, 7 Feb 2008, Sven Wegener wrote:

> Negative values will be converted to MAX_JIFFY_OFFSET by msecs_to_jiffies and
> result in a very long interval. A too long interval will be a good way to get
> your system OOM. We could use an unsigned int or even restrict the value with
> proc_dointvec_minmax. I'd prefer the latter, that's what I already had in my
> mind and it also protects from unintentionally choosing a too long interval.
> 

Yeah, you're definitely going to want an upper bound on acceptable values 
and not allow anything negative for the aforementioned reason.

		David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipvs: Make the synchronization interval controllable
  2008-02-06 21:39 [PATCH] ipvs: Make the synchronization interval controllable Sven Wegener
  2008-02-06 22:09 ` David Rientjes
@ 2008-02-08 14:23 ` Andi Kleen
  2008-02-08 15:57   ` Krzysztof Oledzki
  1 sibling, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2008-02-08 14:23 UTC (permalink / raw)
  To: Sven Wegener; +Cc: netdev, linux-kernel

Sven Wegener <sven.wegener@stealer.net> writes:

> The default synchronization interval of 1000 milliseconds is too high for a
> heavily loaded director. Collecting the connection information from one second
> and then sending it out in a burst will overflow the socket buffer and lead to
> synchronization information being dropped. Make the interval controllable by a
> sysctl variable so that users can tune it.

It would be better if the defaults just worked under all circumstances.
So why not just lower the default?

Or the code could detect overflowing socket buffers and lower the 
value dynamically.

-Andi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ipvs: Make the synchronization interval controllable
  2008-02-08 14:23 ` Andi Kleen
@ 2008-02-08 15:57   ` Krzysztof Oledzki
  0 siblings, 0 replies; 8+ messages in thread
From: Krzysztof Oledzki @ 2008-02-08 15:57 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Sven Wegener, netdev, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 798 bytes --]



On Fri, 8 Feb 2008, Andi Kleen wrote:

> Sven Wegener <sven.wegener@stealer.net> writes:
>
>> The default synchronization interval of 1000 milliseconds is too high for a
>> heavily loaded director. Collecting the connection information from one second
>> and then sending it out in a burst will overflow the socket buffer and lead to
>> synchronization information being dropped. Make the interval controllable by a
>> sysctl variable so that users can tune it.
>
> It would be better if the defaults just worked under all circumstances.
> So why not just lower the default?
>
> Or the code could detect overflowing socket buffers and lower the
> value dynamically.

We can also start sending when amount of data reaches defined level.

Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-02-08 15:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-06 21:39 [PATCH] ipvs: Make the synchronization interval controllable Sven Wegener
2008-02-06 22:09 ` David Rientjes
2008-02-06 23:12   ` Sven Wegener
2008-02-06 23:57     ` Sven Wegener
2008-02-07  4:28       ` Simon Horman
2008-02-07  6:54     ` David Rientjes
2008-02-08 14:23 ` Andi Kleen
2008-02-08 15:57   ` Krzysztof Oledzki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).