LKML Archive on
help / color / mirror / Atom feed
From: Julian Anastasov <>
To: Chris Caputo <>
Cc: Wensong Zhang <>,
	Simon Horman <>,,
Subject: Re: [PATCH 1/3] IPVS: add wlib & wlip schedulers
Date: Tue, 27 Jan 2015 10:36:34 +0200 (EET)	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>


On Fri, 23 Jan 2015, Julian Anastasov wrote:

> On Tue, 20 Jan 2015, Chris Caputo wrote:
> > My application consists of incoming TCP streams being load balanced to 
> > servers which receive the feeds. These are long lived multi-gigabyte 
> > streams, and so I believe the estimator's 2-second timer is fine. As an 
> > example:
> > 
> > # cat /proc/net/ip_vs_stats
> >    Total Incoming Outgoing         Incoming         Outgoing
> >    Conns  Packets  Packets            Bytes            Bytes
> >      9AB  58B7C17        0      1237CA2C325                0
> > 
> >  Conns/s   Pkts/s   Pkts/s          Bytes/s          Bytes/s
> >        1     387C        0          B16C4AE                0
> 	Not sure, may be everything here should be u64 because
> we have shifted values. I'll need some days to investigate
> this issue...

	For now I don't see hope in using schedulers that rely
on IPVS byte/packet stats, due to the slow update (2 seconds).
If we reduce this period we can cause performance problems to
other users.

Every *-LEAST-* (eg. LC, WLC) algorithm needs actual information
to take decision on every new connection. OTOH, all *-ROUND-ROBIN-*
algorithms (RR, WRR) use information (weights) from user space,
by this way kernel performs as expected.

	Currently, LC/WLC use feedback from the 3-way TCP handshake,
see ip_vs_dest_conn_overhead() where the established connections
have large preference. Such feedback from real servers is delayed
usually with microseconds, up to milliseconds. More time if
depends on clients.

	The proposed schedulers have round-robin function but
only among least loaded servers, so it is not dominant
and we suffer from slow feedback from the estimator.

	For load information that is not present in kernel
an user space daemon is needed to determine weights to use
with WRR. It can take actual stats from real server, for
example, it can take into account non-IPVS traffic.

	As alternative, it is possible to implement some new svc
method that can be called for every packet, for example, in
ip_vs_in_stats(). It does not look fatal to add some fields in
struct ip_vs_dest that only specific schedulers will update,
for example, byte/packet counters. Of course, the spin_locks
the scheduler must use will suffer on many CPUs. Such info can
be also attached as allocated structure in RCU pointer
dest->sched_info where data and corresponding methods can be
stored. It will need careful RCU-kind of update, especially when
scheduler is updated in svc. If you think such idea can work
we can discuss the RCU and scheduler changes that are needed.
The proposed schedulers have to implement counters, their
own estimator and WRR function.

	Another variant can be to extend WRR with some
support for automatic dynamic-weight update depending on 
parameters: -s wrr --sched-flags {wlip,wlib,...}

	or using new option --sched-param that can also
provide info for wrr estimator, etc. In any case, the
extended WRR scheduler will need above support to check
every packet.


Julian Anastasov <>

  parent reply	other threads:[~2015-01-27  8:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <>
     [not found] ` <>
2015-01-17 23:15   ` [PATCH 1/2] " Chris Caputo
2015-01-19 23:17     ` Julian Anastasov
2015-01-20 23:21       ` [PATCH 1/3] " Chris Caputo
2015-01-22 22:06         ` Julian Anastasov
2015-01-23  4:16           ` Chris Caputo
2015-01-27  8:36           ` Julian Anastasov [this message]
2015-01-20 23:21       ` [PATCH 2/3] " Chris Caputo
2015-01-22 21:07         ` Julian Anastasov
2015-01-20 23:21       ` [PATCH 3/3] " Chris Caputo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \
    --subject='Re: [PATCH 1/3] IPVS: add wlib & wlip schedulers' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).