LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Chris Caputo <ccaputo@alt.net>
To: Julian Anastasov <ja@ssi.bg>
Cc: Wensong Zhang <wensong@linux-vs.org>,
Simon Horman <horms@verge.net.au>,
lvs-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] IPVS: add wlib & wlip schedulers
Date: Fri, 23 Jan 2015 04:16:47 +0000 (UTC) [thread overview]
Message-ID: <Pine.LNX.4.64.1501230410450.8217@nacho.alt.net> (raw)
In-Reply-To: <alpine.LFD.2.11.1501222308060.2572@ja.home.ssi.bg>
On Fri, 23 Jan 2015, Julian Anastasov wrote:
> Hello,
>
> On Tue, 20 Jan 2015, Chris Caputo wrote:
> > My application consists of incoming TCP streams being load balanced to
> > servers which receive the feeds. These are long lived multi-gigabyte
> > streams, and so I believe the estimator's 2-second timer is fine. As an
> > example:
> >
> > # cat /proc/net/ip_vs_stats
> > Total Incoming Outgoing Incoming Outgoing
> > Conns Packets Packets Bytes Bytes
> > 9AB 58B7C17 0 1237CA2C325 0
> >
> > Conns/s Pkts/s Pkts/s Bytes/s Bytes/s
> > 1 387C 0 B16C4AE 0
>
> All other schedulers react and see different
> picture after every new connection. The worst example
> is WLC where slow-start mechanism is desired because
> idle server can be overloaded before the load is noticed
> properly. Even WRR accounts every connection in its state.
>
> Your setup may expect low number of connections per
> second but for other kind of setups sending all connections
> to same server for 2 seconds looks scary. In fact, what
> changes is the position, so we rotate only among the
> least loaded servers that look equally loaded but it is
> one server in the common case. And as our stats are per
> CPU and designed for human reading, it is difficult to
> read them often for other purposes. We need a good idea
> to solve this problem, so that we can have faster feedback
> after every scheduling.
This is exactly why my wlib/wlip code is a hybrid of wlc and rr. Last
location is saved, and search is started after it. Thus when traffic is
zero, round-robin occurs. When flows already exist, bursts of new
connections do choose poorly based on repeated use of last estimation, but
the complexity of working around that seems complex.
> > > May be not so useful idea: use sum of both directions
> > > or control it with svc->flags & IP_VS_SVC_F_SCHED_WLIB_xxx
> > > flags, see how "sh" scheduler supports flags. I.e.
> > > inbps + outbps.
> >
> > I see a user-mode option as increasing complexity. For example,
> > keepalived users would need to have keepalived patched to support the new
> > algorithm, due to flags, rather than just configuring "wlib" or "wlip" and
> > it just working.
>
> That is also true.
>
> > I think I'd rather see a wlob/wlop version for users that want to
> > load-balance based on outgoing bytes/packets, and a wlb/wlp version for
> > users that want them summed.
>
> ok
>
> > From: Chris Caputo <ccaputo@alt.net>
> >
> > IPVS: Change inbps and outbps to 64-bits so that estimator handles faster
> > flows. Also increases maximum viewable at user level from ~2.15Gbits/s to
> > ~34.35Gbits/s.
>
> Yep, we are limited from u32 in user space structs.
> I have to think how to solve this problem.
>
> 1gbit => ~1.5 million pps
> 10gbit => ~15 million pps
> 100gbit => ~150 million pps
>
> > Signed-off-by: Chris Caputo <ccaputo@alt.net>
> > ---
> > diff -uprN linux-3.19-rc5-stock/include/net/ip_vs.h linux-3.19-rc5/include/net/ip_vs.h
> > --- linux-3.19-rc5-stock/include/net/ip_vs.h 2015-01-18 06:02:20.000000000 +0000
> > +++ linux-3.19-rc5/include/net/ip_vs.h 2015-01-20 08:01:15.548177969 +0000
> > @@ -390,8 +390,8 @@ struct ip_vs_estimator {
> > u32 cps;
> > u32 inpps;
> > u32 outpps;
> > - u32 inbps;
> > - u32 outbps;
> > + u64 inbps;
> > + u64 outbps;
>
> Not sure, may be everything here should be u64 because
> we have shifted values. I'll need some days to investigate
> this issue...
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
Sounds good and thanks!
Chris
next prev parent reply other threads:[~2015-01-23 4:16 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Pine.LNX.4.44.0501260832210.17893-100000@nacho.alt.net>
[not found] ` <Pine.LNX.4.61.0502010007060.1148@penguin.linux-vs.org>
2015-01-17 23:15 ` [PATCH 1/2] " Chris Caputo
2015-01-19 23:17 ` Julian Anastasov
2015-01-20 23:21 ` [PATCH 1/3] " Chris Caputo
2015-01-22 22:06 ` Julian Anastasov
2015-01-23 4:16 ` Chris Caputo [this message]
2015-01-27 8:36 ` Julian Anastasov
2015-01-20 23:21 ` [PATCH 2/3] " Chris Caputo
2015-01-22 21:07 ` Julian Anastasov
2015-01-20 23:21 ` [PATCH 3/3] " Chris Caputo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.1501230410450.8217@nacho.alt.net \
--to=ccaputo@alt.net \
--cc=ccaputo-dated-1432354608.8cc3cb@alt.net \
--cc=horms@verge.net.au \
--cc=ja@ssi.bg \
--cc=linux-kernel@vger.kernel.org \
--cc=lvs-devel@vger.kernel.org \
--cc=wensong@linux-vs.org \
--subject='Re: [PATCH 1/3] IPVS: add wlib & wlip schedulers' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).