LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: David Miller <davem@davemloft.net>
Cc: shemminger@linux-foundation.org, rkuhn@e18.physik.tu-muenchen.de,
	andi@firstfloor.org, dada1@cosmosbay.com,
	jengelh@linux01.gwdg.de, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org
Subject: Re: [PATCH] tcp_cubic: use 32 bit math
Date: Sat, 10 Mar 2007 12:48:26 +0100	[thread overview]
Message-ID: <20070310114826.GB1608@1wt.eu> (raw)
In-Reply-To: <20070307.195135.74749102.davem@davemloft.net>

On Wed, Mar 07, 2007 at 07:51:35PM -0800, David Miller wrote:
> From: Stephen Hemminger <shemminger@linux-foundation.org>
> Date: Wed, 07 Mar 2007 19:10:47 -0800
> 
> > David Miller wrote:
> > > What about Willy Tarreau's supposedly even faster variant?
> > > Or does this incorporate that set of improvements?
> > >   
> > That's what this is:
> >     x = (2 * x + (uint32_t)div64_64(a, (uint64_t)x*(uint64_t)x)) / 3;
> 
> Great, thanks for the clarification.

Oh BTW, I have a newer version with a first approximation of the
cbrt() before the div64_64, which allows us to reduce from 3 div64
to only 2 div64. This results in a version which is twice as fast
as the initial one (ncubic), but with slightly less accuracy (0.286%
compared to 0.247). But I see that other functions such as hcbrt()
had a 1.5% avg error, so I think this is not dramatic.

Also, I managed to remove all other divides, to be kind with CPUs
having a slow divide instruction or no divide at all. Since we compute
on limited range (22 bits), we can multiply then shift right. It shows
me even slightly better time on pentium-m and athlon, with a slightly
higher avg error (0.297% compared to 0.286%), and slightly smaller
code.

I just have to clean experiments from my code to provide a patch.
David, Stephen, are you interested ?

$ ./bictcp
fls(0)=0, fls(1)=1, fls(256)=9
Calibrating
Function     clocks  mean(us) max(us)  std(us)  Avg error
bictcp          936     0.61    24.28     1.99 0.172%
ocubic          886     0.57    23.51     3.18 0.274%
ncubic          644     0.42    16.59     2.18 0.247%
ncubic32        444     0.29    11.47     1.50 0.247%
ncubic32_1      444     0.29    11.56     1.88 0.238%
ncubic32b3      337     0.22     8.67     0.88 0.286%
ncubic_ndiv3    329     0.21     8.46     0.69 0.297%
acbrt           707     0.46    18.05     0.80 0.275%
hcbrt           644     0.42    16.44     0.51 1.580%


Regards,
Willy


  reply	other threads:[~2007-03-10 11:52 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-24  1:05 [RFC] div64_64 support Stephen Hemminger
2007-02-24 16:19 ` Sami Farin
2007-02-26 19:28   ` Stephen Hemminger
2007-02-26 19:39     ` David Miller
2007-02-26 20:09 ` Jan Engelhardt
2007-02-26 21:28   ` Stephen Hemminger
2007-02-27  1:20     ` H. Peter Anvin
2007-02-27  3:45       ` Segher Boessenkool
2007-02-26 22:31   ` Stephen Hemminger
2007-02-26 23:02     ` Jan Engelhardt
2007-02-26 23:44       ` Stephen Hemminger
2007-02-27  0:05         ` Jan Engelhardt
2007-02-27  0:07           ` Stephen Hemminger
2007-02-27  0:14             ` Jan Engelhardt
2007-02-27  6:21     ` Dan Williams
2007-03-03  2:31     ` Andi Kleen
2007-03-05 23:57       ` Stephen Hemminger
2007-03-06  0:25         ` David Miller
2007-03-06 13:36           ` Andi Kleen
2007-03-06 14:04           ` [RFC] div64_64 support II Andi Kleen
2007-03-06 17:43             ` Dagfinn Ilmari Mannsåker
2007-03-06 18:25               ` David Miller
2007-03-06 18:48             ` H. Peter Anvin
2007-03-06 13:34         ` [RFC] div64_64 support Andi Kleen
2007-03-06 14:19           ` Eric Dumazet
2007-03-06 14:45             ` Andi Kleen
2007-03-06 15:10               ` Roland Kuhn
2007-03-06 18:29                 ` Stephen Hemminger
2007-03-06 19:48                   ` Andi Kleen
2007-03-06 20:04                     ` Stephen Hemminger
2007-03-06 21:53                   ` Sami Farin
2007-03-06 22:24                     ` Sami Farin
2007-03-07  0:00                       ` Stephen Hemminger
2007-03-07  0:05                         ` David Miller
2007-03-07  0:05                         ` Sami Farin
2007-03-07 16:11                       ` Chuck Ebbert
2007-03-07 18:32                         ` Sami Farin
2007-03-08 18:23                       ` asm volatile [Was: [RFC] div64_64 support] Sami Farin
2007-03-08 22:01                         ` asm volatile David Miller
2007-03-06 21:58                   ` [RFC] div64_64 support David Miller
2007-03-06 22:47                     ` [PATCH] tcp_cubic: faster cube root Stephen Hemminger
2007-03-06 22:58                       ` cube root benchmark code Stephen Hemminger
2007-03-07  6:08                         ` Update to " Willy Tarreau
2007-03-08  1:07                           ` [PATCH] tcp_cubic: use 32 bit math Stephen Hemminger
2007-03-08  2:55                             ` David Miller
2007-03-08  3:10                               ` Stephen Hemminger
2007-03-08  3:51                                 ` David Miller
2007-03-10 11:48                                   ` Willy Tarreau [this message]
2007-03-12 21:11                                     ` Stephen Hemminger
2007-03-13 20:50                                       ` Willy Tarreau
2007-03-21 18:54                                         ` Stephen Hemminger
2007-03-21 19:15                                           ` Willy Tarreau
2007-03-21 19:58                                             ` Stephen Hemminger
2007-03-21 20:15                                             ` [PATCH 1/2] div64_64 optimization Stephen Hemminger
2007-03-21 20:17                                               ` [PATCH 2/2] tcp: cubic optimization Stephen Hemminger
2007-03-22 19:11                                                 ` David Miller
2007-03-22 19:11                                               ` [PATCH 1/2] div64_64 optimization David Miller
2007-03-08  4:16                                 ` [PATCH] tcp_cubic: use 32 bit math Willy Tarreau
2007-03-07  4:20                       ` [PATCH] tcp_cubic: faster cube root David Miller
2007-03-07 12:12                         ` Andi Kleen
2007-03-07 19:33                           ` David Miller
2007-03-06 18:50               ` [RFC] div64_64 support H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070310114826.GB1608@1wt.eu \
    --to=w@1wt.eu \
    --cc=andi@firstfloor.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=jengelh@linux01.gwdg.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rkuhn@e18.physik.tu-muenchen.de \
    --cc=shemminger@linux-foundation.org \
    --subject='Re: [PATCH] tcp_cubic: use 32 bit math' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).