LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>,
	rkuhn@e18.physik.tu-muenchen.de, andi@firstfloor.org,
	dada1@cosmosbay.com, jengelh@linux01.gwdg.de,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] tcp_cubic: use 32 bit math
Date: Wed, 21 Mar 2007 20:15:37 +0100	[thread overview]
Message-ID: <20070321191537.GA28914@1wt.eu> (raw)
In-Reply-To: <20070321115419.483a655a@freekitty>

Hi Stephen,

On Wed, Mar 21, 2007 at 11:54:19AM -0700, Stephen Hemminger wrote:
> On Tue, 13 Mar 2007 21:50:20 +0100
> Willy Tarreau <w@1wt.eu> wrote:

[...] ( cut my boring part )

> > Here are the results classed by speed :
> > 
> > /* Sample output on a Pentium-M 600 MHz :
> > 
> > Function          clocks mean(us)  max(us)  std(us) Avg err size
> > ncubic_tab0           79     0.66     7.20     1.04  0.613%  160
> > ncubic_0div           84     0.70     7.64     1.57  4.521%  192
> > ncubic_1div          178     1.48    16.27     1.81  0.443%  336
> > ncubic_tab1          179     1.49    16.34     1.85  0.195%  320
> > ncubic_ndiv3         263     2.18    24.04     3.59  0.250%  512
> > ncubic_2div          270     2.24    24.70     2.77  0.187%  512
> > ncubic32_1           359     2.98    32.81     3.59  0.238%  544
> > ncubic_3div          361     2.99    33.08     3.79  0.170%  656
> > ncubic32             364     3.02    33.29     3.51  0.247%  544
> > ncubic               529     4.39    48.39     4.92  0.247%  720
> > hcbrt                539     4.47    49.25     5.98  1.580%   96
> > ocubic               732     4.93    61.83     7.22  0.274%  320
> > acbrt                842     6.98    76.73     8.55  0.275%  192
> > bictcp              1032     6.95    86.30     9.04  0.172%  768
> > 

[...]

> The following version of div64_64 is faster because do_div already
> optimized for the 32 bit case..

Cool, this is interesting because I first wanted to optimize it but did
not find how to start with this. You seem to get very good results. BTW,
you did not append your changes.

However, one thing I do not understand is why your avg error is about 1/3
below the original one. Was there a precision bug in the original div_64_64
or did you extend the values used in the test ?

Or perhaps you used -fast-math to build and the original cbrt() is less
precise in this case ?

> I get the following results on ULV Core Solo (ie slow current processor)
> and the following on 64bit Core Duo. ncubic_tab1 seems like
> the best (no additional error and about as fast)

OK. It was the one I preferred too unless tab0's avg error was acceptable.

> ULV Core Solo
> 
> Function          clocks mean(us)  max(us)  std(us) Avg err size
> ncubic_tab0          192    11.24    45.10    15.28  0.450% -2262
> ncubic_0div          201    11.77    47.23    27.40  3.357% -2404
> ncubic_1div          324    19.02    76.32    25.82  0.189% -2567
> ncubic_tab1          326    19.13    76.73    23.71  0.043% -2059
> ncubic_2div          456    26.72   108.92   493.16  0.028% -2790
> ncubic_ndiv3         463    27.15   133.37  1889.39  0.104% -3344
> ncubic32             549    32.18   130.59   508.97  0.041% -3794
> ncubic32_1           574    33.66   138.32   548.48  0.029% -3604
> ncubic_3div          581    34.04   140.24   608.55  0.018% -3050
> ncubic               733    42.92   173.35   523.19  0.041%  299
> ocubic              1046    61.25   283.68  3305.65  0.027% -2232
> acbrt               1149    67.32   284.91  1941.55  0.029%  168
> bictcp              1663    97.41   394.29   604.86  0.017%  628
> 
> Core 2 Duo
> 
> Function          clocks mean(us)  max(us)  std(us) Avg err size
> ncubic_0div           74     0.03     1.60     0.07  3.357% -2101
> ncubic_tab0           74     0.03     1.60     0.04  0.450% -2029
> ncubic_1div          142     0.07     3.11     1.05  0.189% -2195
> ncubic_tab1          144     0.07     3.18     1.02  0.043% -1638
> ncubic_2div          216     0.10     4.74     1.07  0.028% -2326
> ncubic_ndiv3         219     0.10     4.76     1.04  0.104% -2709
> ncubic32             269     0.13     5.87     1.13  0.041% -1500
> ncubic32_1           272     0.13     5.92     1.10  0.029% -2881
> ncubic               273     0.13     5.96     1.13  0.041% -1763
> ncubic_3div          290     0.14     6.32     1.01  0.018% -2499
> acbrt                430     0.20     9.42     1.18  0.029%   77
> ocubic               444     0.21     9.82     1.82  0.027% -1924
> bictcp               549     0.26    12.06     1.68  0.017%  236

Thanks,
Willy


  reply	other threads:[~2007-03-21 19:20 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-24  1:05 [RFC] div64_64 support Stephen Hemminger
2007-02-24 16:19 ` Sami Farin
2007-02-26 19:28   ` Stephen Hemminger
2007-02-26 19:39     ` David Miller
2007-02-26 20:09 ` Jan Engelhardt
2007-02-26 21:28   ` Stephen Hemminger
2007-02-27  1:20     ` H. Peter Anvin
2007-02-27  3:45       ` Segher Boessenkool
2007-02-26 22:31   ` Stephen Hemminger
2007-02-26 23:02     ` Jan Engelhardt
2007-02-26 23:44       ` Stephen Hemminger
2007-02-27  0:05         ` Jan Engelhardt
2007-02-27  0:07           ` Stephen Hemminger
2007-02-27  0:14             ` Jan Engelhardt
2007-02-27  6:21     ` Dan Williams
2007-03-03  2:31     ` Andi Kleen
2007-03-05 23:57       ` Stephen Hemminger
2007-03-06  0:25         ` David Miller
2007-03-06 13:36           ` Andi Kleen
2007-03-06 14:04           ` [RFC] div64_64 support II Andi Kleen
2007-03-06 17:43             ` Dagfinn Ilmari Mannsåker
2007-03-06 18:25               ` David Miller
2007-03-06 18:48             ` H. Peter Anvin
2007-03-06 13:34         ` [RFC] div64_64 support Andi Kleen
2007-03-06 14:19           ` Eric Dumazet
2007-03-06 14:45             ` Andi Kleen
2007-03-06 15:10               ` Roland Kuhn
2007-03-06 18:29                 ` Stephen Hemminger
2007-03-06 19:48                   ` Andi Kleen
2007-03-06 20:04                     ` Stephen Hemminger
2007-03-06 21:53                   ` Sami Farin
2007-03-06 22:24                     ` Sami Farin
2007-03-07  0:00                       ` Stephen Hemminger
2007-03-07  0:05                         ` David Miller
2007-03-07  0:05                         ` Sami Farin
2007-03-07 16:11                       ` Chuck Ebbert
2007-03-07 18:32                         ` Sami Farin
2007-03-08 18:23                       ` asm volatile [Was: [RFC] div64_64 support] Sami Farin
2007-03-08 22:01                         ` asm volatile David Miller
2007-03-06 21:58                   ` [RFC] div64_64 support David Miller
2007-03-06 22:47                     ` [PATCH] tcp_cubic: faster cube root Stephen Hemminger
2007-03-06 22:58                       ` cube root benchmark code Stephen Hemminger
2007-03-07  6:08                         ` Update to " Willy Tarreau
2007-03-08  1:07                           ` [PATCH] tcp_cubic: use 32 bit math Stephen Hemminger
2007-03-08  2:55                             ` David Miller
2007-03-08  3:10                               ` Stephen Hemminger
2007-03-08  3:51                                 ` David Miller
2007-03-10 11:48                                   ` Willy Tarreau
2007-03-12 21:11                                     ` Stephen Hemminger
2007-03-13 20:50                                       ` Willy Tarreau
2007-03-21 18:54                                         ` Stephen Hemminger
2007-03-21 19:15                                           ` Willy Tarreau [this message]
2007-03-21 19:58                                             ` Stephen Hemminger
2007-03-21 20:15                                             ` [PATCH 1/2] div64_64 optimization Stephen Hemminger
2007-03-21 20:17                                               ` [PATCH 2/2] tcp: cubic optimization Stephen Hemminger
2007-03-22 19:11                                                 ` David Miller
2007-03-22 19:11                                               ` [PATCH 1/2] div64_64 optimization David Miller
2007-03-08  4:16                                 ` [PATCH] tcp_cubic: use 32 bit math Willy Tarreau
2007-03-07  4:20                       ` [PATCH] tcp_cubic: faster cube root David Miller
2007-03-07 12:12                         ` Andi Kleen
2007-03-07 19:33                           ` David Miller
2007-03-06 18:50               ` [RFC] div64_64 support H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070321191537.GA28914@1wt.eu \
    --to=w@1wt.eu \
    --cc=andi@firstfloor.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=jengelh@linux01.gwdg.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rkuhn@e18.physik.tu-muenchen.de \
    --cc=shemminger@linux-foundation.org \
    --subject='Re: [PATCH] tcp_cubic: use 32 bit math' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).