LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>,
rkuhn@e18.physik.tu-muenchen.de, andi@firstfloor.org,
dada1@cosmosbay.com, jengelh@linux01.gwdg.de,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] tcp_cubic: use 32 bit math
Date: Wed, 21 Mar 2007 20:15:37 +0100 [thread overview]
Message-ID: <20070321191537.GA28914@1wt.eu> (raw)
In-Reply-To: <20070321115419.483a655a@freekitty>
Hi Stephen,
On Wed, Mar 21, 2007 at 11:54:19AM -0700, Stephen Hemminger wrote:
> On Tue, 13 Mar 2007 21:50:20 +0100
> Willy Tarreau <w@1wt.eu> wrote:
[...] ( cut my boring part )
> > Here are the results classed by speed :
> >
> > /* Sample output on a Pentium-M 600 MHz :
> >
> > Function clocks mean(us) max(us) std(us) Avg err size
> > ncubic_tab0 79 0.66 7.20 1.04 0.613% 160
> > ncubic_0div 84 0.70 7.64 1.57 4.521% 192
> > ncubic_1div 178 1.48 16.27 1.81 0.443% 336
> > ncubic_tab1 179 1.49 16.34 1.85 0.195% 320
> > ncubic_ndiv3 263 2.18 24.04 3.59 0.250% 512
> > ncubic_2div 270 2.24 24.70 2.77 0.187% 512
> > ncubic32_1 359 2.98 32.81 3.59 0.238% 544
> > ncubic_3div 361 2.99 33.08 3.79 0.170% 656
> > ncubic32 364 3.02 33.29 3.51 0.247% 544
> > ncubic 529 4.39 48.39 4.92 0.247% 720
> > hcbrt 539 4.47 49.25 5.98 1.580% 96
> > ocubic 732 4.93 61.83 7.22 0.274% 320
> > acbrt 842 6.98 76.73 8.55 0.275% 192
> > bictcp 1032 6.95 86.30 9.04 0.172% 768
> >
[...]
> The following version of div64_64 is faster because do_div already
> optimized for the 32 bit case..
Cool, this is interesting because I first wanted to optimize it but did
not find how to start with this. You seem to get very good results. BTW,
you did not append your changes.
However, one thing I do not understand is why your avg error is about 1/3
below the original one. Was there a precision bug in the original div_64_64
or did you extend the values used in the test ?
Or perhaps you used -fast-math to build and the original cbrt() is less
precise in this case ?
> I get the following results on ULV Core Solo (ie slow current processor)
> and the following on 64bit Core Duo. ncubic_tab1 seems like
> the best (no additional error and about as fast)
OK. It was the one I preferred too unless tab0's avg error was acceptable.
> ULV Core Solo
>
> Function clocks mean(us) max(us) std(us) Avg err size
> ncubic_tab0 192 11.24 45.10 15.28 0.450% -2262
> ncubic_0div 201 11.77 47.23 27.40 3.357% -2404
> ncubic_1div 324 19.02 76.32 25.82 0.189% -2567
> ncubic_tab1 326 19.13 76.73 23.71 0.043% -2059
> ncubic_2div 456 26.72 108.92 493.16 0.028% -2790
> ncubic_ndiv3 463 27.15 133.37 1889.39 0.104% -3344
> ncubic32 549 32.18 130.59 508.97 0.041% -3794
> ncubic32_1 574 33.66 138.32 548.48 0.029% -3604
> ncubic_3div 581 34.04 140.24 608.55 0.018% -3050
> ncubic 733 42.92 173.35 523.19 0.041% 299
> ocubic 1046 61.25 283.68 3305.65 0.027% -2232
> acbrt 1149 67.32 284.91 1941.55 0.029% 168
> bictcp 1663 97.41 394.29 604.86 0.017% 628
>
> Core 2 Duo
>
> Function clocks mean(us) max(us) std(us) Avg err size
> ncubic_0div 74 0.03 1.60 0.07 3.357% -2101
> ncubic_tab0 74 0.03 1.60 0.04 0.450% -2029
> ncubic_1div 142 0.07 3.11 1.05 0.189% -2195
> ncubic_tab1 144 0.07 3.18 1.02 0.043% -1638
> ncubic_2div 216 0.10 4.74 1.07 0.028% -2326
> ncubic_ndiv3 219 0.10 4.76 1.04 0.104% -2709
> ncubic32 269 0.13 5.87 1.13 0.041% -1500
> ncubic32_1 272 0.13 5.92 1.10 0.029% -2881
> ncubic 273 0.13 5.96 1.13 0.041% -1763
> ncubic_3div 290 0.14 6.32 1.01 0.018% -2499
> acbrt 430 0.20 9.42 1.18 0.029% 77
> ocubic 444 0.21 9.82 1.82 0.027% -1924
> bictcp 549 0.26 12.06 1.68 0.017% 236
Thanks,
Willy
next prev parent reply other threads:[~2007-03-21 19:20 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-24 1:05 [RFC] div64_64 support Stephen Hemminger
2007-02-24 16:19 ` Sami Farin
2007-02-26 19:28 ` Stephen Hemminger
2007-02-26 19:39 ` David Miller
2007-02-26 20:09 ` Jan Engelhardt
2007-02-26 21:28 ` Stephen Hemminger
2007-02-27 1:20 ` H. Peter Anvin
2007-02-27 3:45 ` Segher Boessenkool
2007-02-26 22:31 ` Stephen Hemminger
2007-02-26 23:02 ` Jan Engelhardt
2007-02-26 23:44 ` Stephen Hemminger
2007-02-27 0:05 ` Jan Engelhardt
2007-02-27 0:07 ` Stephen Hemminger
2007-02-27 0:14 ` Jan Engelhardt
2007-02-27 6:21 ` Dan Williams
2007-03-03 2:31 ` Andi Kleen
2007-03-05 23:57 ` Stephen Hemminger
2007-03-06 0:25 ` David Miller
2007-03-06 13:36 ` Andi Kleen
2007-03-06 14:04 ` [RFC] div64_64 support II Andi Kleen
2007-03-06 17:43 ` Dagfinn Ilmari Mannsåker
2007-03-06 18:25 ` David Miller
2007-03-06 18:48 ` H. Peter Anvin
2007-03-06 13:34 ` [RFC] div64_64 support Andi Kleen
2007-03-06 14:19 ` Eric Dumazet
2007-03-06 14:45 ` Andi Kleen
2007-03-06 15:10 ` Roland Kuhn
2007-03-06 18:29 ` Stephen Hemminger
2007-03-06 19:48 ` Andi Kleen
2007-03-06 20:04 ` Stephen Hemminger
2007-03-06 21:53 ` Sami Farin
2007-03-06 22:24 ` Sami Farin
2007-03-07 0:00 ` Stephen Hemminger
2007-03-07 0:05 ` David Miller
2007-03-07 0:05 ` Sami Farin
2007-03-07 16:11 ` Chuck Ebbert
2007-03-07 18:32 ` Sami Farin
2007-03-08 18:23 ` asm volatile [Was: [RFC] div64_64 support] Sami Farin
2007-03-08 22:01 ` asm volatile David Miller
2007-03-06 21:58 ` [RFC] div64_64 support David Miller
2007-03-06 22:47 ` [PATCH] tcp_cubic: faster cube root Stephen Hemminger
2007-03-06 22:58 ` cube root benchmark code Stephen Hemminger
2007-03-07 6:08 ` Update to " Willy Tarreau
2007-03-08 1:07 ` [PATCH] tcp_cubic: use 32 bit math Stephen Hemminger
2007-03-08 2:55 ` David Miller
2007-03-08 3:10 ` Stephen Hemminger
2007-03-08 3:51 ` David Miller
2007-03-10 11:48 ` Willy Tarreau
2007-03-12 21:11 ` Stephen Hemminger
2007-03-13 20:50 ` Willy Tarreau
2007-03-21 18:54 ` Stephen Hemminger
2007-03-21 19:15 ` Willy Tarreau [this message]
2007-03-21 19:58 ` Stephen Hemminger
2007-03-21 20:15 ` [PATCH 1/2] div64_64 optimization Stephen Hemminger
2007-03-21 20:17 ` [PATCH 2/2] tcp: cubic optimization Stephen Hemminger
2007-03-22 19:11 ` David Miller
2007-03-22 19:11 ` [PATCH 1/2] div64_64 optimization David Miller
2007-03-08 4:16 ` [PATCH] tcp_cubic: use 32 bit math Willy Tarreau
2007-03-07 4:20 ` [PATCH] tcp_cubic: faster cube root David Miller
2007-03-07 12:12 ` Andi Kleen
2007-03-07 19:33 ` David Miller
2007-03-06 18:50 ` [RFC] div64_64 support H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070321191537.GA28914@1wt.eu \
--to=w@1wt.eu \
--cc=andi@firstfloor.org \
--cc=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=jengelh@linux01.gwdg.de \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rkuhn@e18.physik.tu-muenchen.de \
--cc=shemminger@linux-foundation.org \
--subject='Re: [PATCH] tcp_cubic: use 32 bit math' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).