LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Alexander van Heukelum" <heukelum@fastmail.fm>
To: "Ingo Molnar" <mingo@elte.hu>,
	"Alexander van Heukelum" <heukelum@mailshack.com>
Cc: "Thomas Gleixner" <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"LKML" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86: Change x86 to use generic find_next_bit
Date: Sun, 09 Mar 2008 22:13:15 +0100	[thread overview]
Message-ID: <1205097195.13205.1241421773@webmail.messagingengine.com> (raw)
In-Reply-To: <20080309201016.GA28454@elte.hu>

On Sun, 9 Mar 2008 21:10:16 +0100, "Ingo Molnar" <mingo@elte.hu> said:
> > 		Athlon		Xeon		Opteron 32/64bit
> > x86-specific:	0m3.692s	0m2.820s	0m3.196s / 0m2.480s
> > generic:	0m2.622s	0m1.662s	0m2.100s / 0m1.572s
> 
> ok, that's rather convincing.
> 
> the generic version in lib/find_next_bit.c is open-coded C which gcc can 
> optimize pretty nicely.
> 
> the hand-coded assembly versions in arch/x86/lib/bitops_32.c mostly use 
> the special x86 'bit search forward' (BSF) instruction - which i know 
> from the days when the scheduler relied on it has some non-trivial setup 
> costs. So especially when there's _small_ bitmasks involved, it's more 
> expensive.

Hi,

BSF is fine, it doesn't need any special setup. The problem is probably
that the old versions use find_first_bit and find_first_zero_bit,
which are also hand optimized versions... and they use "repe scasl/q".
That's another little project ;).

> > If the bitmap size is not a multiple of BITS_PER_LONG, and no set 
> > (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a 
> > value outside of the range [0,size]. The generic version always 
> > returns exactly size. The generic version also uses unsigned long 
> > everywhere, while the x86 versions use a mishmash of int, unsigned 
> > (int), long and unsigned long.
> 
> i'm not surprised that the hand-coded assembly versions had a bug ...

Not surprised about the bug, but it was in fact noticed, and fixed
in x86_64!

> [ this means we have to test it quite carefully though, as lots of code 
>   only ever gets tested on x86 so code could have built dependency on 
>   the buggy behavior. ]

Agreed.

> > Using the generic version does give a slightly bigger kernel, though.
> > 
> > defconfig:	   text    data     bss     dec     hex filename
> > x86-specific:	4738555  481232  626688 5846475  5935cb vmlinux (32 bit)
> > generic:	4738621  481232  626688 5846541  59360d vmlinux (32 bit)
> > x86-specific:	5392395  846568  724424 6963387  6a40bb vmlinux (64 bit)
> > generic:	5392458  846568  724424 6963450  6a40fa vmlinux (64 bit)
> 
> i'd not worry about that too much. Have you tried to build with:

I don't but I needed to compile something to test the build anyhow ;)

>   CONFIG_CC_OPTIMIZE_FOR_SIZE=y
>   CONFIG_OPTIMIZE_INLINING=y

This was defconfig in -x86#testing, they were both already enabled. 
Here is what you get with those options turned off ;).

                   text    data     bss     dec     hex filename
x86-specific:   5543996  481232  626688 6651916  65800c vmlinux (32 bit)
generic:        5543880  481232  626688 6651800  657f98 vmlinux (32 bit)
x86-specific:   6111834  846568  724424 7682826  753b0a vmlinux (64 bit)
generic:        6111882  846568  724424 7682874  753b3a vmlinux (64 bit)

(and I double-checked the i386 results)

> (the latter only available in x86.git)
> 
> > Patch is against -x86#testing. It compiles.
> 
> i've picked it up into x86.git, lets see how it goes in practice.

Thanks,
    Alexander

> 	Ingo
-- 
  Alexander van Heukelum
  heukelum@fastmail.fm

-- 
http://www.fastmail.fm - And now for something completely different…


  parent reply	other threads:[~2008-03-09 21:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-09 20:01 Alexander van Heukelum
2008-03-09 20:10 ` Ingo Molnar
2008-03-09 21:03   ` Andi Kleen
2008-03-09 21:32     ` Andi Kleen
2008-03-09 21:13   ` Alexander van Heukelum [this message]
2008-03-10  6:29     ` Ingo Molnar
2008-03-09 20:11 ` Ingo Molnar
2008-03-09 20:31   ` Alexander van Heukelum
2008-03-09 20:51     ` Ingo Molnar
2008-03-09 21:29       ` Andi Kleen
2008-03-10 23:17       ` [RFC/PATCH] x86: Optimize find_next_(zero_)bit for small constant-size bitmaps Alexander van Heukelum
2008-03-11  9:56         ` Ingo Molnar
2008-03-11 15:17           ` [PATCH] " Alexander van Heukelum
2008-03-11 15:22             ` [RFC] non-x86: " Alexander van Heukelum
2008-03-11 15:23             ` [PATCH] x86: " Ingo Molnar
2008-03-09 20:28 ` [PATCH] x86: Change x86 to use generic find_next_bit Andi Kleen
2008-03-09 21:31 ` Andi Kleen
2008-03-13 12:44 ` Aneesh Kumar K.V
2008-03-13 14:27   ` Alexander van Heukelum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1205097195.13205.1241421773@webmail.messagingengine.com \
    --to=heukelum@fastmail.fm \
    --cc=heukelum@mailshack.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --subject='Re: [PATCH] x86: Change x86 to use generic find_next_bit' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).