LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Andy Shevchenko' <andriy.shevchenko@linux.intel.com>
Cc: "'Vaittinen, Matti'" <Matti.Vaittinen@fi.rohmeurope.com>,
	Matti Vaittinen <mazziesaccount@gmail.com>,
	Liam Girdwood <lgirdwood@gmail.com>,
	Mark Brown <broonie@kernel.org>, Jiri Kosina <trivial@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Yury Norov <yury.norov@gmail.com>,
	"Kumar Kartikeya Dwivedi" <memxor@gmail.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH 1/4] bitops: Add single_bit_set()
Date: Tue, 23 Nov 2021 14:36:37 +0000	[thread overview]
Message-ID: <74084f269e594286ae5dc88d1f4ca27f@AcuMS.aculab.com> (raw)
In-Reply-To: <YZzv93tdAJ5V6MT2@smile.fi.intel.com>

From: 'Andy Shevchenko'
> Sent: 23 November 2021 13:43
> 
> On Tue, Nov 23, 2021 at 10:58:44AM +0000, David Laight wrote:
> > From: Andy Shevchenko
> > > On Tue, Nov 23, 2021 at 10:42:45AM +0000, David Laight wrote:
> > > > From: Vaittinen, Matti
> > > > > Sent: 22 November 2021 13:19
> > > > > On 11/22/21 14:57, Andy Shevchenko wrote:
> > > > > > On Mon, Nov 22, 2021 at 12:42:21PM +0000, Vaittinen, Matti wrote:
> > > > > >> On 11/22/21 13:28, Andy Shevchenko wrote:
> > > > > >>> On Mon, Nov 22, 2021 at 01:03:25PM +0200, Matti Vaittinen wrote:
> > > > > >
> > > > > > What do you mean by this?
> > > > > >
> > > > > > hweight() will return you the number of the non-zero elements in the set.
> > > > >
> > > > > Exactly. The function I added did only check if given set of bits had
> > > > > only one bit set.
> > > >
> > > > Checking for exactly one bit can use the (x & (x - 1)) check on
> > > > non-zero values - which may even be better on some cpus with a
> > > > popcnt instruction.
> > >
> > > In the discussed case the value pretty much can be 0, meaning you have
> > > to add an additional test which I believe diminishes all efforts for
> > > the is_power_of_2() call.
> >
> > I wouldn't have thought so.
> > Code would be:
> > 	if (!scan_for_non_zero())
> > 		return 0;
> > 	if (!is_power_of_2())
> > 		return 0;
> > 	return scan_for_non_zero() ? 0 : 1;
> >
> > Hand-crafting asm you'd actually check for (x - 1) generating
> > carry in the initial scan.
> 
> Have you done any benchmarks? Can we see them?
> 
> > The latency of popcnt it worse than arithmetic on a lot of x86 cpu.

Well, on AMD piledriver and bulldozer (etc) 64bit popcnt has a latency of 4.
On bobcat the latency is 12.
Excavator and Ryzen are better.
Intel are ok except for the Atoms (silvermont/goldmont).
That isn't going to help.

But run on a cpu without a popcnt instruction and the performance will
really be horrid.
At best the gain for using popcnt is marginal.

If you want to try a benchmark then code up (and debug):
	%rsi = buf + length // pointer to end of bitmap
	%rcx = -length	// in bytes
1:	jrcxz	8f		// jumps if all zeros
	mov	(%rsi, %rcx),%rax
	mov	%rax, %rdx,
	sub	$1, %rax
	lea	8(%rcx), %rcx
	jc	1b		// jump if zero word
	and	%rdx, %rax
	jnz	8f		// jump if >1 bit set
2:	jrcxz	9f
	cmp	(%rsi, %rcx), %rax
	lea	8(%rcx), %rcx
	jz	2b
8:	xor	%eax,%eax
	ret
9:	int	%eax
	ret

I think that is (about) right).
The initial loop may be 3 clocks per iteration on a recent Intel cpu.

But I suspect the only real gains are on cpu without popcnt.
It isn't as though you'll be doing this as often as (say)
the IP checksum function - which I have benchmarked.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


  reply	other threads:[~2021-11-23 14:36 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-22 11:03 [PATCH 0/4] Provide event map helper for regulator drivers Matti Vaittinen
2021-11-22 11:03 ` [PATCH 1/4] bitops: Add single_bit_set() Matti Vaittinen
2021-11-22 11:28   ` Andy Shevchenko
2021-11-22 12:42     ` Vaittinen, Matti
2021-11-22 12:57       ` Andy Shevchenko
2021-11-22 13:00         ` Andy Shevchenko
2021-11-22 13:18         ` Vaittinen, Matti
2021-11-23 10:42           ` David Laight
2021-11-23 10:47             ` Andy Shevchenko
2021-11-23 10:58               ` David Laight
2021-11-23 13:43                 ` 'Andy Shevchenko'
2021-11-23 14:36                   ` David Laight [this message]
2021-11-23 11:42             ` Vaittinen, Matti
2021-11-22 17:54         ` Yury Norov
2021-11-22 19:56           ` Andy Shevchenko
2021-11-23  7:51             ` Yury Norov
2021-11-23  5:26           ` Vaittinen, Matti
2021-11-23  7:33             ` Yury Norov
2021-11-23  9:03               ` Andy Shevchenko
2021-11-23  9:10                 ` Geert Uytterhoeven
2021-11-22 11:03 ` [PATCH 2/4] regulators: Add regulator_err2notif() helper Matti Vaittinen
2021-11-22 11:04 ` [PATCH 3/4] regulators: irq_helper: Provide helper for trivial IRQ notifications Matti Vaittinen
2021-11-22 11:48   ` Andy Shevchenko
2021-11-22 12:44     ` Vaittinen, Matti
2021-11-22 11:04 ` [PATCH 4/4] regulator: Drop unnecessary struct member Matti Vaittinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=74084f269e594286ae5dc88d1f4ca27f@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=Matti.Vaittinen@fi.rohmeurope.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=broonie@kernel.org \
    --cc=geert+renesas@glider.be \
    --cc=lgirdwood@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=mazziesaccount@gmail.com \
    --cc=memxor@gmail.com \
    --cc=trivial@kernel.org \
    --cc=yury.norov@gmail.com \
    --subject='RE: [PATCH 1/4] bitops: Add single_bit_set()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).