LKML Archive on
help / color / mirror / Atom feed
From: Segher Boessenkool <>
To: Gabriel Paubert <>
Cc:, LKML <>,, Steven Rostedt <>
Subject: Re: [PATCH] add strncmp to PowerPC
Date: Mon, 3 Mar 2008 20:08:59 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

>> Even if it was logically faster (which I still doubt) it's a hell of 
>> a lot
>> of cache lines to waste.

Yeah, 1 on 64-bit and 3 on 32-bit, that's a terrible lot.</sarcasm>

> Indeed, but there are some corner cases that the C code handles. Like
> a length of 0 which may lead to infinite loop in the asm code.
> OTOH, I'm a bit surprised by the extsb instructions in the compiler 
> generated
> code. We don't compile with -fsigned-char, do we? The clrldi
> instructions are also extremely stupid.

Those are both necessary to be equivalent to the C code, which uses
signed char explicitly.  It is generally considered a Good Thing(tm)
for the compiler to generate assembler code equivalent to the C code,
even if the C code is wrong.

> Now that I think a bit more about it, I believe that the C version is
> incorrect

It is.  It's a great entry for the IOCCC as well.

I just tested the following (can't guarantee it's correct, just a PoC):

int strncmp(const char *s1, const char *s2, unsigned long /*size_t*/ 
         while (len--) {
                 unsigned char c1, c2;
                 c1 = *s1++;
                 c2 = *s2++;
                 int cmp = c1 - c2;
                 if (cmp)
                         return cmp;
                 if (c1 == 0 || c2 == 0)
         return 0;

which generates (with GCC-4.2.3)

         addi 5,5,1
         mtctr 5
         bdz .L11
         lbz 0,0(3)
         addi 3,3,1
         lbz 9,0(4)
         addi 4,4,1
         cmpwi 7,0,0
         subf. 0,9,0
         cmpwi 6,9,0
         bne- 0,.L4
         beq- 7,.L4
         bne+ 6,.L2
         mr 3,0
         li 0,0
         mr 3,0

which isn't horrid, although it does some weirdish things obviously.

Current GCC-4.4.0 generates

         addi 5,5,1
         mr 10,3
         mtctr 5
         li 11,0
         bdz .L7
         .p2align 4,,15
         lbzx 0,10,11
         lbzx 9,4,11
         addi 11,11,1
         subf. 3,9,0
         cmpwi 6,9,0
         cmpwi 7,0,0
         bnelr 0
         beqlr 7
         beqlr 6
         bdnz .L4
         li 3,0

which is about as good as it can get (well, it didn't realise you
only need to test one of c1, c2 for zero.  Did I say this was just
proof-of-concept code?)


  parent reply	other threads:[~2008-03-03 19:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-29 16:04 Steven Rostedt
2008-03-01  3:04 ` Benjamin Herrenschmidt
2008-03-01  3:56   ` Steven Rostedt
2008-03-03  9:54     ` Gabriel Paubert
2008-03-03 10:10       ` Andreas Schwab
2008-03-03 19:08       ` Segher Boessenkool [this message]
2008-03-05  4:03   ` Paul Mackerras
2008-03-05  5:26     ` Segher Boessenkool
2008-03-05  5:39       ` Paul Mackerras
2008-03-05  7:01         ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \
    --subject='Re: [PATCH] add strncmp to PowerPC' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).