LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: somebody dropped a (warning) bomb
       [not found]                         ` <7Oc8t-NS-1@gated-at.bofh.it>
@ 2007-02-15 20:08                           ` Bodo Eggert
  2007-02-16 11:21                             ` Sergei Organov
  2007-02-16 12:46                             ` Sergei Organov
  0 siblings, 2 replies; 67+ messages in thread
From: Bodo Eggert @ 2007-02-15 20:08 UTC (permalink / raw)
  To: Sergei Organov, Linus Torvalds,
	J.A. MagallÃÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

Sergei Organov <osv@javad.com> wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:

>> Exactly because "char" *by*definition* is "indeterminate sign" as far as
>> something like "strlen()" is concerned.
> 
> Thanks, I now understand that you either don't see the difference
> between "indeterminate" and "implementation-defined" in this context or
> consider it non-essential, so I think I've got your point.

If you don't code for a specific compiler with specific settings, there is
no implementation defining the signedness of char, and each part of the code
using char* will be wrong unless it handles both cases correctly.

Therefore it's either always wrong to call your char* function with char*,
unsigned char* _and_ signed char unless you can guarantee not to overflow any
of them, or it's always correct to call char* functions with any kind of these.

Off cause if you happen to code for specific compiler settings, one signedness
of char will become real and one warning will be legit. And if pigs fly, they
should wear googles to protect their eyes ...

>> THE FACT IS, THAT "strlen()" IS DEFINED UNIVERSALLY AS TAKING "char *".
> 
> So just don't pass it "unsigned char*". End of story.

Using signed chars for strings is wrong in most countries on earth. It was
wrong when the first IBM PC came out in 1981, and creating a compiler in
1987 defaulting to signed char is a sure sign of originating from an
isolated country and knowing nothing about this planet. Using signed chars
in comparisons is especially wrong, and casting each char to unsigned before
comparing them is likely to be forgotten. Unsigned character strings are
useless because there is no such thing as char(-23), and if these strings
weren't casted to signed inside all IO functions, they wouldn't work correctly.
Only because many programmers don't compare chars, most programs will work
outside the USA. I repeat: Thanks to using signed chars, the programs only
work /by/ /accident/! Promoting the use of signed char strings is promoting
bugs and endangering the stability of all our systems. You should stop this
bullshit now, instead of increasing the pile.

>> That BY DEFINITION means that "strlen()" cannot care about the sign,
>> because the sign IS NOT DEFINED UNIVERSALLY!
>>
>> And if you cannot accept that fact, it's your problem. Not mine.
> 
> I never had problems either with strlen() or with this warning, so was
> curious why does the warning is such a problem for the kernel.

The warning is a problem because either it clutters your build log hiding
real bugs, you typecast hiding errors whenever a char* function is called,
or you disable it globally hiding real signedness bugs. Either way you take
it, it's wrong.
-- 
Funny quotes:
12. He who laughs last thinks slowest.

Friß, Spammer: uv@k.7eggert.dyndns.org mt4siyETc@65egwl.7eggert.dyndns.org

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-15 20:08                           ` somebody dropped a (warning) bomb Bodo Eggert
@ 2007-02-16 11:21                             ` Sergei Organov
  2007-02-16 14:51                               ` Bodo Eggert
  2007-02-16 12:46                             ` Sergei Organov
  1 sibling, 1 reply; 67+ messages in thread
From: Sergei Organov @ 2007-02-16 11:21 UTC (permalink / raw)
  To: 7eggert
  Cc: Linus Torvalds, J.A. MagallÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Bodo Eggert <7eggert@gmx.de> writes:
> Sergei Organov <osv@javad.com> wrote:
>> Linus Torvalds <torvalds@linux-foundation.org> writes:
[...]
> Using signed chars for strings is wrong in most countries on earth. It was
> wrong when the first IBM PC came out in 1981, and creating a compiler in
> 1987 defaulting to signed char is a sure sign of originating from an
> isolated country and knowing nothing about this planet.

Maybe I'm wrong, but wasn't it the case that C had no "signed char" type
that time, so "char" had to be abused for the "tiny signed integer"?

> Using signed chars in comparisons is especially wrong, and casting
> each char to unsigned before comparing them is likely to be
> forgotten.

If we start talking about the C language, my opinion is that it's C
problem that it allows numeric comparison of "char" variables at
all. If one actually needs to compare alphabetic characters numerically,
he should first cast to required integer type.

> Unsigned character strings are useless because there is no such thing
> as char(-23), and if these strings weren't casted to signed inside all
> IO functions, they wouldn't work correctly.

Didn't you swap "signed" and "unsigned" by mistake in this phrase? Or
are you indeed think that using "unsigned char*" for strings is useless?

> Only because many programmers don't compare chars, most programs will
> work outside the USA.

Comparison of characters being numeric is not a very good property of
the C language.

> I repeat: Thanks to using signed chars, the programs only
> work /by/ /accident/! Promoting the use of signed char strings is promoting
> bugs and endangering the stability of all our systems. You should stop this
> bullshit now, instead of increasing the pile.

Where did you see I promoted using of "singed char strings"?! If you
don't like the fact that in C language characters are "char", strings
are "char*", and the sign of char is implementation-defined, please
argue with the C committee, not with me.

Or use -funsigned-char to get dialect of C that fits your requirements
better.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-15 20:08                           ` somebody dropped a (warning) bomb Bodo Eggert
  2007-02-16 11:21                             ` Sergei Organov
@ 2007-02-16 12:46                             ` Sergei Organov
  2007-02-16 17:40                               ` Bodo Eggert
  1 sibling, 1 reply; 67+ messages in thread
From: Sergei Organov @ 2007-02-16 12:46 UTC (permalink / raw)
  To: 7eggert
  Cc: Linus Torvalds, J.A. MagallÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Bodo Eggert <7eggert@gmx.de> writes:

> Sergei Organov <osv@javad.com> wrote:
>> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>>> Exactly because "char" *by*definition* is "indeterminate sign" as far as
>>> something like "strlen()" is concerned.
>> 
>> Thanks, I now understand that you either don't see the difference
>> between "indeterminate" and "implementation-defined" in this context or
>> consider it non-essential, so I think I've got your point.
>
> If you don't code for a specific compiler with specific settings, there is
> no implementation defining the signedness of char, and each part of the code
> using char* will be wrong unless it handles both cases correctly.

The problem here is that due to historical reasons, there could be code
out there that abuses "char" for "signed char" (not sure about "unsigned
char"). Old code and old habits are rather persistent.

> Therefore it's either always wrong to call your char* function with char*,
> unsigned char* _and_ signed char unless you can guarantee not to overflow any
> of them, or it's always correct to call char* functions with any kind
> of these.

How are you sure those who wrote foo(char*) agrees with your opinion or
even understands all the involved issues?

> Off cause if you happen to code for specific compiler settings, one signedness
> of char will become real and one warning will be legit. And if pigs fly, they
> should wear googles to protect their eyes ...

And if people were writing perfect portable code all the time, there
would be no need for warnings in the first place...

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-16 11:21                             ` Sergei Organov
@ 2007-02-16 14:51                               ` Bodo Eggert
  2007-02-19 11:56                                 ` Sergei Organov
  0 siblings, 1 reply; 67+ messages in thread
From: Bodo Eggert @ 2007-02-16 14:51 UTC (permalink / raw)
  To: Sergei Organov
  Cc: 7eggert, Linus Torvalds,
	J.A. MagallÃÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Fri, 16 Feb 2007, Sergei Organov wrote:
> Bodo Eggert <7eggert@gmx.de> writes:
> > Sergei Organov <osv@javad.com> wrote:
> >> Linus Torvalds <torvalds@linux-foundation.org> writes:

> [...]
> > Using signed chars for strings is wrong in most countries on earth. It was
> > wrong when the first IBM PC came out in 1981, and creating a compiler in
> > 1987 defaulting to signed char is a sure sign of originating from an
> > isolated country and knowing nothing about this planet.
> 
> Maybe I'm wrong, but wasn't it the case that C had no "signed char" type
> that time, so "char" had to be abused for the "tiny signed integer"?

Mentioning tha availability of unsigned char was a change in K&R's 
book "The C programming Language" from 1987 to 1988:

"For some time, type unsigned char has been available. The standard 
introduces the signed keyword to make signedness explicit for char and 
other integral objects."

So if there was no "signed char" before, the blame passes on to those who
didn't introduce it alongside "unsigned char".

> > Using signed chars in comparisons is especially wrong, and casting
> > each char to unsigned before comparing them is likely to be
> > forgotten.
> 
> If we start talking about the C language, my opinion is that it's C
> problem that it allows numeric comparison of "char" variables at
> all. If one actually needs to compare alphabetic characters numerically,
> he should first cast to required integer type.

Char values are indexes in a character set. Taking the difference of
indexes is legal. Other languages have ord('x'), but it's only an
expensive typecast (byte('x') does exactly the same thing).

> > Unsigned character strings are useless because there is no such thing
> > as char(-23), and if these strings weren't casted to signed inside all
> > IO functions, they wouldn't work correctly.
> 
> Didn't you swap "signed" and "unsigned" by mistake in this phrase? Or
> are you indeed think that using "unsigned char*" for strings is useless?

ACK

> > Only because many programmers don't compare chars, most programs will
> > work outside the USA.
> 
> Comparison of characters being numeric is not a very good property of
> the C language.

NACK, as above

> > I repeat: Thanks to using signed chars, the programs only
> > work /by/ /accident/! Promoting the use of signed char strings is promoting
> > bugs and endangering the stability of all our systems. You should stop this
> > bullshit now, instead of increasing the pile.
> 
> Where did you see I promoted using of "singed char strings"?!

char strings are often signed char strings, only using unsigned char 
prevents this in a portable way. If you care about ordinal values of
your strings, use unsigned.

> If you
> don't like the fact that in C language characters are "char", strings
> are "char*", and the sign of char is implementation-defined, please
> argue with the C committee, not with me.
> 
> Or use -funsigned-char to get dialect of C that fits your requirements
> better.

If you restrict yourself to a specific C compiler, you are doomed, and 
that's not a good start.
-- 
What about those red balls they have on car aerials so you can spot your car
in a park. I think all cars should have them!
	-- Homer Simpson

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-16 12:46                             ` Sergei Organov
@ 2007-02-16 17:40                               ` Bodo Eggert
  2007-02-19 12:17                                 ` Sergei Organov
  0 siblings, 1 reply; 67+ messages in thread
From: Bodo Eggert @ 2007-02-16 17:40 UTC (permalink / raw)
  To: Sergei Organov
  Cc: 7eggert, Linus Torvalds,
	J.A. MagallÃÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Fri, 16 Feb 2007, Sergei Organov wrote:
> Bodo Eggert <7eggert@gmx.de> writes:
> > Sergei Organov <osv@javad.com> wrote:
> >> Linus Torvalds <torvalds@linux-foundation.org> writes:

> > If you don't code for a specific compiler with specific settings, there is
> > no implementation defining the signedness of char, and each part of the code
> > using char* will be wrong unless it handles both cases correctly.
> 
> The problem here is that due to historical reasons, there could be code
> out there that abuses "char" for "signed char" (not sure about "unsigned
> char"). Old code and old habits are rather persistent.

There could be code using trigraphs ... and gcc has an option for that.

If this code uses signed chars, using it on unsigned-char-archs is broken
and should be warned about, but the compiler will not warn about this
because this code will not use "signed char" and therefore it's bug-to-bug
syntax compatible, waiting for a semantic breakdown.

I'll say it again: Either the code using unspecified chars is correct, or 
it isn't. If it's correct, neither using with signed nor with unsigned 
chars is a bug and you should not warn at all, and if it's not correct, 
you should always warn. Instead, gcc warns on "code compiles for $arch".

> > Therefore it's either always wrong to call your char* function with char*,
> > unsigned char* _and_ signed char unless you can guarantee not to overflow any
> > of them, or it's always correct to call char* functions with any kind
> > of these.
> 
> How are you sure those who wrote foo(char*) agrees with your opinion or
> even understands all the involved issues?

Let's asume we have this piece of buggy code. We compile it on an unsigned
char architecture. No warning. *BOOM*


Let's asume there is correct code, and we use it as designed:
Warning: Wrong arch
Warning: Wrong arch
Warning: Wrong arch
Warning: real issue 
Warning: Wrong arch
Warning: Wrong arch
Warning: Wrong arch
Warning: Wrong arch
<scroll off screen/>
Warning: Wrong arch
Warning: Wrong arch
Warning: Wrong arch
Warning: Wrong arch
Warning: Wrong arch

You don't see "real issue". *BOOM*


What can you do about this warning? Let's asume we cast everywhere:

struct foo * p;
printf(strlen(char*)p); *BOOM*


Let's asume we disable this warning:

int f(unsigned short x)
{
	if (!x)
		return 0;
	return (int) x + f(x-1);
}

f(-1); *BOOM*


Therefore unless you program for one arch with one set of compiler flags, 
this warning is useless, and I did not see much code explicitely designed 
to be non-portable.

Warning on wrong signedness is good, but if you can't enable it on 
portable code, it's useless.

-- 
Funny quotes:
39. Ever wonder about those people who spend $2.00 apiece on those little
    bottles of Evian water? Try spelling Evian backwards: NAIVE

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-16 14:51                               ` Bodo Eggert
@ 2007-02-19 11:56                                 ` Sergei Organov
  0 siblings, 0 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-19 11:56 UTC (permalink / raw)
  To: Bodo Eggert
  Cc: Linus Torvalds, J.A. MagallÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Bodo Eggert <7eggert@gmx.de> writes:
> On Fri, 16 Feb 2007, Sergei Organov wrote:
>> Bodo Eggert <7eggert@gmx.de> writes:
>> > Sergei Organov <osv@javad.com> wrote:
>> >> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
[...]
>> If we start talking about the C language, my opinion is that it's C
>> problem that it allows numeric comparison of "char" variables at
>> all. If one actually needs to compare alphabetic characters numerically,
>> he should first cast to required integer type.
>
> Char values are indexes in a character set. Taking the difference of
> indexes is legal. Other languages have ord('x'), but it's only an
> expensive typecast (byte('x') does exactly the same thing).

Why this typecast should be expensive? Is

   char c1, c2;
   ...
   if((unsigned char)c1 < (unsigned char)c2)

any more expensive than if(c1 < c2)?

[...]

>> Comparison of characters being numeric is not a very good property of
>> the C language.
>
> NACK, as above

The point was not to entirely disable it or to make it any more
expensive, but to somehow force programmer to be explicit that he indeed
needs numeric comparison of characters' indexes. Well, I do realize that
the above is not in the C "spirit" that is to allow implicit conversions
almost everywhere.

>> > I repeat: Thanks to using signed chars, the programs only
>> > work /by/ /accident/! Promoting the use of signed char strings is promoting
>> > bugs and endangering the stability of all our systems. You should stop this
>> > bullshit now, instead of increasing the pile.
>> 
>> Where did you see I promoted using of "singed char strings"?!
>
> char strings are often signed char strings, only using unsigned char 
> prevents this in a portable way. If you care about ordinal values of
> your strings, use unsigned.

It means that (as I don't want to change the type of my
strings/characters in case I suddenly start to care about ordinal values
of the characters) I must always use "unsigned char" for storing
characters in practice. Right?

There are at least three problems then:

1. The type of "abc" remains "char*", so effectively I'll have two
   different representations of strings.

2. I can't mix two different representations of strings in C without
   compromising type safety. In C, the type for characters is "char",
   that by definition is different type than "unsigned char".

3. As I now represent characters by unsigned tiny integers, I loose
   distinct type for unsigned tiny integers. Effectively this brings me
   back in time to the old days when C didn't have distinct type for
   signed tiny integers.

Therefore, your "solution" to one problem creates a bunch of
others. Worse yet, you in fact don't solve initial problem of comparing
strings. Take, for example, KOI8-R encoding. Comparing characters from
this encoding using "unsigned char" representation of character indexes
makes no more sense than comparing "signed char" representations, as
both comparisons will bring meaningless results, due to the fact that
the order of (some of) the characters in the table doesn't match their
alphabetical order.

>> If you don't like the fact that in C language characters are "char",
>> strings are "char*", and the sign of char is implementation-defined,
>> please argue with the C committee, not with me.
>> 
>> Or use -funsigned-char to get dialect of C that fits your requirements
>> better.
>
> If you restrict yourself to a specific C compiler, you are doomed, and 
> that's not a good start.

Isn't it you who insists that "char" should always be unsigned? It's
your problem then to use compiler that understands those imaginary
language that meets the requirement.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-16 17:40                               ` Bodo Eggert
@ 2007-02-19 12:17                                 ` Sergei Organov
  0 siblings, 0 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-19 12:17 UTC (permalink / raw)
  To: Bodo Eggert
  Cc: Linus Torvalds, J.A. MagallÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Bodo Eggert <7eggert@gmx.de> writes:
> On Fri, 16 Feb 2007, Sergei Organov wrote:

[...]

> I'll say it again: Either the code using unspecified chars is correct, or 
> it isn't. If it's correct, neither using with signed nor with unsigned 
> chars is a bug and you should not warn at all, and if it's not correct, 
> you should always warn. Instead, gcc warns on "code compiles for
> $arch".

Here is where we disagree. In my opinion, no matter what the sign of
char for given architecture is, it's incorrect to pass either "signed
char*" or "unsigned char*" argument to a function expecting "char*".

>> > Therefore it's either always wrong to call your char* function with char*,
>> > unsigned char* _and_ signed char unless you can guarantee not to overflow any
>> > of them, or it's always correct to call char* functions with any kind
>> > of these.
>> 
>> How are you sure those who wrote foo(char*) agrees with your opinion or
>> even understands all the involved issues?
>
> Let's asume we have this piece of buggy code. We compile it on an unsigned
> char architecture. No warning. *BOOM*

There should be warning, -- that's my point. "char" is different. "char"
is distinct from either "signed char" or "unsigned char". Always. At
least it's how C is defined.

> Let's asume there is correct code, and we use it as designed:
> Warning: Wrong arch
> Warning: Wrong arch
> Warning: Wrong arch
> Warning: real issue 
> Warning: Wrong arch
> Warning: Wrong arch
> Warning: Wrong arch
> Warning: Wrong arch
> <scroll off screen/>
> Warning: Wrong arch
> Warning: Wrong arch
> Warning: Wrong arch
> Warning: Wrong arch
> Warning: Wrong arch
>
> You don't see "real issue". *BOOM*
>
>
> What can you do about this warning? Let's asume we cast everywhere:

I already gave an answer in response to Linus:

static inline size_t ustrlen(const unsigned char *s)
{
    return strlen((const char *)s);
}

>
> struct foo * p;
> printf(strlen(char*)p); *BOOM*

printf(ustrlen(p)); *NO BOOM*
unsigned char* u;
printf(ustrlen(u)); *NO WARNING*

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-15 19:02                                       ` Linus Torvalds
  2007-02-16  4:26                                         ` Rene Herman
@ 2007-02-19 13:58                                         ` Sergei Organov
  1 sibling, 0 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-19 13:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka Enberg,
	J.A. MagallÃÃÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, 15 Feb 2007, Sergei Organov wrote:
>> 
>> I agree that if the warning has no true positives, it sucks. The problem
>> is that somehow I doubt it has none. And the reasons for the doubt are:
>
> Why do you harp on "no true positives"?

Because if somebody is capable to proof a warning has no true positives,
I immediately agree it's useless. I just wanted to check if it's indeed
the case. It seems it is not.

> That's a pointless thing. You can make *any* warning have "true 
> positives". My personal favorite is the unconditional warning:
>
> 	warning: user is an idiot
>
> and I _guarantee_ you that it has a lot of true positives.

Yes, but there is obviously 0 correlation between the warning and the
user being an idiot (except that non-idiot will probably find a way to
either turn this warning off, or filter it out).

I already agreed that a warning may have true positives and still be
bad; and a warning having "no false negatives" is obviously a pretty
one, though unfortunately not that common in practice; and a warning
that has no true positives is useless. What we are arguing about are
intermediate cases that are most common in practice.

As programmers, we in fact are interested in correlation between a
warning and actual problem in the software, but for the kind of warnings
we are discussing (sign-correctness and type-safety in general), the
correlation depends on the style the program is written is, making it
difficult to use as a criteria in the discussion.

> It's the "no false negatives" angle you should look at.

I'd like to, but then I'll need to classify most of (all?) the GCC
warnings as bad, I'm afraid.

> THAT is what matters. The reason we don't see a lot of warnings about 
> idiot users is not that people don't do stupid things, but that 
> *sometimes* they actually don't do something stupid.
>
> Yeah, I know, it's far-fetched, but still.
>
> In other words, you're barking up *exactly* the wrong tree. You're looking 
> at it the wrong way around.
>
> Think of it this way: in science, a theory is proven to be bad by a single 
> undeniable fact just showing that it's wrong.
>
> The same is largely true of a warning. If the warning sometimes happens 
> for code that is perfectly fine, the warning is bad.

Consider:

int find_first_zero(const int *a)
{
	int index = 0;
	while(*a++ != 0)
		index++;
	return index;
}

unsigned int array[] = { 1, 2, 3, 0, 25, 14 };
int index = find_first_zero(array);  *WARNING*

So, by your logic, -Wpointer-sign warning is bad in general? If not,
then how will you "fix" the code above? And the upside of the fix is
what?

IMHO, the "problem" is that this warning is about violation of type
safety. Type safety by itself doesn't guarantee the code is perfectly
fine. Violation of type safety doesn't necessarily mean the code is
broken. A warning that warns about violation of type safety has the same
properties.  Now, it's up to you to decide if you want warnings on type
safety or not.

What you are saying, on the other hand, for me reads like this: "I do
want warnings on violations of type safety in general, except for
"signed char" or "unsigned char" being used instead of "char"". I think
I do understand your reasons, -- I just don't agree with them.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-16  4:26                                         ` Rene Herman
@ 2007-02-19 11:58                                           ` Sergei Organov
  0 siblings, 0 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-19 11:58 UTC (permalink / raw)
  To: Rene Herman
  Cc: Linus Torvalds, Pekka Enberg,
	J.A. MagallÃÃÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Rene Herman <rene.herman@gmail.com> writes:
[...]
> Given char's special nature, shouldn't the conclusion of this thread
> have long been simply that gcc needs -Wno-char-pointer-sign? (with
> whatever default, as far as I'm concerned).

I entirely agree that all the char business in C is messy enough to
justify separate warning switch(es) in GCC.

However, I still insist that the problem with the code:

   void foo(char *c);
   unsigned char *u;
   signed char *s;
   ...
   foo(u);
   foo(s);

is not (only) in signedness, as neither 'u' nor 's' has compatible type
with the "char*", no matter what is the sign of "char", so if one cares
about type safety he needs warnings on both invocations of foo().

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-15 19:02                                       ` Linus Torvalds
@ 2007-02-16  4:26                                         ` Rene Herman
  2007-02-19 11:58                                           ` Sergei Organov
  2007-02-19 13:58                                         ` Sergei Organov
  1 sibling, 1 reply; 67+ messages in thread
From: Rene Herman @ 2007-02-16  4:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sergei Organov, Pekka Enberg,
	"J.A. MagallÃÃÃÃón",
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

On 02/15/2007 08:02 PM, Linus Torvalds wrote:

> Think of it this way: in science, a theory is proven to be bad by a
> single undeniable fact just showing that it's wrong.
> 
> The same is largely true of a warning. If the warning sometimes
> happens for code that is perfectly fine, the warning is bad.

Slight difference; if a compulsory warning sometimes happens for code 
that is perfectly fine, the warning is bad. I do want to be _able_ to 
get as many warnings as a compiler can muster though.

Given char's special nature, shouldn't the conclusion of this thread 
have long been simply that gcc needs -Wno-char-pointer-sign? (with 
whatever default, as far as I'm concerned).

Rene.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-15 15:57                                   ` Linus Torvalds
  2007-02-15 18:53                                     ` Sergei Organov
@ 2007-02-15 22:32                                     ` Lennart Sorensen
  1 sibling, 0 replies; 67+ messages in thread
From: Lennart Sorensen @ 2007-02-15 22:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sergei Organov, Pekka Enberg, J.A. Magall??????n, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Thu, Feb 15, 2007 at 07:57:05AM -0800, Linus Torvalds wrote:
> I agree that it's "unnecessary code", and in many ways exactly the same 
> thing. I just happen to believe that casts tend to be a lot more dangerous 
> than extraneous initializations. Casts have this nasty tendency of hiding 
> *other* problems too (ie you later change the thing you cast completely, 
> and now the compiler doesn't warn about a *real* bug, because the cast 
> will just silently force it to a wrong type).

Well one way you could cast it and still have the compiler tell you if
the type ever changes is doing something stupid like:

char* unsigned_char_star_to_char_star(unsigned char* in) {
	return (char*)in;
}

Then call your strlen with:
strlen(unsigned_char_star_to_char_star(my_unfortunate_unsigned_char_string)

If you ever changed the type the compiler would warn you again since the
convertion function doesn't accept anything other than unsigned char *
being passed in for "convertion".  Seems safer than a direct cast since
then you get no type checking anymore.

I hope it doesn't come to this though.  Hopefully gcc will change it's
behaviour back or give an option of --dontcomplainaboutcharstar :)

--
Len Sorensen

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 22:21                       ` Olivier Galibert
  2007-02-14 12:52                         ` Sergei Organov
@ 2007-02-15 20:06                         ` Sergei Organov
  1 sibling, 0 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-15 20:06 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Linus Torvalds, J.A. MagallÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Olivier Galibert <galibert@pobox.com> writes:

> On Tue, Feb 13, 2007 at 09:06:24PM +0300, Sergei Organov wrote:
>> I agree that making strxxx() family special is not a good idea. So what
>> do we do for a random foo(char*) called with an 'unsigned char*'
>> argument? Silence? Hmmm... It's not immediately obvious that it's indeed
>> harmless. Yet another -Wxxx option to GCC to silence this particular
>> case?
>
> Silence would be good.  "char *" has a special status in C, it can be:
> - pointer to a char/to an array of chars (standard interpretation)
> - pointer to a string
> - generic pointer to memory you can read(/write)
>
> Check the aliasing rules if you don't believe be on the third one.

Generic pointer still reads 'void*' isn't it? If you don't believe me,
check memcpu() and friends. From the POV of aliasing rules, "char*",
"unsigned char*", and "signed char*" are all the same, as aliasing rules
mention "character type", that according to the standard, is defined
like this:

  "The three types char, signed char, and unsigned char are collectively
   called the character types."

> And it's *way* more often cases 2 and 3 than 1 for the simple reason
> that the signedness of char is unpredictable.

Are the first two different other than by 0-termination? If not, then
differentiate 1 and 2 by signedness doesn't make sense. If we throw
away signedness argument, then yes, 2 is more common than 1, due to
obvious reason that it's much more convenient to use 0-terminated arrays
of characters in C than plain arrays of characters.

For 3, I'd use "unsigned char*" (or, sometimes, maybe, "signed char*") to
do actual access (due to aliasing rules), but "void*" as an argument
type, as arbitrary pointer is spelled "void*" in C.

So, it's 2 that I'd consider the most common usage for "char*". Strings
of characters are idiomatically "char*" in C.

> As a result, a signedness warning between char * and (un)signed char *
> is 99.99% of the time stupid.

As a result of what? Of the fact that strings in C are "char*"? Of the
fact that one can use "char*" to access arbitrary memory? If the latter,
then "char*", "signed char*", and "unsigned char*" are all equal, and
you may well argue that there should be no warning between "signed
char*" and "unsigned char*" as well, as either of them could be used to
access memory.

Anyway, I've already stated that I don't think it's signedness that matters
here. Standard explicitly says "char", "signed char", and "unsigned
char" are all distinct types, so the warning should be about
incompatible pointer instead. Anyway, I even don't dispute those 99.99%
number, I just want somebody to explain me how dangerous those 0.001%
are, and isn't it exactly 0%?

[...]
>> I'm afraid I don't follow. Do we have a way to say "I want an int of
>> indeterminate sign" in C?
>
> Almost completely.  The rules on aliasing say you can convert pointer
> between signed and unsigned variants and the accesses will be
> unsurprising.

Only from the point of view of aliasing. Aliasing rules don't disallow
other surprises.

>
> The only problem is that the implicit conversion of incompatible
> pointer parameters to a function looks impossible in the draft I have.

Why do you think it's impossible according to the draft?

> Probably has been corrected in the final version.

I doubt it's impossible either in the draft or in the final version.

> In any case, having for instance unsigned int * in a prototype really
> means in the language "I want a pointer to integers, and I'm probably
> going to use it them as unsigned, so beware".

> For the special case of char, since the beware version would require a
> signed or unsigned tag, it really means indeterminate.

IMHO, it would read "I want a pointer to alphabetic characters, and I'm
probably going to use them as alphabetic characters, so beware". No
signedness involved.

For example, suppose we have 3 function that compare two
zero-terminated arrays:

int compare1(char *c1, char *c2);
int compare2(unsigned char *c1, unsigned char *c2);
int compare3(signed char *c1, signed char *c2);

the first one might compare arguments alphabetically, while the second
and the third compare them numerically. All of the 3 are different, so
passing wrong type of argument to either of them might be dangerous.

> C is sometimes called a high-level assembler for a reason :-)

I'm afraid that those times when it actually was, are far in the past,
for better or worse.

>> The same way there doesn't seem to be a way to say "I want a char of
>> indeterminate sign". :( So no, strlen() doesn't actually say that, no
>> matter if we like it or not. It actually says "I want a char with
>> implementation-defined sign".
>
> In this day and age it means "I want a 0-terminated string".

IMHO, you've forgot to say "of alphabetic characters".

> Everything else is explicitely signed char * or unsigned char *, often
> through typedefs in the signed case.

Those are for signed and unsigned tiny integers, with confusing names,
that in turn are just historical baggage.

>> In fact it's implementation-defined, and this may make a difference
>> here. strlen(), being part of C library, could be specifically
>> implemented for given architecture, and as architecture is free to
>> define the sign of "char", strlen() could in theory rely on particular
>> sign of "char" as defined for given architecture. [Not that I think that
>> any strlen() implementation actually depends on sign.]
>
> That would require pointers tagged in a way or another, you can't
> distinguish between pointers to differently-signed versions of the
> same integer type otherwise (they're required to have same size and
> alignment).  You don't have that on modern architectures.

How is it relevant? According to this argument, e.g., foo(int*) can't
actually rely on the sign of its argument, so any warning about
signedness of pointer argument is crap.

>> Can we assure that no function taking 'char*' ever cares about the sign?
>> I'm not sure, and I'm not a language lawyer, but if it's indeed the
>> case, I'd probably agree that it might be a good idea for GCC to extend
>> the C language so that function argument declared "char*" means either
>> of "char*", "signed char*", or "unsigned char*" even though there is no
>> precedent in the language.
>
> It's a warning you're talking about.  That means it is _legal_ in the
> language (even if maybe implementation defined, but legal still).
> Otherwise it would be an error.

It's legal to pass "unsigned int*" to function requesting "int*", still
the warning on this is useful?

>> BTW, the next logical step would be for "int*" argument to stop meaning
>> "signed int*" and become any of "int*", "signed int*" or "unsigned
>> int*". Isn't it cool to be able to declare a function that won't produce
>> warning no matter what int is passed to it? ;)
>
> No, it wouldn't be logical, because char is *special*.

Yes, it's special, but in what manner? I fail to see in the standard
that "char" means either of "signed char" or "unsigned char".

>> Yes, indeed. So the real problem of the C language is inconsistency
>> between strxxx() and isxxx() families of functions? If so, what is 
>> wrong with actually fixing the problem, say, by using wrappers over
>> isxxx()? Checking... The kernel already uses isxxx() that are macros
>> that do conversion to "unsigned char" themselves, and a few invocations
>> of isspace() I've checked pass "char" as argument. So that's not a real
>> problem for the kernel, right?
>
> Because a cast to silence a warning silences every possible warning
> even if the then-pointer turns for instance into an integer through an
> unrelated change.

The isxxx() family doesn't have pointer arguments, so this is
irrelevant. The isxxx() family, as implemented in the kernel, casts its
argument to "unsigned char", that is the right thing to do.

>> As the isxxx() family does not seem to be a real problem, at least in
>> the context of the kernel source base, I'd like to learn other reasons
>> to use "unsigned char" for doing strings either in general or
>> specifically in the Linux kernel.
>
> Everybody who has ever done text manipulation in languages other than
> english knows for a fact that chars must be unsigned, always.  The
> current utf8 support frenzy is driving that home even harder.

Well, maybe, but it won't be C anymore. Strings in C are "char*", and
sign of "char" is implementation-defined. Though you can get what you
are asking for even now by using -funsigned-char.

>> OK, provided there are actually sound reasons to use "unsigned char*"
>> for strings, isn't it safer to always use it, and re-define strxxx() for
>> the kernel to take that one as argument? Or are there reasons to use
>> "char*" (or "signed char*") for strings as well? I'm afraid that if two
>> or three types are allowed to be used for strings, some inconsistencies
>> here or there will remain no matter what.
>
> "blahblahblah"'s type is const char *.

Yes, how did I forgot about it? So be it, strings in C are "char*".

>> Another option could be to always use "char*" for strings and always compile
>> the kernel with -funsigned-char. At least it will be consistent with the
>> C idea of strings being "char*".
>
> The best option is to have gcc stop being stupid.

Any compiler is stupid. AI didn't mature enough yet ;)

> -funsigned-char creates an assumption which is actually invisible in
> the code itself, making it a ticking bomb for reuse in other projects.

Well, on one hand you are asking for char to always be unsigned, but on
the other hand you deny the gcc option that provides exactly that. The
only option that remains is to argue char should always be unsigned to
the C committee, I'm afraid.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-15 18:53                                     ` Sergei Organov
@ 2007-02-15 19:02                                       ` Linus Torvalds
  2007-02-16  4:26                                         ` Rene Herman
  2007-02-19 13:58                                         ` Sergei Organov
  0 siblings, 2 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-15 19:02 UTC (permalink / raw)
  To: Sergei Organov
  Cc: Pekka Enberg,
	J.A. MagallÃÃÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton



On Thu, 15 Feb 2007, Sergei Organov wrote:
> 
> I agree that if the warning has no true positives, it sucks. The problem
> is that somehow I doubt it has none. And the reasons for the doubt are:

Why do you harp on "no true positives"?

That's a pointless thing. You can make *any* warning have "true 
positives". My personal favorite is the unconditional warning:

	warning: user is an idiot

and I _guarantee_ you that it has a lot of true positives.

It's the "no false negatives" angle you should look at.

THAT is what matters. The reason we don't see a lot of warnings about 
idiot users is not that people don't do stupid things, but that 
*sometimes* they actually don't do something stupid.

Yeah, I know, it's far-fetched, but still.

In other words, you're barking up *exactly* the wrong tree. You're looking 
at it the wrong way around.

Think of it this way: in science, a theory is proven to be bad by a single 
undeniable fact just showing that it's wrong.

The same is largely true of a warning. If the warning sometimes happens 
for code that is perfectly fine, the warning is bad.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-15 15:57                                   ` Linus Torvalds
@ 2007-02-15 18:53                                     ` Sergei Organov
  2007-02-15 19:02                                       ` Linus Torvalds
  2007-02-15 22:32                                     ` Lennart Sorensen
  1 sibling, 1 reply; 67+ messages in thread
From: Sergei Organov @ 2007-02-15 18:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka Enberg, J.A. MagallÃÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, 15 Feb 2007, Sergei Organov wrote:

[...Skip things I agree with...]

>> > But if you have 
>> >
>> > 	unsigned char *mystring;
>> >
>> > 	len = strlen(mystring);
>> >
>> > then please tell me how to fix that warning without making the code 
>> > *worse* from a type-safety standpoint? You CANNOT. You'd have to cast it 
>> > explicitly (or assing it through a "void *" variable), both of which 
>> > actually MAKE THE TYPE-CHECKING PROBLEM WORSE!
>> 
>> Because instead of trying to fix the code, you insist on hiding the
>> warning.
>
> No. I'm saying that I have *reasons* that I need my string to be unsigned. 
> That makes your first "fix" totally bogus:

Somehow I still suppose that most of those warnings could be fixed by
using "char*" instead. Yes, I believe you, you have reasons. But still I
only heard about isxxx() that appeared to be not a reason in practice
(in the context of the kernel tree). As you didn't tell about other
reasons, my expectation is that they are not in fact very common. I'm
curious what are actual reasons besides the fact that quite a lot of
code is apparently written this way in the kernel. The latter is indeed
a very sound reason for the warning to be sucks for kernel developers,
but it might be not a reason for it to be sucks in general.

>> There are at least two actual fixes:
>> 
>> 1. Do 'char *mystring' instead, as otherwise compiler can't figure out
>>    that you are going to use 'mystring' to point to zero-terminated
>>    array of characters.
>
> And the second fix is a fix, but it is, again, worse than the warning:
>
>> 2. Use "len = ustrlen(mystring)" if you indeed need something like C
>>    strings
>
> Sure, I can do that, but the upside is what?

1. You make it explicit, in a type-safe manner, that you are sure it's
   safe to call "strlen()" with "unsigned char*" argument. We all agree
   about it. Let's write it down in the program. And we can already add
   "strcmp()" to the list. And, say, ustrdup() and ustrchr() returning
   "unsigned char *", would be more type-safe either.

2. You make it more explicit that you indeed mean to use your own
   representation of strings that is different than what C idiomatically
   uses for strings, along with explicitly defined allowed set of
   operations on this representation.

The actual divergence of opinion seems to be that you are sure it's safe
to call any foo(char*) with "unsigned char*" argument, and I'm not, due
to the reasons described below. If you are right, then there is indeed
little or no point in doing this work.

> Getting rid of a warning that was bogus to begin with?

I agree that if the warning has no true positives, it sucks. The problem
is that somehow I doubt it has none. And the reasons for the doubt are:

1. C standard does explicitly say that "char" is different type from
   either "signed char" or "unsigned char". There should be some reason
   for that, otherwise they'd simply write that "char" is a synonym for
   one of "signed char" or "unsigned char", depending on implementation.

2. If "char" in fact matches one of the other two types for given
   architecture, it is definitely different from another one, so
   substituting "char" for the different one might be unsafe. As a
   consequence, for a portable program it might be unsafe to substitute
   "char" for any of signed or unsigned, as "char" may happen to
   differ from any of them.

While the doubts aren't resolved, I prefer to at least be careful with
things that I don't entirely understand, so I wish compiler give me a
warning about subtle semantic differences (if any).

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-15 13:20                                 ` Sergei Organov
@ 2007-02-15 15:57                                   ` Linus Torvalds
  2007-02-15 18:53                                     ` Sergei Organov
  2007-02-15 22:32                                     ` Lennart Sorensen
  0 siblings, 2 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-15 15:57 UTC (permalink / raw)
  To: Sergei Organov
  Cc: Pekka Enberg, J.A. MagallÃÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton



On Thu, 15 Feb 2007, Sergei Organov wrote:
> 
> Yes, there is. There are at least 2 drawbacks. Minor one is
> initialization eating cycles. Major one is that if later I change the
> code and there will appear a pass that by mistake uses those fake value,
> I won't get the warning anymore. The latter drawback is somewhat similar
> to the drawbacks of explicit type casts.

I actually agree - it would clearly be *nicer* if gcc was accurate with 
the warning. I'm just saying that usually the patch to shut gcc up is 
quite benign.

But yes, it basically disabled the warning, and just *maybe* the warning 
was valid, and it disabled it by initializing something to the wrong 
value. 

That's why warnings with false positives are often much worse than not 
having the warning at all: you start dismissing them, and not even 
bothering to fix them "properly". That's especially true if there _isn't_ 
a proper fix.

I personally much prefer a compiler that doesn't warn a lot, but *when* it 
warns, it's 100% on the money. I think that compilers that warn about 
things that "may" be broken are bogus. It needs to be iron-clad, with no 
question about whether the warning is appropriate!

Which is, of course, why we then shut up some gcc warnings, and which gets 
us back to the start of the discussion.. ;)

> [I'd, personally, try to do code reorganization instead so that it
> becomes obvious for the compiler that the variable can't be used
> uninitialized. Quite often it makes the code better, at least it was my
> experience so far.]

It's true that it sometimes works that way. Not always, though.

The gcc "uninitialized variable" thing is *usually* a good sign that the 
code wasn't straightforward for the compiler to follow, and if the 
compiler cannot follow it, then probably neither can most programmers. So 
together with the fact that it's not at least _syntactically_ ugly to shut 
up, it's not a warning I personally object to most of the time (unlike, 
say, the pointer-sign one ;)

> [Did I already say that I think it's wrong warning to be given in this
> particular case, as the problem with the code is not in signedness?]
> 
> 1. Don't try to hide the warning, at least not immediately, -- consider
>    fixing the code first. Ah, sorry, you've already decided for yourself
>    that the warning is idiotic, so there is no reason to try to...

This is actually something we've tried. 

The problem with that approach is that once you have tons of warnings that 
are "expected", suddenly *all* warnings just become noise. Not just the 
bogus one. It really *is* a matter of: warnings should either be 100% 
solid, or they should not be there at all.

And this is not just about compiler warnings. We've added some warnings of 
our own (well, with compiler help, of course): things like the 
"deprecated" warnings etc. I have often ended up shutting them up, and one 
reason for that is that yes, I have literally added bugs to the kernel 
that the compiler really *did* warn about, but because there were so many 
other pointless warnings, I didn't notice!

I didn't make that up. I forget what the bug was, but it literally wasn't 
more than a couple of months ago that I passed in the wrong type to a 
function, and the compiler warned about it in big letters. It was just 
totally hidden by all the other warning crap.

> 2. If you add a cast, it's not in contrast, it's rather similar to fake
>    initialization above as they have somewhat similar drawback.

I agree that it's "unnecessary code", and in many ways exactly the same 
thing. I just happen to believe that casts tend to be a lot more dangerous 
than extraneous initializations. Casts have this nasty tendency of hiding 
*other* problems too (ie you later change the thing you cast completely, 
and now the compiler doesn't warn about a *real* bug, because the cast 
will just silently force it to a wrong type).

And yeah, that may well be a personal hangup of mine. A lot of people 
think casts in C are normal. Me, I actually consider C to be a type-safe 
language (!) but it's only type-safe if you are careful, and avoiding 
casts is one of the rules.

Others claim that C isn't type-safe at all, and I think it comes from 
them having been involved in projects where people didn't have the same 
"casts are good, but only when you use them really really carefully".

> > But if you have 
> >
> > 	unsigned char *mystring;
> >
> > 	len = strlen(mystring);
> >
> > then please tell me how to fix that warning without making the code 
> > *worse* from a type-safety standpoint? You CANNOT. You'd have to cast it 
> > explicitly (or assing it through a "void *" variable), both of which 
> > actually MAKE THE TYPE-CHECKING PROBLEM WORSE!
> 
> Because instead of trying to fix the code, you insist on hiding the
> warning.

No. I'm saying that I have *reasons* that I need my string to be unsigned. 
That makes your first "fix" totally bogus:

> There are at least two actual fixes:
> 
> 1. Do 'char *mystring' instead, as otherwise compiler can't figure out
>    that you are going to use 'mystring' to point to zero-terminated
>    array of characters.

And the second fix is a fix, but it is, again, worse than the warning:

> 2. Use "len = ustrlen(mystring)" if you indeed need something like C
>    strings

Sure, I can do that, but the upside is what?

Getting rid of a warning that was bogus to begin with?

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 20:24                           ` Linus Torvalds
@ 2007-02-15 15:15                             ` Sergei Organov
  0 siblings, 0 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-15 15:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: J.A. MagallÃÃÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Tue, 13 Feb 2007, Sergei Organov wrote:
[...]
> BUT (and this is a big but) within the discussion of "strlen()", that is 
> no longer true. "strlen()" exists _outside_ of a single particular 
> implementation. As such, "implementation-defined" is no longer something 
> that "strlen()" can depend on.
>
> As an example of this argument, try this:
>
> 	#include <string.h>
> 	#include <stdio.h>
>
> 	int main(int argc, char **argv)
> 	{
> 		char a1[] = { -1, 0 }, a2[] = { 1, 0 };
>
> 		printf("%d %d\n", a1[0] < a2[0], strcmp(a1, a2) < 0);
> 		return 0;
> 	}
>
> and *before* you compile it, try to guess what the output is.

Well, I'll try to play fair, so I didn't yet compile it. Now, strcmp()
is defined in the C standard so that its behavior doesn't depend on the
sign of char:

   "The sign of a nonzero value returned by the comparison functions
    memcmp, strcmp, and strncmp is determined by the sign of the
    difference between the values of the first pair of characters (both
    interpreted as unsigned char) that differ in the objects being
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    compared."

[Therefore, at least I don't need to build GCC multilib'ed on
-fsigned/unsigned-char to get consistent results even if strcmp() in
fact lives in a library, that was my first thought before I referred to
the standard ;)]

Suppose the char is signed. Then a1[0] < a2[0] (= -1 < 1) should be
true. On 2's-complement implementation with 8bit char, -1 converted by
strcmp() to unsigned char should be 0xFF, and 1 converted should be
1. So strcmp() should be equivalent to (0xFF < 1) that is false. So
I'd expect

 1 0

result on implementation with signed char.

Now suppose the char is unsigned. Then on 2's-complement implementation
with 8bit-byte CPU, a1[0] should be 0xFF, and a2[0] should be 1. The
result from strcmp() won't change. So I'd expect

 0 0

result on implementation with unsigned char.

Now I'm going to compile it (I must admit I'm slightly afraid to get
surprising results, so I've re-read my above reasonings before
compiling):

osv@osv tmp$ cat strcmp.c
#include <stdio.h>
#include <string.h>
int main()
{
  char a1[] = { -1, 0 }, a2[] = { 1, 0 };

  printf("%d %d\n", a1[0] < a2[0], strcmp(a1, a2) < 0);
  return 0;
}
osv@osv tmp$ gcc -v
Using built-in specs.
Target: i486-linux-gnu
...
gcc version 4.1.2 20061028 (prerelease) (Debian 4.1.1-19)
osv@osv tmp$ gcc -O2 strcmp.c -o strcmp && ./strcmp
1 0

> And when that confuses you,

It didn't, or did I miss something? Is char unsigned by default?

> try to compile it using gcc with the 
> "-funsigned-char" flag (or "-fsigned-char" if you started out on an 
> architecture where char was unsigned by default)

osv@osv tmp$ gcc -O2 -fsigned-char strcmp.c -o strcmp && ./strcmp
1 0
osv@osv tmp$ gcc -O2 -funsigned-char strcmp.c -o strcmp && ./strcmp
0 0
osv@osv tmp$

Due to above, apparently char is indeed signed by default, so what?

> And when you really *really* think about it afterwards, I think you'll go 
> "Ahh.. Linus is right". It's more than "implementation-defined": it really 
> is totally indeterminate for code like this.

The fact is that strcmp() is explicitly defined in the C standard so
that it will bring the same result no matter what the sign of "char"
type is. Therefore, it obviously can't be used to determine the sign of
"char", so from the POV of usage of strcmp(), the sign of its argument
is indeed "indeterminate". What I fail to see is how this fact could
help in proving that for any function taking "char*" argument the sign
of char is indeterminate.

Anyway, it seems that you still miss (or ignore) my point that it's not
(only) sign of "char" that makes it suspect to call functions requesting
"char*" argument with "unsigned char*" value.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 23:21                               ` Linus Torvalds
@ 2007-02-15 13:20                                 ` Sergei Organov
  2007-02-15 15:57                                   ` Linus Torvalds
  0 siblings, 1 reply; 67+ messages in thread
From: Sergei Organov @ 2007-02-15 13:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka Enberg, J.A. MagallÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Tue, 13 Feb 2007, Sergei Organov wrote:
>> 
>> Sorry, what do you do with "variable 'xxx' might be used uninitialized"
>> warning when it's false? Turn it off? Annotate the source? Assign fake
>> initialization value? Change the compiler so that it does "the effort"
>> for you? Never encountered false positive from this warning?
>
> The thing is, that warning is at least easy to shut up. 
>
> You just add a fake initialization. There isn't any real downside.

Yes, there is. There are at least 2 drawbacks. Minor one is
initialization eating cycles. Major one is that if later I change the
code and there will appear a pass that by mistake uses those fake value,
I won't get the warning anymore. The latter drawback is somewhat similar
to the drawbacks of explicit type casts.

[I'd, personally, try to do code reorganization instead so that it
becomes obvious for the compiler that the variable can't be used
uninitialized. Quite often it makes the code better, at least it was my
experience so far.]

> In contrast, to shut up the idiotic pointer-sign warning, you have to add 
> a cast.

[Did I already say that I think it's wrong warning to be given in this
particular case, as the problem with the code is not in signedness?]

1. Don't try to hide the warning, at least not immediately, -- consider
   fixing the code first. Ah, sorry, you've already decided for yourself
   that the warning is idiotic, so there is no reason to try to...

2. If you add a cast, it's not in contrast, it's rather similar to fake
   initialization above as they have somewhat similar drawback.

> Now, some people seem to think that adding casts is a way of life, and 
> think that C programs always have a lot of casts.

Hopefully I'm not one of those.

[...]
> But if you have 
>
> 	unsigned char *mystring;
>
> 	len = strlen(mystring);
>
> then please tell me how to fix that warning without making the code 
> *worse* from a type-safety standpoint? You CANNOT. You'd have to cast it 
> explicitly (or assing it through a "void *" variable), both of which 
> actually MAKE THE TYPE-CHECKING PROBLEM WORSE!

Because instead of trying to fix the code, you insist on hiding the
warning. There are at least two actual fixes:

1. Do 'char *mystring' instead, as otherwise compiler can't figure out
   that you are going to use 'mystring' to point to zero-terminated
   array of characters.

2. Use "len = ustrlen(mystring)" if you indeed need something like C
   strings, but built from "unsigned char" elements. Similar to (1), this
   will explain your intent to those stupid compiler. Should I also
   mention that due to the definition of strlen(), the implementation of
   ustrlen() is a one-liner (thnx Andrew):

static inline size_t ustrlen(const unsigned char *s)
{
    return strlen((const char *)s);
}

Overall, describe your intent to the compiler more clearly, as reading
programmer's mind is not the kind of business where compilers are
strong.

> See? The warning made no sense to begin with, and it warns about something 
> where the alternatives are worse than what it warned about.
>
> Ergo. It's a crap warning.

No, I'm still not convinced. Well, I'll try to explain my point
differently. What would you think about the following line, should you
encounter it in real code:

  len = strlen(my_tiny_integers);

I'd think that as 'my_tiny_integers' is used as an argument to strlen(),
it might be zero-terminated array of characters. But why does it have
such a name then?!  Maybe it in fact holds an array of integers where 0
is not necessarily terminating, and the intent of this code is just to
find first occurrence of 0? It's confusing and I'd probably go check the
declaration. Declaration happens to be "unsigned char *my_tiny_integers"
that at least is rather consistent with the variable name, but doesn't
resolve my initial doubts. So I need to go check other usages of this
'my_tiny_integers' variable to figure out what's actually going on
here. Does it sound as a good programming practice? I don't think so.

Now, you probably realize that in the example you gave above you've
effectively declared "mystring" to be a pointer to
"tiny_unsigned_integer". As compiler (fortunately) doesn't guess intent
from variable names, it can't give us warning at the point of
declaration, but I still think we must be thankful that it gave us a
warning (even though gcc gives a wrong kind of warning) at the point of
dubious use.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 22:21                       ` Olivier Galibert
@ 2007-02-14 12:52                         ` Sergei Organov
  2007-02-15 20:06                         ` Sergei Organov
  1 sibling, 0 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-14 12:52 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Linus Torvalds, J.A. MagallÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

Olivier Galibert <galibert@pobox.com> writes:
> On Tue, Feb 13, 2007 at 09:06:24PM +0300, Sergei Organov wrote:
[...]
>> May I suggest another definition for a warning being entirely sucks?
>> "The warning is entirely sucks if and only if it never has true
>> positives." In all other cases it's only more or less sucks, IMHO.
>
> That means a warning that triggers on every line saying "there may be
> a bug there" does not entirely suck?

You got me. This warning is indeed entirely sucks. My definition is
poor.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 20:29                             ` Sergei Organov
  2007-02-13 21:31                               ` Jeff Garzik
@ 2007-02-13 23:21                               ` Linus Torvalds
  2007-02-15 13:20                                 ` Sergei Organov
  1 sibling, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-13 23:21 UTC (permalink / raw)
  To: Sergei Organov
  Cc: Pekka Enberg, J.A. MagallÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton



On Tue, 13 Feb 2007, Sergei Organov wrote:
> 
> Sorry, what do you do with "variable 'xxx' might be used uninitialized"
> warning when it's false? Turn it off? Annotate the source? Assign fake
> initialization value? Change the compiler so that it does "the effort"
> for you? Never encountered false positive from this warning?

The thing is, that warning is at least easy to shut up. 

You just add a fake initialization. There isn't any real downside.

In contrast, to shut up the idiotic pointer-sign warning, you have to add 
a cast.

Now, some people seem to think that adding casts is a way of life, and 
think that C programs always have a lot of casts. That's NOT TRUE. It's 
actually possible to avoid casts, and good C practice will do that to 
quite a high degree, because casts in C are *dangerous*. 

A language that doesn't allow arbitrary type-casting is a horrible 
language (because it doesn't allow you to "get out" of the language type 
system), and typecasts in C are really important. But they are an 
important feature that should be used with caution, and as little as 
possible, because they (by design, of course) break the type rules.

Now, since the _only_ reason for the -Wpointer-sign warning in the first 
place is to warn about breaking type rules, if the way to shut it up is to 
break them EVEN MORE, then the warnign is obviously totally broken. IT 
CAUSES PEOPLE TO WRITE WORSE CODE! Which was against the whole point of 
having the warning in the first place.

This is why certain warnings are fine. For example, the warning about

	if (a=b)
		..

is obviously warning about totally valid C code, but it's _still_ a fine 
warning, because it's actually very easy to make that warning go away AND 
IMPROVE THE CODE at the same time. Even if you actually meant to write the 
assignment, you can write it as

	if ((a = b) != 0)
		..

or 

	a = b;
	if (a)
		..

both of which are actually more readable.

But if you have 

	unsigned char *mystring;

	len = strlen(mystring);

then please tell me how to fix that warning without making the code 
*worse* from a type-safety standpoint? You CANNOT. You'd have to cast it 
explicitly (or assing it through a "void *" variable), both of which 
actually MAKE THE TYPE-CHECKING PROBLEM WORSE!

See? The warning made no sense to begin with, and it warns about something 
where the alternatives are worse than what it warned about.

Ergo. It's a crap warning.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 18:06                     ` Sergei Organov
  2007-02-13 18:26                       ` Pekka Enberg
  2007-02-13 19:25                       ` Linus Torvalds
@ 2007-02-13 22:21                       ` Olivier Galibert
  2007-02-14 12:52                         ` Sergei Organov
  2007-02-15 20:06                         ` Sergei Organov
  2 siblings, 2 replies; 67+ messages in thread
From: Olivier Galibert @ 2007-02-13 22:21 UTC (permalink / raw)
  To: Sergei Organov
  Cc: Linus Torvalds, J.A. MagallÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

On Tue, Feb 13, 2007 at 09:06:24PM +0300, Sergei Organov wrote:
> I agree that making strxxx() family special is not a good idea. So what
> do we do for a random foo(char*) called with an 'unsigned char*'
> argument? Silence? Hmmm... It's not immediately obvious that it's indeed
> harmless. Yet another -Wxxx option to GCC to silence this particular
> case?

Silence would be good.  "char *" has a special status in C, it can be:
- pointer to a char/to an array of chars (standard interpretation)
- pointer to a string
- generic pointer to memory you can read(/write)

Check the aliasing rules if you don't believe be on the third one.
And it's *way* more often cases 2 and 3 than 1 for the simple reason
that the signedness of char is unpredictable.  As a result, a
signedness warning between char * and (un)signed char * is 99.99% of
the time stupid.


> May I suggest another definition for a warning being entirely sucks?
> "The warning is entirely sucks if and only if it never has true
> positives." In all other cases it's only more or less sucks, IMHO.

That means a warning that triggers on every line saying "there may be
a bug there" does not entirely suck?


> I'm afraid I don't follow. Do we have a way to say "I want an int of
> indeterminate sign" in C?

Almost completely.  The rules on aliasing say you can convert pointer
between signed and unsigned variants and the accesses will be
unsurprising.  The only problem is that the implicit conversion of
incompatible pointer parameters to a function looks impossible in the
draft I have.  Probably has been corrected in the final version.

In any case, having for instance unsigned int * in a prototype really
means in the language "I want a pointer to integers, and I'm probably
going to use it them as unsigned, so beware".  For the special case of
char, since the beware version would require a signed or unsigned tag,
it really means indeterminate.

C is sometimes called a high-level assembler for a reason :-)


> The same way there doesn't seem to be a way
> to say "I want a char of indeterminate sign". :( So no, strlen() doesn't
> actually say that, no matter if we like it or not. It actually says "I
> want a char with implementation-defined sign".

In this day and age it means "I want a 0-terminated string".
Everything else is explicitely signed char * or unsigned char *, often
through typedefs in the signed case.


> In fact it's implementation-defined, and this may make a difference
> here. strlen(), being part of C library, could be specifically
> implemented for given architecture, and as architecture is free to
> define the sign of "char", strlen() could in theory rely on particular
> sign of "char" as defined for given architecture. [Not that I think that
> any strlen() implementation actually depends on sign.]

That would require pointers tagged in a way or another, you can't
distinguish between pointers to differently-signed versions of the
same integer type otherwise (they're required to have same size and
alignment).  You don't have that on modern architectures.


> Can we assure that no function taking 'char*' ever cares about the sign?
> I'm not sure, and I'm not a language lawyer, but if it's indeed the
> case, I'd probably agree that it might be a good idea for GCC to extend
> the C language so that function argument declared "char*" means either
> of "char*", "signed char*", or "unsigned char*" even though there is no
> precedent in the language.

It's a warning you're talking about.  That means it is _legal_ in the
language (even if maybe implementation defined, but legal still).
Otherwise it would be an error.


> BTW, the next logical step would be for "int*" argument to stop meaning
> "signed int*" and become any of "int*", "signed int*" or "unsigned
> int*". Isn't it cool to be able to declare a function that won't produce
> warning no matter what int is passed to it? ;)

No, it wouldn't be logical, because char is *special*.


> Yes, indeed. So the real problem of the C language is inconsistency
> between strxxx() and isxxx() families of functions? If so, what is 
> wrong with actually fixing the problem, say, by using wrappers over
> isxxx()? Checking... The kernel already uses isxxx() that are macros
> that do conversion to "unsigned char" themselves, and a few invocations
> of isspace() I've checked pass "char" as argument. So that's not a real
> problem for the kernel, right?

Because a cast to silence a warning silences every possible warning
even if the then-pointer turns for instance into an integer through an
unrelated change.  Think for instance about an error_t going from
const char * (error string) to int (error code) through a patch, which
happened to be passed to an utf8_to_whatever conversion function that
takes an const unsigned char * as a parameter.  Casting would hide the
impact of changing the type.


> As the isxxx() family does not seem to be a real problem, at least in
> the context of the kernel source base, I'd like to learn other reasons
> to use "unsigned char" for doing strings either in general or
> specifically in the Linux kernel.

Everybody who has ever done text manipulation in languages other than
english knows for a fact that chars must be unsigned, always.  The
current utf8 support frenzy is driving that home even harder.


> OK, provided there are actually sound reasons to use "unsigned char*"
> for strings, isn't it safer to always use it, and re-define strxxx() for
> the kernel to take that one as argument? Or are there reasons to use
> "char*" (or "signed char*") for strings as well? I'm afraid that if two
> or three types are allowed to be used for strings, some inconsistencies
> here or there will remain no matter what.

"blahblahblah"'s type is const char *.


> Another option could be to always use "char*" for strings and always compile
> the kernel with -funsigned-char. At least it will be consistent with the
> C idea of strings being "char*".

The best option is to have gcc stop being stupid.  -Funsigned-char
creates an assumption which is actually invisible in the code itself,
making it a ticking bomb for reuse in other projects.

  OG.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 20:29                             ` Sergei Organov
@ 2007-02-13 21:31                               ` Jeff Garzik
  2007-02-13 23:21                               ` Linus Torvalds
  1 sibling, 0 replies; 67+ messages in thread
From: Jeff Garzik @ 2007-02-13 21:31 UTC (permalink / raw)
  To: Sergei Organov
  Cc: Pekka Enberg, Linus Torvalds,
	J.A. MagallÃón, Jan Engelhardt,
	Linux Kernel Mailing List, Andrew Morton

On Tue, Feb 13, 2007 at 11:29:44PM +0300, Sergei Organov wrote:
> Sorry, what do you do with "variable 'xxx' might be used uninitialized"
> warning when it's false? Turn it off? Annotate the source? Assign fake
> initialization value? Change the compiler so that it does "the effort"
> for you? Never encountered false positive from this warning?

You pull git://git.kernel.org/...jgarzik/misc-2.6.git#gccbug

	Jeff




^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 19:25                       ` Linus Torvalds
  2007-02-13 19:59                         ` Sergei Organov
@ 2007-02-13 21:13                         ` Rob Landley
  1 sibling, 0 replies; 67+ messages in thread
From: Rob Landley @ 2007-02-13 21:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sergei Organov, J.A. MagallÃÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

On Tuesday 13 February 2007 2:25 pm, Linus Torvalds wrote:
> THE FACT IS, THAT "strlen()" IS DEFINED UNIVERSALLY AS TAKING "char *".
> 
> That BY DEFINITION means that "strlen()" cannot care about the sign, 
> because the sign IS NOT DEFINED UNIVERSALLY!
> 
> And if you cannot accept that fact, it's your problem. Not mine.
> 
> The warning is CRAP. End of story.

In busybox we fed the compiler -funsigned-char to make it shut up.  (And so we 
had consistent bugs between arm and x86.)

Building tcc I had to feed it -fsigned-char because -funsigned char broke 
stuff. :)

Rob
-- 
"Perfection is reached, not when there is no longer anything to add, but
when there is no longer anything to take away." - Antoine de Saint-Exupery

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 19:43                           ` Pekka Enberg
@ 2007-02-13 20:29                             ` Sergei Organov
  2007-02-13 21:31                               ` Jeff Garzik
  2007-02-13 23:21                               ` Linus Torvalds
  0 siblings, 2 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-13 20:29 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Linus Torvalds, J.A. MagallÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

"Pekka Enberg" <penberg@cs.helsinki.fi> writes:
> On 2/13/07, Sergei Organov <osv@javad.com> wrote:
>> With almost any warning out there one makes more or less efforts to
>> suppress the warning where it gives false positives, isn't it?
>
> Yes, as long it's the _compiler_ that's doing the effort.
> You shouldn't need to annotate the source just to make the compiler
> shut up.

Sorry, what do you do with "variable 'xxx' might be used uninitialized"
warning when it's false? Turn it off? Annotate the source? Assign fake
initialization value? Change the compiler so that it does "the effort"
for you? Never encountered false positive from this warning?

> Once the compiler starts issuing enough false positives, it's
> time to turn off that warning completely.

Yes, I don't argue that. I said "otherwise the warning is more or less
sucks", and then it's up to programmers to decide if it's enough sucks
to be turned off. The decision depends on the importance of its true
positives then. Only if warning never has true positives it is
unconditional, total, unhelpful crap, -- that was my point.

> Therefore, the only sane strategy for a warning is to aim for zero
> false positives.

Sure. But unfortunately this in an unreachable aim in most cases.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 19:59                         ` Sergei Organov
@ 2007-02-13 20:24                           ` Linus Torvalds
  2007-02-15 15:15                             ` Sergei Organov
  0 siblings, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-13 20:24 UTC (permalink / raw)
  To: Sergei Organov
  Cc: J.A. MagallÃÃÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Tue, 13 Feb 2007, Sergei Organov wrote:

> > Exactly because "char" *by*definition* is "indeterminate sign" as far as 
> > something like "strlen()" is concerned.
> 
> Thanks, I now understand that you either don't see the difference
> between "indeterminate" and "implementation-defined" in this context or
> consider it non-essential, so I think I've got your point.

The thing is, "implementation-defined" does actually matter when you can 
_depend_ on it.

For example, there's a *huge* difference between "undefined" and 
"implementation-defined". A program can actually depend on something like

	char c = 0x80;

	if (c < 0)
		..

always having the *same* behaviour for a particular compiler (with a 
particular set of compile flags - some compilers have flags to change the 
sign behaviour).

So yes, there "implementation defined" actually has a real and worthwhile 
meaning. It is guaranteed to have _some_ semantics within the confines of 
that program, and they are defined to be consistent (again, within the 
confines of that particular program).

So I agree that "implementation defined" doesn't always mean 
"meaningless".

BUT (and this is a big but) within the discussion of "strlen()", that is 
no longer true. "strlen()" exists _outside_ of a single particular 
implementation. As such, "implementation-defined" is no longer something 
that "strlen()" can depend on.

As an example of this argument, try this:

	#include <string.h>
	#include <stdio.h>

	int main(int argc, char **argv)
	{
		char a1[] = { -1, 0 }, a2[] = { 1, 0 };

		printf("%d %d\n", a1[0] < a2[0], strcmp(a1, a2) < 0);
		return 0;
	}

and *before* you compile it, try to guess what the output is.

And when that confuses you, try to compile it using gcc with the 
"-funsigned-char" flag (or "-fsigned-char" if you started out on an 
architecture where char was unsigned by default)

And when you really *really* think about it afterwards, I think you'll go 
"Ahh.. Linus is right". It's more than "implementation-defined": it really 
is totally indeterminate for code like this.

(Yeah, the above isn't "strlen()", but it's an even subtler issue with 
EXACTLY THE SAME PROBLEM! And one where you can actually see the _effects_ 
of it)

			Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 19:25                       ` Linus Torvalds
@ 2007-02-13 19:59                         ` Sergei Organov
  2007-02-13 20:24                           ` Linus Torvalds
  2007-02-13 21:13                         ` Rob Landley
  1 sibling, 1 reply; 67+ messages in thread
From: Sergei Organov @ 2007-02-13 19:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: J.A. MagallÃÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Tue, 13 Feb 2007, Sergei Organov wrote:
>> >
>> > 	"I want a char of indeterminate sign"!
>> 
>> I'm afraid I don't follow. Do we have a way to say "I want an int of
>> indeterminate sign" in C? The same way there doesn't seem to be a way
>> to say "I want a char of indeterminate sign".
>
> You're wrong.

Sure, I knew it from the very beginning ;)

>
> Exactly because "char" *by*definition* is "indeterminate sign" as far as 
> something like "strlen()" is concerned.

Thanks, I now understand that you either don't see the difference
between "indeterminate" and "implementation-defined" in this context or
consider it non-essential, so I think I've got your point.

> "char" is _special_. Char is _different_. Char is *not* "int".

Sure.

>
>> So no, strlen() doesn't actually say that, no matter if we like it or 
>> not. It actually says "I want a char with implementation-defined sign".
>
> You're arguing some strange semantic difference in the English
> language.

Didn't I further explain what I meant exactly (that you've skipped)?
OK, never mind.

> I'm not really interested.

OK, no problem.

>
> THE FACT IS, THAT "strlen()" IS DEFINED UNIVERSALLY AS TAKING "char *".

So just don't pass it "unsigned char*". End of story.

>
> That BY DEFINITION means that "strlen()" cannot care about the sign, 
> because the sign IS NOT DEFINED UNIVERSALLY!
>
> And if you cannot accept that fact, it's your problem. Not mine.

I never had problems either with strlen() or with this warning, so was
curious why does the warning is such a problem for the kernel. I'm sorry
I took your time and haven't got an answer that I entirely agree
with. Thank you very much for your explanations anyway.

> The warning is CRAP. End of story.

Amen!

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 19:14                         ` Sergei Organov
@ 2007-02-13 19:43                           ` Pekka Enberg
  2007-02-13 20:29                             ` Sergei Organov
  0 siblings, 1 reply; 67+ messages in thread
From: Pekka Enberg @ 2007-02-13 19:43 UTC (permalink / raw)
  To: Sergei Organov
  Cc: Linus Torvalds, J.A. MagallÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

On 2/13/07, Sergei Organov <osv@javad.com> wrote:
> With almost any warning out there one makes more or less efforts to
> suppress the warning where it gives false positives, isn't it?

Yes, as long it's the _compiler_ that's doing the effort. You
shouldn't need to annotate the source just to make the compiler shut
up. Once the compiler starts issuing enough false positives, it's time
to turn off that warning completely. Therefore, the only sane strategy
for a warning is to aim for zero false positives.

                                 Pekka

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 18:06                     ` Sergei Organov
  2007-02-13 18:26                       ` Pekka Enberg
@ 2007-02-13 19:25                       ` Linus Torvalds
  2007-02-13 19:59                         ` Sergei Organov
  2007-02-13 21:13                         ` Rob Landley
  2007-02-13 22:21                       ` Olivier Galibert
  2 siblings, 2 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-13 19:25 UTC (permalink / raw)
  To: Sergei Organov
  Cc: J.A. MagallÃÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Tue, 13 Feb 2007, Sergei Organov wrote:
> >
> > 	"I want a char of indeterminate sign"!
> 
> I'm afraid I don't follow. Do we have a way to say "I want an int of
> indeterminate sign" in C? The same way there doesn't seem to be a way
> to say "I want a char of indeterminate sign".

You're wrong.

Exactly because "char" *by*definition* is "indeterminate sign" as far as 
something like "strlen()" is concerned.

"char" is _special_. Char is _different_. Char is *not* "int".

> So no, strlen() doesn't actually say that, no matter if we like it or 
> not. It actually says "I want a char with implementation-defined sign".

You're arguing some strange semantic difference in the English language.

I'm not really interested.

THE FACT IS, THAT "strlen()" IS DEFINED UNIVERSALLY AS TAKING "char *".

That BY DEFINITION means that "strlen()" cannot care about the sign, 
because the sign IS NOT DEFINED UNIVERSALLY!

And if you cannot accept that fact, it's your problem. Not mine.

The warning is CRAP. End of story.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 18:26                       ` Pekka Enberg
@ 2007-02-13 19:14                         ` Sergei Organov
  2007-02-13 19:43                           ` Pekka Enberg
  0 siblings, 1 reply; 67+ messages in thread
From: Sergei Organov @ 2007-02-13 19:14 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Linus Torvalds, J.A. MagallÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

"Pekka Enberg" <penberg@cs.helsinki.fi> writes:

> On 2/13/07, Sergei Organov <osv@javad.com> wrote:
>> May I suggest another definition for a warning being entirely sucks?
>> "The warning is entirely sucks if and only if it never has true
>> positives." In all other cases it's only more or less sucks, IMHO.
>
> You're totally missing the point.

No, I don't. 

> False positives are not a minor annoyance, they're actively harmful as
> they hide other _useful_ warnings.

Every warning I'm aware of do have false positives. They are indeed
harmful, so one takes steps to get rid of them. If a warning had none
false positives, it should be an error, not a warning in the first
place.

> So, you really want warnings to be about things that can and
> should be fixed.

That's the definition of a "perfect" warning, that is actually called
"error".

> So you really should aim for _zero false positives_ even if you risk
> not detecting some real positives.

With almost any warning out there one makes more or less efforts to
suppress the warning where it gives false positives, isn't it? At least
that's my experience so far.

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-13 18:06                     ` Sergei Organov
@ 2007-02-13 18:26                       ` Pekka Enberg
  2007-02-13 19:14                         ` Sergei Organov
  2007-02-13 19:25                       ` Linus Torvalds
  2007-02-13 22:21                       ` Olivier Galibert
  2 siblings, 1 reply; 67+ messages in thread
From: Pekka Enberg @ 2007-02-13 18:26 UTC (permalink / raw)
  To: Sergei Organov
  Cc: Linus Torvalds, J.A. MagallÃón,
	Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List,
	Andrew Morton

On 2/13/07, Sergei Organov <osv@javad.com> wrote:
> May I suggest another definition for a warning being entirely sucks?
> "The warning is entirely sucks if and only if it never has true
> positives." In all other cases it's only more or less sucks, IMHO.

You're totally missing the point. False positives are not a minor
annoyance, they're actively harmful as they hide other _useful_
warnings. So, you really want warnings to be about things that can and
should be fixed. So you really should aim for _zero false positives_
even if you risk not detecting some real positives.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-12 16:26                   ` Linus Torvalds
@ 2007-02-13 18:06                     ` Sergei Organov
  2007-02-13 18:26                       ` Pekka Enberg
                                         ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-13 18:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: J.A. MagallÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Mon, 12 Feb 2007, Sergei Organov wrote:
>> 
>> Why strlen() should be allowed to be called with an incompatible pointer
>> type? My point is that gcc should issue *different warning*, -- the same
>> warning it issues here:
>
> I agree that "strlen()" per se isn't different.

I agree that making strxxx() family special is not a good idea. So what
do we do for a random foo(char*) called with an 'unsigned char*'
argument? Silence? Hmmm... It's not immediately obvious that it's indeed
harmless. Yet another -Wxxx option to GCC to silence this particular
case?

> A warnign is only as good as
>  (a) the thing it warns about
>  (b) the thing you can do about it
> 
> And THAT is the fundamental problem with that *idiotic* warning. Yes, it's 
> technically correct. Yes, it's "proper C grammar". But if you can't get 
> over the hump of realizing that there is a difference between "grammar" 
> and "intelligent speech", you shouldn't be doing compilers.
> 
> So the warning sucks because:

May I suggest another definition for a warning being entirely sucks?
"The warning is entirely sucks if and only if it never has true
positives." In all other cases it's only more or less sucks, IMHO.

>  - the thing it warns about (passing "unsigned char" to something that 
>    doesn't specify a sign at all!) is not something that sounds wrong in 
>    the first place. Yes, it's unsigned. But no, the thing it is passed to 
>    didn't specify that it wanted a "signed" thing in the first place. The 
>    "strlen()" function literally says
>
> 	"I want a char of indeterminate sign"!

I'm afraid I don't follow. Do we have a way to say "I want an int of
indeterminate sign" in C? The same way there doesn't seem to be a way
to say "I want a char of indeterminate sign". :( So no, strlen() doesn't
actually say that, no matter if we like it or not. It actually says "I
want a char with implementation-defined sign".

>    which implies that strlen really doesn't care about the sign. The same 
>    is true of *any* function that takes a "char *". Such a function 
>    doesn't care, and fundamentally CANNOT care about the sign, since it's 
>    not even defined!

In fact it's implementation-defined, and this may make a difference
here. strlen(), being part of C library, could be specifically
implemented for given architecture, and as architecture is free to
define the sign of "char", strlen() could in theory rely on particular
sign of "char" as defined for given architecture. [Not that I think that
any strlen() implementation actually depends on sign.]

>    So the warning fails the (a) criterion. The warning isn't valid, 
>    because the thing it warns about isn't a valid problem!

Can we assure that no function taking 'char*' ever cares about the sign?
I'm not sure, and I'm not a language lawyer, but if it's indeed the
case, I'd probably agree that it might be a good idea for GCC to extend
the C language so that function argument declared "char*" means either
of "char*", "signed char*", or "unsigned char*" even though there is no
precedent in the language.

BTW, the next logical step would be for "int*" argument to stop meaning
"signed int*" and become any of "int*", "signed int*" or "unsigned
int*". Isn't it cool to be able to declare a function that won't produce
warning no matter what int is passed to it? ;)

>  - it _also_ fails the (b) criterion, because quite often there is nothing 
>    you can do about it. Yes, you can add a cast, but adding a cast 
>    actually causes _worse_ code (but the warning is certainly gone). But 
>    that makes the _other_ argument for the warning totally point-less: if 
>    the reason for the warning was "bad code", then having the warning is 
>    actively BAD, because the end result is actually "worse code".
>
> See? The second point is why it's important to also realize that there is 
> a lot of real and valid code that actually _does_ pass "strlen()" an 
> unsigned string.

And here we finally get to the practice...

> There are tons of reasons for that to happen: the part of the program
> that _does_ care wants to use a "unsigned char" array, because it ends
> up doing things like "isspace(array[x])", and that is not well-defined
> if you use a "char *" array.

Yes, indeed. So the real problem of the C language is inconsistency
between strxxx() and isxxx() families of functions? If so, what is 
wrong with actually fixing the problem, say, by using wrappers over
isxxx()? Checking... The kernel already uses isxxx() that are macros
that do conversion to "unsigned char" themselves, and a few invocations
of isspace() I've checked pass "char" as argument. So that's not a real
problem for the kernel, right?

> So there are lots of reasons to use "unsigned char" arrays for strings. 
> Look it up. Look up any half-way reasonable man-page for the "isspace()" 
> kind of functions, and if they don't actually explicitly say that you 
> should use unsigned characters for it, those man-pages are crap.

Well, even C99 itself does say it. But the kernel already handles this
issue nicely without resorting to requirement to convert char to
unsigned char, isn't it?

> Because those functions really *are* defined in "int", but it's the
> same kind of namespace that "getchar()" works in (ie "unsigned char" +
> EOF, where EOF _usually_ is -1, although other values are certainly
> technically legal too).
>
> So:
>
>  - in practice, a lot of "good programming" uses "unsigned char" pointers 
>    for doing strings. There are LOTS of reasons for that, but "isspace()" 
>    and friends is the most obvious one.

As the isxxx() family does not seem to be a real problem, at least in
the context of the kernel source base, I'd like to learn other reasons
to use "unsigned char" for doing strings either in general or
specifically in the Linux kernel.

>  - if you can't call "strlen()" on your strings without the compiler 
>    warning, there's two choices: the compiler warning is CRAP, or your 
>    program is bad. But as I just showed you, "unsigned char *" is actually 
>    often the *right* thing to use for string work, so it clearly wasn't 
>    the program that was bad.

OK, provided there are actually sound reasons to use "unsigned char*"
for strings, isn't it safer to always use it, and re-define strxxx() for
the kernel to take that one as argument? Or are there reasons to use
"char*" (or "signed char*") for strings as well? I'm afraid that if two
or three types are allowed to be used for strings, some inconsistencies
here or there will remain no matter what.

Another option could be to always use "char*" for strings and always compile
the kernel with -funsigned-char. At least it will be consistent with the
C idea of strings being "char*".

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 12:34                   ` Jan Engelhardt
  2007-02-09 13:16                     ` linux-os (Dick Johnson)
@ 2007-02-13 15:14                     ` Dick Streefland
  1 sibling, 0 replies; 67+ messages in thread
From: Dick Streefland @ 2007-02-13 15:14 UTC (permalink / raw)
  To: linux-kernel

Jan Engelhardt <jengelh@linux01.gwdg.de> wrote:
| Further, giving again answer to the question whether they generate signed or
| unsigned comparisons: Have you ever seen a computer which addresses memory with
| negative numbers? Since the answer is most likely no, signed comparisons would
| not make sense for me.

The Transputer has (had?) a signed address space:

http://maven.smith.edu/~thiebaut/transputer/chapter2/chap2-3.html

-- 
Dick Streefland                      ////                      Altium BV
dick.streefland@altium.nl           (@ @)          http://www.altium.com
--------------------------------oOO--(_)--OOo---------------------------


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 22:05                           ` Jan Engelhardt
  2007-02-09 22:58                             ` Martin Mares
@ 2007-02-12 18:50                             ` linux-os (Dick Johnson)
  1 sibling, 0 replies; 67+ messages in thread
From: linux-os (Dick Johnson) @ 2007-02-12 18:50 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Linus Torvalds, David Rientjes, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton


On Fri, 9 Feb 2007, Jan Engelhardt wrote:

>
> On Feb 9 2007 15:29, linux-os (Dick Johnson) wrote:
>>>
>>> I was referring to "absolute memory", not the offset magic that assembler
>>> allows. After all, (reg+relativeOffset) will yield an absolute address.
>>> What I was out at: for machines that have more than 2 GB of memory, you
>>> don't call the address that is given by 0x80000000U actually "byte
>>> -2147483648", but "byte 2147483648".
>>
>> Don't make any large bets on that!
>>
>> char foo()
>> {
>>    volatile char *p = (char *)0x80000000;
>>    return *p;
>> }
>> Optimized....
>> 	.file	"zzz.c"
>> 	.text
>> 	.p2align 2,,3
>> .globl foo
>> 	.type	foo, @function
>> foo:
>> 	pushl	%ebp
>> 	movb	-2147483648, %al
>> 	movl	%esp, %ebp
>> 	movsbl	%al,%eax
>> 	leave
>> 	ret
>> 	.size	foo, .-foo
>> 	.section	.note.GNU-stack,"",@progbits
>> 	.ident	"GCC: (GNU) 3.3.3 20040412 (Red Hat Linux 3.3.3-7)"
>
> 00000000 <foo>:
>   0:   55                      push   %ebp
>   1:   0f b6 05 00 00 00 80    movzbl 0x80000000,%eax
>   8:   89 e5                   mov    %esp,%ebp
>   a:   5d                      pop    %ebp
>   b:   0f be c0                movsbl %al,%eax
>   e:   c3                      ret
>
> You do know that there is a bijection between the set of signed [32bit]
> integers and unsigned [32bit] integers, don't you?
> For the CPU, it's just bits. Being signed or unsigned is not important
> when just accessing memory.

I would have normally just let that comment go, but it is symptomatic
of a complete misunderstanding of how many (most) processors address
memory. This may come about because modern compilers tend to isolate
programmers from the underlying mechanisms. There isn't a 1:1
correspondence between a signed displacement and an unsigned one.
Signed displacements are mandatory with processors that provide
protection. Here is the reason: assume that you could execute code
anywhere (a flat module) and you didn't have 'C' complaining about
NULL pointers. If you were to execute code at memory location 0,
that told the processor to jump to location -1 in absolute memory, you
need to have the processor generate a bounds error trap, not to wrap
to offset 0xffffffff. The same thing is true if code is executing
at or near offset 0xffffffff. If it jumps to a location that would wrap
past 0xffffffff, one needs to generate a trap as well. Also, except for
the "FAR" jumps and FAR calls, where the segment register is part of
the operand, all programed jumps or calls are relative, i.e., based
upon the current program-counter's value. This means that the program-
counter will have a value added to, or subtracted from, its current
value to control program flow. For such addition to work either as
an addition of signed numbers or as a relative displacement, the
math must use signed (twos compliment) arithmetic.

If you look at the opcode output from the compiler, when performing
conditional jumps, you will note that the jump usually doesn't
include a 32-bit address. Instead, these are mostly short jumps, using
an 8-bit signed displacement (-128 to +127 bytes). Other conditional
jumps use 32-bit signed displacements.

The exact same addressing mechanism exists for memory read and write
accesses. They are all signed displacements.

> It will, when a comparison is involved, but
> that was not the point here. void* comparisons are unsigned. Period.
> Because a compiler doing signed comparisons will "map" the memory [from
> 2 GB to 4 GB] as part of the signed comparison before the memory [from 0
> GB to 2 GB], which collides with - let's call it - "the world view".
>
> Jan
> -- 
> ft: http://freshmeat.net/p/chaostables/
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.16.24 on an i686 machine (5592.61 BogoMips).
New book: http://www.AbominableFirebug.com/
_
\x1a\x04

****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-12 11:12                 ` Sergei Organov
@ 2007-02-12 16:26                   ` Linus Torvalds
  2007-02-13 18:06                     ` Sergei Organov
  0 siblings, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-12 16:26 UTC (permalink / raw)
  To: Sergei Organov
  Cc: J.A. MagallÃón, Jan Engelhardt,
	Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Mon, 12 Feb 2007, Sergei Organov wrote:
> 
> Why strlen() should be allowed to be called with an incompatible pointer
> type? My point is that gcc should issue *different warning*, -- the same
> warning it issues here:

I agree that "strlen()" per se isn't different.

The issue is not that the warning isn't "technically correct". IT IS. 

Nobody should ever argue that the warning isn't "correct". I hope people 
didn't think I argued that.

I've argued that the warning is STUPID. That's a totally different thing.

I can say totally idiotic things in perfectly reasonable English grammar 
and spelling. Does that make the things I say "good"? No.

The same is true of this gcc warning. It's technically perfectly 
reasonable both in English grammar and spelling (well, as far as 
any compiler warning ever is) _and_ in "C grammar and spelling" too.

But being grammatically correct does not make it "smart". IT IS STILL 
STUPID.

Can people not see the difference between "grammatically correct" and 
"intelligent"? People on the internet seem to have often acquired the 
understanding that "bad grammar and spelling" => "stupid", and yes, there 
is definitely some kind of correlation there. But as any logician and 
matematician hopefully knows, "a => b" does NOT imply "!a => !b".

Some people think that "warnings are always good". HELL NO!

A warnign is only as good as
 (a) the thing it warns about
 (b) the thing you can do about it

And THAT is the fundamental problem with that *idiotic* warning. Yes, it's 
technically correct. Yes, it's "proper C grammar". But if you can't get 
over the hump of realizing that there is a difference between "grammar" 
and "intelligent speech", you shouldn't be doing compilers.

So the warning sucks because:

 - the thing it warns about (passing "unsigned char" to something that 
   doesn't specify a sign at all!) is not something that sounds wrong in 
   the first place. Yes, it's unsigned. But no, the thing it is passed to 
   didn't specify that it wanted a "signed" thing in the first place. The 
   "strlen()" function literally says

	"I want a char of indeterminate sign"!

   which implies that strlen really doesn't care about the sign. The same 
   is true of *any* function that takes a "char *". Such a function 
   doesn't care, and fundamentally CANNOT care about the sign, since it's 
   not even defined!

   So the warning fails the (a) criterion. The warning isn't valid, 
   because the thing it warns about isn't a valid problem!

 - it _also_ fails the (b) criterion, because quite often there is nothing 
   you can do about it. Yes, you can add a cast, but adding a cast 
   actually causes _worse_ code (but the warning is certainly gone). But 
   that makes the _other_ argument for the warning totally point-less: if 
   the reason for the warning was "bad code", then having the warning is 
   actively BAD, because the end result is actually "worse code".

See? The second point is why it's important to also realize that there is 
a lot of real and valid code that actually _does_ pass "strlen()" an 
unsigned string. There are tons of reasons for that to happen: the part of 
the program that _does_ care wants to use a "unsigned char" array, because 
it ends up doing things like "isspace(array[x])", and that is not 
well-defined if you use a "char *" array.

So there are lots of reasons to use "unsigned char" arrays for strings. 
Look it up. Look up any half-way reasonable man-page for the "isspace()" 
kind of functions, and if they don't actually explicitly say that you 
should use unsigned characters for it, those man-pages are crap. Because 
those functions really *are* defined in "int", but it's the same kind of 
namespace that "getchar()" works in (ie "unsigned char" + EOF, where EOF 
_usually_ is -1, although other values are certainly technically legal 
too).

So:

 - in practice, a lot of "good programming" uses "unsigned char" pointers 
   for doing strings. There are LOTS of reasons for that, but "isspace()" 
   and friends is the most obvious one.

 - if you can't call "strlen()" on your strings without the compiler 
   warning, there's two choices: the compiler warning is CRAP, or your 
   program is bad. But as I just showed you, "unsigned char *" is actually 
   often the *right* thing to use for string work, so it clearly wasn't 
   the program that was bad.

So *please* understand:

 - yes, the warning is "correct" from a C grammatical standpoint

 - the warnign is STILL CRAP, because grammar isn't the only thing about a 
   computer language. Sane usage is MUCH MORE important than any grammar. 

Thus ends the sacred teachings of Linus "always right" Torvalds. Go and 
ponder these words, and please send me all your money (certified checks 
only, please - sending small unmarked bills is against USPS rules) to show 
your support of the holy church of good taste.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 15:58               ` Linus Torvalds
@ 2007-02-12 11:12                 ` Sergei Organov
  2007-02-12 16:26                   ` Linus Torvalds
  0 siblings, 1 reply; 67+ messages in thread
From: Sergei Organov @ 2007-02-12 11:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: J.A. Magallón, Jan Engelhardt, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Fri, 9 Feb 2007, Sergei Organov wrote:
>> 
>> As far as I can read the C99 standard, the "char", "signed char", and
>> "unsigned char", are all different types:
>
> Indeed. Search for "pseudo-unsigned", and you'll see more. There are 
> actually cases where "char" can act differently from _both_ "unsigned 
> char" and "signed char".
>
>> If so, then the code above is broken no matter what representation of
>> "char" is chosen for given arch, as strcmp() takes "char*" arguments,
>> that are not compatible with either "signed char*" or "unsigned char*".
>
> ..and my argument is that a warning which doesn't allow you to call 
> "strlen()" on a "unsigned char" array without triggering is a bogus 
> warning, and must be removed.

Why strlen() should be allowed to be called with an incompatible pointer
type? My point is that gcc should issue *different warning*, -- the same
warning it issues here:

$ cat incompat.c
void foo(int *p);
void boo(long *p) { foo(p); }
$ gcc -W -Wall -c incompat.c
incompat.c:2: warning: passing argument 1 of 'foo' from incompatible pointer type

Calling strlen(char*) with "unsigned char*" argument does pass argument
of incompatible pointer type due to the fact that in C "char" and
"unsigned char" are two incompatible types, and it has nothing to do
with signedness.

[...]

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 22:05                           ` Jan Engelhardt
@ 2007-02-09 22:58                             ` Martin Mares
  2007-02-12 18:50                             ` linux-os (Dick Johnson)
  1 sibling, 0 replies; 67+ messages in thread
From: Martin Mares @ 2007-02-09 22:58 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: linux-os (Dick Johnson),
	Linus Torvalds, David Rientjes, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton

Hello!

> void* comparisons are unsigned. Period.

As far as the C standard is concerned, there is no relationship between
comparison on pointers and comparison of their values casted to uintptr_t.
The address space needn't be linear and on some machines it isn't. So
speaking about signedness of pointer comparisons doesn't make sense,
except for concrete implementations.

				Have a nice fortnight
-- 
Martin `MJ' Mares                          <mj@ucw.cz>   http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Top ten reasons to procrastinate: 1.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 20:29                         ` linux-os (Dick Johnson)
@ 2007-02-09 22:05                           ` Jan Engelhardt
  2007-02-09 22:58                             ` Martin Mares
  2007-02-12 18:50                             ` linux-os (Dick Johnson)
  0 siblings, 2 replies; 67+ messages in thread
From: Jan Engelhardt @ 2007-02-09 22:05 UTC (permalink / raw)
  To: linux-os (Dick Johnson)
  Cc: Linus Torvalds, David Rientjes, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton


On Feb 9 2007 15:29, linux-os (Dick Johnson) wrote:
>>
>> I was referring to "absolute memory", not the offset magic that assembler
>> allows. After all, (reg+relativeOffset) will yield an absolute address.
>> What I was out at: for machines that have more than 2 GB of memory, you
>> don't call the address that is given by 0x80000000U actually "byte
>> -2147483648", but "byte 2147483648".
>
>Don't make any large bets on that!
>
>char foo()
>{
>    volatile char *p = (char *)0x80000000;
>    return *p;
>}
>Optimized....
> 	.file	"zzz.c"
> 	.text
> 	.p2align 2,,3
>.globl foo
> 	.type	foo, @function
>foo:
> 	pushl	%ebp
> 	movb	-2147483648, %al
> 	movl	%esp, %ebp
> 	movsbl	%al,%eax
> 	leave
> 	ret
> 	.size	foo, .-foo
> 	.section	.note.GNU-stack,"",@progbits
> 	.ident	"GCC: (GNU) 3.3.3 20040412 (Red Hat Linux 3.3.3-7)"

00000000 <foo>:
   0:   55                      push   %ebp
   1:   0f b6 05 00 00 00 80    movzbl 0x80000000,%eax
   8:   89 e5                   mov    %esp,%ebp
   a:   5d                      pop    %ebp
   b:   0f be c0                movsbl %al,%eax
   e:   c3                      ret    

You do know that there is a bijection between the set of signed [32bit] 
integers and unsigned [32bit] integers, don't you?
For the CPU, it's just bits. Being signed or unsigned is not important 
when just accessing memory. It will, when a comparison is involved, but 
that was not the point here. void* comparisons are unsigned. Period.
Because a compiler doing signed comparisons will "map" the memory [from 
2 GB to 4 GB] as part of the signed comparison before the memory [from 0 
GB to 2 GB], which collides with - let's call it - "the world view".


Jan
-- 
ft: http://freshmeat.net/p/chaostables/

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 17:45                       ` Jan Engelhardt
@ 2007-02-09 20:29                         ` linux-os (Dick Johnson)
  2007-02-09 22:05                           ` Jan Engelhardt
  0 siblings, 1 reply; 67+ messages in thread
From: linux-os (Dick Johnson) @ 2007-02-09 20:29 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Linus Torvalds, David Rientjes, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton


On Fri, 9 Feb 2007, Jan Engelhardt wrote:

>
> On Feb 9 2007 08:16, linux-os (Dick Johnson) wrote:
>> On Fri, 9 Feb 2007, Jan Engelhardt wrote:
>>> On Feb 8 2007 16:42, Linus Torvalds wrote:
>>>>
>>> Further, giving again answer to the question whether they generate
>>> signed or unsigned comparisons: Have you ever seen a computer which
>>> addresses memory with negative numbers? Since the answer is most likely
>>> no, signed comparisons would
>>
>> Yes, most all do. Indexed memory operands are signed displacements. See
>> page 2-16, Intel486 Microprocessor Programmer's Reference Manual. This
>
> I was referring to "absolute memory", not the offset magic that assembler
> allows. After all, (reg+relativeOffset) will yield an absolute address.
> What I was out at: for machines that have more than 2 GB of memory, you
> don't call the address that is given by 0x80000000U actually "byte
> -2147483648", but "byte 2147483648".

Don't make any large bets on that!

char foo()
{
    volatile char *p = (char *)0x80000000;
    return *p;
}
Optimized....
 	.file	"zzz.c"
 	.text
 	.p2align 2,,3
.globl foo
 	.type	foo, @function
foo:
 	pushl	%ebp
 	movb	-2147483648, %al
 	movl	%esp, %ebp
 	movsbl	%al,%eax
 	leave
 	ret
 	.size	foo, .-foo
 	.section	.note.GNU-stack,"",@progbits
 	.ident	"GCC: (GNU) 3.3.3 20040412 (Red Hat Linux 3.3.3-7)"

Not optimized...

 	.file	"zzz.c"
 	.text
.globl foo
 	.type	foo, @function
foo:
 	pushl	%ebp
 	movl	%esp, %ebp
 	subl	$4, %esp
 	movl	$-2147483648, -4(%ebp)
 	movl	-4(%ebp), %eax
 	movb	(%eax), %al
 	movsbl	%al,%eax
 	leave
 	ret
 	.size	foo, .-foo
 	.section	.note.GNU-stack,"",@progbits
 	.ident	"GCC: (GNU) 3.3.3 20040412 (Red Hat Linux 3.3.3-7)"

... Still negative numbers!

>
>> gets hidden by higher-level languages such as 'C'. It is also non-obvious
>> when using the AT&T (GNU) assembly language. However, the Intel documentation
>> shows it clearly:
>> 			MOV	AL,  [EBX+0xFFFFFFFF]	; -1
>
> It can be debated what exactly this is... negative offset or "using
> overflow to get where we want".
>

No debate at all, and no overflow. The construction is even used in
'C' to optimize memory access while unrolling loops:


/*-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/
/*
  *   This converts an array of integers to an array of double floats.
  */
void int2float(double *fp, int *ip, size_t len)
{
     while(len >= 0x10)
     {
         fp  += 0x10;
         ip  += 0x10;
         len -= 0x10;
         fp[-0x10] = (double) ip[-0x10];
         fp[-0x0f] = (double) ip[-0x0f];
         fp[-0x0e] = (double) ip[-0x0e];
         fp[-0x0d] = (double) ip[-0x0d];
         fp[-0x0c] = (double) ip[-0x0c];
         fp[-0x0b] = (double) ip[-0x0b];
         fp[-0x0a] = (double) ip[-0x0a];
         fp[-0x09] = (double) ip[-0x09];
         fp[-0x08] = (double) ip[-0x08];
         fp[-0x07] = (double) ip[-0x07];
         fp[-0x06] = (double) ip[-0x06];
         fp[-0x05] = (double) ip[-0x05];
         fp[-0x04] = (double) ip[-0x04];
         fp[-0x03] = (double) ip[-0x03];
         fp[-0x02] = (double) ip[-0x02];
         fp[-0x01] = (double) ip[-0x01];
     }
     while(len--)
        *fp++ = (double) *ip++;
}

This makes certain that each memory access uses the same type
and the address-calculation for the next memory access can be done
in parallel with the current fetch. This is one of the methods
used to save a few machine cycles. Doesn't mean much until you
end up with large arrays such as used in image processing. Then,
it may be the difference that makes or breaks the project!

>
> Jan
> -- 
> ft: http://freshmeat.net/p/chaostables/
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.16.24 on an i686 machine (5592.61 BogoMips).
New book: http://www.AbominableFirebug.com/
_
\x1a\x04

****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  3:27                   ` D. Hazelton
@ 2007-02-09 19:54                     ` Pete Zaitcev
  0 siblings, 0 replies; 67+ messages in thread
From: Pete Zaitcev @ 2007-02-09 19:54 UTC (permalink / raw)
  To: D. Hazelton; +Cc: Kernel Mailing List

On Thu, 8 Feb 2007 22:27:43 -0500, "D. Hazelton" <dhazelton@enter.net> wrote:

> So almost all the rules around the signs of types are because of a single, 
> historical machine. Hence the rules about "char" being unsigned by default 
> and "int" being signed by default are because of the nature of the PDP-11. 
>[...]
> Now: There is no reason for the behavior that came from the nature of the 
> PDP11 to have survived, but because it was in "The White Book" it made it 
> through the ANSI standardization process.
> 
> Now that this history lesson is over...

... we can remember that char is signed on PDP-11. Mvu hah hah hah ha,
only serious. Check how movb works.

-- Pete

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 13:16                     ` linux-os (Dick Johnson)
@ 2007-02-09 17:45                       ` Jan Engelhardt
  2007-02-09 20:29                         ` linux-os (Dick Johnson)
  0 siblings, 1 reply; 67+ messages in thread
From: Jan Engelhardt @ 2007-02-09 17:45 UTC (permalink / raw)
  To: linux-os (Dick Johnson)
  Cc: Linus Torvalds, David Rientjes, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton


On Feb 9 2007 08:16, linux-os (Dick Johnson) wrote:
>On Fri, 9 Feb 2007, Jan Engelhardt wrote:
>> On Feb 8 2007 16:42, Linus Torvalds wrote:
>>>
>> Further, giving again answer to the question whether they generate
>> signed or unsigned comparisons: Have you ever seen a computer which
>> addresses memory with negative numbers? Since the answer is most likely
>> no, signed comparisons would
>
>Yes, most all do. Indexed memory operands are signed displacements. See
>page 2-16, Intel486 Microprocessor Programmer's Reference Manual. This

I was referring to "absolute memory", not the offset magic that assembler
allows. After all, (reg+relativeOffset) will yield an absolute address.
What I was out at: for machines that have more than 2 GB of memory, you
don't call the address that is given by 0x80000000U actually "byte
-2147483648", but "byte 2147483648".

>gets hidden by higher-level languages such as 'C'. It is also non-obvious
>when using the AT&T (GNU) assembly language. However, the Intel documentation
>shows it clearly:
> 			MOV	AL,  [EBX+0xFFFFFFFF]	; -1

It can be debated what exactly this is... negative offset or "using
overflow to get where we want".



Jan
-- 
ft: http://freshmeat.net/p/chaostables/

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 12:38             ` Sergei Organov
@ 2007-02-09 15:58               ` Linus Torvalds
  2007-02-12 11:12                 ` Sergei Organov
  0 siblings, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-09 15:58 UTC (permalink / raw)
  To: Sergei Organov
  Cc: J.A. Magallón, Jan Engelhardt, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton



On Fri, 9 Feb 2007, Sergei Organov wrote:
> 
> As far as I can read the C99 standard, the "char", "signed char", and
> "unsigned char", are all different types:

Indeed. Search for "pseudo-unsigned", and you'll see more. There are 
actually cases where "char" can act differently from _both_ "unsigned 
char" and "signed char".

> If so, then the code above is broken no matter what representation of
> "char" is chosen for given arch, as strcmp() takes "char*" arguments,
> that are not compatible with either "signed char*" or "unsigned char*".

..and my argument is that a warning which doesn't allow you to call 
"strlen()" on a "unsigned char" array without triggering is a bogus 
warning, and must be removed.

Which is all I've ever said from the beginning.

I've never disputed that "char" and "unsigned char" are different types. 
They *clearly* are different types. 

But that doesn't make the warning valid. There is a real need for NOT 
warning about sign differences.

But this argument has apparently again broken down and become a "language 
lawyer" argument instead of being about what a reasonable compiler 
actually can and should do.

As long as gcc warns about idiotic things, the kernel will keep shutting 
that warning off. I'd _like_ to use the warning, but if the gcc 
implementation is on crack, we clearly cannot.

There's simply an issue of "quality of implementation". In the case of 
this warning, gcc is just not very high quality.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  1:18                       ` David Rientjes
@ 2007-02-09 15:38                         ` Linus Torvalds
  0 siblings, 0 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-09 15:38 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, David Rientjes wrote:
> 
> Yes, I read the 4.1.1 docs:
> 
> 	By default, such a bit-field is signed, because this is
> 	consistent: the basic integer types such as int are signed
> 	types.

Ok, so the gcc people have added some language. So what? The fact is, that 
language has absolutely nothing to do with the C language, and doesn't 
change anything at all.

> That is the whole basis for my argument, when you declare something "int," 
> most programmers would consider that to be SIGNED regardless of whether it 
> is 32 bits, 13 bits, or 1 bit.

And MY argument is that YOUR argument is CRAP.

The fact is, signs of bitfields, chars and enums aren't well-defined. 
That's a FACT.

Your argument makes no sense. It's like arguing against "gravity", or like 
arguing against the fact that the earth circles around the sun. It us 
STUPID to argue against facts.

Your argument is exactly the same argument that people used to say that 
the earth is the center of the universe: "because it makes sense that 
we're the center". 

The fact that "most programmers" or "it makes sense" doesn't make anything 
at all true. Only verifiable FACTS make something true.

There's a great saying: "Reality is that which, when you stop believing in 
it, doesn't go away." - Philip K Dick.

So wake up and smell the coffee: reality is that bitfields don't have a 
specified sign. It's just a FACT. Whether you _like_ it or not is simply 
not relevant.

The same is true of "char". You can argue that it should be signed by 
default, sicne "short", "int" and "long" are signed. But THAT IS NOT TRUE.

Arguing against reality really isn't very smart.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 18:42   ` Jan Engelhardt
  2007-02-08 19:53     ` Linus Torvalds
@ 2007-02-09 15:10     ` Sergei Organov
  1 sibling, 0 replies; 67+ messages in thread
From: Sergei Organov @ 2007-02-09 15:10 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Linux Kernel Mailing List

Jan Engelhardt <jengelh@linux01.gwdg.de> writes:
> On Feb 8 2007 08:33, Linus Torvalds wrote:
>>
[...]
> What C needs is a distinction between char and int8_t, rendering "char" 
> an unsigned at all times basically and making "unsigned char" and 
> "signed char" illegal types in turn.

AFAIK, C already has 3 *distinct* types, "char", "signed char", and
"unsigned char". So your int8_t reads "signed char" in C, uint8_t reads
"unsigned char" in C, while "char" is yet another type that is not
compatible with those two. Yes, the names for these types are somewhat
confusing, but that's due to historical reasons, I think.

Overall,

signed char   == tiny signed int   == tiny int
unsigned char == tiny unsigned int
char != signed char
char != unsigned char

where

tiny == short short ;)

Rules of thumb: don't use bare char to store integers; don't use
signed/unsigned char to store characters.

[strxxx() routines take char* arguments as they work on '\0'-terminated
character arrays (strings). Using them on arrays of tiny integers is
unsafe no matter how char behaves on given platform (like signed or like
unsigned type).]

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09 12:34                   ` Jan Engelhardt
@ 2007-02-09 13:16                     ` linux-os (Dick Johnson)
  2007-02-09 17:45                       ` Jan Engelhardt
  2007-02-13 15:14                     ` Dick Streefland
  1 sibling, 1 reply; 67+ messages in thread
From: linux-os (Dick Johnson) @ 2007-02-09 13:16 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Linus Torvalds, David Rientjes, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton


On Fri, 9 Feb 2007, Jan Engelhardt wrote:

>
> On Feb 8 2007 16:42, Linus Torvalds wrote:
>>
>> Most C types don't, and some you can't even tell (do pointers generate
>> "signed" or "unsigned" comparisons?
>
> I'd say "neither", because both
>
>    signed void *ptr; and
>    unsigned void *xyz;
>
> are impossible (and dull at that). That's why you explicitly will
> have to cast one operand in a comparison to make it evident
> what sort of comparison is intended, i.e.
>
>    if((unsigned long)ptr < PAGE_OFFSET)
>
> Further, giving again answer to the question whether they generate signed or
> unsigned comparisons: Have you ever seen a computer which addresses memory with
> negative numbers? Since the answer is most likely no, signed comparisons would

Yes, most all do. Indexed memory operands are signed displacements. See
page 2-16, Intel486 Microprocessor Programmer's Reference Manual. This
gets hidden by higher-level languages such as 'C'. It is also non-obvious
when using the AT&T (GNU) assembly language. However, the Intel documentation
shows it clearly:
 			MOV	AL,  [EBX+0xFFFFFFFF]	; -1

...will put the byte existing at the location just BEFORE the offset in
EBX, into register AL, while:

 			MOV	AL,  [EBX+0x00000001]  ; +1

...will put the byte existing at the location just AFTER the offset in
EBX, into register AL.

Also, the documentation specifically states that sign extension is
provided for all effective address calculations.

> not make sense for me.
>
>> I'll argue that a compiler that
>> generates signed comparisons for them is broken, but it tends to be
>> something you can only see with a standards- conforming proghram if you
>> can allocate memory across the sign boundary, which may or may not be
>> true..)
>
> Jan
> -- 
> ft: http://freshmeat.net/p/chaostables/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.16.24 on an i686 machine (5592.61 BogoMips).
New book: http://www.AbominableFirebug.com/
_
\x1a\x04

****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 22:03           ` Linus Torvalds
  2007-02-08 22:19             ` Willy Tarreau
  2007-02-09  0:03             ` J.A. Magallón
@ 2007-02-09 12:38             ` Sergei Organov
  2007-02-09 15:58               ` Linus Torvalds
  2 siblings, 1 reply; 67+ messages in thread
From: Sergei Organov @ 2007-02-09 12:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: J.A. Magallón, Jan Engelhardt, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, 8 Feb 2007, Linus Torvalds wrote:
>> 
>> But THE CALLER CANNOT AND MUST NOT CARE! Because the sign of "char" is 
>> implementation-defined, so if you call "strcmp()", you are already 
>> basically saying: I don't care (and I _cannot_ care) what sign you are 
>> using.
>
> Let me explain it another way.
>
> Say you use
>
> 	signed char *myname, *yourname;
>
> 	if (strcmp(myname,yourname) < 0)
> 		printf("Ha, I win!\n")
>
> and you compile this on an architecture where "char" is signed even 
> without the explicit "signed".

As far as I can read the C99 standard, the "char", "signed char", and
"unsigned char", are all different types:

 "The three types char, signed char, and unsigned char are collectively called
  the character types. The implementation shall define char to have the same range,
  representation, and behavior as either signed char or unsigned
  char.(35)
  ....
  (35) CHAR_MIN, defined in <limits.h>, will have one of the values 0 or
       SCHAR_MIN, and this can be used to distinguish the two
       options. Irrespective of the choice made, char is a separate type
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       from the other two and is not compatible with either.
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 "

If so, then the code above is broken no matter what representation of
"char" is chosen for given arch, as strcmp() takes "char*" arguments,
that are not compatible with either "signed char*" or "unsigned char*".

-- Sergei.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  0:42                 ` Linus Torvalds
                                     ` (2 preceding siblings ...)
  2007-02-09  3:27                   ` D. Hazelton
@ 2007-02-09 12:34                   ` Jan Engelhardt
  2007-02-09 13:16                     ` linux-os (Dick Johnson)
  2007-02-13 15:14                     ` Dick Streefland
  3 siblings, 2 replies; 67+ messages in thread
From: Jan Engelhardt @ 2007-02-09 12:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Rientjes, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton


On Feb 8 2007 16:42, Linus Torvalds wrote:
>
>Most C types don't, and some you can't even tell (do pointers generate 
>"signed" or "unsigned" comparisons?

I'd say "neither", because both

    signed void *ptr; and
    unsigned void *xyz;

are impossible (and dull at that). That's why you explicitly will
have to cast one operand in a comparison to make it evident
what sort of comparison is intended, i.e.

    if((unsigned long)ptr < PAGE_OFFSET)

Further, giving again answer to the question whether they generate signed or
unsigned comparisons: Have you ever seen a computer which addresses memory with
negative numbers? Since the answer is most likely no, signed comparisons would
not make sense for me.

> I'll argue that a compiler that 
>generates signed comparisons for them is broken, but it tends to be 
>something you can only see with a standards- conforming proghram if you 
>can allocate memory across the sign boundary, which may or may not be 
>true..)

Jan
-- 
ft: http://freshmeat.net/p/chaostables/

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  0:42                 ` Linus Torvalds
  2007-02-09  0:59                   ` Linus Torvalds
  2007-02-09  0:59                   ` David Rientjes
@ 2007-02-09  3:27                   ` D. Hazelton
  2007-02-09 19:54                     ` Pete Zaitcev
  2007-02-09 12:34                   ` Jan Engelhardt
  3 siblings, 1 reply; 67+ messages in thread
From: D. Hazelton @ 2007-02-09  3:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Rientjes, Jan Engelhardt, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton

On Thursday 08 February 2007 19:42, Linus Torvalds wrote:
<snip>
> Most C types don't, and some you can't even tell (do pointers generate
> "signed" or "unsigned" comparisons? I'll argue that a compiler that
> generates signed comparisons for them is broken, but it tends to be
> something you can only see with a standards- conforming proghram if you
> can allocate memory across the sign boundary, which may or may not be
> true..)

And this is, because as Dennis Ritchie says in "The Development of the C 
Language" (http://cm.bell-labs.com/cm/cs/who/dmr/chist.html) C evolved from a 
typeless language, B - which, in turn, had originated as a stripped down 
version of BCPL. B and BCPL  used a "cell addressing" scheme, but because of 
the move off the PDP-7, and the difficulties run into by Ken Thompson in 
rewriting the code for Unix in B (the original Unix was written in assembler) 
types were added - 'char' and 'int'. These represented the byte and word 
addressing modes of the PDP-11, and were needed because Ritchie was in the 
process of having the compiler generate assembler instead of "threaded code". 
This "NB" language, after also having data structures added and the rules for 
defining pointers finalized, was renamed "C" by Ritchie.

So almost all the rules around the signs of types are because of a single, 
historical machine. Hence the rules about "char" being unsigned by default 
and "int" being signed by default are because of the nature of the PDP-11. 
The implementation defined nature of bitfields is because Dennis Ritchie had 
freedom there - the PDP-11 didn't have anything like them in the hardware and 
they were being stuffed in to make Thompsons life easier.

Now: There is no reason for the behavior that came from the nature of the 
PDP11 to have survived, but because it was in "The White Book" it made it 
through the ANSI standardization process.

Now that this history lesson is over...

DRH

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  1:11                     ` Linus Torvalds
@ 2007-02-09  1:18                       ` David Rientjes
  2007-02-09 15:38                         ` Linus Torvalds
  0 siblings, 1 reply; 67+ messages in thread
From: David Rientjes @ 2007-02-09  1:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Thu, 8 Feb 2007, Linus Torvalds wrote:

> Even gcc DOES DIFFERENT THINGS! Have you even read the docs?
> 
> 	By default it is treated as signed int but this may be changed by 
> 	the -funsigned-bitfields option.
> 

Yes, I read the 4.1.1 docs:

	By default, such a bit-field is signed, because this is
	consistent: the basic integer types such as int are signed
	types.

That is the whole basis for my argument, when you declare something "int," 
most programmers would consider that to be SIGNED regardless of whether it 
is 32 bits, 13 bits, or 1 bit.

No doubt it is configurable because of the existance of brain-dead 
compilers that treat them as unsigned.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  0:59                   ` David Rientjes
@ 2007-02-09  1:11                     ` Linus Torvalds
  2007-02-09  1:18                       ` David Rientjes
  0 siblings, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-09  1:11 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, David Rientjes wrote:
>
> Maybe you should read my first post, we're talking about gcc's behavior 
> here, not the C standard.

Give it up, David.

Even gcc DOES DIFFERENT THINGS! Have you even read the docs?

	By default it is treated as signed int but this may be changed by 
	the -funsigned-bitfields option.

So even for gcc, it's just a default. I think you can even change the 
default in the spec file.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  0:42                 ` Linus Torvalds
  2007-02-09  0:59                   ` Linus Torvalds
@ 2007-02-09  0:59                   ` David Rientjes
  2007-02-09  1:11                     ` Linus Torvalds
  2007-02-09  3:27                   ` D. Hazelton
  2007-02-09 12:34                   ` Jan Engelhardt
  3 siblings, 1 reply; 67+ messages in thread
From: David Rientjes @ 2007-02-09  0:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Thu, 8 Feb 2007, Linus Torvalds wrote:

> No it's not.
> 
> You just don't understand the C language. 
> 
> And if you don't understand the C language, you can't say "that's what the 
> int says". It says no such thing.
> 
> The C language clearly says that bitfields have implementation-defined 
> types. So when you see
> 
> 	struct dummy {
> 		int flag:1;
> 	} a_variable;
> 
> if you don't read that as "oh, the sign of 'flag' is implementation- 
> defined", then you simply aren't reading it right.
> 

Maybe you should read my first post, we're talking about gcc's behavior 
here, not the C standard.  My criticism was that any compiler that makes 
a_variable.flag unsigned is brain-dead and I was arguing in favor of gcc 
treating plain int bitfields as signed ints (6.7.2, 6.7.2.1).  This has 
_nothing_ to do with the fact that the standard leaves it implementation 
defined.

Naturally you should define it's signness explicitly in your code since it 
is implementation defined.  That's not the point.

Just because a compiler CAN consider a_variable.flag as unsigned doesn't 
mean it makes sense.  It makes no sense, and thus is brain-dead.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  0:42                 ` Linus Torvalds
@ 2007-02-09  0:59                   ` Linus Torvalds
  2007-02-09  0:59                   ` David Rientjes
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-09  0:59 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, Linus Torvalds wrote:
> 
> The C language clearly says that bitfields have implementation-defined 
> types.

Btw, this isn't even some "theoretical discussion". LOTS of compilers do 
unsigned bitfields. It may even be the majority. There may (or may not) be 
some correlation with what the default for "char" is, I haven't looked 
into it.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  0:24               ` David Rientjes
@ 2007-02-09  0:42                 ` Linus Torvalds
  2007-02-09  0:59                   ` Linus Torvalds
                                     ` (3 more replies)
  0 siblings, 4 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-09  0:42 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, David Rientjes wrote:
> 
> Your struct:
> 
> 	struct dummy {
> 		int flag:1;
> 	} a_variable;
> 
> should expect a_varible.flag to be signed, that's what the int says.  

No it's not.

You just don't understand the C language. 

And if you don't understand the C language, you can't say "that's what the 
int says". It says no such thing.

The C language clearly says that bitfields have implementation-defined 
types. So when you see

	struct dummy {
		int flag:1;
	} a_variable;

if you don't read that as "oh, the sign of 'flag' is implementation- 
defined", then you simply aren't reading it right.

Is it "intuitive"? Nobody ever really called C an _intuitive_ language. C 
has a lot of rules that you simply have to know. The bitfield sign rule is 
just one such rule.

> There is no special case here with regard to type.

Sure there is. Read the spec.

I don't understand why you are arguing. YOU ARE WRONG.

Bitfields simply have implementation-defined signedness. As do enums. As 
does "char". It really is that simple. 

The *real* special case is actually "int" and "long". In many ways, the 
fact that those *do* have a well-specified signedness is actually the 
exception rather than the rule.

Most C types don't, and some you can't even tell (do pointers generate 
"signed" or "unsigned" comparisons? I'll argue that a compiler that 
generates signed comparisons for them is broken, but it tends to be 
something you can only see with a standards- conforming proghram if you 
can allocate memory across the sign boundary, which may or may not be 
true..)

			Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 23:37             ` Linus Torvalds
@ 2007-02-09  0:24               ` David Rientjes
  2007-02-09  0:42                 ` Linus Torvalds
  0 siblings, 1 reply; 67+ messages in thread
From: David Rientjes @ 2007-02-09  0:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Thu, 8 Feb 2007, Linus Torvalds wrote:

> No, making bitfields unsigned is actually usually a good idea. It allows 
> you to often generate better code, and it actually tends to be what 
> programmers _expect_. A lot of people seem to be surprised to hear that a 
> one-bit bitfield actually often encodes -1/0, and not 0/1.
> 

Your struct:

	struct dummy {
		int flag:1;
	} a_variable;

should expect a_varible.flag to be signed, that's what the int says.  
There is no special case here with regard to type.  It's traditional K&R 
that writing signed flag:1 here is redundant (K&R pg. 211).  Now whether 
that's what you wanted or not is irrelevant.  As typical with C, it gives 
you exactly what you asked for.

You're arguing for inconsistency in how bitfields are qualified based on 
their size.  If you declare a bitfield as int byte:8, then you're going 
to expect this to behave exactly like a signed char, which it does.

	struct {
		int byte:8;
	} __attribute__((packed)) b_variable;

b_variable.byte is identical to a signed char.

Just because a_variable.flag happens to be one bit, you cannot say that 
programmers _expect_ it to be unsigned, or you would also expect 
b_variable.byte to act as an unsigned char.  signed is the default 
behavior for all types besides char, so the behavior is appropriate based 
on what most programmers would expect.

> So unsigned bitfields are not only traditional K&R, they are also usually 
> _faster_ (which is probably why they are traditional K&R - along with 
> allowing "char" to be unsigned by default). Don't knock them.  It's much 
> better to just remember that bitfields simply don't _have_ any standard 
> sign unless you specify it explicitly, than saying "it should be signed 
> because 'int' is signed".
> 

Of course they're faster, it doesn't require the sign extension.  But 
you're adding additional semantics to the language by your interpretation 
of what is expected by the programmer in the bitfield case as opposed to a 
normal variable definition.

> I will actually argue that having signed bit-fields is almost always a 
> bug, and that as a result you should _never_ use "int" at all. Especially 
> as you might as well just write it as
> 
> 	signed a:1;
> 
> if you really want a signed bitfield.
> 

How can signed bitfields almost always be a bug?  I'm the programmer and I 
want to store my data in a variable that I define, so I must follow the 
twos-complement rule and allot a sign bit if I declare it as a signed 
value.  In C, you do this with "int"; otherwise, you use "unsigned".

> So I would reall yrecommend that you never use "int a:<bits>" AT ALL, 
> because there really is never any good reason to do so. Do it as
> 
> 	unsigned a:3;
> 	signed b:2;
> 
> but never 
> 
> 	int c:4;
> 
> because the latter really isn't sensible.
> 

That _is_ sensible because anything you declare "int" is going to be 
signed as far as gcc is concerned.  "c" just happens to be 4-bits wide 
instead of 32.  But for everything else, it's the same as an int.  I know 
exactly what I'm getting by writing "int c:4", as would most programmers 
who use bitfields to begin with.

		David

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-09  0:03             ` J.A. Magallón
@ 2007-02-09  0:22               ` Linus Torvalds
  0 siblings, 0 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-09  0:22 UTC (permalink / raw)
  To: J.A. Magallón
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1251 bytes --]



On Fri, 9 Feb 2007, J.A. Magallón wrote:
> 
> Thats the point. Mmmm, I think I see it the other way around. I defined
> a variable as 'signed' or 'unsigned', because the sign info matters for me.
> And gcc warns about using a function on it that will _ignore_ or even
> misinterpret that info. Could it be a BUG ? Yes.

Sure. The other way of seeing it is that *anything*  could be a bug.

Could adding 1 to "a" be a bug? Yes. "a" might overflow. So maybe the 
compiler should warn about that too?

So do you think a compiler should warn when you do

	int a = i + 1;

and say "warning: Expression on line x might overflow"?

Could it be a BUG? Hell yeah.

Is warning for things that _could_ be bugs sane? Hell NO.

> Linux/x86, gcc 4.1.2-0.20070115:
> werewolf:~> gcc -Wpointer-sign -c t.c
> t.c: In function ÿÿfÿÿ:
> t.c:10: warning: pointer targets in passing argument 1 of ÿÿstrlenÿÿ differ in signedness
> t.c:11: warning: pointer targets in passing argument 1 of ÿÿstrlenÿÿ differ in signedness

Yeah, and that's what I think is crazy.

Is it consistent? Yes. Does it help people? No.

A warning that is consistent is not necessarily a good warning. It needs 
to MAKE SENSE too. And this one doesn't.  I'm sorry if you can't see that.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 22:03           ` Linus Torvalds
  2007-02-08 22:19             ` Willy Tarreau
@ 2007-02-09  0:03             ` J.A. Magallón
  2007-02-09  0:22               ` Linus Torvalds
  2007-02-09 12:38             ` Sergei Organov
  2 siblings, 1 reply; 67+ messages in thread
From: J.A. Magallón @ 2007-02-09  0:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Thu, 8 Feb 2007 14:03:06 -0800 (PST), Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Thu, 8 Feb 2007, Linus Torvalds wrote:
> > 
> > But THE CALLER CANNOT AND MUST NOT CARE! Because the sign of "char" is 
> > implementation-defined, so if you call "strcmp()", you are already 
> > basically saying: I don't care (and I _cannot_ care) what sign you are 
> > using.
> 
> Let me explain it another way.
> 
> Say you use
> 
> 	signed char *myname, *yourname;
> 
> 	if (strcmp(myname,yourname) < 0)
> 		printf("Ha, I win!\n")
> 
> and you compile this on an architecture where "char" is signed even 
> without the explicit "signed".
> 
> What should happen?
> 
> Should you get a warning? The types really *are* the same, so getting a 
> warning sounds obviously insane. But on the other hand, if you really care 
> about the sign that strcmp() uses internally, the code is wrong *anyway*, 

Thats the point. Mmmm, I think I see it the other way around. I defined
a variable as 'signed' or 'unsigned', because the sign info matters for me.
And gcc warns about using a function on it that will _ignore_ or even
misinterpret that info. Could it be a BUG ? Yes.

> because with another compiler, or with the *same* compiler on another 
> architecture or some other compiler flags, the very same code is buggy.
> 
> In other words, either you should get a warning *regardless* of whether 
> the sign actually matches or not, or you shouldn't get a warning at all 
> for the above code. Either it's buggy code, or it isn't.
> 

In fact, I tried to look for an arch where this gives only _one_ error:

#include <string.h>

void f()
{
	const unsigned char *a;
	const signed char *b;

	int la,lb;

	la = strlen(a);
	lb = strlen(b);
}

and guess, all give _both_ errors.

Linux/x86, gcc 4.1.2-0.20070115:
werewolf:~> gcc -Wpointer-sign -c t.c
t.c: In function ‘f’:
t.c:10: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness
t.c:11: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness

OSX ppc, gcc-4.0.1
belly:~> gcc -Wpointer-sign -c t.c
t.c: In function ‘f’:
t.c:10: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness
t.c:11: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness

Solaris5.7 Ultra, cc WorkShop Compilers 5.0
den:~> cc -Xa -c t.c                                                           
ucbcc: Warning: "-Xa" redefines compatibility mode from "SunC transition" to "ANSI"
"t.c", line 10: warning: argument #1 is incompatible with prototype:
        prototype: pointer to const char : "/usr/include/string.h", line 71
        argument : pointer to const uchar
"t.c", line 11: warning: argument #1 is incompatible with prototype:
        prototype: pointer to const char : "/usr/include/string.h", line 71
        argument : pointer to const schar

This makes sense.
Someone can try on something psychedelic, like ARM ;) ?

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2007.1 (Cooker) for i586
Linux 2.6.19-jam06 (gcc 4.1.2 20070115 (prerelease) (4.1.2-0.20070115.1mdv2007.1)) #1 SMP PREEMPT

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 23:12           ` David Rientjes
@ 2007-02-08 23:37             ` Linus Torvalds
  2007-02-09  0:24               ` David Rientjes
  0 siblings, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-08 23:37 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, David Rientjes wrote:
> 
> And a compiler that makes a_variable.flag unsigned would be brain-dead 
> because "int" is always signed.

No, making bitfields unsigned is actually usually a good idea. It allows 
you to often generate better code, and it actually tends to be what 
programmers _expect_. A lot of people seem to be surprised to hear that a 
one-bit bitfield actually often encodes -1/0, and not 0/1.

So unsigned bitfields are not only traditional K&R, they are also usually 
_faster_ (which is probably why they are traditional K&R - along with 
allowing "char" to be unsigned by default). Don't knock them.  It's much 
better to just remember that bitfields simply don't _have_ any standard 
sign unless you specify it explicitly, than saying "it should be signed 
because 'int' is signed".

I will actually argue that having signed bit-fields is almost always a 
bug, and that as a result you should _never_ use "int" at all. Especially 
as you might as well just write it as

	signed a:1;

if you really want a signed bitfield.

So I would reall yrecommend that you never use "int a:<bits>" AT ALL, 
because there really is never any good reason to do so. Do it as

	unsigned a:3;
	signed b:2;

but never 

	int c:4;

because the latter really isn't sensible.

"sparse" will actually complain about single-bit signed bitfields, and it 
found a number of cases where people used that "int x:1" kind of syntax. 

Just don't do it.

			Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 21:37         ` Linus Torvalds
@ 2007-02-08 23:12           ` David Rientjes
  2007-02-08 23:37             ` Linus Torvalds
  0 siblings, 1 reply; 67+ messages in thread
From: David Rientjes @ 2007-02-08 23:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Thu, 8 Feb 2007, Linus Torvalds wrote:

> Same goes for
> 
> 	struct dummy {
> 		int flag:1;
> 	} a_variable;
> 
> which could make "a_variable.d" be either signed or unsigned. In the 
> absense of an explicit "signed" or "unsigned" by the programmer, you 
> really won't even _know_ whether "flag" can contain the values {0,1} or 
> {-1,0}.
> 
> Normally you wouldn't ever even care. A one-bit bitfield would only ever 
> really be tested against 0, but it really is possible that when you assign 
> "1" to that value, and read it back, it will read back -1. Try it. 
> 
> Those are the three only types I can think of, but the point is, they 
> really don't have any standard-defined sign. It's up to the compiler. 
> 

And a compiler that makes a_variable.flag unsigned would be brain-dead 
because "int" is always signed.  The signed specifier is only useful for 
forcing chars to carry a sign and it's redundant with any other types.  
Using bitfields requires the knowledge of the sign bit in twos-complement 
just like when you use a signed qualifier for any other definition.  
Having inconsistent behavior on whether it is unsigned or signed based on 
how large the bitfield is (i.e. one bit compared to multiple bits) is 
inappropriate.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 22:03           ` Linus Torvalds
@ 2007-02-08 22:19             ` Willy Tarreau
  2007-02-09  0:03             ` J.A. Magallón
  2007-02-09 12:38             ` Sergei Organov
  2 siblings, 0 replies; 67+ messages in thread
From: Willy Tarreau @ 2007-02-08 22:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: J.A. Magallón, Jan Engelhardt, Jeff Garzik,
	Linux Kernel Mailing List, Andrew Morton

On Thu, Feb 08, 2007 at 02:03:06PM -0800, Linus Torvalds wrote:
> 
> 
> On Thu, 8 Feb 2007, Linus Torvalds wrote:
> > 
> > But THE CALLER CANNOT AND MUST NOT CARE! Because the sign of "char" is 
> > implementation-defined, so if you call "strcmp()", you are already 
> > basically saying: I don't care (and I _cannot_ care) what sign you are 
> > using.
> 
> Let me explain it another way.
> 
> Say you use
> 
> 	signed char *myname, *yourname;
> 
> 	if (strcmp(myname,yourname) < 0)
> 		printf("Ha, I win!\n")
> 
> and you compile this on an architecture where "char" is signed even 
> without the explicit "signed".
> 
> What should happen?
> 
> Should you get a warning? The types really *are* the same, so getting a 
> warning sounds obviously insane. But on the other hand, if you really care 
> about the sign that strcmp() uses internally, the code is wrong *anyway*, 
> because with another compiler, or with the *same* compiler on another 
> architecture or some other compiler flags, the very same code is buggy.
> 
> In other words, either you should get a warning *regardless* of whether 
> the sign actually matches or not, or you shouldn't get a warning at all 
> for the above code. Either it's buggy code, or it isn't.
> 
> Warning only when the sign doesn't _happen_ to match is crap. In that 
> case, it's not a warning about bad code, it's a warning about a bad 
> *compiler*.
> 
> My suggestion is that if you *really* care about the sign so much that you 
> want the sign warning, make it really obvious to the compiler. Don't ever 
> call functions that have implicit signs. Make even "int" arguments (which 
> is well-defined in its sign) use "signed int", and then you can make the 
> compiler warn if anybody ever passes it an "unsigned int".
> 
> Never mind even a pointer - if somebody actually took the time and effort 
> to spell out "signed int" in a function prototype, and you pass that 
> function an unsigned integer, maybe a warning is perfectly fine. Clearly 
> the programmer really cared, and if he didn't care about the sign that 
> much, he could have used just "int".
> 
> Conversely, if somebody has a function with a "unsigned int" prototype, 
> and you pass it a regular "int", a compiler shouldn't complain, because an 
> "int" will just silently promote to unsigned. But perhaps the programmer 
> passes it something that he had _explicitly_ marked with "signed int". 
> Would it make sense to warn then? Makes sense to me.
> 
> And no, none of this is about "strict C standards". All of it is about 
> "what makes sense". It simply doesn't make sense to complain about the 
> sign of "char", because it's not something that has a very hard 
> definition. Similarly, you shouldn't complain about regular "int" 
> conversions, because they are normal, and the standard defines them, but 
> maybe you can take a hint when the programmer gives you a hint by doing 
> something that is "obviously unnecessary", like explicitly saying that 
> "signed int" thing.
> 
> Just an idea. 
>
> 			Linus

I perfectly agree with your analysis here. I'm used to specify "signed"
or "unsigned" when I really expect such a sign, whether it's an int or
char, and because of this, I've spent a lot of time stuffing useless
casts everywhere to shut stupid warning with common functions such as
strcmp(), or even my functions when I don't care about the sign. IOW,
the compiler's excessive warnings discourage you from being strict on
your types. That could be called "funny" if it did not make us waste
so much time.

I would really like to get the "I don't care" possibility by default.

Willy


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 21:42         ` Linus Torvalds
@ 2007-02-08 22:03           ` Linus Torvalds
  2007-02-08 22:19             ` Willy Tarreau
                               ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-08 22:03 UTC (permalink / raw)
  To: J.A. Magallón
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, Linus Torvalds wrote:
> 
> But THE CALLER CANNOT AND MUST NOT CARE! Because the sign of "char" is 
> implementation-defined, so if you call "strcmp()", you are already 
> basically saying: I don't care (and I _cannot_ care) what sign you are 
> using.

Let me explain it another way.

Say you use

	signed char *myname, *yourname;

	if (strcmp(myname,yourname) < 0)
		printf("Ha, I win!\n")

and you compile this on an architecture where "char" is signed even 
without the explicit "signed".

What should happen?

Should you get a warning? The types really *are* the same, so getting a 
warning sounds obviously insane. But on the other hand, if you really care 
about the sign that strcmp() uses internally, the code is wrong *anyway*, 
because with another compiler, or with the *same* compiler on another 
architecture or some other compiler flags, the very same code is buggy.

In other words, either you should get a warning *regardless* of whether 
the sign actually matches or not, or you shouldn't get a warning at all 
for the above code. Either it's buggy code, or it isn't.

Warning only when the sign doesn't _happen_ to match is crap. In that 
case, it's not a warning about bad code, it's a warning about a bad 
*compiler*.

My suggestion is that if you *really* care about the sign so much that you 
want the sign warning, make it really obvious to the compiler. Don't ever 
call functions that have implicit signs. Make even "int" arguments (which 
is well-defined in its sign) use "signed int", and then you can make the 
compiler warn if anybody ever passes it an "unsigned int".

Never mind even a pointer - if somebody actually took the time and effort 
to spell out "signed int" in a function prototype, and you pass that 
function an unsigned integer, maybe a warning is perfectly fine. Clearly 
the programmer really cared, and if he didn't care about the sign that 
much, he could have used just "int".

Conversely, if somebody has a function with a "unsigned int" prototype, 
and you pass it a regular "int", a compiler shouldn't complain, because an 
"int" will just silently promote to unsigned. But perhaps the programmer 
passes it something that he had _explicitly_ marked with "signed int". 
Would it make sense to warn then? Makes sense to me.

And no, none of this is about "strict C standards". All of it is about 
"what makes sense". It simply doesn't make sense to complain about the 
sign of "char", because it's not something that has a very hard 
definition. Similarly, you shouldn't complain about regular "int" 
conversions, because they are normal, and the standard defines them, but 
maybe you can take a hint when the programmer gives you a hint by doing 
something that is "obviously unnecessary", like explicitly saying that 
"signed int" thing.

Just an idea. 

			Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 21:13       ` J.A. Magallón
@ 2007-02-08 21:42         ` Linus Torvalds
  2007-02-08 22:03           ` Linus Torvalds
  0 siblings, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-08 21:42 UTC (permalink / raw)
  To: J.A. Magallón
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1282 bytes --]



On Thu, 8 Feb 2007, J.A. Magallón wrote:
> 
> Perhaps the problem is that gcc warns you something like 'if you care
> anything about the sign behaviour of what you send to strlen() you're doomed'.

But you *cannot* care. That's the point. The warning is insane. For any 
function that takes "char *", you very fundamentally *cannot* care about 
the signedness of the argument, because doing so would be wrong. If you 
do care, you're already buggy.

> Ie, you declared the var unsigned, so you care about it. But I can do
> anything without any regard to the sign.

That makes no sense. 

First off, "strlen()" doesn't care about the sign of the buffer, so it's a 
bad example anyway. But take something that *can* care, namely "strcmp()", 
which can return different things depending on whether "char" is signed or 
not (is 0xff larger than 0x00? Depends on whether char is signed or 
not..).

But THE CALLER CANNOT AND MUST NOT CARE! Because the sign of "char" is 
implementation-defined, so if you call "strcmp()", you are already 
basically saying: I don't care (and I _cannot_ care) what sign you are 
using.

So having the compiler warn about it is insane. It's like warnign about 
the fact that it's Thursday today. It's not something the programmer cares 
about.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 21:10       ` Jan Engelhardt
@ 2007-02-08 21:37         ` Linus Torvalds
  2007-02-08 23:12           ` David Rientjes
  0 siblings, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-08 21:37 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, Jan Engelhardt wrote:
> 
> Thank you for this insight, I don't usually read standards, only RFCs :)
> Uh, does that also apply to the longer types, int, long etc.? I hope not.

No. An "int" or "long" is always signed.

But bitfields, enums or "char" can - and do - have signedness that depends 
on the architecture and/or compiler writers whims.

So if you do

	enum my_enum {
		orange = 1,
		blue = 2,
		green = 3
	};

you don't know if "enum my_enum" is signed or not. The compiler could 
decide to use "int". Or it could decide to use "unsigned char" for it. You 
simply don't know.

Same goes for

	struct dummy {
		int flag:1;
	} a_variable;

which could make "a_variable.d" be either signed or unsigned. In the 
absense of an explicit "signed" or "unsigned" by the programmer, you 
really won't even _know_ whether "flag" can contain the values {0,1} or 
{-1,0}.

Normally you wouldn't ever even care. A one-bit bitfield would only ever 
really be tested against 0, but it really is possible that when you assign 
"1" to that value, and read it back, it will read back -1. Try it. 

Those are the three only types I can think of, but the point is, they 
really don't have any standard-defined sign. It's up to the compiler. 

Now, passing a "pointer to bitfield" is impossible, and if you pass a 
pointer to an enum you'd better *always* use the same enum anyway (because 
not only is the signedness implementation-defined, the _size_ of the enum 
is also implementation-defined), so in those two cases, -Wpointer-sign 
doesn't hurt.

But in the case of "char", it really is insane. It's doubly insane exactly 
because the C standard defines a lot of standard functions that take "char 
*", and that make tons of sense with unsigned chars too, and you may have 
100% good reasons to use "unsigned char" for many of them.

> >It really is that simple. gcc is broken. The C language isn't, it's purely 
> >a broken compiler issue.
> 
> Maybe you could send in a patch to gcc that fixes the issue?

I've butted heads with too many broken gcc decisions. I've occasionally 
used to try to discuss these kinds of things on the gcc mailing lists, but 
I got too fed up with language lawyers that said "the standard _allows_ us 
to be total *ssholes, so stop complaining". It seemed to be an almost 
universal defense.

(And yes, the standard allows any warnings at all. So technically, gcc can 
warn about whatever hell it wants. It doesn't make it a good idea).

			Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 19:53     ` Linus Torvalds
  2007-02-08 21:10       ` Jan Engelhardt
@ 2007-02-08 21:13       ` J.A. Magallón
  2007-02-08 21:42         ` Linus Torvalds
  1 sibling, 1 reply; 67+ messages in thread
From: J.A. Magallón @ 2007-02-08 21:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jan Engelhardt, Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Thu, 8 Feb 2007 11:53:38 -0800 (PST), Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Thu, 8 Feb 2007, Jan Engelhardt wrote:
> > 
> > I generally have to agree with you about the unsigned char* vs char*. It 
> > is a problem of the C language that char can be signed and unsigned, and 
> > that people, as a result, have used it for storing 
> > "shorter_than_short_t"s.
> > 
> > What C needs is a distinction between char and int8_t, rendering "char" 
> > an unsigned at all times basically and making "unsigned char" and 
> > "signed char" illegal types in turn.
> 
> No, it's really more fundamental than that.
> 
> Exactly because "char *" doesn't have a defined sign, only a TOTALLY 
> INCOMPETENT compiler will warn about its signedness.
> 

Perhaps the problem is that gcc warns you something like 'if you care
anything about the sign behaviour of what you send to strlen() you're doomed'.
Ie, you declared the var unsigned, so you care about it. But I can do
anything without any regard to the sign.

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2007.1 (Cooker) for i586
Linux 2.6.19-jam06 (gcc 4.1.2 20070115 (prerelease) (4.1.2-0.20070115.1mdv2007.1)) #1 SMP PREEMPT

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 19:53     ` Linus Torvalds
@ 2007-02-08 21:10       ` Jan Engelhardt
  2007-02-08 21:37         ` Linus Torvalds
  2007-02-08 21:13       ` J.A. Magallón
  1 sibling, 1 reply; 67+ messages in thread
From: Jan Engelhardt @ 2007-02-08 21:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Linux Kernel Mailing List, Andrew Morton


On Feb 8 2007 11:53, Linus Torvalds wrote:
>On Thu, 8 Feb 2007, Jan Engelhardt wrote:
>> 
>Exactly because "char *" doesn't have a defined sign,
>The user has clearly stated "I don't care about the sign". If a compiler 
>complains about us passing "unsigned char *" (or, if "char" is naturally 
>unsigned on that platform, "signed char *") to strcmp then that compiler 
>IS BROKEN. Because "strcmp()" takes "char *", which simply DOES NOT HAVE a 
>uniquely defined sign. 

Thank you for this insight, I don't usually read standards, only RFCs :)
Uh, does that also apply to the longer types, int, long etc.? I hope not.

>only a TOTALLY INCOMPETENT compiler will warn about its signedness.
>That's why we can't have -Wpointer-sign on by default. The gcc warning is 
>simply *crap*.
>[...]
>It really is that simple. gcc is broken. The C language isn't, it's purely 
>a broken compiler issue.

Maybe you could send in a patch to gcc that fixes the issue?


Jan
-- 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 18:42   ` Jan Engelhardt
@ 2007-02-08 19:53     ` Linus Torvalds
  2007-02-08 21:10       ` Jan Engelhardt
  2007-02-08 21:13       ` J.A. Magallón
  2007-02-09 15:10     ` Sergei Organov
  1 sibling, 2 replies; 67+ messages in thread
From: Linus Torvalds @ 2007-02-08 19:53 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Jeff Garzik, Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, Jan Engelhardt wrote:
> 
> I generally have to agree with you about the unsigned char* vs char*. It 
> is a problem of the C language that char can be signed and unsigned, and 
> that people, as a result, have used it for storing 
> "shorter_than_short_t"s.
> 
> What C needs is a distinction between char and int8_t, rendering "char" 
> an unsigned at all times basically and making "unsigned char" and 
> "signed char" illegal types in turn.

No, it's really more fundamental than that.

Exactly because "char *" doesn't have a defined sign, only a TOTALLY 
INCOMPETENT compiler will warn about its signedness.

The user has clearly stated "I don't care about the sign". If a compiler 
complains about us passing "unsigned char *" (or, if "char" is naturally 
unsigned on that platform, "signed char *") to strcmp then that compiler 
IS BROKEN. Because "strcmp()" takes "char *", which simply DOES NOT HAVE a 
uniquely defined sign. 

That's why we can't have -Wpointer-sign on by default. The gcc warning is 
simply *crap*.

If you were to have

	extern int strcmp(signed char *);

and would pass *that* an "unsigned char", it would be a real and valid 
sign error. But just plain "char *" isn't signed or unsigned. It's 
"implementation defined", and as such, if the programmer used it, the 
programmer clearly doesn't care. If he cared, he would just state the 
signedness explicitly.

It really is that simple. gcc is broken. The C language isn't, it's purely 
a broken compiler issue.

The other things in the kernel I'd be willing to fix up. But I simply 
refuse to work around a known-buggy warning.

		Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 16:33 ` Linus Torvalds
@ 2007-02-08 18:42   ` Jan Engelhardt
  2007-02-08 19:53     ` Linus Torvalds
  2007-02-09 15:10     ` Sergei Organov
  0 siblings, 2 replies; 67+ messages in thread
From: Jan Engelhardt @ 2007-02-08 18:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Linux Kernel Mailing List, Andrew Morton

On Feb 8 2007 08:33, Linus Torvalds wrote:
>
>And the thing is, sometimes -Wpointer-sign-compare is just horribly 
>broken. For example, you cannot write a "strlen()" that doesn't 
>complain about "unsigned char *" vs "char *". And that is just a BUG.
>
>I'm continually amazed at how totally clueless some gcc warnings are. 
>You'd think that the people writing them are competent. And then 
>sometimes they show just how totally incompetent they are, by making it 
>impossible to use "unsigned char" arrays together with standard 
>functions.

I generally have to agree with you about the unsigned char* vs char*. It 
is a problem of the C language that char can be signed and unsigned, and 
that people, as a result, have used it for storing 
"shorter_than_short_t"s.

What C needs is a distinction between char and int8_t, rendering "char" 
an unsigned at all times basically and making "unsigned char" and 
"signed char" illegal types in turn.


Jan
-- 
ft: http://freshmeat.net/p/chaostables/

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 15:00 Jeff Garzik
  2007-02-08 16:33 ` Linus Torvalds
@ 2007-02-08 16:35 ` Kumar Gala
  1 sibling, 0 replies; 67+ messages in thread
From: Kumar Gala @ 2007-02-08 16:35 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel Mailing List, Andrew Morton, Linus Torvalds

On Thu, 8 Feb 2007, Jeff Garzik wrote:

> Just did a pull for the first time since 2.6.20, and a /megaton/ of new 
> warnings have appeared, on Fedora Core 6/x86-64 "make allyesconfig".
>
> All of the new warnings spew appear to be "pointers differ in signedness" 
> warning.  Granted, we should care about these, but even a "silent" build 
> (make -s) now spews far more output than any human could possibly care about, 
> and digging important nuggets out of all this noise is beginning to require 
> automated tools rather than just human eyes.

I think this is because of changes to Kbuild and the ongoing discussion 
related to it.  On PPC I see that we are losing options via $(call 
cc-option..)

- k

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: somebody dropped a (warning) bomb
  2007-02-08 15:00 Jeff Garzik
@ 2007-02-08 16:33 ` Linus Torvalds
  2007-02-08 18:42   ` Jan Engelhardt
  2007-02-08 16:35 ` Kumar Gala
  1 sibling, 1 reply; 67+ messages in thread
From: Linus Torvalds @ 2007-02-08 16:33 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel Mailing List, Andrew Morton



On Thu, 8 Feb 2007, Jeff Garzik wrote:
>
> Just did a pull for the first time since 2.6.20, and a /megaton/ of new
> warnings have appeared, on Fedora Core 6/x86-64 "make allyesconfig".

That was due to the Makefile rule breakage. Hopefully it now _really_ is 
fixed. The

	CFLAGS += $(call cc-option,-Wno-pointer-sign,)

part effectively got dropped.

> All of the new warnings spew appear to be "pointers differ in signedness"
> warning.  Granted, we should care about these

The thing is, I'm _really_ sad we have to use that -Wno-pointer-sign 
thing. It's a real and valid warning. It finds really ugly code issues 
(unlike a lot of the other things). So it got dropped by mistake, but that 
is one warning that I'd actually like to resurrect (unlike a lot of other 
gcc warnings that I just think are crap to begin with).

However, we've got a metric shitload of code that passes in pointers to 
the wrong sign, so a lot of it would involve mindless changes that are 
likely to break real things. 

And the thing is, sometimes -Wpointer-sign-compare is just horribly 
broken. For example, you cannot write a "strlen()" that doesn't complain 
about "unsigned char *" vs "char *". And that is just a BUG.

I'm continually amazed at how totally clueless some gcc warnings are. 
You'd think that the people writing them are competent. And then sometimes 
they show just how totally incompetent they are, by making it impossible 
to use "unsigned char" arrays together with standard functions.

So a _lot_ of the warnings are real. But with no way to shut up the ones 
that aren't, it sadly doesn't matter one whit ;(

(And adding casts is just much worse than shutting up the warning 
entirely)

There should be a rule about warnings: if you can't shut up stylistic 
warnings on a site-by-site level without adding horribly syntactic stuff 
that is worse than what the warning is all about, the warning shouldn't 
exist.

			Linus

^ permalink raw reply	[flat|nested] 67+ messages in thread

* somebody dropped a (warning) bomb
@ 2007-02-08 15:00 Jeff Garzik
  2007-02-08 16:33 ` Linus Torvalds
  2007-02-08 16:35 ` Kumar Gala
  0 siblings, 2 replies; 67+ messages in thread
From: Jeff Garzik @ 2007-02-08 15:00 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Andrew Morton, Linus Torvalds

Just did a pull for the first time since 2.6.20, and a /megaton/ of new 
warnings have appeared, on Fedora Core 6/x86-64 "make allyesconfig".

All of the new warnings spew appear to be "pointers differ in 
signedness" warning.  Granted, we should care about these, but even a 
"silent" build (make -s) now spews far more output than any human could 
possibly care about, and digging important nuggets out of all this noise 
is beginning to require automated tools rather than just human eyes.

	Jeff



^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2007-02-19 13:59 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <7Mj5f-3oz-21@gated-at.bofh.it>
     [not found] ` <7MktH-5EW-35@gated-at.bofh.it>
     [not found]   ` <7Mmvy-vj-17@gated-at.bofh.it>
     [not found]     ` <7MnBC-2fk-13@gated-at.bofh.it>
     [not found]       ` <7MoQx-4p8-11@gated-at.bofh.it>
     [not found]         ` <7MpjE-50z-7@gated-at.bofh.it>
     [not found]           ` <7MpCS-5Fe-9@gated-at.bofh.it>
     [not found]             ` <7MDd7-17w-1@gated-at.bofh.it>
     [not found]               ` <7MGkB-62k-31@gated-at.bofh.it>
     [not found]                 ` <7NHoe-2Mb-37@gated-at.bofh.it>
     [not found]                   ` <7NMe9-1ZN-7@gated-at.bofh.it>
     [not found]                     ` <7Oagl-6bO-1@gated-at.bofh.it>
     [not found]                       ` <7ObvW-89N-23@gated-at.bofh.it>
     [not found]                         ` <7Oc8t-NS-1@gated-at.bofh.it>
2007-02-15 20:08                           ` somebody dropped a (warning) bomb Bodo Eggert
2007-02-16 11:21                             ` Sergei Organov
2007-02-16 14:51                               ` Bodo Eggert
2007-02-19 11:56                                 ` Sergei Organov
2007-02-16 12:46                             ` Sergei Organov
2007-02-16 17:40                               ` Bodo Eggert
2007-02-19 12:17                                 ` Sergei Organov
2007-02-08 15:00 Jeff Garzik
2007-02-08 16:33 ` Linus Torvalds
2007-02-08 18:42   ` Jan Engelhardt
2007-02-08 19:53     ` Linus Torvalds
2007-02-08 21:10       ` Jan Engelhardt
2007-02-08 21:37         ` Linus Torvalds
2007-02-08 23:12           ` David Rientjes
2007-02-08 23:37             ` Linus Torvalds
2007-02-09  0:24               ` David Rientjes
2007-02-09  0:42                 ` Linus Torvalds
2007-02-09  0:59                   ` Linus Torvalds
2007-02-09  0:59                   ` David Rientjes
2007-02-09  1:11                     ` Linus Torvalds
2007-02-09  1:18                       ` David Rientjes
2007-02-09 15:38                         ` Linus Torvalds
2007-02-09  3:27                   ` D. Hazelton
2007-02-09 19:54                     ` Pete Zaitcev
2007-02-09 12:34                   ` Jan Engelhardt
2007-02-09 13:16                     ` linux-os (Dick Johnson)
2007-02-09 17:45                       ` Jan Engelhardt
2007-02-09 20:29                         ` linux-os (Dick Johnson)
2007-02-09 22:05                           ` Jan Engelhardt
2007-02-09 22:58                             ` Martin Mares
2007-02-12 18:50                             ` linux-os (Dick Johnson)
2007-02-13 15:14                     ` Dick Streefland
2007-02-08 21:13       ` J.A. Magallón
2007-02-08 21:42         ` Linus Torvalds
2007-02-08 22:03           ` Linus Torvalds
2007-02-08 22:19             ` Willy Tarreau
2007-02-09  0:03             ` J.A. Magallón
2007-02-09  0:22               ` Linus Torvalds
2007-02-09 12:38             ` Sergei Organov
2007-02-09 15:58               ` Linus Torvalds
2007-02-12 11:12                 ` Sergei Organov
2007-02-12 16:26                   ` Linus Torvalds
2007-02-13 18:06                     ` Sergei Organov
2007-02-13 18:26                       ` Pekka Enberg
2007-02-13 19:14                         ` Sergei Organov
2007-02-13 19:43                           ` Pekka Enberg
2007-02-13 20:29                             ` Sergei Organov
2007-02-13 21:31                               ` Jeff Garzik
2007-02-13 23:21                               ` Linus Torvalds
2007-02-15 13:20                                 ` Sergei Organov
2007-02-15 15:57                                   ` Linus Torvalds
2007-02-15 18:53                                     ` Sergei Organov
2007-02-15 19:02                                       ` Linus Torvalds
2007-02-16  4:26                                         ` Rene Herman
2007-02-19 11:58                                           ` Sergei Organov
2007-02-19 13:58                                         ` Sergei Organov
2007-02-15 22:32                                     ` Lennart Sorensen
2007-02-13 19:25                       ` Linus Torvalds
2007-02-13 19:59                         ` Sergei Organov
2007-02-13 20:24                           ` Linus Torvalds
2007-02-15 15:15                             ` Sergei Organov
2007-02-13 21:13                         ` Rob Landley
2007-02-13 22:21                       ` Olivier Galibert
2007-02-14 12:52                         ` Sergei Organov
2007-02-15 20:06                         ` Sergei Organov
2007-02-09 15:10     ` Sergei Organov
2007-02-08 16:35 ` Kumar Gala

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).