LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [PATCH] 64bit unaligned access on 32bit kernel
       [not found] ` <20060306.203218.69025300.nemoto@toshiba-tops.co.jp>
@ 2006-03-07  1:05   ` Andrew Morton
  2006-03-07  2:03     ` Atsushi Nemoto
                       ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Andrew Morton @ 2006-03-07  1:05 UTC (permalink / raw)
  To: Atsushi Nemoto; +Cc: ralf, linux-mips, linux-kernel

Atsushi Nemoto <anemo@mba.ocn.ne.jp> wrote:
>
> >>>>> On Tue, 30 Aug 2005 11:40:56 +0100, Ralf Baechle <ralf@linux-mips.org> said:
> > I've rewriten Atushi's fix for the 64-bit put_unaligned on 32-bit
> > systems bug to generate more efficient code.
> 
> > This case has buzilla URL http://bugzilla.kernel.org/show_bug.cgi?id=5138.
> 
> > Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
> ...
> >  #define __get_unaligned(ptr, size) ({		\
> >  	const void *__gu_p = ptr;		\
> > -	unsigned long val;			\
> > +	__typeof__(*(ptr)) val;			\
> >  	switch (size) {				\
> >  	case 1:					\
> >  		val = *(const __u8 *)__gu_p;	\
> 
> It looks gcc 4.x strike back.  If the 'ptr' is a const, this code
> cause "assignment of read-only variable" error on gcc 4.x.  Let's step
> a back, or do you have any other good idea?
> 
> 
> Use __u64 instead of __typeof__(*(ptr)) for temporary variable to get
> rid of errors on gcc 4.x.
> 
> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
> 
> diff --git a/include/asm-generic/unaligned.h b/include/asm-generic/unaligned.h
> index 4dc8ddb..09ec447 100644
> --- a/include/asm-generic/unaligned.h
> +++ b/include/asm-generic/unaligned.h
> @@ -78,7 +78,7 @@ static inline void __ustw(__u16 val, __u
>  
>  #define __get_unaligned(ptr, size) ({		\
>  	const void *__gu_p = ptr;		\
> -	__typeof__(*(ptr)) val;			\
> +	__u64 val;				\
>  	switch (size) {				\
>  	case 1:					\
>  		val = *(const __u8 *)__gu_p;	\
> @@ -95,7 +95,7 @@ static inline void __ustw(__u16 val, __u
>  	default:				\
>  		bad_unaligned_access_length();	\
>  	};					\
> -	val;					\
> +	(__typeof__(*(ptr)))val;		\
>  })
>  
>  #define __put_unaligned(val, ptr, size)		\

I worry about what impact that change might have on code generation. 
Hopefully none, if gcc is good enough.

But I cannot think of a better fix.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] 64bit unaligned access on 32bit kernel
  2006-03-07  1:05   ` [PATCH] 64bit unaligned access on 32bit kernel Andrew Morton
@ 2006-03-07  2:03     ` Atsushi Nemoto
  2006-03-07 18:09     ` Ralf Baechle
  2007-02-14 21:42     ` [PATCH] Optimize generic get_unaligned / put_unaligned implementations Ralf Baechle
  2 siblings, 0 replies; 18+ messages in thread
From: Atsushi Nemoto @ 2006-03-07  2:03 UTC (permalink / raw)
  To: akpm; +Cc: ralf, linux-mips, linux-kernel

>>>>> On Mon, 6 Mar 2006 17:05:52 -0800, Andrew Morton <akpm@osdl.org> said:
>> Use __u64 instead of __typeof__(*(ptr)) for temporary variable to
>> get rid of errors on gcc 4.x.

akpm> I worry about what impact that change might have on code
akpm> generation.  Hopefully none, if gcc is good enough.

akpm> But I cannot think of a better fix.

As I tested on MIPS gcc 3.x, the impact is not none, but not so huge.
And it becomes much smaller with gcc 4.x.

---
Atsushi Nemoto

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] 64bit unaligned access on 32bit kernel
  2006-03-07  1:05   ` [PATCH] 64bit unaligned access on 32bit kernel Andrew Morton
  2006-03-07  2:03     ` Atsushi Nemoto
@ 2006-03-07 18:09     ` Ralf Baechle
  2006-03-08  4:58       ` Atsushi Nemoto
  2007-02-14 21:42     ` [PATCH] Optimize generic get_unaligned / put_unaligned implementations Ralf Baechle
  2 siblings, 1 reply; 18+ messages in thread
From: Ralf Baechle @ 2006-03-07 18:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Mon, Mar 06, 2006 at 05:05:52PM -0800, Andrew Morton wrote:

> I worry about what impact that change might have on code generation. 
> Hopefully none, if gcc is good enough.
> 
> But I cannot think of a better fix.

Below's fix results in exactly the same code size on all compilers and
configurations I've tested it.

I also have another more elegant fix which as a side effect makes
get_unaligned work for arbitrary data types but it that one results in a
slight code bloat:

gcc 4.1.0 ip22 64-bit
   text    data     bss     dec     hex filename
2717213  337920  167968 3223101  312e3d vmlinux
2717277  337920  167968 3223165  312e7d vmlinux         unaligned-4.patch

  Ralf


Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

diff --git a/include/asm-generic/unaligned.h b/include/asm-generic/unaligned.h
index 4dc8ddb..9a63564 100644
--- a/include/asm-generic/unaligned.h
+++ b/include/asm-generic/unaligned.h
@@ -26,35 +26,13 @@
  * the linker will alert us to the problem via an unresolved reference
  * error.
  */
-extern void bad_unaligned_access_length(void) __attribute__((noreturn));
+extern int bad_unaligned_access_length(void) __attribute__((noreturn));
 
 struct __una_u64 { __u64 x __attribute__((packed)); };
 struct __una_u32 { __u32 x __attribute__((packed)); };
 struct __una_u16 { __u16 x __attribute__((packed)); };
 
 /*
- * Elemental unaligned loads 
- */
-
-static inline __u64 __uldq(const __u64 *addr)
-{
-	const struct __una_u64 *ptr = (const struct __una_u64 *) addr;
-	return ptr->x;
-}
-
-static inline __u32 __uldl(const __u32 *addr)
-{
-	const struct __una_u32 *ptr = (const struct __una_u32 *) addr;
-	return ptr->x;
-}
-
-static inline __u16 __uldw(const __u16 *addr)
-{
-	const struct __una_u16 *ptr = (const struct __una_u16 *) addr;
-	return ptr->x;
-}
-
-/*
  * Elemental unaligned stores 
  */
 
@@ -76,26 +54,16 @@ static inline void __ustw(__u16 val, __u
 	ptr->x = val;
 }
 
-#define __get_unaligned(ptr, size) ({		\
-	const void *__gu_p = ptr;		\
-	__typeof__(*(ptr)) val;			\
-	switch (size) {				\
-	case 1:					\
-		val = *(const __u8 *)__gu_p;	\
-		break;				\
-	case 2:					\
-		val = __uldw(__gu_p);		\
-		break;				\
-	case 4:					\
-		val = __uldl(__gu_p);		\
-		break;				\
-	case 8:					\
-		val = __uldq(__gu_p);		\
-		break;				\
-	default:				\
-		bad_unaligned_access_length();	\
-	};					\
-	val;					\
+#define __get_unaligned(ptr, size)						\
+({										\
+	const void *__gu_p = ptr;						\
+	int __sz = size;							\
+										\
+	((__sz == 1) ? (__typeof__(*(ptr)))*(const __u8 *)__gu_p		\
+	: ((__sz == 2) ? (__typeof__(*(ptr)))((struct __una_u16 *)__gu_p)->x	\
+	: ((__sz == 4) ? (__typeof__(*(ptr)))((struct __una_u32 *)__gu_p)->x	\
+	: ((__sz == 8) ? (__typeof__(*(ptr)))((struct __una_u64 *)__gu_p)->x	\
+	: bad_unaligned_access_length()))));					\
 })
 
 #define __put_unaligned(val, ptr, size)		\

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] 64bit unaligned access on 32bit kernel
  2006-03-07 18:09     ` Ralf Baechle
@ 2006-03-08  4:58       ` Atsushi Nemoto
  2006-03-08  5:12         ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Atsushi Nemoto @ 2006-03-08  4:58 UTC (permalink / raw)
  To: ralf; +Cc: akpm, linux-mips, linux-kernel

>>>>> On Tue, 7 Mar 2006 18:09:07 +0000, Ralf Baechle <ralf@linux-mips.org> said:
ralf> Below's fix results in exactly the same code size on all
ralf> compilers and configurations I've tested it.

ralf> I also have another more elegant fix which as a side effect
ralf> makes get_unaligned work for arbitrary data types but it that
ralf> one results in a slight code bloat:

I tested the patch attached on several MIPS kernel (big/little,
32bit/64bit) with gcc 3.4.5.  In most (but not all) case, Ralf's fix
resulted better than the previous fix.

Acked-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>

---
Atsushi Nemoto

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] 64bit unaligned access on 32bit kernel
  2006-03-08  4:58       ` Atsushi Nemoto
@ 2006-03-08  5:12         ` Andrew Morton
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2006-03-08  5:12 UTC (permalink / raw)
  To: Atsushi Nemoto; +Cc: ralf, linux-mips, linux-kernel

Atsushi Nemoto <anemo@mba.ocn.ne.jp> wrote:
>
> >>>>> On Tue, 7 Mar 2006 18:09:07 +0000, Ralf Baechle <ralf@linux-mips.org> said:
> ralf> Below's fix results in exactly the same code size on all
> ralf> compilers and configurations I've tested it.
> 
> ralf> I also have another more elegant fix which as a side effect
> ralf> makes get_unaligned work for arbitrary data types but it that
> ralf> one results in a slight code bloat:
> 
> I tested the patch attached on several MIPS kernel (big/little,
> 32bit/64bit) with gcc 3.4.5.  In most (but not all) case, Ralf's fix
> resulted better than the previous fix.
> 

hmm, well, your earlier patch has had more testing on various platforms,
for what that's worth.  I think for 2.6.16 we should run with that.  If you
want to prepare a patch which implements Ralf's version for post-2.6.16
then that would be good, thanks.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2006-03-07  1:05   ` [PATCH] 64bit unaligned access on 32bit kernel Andrew Morton
  2006-03-07  2:03     ` Atsushi Nemoto
  2006-03-07 18:09     ` Ralf Baechle
@ 2007-02-14 21:42     ` Ralf Baechle
  2007-02-15  4:39       ` Andrew Morton
  2 siblings, 1 reply; 18+ messages in thread
From: Ralf Baechle @ 2007-02-14 21:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

Time for a little bit of dead horse flogging.

On Mon, Mar 06, 2006 at 05:05:52PM -0800, Andrew Morton wrote:

> > --- a/include/asm-generic/unaligned.h
> > +++ b/include/asm-generic/unaligned.h
> > @@ -78,7 +78,7 @@ static inline void __ustw(__u16 val, __u
> >  
> >  #define __get_unaligned(ptr, size) ({		\
> >  	const void *__gu_p = ptr;		\
> > -	__typeof__(*(ptr)) val;			\
> > +	__u64 val;				\
> >  	switch (size) {				\
> >  	case 1:					\
> >  		val = *(const __u8 *)__gu_p;	\
> > @@ -95,7 +95,7 @@ static inline void __ustw(__u16 val, __u
> >  	default:				\
> >  		bad_unaligned_access_length();	\
> >  	};					\
> > -	val;					\
> > +	(__typeof__(*(ptr)))val;		\
> >  })
> >  
> >  #define __put_unaligned(val, ptr, size)		\
> 
> I worry about what impact that change might have on code generation. 
> Hopefully none, if gcc is good enough.
> 
> But I cannot think of a better fix.

It does inflate the code but back then we agreed to go for Atsushi's patch
because it was fairly obviously correct.  This patch obviously is less
obvious but generates fairly decent, works for arbitrary data types and
cuts down the size of unaligned.h from 122 lines to 44 so it must be good.

  Ralf

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

diff --git a/include/asm-generic/unaligned.h b/include/asm-generic/unaligned.h
index 09ec447..d7fda33 100644
--- a/include/asm-generic/unaligned.h
+++ b/include/asm-generic/unaligned.h
@@ -1,122 +1,44 @@
-#ifndef _ASM_GENERIC_UNALIGNED_H_
-#define _ASM_GENERIC_UNALIGNED_H_
-
 /*
- * For the benefit of those who are trying to port Linux to another
- * architecture, here are some C-language equivalents. 
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
  *
- * This is based almost entirely upon Richard Henderson's
- * asm-alpha/unaligned.h implementation.  Some comments were
- * taken from David Mosberger's asm-ia64/unaligned.h header.
+ * Copyright (C) 2006 MIPS Technologies, Inc.
+ *   written by Ralf Baechle <ralf@linux-mips.org>
  */
+#ifndef __ASM_GENERIC_UNALIGNED_H
+#define __ASM_GENERIC_UNALIGNED_H
 
 #include <linux/types.h>
 
-/* 
- * The main single-value unaligned transfer routines.
- */
-#define get_unaligned(ptr) \
-	__get_unaligned((ptr), sizeof(*(ptr)))
-#define put_unaligned(x,ptr) \
-	__put_unaligned((__u64)(x), (ptr), sizeof(*(ptr)))
-
-/*
- * This function doesn't actually exist.  The idea is that when
- * someone uses the macros below with an unsupported size (datatype),
- * the linker will alert us to the problem via an unresolved reference
- * error.
- */
-extern void bad_unaligned_access_length(void) __attribute__((noreturn));
-
-struct __una_u64 { __u64 x __attribute__((packed)); };
-struct __una_u32 { __u32 x __attribute__((packed)); };
-struct __una_u16 { __u16 x __attribute__((packed)); };
-
-/*
- * Elemental unaligned loads 
- */
-
-static inline __u64 __uldq(const __u64 *addr)
-{
-	const struct __una_u64 *ptr = (const struct __una_u64 *) addr;
-	return ptr->x;
-}
-
-static inline __u32 __uldl(const __u32 *addr)
-{
-	const struct __una_u32 *ptr = (const struct __una_u32 *) addr;
-	return ptr->x;
-}
-
-static inline __u16 __uldw(const __u16 *addr)
-{
-	const struct __una_u16 *ptr = (const struct __una_u16 *) addr;
-	return ptr->x;
-}
-
 /*
- * Elemental unaligned stores 
+ * The unused member __un_foo is there to suppress "warning: ´packed´
+ * attribute ignored for field of type ´union <anonymous>´" messages if
+ * ptr is char *.
  */
 
-static inline void __ustq(__u64 val, __u64 *addr)
-{
-	struct __una_u64 *ptr = (struct __una_u64 *) addr;
-	ptr->x = val;
-}
-
-static inline void __ustl(__u32 val, __u32 *addr)
-{
-	struct __una_u32 *ptr = (struct __una_u32 *) addr;
-	ptr->x = val;
-}
-
-static inline void __ustw(__u16 val, __u16 *addr)
-{
-	struct __una_u16 *ptr = (struct __una_u16 *) addr;
-	ptr->x = val;
-}
-
-#define __get_unaligned(ptr, size) ({		\
-	const void *__gu_p = ptr;		\
-	__u64 val;				\
-	switch (size) {				\
-	case 1:					\
-		val = *(const __u8 *)__gu_p;	\
-		break;				\
-	case 2:					\
-		val = __uldw(__gu_p);		\
-		break;				\
-	case 4:					\
-		val = __uldl(__gu_p);		\
-		break;				\
-	case 8:					\
-		val = __uldq(__gu_p);		\
-		break;				\
-	default:				\
-		bad_unaligned_access_length();	\
-	};					\
-	(__typeof__(*(ptr)))val;		\
+#define get_unaligned(ptr)						\
+({									\
+	const struct {							\
+		union {							\
+			const int __un_foo[0];				\
+			const __typeof__(*(ptr)) __un;			\
+		} __un __attribute__ ((packed));			\
+	} * const __gu_p = (void *) (ptr);				\
+									\
+	__gu_p->__un.__un;						\
 })
 
-#define __put_unaligned(val, ptr, size)		\
-do {						\
-	void *__gu_p = ptr;			\
-	switch (size) {				\
-	case 1:					\
-		*(__u8 *)__gu_p = val;		\
-	        break;				\
-	case 2:					\
-		__ustw(val, __gu_p);		\
-		break;				\
-	case 4:					\
-		__ustl(val, __gu_p);		\
-		break;				\
-	case 8:					\
-		__ustq(val, __gu_p);		\
-		break;				\
-	default:				\
-	    	bad_unaligned_access_length();	\
-	};					\
+#define put_unaligned(val, ptr)						\
+do {									\
+	struct {							\
+		union {							\
+			const int __un_foo[0];				\
+			__typeof__(*(ptr)) __un;			\
+		} __un __attribute__ ((packed));			\
+	} * const __gu_p = (void *) (ptr);				\
+									\
+	__gu_p->__un.__un = (val);					\
 } while(0)
 
-#endif /* _ASM_GENERIC_UNALIGNED_H */
+#endif /* __ASM_GENERIC_UNALIGNED_H */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-14 21:42     ` [PATCH] Optimize generic get_unaligned / put_unaligned implementations Ralf Baechle
@ 2007-02-15  4:39       ` Andrew Morton
  2007-02-15  8:35         ` Marcel Holtmann
  2007-02-15 14:34         ` Ralf Baechle
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2007-02-15  4:39 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Wed, 14 Feb 2007 21:42:26 +0000 Ralf Baechle <ralf@linux-mips.org> wrote:

> Time for a little bit of dead horse flogging.
> 
> On Mon, Mar 06, 2006 at 05:05:52PM -0800, Andrew Morton wrote:
> 
> > > --- a/include/asm-generic/unaligned.h
> > > +++ b/include/asm-generic/unaligned.h
> > > @@ -78,7 +78,7 @@ static inline void __ustw(__u16 val, __u
> > >  
> > >  #define __get_unaligned(ptr, size) ({		\
> > >  	const void *__gu_p = ptr;		\
> > > -	__typeof__(*(ptr)) val;			\
> > > +	__u64 val;				\
> > >  	switch (size) {				\
> > >  	case 1:					\
> > >  		val = *(const __u8 *)__gu_p;	\
> > > @@ -95,7 +95,7 @@ static inline void __ustw(__u16 val, __u
> > >  	default:				\
> > >  		bad_unaligned_access_length();	\
> > >  	};					\
> > > -	val;					\
> > > +	(__typeof__(*(ptr)))val;		\
> > >  })
> > >  
> > >  #define __put_unaligned(val, ptr, size)		\
> > 
> > I worry about what impact that change might have on code generation. 
> > Hopefully none, if gcc is good enough.
> > 
> > But I cannot think of a better fix.
> 
> It does inflate the code but back then we agreed to go for Atsushi's patch
> because it was fairly obviously correct.  This patch obviously is less
> obvious but generates fairly decent, works for arbitrary data types and
> cuts down the size of unaligned.h from 122 lines to 44 so it must be good.
> 
> ...
>
> +#define get_unaligned(ptr)						\
> +({									\
> +	const struct {							\
> +		union {							\
> +			const int __un_foo[0];				\
> +			const __typeof__(*(ptr)) __un;			\
> +		} __un __attribute__ ((packed));			\
> +	} * const __gu_p = (void *) (ptr);				\
> +									\
> +	__gu_p->__un.__un;						\
>  })

Can someone please tell us how this magic works?  (And it does appear to
work).

It seems to assuming that the compiler will assume that members of packed
structures can have arbitrary alignment, even if that alignment is obvious.

Which makes sense, but I'd like to see chapter-and-verse from the spec or
from the gcc docs so we can rely upon it working on all architectures and
compilers from now until ever more.

IOW: your changlogging sucks ;)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-15  4:39       ` Andrew Morton
@ 2007-02-15  8:35         ` Marcel Holtmann
  2007-02-15 14:34         ` Ralf Baechle
  1 sibling, 0 replies; 18+ messages in thread
From: Marcel Holtmann @ 2007-02-15  8:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ralf Baechle, Atsushi Nemoto, linux-mips, linux-kernel

Hi Andrew,

> > +#define get_unaligned(ptr)						\
> > +({									\
> > +	const struct {							\
> > +		union {							\
> > +			const int __un_foo[0];				\
> > +			const __typeof__(*(ptr)) __un;			\
> > +		} __un __attribute__ ((packed));			\
> > +	} * const __gu_p = (void *) (ptr);				\
> > +									\
> > +	__gu_p->__un.__un;						\
> >  })
> 
> Can someone please tell us how this magic works?  (And it does appear to
> work).
> 
> It seems to assuming that the compiler will assume that members of packed
> structures can have arbitrary alignment, even if that alignment is obvious.
> 
> Which makes sense, but I'd like to see chapter-and-verse from the spec or
> from the gcc docs so we can rely upon it working on all architectures and
> compilers from now until ever more.

I am far away from having any knowledge about the GCC internals and the
reason why this code works, but I've been told the generic way of
handling unaligned access is this:

#define get_unaligned(ptr)                      \
({                                              \
        struct __attribute__((packed)) {        \
                typeof(*(ptr)) __v;             \
        } *__p = (void *) (ptr);                \
        __p->__v;                               \
})

#define put_unaligned(val, ptr)                 \
do {                                            \
        struct __attribute__((packed)) {        \
                typeof(*(ptr)) __v;             \
        } *__p = (void *) (ptr);                \
        __p->__v = (val);                       \
} while(0)

Actually I am using this code in the Bluetooth userspace library for
over two years now without any complaints.

Regards

Marcel



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-15  4:39       ` Andrew Morton
  2007-02-15  8:35         ` Marcel Holtmann
@ 2007-02-15 14:34         ` Ralf Baechle
  2007-02-15 21:53           ` Andrew Morton
  1 sibling, 1 reply; 18+ messages in thread
From: Ralf Baechle @ 2007-02-15 14:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Wed, Feb 14, 2007 at 08:39:03PM -0800, Andrew Morton wrote:

> Can someone please tell us how this magic works?  (And it does appear to
> work).
> 
> It seems to assuming that the compiler will assume that members of packed
> structures can have arbitrary alignment, even if that alignment is obvious.
> 
> Which makes sense, but I'd like to see chapter-and-verse from the spec or
> from the gcc docs so we can rely upon it working on all architectures and
> compilers from now until ever more.
> 
> IOW: your changlogging sucks ;)

It was my entry for the next edition of the C Puzzle Book ;-)

The whole union thing was only needed to get rid of a warning but Marcel's
solution does the same thing by attaching the packed keyword to the entire
structure instead, so this patch is now using his macros but using __packed
instead.

  Ralf

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

diff --git a/include/asm-generic/unaligned.h b/include/asm-generic/unaligned.h
index 09ec447..60d94fc 100644
--- a/include/asm-generic/unaligned.h
+++ b/include/asm-generic/unaligned.h
@@ -1,122 +1,27 @@
-#ifndef _ASM_GENERIC_UNALIGNED_H_
-#define _ASM_GENERIC_UNALIGNED_H_
-
-/*
- * For the benefit of those who are trying to port Linux to another
- * architecture, here are some C-language equivalents. 
- *
- * This is based almost entirely upon Richard Henderson's
- * asm-alpha/unaligned.h implementation.  Some comments were
- * taken from David Mosberger's asm-ia64/unaligned.h header.
- */
-
-#include <linux/types.h>
-
-/* 
- * The main single-value unaligned transfer routines.
- */
-#define get_unaligned(ptr) \
-	__get_unaligned((ptr), sizeof(*(ptr)))
-#define put_unaligned(x,ptr) \
-	__put_unaligned((__u64)(x), (ptr), sizeof(*(ptr)))
-
 /*
- * This function doesn't actually exist.  The idea is that when
- * someone uses the macros below with an unsupported size (datatype),
- * the linker will alert us to the problem via an unresolved reference
- * error.
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
  */
-extern void bad_unaligned_access_length(void) __attribute__((noreturn));
-
-struct __una_u64 { __u64 x __attribute__((packed)); };
-struct __una_u32 { __u32 x __attribute__((packed)); };
-struct __una_u16 { __u16 x __attribute__((packed)); };
-
-/*
- * Elemental unaligned loads 
- */
-
-static inline __u64 __uldq(const __u64 *addr)
-{
-	const struct __una_u64 *ptr = (const struct __una_u64 *) addr;
-	return ptr->x;
-}
-
-static inline __u32 __uldl(const __u32 *addr)
-{
-	const struct __una_u32 *ptr = (const struct __una_u32 *) addr;
-	return ptr->x;
-}
-
-static inline __u16 __uldw(const __u16 *addr)
-{
-	const struct __una_u16 *ptr = (const struct __una_u16 *) addr;
-	return ptr->x;
-}
-
-/*
- * Elemental unaligned stores 
- */
-
-static inline void __ustq(__u64 val, __u64 *addr)
-{
-	struct __una_u64 *ptr = (struct __una_u64 *) addr;
-	ptr->x = val;
-}
-
-static inline void __ustl(__u32 val, __u32 *addr)
-{
-	struct __una_u32 *ptr = (struct __una_u32 *) addr;
-	ptr->x = val;
-}
+#ifndef __ASM_GENERIC_UNALIGNED_H
+#define __ASM_GENERIC_UNALIGNED_H
 
-static inline void __ustw(__u16 val, __u16 *addr)
-{
-	struct __una_u16 *ptr = (struct __una_u16 *) addr;
-	ptr->x = val;
-}
+#include <linux/compiler.h>
 
-#define __get_unaligned(ptr, size) ({		\
-	const void *__gu_p = ptr;		\
-	__u64 val;				\
-	switch (size) {				\
-	case 1:					\
-		val = *(const __u8 *)__gu_p;	\
-		break;				\
-	case 2:					\
-		val = __uldw(__gu_p);		\
-		break;				\
-	case 4:					\
-		val = __uldl(__gu_p);		\
-		break;				\
-	case 8:					\
-		val = __uldq(__gu_p);		\
-		break;				\
-	default:				\
-		bad_unaligned_access_length();	\
-	};					\
-	(__typeof__(*(ptr)))val;		\
+#define get_unaligned(ptr)					\
+({								\
+	struct __packed {					\
+		typeof(*(ptr)) __v;				\
+	} *__p = (void *) (ptr);				\
+	__p->__v;						\
 })
 
-#define __put_unaligned(val, ptr, size)		\
-do {						\
-	void *__gu_p = ptr;			\
-	switch (size) {				\
-	case 1:					\
-		*(__u8 *)__gu_p = val;		\
-	        break;				\
-	case 2:					\
-		__ustw(val, __gu_p);		\
-		break;				\
-	case 4:					\
-		__ustl(val, __gu_p);		\
-		break;				\
-	case 8:					\
-		__ustq(val, __gu_p);		\
-		break;				\
-	default:				\
-	    	bad_unaligned_access_length();	\
-	};					\
+#define put_unaligned(val, ptr)					\
+do {								\
+	struct __packed {					\
+		typeof(*(ptr)) __v;				\
+	} *__p = (void *) (ptr);				\
+	__p->__v = (val);					\
 } while(0)
 
-#endif /* _ASM_GENERIC_UNALIGNED_H */
+#endif /* __ASM_GENERIC_UNALIGNED_H */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-15 14:34         ` Ralf Baechle
@ 2007-02-15 21:53           ` Andrew Morton
  2007-02-15 22:18             ` Ralf Baechle
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2007-02-15 21:53 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Thu, 15 Feb 2007 14:34:41 +0000
Ralf Baechle <ralf@linux-mips.org> wrote:

> On Wed, Feb 14, 2007 at 08:39:03PM -0800, Andrew Morton wrote:
> 
> > Can someone please tell us how this magic works?  (And it does appear to
> > work).
> > 
> > It seems to assuming that the compiler will assume that members of packed
> > structures can have arbitrary alignment, even if that alignment is obvious.
> > 
> > Which makes sense, but I'd like to see chapter-and-verse from the spec or
> > from the gcc docs so we can rely upon it working on all architectures and
> > compilers from now until ever more.
> > 
> > IOW: your changlogging sucks ;)
> 
> It was my entry for the next edition of the C Puzzle Book ;-)
> 
> The whole union thing was only needed to get rid of a warning but Marcel's
> solution does the same thing by attaching the packed keyword to the entire
> structure instead, so this patch is now using his macros but using __packed
> instead.

How do we know this trick will work as-designed across all versions of gcc
and icc (at least) and for all architectures and for all sets of compiler
options?

Basically, it has to be guaranteed by a C standard.  Is it?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-15 21:53           ` Andrew Morton
@ 2007-02-15 22:18             ` Ralf Baechle
  2007-02-15 23:05               ` Jeremy Fitzhardinge
  2007-02-15 23:38               ` Andrew Morton
  0 siblings, 2 replies; 18+ messages in thread
From: Ralf Baechle @ 2007-02-15 22:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Thu, Feb 15, 2007 at 01:53:58PM -0800, Andrew Morton wrote:

> > The whole union thing was only needed to get rid of a warning but Marcel's
> > solution does the same thing by attaching the packed keyword to the entire
> > structure instead, so this patch is now using his macros but using __packed
> > instead.
> 
> How do we know this trick will work as-designed across all versions of gcc
> and icc (at least) and for all architectures and for all sets of compiler
> options?
> 
> Basically, it has to be guaranteed by a C standard.  Is it?

Gcc info page says:

[...]
`packed'
     The `packed' attribute specifies that a variable or structure field
     should have the smallest possible alignment--one byte for a
     variable, and one bit for a field, unless you specify a larger
     value with the `aligned' attribute.
[...]

Qed?

  Ralf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-15 22:18             ` Ralf Baechle
@ 2007-02-15 23:05               ` Jeremy Fitzhardinge
  2007-02-15 23:38               ` Andrew Morton
  1 sibling, 0 replies; 18+ messages in thread
From: Jeremy Fitzhardinge @ 2007-02-15 23:05 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Andrew Morton, Atsushi Nemoto, linux-mips, linux-kernel

Ralf Baechle wrote:
> Gcc info page says:
>
> [...]
> `packed'
>      The `packed' attribute specifies that a variable or structure field
>      should have the smallest possible alignment--one byte for a
>      variable, and one bit for a field, unless you specify a larger
>      value with the `aligned' attribute.
> [...]
>
> Qed?

So that the compiler has to assume that if its accessing this __packed
structure, it may be embedded unaligned within something else? And
because the pointer is cast through (void *) it isn't allowed to use
alias analysis to notice that the pointer wasn't originally (apparently)
unaligned.

Seems sound to me.

    J

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-15 22:18             ` Ralf Baechle
  2007-02-15 23:05               ` Jeremy Fitzhardinge
@ 2007-02-15 23:38               ` Andrew Morton
  2007-02-16  0:13                 ` Jeremy Fitzhardinge
  2007-02-16  0:43                 ` Ralf Baechle
  1 sibling, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2007-02-15 23:38 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Thu, 15 Feb 2007 22:18:39 +0000
Ralf Baechle <ralf@linux-mips.org> wrote:

> On Thu, Feb 15, 2007 at 01:53:58PM -0800, Andrew Morton wrote:
> 
> > > The whole union thing was only needed to get rid of a warning but Marcel's
> > > solution does the same thing by attaching the packed keyword to the entire
> > > structure instead, so this patch is now using his macros but using __packed
> > > instead.
> > 
> > How do we know this trick will work as-designed across all versions of gcc
> > and icc (at least) and for all architectures and for all sets of compiler
> > options?
> > 
> > Basically, it has to be guaranteed by a C standard.  Is it?
> 
> Gcc info page says:
> 
> [...]
> `packed'
>      The `packed' attribute specifies that a variable or structure field
>      should have the smallest possible alignment--one byte for a
>      variable, and one bit for a field, unless you specify a larger
>      value with the `aligned' attribute.
> [...]
> 

hm.  So if I have

	struct bar {
		unsigned long b;
	} __attribute__((packed));

	struct foo {
		unsigned long u;
		struct bar b;
	};

then the compiler can see that foo.b.b is well-aligned, regardless of the
packedness.

Plus some crazy people compile the kernel with icc (or at least they used
to).  What happens there?

> Qed?

worried.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-15 23:38               ` Andrew Morton
@ 2007-02-16  0:13                 ` Jeremy Fitzhardinge
  2007-02-16  0:43                 ` Ralf Baechle
  1 sibling, 0 replies; 18+ messages in thread
From: Jeremy Fitzhardinge @ 2007-02-16  0:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ralf Baechle, Atsushi Nemoto, linux-mips, linux-kernel

Andrew Morton wrote:
> hm.  So if I have
>
> 	struct bar {
> 		unsigned long b;
> 	} __attribute__((packed));
>
> 	struct foo {
> 		unsigned long u;
> 		struct bar b;
> 	};
>
> then the compiler can see that foo.b.b is well-aligned, regardless of the
> packedness.

In Ralf's code, the structure is anonymous, and is used to declare a
pointer type, which is initialized from a void *.  So I think the
compiler isn't allowed to assume anything about its alignment.

    J

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-15 23:38               ` Andrew Morton
  2007-02-16  0:13                 ` Jeremy Fitzhardinge
@ 2007-02-16  0:43                 ` Ralf Baechle
  2007-02-16  1:27                   ` Andrew Morton
  1 sibling, 1 reply; 18+ messages in thread
From: Ralf Baechle @ 2007-02-16  0:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Thu, Feb 15, 2007 at 03:38:23PM -0800, Andrew Morton wrote:

> hm.  So if I have
> 
> 	struct bar {
> 		unsigned long b;
> 	} __attribute__((packed));
> 
> 	struct foo {
> 		unsigned long u;
> 		struct bar b;
> 	};
> 
> then the compiler can see that foo.b.b is well-aligned, regardless of the
> packedness.
> 
> Plus some crazy people compile the kernel with icc (or at least they used
> to).  What happens there?

A quick grep for __attribute__((packed)) and __packed find around 900 hits,
I'd probably find more if I'd look for syntactical variations.  Some hits
are in arch/{i386,x86_64,ia64}.  At a glance it seems hard to configure a
useful x86 kernel that doesn't involve any packed attribute.  I take that
as statistical proof that icc either has doesn't really work for building
the kernel or groks packing.  Any compiler not implementing gcc extensions
is lost at building the kernel but that's old news.

  Ralf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-16  0:43                 ` Ralf Baechle
@ 2007-02-16  1:27                   ` Andrew Morton
  2007-02-16  1:59                     ` Ralf Baechle
  2007-02-20 13:50                     ` Pavel Machek
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2007-02-16  1:27 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Fri, 16 Feb 2007 00:43:17 +0000
Ralf Baechle <ralf@linux-mips.org> wrote:

> On Thu, Feb 15, 2007 at 03:38:23PM -0800, Andrew Morton wrote:
> 
> > hm.  So if I have
> > 
> > 	struct bar {
> > 		unsigned long b;
> > 	} __attribute__((packed));
> > 
> > 	struct foo {
> > 		unsigned long u;
> > 		struct bar b;
> > 	};
> > 
> > then the compiler can see that foo.b.b is well-aligned, regardless of the
> > packedness.
> > 
> > Plus some crazy people compile the kernel with icc (or at least they used
> > to).  What happens there?
> 
> A quick grep for __attribute__((packed)) and __packed find around 900 hits,
> I'd probably find more if I'd look for syntactical variations.  Some hits
> are in arch/{i386,x86_64,ia64}.  At a glance it seems hard to configure a
> useful x86 kernel that doesn't involve any packed attribute.  I take that
> as statistical proof that icc either has doesn't really work for building
> the kernel or groks packing.  Any compiler not implementing gcc extensions
> is lost at building the kernel but that's old news.
> 

No, icc surely supports attribute(packed).  My point is that we shouldn't
rely upon the gcc info file for this, because other compilers can (or
could) be used to build the kernel.

So it would be safer if the C spec said (or could be interpreted to say)
"members of packed structures are always copied bytewise".  So then we
can be reasonably confident that this change won't break the use of
those compilers.

But then, I don't even know if any C standard says anything about packing.

Ho hum.  Why are we talking about this, anyway?  Does the patch make the
code faster?  Or just nicer?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-16  1:27                   ` Andrew Morton
@ 2007-02-16  1:59                     ` Ralf Baechle
  2007-02-20 13:50                     ` Pavel Machek
  1 sibling, 0 replies; 18+ messages in thread
From: Ralf Baechle @ 2007-02-16  1:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Atsushi Nemoto, linux-mips, linux-kernel

On Thu, Feb 15, 2007 at 05:27:20PM -0800, Andrew Morton wrote:

> No, icc surely supports attribute(packed).  My point is that we shouldn't
> rely upon the gcc info file for this, because other compilers can (or
> could) be used to build the kernel.
> 
> So it would be safer if the C spec said (or could be interpreted to say)
> "members of packed structures are always copied bytewise".  So then we
> can be reasonably confident that this change won't break the use of
> those compilers.
> 
> But then, I don't even know if any C standard says anything about packing.

Memory layout and alignment of structures and members are implementation
defined according to the C standard; the standard provides no means to
influence these.  So it takes a compiler extension such as gcc's
__attribute__().

> Ho hum.  Why are we talking about this, anyway?  Does the patch make the
> code faster?  Or just nicer?

Smaller binary and from looking at the disassembly a tad faster also.

  Ralf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Optimize generic get_unaligned / put_unaligned implementations.
  2007-02-16  1:27                   ` Andrew Morton
  2007-02-16  1:59                     ` Ralf Baechle
@ 2007-02-20 13:50                     ` Pavel Machek
  1 sibling, 0 replies; 18+ messages in thread
From: Pavel Machek @ 2007-02-20 13:50 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ralf Baechle, Atsushi Nemoto, linux-mips, linux-kernel

Hi!

> > > hm.  So if I have
> > > 
> > > 	struct bar {
> > > 		unsigned long b;
> > > 	} __attribute__((packed));
> > > 
> > > 	struct foo {
> > > 		unsigned long u;
> > > 		struct bar b;
> > > 	};
> > > 
> > > then the compiler can see that foo.b.b is well-aligned, regardless of the
> > > packedness.
> > > 
> > > Plus some crazy people compile the kernel with icc (or at least they used
> > > to).  What happens there?
> > 
> > A quick grep for __attribute__((packed)) and __packed find around 900 hits,
> > I'd probably find more if I'd look for syntactical variations.  Some hits
> > are in arch/{i386,x86_64,ia64}.  At a glance it seems hard to configure a
> > useful x86 kernel that doesn't involve any packed attribute.  I take that
> > as statistical proof that icc either has doesn't really work for building
> > the kernel or groks packing.  Any compiler not implementing gcc extensions
> > is lost at building the kernel but that's old news.
> > 
> 
> No, icc surely supports attribute(packed).  My point is that we shouldn't
> rely upon the gcc info file for this, because other compilers can (or
> could) be used to build the kernel.

Well, icc should be gcc compatible. If it is not, it is icc bug.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-02-20 14:00 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20050830104056.GA4710@linux-mips.org>
     [not found] ` <20060306.203218.69025300.nemoto@toshiba-tops.co.jp>
2006-03-07  1:05   ` [PATCH] 64bit unaligned access on 32bit kernel Andrew Morton
2006-03-07  2:03     ` Atsushi Nemoto
2006-03-07 18:09     ` Ralf Baechle
2006-03-08  4:58       ` Atsushi Nemoto
2006-03-08  5:12         ` Andrew Morton
2007-02-14 21:42     ` [PATCH] Optimize generic get_unaligned / put_unaligned implementations Ralf Baechle
2007-02-15  4:39       ` Andrew Morton
2007-02-15  8:35         ` Marcel Holtmann
2007-02-15 14:34         ` Ralf Baechle
2007-02-15 21:53           ` Andrew Morton
2007-02-15 22:18             ` Ralf Baechle
2007-02-15 23:05               ` Jeremy Fitzhardinge
2007-02-15 23:38               ` Andrew Morton
2007-02-16  0:13                 ` Jeremy Fitzhardinge
2007-02-16  0:43                 ` Ralf Baechle
2007-02-16  1:27                   ` Andrew Morton
2007-02-16  1:59                     ` Ralf Baechle
2007-02-20 13:50                     ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).