LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* I/O memory barriers vs SMP memory barriers
       [not found]         ` <tnxlkhpgslz.fsf@arm.com>
@ 2007-03-23 13:43           ` David Howells
  2007-03-23 15:08             ` Lennert Buytenhek
                               ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: David Howells @ 2007-03-23 13:43 UTC (permalink / raw)
  To: Lennert Buytenhek
  Cc: Catalin Marinas, ARM Linux Mailing List, Dan Williams,
	linux-kernel, torvalds, paulmck


[Resend - this time with a comma in the addresses, not a dot]

Lennert Buytenhek <buytenh@wantstofly.org> wrote:

> [ background: On ARM, SMP synchronisation does need barriers but device
>   synchronisation does not.  The question is that given this, whether
>   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
>   supposed to sync against other CPUs or not, or whether only smp_mb()
>   can be used for this.)  ]

Hmmmm...

I see your problem.  I think the right way to deal with this is to get rid of
mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
io_rmb(), ...

I think that there are only two places you should be using explicit memory
barriers:

 (1) To control inter-CPU effects on an SMP system.

 (2) To control CPU vs device effects.

> On Thu, Mar 22, 2007 at 04:17:44PM +0000, Catalin Marinas wrote:
> 
> > Is the requirement for mb() to act correctly in the SMP case as well?
> 
> That's what the docs seem to suggest.  A couple of snippets from
> memory-barriers.txt:
> 
> [1]  A write memory barrier gives a guarantee that all the STORE operations
>      specified before the barrier will appear to happen before all the STORE
>      operations specified after the barrier with respect to the other
>      components of the system.
> 
> [2]  A read barrier is a data dependency barrier plus a guarantee that all the
>      LOAD operations specified before the barrier will appear to happen before
>      all the LOAD operations specified after the barrier with respect to the
>      other components of the system.
> 
> [3]     TYPE            MANDATORY               SMP CONDITIONAL
>         =============== ======================= ===========================
>         GENERAL         mb()                    smp_mb()
>         WRITE           wmb()                   smp_wmb()
>         READ            rmb()                   smp_rmb()
>         DATA DEPENDENCY read_barrier_depends()  smp_read_barrier_depends()
> 
> [4]  Mandatory barriers should not be used to control SMP effects,
>      since mandatory barriers unnecessarily impose overhead on UP
>      systems.
> 
> Note the wording of 'other components of the system' in [1] and [2] --
> the way I read it, this includes devices as well as other CPUs.

Yes, but I suppose which "other components" may depend on the class of barrier
used.

> [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
> read_barrier_depends()) SHOULD not be used to control SMP effects, but
> it does not say that they MUST not.

As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc.,
so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't be used if
smb_mb() is sufficient as that may impact performance on a UP system.

Really, mb() should only be used with respect to I/O.

> > The memory-barriers.txt doc says that smp_* must be used for the SMP
> > case.
> 
> The exact wording is:
> 
> 	[!] Note that SMP memory barriers _must_ be used to control the
> 	ordering of references to shared memory on SMP systems, though
> 	the use of locking instead is sufficient.
> 
> This can IMHO be interpreted in two ways:
> 1. If you want to control ordering of references to shared memory on
>    SMP systems, you must use SMP memory barriers and not any other kind
>    of memory barrier.

If the shared memory is purely an inter-CPU effect, yes.  If the shared memory
is actually a device with side effects, then I/O safe memory barriers are
required - mb() and co.  Note that there must _also_ be safety wrt to other
CPUs in the system, as other CPUs may also try to access the device.

> 2. If you want to control ordering of references to shared memory on
>    SMP systems, you must use memory barriers, and the SMP memory barrier
>    is the most appropriate barrier type to use.

You may use locking instead to control inter-CPU effects.  Locks imply one-way
permeable SMP-class memory barriers.

> I'm thinking that [2] is what was intended.  [1] doesn't seem consistent
> with the rest of the document, but if [1] _is_ what is what was intended,
> we're off the hook and mb() and friends can be NOPs on ARM.  (But it'd
> probably still need a thorough audit... :-/ )

I think the best way to do an audit would be to make mb() and co. deprecated,
pending obsolete, and to replace them with io_mb() and co.  That way people
would have to eyeball any usages of mb() and co.

> > This means that if code uses mb() to control SMP sharing, it is broken.
> 
> I'm not so sure.

If it's _purely_ to control inter-CPU SMP sharing, then yes, it's broken.  It
must use either a lock or an smp_*mb() barrier.

Of course, Linus may disagree...

David


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers
  2007-03-23 13:43           ` I/O memory barriers vs SMP memory barriers David Howells
@ 2007-03-23 15:08             ` Lennert Buytenhek
  2007-03-24 20:16             ` Benjamin Herrenschmidt
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Lennert Buytenhek @ 2007-03-23 15:08 UTC (permalink / raw)
  To: David Howells
  Cc: Catalin Marinas, ARM Linux Mailing List, Dan Williams,
	linux-kernel, torvalds, paulmck

On Fri, Mar 23, 2007 at 01:43:53PM +0000, David Howells wrote:

> > [ background: On ARM, SMP synchronisation does need barriers but device
> >   synchronisation does not.  The question is that given this, whether
> >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> >   supposed to sync against other CPUs or not, or whether only smp_mb()
> >   can be used for this.)  ]
> 
> Hmmmm...
> 
> I see your problem.  I think the right way to deal with this is to get
> rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them
> with io_mb(), io_rmb(), ...

There's actually three different cases of interest on ARM:
1. direct-mapped and vmalloc()ed kernel memory
2. coherent DMA memory
3. I/O memory (device mappings)

smp_*() only make sense on (1).  Here, you'd want a hardware barrier
on SMP systems, and just a compiler barrier on UP systems.

For (2), most ARM systems use uncached mappings of kernel memory, which
are strongly ordered, and you don't need hardware barriers.  However,
some ARM systems are cache coherent, and they can use ordinary mappings
for (2) (i.e. kmalloc), _but_, such ordinary mappings are weakly ordered,
and so on those systems, you _would_ need hardware barriers for (2).

For (3), Device memory (i.e. I/O mappings) are strongly ordered on all
ARM platforms.

(And of course, then there's the synchronisation issues _between_ the
different mapping types.)

Anyway, we could split the barrier types into three groups, or even
more groups (I bet that on, say, ia64, there's at least a couple more
different scenarios of interest), however, I'm really worried that the
Average Joe Driver Writer's head is just going to explode.


> > [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
> > read_barrier_depends()) SHOULD not be used to control SMP effects, but
> > it does not say that they MUST not.
> 
> As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(),
> etc.,

Not (anymore) on ARM:

	http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9623b3732d11b0a18d9af3419f680d27ea24b014

The question is whether this change was correct.


> so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't
> be used if smb_mb() is sufficient as that may impact performance on
> a UP system.

There's two different statements that can be made about mb():

1. You shouldn't use mb() to synchronise with other CPUs as that is
   unnecessarily slow.

2. You must not use mb() to synchronise with other CPUs as that is
   wrong.

Which is it, (1) or (2)?  The memory-barriers.txt document confuses
these two issues, and you confuse these two issues, but there is a
_fundamental_ _semantic_ _difference_ between these two statements.
Let's not confuse them.


> Really, mb() should only be used with respect to I/O.

OK.  Can we clarify the docs on this point, please?


> > > The memory-barriers.txt doc says that smp_* must be used for the SMP
> > > case.
> > 
> > The exact wording is:
> > 
> > 	[!] Note that SMP memory barriers _must_ be used to control the
> > 	ordering of references to shared memory on SMP systems, though
> > 	the use of locking instead is sufficient.
> > 
> > This can IMHO be interpreted in two ways:
> > 1. If you want to control ordering of references to shared memory on
> >    SMP systems, you must use SMP memory barriers and not any other kind
> >    of memory barrier.
> 
> If the shared memory is purely an inter-CPU effect, yes.  If the shared
> memory is actually a device with side effects, then I/O safe memory
> barriers are required - mb() and co.  Note that there must _also_ be
> safety wrt to other CPUs in the system, as other CPUs may also try to
> access the device.

I was not making any statement, I was just giving two possible
interpretations of the above-quoted snippet from memory-barriers.txt.

Yes, I'm aware of the issues you mention, and yes, all the other
necessary guarantees are provided on the ARM platform.


> > 2. If you want to control ordering of references to shared memory on
> >    SMP systems, you must use memory barriers, and the SMP memory barrier
> >    is the most appropriate barrier type to use.
> 
> You may use locking instead to control inter-CPU effects.  Locks imply
> one-way permeable SMP-class memory barriers.

Again, I was not trying to make a statement here, just giving a
possible interpretation of a statement in memory-barriers.txt.


> > I'm thinking that [2] is what was intended.  [1] doesn't seem
> > consistent with the rest of the document, but if [1] _is_ what
> > is what was intended, we're off the hook and mb() and friends
> > can be NOPs on ARM.  (But it'd probably still need a thorough
> > audit... :-/ )
> 
> I think the best way to do an audit would be to make mb() and co.
> deprecated, pending obsolete, and to replace them with io_mb() and
> co.  That way people would have to eyeball any usages of mb() and
> co.

Sounds OK to me.  Then again, I have an idea of what all the different
types of barriers do.. Joe Driver Writer might not.


> > > This means that if code uses mb() to control SMP sharing, it is
> > > broken.
> > 
> > I'm not so sure.
> 
> If it's _purely_ to control inter-CPU SMP sharing, then yes, it's
> broken.

Bah.  Until the docs are changed to be utterly and unambiguously
clear on this point, that's just rubbish.


cheers,
Lennert

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers
  2007-03-23 13:43           ` I/O memory barriers vs SMP memory barriers David Howells
  2007-03-23 15:08             ` Lennert Buytenhek
@ 2007-03-24 20:16             ` Benjamin Herrenschmidt
  2007-03-25 21:15             ` Paul E. McKenney
  2007-03-26 10:07             ` David Howells
  3 siblings, 0 replies; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-24 20:16 UTC (permalink / raw)
  To: David Howells
  Cc: Lennert Buytenhek, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds, paulmck

On Fri, 2007-03-23 at 13:43 +0000, David Howells wrote:
> [Resend - this time with a comma in the addresses, not a dot]
> 
> Lennert Buytenhek <buytenh@wantstofly.org> wrote:
> 
> > [ background: On ARM, SMP synchronisation does need barriers but device
> >   synchronisation does not.  The question is that given this, whether
> >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> >   supposed to sync against other CPUs or not, or whether only smp_mb()
> >   can be used for this.)  ]
> 
> Hmmmm...
> 
> I see your problem.  I think the right way to deal with this is to get rid of
> mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
> io_rmb(), ...

Hrm... I'm not sure I like the io_* name, I think it's even more
confusing, people will never know when to use what ...

Maybe we should dig out again my attempt at properly defining semantics
of IO accessors and related barriers and extend it to include CPU vs.
DMA barriers.

Ben.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers
  2007-03-23 13:43           ` I/O memory barriers vs SMP memory barriers David Howells
  2007-03-23 15:08             ` Lennert Buytenhek
  2007-03-24 20:16             ` Benjamin Herrenschmidt
@ 2007-03-25 21:15             ` Paul E. McKenney
  2007-03-25 21:38               ` Lennert Buytenhek
  2007-03-26 10:04               ` David Howells
  2007-03-26 10:07             ` David Howells
  3 siblings, 2 replies; 11+ messages in thread
From: Paul E. McKenney @ 2007-03-25 21:15 UTC (permalink / raw)
  To: David Howells
  Cc: Lennert Buytenhek, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds

On Fri, Mar 23, 2007 at 01:43:53PM +0000, David Howells wrote:
> 
> [Resend - this time with a comma in the addresses, not a dot]
> 
> Lennert Buytenhek <buytenh@wantstofly.org> wrote:
> 
> > [ background: On ARM, SMP synchronisation does need barriers but device
> >   synchronisation does not.  The question is that given this, whether
> >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> >   supposed to sync against other CPUs or not, or whether only smp_mb()
> >   can be used for this.)  ]
> 
> Hmmmm...
> 
> I see your problem.  I think the right way to deal with this is to get rid of
> mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
> io_rmb(), ...

We will get combinatorial explosion if we aren't -extremely- careful:

1.	Orders only normal memory accesses, which is all that is required
	of smp_*().

2.	Orders both normal and device accesses -- mmiowb().

3.	Orders memory accesses and device accesses, but not necessarily
	the union of the two -- mb(), rmb(), wmb().

4.	Orders only device accesses, which is what seems to be looked
	for here.

						Thanx, Paul

> I think that there are only two places you should be using explicit memory
> barriers:
> 
>  (1) To control inter-CPU effects on an SMP system.
> 
>  (2) To control CPU vs device effects.
> 
> > On Thu, Mar 22, 2007 at 04:17:44PM +0000, Catalin Marinas wrote:
> > 
> > > Is the requirement for mb() to act correctly in the SMP case as well?
> > 
> > That's what the docs seem to suggest.  A couple of snippets from
> > memory-barriers.txt:
> > 
> > [1]  A write memory barrier gives a guarantee that all the STORE operations
> >      specified before the barrier will appear to happen before all the STORE
> >      operations specified after the barrier with respect to the other
> >      components of the system.
> > 
> > [2]  A read barrier is a data dependency barrier plus a guarantee that all the
> >      LOAD operations specified before the barrier will appear to happen before
> >      all the LOAD operations specified after the barrier with respect to the
> >      other components of the system.
> > 
> > [3]     TYPE            MANDATORY               SMP CONDITIONAL
> >         =============== ======================= ===========================
> >         GENERAL         mb()                    smp_mb()
> >         WRITE           wmb()                   smp_wmb()
> >         READ            rmb()                   smp_rmb()
> >         DATA DEPENDENCY read_barrier_depends()  smp_read_barrier_depends()
> > 
> > [4]  Mandatory barriers should not be used to control SMP effects,
> >      since mandatory barriers unnecessarily impose overhead on UP
> >      systems.
> > 
> > Note the wording of 'other components of the system' in [1] and [2] --
> > the way I read it, this includes devices as well as other CPUs.
> 
> Yes, but I suppose which "other components" may depend on the class of barrier
> used.
> 
> > [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
> > read_barrier_depends()) SHOULD not be used to control SMP effects, but
> > it does not say that they MUST not.
> 
> As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc.,
> so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't be used if
> smb_mb() is sufficient as that may impact performance on a UP system.
> 
> Really, mb() should only be used with respect to I/O.
> 
> > > The memory-barriers.txt doc says that smp_* must be used for the SMP
> > > case.
> > 
> > The exact wording is:
> > 
> > 	[!] Note that SMP memory barriers _must_ be used to control the
> > 	ordering of references to shared memory on SMP systems, though
> > 	the use of locking instead is sufficient.
> > 
> > This can IMHO be interpreted in two ways:
> > 1. If you want to control ordering of references to shared memory on
> >    SMP systems, you must use SMP memory barriers and not any other kind
> >    of memory barrier.
> 
> If the shared memory is purely an inter-CPU effect, yes.  If the shared memory
> is actually a device with side effects, then I/O safe memory barriers are
> required - mb() and co.  Note that there must _also_ be safety wrt to other
> CPUs in the system, as other CPUs may also try to access the device.
> 
> > 2. If you want to control ordering of references to shared memory on
> >    SMP systems, you must use memory barriers, and the SMP memory barrier
> >    is the most appropriate barrier type to use.
> 
> You may use locking instead to control inter-CPU effects.  Locks imply one-way
> permeable SMP-class memory barriers.
> 
> > I'm thinking that [2] is what was intended.  [1] doesn't seem consistent
> > with the rest of the document, but if [1] _is_ what is what was intended,
> > we're off the hook and mb() and friends can be NOPs on ARM.  (But it'd
> > probably still need a thorough audit... :-/ )
> 
> I think the best way to do an audit would be to make mb() and co. deprecated,
> pending obsolete, and to replace them with io_mb() and co.  That way people
> would have to eyeball any usages of mb() and co.
> 
> > > This means that if code uses mb() to control SMP sharing, it is broken.
> > 
> > I'm not so sure.
> 
> If it's _purely_ to control inter-CPU SMP sharing, then yes, it's broken.  It
> must use either a lock or an smp_*mb() barrier.
> 
> Of course, Linus may disagree...
> 
> David
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers
  2007-03-25 21:15             ` Paul E. McKenney
@ 2007-03-25 21:38               ` Lennert Buytenhek
  2007-03-26  3:24                 ` Paul E. McKenney
  2007-03-26 10:04               ` David Howells
  1 sibling, 1 reply; 11+ messages in thread
From: Lennert Buytenhek @ 2007-03-25 21:38 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds

On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote:

> > > [ background: On ARM, SMP synchronisation does need barriers but device
> > >   synchronisation does not.  The question is that given this, whether
> > >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > >   supposed to sync against other CPUs or not, or whether only smp_mb()
> > >   can be used for this.)  ]
> > 
> > Hmmmm...
> > 
> > [snip]
> 
> 3.	Orders memory accesses and device accesses, but not necessarily
> 	the union of the two -- mb(), rmb(), wmb().

If mb/rmb/wmb are required to order normal memory accesses, that means
that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
to always define mb/rmb/wmb as barrier() on ARM systems was wrong.

Does everybody agree on these semantics, though?  At least David seems
to think that mb/rmb/wmb aren't required to order normal memory accesses
against each other..


> 4.	Orders only device accesses, which is what seems to be looked
> 	for here.

Yes.  (As above, on ARM, SMP synchronisation does need barriers but
device synchronisation does not.  If mb/rmb/wmb were only required to
synchronise device accesses, they could have been regular compiler
barriers on ARM, but if they are also required to synchronise normal
memory accesses against each other, they have to map to hardware
barriers.)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers
  2007-03-25 21:38               ` Lennert Buytenhek
@ 2007-03-26  3:24                 ` Paul E. McKenney
  2007-03-26  8:46                   ` Lennert Buytenhek
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2007-03-26  3:24 UTC (permalink / raw)
  To: Lennert Buytenhek
  Cc: David Howells, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds

On Sun, Mar 25, 2007 at 11:38:43PM +0200, Lennert Buytenhek wrote:
> On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote:
> 
> > > > [ background: On ARM, SMP synchronisation does need barriers but device
> > > >   synchronisation does not.  The question is that given this, whether
> > > >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > > >   supposed to sync against other CPUs or not, or whether only smp_mb()
> > > >   can be used for this.)  ]
> > > 
> > > Hmmmm...
> > > 
> > > [snip]
> > 
> > 3.	Orders memory accesses and device accesses, but not necessarily
> > 	the union of the two -- mb(), rmb(), wmb().
> 
> If mb/rmb/wmb are required to order normal memory accesses, that means
> that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
> to always define mb/rmb/wmb as barrier() on ARM systems was wrong.

This was on UP ARM systems, right?  Assuming that ARM CPUs respect the
usual CPU-self-consistency semantics, and given the background that
device accesses are ordered, then it might well be OK to have mb/rmb/wmb
be barrier() on UP ARM systems.

Most likely not on SMP ARM systems, however.

> Does everybody agree on these semantics, though?  At least David seems
> to think that mb/rmb/wmb aren't required to order normal memory accesses
> against each other..

Not on UP.  On SMP, ordering is (almost certainly) required.

> > 4.	Orders only device accesses, which is what seems to be looked
> > 	for here.
> 
> Yes.  (As above, on ARM, SMP synchronisation does need barriers but
> device synchronisation does not.  If mb/rmb/wmb were only required to
> synchronise device accesses, they could have been regular compiler
> barriers on ARM, but if they are also required to synchronise normal
> memory accesses against each other, they have to map to hardware
> barriers.)

Again, for kernels built for UP, you might well be able to make the
mb() primitives be barrier().  I don't see it for SMP, though.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers
  2007-03-26  3:24                 ` Paul E. McKenney
@ 2007-03-26  8:46                   ` Lennert Buytenhek
  2007-03-26 20:07                     ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Lennert Buytenhek @ 2007-03-26  8:46 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds

On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote:

> > > > > [ background: On ARM, SMP synchronisation does need barriers but device
> > > > >   synchronisation does not.  The question is that given this, whether
> > > > >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > > > >   supposed to sync against other CPUs or not, or whether only smp_mb()
> > > > >   can be used for this.)  ]
> > > > 
> > > > Hmmmm...
> > > > 
> > > > [snip]
> > > 
> > > 3.	Orders memory accesses and device accesses, but not necessarily
> > > 	the union of the two -- mb(), rmb(), wmb().
> > 
> > If mb/rmb/wmb are required to order normal memory accesses, that means
> > that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
> > to always define mb/rmb/wmb as barrier() on ARM systems was wrong.
> 
> This was on UP ARM systems, right?

No.

If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can
see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems.
The UP part is obviously fine, the SMP part is what is under debate here.


> Assuming that ARM CPUs respect the usual CPU-self-consistency
> semantics, and given the background that device accesses are ordered,
> then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM
> systems.
> 
> Most likely not on SMP ARM systems, however.

Given the semantics above, mb/rmb/wmb can obviously be just barrier()s
on ARM UP systems.. I don't think anyone ever disagreed about that.


> > Does everybody agree on these semantics, though?  At least David
> > seems to think that mb/rmb/wmb aren't required to order normal
> > memory accesses against each other..
> 
> Not on UP.  On SMP, ordering is (almost certainly) required.

'almost certainly'?  That sounds like there is a possibility that it
wouldn't have to?  What does this depend on?

At least David and Catalin seem to disagree with the statement
that mb/rmb/wmb should order accesses from different CPUs.  And
memory-barriers.txt is pretty vague about this..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers 
  2007-03-25 21:15             ` Paul E. McKenney
  2007-03-25 21:38               ` Lennert Buytenhek
@ 2007-03-26 10:04               ` David Howells
  1 sibling, 0 replies; 11+ messages in thread
From: David Howells @ 2007-03-26 10:04 UTC (permalink / raw)
  To: Lennert Buytenhek
  Cc: Paul E. McKenney, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds, dhowells

Lennert Buytenhek <buytenh@wantstofly.org> wrote:

> Does everybody agree on these semantics, though?  At least David seems
> to think that mb/rmb/wmb aren't required to order normal memory accesses
> against each other..

Ummm...  I've just realised that your statement here is ambiguous.  When you
say "aren't required to", do you mean "aren't necessary to" or do you mean
"don't have to"?  Isn't English a fun language?

Anyway, what I meant is that mb() and co. as they stand _must_ do everything
smp_mb() and co do respectively, _in_ _addition_ to other side effects.

	mb() implies smp_mb()
	rmb() implies smp_rmb()
	wmb() implies smp_wmb()
	...

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers 
  2007-03-23 13:43           ` I/O memory barriers vs SMP memory barriers David Howells
                               ` (2 preceding siblings ...)
  2007-03-25 21:15             ` Paul E. McKenney
@ 2007-03-26 10:07             ` David Howells
  3 siblings, 0 replies; 11+ messages in thread
From: David Howells @ 2007-03-26 10:07 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Lennert Buytenhek, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds, paulmck

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> Hrm... I'm not sure I like the io_* name, I think it's even more
> confusing, people will never know when to use what ...

I'd've thought it more obvious, but given there are several types of I/O, some
of which might require different barriering to others, I can see your point.

However, I think mb() unadorned is also confusing.

> Maybe we should dig out again my attempt at properly defining semantics
> of IO accessors and related barriers and extend it to include CPU vs.
> DMA barriers.

That could be useful.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers
  2007-03-26  8:46                   ` Lennert Buytenhek
@ 2007-03-26 20:07                     ` Paul E. McKenney
  2007-03-28 18:36                       ` Lennert Buytenhek
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2007-03-26 20:07 UTC (permalink / raw)
  To: Lennert Buytenhek
  Cc: David Howells, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds

On Mon, Mar 26, 2007 at 10:46:39AM +0200, Lennert Buytenhek wrote:
> On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote:
> 
> > > > > > [ background: On ARM, SMP synchronisation does need barriers but device
> > > > > >   synchronisation does not.  The question is that given this, whether
> > > > > >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > > > > >   supposed to sync against other CPUs or not, or whether only smp_mb()
> > > > > >   can be used for this.)  ]
> > > > > 
> > > > > Hmmmm...
> > > > > 
> > > > > [snip]
> > > > 
> > > > 3.	Orders memory accesses and device accesses, but not necessarily
> > > > 	the union of the two -- mb(), rmb(), wmb().
> > > 
> > > If mb/rmb/wmb are required to order normal memory accesses, that means
> > > that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
> > > to always define mb/rmb/wmb as barrier() on ARM systems was wrong.
> > 
> > This was on UP ARM systems, right?
> 
> No.
> 
> If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can
> see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems.
> The UP part is obviously fine, the SMP part is what is under debate here.

Yep, looks wrong to me.

> > Assuming that ARM CPUs respect the usual CPU-self-consistency
> > semantics, and given the background that device accesses are ordered,
> > then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM
> > systems.
> > 
> > Most likely not on SMP ARM systems, however.
> 
> Given the semantics above, mb/rmb/wmb can obviously be just barrier()s
> on ARM UP systems.. I don't think anyone ever disagreed about that.

Good.

> > > Does everybody agree on these semantics, though?  At least David
> > > seems to think that mb/rmb/wmb aren't required to order normal
> > > memory accesses against each other..
> > 
> > Not on UP.  On SMP, ordering is (almost certainly) required.
> 
> 'almost certainly'?  That sounds like there is a possibility that it
> wouldn't have to?  What does this depend on?

The underlying memory model of the CPU.  For sequentially consistent
systems, only compiler barriers are required.  There are very few such
systems -- MIPS and PA-RISC, if I remember correctly.  Performance
dictates otherwise.

I believe that MIPS is -not- sequentially consistent, but have not yet
purchased an architecture reference manual.

> At least David and Catalin seem to disagree with the statement
> that mb/rmb/wmb should order accesses from different CPUs.  And
> memory-barriers.txt is pretty vague about this..

mb() needs to do everything that smp_mb() does, ditto for rmb() and
wmb().  There really are cases where both I/O and memory accesses
need to be ordered, so just providing separate memory ordering and
I/O ordering is not enough.

Given that ARM device drivers are accessing MMIO locations, which are
often slow anyway, how much is ARM really gaining by dropping memory
barriers when only I/O accesses need be ordered?  Is it measurable?
If not, there is no point in adding yet another set of combinatorial
choices to the memory-barrier API.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: I/O memory barriers vs SMP memory barriers
  2007-03-26 20:07                     ` Paul E. McKenney
@ 2007-03-28 18:36                       ` Lennert Buytenhek
  0 siblings, 0 replies; 11+ messages in thread
From: Lennert Buytenhek @ 2007-03-28 18:36 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Catalin Marinas, ARM Linux Mailing List,
	Dan Williams, linux-kernel, torvalds

On Mon, Mar 26, 2007 at 01:07:11PM -0700, Paul E. McKenney wrote:

> > > > Does everybody agree on these semantics, though?  At least David
> > > > seems to think that mb/rmb/wmb aren't required to order normal
> > > > memory accesses against each other..
> > > 
> > > Not on UP.  On SMP, ordering is (almost certainly) required.
> > 
> > 'almost certainly'?  That sounds like there is a possibility that it
> > wouldn't have to?  What does this depend on?
> 
> The underlying memory model of the CPU.  For sequentially consistent
> systems, only compiler barriers are required.  There are very few such
> systems -- MIPS and PA-RISC, if I remember correctly.  Performance
> dictates otherwise.
> 
> I believe that MIPS is -not- sequentially consistent, but have not yet
> purchased an architecture reference manual.

ARM Normal memory (RAM) accesses are weakly ordered, so on SMP, you
need barriers.  (SMP ARM systems are the definite minority, though.)

(For ARM UP, we generally don't care, since most have virtual caches
and are not I/O coherent, and so DMA coherent mappings will be done
as uncached mappings, and uncached mappings are strongly ordered --
except on XScale V3, which supports I/O coherency, and so you need to
use barriers when operating on DMA coherent memory because DMA coherent
mappings are done as Normal memory (which is weakly ordered) when I/O
coherency is enabled.)


> Given that ARM device drivers are accessing MMIO locations, which are
> often slow anyway, how much is ARM really gaining by dropping memory
> barriers when only I/O accesses need be ordered?  Is it measurable?

No idea -- I assume Catalin has looked at this.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-03-28 18:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20070323111350.GD3980@xi.wantstofly.org>
     [not found] ` <e9c3a7c20703021312y5f7aa228i5d1c84a8e9ea5676@mail.gmail.com>
     [not found]   ` <20070303111427.GB16944@xi.wantstofly.org>
     [not found]     ` <20070303113305.GB10515@flint.arm.linux.org.uk>
     [not found]       ` <20070321221134.GA22497@xi.wantstofly.org>
     [not found]         ` <tnxlkhpgslz.fsf@arm.com>
2007-03-23 13:43           ` I/O memory barriers vs SMP memory barriers David Howells
2007-03-23 15:08             ` Lennert Buytenhek
2007-03-24 20:16             ` Benjamin Herrenschmidt
2007-03-25 21:15             ` Paul E. McKenney
2007-03-25 21:38               ` Lennert Buytenhek
2007-03-26  3:24                 ` Paul E. McKenney
2007-03-26  8:46                   ` Lennert Buytenhek
2007-03-26 20:07                     ` Paul E. McKenney
2007-03-28 18:36                       ` Lennert Buytenhek
2007-03-26 10:04               ` David Howells
2007-03-26 10:07             ` David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).