LKML Archive on
help / color / mirror / Atom feed
From: Alan Stern <>
To: Marco Elver <>
Cc: "Paul E. McKenney" <>,
	Boqun Feng <>,
	Andrea Parri <>,
	Will Deacon <>,
	Mark Rutland <>,
	Dmitry Vyukov <>,,
Subject: Re: LKMM: Read dependencies of writes ordered by dma_wmb()?
Date: Mon, 16 Aug 2021 10:59:45 -0400	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Mon, Aug 16, 2021 at 12:12:01PM +0200, Marco Elver wrote:
> Hello,
> Commit c58a801701693 added a paragraph to the LKMM:
> 	+Although we said that plain accesses are not linked by the ppo
> 	+relation, they do contribute to it indirectly.  Namely, when there is
> 	+an address dependency from a marked load R to a plain store W,
> 	+followed by smp_wmb() and then a marked store W', the LKMM creates a
> 	+ppo link from R to W'.
> Defining that certain _marked reads_ will also be ordered by smp_wmb().
> But otherwise, other reads (especially plain reads!) will _never_ be
> ordered by smp_wmb(). Is my understanding correct?

The ordering is indirect, but yes.

> I am asking because KCSAN is growing limited support for weak memory
> modeling and memory barriers, and I'm trying to figure out if I'm seeing
> a false positive or genuinely allowed race.
> One caveat is the case I'm trying to understand doesn't involve just 2
> CPUs but also a device. And for now, I'm assuming that dma_wmb() is as
> strong as smp_wmb() also wrt other CPUs (but my guess is this
> assumption is already too strong).

I'm not sure that is right.  dma_wmb affects the visibility of writes to 
a DMA buffer from the point of view of the device, not necessarily from 
the point of view of other CPUs.  At least, there doesn't seem to be any 
claim in memory-barriers.txt that it does so.

> The whole area of the memory model that includes talking to devices and
> devices talking back to CPUs seems quite murky, and need to confirm that
> I either got it right or wrong. :-)

The LKMM itself doesn't include anything about device I/O.  So you're 
already going beyond the known limits.  :-)


> KCSAN is saying there is a potential use-after-free read of an skb due
> to the read to 0xffff8880077b5570 potentially being delayed/reordered
> later. If the memory was reallocated and reused concurrently, the read
> could read garbage data:
> 1.	The e1000 driver is being instructed to transmit in
> 	e1000_xmit_frame(). Here it uses the data in the skb in various
> 	places (e.g. in skb_headlen() above) to set up a new element in
> 	the ring buffer to be consumed by the device via DMA.

You mean here the driver reads some stuff from the skb, right?  And 
various writes depend on the data that was read, but these dependencies 
aren't evident to the memory model because they all involve plain 

> 2.	Eventually it calls e1000_tx_queue(), which seems to publish the
> 	next entry into the ring buffer and finally calls dma_wmb().
> 	Until this point I see no other barriers (although there's a
> 	writel(), but it doesn't always seem to be called).

And potentially those reads from above could be delayed (or repeated) 
after this point.

But you're missing something.  What matters isn't the dma_wmb.  Rather, 
it's the call which transfers ownership of the buffer to the device.  
That call must certainly include its own memory barrier, meaning that 
the reads must complete before the call returns.  We don't depend on a 
dma_wmb which might or might not be present to enforce this ordering.

Unless this buffer mapping is supposed to be coherent, of course, in 
which case there would be no ownership transfers.

> 3.	e1000_clean_tx_irq() is called on another CPU after transmit
> 	completes, and we know the device has consumed that entry from
> 	the ring buffer. At this point the driver then says that the
> 	associated skb can be kfree()'d.
> 4.	If I interpreted dma_wmb() (and smp_wmb()) right, plain reads
> 	may be reordered after it, irrespective if a write that depended
> 	on such reads was ordered by the wmb(). Which means the
> 	reordering of the plain reads accessing the skb before it may in
> 	fact happen concurrently with the kfree() of skb if reordered
> 	after. For example reordered to the very end of
> 	e1000_xmit_frame() (line 3282) as KCSAN simulated in this case.
> Is the above result allowed by the kernel's memory model?

This can't happen, for the reason explained above, if the buffer is 
non-coherent.  But if the DMA mapping is coherent, this does sound like 
a bug.

> In practice, my guess is no compiler and architecture combination would
> allow this today; or is there an arch where it could?

Probably not; reordering of reads tends to take place over time 
scales a lot shorter than lengthy I/O operations.


  reply	other threads:[~2021-08-16 14:59 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16 10:12 Marco Elver
2021-08-16 14:59 ` Alan Stern [this message]
2021-08-16 17:23   ` Marco Elver
2021-08-16 19:21     ` Alan Stern
2021-08-16 20:50       ` Paul E. McKenney
2021-08-17 12:14         ` Marco Elver
2021-08-17 12:28         ` Will Deacon
2021-08-17 13:27           ` Marco Elver
2021-08-17 13:53           ` Paul E. McKenney
2021-08-18 11:39             ` Will Deacon
2021-08-18 23:17               ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: LKMM: Read dependencies of writes ordered by dma_wmb()?' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).