LKML Archive on lore.kernel.org
 help / color / Atom feed
From: "Ahmed S. Darwish" <darwish.07@gmail.com>
To: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Olivier Sobrie <olivier@sobrie.be>,
	Oliver Hartkopp <socketcan@hartkopp.net>,
	Wolfgang Grandegger <wg@grandegger.com>,
	Andri Yngvason <andri.yngvason@marel.com>,
	Linux-CAN <linux-can@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org
Subject: Re: [PATCH v4 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions
Date: Sat, 14 Mar 2015 10:38:20 -0400
Message-ID: <20150314143820.GA22071@linux> (raw)
In-Reply-To: <55043A7E.7040001@pengutronix.de>

Hi Marc,

On Sat, Mar 14, 2015 at 02:41:18PM +0100, Marc Kleine-Budde wrote:
> On 03/14/2015 02:02 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> > 
> > A number of tx queue wake-up events went missing due to the
> > outlined scenario below. Start state is a pool of 16 tx URBs,
> > active tx_urbs count = 15, with the netdev tx queue open.
> > 
> > CPU #1 [softirq]                         CPU #2 [softirq]
> > start_xmit()                             tx_acknowledge()
> > ................                         ................
> > 
> > atomic_inc(&tx_urbs);
> > if (atomic_read(&tx_urbs) >= 16) {
> >                         -->
> >                                          atomic_dec(&tx_urbs);
> >                                          netif_wake_queue();
> >                                          return;
> >                         <--
> >     netif_stop_queue();
> > }
> > 
> > At the end, the correct state expected is a 15 tx_urbs count
> > value with the tx queue state _open_. Due to the race, we get
> > the same tx_urbs value but with the tx queue state _stopped_.
> > The wake-up event is completely lost.
> > 
> > Thus avoid hand-rolled concurrency mechanisms and use a proper
> > lock for contexts and tx queue protection.
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
> 
> Applied to can. This will go into David's net tree and finally into
> net-next. Then I'll apply patches 2+3. Nag me, if I forget about them ;)
> 

Thanks! :-)

So if I've understood correctly, this patch will go to -rc5 and
the rest will go into net-next?

If so, IMHO patch #2 should also go to -rc5. I know it's usually
frowned up on to add further patches at this late -rc stage, but
here's my logic:

The original driver code just _arbitrarily_ assumed a MAX_TX_URB
value of 16 parallel transmissions. This value was chosen, it seems,
because the driver was heavily based on esd_usb2.c and the esd code
just did so :-(

Meanwhile, in the Kvaser hardware at hand, if I've increased the
driver's max parallel transmissions little above the recommended
limit reported by firmware, the firmware breaks up badly, reports a
massive list of internal errors, and the candump traces becomes
sort of an internal mess hardly related to the actual frames sent
and received.

In my case, I was lucky that the brand we own here (*) had a higher
max outstanding transmissions value than 16. But if there is hardware
out there with a max oustanding tx support < 16 (#), such hardware
will break badly under a heavy transmission load :-(

(*) http://www.kvaser.com/products/kvaser-usb-hs-ii-hsls/

(#) There are a huge list of Kvaser products having the same controller
    but with different performance metrics, so this is quite a
    possiblity.

Thanks,
Darwish

  reply index

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-26 15:20 [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Ahmed S. Darwish
2015-02-26 15:22 ` [PATCH 2/5] can: kvaser_usb: Read all messages in a bulk-in URB buffer Ahmed S. Darwish
2015-02-26 15:24   ` [PATCH 3/5] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
2015-02-26 15:25     ` [PATCH 4/5] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
2015-02-26 15:29       ` [PATCH 5/5] can: kvaser_usb: Fix tx queue start/stop race conditions Ahmed S. Darwish
2015-03-14 14:26   ` [PATCH 2/5] can: kvaser_usb: Read all messages in a bulk-in URB buffer Marc Kleine-Budde
2015-03-04  9:15 ` [PATCH 1/5] can: kvaser_usb: Avoid double free on URB submission failures Marc Kleine-Budde
2015-03-09 12:32   ` Ahmed S. Darwish
2015-03-09 12:56     ` Marc Kleine-Budde
2015-03-11 15:23 ` [PATCH v2 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Ahmed S. Darwish
2015-03-11 15:28   ` [PATCH v2 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
2015-03-11 15:30     ` [PATCH v2 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
2015-03-11 15:36   ` [PATCH v2 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Marc Kleine-Budde
2015-03-11 15:57     ` Ahmed S. Darwish
2015-03-11 17:37 ` [PATCH v3 " Ahmed S. Darwish
2015-03-11 17:39   ` [PATCH v3 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
2015-03-11 17:39     ` [PATCH v3 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
2015-03-11 21:53     ` [PATCH v3 2/3] can: kvaser_usb: Utilize all possible tx URBs Marc Kleine-Budde
2015-03-12 10:52       ` Ahmed S. Darwish
2015-03-12 11:29         ` Marc Kleine-Budde
2015-03-11 21:43   ` [PATCH v3 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Marc Kleine-Budde
2015-03-12 19:30     ` Ahmed S. Darwish
2015-03-14 13:02 ` [PATCH v4 " Ahmed S. Darwish
2015-03-14 13:09   ` [PATCH v4 2/3] can: kvaser_usb: Utilize all possible tx URBs Ahmed S. Darwish
2015-03-14 13:11     ` [PATCH v4 3/3] can: kvaser_usb: Use can-dev unregistration mechanism Ahmed S. Darwish
2015-03-14 15:26       ` Marc Kleine-Budde
2015-03-14 15:41         ` Ahmed S. Darwish
2015-03-14 15:55           ` Marc Kleine-Budde
2015-03-14 16:06             ` Ahmed S. Darwish
2015-03-14 13:41   ` [PATCH v4 1/3] can: kvaser_usb: Fix tx queue start/stop race conditions Marc Kleine-Budde
2015-03-14 14:38     ` Ahmed S. Darwish [this message]
2015-03-14 14:58       ` Marc Kleine-Budde
2015-03-14 15:19         ` Ahmed S. Darwish
2015-03-15 15:03 ` [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value Ahmed S. Darwish
2015-03-15 15:10   ` [PATCH v5 2/2] can: kvaser_usb: Fix sparse warning __le16 degrades to integer Ahmed S. Darwish
2015-03-15 18:08   ` [PATCH v5 1/2] can: kvaser_usb: Comply with firmware max tx URBs value Marc Kleine-Budde
2015-03-16 12:16     ` Ahmed S. Darwish
2015-03-16 12:56       ` Marc Kleine-Budde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150314143820.GA22071@linux \
    --to=darwish.07@gmail.com \
    --cc=andri.yngvason@marel.com \
    --cc=linux-can@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    --cc=netdev@vger.kernel.org \
    --cc=olivier@sobrie.be \
    --cc=socketcan@hartkopp.net \
    --cc=wg@grandegger.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lkml.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lkml.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lkml.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lkml.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lkml.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lkml.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lkml.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lkml.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lkml.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lkml.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lkml.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lkml.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git