LKML Archive on lore.kernel.org
 help / color / Atom feed
From: "Ahmed S. Darwish" <darwish.07@gmail.com>
To: Andri Yngvason <andri.yngvason@marel.com>
Cc: Wolfgang Grandegger <wg@grandegger.com>,
	Olivier Sobrie <olivier@sobrie.be>,
	Oliver Hartkopp <socketcan@hartkopp.net>,
	Marc Kleine-Budde <mkl@pengutronix.de>,
	Linux-CAN <linux-can@vger.kernel.org>,
	netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling
Date: Sun, 25 Jan 2015 04:43:00 +0200
Message-ID: <20150125024300.GA6028@Darwish.PC> (raw)
In-Reply-To: <20150123103213.6205.54618@shannon>

On Fri, Jan 23, 2015 at 10:32:13AM +0000, Andri Yngvason wrote:
> Quoting Ahmed S. Darwish (2015-01-23 06:07:34)
> > On Wed, Jan 21, 2015 at 05:13:45PM +0100, Wolfgang Grandegger wrote:
> > > On Wed, 21 Jan 2015 10:36:47 -0500, "Ahmed S. Darwish"
> > > <darwish.07@gmail.com> wrote:
> > > > On Wed, Jan 21, 2015 at 03:00:15PM +0000, Andri Yngvason wrote:
> > > >> Quoting Ahmed S. Darwish (2015-01-21 14:43:23)
> > > >> > Hi!
> > > > 
> > > > ...
> > > > 
> > > >> > <-- Unplug the cable -->
> > > >> > 
> > > >> >  (000.009106)  can0  20000080   [8]  00 00 00 00 00 00 08 00  
> > > >> >  ERRORFRAME
> > > >> >         bus-error
> > > >> >         error-counter-tx-rx{{8}{0}}
> > > >> >  (000.001872)  can0  20000080   [8]  00 00 00 00 00 00 10 00  
> > > 
> > > For a bus-errors I would also expcect some more information in the
> > > data[2..3] fields. But these are always zero.
> > > 
> > 
> > M16C error factors made it possible to report things like
> > CAN_ERR_PROT_FORM/STUFF/BIT0/BIT1/TX in data[2], and
> > CAN_ERR_PROT_LOC_ACK/CRC_DEL in data[3].
> > 
> > Unfortunately such error factors are only reported in Leaf, but
> > not in USBCan-II due to the wire format change in the error event:
> > 
> >         struct leaf_msg_error_event {
> >                 u8 tid;
> >                 u8 flags;
> >                 __le16 time[3];
> >                 u8 channel;
> >                 u8 padding;
> >                 u8 tx_errors_count;
> >                 u8 rx_errors_count;
> >                 u8 status;
> >                 u8 error_factor;
> >         } __packed;
> > 
> >         struct usbcan_msg_error_event {
> >                 u8 tid;
> >                 u8 padding;
> >                 u8 tx_errors_count_ch0;
> >                 u8 rx_errors_count_ch0;
> >                 u8 tx_errors_count_ch1;
> >                 u8 rx_errors_count_ch1;
> >                 u8 status_ch0;
> >                 u8 status_ch1;
> >                 __le16 time;
> >         } __packed;
> > 
> > I speculate that the wire format was changed due to controller
> > bugs in the USBCan-II, which was slightly mentioned in their
> > data sheets here:
> > 
> >         http://www.kvaser.com/canlib-webhelp/page_hardware_specific_can_controllers.html
> > 
> > So it seems there's really no way for filling such bus error
> > info given the very limited amount of data exported :-(
> >
> We experienced similar problems with FlexCAN.

Hmm, I'll have a look there then...

Although my initial instincts imply that the FlexCAN driver has
access to the raw CAN registers, something I'm unable to do here.
But maybe there's some black magic I'm missing :-)

[...]

> > 
> > I've dumped _every_ message I receive from the firmware while
> > disconnecting the CAN bus, waiting a while, and connecting it again.
> > I really received _nothing_ from the firmware when the CAN bus was
> > reconnected and the data packets were flowing again. Not even a
> > single CHIP_STATE_EVENT, even after waiting for a long time.
> > 
> > So it's basically:
> > ...
> > ERR EVENT, txerr=128, rxerr=0
> > ERR EVENT, txerr=128, rxerr=0
> > ERR EVENT, txerr=128, rxerr=0
> > ...
> > 
> > then complete silence, except the data frames. I've even tried with
> > different versions of the firmware, but the same behaviour persisted.
> > 
> > > > So, What can the driver do given the above?
> > > 
> > > Little if the notification does not come.
> > > 
> > 
> > We can poll the state by sending CMD_GET_CHIP_STATE to the firmware,
> > and it will hopefully reply with a CHIP_STATE_EVENT response
> > containing the new txerr and rxerr values that we can use for
> > reverse state transitions.
> >
> > But do we _really_ want to go through the path? I feel that it will
> > open some cans of worms w.r.t. concurrent access to both the netdev
> > and USB stacks from a single driver.
> >
> Honestly, I don't know.
> >
> > A possible solution can be setting up a kernel thread that queries
> > for a CHIP_STATE_EVENT every second?
> > 
> Have you considered polling in kvaser_usb_tx_acknowledge? You could do something
> like:
> if(unlikely(dev->can.state != CAN_STATE_ERROR_ACTIVE))
> {
>     request_state();
> }
> 

OK, I have four important updates on this issue:

a) My initial testing was done on high-speed channel, at a bitrate
   of 50K. After setting the bus to a more reasonable bitrate 500K
   or 1M, I was _consistently_ able to receive CHIP_STATE_EVENTs
   when plugging the CAN connector again after an unplug.

b) The error counters on this device do not get reset on plugging
   after an unplug. I've setup a kernel thread [2] that queries
   the chip state event every second, and the error counters stays
   the same all the time. [1]

c) There's a single case when the erro counters do indeed get
   reversed, and it happens only when introducing some noise in
   the bus after the re-plug. In that case, the new error events
   get raised with new error counters starting from 0/1 again.

d) I've discovered a bug that forbids the CAN state from
   returning to ERROR_ACTIVE in case of the error counters
   numbers getting decreased. But independent from that bug, the
   verbose debugging messages clearly imply that we only get the
   error counters decreased in the case mentioned at `c)' above.

So from [1] and [2], it's now clear that the device do not reset
these counters back in the re-plug case. I'll give a check to
flexcan as advised, but unfortunately I don't really think there's
much I can do about this.

[1]

[  877.207082] CAN_ERROR_: channel=0, txerr=88, rxerr=0
[  877.207090] CAN_ERROR_: channel=0, txerr=136, rxerr=0
[  877.207094] CAN_ERROR_: channel=0, txerr=144, rxerr=0
[  877.207098] CAN_ERROR_: channel=0, txerr=152, rxerr=0
[  877.207100] CAN_ERROR_: channel=0, txerr=160, rxerr=0
[  877.207102] CAN_ERROR_: channel=0, txerr=168, rxerr=0
[  877.208075] CAN_ERROR_: channel=0, txerr=200, rxerr=0 

(( The above error event, staying the same at txerr=200 keeps
   flooding the bus until the CAN cable is re-plugged ))

[  878.225116] CHIP_STATE: channel=0, txerr=200, rxerr=0
[  878.225143] CHIP_STATE: channel=1, txerr=0, rxerr=0
[  879.265167] CHIP_STATE: channel=0, txerr=200, rxerr=0
[  879.267152] CHIP_STATE: channel=1, txerr=0, rxerr=0
[  879.265167] CHIP_STATE: channel=0, txerr=200, rxerr=0
[  879.267152] CHIP_STATE: channel=1, txerr=0, rxerr=0

(( The same counters get repeated every second ))

[2] State was polled using:

static int kvaser_usb_poll_chip_state(void *vpriv) {
      struct kvaser_usb_net_priv *priv = vpriv;

      while (!kthread_should_stop()) {
              kvaser_usb_simple_msg_async(priv, CMD_GET_CHIP_STATE);
              ssleep(1);
      }

      return 0;
}

> I don't think that anything beyond that would be worth pursuing.
> 

I agree, but given the new input, it seems that our problem
extends to the error counters themselves not getting decreased
on re-plug. So, even polling will not solve the issue: we'll
get the same txerr/rxerr values again and again :-(

> Best regards,
> Andri

Regards,
Darwish


  reply index

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-23 15:46 [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Ahmed S. Darwish
2014-12-23 15:53 ` [PATCH] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2014-12-24 12:36   ` Olivier Sobrie
2014-12-24 15:04     ` Ahmed S. Darwish
2014-12-28 21:51       ` Olivier Sobrie
2014-12-30 15:33         ` Ahmed S. Darwish
2014-12-31 12:13           ` Olivier Sobrie
2014-12-24 12:31 ` [PATCH] can: kvaser_usb: Don't free packets when tight on URBs Olivier Sobrie
2014-12-24 15:52   ` Ahmed S. Darwish
2014-12-24 23:56 ` [PATCH v2 1/4] " Ahmed S. Darwish
2014-12-24 23:59   ` [PATCH v2 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close Ahmed S. Darwish
2014-12-25  0:02     ` [PATCH v2 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Ahmed S. Darwish
2014-12-25  0:04       ` [PATCH v2 4/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2014-12-28 21:54       ` [PATCH v2 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Olivier Sobrie
2014-12-25  2:50   ` [PATCH v2 1/4] can: kvaser_usb: Don't free packets when tight on URBs Greg KH
2014-12-25  9:38     ` Ahmed S. Darwish
2014-12-28 21:52   ` Olivier Sobrie
2015-01-01 21:59 ` [PATCH] " Stephen Hemminger
2015-01-03 14:34   ` Ahmed S. Darwish
2015-01-05 17:49 ` [PATCH v3 1/4] " Ahmed S. Darwish
2015-01-05 17:52   ` [PATCH v3 2/4] can: kvaser_usb: Reset all URB tx contexts upon channel close Ahmed S. Darwish
2015-01-05 17:57     ` [PATCH v3 3/4] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels Ahmed S. Darwish
2015-01-05 18:31       ` [PATCH v3 4/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2015-01-08 11:53         ` Marc Kleine-Budde
2015-01-08 15:19           ` Ahmed S. Darwish
2015-01-12 11:51             ` Marc Kleine-Budde
2015-01-12 12:26               ` Ahmed S. Darwish
2015-01-12 12:34                 ` Marc Kleine-Budde
2015-01-09  3:06           ` Ahmed S. Darwish
2015-01-09 14:05             ` Marc Kleine-Budde
2015-01-08  9:59   ` [PATCH v3 1/4] can: kvaser_usb: Don't free packets when tight on URBs Marc Kleine-Budde
2015-01-11 20:05 ` [PATCH v4 00/04] can: Introduce support for Kvaser USBCAN-II devices Ahmed S. Darwish
2015-01-11 20:11   ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() devices Ahmed S. Darwish
2015-01-11 20:15     ` [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM Ahmed S. Darwish
2015-01-11 20:36       ` [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2015-01-11 20:45         ` [PATCH v4 4/4] can: kvaser_usb: Retry first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
2015-01-11 20:51           ` Marc Kleine-Budde
2015-01-12 10:14             ` Ahmed S. Darwish
2015-01-12 10:25               ` Marc Kleine-Budde
2015-01-12 13:33               ` Olivier Sobrie
2015-01-12 13:50                 ` Ahmed S. Darwish
2015-01-12 11:20         ` [PATCH v4 3/4] can: kvaser_usb: Add support for the Usbcan-II family Ahmed S. Darwish
2015-01-12 11:43         ` Marc Kleine-Budde
2015-01-12 12:07           ` Ahmed S. Darwish
2015-01-12 12:36             ` Ahmed S. Darwish
2015-01-12 13:53         ` Olivier Sobrie
2015-01-18 20:12           ` Ahmed S. Darwish
2015-01-18 20:13             ` Marc Kleine-Budde
2015-01-12 11:09       ` [PATCH v4 2/4] can: kvaser_usb: Update error counters before exiting on OOM Marc Kleine-Budde
2015-01-12 20:36         ` Ahmed S. Darwish
2015-01-16 14:39           ` Marc Kleine-Budde
2015-01-11 20:49     ` [PATCH v4 01/04] can: kvaser_usb: Don't dereference skb after a netif_rx() Ahmed S. Darwish
2015-01-12 11:05       ` Marc Kleine-Budde
2015-01-20 21:44 ` [PATCH v5 1/5] can: kvaser_usb: Update net interface state before exiting on OOM Ahmed S. Darwish
2015-01-20 21:45   ` [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling Ahmed S. Darwish
2015-01-20 21:47     ` [PATCH v5 3/5] can: kvaser_usb: Fix state handling upon BUS_ERROR events Ahmed S. Darwish
2015-01-20 21:48       ` [PATCH v5 4/5] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
2015-01-20 21:50         ` [PATCH v5 5/5] can: kvaser_usb: Add support for the USBcan-II family Ahmed S. Darwish
2015-01-21 12:24         ` [PATCH v5 4/5] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Sergei Shtylyov
2015-01-25 11:59           ` Ahmed S. Darwish
2015-01-21 10:33     ` [PATCH v5 2/5] can: kvaser_usb: Consolidate and unify state change handling Andri Yngvason
2015-01-21 10:44       ` Marc Kleine-Budde
2015-01-21 11:00         ` Andri Yngvason
2015-01-21 11:53       ` Wolfgang Grandegger
2015-01-21 14:43         ` Ahmed S. Darwish
2015-01-21 15:00           ` Andri Yngvason
2015-01-21 15:36             ` Ahmed S. Darwish
2015-01-21 16:13               ` Wolfgang Grandegger
2015-01-23  6:07                 ` Ahmed S. Darwish
2015-01-23 10:32                   ` Andri Yngvason
2015-01-25  2:43                     ` Ahmed S. Darwish [this message]
2015-01-26 10:21                       ` Andri Yngvason
2015-01-21 16:37               ` Andri Yngvason
2015-01-21 16:20     ` Andri Yngvason
2015-01-21 22:59       ` Marc Kleine-Budde
2015-01-22 10:14         ` Andri Yngvason
2015-01-25  2:49           ` Ahmed S. Darwish
2015-01-25  3:21       ` Ahmed S. Darwish
2015-01-26  5:17 ` [PATCH v6 0/7] can: kvaser_usb: Leaf bugfixes and USBCan-II support Ahmed S. Darwish
2015-01-26  5:20   ` [PATCH v6 1/7] can: kvaser_usb: Do not sleep in atomic context Ahmed S. Darwish
2015-01-26  5:22     ` [PATCH v6 2/7] can: kvaser_usb: Send correct context to URB completion Ahmed S. Darwish
2015-01-26  5:24       ` [PATCH v6 3/7] can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT Ahmed S. Darwish
2015-01-26  5:25         ` [PATCH v6 4/7] can: kvaser_usb: Fix state handling upon BUS_ERROR events Ahmed S. Darwish
2015-01-26  5:27           ` [PATCH v6 5/7] can: kvaser_usb: Update interface state before exiting on OOM Ahmed S. Darwish
2015-01-26  5:29             ` [PATCH v6 6/7] can: kvaser_usb: Consolidate and unify state change handling Ahmed S. Darwish
2015-01-26  5:33               ` [PATCH v6 7/7] can: kvaser_usb: Add support for the USBcan-II family Ahmed S. Darwish
2015-01-26 10:34                 ` Andri Yngvason
2015-01-26 10:26               ` [PATCH v6 6/7] can: kvaser_usb: Consolidate and unify state change handling Andri Yngvason
2015-01-26 10:28             ` [PATCH v6 5/7] can: kvaser_usb: Update interface state before exiting on OOM Andri Yngvason
2015-01-26 10:50               ` Marc Kleine-Budde
2015-01-26 10:52                 ` Andri Yngvason
2015-01-26 10:53                   ` Marc Kleine-Budde
2015-01-26  9:55   ` [PATCH v6 0/7] can: kvaser_usb: Leaf bugfixes and USBCan-II support Marc Kleine-Budde
2015-01-26 10:07   ` Marc Kleine-Budde
2015-01-26 10:56     ` Marc Kleine-Budde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150125024300.GA6028@Darwish.PC \
    --to=darwish.07@gmail.com \
    --cc=andri.yngvason@marel.com \
    --cc=linux-can@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    --cc=netdev@vger.kernel.org \
    --cc=olivier@sobrie.be \
    --cc=socketcan@hartkopp.net \
    --cc=wg@grandegger.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lkml.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lkml.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lkml.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lkml.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lkml.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lkml.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lkml.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lkml.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lkml.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lkml.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lkml.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lkml.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git